Open Access
February 2005 Bandwidth choice for nonparametric classification
Peter Hall, Kee-Hoon Kang
Ann. Statist. 33(1): 284-306 (February 2005). DOI: 10.1214/009053604000000959


It is shown that, for kernel-based classification with univariate distributions and two populations, optimal bandwidth choice has a dichotomous character. If the two densities cross at just one point, where their curvatures have the same signs, then minimum Bayes risk is achieved using bandwidths which are an order of magnitude larger than those which minimize pointwise estimation error. On the other hand, if the curvature signs are different, or if there are multiple crossing points, then bandwidths of conventional size are generally appropriate. The range of different modes of behavior is narrower in multivariate settings. There, the optimal size of bandwidth is generally the same as that which is appropriate for pointwise density estimation. These properties motivate empirical rules for bandwidth choice.


Download Citation

Peter Hall. Kee-Hoon Kang. "Bandwidth choice for nonparametric classification." Ann. Statist. 33 (1) 284 - 306, February 2005.


Published: February 2005
First available in Project Euclid: 8 April 2005

zbMATH: 1064.62075
MathSciNet: MR2157804
Digital Object Identifier: 10.1214/009053604000000959

Primary: 62C12 , 62H30
Secondary: 62G07

Keywords: Bayes risk , bootstrap , classification error , cross-validation , discrimination , error rate , kernel methods , Nonparametric density estimation

Rights: Copyright © 2005 Institute of Mathematical Statistics

Vol.33 • No. 1 • February 2005
Back to Top