Abstract
Consider the multiclassification (discrimination) problem with known prior probabilities and a multi-dimensional vector of observations. Assume the underlying densities corresponding to the various classes are unknown but a training sample of size $N$ is available from each class. Rates of convergence to Bayes risk are investigated under smoothness conditions on the underlying densities of the type often seen in nonparametric density estimation. These rates can be drastically affected by a small change in the prior probabilities, so the error criterion used here is Bayes risk averaged (uniformly) over all prior probabilities. Then it is shown that a certain rate, $N^{-r}$, is optimal in the sense that no rule can do better (uniformly over the class of smooth densities) and a rule is exhibited which does that well. The optimal value of $r$ depends on the smoothness of the distributions and the dimensionality of the observations in the same way as for nonparametric density estimation with integrated square error loss.
Citation
James Stephen Marron. "Optimal Rates of Convergence to Bayes Risk in Nonparametric Discrimination." Ann. Statist. 11 (4) 1142 - 1155, December, 1983. https://doi.org/10.1214/aos/1176346328
Information