Bernoulli

Nonparametrically consistent depth-based classifiers

Davy Paindaveine and Germain Van Bever

Full-text: Open access

Abstract

We introduce a class of depth-based classification procedures that are of a nearest-neighbor nature. Depth, after symmetrization, indeed provides the center-outward ordering that is necessary and sufficient to define nearest neighbors. Like all their depth-based competitors, the resulting classifiers are affine-invariant, hence in particular are insensitive to unit changes. Unlike the former, however, the latter achieve Bayes consistency under virtually any absolutely continuous distributions – a concept we call nonparametric consistency, to stress the difference with the stronger universal consistency of the standard $k$NN classifiers. We investigate the finite-sample performances of the proposed classifiers through simulations and show that they outperform affine-invariant nearest-neighbor classifiers obtained through an obvious standardization construction. We illustrate the practical value of our classifiers on two real data examples. Finally, we shortly discuss the possible uses of our depth-based neighbors in other inference problems.

Article information

Source
Bernoulli, Volume 21, Number 1 (2015), 62-82.

Dates
First available in Project Euclid: 17 March 2015

Permanent link to this document
https://projecteuclid.org/euclid.bj/1426597064

Digital Object Identifier
doi:10.3150/13-BEJ561

Mathematical Reviews number (MathSciNet)
MR3322313

Zentralblatt MATH identifier
1359.62258

Keywords
affine-invariance classification procedures nearest neighbors statistical depth functions symmetrization

Citation

Paindaveine, Davy; Van Bever, Germain. Nonparametrically consistent depth-based classifiers. Bernoulli 21 (2015), no. 1, 62--82. doi:10.3150/13-BEJ561. https://projecteuclid.org/euclid.bj/1426597064


Export citation

References

  • [1] Biau, G., Devroye, L., Dujmović, V. and Krzyżak, A. (2012). An affine invariant $k$-nearest neighbor regression estimate. J. Multivariate Anal. 112 24–34.
  • [2] Chacón, J.E. (2009). Data-driven choice of the smoothing parametrization for kernel density estimators. Canad. J. Statist. 37 249–265.
  • [3] Chacón, J.E., Duong, T. and Wand, M.P. (2011). Asymptotics for general multivariate kernel density derivative estimators. Statist. Sinica 21 807–840.
  • [4] Croux, C. and Dehon, C. (2001). Robust linear discriminant analysis using $S$-estimators. Canad. J. Statist. 29 473–493.
  • [5] Cui, X., Lin, L. and Yang, G. (2008). An extended projection data depth and its applications to discrimination. Comm. Statist. Theory Methods 37 2276–2290.
  • [6] Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics (New York) 31. New York: Springer.
  • [7] Donoho, D.L. and Gasko, M. (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann. Statist. 20 1803–1827.
  • [8] Dümbgen, L. (1992). Limit theorems for the simplicial depth. Statist. Probab. Lett. 14 119–128.
  • [9] Dümbgen, L. (1998). On Tyler’s $M$-functional of scatter in high dimension. Ann. Inst. Statist. Math. 50 471–491.
  • [10] Dutta, S. and Ghosh, A.K. (2012). On robust classification using projection depth. Ann. Inst. Statist. Math. 64 657–676.
  • [11] Dutta, S. and Ghosh, A.K. (2012). On classification based on $L_{p}$ depth with an adaptive choice of $p$. Technical Report Number R5/2011, Statistics and Mathematics Unit, Indian Statistical Institute, Kolkata, India.
  • [12] Ghosh, A.K. and Chaudhuri, P. (2005). On data depth and distribution-free discriminant analysis using separating surfaces. Bernoulli 11 1–27.
  • [13] Ghosh, A.K. and Chaudhuri, P. (2005). On maximum depth and related classifiers. Scand. J. Statist. 32 327–350.
  • [14] Hartikainen, A. and Oja, H. (2006). On some parametric, nonparametric and semiparametric discrimination rules. In Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications. DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 72 61–70. Providence, RI: Amer. Math. Soc.
  • [15] He, X. and Fung, W.K. (2000). High breakdown estimation for multiple populations with applications to discriminant analysis. J. Multivariate Anal. 72 151–162.
  • [16] Hettmansperger, T.P. and Randles, R.H. (2002). A practical affine equivariant multivariate median. Biometrika 89 851–860.
  • [17] Hubert, M. and Van der Veeken, S. (2010). Robust classification for skewed data. Adv. Data Anal. Classif. 4 239–254.
  • [18] Jörnsten, R. (2004). Clustering and classification based on the $L_{1}$ data depth. J. Multivariate Anal. 90 67–89.
  • [19] Koshevoy, G. and Mosler, K. (1997). Zonoid trimming for multivariate distributions. Ann. Statist. 25 1998–2017.
  • [20] Lange, T., Mosler, K. and Mozharovskyi, P. (2014). Fast nonparametric classification based on data depth. Statist. Papers 55 49–69.
  • [21] Li, J., Cuesta-Albertos, J.A. and Liu, R.Y. (2012). $DD$-classifier: Nonparametric classification procedure based on $DD$-plot. J. Amer. Statist. Assoc. 107 737–753.
  • [22] Liu, R.Y. (1990). On a notion of data depth based on random simplices. Ann. Statist. 18 405–414.
  • [23] Liu, R.Y., Parelius, J.M. and Singh, K. (1999). Multivariate analysis by data depth: Descriptive statistics, graphics and inference. Ann. Statist. 27 783–858.
  • [24] Mosler, K. and Hoberg, R. (2006). Data analysis and classification with the zonoid depth. In Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications. DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 72 49–59. Providence, RI: Amer. Math. Soc.
  • [25] Oja, H. and Paindaveine, D. (2005). Optimal signed-rank tests based on hyperplanes. J. Statist. Plann. Inference 135 300–323.
  • [26] Randles, R.H., Broffitt, J.D., Ramberg, J.S. and Hogg, R.V. (1978). Generalized linear and quadratic discriminant functions using robust estimates. J. Amer. Statist. Assoc. 73 564–568.
  • [27] Ripley, B.D. (1996). Pattern Recognition and Neural Networks. Cambridge: Cambridge Univ. Press.
  • [28] Rousseeuw, P.J. and Ruts, I. (1999). The depth function of a population distribution. Metrika 49 213–244.
  • [29] Rousseeuw, P.J. and Struyf, A. (2004). Characterizing angular symmetry and regression symmetry. J. Statist. Plann. Inference 122 161–173.
  • [30] Serfling, R.J. (2006). Multivariate symmetry and asymmetry. Encyclopedia Statist. Sci. 8 5338–5345.
  • [31] Stone, C.J. (1977). Consistent nonparametric regression. Ann. Statist. 5 595–645.
  • [32] Tukey, J.W. (1975). Mathematics and the picturing of data. In Proceedings of the International Congress of Mathematicians (Vancouver, B.C., 1974), Vol. 2 523–531. Canad. Math. Congress, Montreal, Que.
  • [33] Tyler, D.E. (1987). A distribution-free $M$-estimator of multivariate scatter. Ann. Statist. 15 234–251.
  • [34] Yeh, I.C., Yang, K.J. and Ting, T.M. (2009). Knowledge discovery on RFM model using Bernoulli sequence. Expert Syst. Appl. 36 5866–5871.
  • [35] Zakai, A. and Ritov, Y. (2009). Consistency and localizability. J. Mach. Learn. Res. 10 827–856.
  • [36] Zuo, Y. (2003). Projection-based depth functions and associated medians. Ann. Statist. 31 1460–1490.
  • [37] Zuo, Y. and Serfling, R. (2000). General notions of statistical depth function. Ann. Statist. 28 461–482.
  • [38] Zuo, Y. and Serfling, R. (2000). Structural properties and convergence results for contours of sample statistical depth functions. Ann. Statist. 28 483–499.