The Annals of Statistics

Discussion of “Influential features PCA for high dimensional clustering”

Ery Arias-Castro and Nicolas Verzelen

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Article information

Ann. Statist., Volume 44, Number 6 (2016), 2360-2365.

Received: May 2016
First available in Project Euclid: 23 November 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier


Arias-Castro, Ery; Verzelen, Nicolas. Discussion of “Influential features PCA for high dimensional clustering”. Ann. Statist. 44 (2016), no. 6, 2360--2365. doi:10.1214/16-AOS1423A.

Export citation


  • Achlioptas, D. and McSherry, F. (2005). On spectral learning of mixtures of distributions. In Learning Theory. Lecture Notes in Computer Science 3559 458–469. Springer, Berlin.
  • Azizyan, M., Singh, A. and Wasserman, L. (2013). Minimax theory for high-dimensional Gaussian mixtures with sparse mean separation. In Advances in Neural Information Processing Systems 26 (C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K. Q. Weinberger, eds.). Lake Tahoe, NV.
  • Chan, Y. and Hall, P. (2010). Using evidence of mixed populations to select variables for clustering very high-dimensional data. J. Amer. Statist. Assoc. 105 798–809.
  • Delaigle, A. and Hall, P. (2009). Higher criticism in the context of unknown distribution, non-independence and classification. In Perspectives in Mathematical Sciences. I. Stat. Sci. Interdiscip. Res. 7 109–138. World Sci. Publ., Hackensack, NJ.
  • Delaigle, A., Hall, P. and Jin, J. (2011). Robustness and accuracy of methods for high dimensional data analysis based on Student’s $t$-statistic. J. R. Stat. Soc. Ser. B Stat. Methodol. 73 283–301.
  • De Soete, G. (1986). Optimal variable weighting for ultrametric and additive tree clustering. Qual. Quant. 20 169–180.
  • Efron, B. (2004). Large-scale simultaneous hypothesis testing: The choice of a null hypothesis. J. Amer. Statist. Assoc. 99 96–104.
  • Friedman, J. H. and Meulman, J. J. (2004). Clustering objects on subsets of attributes. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 815–849.
  • Hall, P. and Jin, J. (2008). Properties of higher criticism under strong dependence. Ann. Statist. 36 381–402.
  • Hall, P. and Jin, J. (2010). Innovated higher criticism for detecting sparse signals in correlated noise. Ann. Statist. 38 1686–1732.
  • Hall, P., Jin, J. and Miller, H. (2014). Feature selection when there are many influential features. Bernoulli 20 1647–1671.
  • Jin, J. (2015). Fast community detection by SCORE. Ann. Statist. 43 57–89.
  • Jin, J., Ke, Z. T. and Wang, W. (2015). Phase transitions for high dimensional clustering and related problems. Preprint. Available at arXiv:1502.06952.
  • Jin, J., Ke, Z. T. and Wang, W. (2016). Optimal spectral clustering by higher criticism thresholding. Unpublished manuscript.
  • Kannan, R., Salmasian, H. and Vempala, S. (2005). The spectral method for general mixture models. In Learning Theory. Lecture Notes in Computer Science 3559 444–457. Springer, Berlin.
  • Karlin, S. and Studden, W. J. (1966). Tchebycheff Systems: With Applications in Analysis and Statistics. Pure and Applied Mathematics, Vol. XV. Wiley, New York.
  • Lee, A. B., Luca, D. and Roeder, K. (2010). A spectral graph approach to discovering genetic ancestry. Ann. Appl. Stat. 4 179–202.
  • Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist. 43 215–237.
  • Moitra, A. and Valiant, G. (2010). Settling the polynomial learnability of mixtures of Gaussians. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science FOCS 2010 93–102. IEEE Computer Soc., Los Alamitos, CA.
  • Ng, A., Jordan, M. and Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2 849–856.
  • Vempala, S. and Wang, G. (2004). A spectral algorithm for learning mixture models. J. Comput. System Sci. 68 841–860.
  • Verzelen, N. and Arias-Castro, E. (2014). Detection and feature selection in sparse mixture models. Available at arXiv:1405.1478.
  • von Luxburg, U. (2007). A tutorial on spectral clustering. Stat. Comput. 17 395–416.
  • Witten, D. M. and Tibshirani, R. (2010). A framework for feature selection in clustering. J. Amer. Statist. Assoc. 105 713–726.

See also

  • Main article: Influential features PCA for high dimensional clustering.