## Bernoulli

• Bernoulli
• Volume 24, Number 3 (2018), 1787-1833.

### On Gaussian comparison inequality and its application to spectral analysis of large random matrices

#### Abstract

Recently, Chernozhukov, Chetverikov, and Kato (Ann. Statist. 42 (2014) 1564–1597) developed a new Gaussian comparison inequality for approximating the suprema of empirical processes. This paper exploits this technique to devise sharp inference on spectra of large random matrices. In particular, we show that two long-standing problems in random matrix theory can be solved: (i) simple bootstrap inference on sample eigenvalues when true eigenvalues are tied; (ii) conducting two-sample Roy’s covariance test in high dimensions. To establish the asymptotic results, a generalized $\varepsilon$-net argument regarding the matrix rescaled spectral norm and several new empirical process bounds are developed and of independent interest.

#### Article information

Source
Bernoulli, Volume 24, Number 3 (2018), 1787-1833.

Dates
Revised: July 2016
First available in Project Euclid: 2 February 2018

https://projecteuclid.org/euclid.bj/1517540460

Digital Object Identifier
doi:10.3150/16-BEJ912

Mathematical Reviews number (MathSciNet)
MR3757515

Zentralblatt MATH identifier
06839252

#### Citation

Han, Fang; Xu, Sheng; Zhou, Wen-Xin. On Gaussian comparison inequality and its application to spectral analysis of large random matrices. Bernoulli 24 (2018), no. 3, 1787--1833. doi:10.3150/16-BEJ912. https://projecteuclid.org/euclid.bj/1517540460

#### References

• [1] Adamczak, R. (2008). A tail inequality for suprema of unbounded empirical processes with applications to Markov chains. Electron. J. Probab. 13 1000–1034.
• [2] Anderson, T.W. (1963). Asymptotic theory for principal component analysis. Ann. Math. Stat. 34 122–148.
• [3] Anderson, T.W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley Series in Probability and Statistics. Hoboken, NJ: Wiley-Interscience.
• [4] Bao, Z., Pan, G. and Zhou, W. (2015). Universality for the largest eigenvalue of sample covariance matrices with general population. Ann. Statist. 43 382–421.
• [5] Beran, R. and Srivastava, M.S. (1985). Bootstrap tests and confidence regions for functions of a covariance matrix. Ann. Statist. 13 95–115.
• [6] Berthet, Q. and Rigollet, P. (2013). Optimal detection of sparse principal components in high dimension. Ann. Statist. 41 1780–1815.
• [7] Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106 672–684.
• [8] Cai, T., Liu, W. and Xia, Y. (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. J. Amer. Statist. Assoc. 108 265–277.
• [9] Cai, T., Ma, Z. and Wu, Y. (2015). Optimal estimation and rank detection for sparse spiked covariance matrices. Probab. Theory Related Fields 161 781–815.
• [10] Cai, T.T. and Ma, Z. (2013). Optimal hypothesis testing for high dimensional covariance matrices. Bernoulli 19 2359–2388.
• [11] Cai, T.T., Ma, Z. and Wu, Y. (2013). Sparse PCA: Optimal rates and adaptive estimation. Ann. Statist. 41 3074–3110.
• [12] Chang, J., Zhou, W., Zhou, W.-X. and Wang, L. (2017). Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering. Biometrics 73 31–41.
• [13] Chen, S.X., Zhang, L.-X. and Zhong, P.-S. (2010). Tests for high-dimensional covariance matrices. J. Amer. Statist. Assoc. 105 810–819.
• [14] Chernozhukov, V., Chetverikov, D. and Kato, K. (2014). Gaussian approximation of suprema of empirical processes. Ann. Statist. 42 1564–1597.
• [15] Chernozhukov, V., Chetverikov, D. and Kato, K. (2015). Comparison and anti-concentration bounds for maxima of Gaussian random vectors. Probab. Theory Related Fields 162 47–70.
• [16] Dvoretzky, A., Kiefer, J. and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann. Math. Stat. 27 642–669.
• [17] Eaton, M.L. and Tyler, D.E. (1991). On Wielandt’s inequality and its application to the asymptotic distribution of the eigenvalues of a random symmetric matrix. Ann. Statist. 19 260–271.
• [18] Fan, J. and Wang, W. (2015) Asymptotics of empirical eigen-structure for ultra-high dimensional spiked covariance model. Available at arXiv:1502.04733.
• [19] Fang, K.T., Kotz, S. and Ng, K.W. (1990). Symmetric Multivariate and Related Distributions. Monographs on Statistics and Applied Probability 36. London: Chapman & Hall.
• [20] Fujikoshi, Y. (1980). Asymptotic expansions for the distributions of the sample roots under nonnormality. Biometrika 67 45–51.
• [21] Hall, P., Härdle, W. and Simar, L. (1993). On the inconsistency of bootstrap distribution estimators. Comput. Statist. Data Anal. 16 11–18.
• [22] Hall, P., Lee, Y.K., Park, B.U. and Paul, D. (2009). Tie-respecting bootstrap methods for estimating distributions of sets and functions of eigenvalues. Bernoulli 15 380–401.
• [23] Jiang, T. (2004). The asymptotic distributions of the largest entries of sample correlation matrices. Ann. Appl. Probab. 14 865–880.
• [24] Johnstone, I.M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
• [25] Johnstone, I.M. (2008). Multivariate analysis and Jacobi ensembles: Largest eigenvalue, Tracy–Widom limits and rates of convergence. Ann. Statist. 36 2638–2716.
• [26] Johnstone, I.M. and Lu, A.Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc. 104 682–693.
• [27] Johnstone, I.M. and Nadler, B. (2017). Roy’s largest root test under rank-one alternatives. Biometrika 104 181–193.
• [28] Kendall, M. and Stuart, A. (1979). The Advanced Theory of Statistics 2, 4th ed. Oxford: Oxford Univ. Press.
• [29] Kritchman, S. and Nadler, B. (2009). Non-parametric detection of the number of signals: Hypothesis testing and random matrix theory. IEEE Trans. Signal Process. 57 3930–3941.
• [30] Li, J. and Chen, S.X. (2012). Two sample tests for high-dimensional covariance matrices. Ann. Statist. 40 908–940.
• [31] Ma, Z. (2013). Sparse principal component analysis and iterative thresholding. Ann. Statist. 41 772–801.
• [32] Mendelson, S. (2010). Empirical processes with a bounded $\psi_{1}$ diameter. Geom. Funct. Anal. 20 988–1027.
• [33] Moghaddam, B., Weiss, Y. and Avidan, S. (2006). Advances in neural information processing systems. 18. In Proceedings of the 19th Annual Conference (NIPS-19) Held December 510, 2005. Advances in Neural Information Processing Systems 18. Cambridge, MA: MIT Press. Edited by Yair Weiss, Bernard Schölkopf and John C. Platt, A Bradford Book.
• [34] Muirhead, R.J. (1982). Aspects of Multivariate Statistical Theory. Wiley Series in Probability and Mathematical Statistics. New York: Wiley.
• [35] Péché, S. (2009). Universality results for the largest eigenvalues of some sample covariance matrix ensembles. Probab. Theory Related Fields 143 481–516.
• [36] Pillai, N.S. and Yin, J. (2012). Edge universality of correlation matrices. Ann. Statist. 40 1737–1763.
• [37] Roy, S.N. (1957). Some Aspects of Multivariate Analysis. New York: Wiley; Calcutta: Indian Statistical Institute.
• [38] Talagrand, M. (2014). Upper and Lower Bounds for Stochastic Processes: Modern Methods and Classical Problems. Ergebnisse der Mathematik und Ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics] 60. Heidelberg: Springer.
• [39] Tracy, C.A. and Widom, H. (1996). On orthogonal and symplectic matrix ensembles. Comm. Math. Phys. 177 727–754.
• [40] Tsybakov, A.B. (2004). Introduction à L’estimation Non-paramétrique. Mathématiques & Applications (Berlin) [Mathematics & Applications] 41. Berlin: Springer.
• [41] Tyler, D.E. (1983). The asymptotic distribution of principal component roots under local alternatives to multiple roots. Ann. Statist. 11 1232–1242.
• [42] van der Vaart, A.W. and Wellner, J.A. (1996). Weak Convergence and Empirical Processes. Springer Series in Statistics. New York: Springer. With applications to statistics.
• [43] Vershynin, R. (2012). Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing 210–268. Cambridge: Cambridge Univ. Press.
• [44] Waternaux, C.M. (1976). Asymptotic distribution of the sample roots for a nonnormal population. Biometrika 63 639–645.
• [45] Yuan, X.-T. and Zhang, T. (2013). Truncated power method for sparse eigenvalue problems. J. Mach. Learn. Res. 14 899–925.