Bernoulli

  • Bernoulli
  • Volume 21, Number 2 (2015), 1089-1133.

CLT for linear spectral statistics of normalized sample covariance matrices with the dimension much larger than the sample size

Binbin Chen and Guangming Pan

Full-text: Open access

Abstract

Let $\mathbf{A} =\frac{1}{\sqrt{np}}(\mathbf{X} ^{T}\mathbf{X} -p\mathbf{I} _{n})$ where $\mathbf{X} $ is a $p\times n$ matrix, consisting of independent and identically distributed (i.i.d.) real random variables $X_{ij}$ with mean zero and variance one. When $p/n\to\infty$, under fourth moment conditions a central limit theorem (CLT) for linear spectral statistics (LSS) of $\mathbf{A} $ defined by the eigenvalues is established. We also explore its applications in testing whether a population covariance matrix is an identity matrix.

Article information

Source
Bernoulli, Volume 21, Number 2 (2015), 1089-1133.

Dates
First available in Project Euclid: 21 April 2015

Permanent link to this document
https://projecteuclid.org/euclid.bj/1429624972

Digital Object Identifier
doi:10.3150/14-BEJ599

Mathematical Reviews number (MathSciNet)
MR3338658

Zentralblatt MATH identifier
06445969

Keywords
central limit theorem empirical spectral distribution hypothesis test linear spectral statistics sample covariance matrix

Citation

Chen, Binbin; Pan, Guangming. CLT for linear spectral statistics of normalized sample covariance matrices with the dimension much larger than the sample size. Bernoulli 21 (2015), no. 2, 1089--1133. doi:10.3150/14-BEJ599. https://projecteuclid.org/euclid.bj/1429624972


Export citation

References

  • [1] Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D. and Levine, A.J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96 6745–6750.
  • [2] Bai, Z., Jiang, D., Yao, J.-F. and Zheng, S. (2009). Corrections to LRT on large-dimensional covariance matrix by RMT. Ann. Statist. 37 3822–3840.
  • [3] Bai, Z. and Silverstein, J.W. (2010). Spectral Analysis of Large Dimensional Random Matrices, 2nd ed. Springer Series in Statistics. New York: Springer.
  • [4] Bai, Z.D. (1993). Convergence rate of expected spectral distributions of large random matrices. I. Wigner matrices. Ann. Probab. 21 625–648.
  • [5] Bai, Z.D. and Silverstein, J.W. (2004). CLT for linear spectral statistics of large-dimensional sample covariance matrices. Ann. Probab. 32 553–605.
  • [6] Bai, Z.D. and Yao, J. (2005). On the convergence of the spectral empirical process of Wigner matrices. Bernoulli 11 1059–1092.
  • [7] Bai, Z.D. and Yin, Y.Q. (1988). Convergence to the semicircle law. Ann. Probab. 16 863–875.
  • [8] Billingsley, P. (1968). Convergence of Probability Measures. New York: Wiley.
  • [9] Birke, M. and Dette, H. (2005). A note on testing the covariance matrix for large dimension. Statist. Probab. Lett. 74 281–289.
  • [10] Chen, B.B. and Pan, G.M. (2012). Convergence of the largest eigenvalue of normalized sample covariance matrices when $p$ and $n$ both tend to infinity with their ratio converging to zero. Bernoulli 18 1405–1420.
  • [11] Chen, S.X., Zhang, L.-X. and Zhong, P.-S. (2010). Tests for high-dimensional covariance matrices. J. Amer. Statist. Assoc. 105 810–819.
  • [12] Costin, O. and Lebowitz, J.L. (1995). Gaussian fluctuation in random matrices. Phys. Rev. Lett. 75 69–72.
  • [13] Donoho, D.L. (2000). High-dimensional data analysis: The curses and blessings of dimensionality. In American Math. Society Conference on Math. Challenges of the 21st Century.
  • [14] Fan, J. and Fan, Y. (2008). High-dimensional classification using features annealed independence rules. Ann. Statist. 36 2605–2637.
  • [15] Fisher, T.J. (2012). On testing for an identity covariance matrix when the dimensionality equals or exceeds the sample size. J. Statist. Plann. Inference 142 312–326.
  • [16] Fisher, T.J., Sun, X. and Gallagher, C.M. (2010). A new test for sphericity of the covariance matrix for high dimensional data. J. Multivariate Anal. 101 2554–2570.
  • [17] Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D. and Lander, E.S. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286 531–537.
  • [18] Johansson, K. (1998). On fluctuations of eigenvalues of random Hermitian matrices. Duke Math. J. 91 151–204.
  • [19] John, S. (1971). Some optimal multivariate tests. Biometrika 58 123–127.
  • [20] Karoui, E.N. (2003). On the largest eigenvalue of Wishart matrices with identity covariance when $n$, $p$ and $p/n$ tend to infinity. Available at arXiv:math/0309355.
  • [21] Ledoit, O. and Wolf, M. (2002). Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann. Statist. 30 1081–1102.
  • [22] Nagao, H. (1973). On some test criteria for covariance matrix. Ann. Statist. 1 700–709.
  • [23] Pan, G.M. (2014). Comparison between two types of large sample covariance matrices. Ann. Inst. Henri Poincare Probab. Stat. 50 315–713.
  • [24] Pan, G.M. and Gao, J.T. (2012). Asymptotic theorey for sample covariance matrix under cross-sectional dependence. Preprint.
  • [25] Pan, G.M. and Zhou, W. (2011). Central limit theorem for Hotelling’s $T^{2}$ statistic under large dimension. Ann. Appl. Probab. 21 1860–1910.
  • [26] Srivastava, M.S. (2005). Some tests concerning the covariance matrix in high dimensional data. J. Japan Statist. Soc. 35 251–272.
  • [27] Srivastava, M.S., Kollo, T. and von Rosen, D. (2011). Some tests for the covariance matrix with fewer observations than the dimension under non-normality. J. Multivariate Anal. 102 1090–1103.
  • [28] Titchmarsh, E.C. (1939). The Theory of Functions, 2nd ed. London: Oxford Univ. Press.