• Bernoulli
  • Volume 20, Number 3 (2014), 1029-1058.

High-dimensional covariance matrix estimation with missing observations

Karim Lounici

Full-text: Open access


In this paper, we study the problem of high-dimensional covariance matrix estimation with missing observations. We propose a simple procedure computationally tractable in high-dimension and that does not require imputation of the missing data. We establish non-asymptotic sparsity oracle inequalities for the estimation of the covariance matrix involving the Frobenius and the spectral norms which are valid for any setting of the sample size, probability of a missing observation and the dimensionality of the covariance matrix. We further establish minimax lower bounds showing that our rates are minimax optimal up to a logarithmic factor.

Article information

Bernoulli, Volume 20, Number 3 (2014), 1029-1058.

First available in Project Euclid: 11 June 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

covariance matrix Lasso low-rank matrix estimation missing observations non-commutative Bernstein inequality optimal rate of convergence


Lounici, Karim. High-dimensional covariance matrix estimation with missing observations. Bernoulli 20 (2014), no. 3, 1029--1058. doi:10.3150/12-BEJ487.

Export citation


  • [1] Ahlswede, R. and Winter, A. (2002). Strong converse for identification via quantum channels. IEEE Trans. Inform. Theory 48 569–579.
  • [2] Banerjee, O., El Ghaoui, L. and d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9 485–516.
  • [3] Bickel, P.J. and Levina, E. (2008). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604.
  • [4] Bickel, P.J., Ritov, Y. and Tsybakov, A.B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • [5] Bunea, F., She, Y. and Wegkamp, M.H. (2011). Optimal selection of reduced rank estimators of high-dimensional matrices. Ann. Statist. 39 1282–1309.
  • [6] Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106 672–684.
  • [7] Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell_{1}$ minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106 594–607.
  • [8] Cai, T.T., Zhang, C.-H. and Zhou, H.H. (2010). Optimal rates of convergence for covariance matrix estimation. Ann. Statist. 38 2118–2144.
  • [9] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
  • [10] Candès, E.J. and Plan, Y. (2010). Matrix completion with noise. In Proceedings of the IEEE, Vol. 98 925–936.
  • [11] Candès, E.J. and Tao, T. (2010). The power of convex relaxation: Near-optimal matrix completion. IEEE Trans. Inform. Theory 56 2053–2080.
  • [12] Donoho, D.L. and Tanner, J. (2005). Sparse nonnegative solution of underdetermined linear equations by linear programming. Proc. Natl. Acad. Sci. USA 102 9446–9451 (electronic).
  • [13] El Karoui, N. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756.
  • [14] El Karoui, N. (2008). Spectrum estimation for large dimensional covariance matrices using random matrix theory. Ann. Statist. 36 2757–2790.
  • [15] Gross, D. (2009) Recovering low-rank matrices from few coefficients in any basis. Available at arXiv:0910.1879.
  • [16] Johnstone, I.M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
  • [17] Johnstone, I.M. and Ma, Z. (2013). Fast approach to the Tracy-Widom law at the edge of GOE and GUE. Ann. Appl. Probab. 22 1962–1988.
  • [18] Keshavan, R.H., Montanari, A. and Oh, S. (2010). Matrix completion from noisy entries. J. Mach. Learn. Res. 11 2057–2078.
  • [19] Klopp, O. (2011). Rank penalized estimators for high-dimensional matrices. Electron. J. Stat. 5 1161–1183.
  • [20] Koltchinskii, V. (2009). The Dantzig selector and sparsity oracle inequalities. Bernoulli 15 799–828.
  • [21] Koltchinskii, V. (2010) Von Neumann entropy penalization and low rank matrix approximation. Available at arXiv:1009.2439.
  • [22] Koltchinskii, V. (2011). Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. Lecture Notes in Math. 2033. Heidelberg: Springer. Lectures from the 38th Probability Summer School held in Saint-Flour, 2008, École d’Été de Probabilités de Saint-Flour [Saint-Flour Probability Summer School].
  • [23] Koltchinskii, V., Lounici, K. and Tsybakov, A.B. (2011). Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist. 39 2302–2329.
  • [24] Lounici, K. (2008). Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2 90–102.
  • [25] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • [26] Recht, B., Fazel, M. and Parrilo, P.A. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52 471–501.
  • [27] Rohde, A. and Tsybakov, A.B. (2011). Estimation of high-dimensional low-rank matrices. Ann. Statist. 39 887–930.
  • [28] Rothman, A.J., Bickel, P.J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Stat. 2 494–515.
  • [29] Rothman, A.J., Levina, E. and Zhu, J. (2009). Generalized thresholding of large covariance matrices. J. Amer. Statist. Assoc. 104 177–186.
  • [30] Schneider, T. (2001). Analysis of incomplete climate data: Estimation of mean values and covariance matrices and imputation of missing values. J. Climate 14 853–871.
  • [31] Tropp, J.A. (2010) User-friendly tail bounds for sums of random matrices. Available at arXiv:1004.4389.
  • [32] Tsybakov, A.B. (2009). Introduction to Nonparametric Estimation. Springer Series in Statistics. New York: Springer. Revised and extended from the 2004 French original. Translated by Vladimir Zaiats.
  • [33] Vershynin, R. (2011) Introduction to the non-asymptotic analysis of random matrices. Available at arXiv:1011.3027v7.
  • [34] Watson, G.A. (1992). Characterization of the subdifferential of some matrix norms. Linear Algebra Appl. 170 33–45.
  • [35] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19–35.