The Annals of Statistics

Asymptotic normality and optimalities in estimation of large Gaussian graphical models

Zhao Ren, Tingni Sun, Cun-Hui Zhang, and Harrison H. Zhou

Full-text: Open access

Abstract

The Gaussian graphical model, a popular paradigm for studying relationship among variables in a wide range of applications, has attracted great attention in recent years. This paper considers a fundamental question: When is it possible to estimate low-dimensional parameters at parametric square-root rate in a large Gaussian graphical model? A novel regression approach is proposed to obtain asymptotically efficient estimation of each entry of a precision matrix under a sparseness condition relative to the sample size. When the precision matrix is not sufficiently sparse, or equivalently the sample size is not sufficiently large, a lower bound is established to show that it is no longer possible to achieve the parametric rate in the estimation of each entry. This lower bound result, which provides an answer to the delicate sample size question, is established with a novel construction of a subset of sparse precision matrices in an application of Le Cam’s lemma. Moreover, the proposed estimator is proven to have optimal convergence rate when the parametric rate cannot be achieved, under a minimal sample requirement.

The proposed estimator is applied to test the presence of an edge in the Gaussian graphical model or to recover the support of the entire model, to obtain adaptive rate-optimal estimation of the entire precision matrix as measured by the matrix $\ell_{q}$ operator norm and to make inference in latent variables in the graphical model. All of this is achieved under a sparsity condition on the precision matrix and a side condition on the range of its spectrum. This significantly relaxes the commonly imposed uniform signal strength condition on the precision matrix, irrepresentability condition on the Hessian tensor operator of the covariance matrix or the $\ell_{1}$ constraint on the precision matrix. Numerical results confirm our theoretical findings. The ROC curve of the proposed algorithm, Asymptotic Normal Thresholding (ANT), for support recovery significantly outperforms that of the popular GLasso algorithm.

Article information

Source
Ann. Statist., Volume 43, Number 3 (2015), 991-1026.

Dates
Received: August 2013
Revised: October 2014
First available in Project Euclid: 15 May 2015

Permanent link to this document
https://projecteuclid.org/euclid.aos/1431695636

Digital Object Identifier
doi:10.1214/14-AOS1286

Mathematical Reviews number (MathSciNet)
MR3346695

Zentralblatt MATH identifier
1328.62342

Subjects
Primary: 62H12: Estimation
Secondary: 62F12: Asymptotic properties of estimators 62G09: Resampling methods

Keywords
Asymptotic efficiency covariance matrix inference graphical model latent graphical model minimax lower bound optimal rate of convergence scaled lasso precision matrix sparsity spectral norm

Citation

Ren, Zhao; Sun, Tingni; Zhang, Cun-Hui; Zhou, Harrison H. Asymptotic normality and optimalities in estimation of large Gaussian graphical models. Ann. Statist. 43 (2015), no. 3, 991--1026. doi:10.1214/14-AOS1286. https://projecteuclid.org/euclid.aos/1431695636


Export citation

References

  • Antoniadis, A. (2010). Comment: $\ell_{1}$-penalization for mixture regression models [MR2677722]. TEST 19 257–258.
  • Belloni, A., Chernozhukov, V. and Hansen, C. (2014). Inference on treatment effects after selection among high-dimensional controls. Rev. Econ. Stud. 81 608–650.
  • Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98 791–806.
  • Bickel, P. J. and Levina, E. (2008a). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
  • Bickel, P. J. and Levina, E. (2008b). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604.
  • Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • Bühlmann, P. (2013). Statistical significance in high-dimensional linear models. Bernoulli 19 1212–1242.
  • Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell_{1}$ minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106 594–607.
  • Cai, T. T., Liu, W. and Zhou, H. H. (2012). Estimating sparse precision matrix: Optimal rates of convergence and adaptive estimation. Preprint. Available at arXiv:1212.2882.
  • Cai, T. T., Zhang, C.-H. and Zhou, H. H. (2010). Optimal rates of convergence for covariance matrix estimation. Ann. Statist. 38 2118–2144.
  • Cai, T. T. and Zhou, H. H. (2012). Optimal rates of convergence for sparse covariance matrix estimation. Ann. Statist. 40 2389–2420.
  • Candès, E. J. and Recht, B. (2009). Exact matrix completion via convex optimization. Found. Comput. Math. 9 717–772.
  • Chandrasekaran, V., Parrilo, P. A. and Willsky, A. S. (2012). Latent variable graphical model selection via convex optimization. Ann. Statist. 40 1935–1967.
  • d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2008). First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30 56–66.
  • El Karoui, N. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432–441.
  • Horn, R. A. and Johnson, C. R. (1990). Matrix Analysis. Cambridge Univ. Press, Cambridge.
  • Javanmard, A. and Montanari, A. (2014). Hypothesis testing in high-dimensional regression under the Gaussian random design model: Asymptotic theory. IEEE Trans. Inform. Theory 60 6522–6554.
  • Koltchinskii, V. (2009). The Dantzig selector and sparsity oracle inequalities. Bernoulli 15 799–828.
  • Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Statist. 37 4254–4278.
  • Lauritzen, S. L. (1996). Graphical Models. Oxford Statistical Science Series 17. Oxford Univ. Press, New York.
  • Le Cam, L. (1973). Convergence of estimates under dimensionality restrictions. Ann. Statist. 1 38–53.
  • Liu, W. (2013). Gaussian graphical model estimation with false discovery rate control. Ann. Statist. 41 2948–2978.
  • Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • Pang, H., Liu, H. and Vanderbei, R. (2014). The FASTCLIME package for linear programming and large-scale precision matrix estimation in R. J. Mach. Learn. Res. 15 489–493.
  • Raskutti, G., Wainwright, M. J. and Yu, B. (2010). Restricted eigenvalue properties for correlated Gaussian designs. J. Mach. Learn. Res. 11 2241–2259.
  • Ravikumar, P., Wainwright, M. J., Raskutti, G. and Yu, B. (2011). High-dimensional covariance estimation by minimizing $\ell_{1}$-penalized log-determinant divergence. Electron. J. Stat. 5 935–980.
  • Ren, Z. and Zhou, H. H. (2012). Discussion: Latent variable graphical model selection via convex optimization [MR3059067]. Ann. Statist. 40 1989–1996.
  • Ren, Z., Sun, T., Zhang, C.-H. and Zhou, H. H. (2015). Supplement to “Asymptotic normality and optimalities in estimation of large Gaussian graphical models.” DOI:10.1214/14-AOS1286SUPP.
  • Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Stat. 2 494–515.
  • Städler, N., Bühlmann, P. and van de Geer, S. (2010). $\ell_{1}$-penalization for mixture regression models. TEST 19 209–256.
  • Sun, T. and Zhang, C.-H. (2010). Comment: $\ell_{1}$-penalization for mixture regression models [MR2677722]. TEST 19 270–275.
  • Sun, T. and Zhang, C.-H. (2012a). Scaled sparse linear regression. Biometrika 99 879–898.
  • Sun, T. and Zhang, C.-H. (2012b). Comment: “Minimax estimation of large covariance matrices under $\ell_{1}$-norm” [MR3027084]. Statist. Sinica 22 1354–1358.
  • Sun, T. and Zhang, C.-H. (2013). Sparse matrix inversion with scaled lasso. J. Mach. Learn. Res. 14 3385–3418.
  • Thorin, G. O. (1948). Convexity theorems generalizing those of M. Riesz and Hadamard with some applications. Comm. Sem. Math. Univ. Lund [Medd. Lunds Univ. Mat. Sem.] 9 1–58.
  • van de Geer, S. A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3 1360–1392.
  • van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42 1166–1202.
  • Ye, F. and Zhang, C.-H. (2010). Rate minimaxity of the Lasso and Dantzig selector for the $\ell_{q}$ loss in $\ell_{r}$ balls. J. Mach. Learn. Res. 11 3519–3540.
  • Yu, B. (1997). Assouad, Fano, and Le Cam. In Festschrift for Lucien Le Cam 423–435. Springer, New York.
  • Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. J. Mach. Learn. Res. 11 2261–2286.
  • Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19–35.
  • Zhang, T. (2009). Some sharp performance bounds for least squares regression with $L_{1}$ regularization. Ann. Statist. 37 2109–2144.
  • Zhang, C.-H. (2011). Statistical inference for high-dimensional data. In Mathematisches Forschungsinstitut Oberwolfach: Very High Dimensional Semiparametric Models. Report No. 48/2011 28–31.
  • Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567–1594.
  • Zhang, C.-H. and Zhang, T. (2012). A general theory of concave regularization for high-dimensional sparse estimation problems. Statist. Sci. 27 576–593.
  • Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. Ser. B Stat. Methodol. 76 217–242.

Supplemental materials

  • Supplement to “Asymptotic normality and optimalities in estimation of large Gaussian graphical model”. In this supplement we collect proofs of Theorems 1–3 in Section 2, proofs of Theorems 6, 8 in Section 3 and proofs of Theorems 10–11 as well as Proposition 1 in Section 4.