The Annals of Statistics

Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

Po-Ling Loh and Martin J. Wainwright

Full-text: Open access

Abstract

We investigate the relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the significance of the inverse covariance matrix of a non-Gaussian distribution. The proof exploits a combination of ideas from the geometry of exponential families, junction tree theory and convex analysis. These population-level results have various consequences for graph selection methods, both known and novel, including a novel method for structure estimation for missing or corrupted observations. We provide nonasymptotic guarantees for such methods and illustrate the sharpness of these predictions via simulations.

Article information

Source
Ann. Statist., Volume 41, Number 6 (2013), 3022-3049.

Dates
First available in Project Euclid: 1 January 2014

Permanent link to this document
https://projecteuclid.org/euclid.aos/1388545677

Digital Object Identifier
doi:10.1214/13-AOS1162

Mathematical Reviews number (MathSciNet)
MR3161456

Zentralblatt MATH identifier
1288.62081

Subjects
Primary: 62F12: Asymptotic properties of estimators
Secondary: 68W25: Approximation algorithms

Keywords
Graphical models Markov random fields model selection inverse covariance estimation high-dimensional statistics exponential families Legendre duality

Citation

Loh, Po-Ling; Wainwright, Martin J. Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses. Ann. Statist. 41 (2013), no. 6, 3022--3049. doi:10.1214/13-AOS1162. https://projecteuclid.org/euclid.aos/1388545677


Export citation

References

  • [1] Agarwal, A., Negahban, S. and Wainwright, M. J. (2012). Fast global convergence of gradient methods for high-dimensional statistical recovery. Ann. Statist. 40 2452–2482.
  • [2] Anandkumar, A., Tan, V. Y. F., Huang, F. and Willsky, A. S. (2012). High-dimensional structure estimation in Ising models: Local separation criterion. Ann. Statist. 40 1346–1375.
  • [3] Banerjee, O., El Ghaoui, L. and d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9 485–516.
  • [4] Barndorff-Nielson, O. E. (1978). Information and Exponential Families. Wiley, Chichester.
  • [5] Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. J. R. Stat. Soc. Ser. B Stat. Methodol. 36 192–236.
  • [6] Bresler, G., Mossel, E. and Sly, A. (2008). Reconstruction of Markov random fields from samples: Some observations and algorithms. In Approximation, Randomization and Combinatorial Optimization. 343–356. Springer, Berlin.
  • [7] Brown, L. D. (1986). Fundamentals of Statistical Exponential Families. IMS, Hayward, CA.
  • [8] Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell_{1}$ minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106 594–607.
  • [9] Carroll, R. J., Ruppert, D. and Stefanski, L. A. (1995). Measurement Error in Nonlinear Models. Chapman & Hall, London.
  • [10] Chow, C. I. and Liu, C. N. (1968). Approximating discrete probability distributions with dependence trees. IEEE Trans. Inform. Theory 14 462–467.
  • [11] Darroch, J. N. and Speed, T. P. (1983). Additive and multiplicative models and interactions. Ann. Statist. 11 724–738.
  • [12] d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2008). First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30 56–66.
  • [13] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 39 1–38.
  • [14] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical Lasso. Biostatistics 9 432–441.
  • [15] Grimmett, G. R. (1973). A theorem about random fields. Bull. Lond. Math. Soc. 5 81–84.
  • [16] Horn, R. A. and Johnson, C. R. (1990). Matrix Analysis. Cambridge Univ. Press, Cambridge.
  • [17] Ibrahim, J. G., Chen, M.-H., Lipsitz, S. R. and Herring, A. H. (2005). Missing-data methods for generalized linear models: A comparative review. J. Amer. Statist. Assoc. 100 332–346.
  • [18] Jacob, L., Obozinski, G. and Vert, J. P. (2009). Group Lasso with overlap and graph Lasso. In International Conference on Machine Learning (ICML) 433–440. ACM, New York.
  • [19] Jalali, A., Ravikumar, P. D., Vasuki, V. and Sanghavi, S. (2011). On learning discrete graphical models using group-sparse regularization. Journal of Machine Learning Research—Proceedings Track 15 378–387.
  • [20] Koller, D. and Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge.
  • [21] Lauritzen, S. L. (1996). Graphical Models. Oxford Univ. Press, New York.
  • [22] Lauritzen, S. L. and Spiegelhalter, D. J. (1988). Local computations with probabilities on graphical structures and their application to expert systems. J. R. Stat. Soc. Ser. B Stat. Methodol. 50 157–224.
  • [23] Liu, H., Han, F., Yuan, M., Lafferty, J. and Wasserman, L. (2012). High-dimensional semiparametric Gaussian copula graphical models. Ann. Statist. 40 2293–2326.
  • [24] Liu, H., Lafferty, J. and Wasserman, L. (2009). The nonparanormal: Semiparametric estimation of high dimensional undirected graphs. J. Mach. Learn. Res. 10 2295–2328.
  • [25] Loh, P.-L. and Wainwright, M. J. (2012). High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity. Ann. Statist. 40 1637–1664.
  • [26] Loh, P. and Wainwright, M. J. (2013). Supplement to “Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses.” DOI:10.1214/13-AOS1162SUPP.
  • [27] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • [28] Newman, M. E. J. and Watts, D. J. (1999). Scaling and percolation in the small-world network model. Phys. Rev. E (3) 60 7332–7342.
  • [29] Obozinski, G., Wainwright, M. J. and Jordan, M. I. (2011). Support union recovery in high-dimensional multivariate regression. Ann. Statist. 39 1–47.
  • [30] Ravikumar, P., Wainwright, M. J. and Lafferty, J. D. (2010). High-dimensional Ising model selection using $\ell_{1}$-regularized logistic regression. Ann. Statist. 38 1287–1319.
  • [31] Ravikumar, P., Wainwright, M. J., Raskutti, G. and Yu, B. (2011). High-dimensional covariance estimation by minimizing $\ell_{1}$-penalized log-determinant divergence. Electron. J. Stat. 5 935–980.
  • [32] Rockafellar, R. T. (1970). Convex Analysis. Princeton Univ. Press, Princeton, NJ.
  • [33] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Stat. 2 494–515.
  • [34] Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. Wiley, New York.
  • [35] Santhanam, N. P. and Wainwright, M. J. (2012). Information-theoretic limits of selecting binary graphical models in high dimensions. IEEE Trans. Inform. Theory 58 4117–4134.
  • [36] Wainwright, M. J. and Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1 1–305. ISSN 1935-8237.
  • [37] Xue, L. and Zou, H. (2012). Regularized rank-based estimation of high-dimensional nonparanormal graphical models. Ann. Statist. 40 2541–2571.
  • [38] Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. J. Mach. Learn. Res. 11 2261–2286.
  • [39] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19–35.
  • [40] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541–2563.

Supplemental materials

  • Supplementary material: Supplementary material for “Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses”. Due to space constraints, we have relegated technical details of the remaining proofs to the supplement [26].