Bayesian Analysis

Scaling It Up: Stochastic Search Structure Learning in Graphical Models

Hao Wang

Full-text: Open access


Gaussian concentration graph models and covariance graph models are two classes of graphical models that are useful for uncovering latent dependence structures among multivariate variables. In the Bayesian literature, graphs are often determined through the use of priors over the space of positive definite matrices with fixed zeros, but these methods present daunting computational burdens in large problems. Motivated by the superior computational efficiency of continuous shrinkage priors for regression analysis, we propose a new framework for structure learning that is based on continuous spike and slab priors and uses latent variables to identify graphs. We discuss model specification, computation, and inference for both concentration and covariance graph models. The new approach produces reliable estimates of graphs and efficiently handles problems with hundreds of variables.

Article information

Bayesian Anal., Volume 10, Number 2 (2015), 351-377.

First available in Project Euclid: 2 February 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian inference Bi-directed graph Block Gibbs Concentration graph models Covariance graph models Credit default swap Undirected graph Structural learning


Wang, Hao. Scaling It Up: Stochastic Search Structure Learning in Graphical Models. Bayesian Anal. 10 (2015), no. 2, 351--377. doi:10.1214/14-BA916.

Export citation


  • Armagan, A., Dunson, D. B., and Lee, J. (2013). “Generalized double Pareto shrinkage.” Statistica Sinica, 23(1): 119.
  • Atay-Kayis, A. and Massam, H. (2005). “The marginal likelihood for decomposable and non-decomposable graphical Gaussian models.” Biometrika, 92: 317–335.
  • Banerjee, O., El Ghaoui, L., and d’Aspremont, A. (2008). “Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data.” The Journal of Machine Learning Research, 9: 485–516.
  • Bhattacharya, A. and Dunson, D. B. (2011). “Sparse Bayesian infinite factor models.” Biometrika, 98(2): 291–306.
  • Bickel, P. J. and Levina, E. (2008). “Covariance Regularization by Thresholding.” The Annals of Statistics, 36(6): 2577–2604.
  • Bien, J. and Tibshirani, R. J. (2011). “Sparse estimation of a covariance matrix.” Biometrika, 98(4): 807–820.
  • Cai, T. and Liu, W. (2011). “Adaptive Thresholding for Sparse Covariance Matrix Estimation.” Journal of the American Statistical Association, 106(494): 672–684. URL
  • Carvalho, C. M., Chang, J., Lucas, J. E., Nevins, J. R., Wang, Q., and West, M. (2008). “High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics.” Journal of the American Statistical Association, 103(484): 1438–1456.
  • Carvalho, C. M., Polson, N. G., and Scott, J. G. (2010). “The horseshoe estimator for sparse signals.” Biometrika, 97(2): 465–480.
  • Carvalho, C. M. and West, M. (2007). “Dynamic matrix-variate graphical models.” Bayesian Analysis, 2: 69–98.
  • Castelo, R. and Roverato, A. (2006). “A Robust Procedure For Gaussian Graphical Model Search From Microarray Data With p Larger Than n.” Journal of Machine Learning Research, 7: 2621–2650.
  • Chaudhuri, S., Drton, M., and Richardson, T. S. (2007). “Estimation of a covariance matrix with zeros.” Biometrika, 94(1): 199–216.
  • Cheng, Y. and Lenkoski, A. (2012). “Hierarchical Gaussian graphical models: Beyond reversible jump.” Electronic Journal of Statistics, 6: 2309–2331.
  • Cox, D. R. and Wermuth, N. (1993). “Linear Dependencies Represented by Chain Graphs.” Statistical Science, 8(3): 204–218.
  • Dawid, A. P. and Lauritzen, S. L. (1993). “Hyper-Markov laws in the statistical analysis of decomposable graphical models.” Annals of Statistics, 21: 1272–1317.
  • Dempster, A. (1972). “Covariance selection.” Biometrics, 28: 157–175.
  • Dobra, A., Lenkoski, A., and Rodriguez, A. (2011). “Bayesian inference for general Gaussian graphical models with application to multivariate lattice data.” Journal of the American Statistical Association, 106(496): 1418–1433.
  • Fan, J., Feng, Y., and Wu, Y. (2009). “Network exploration via the adaptive LASSO and SCAD penalties.” Annals of Applied Statistics, 3(2): 521–541.
  • Friedman, J., Hastie, T., and Tibshirani, R. (2008). “Sparse inverse covariance estimation with the graphical lasso.” Biostatistics, 9(3): 432–441.
  • George, E. I. and McCulloch, R. E. (1993). “Variable selection via Gibbs sampling.” Journal of the American Statistical Association, 88: 881–889.
  • Giudici, P. and Green, P. J. (1999). “Decomposable graphical Gaussian model determination.” Biometrika, 86: 785–801.
  • Griffin, J. E. and Brown, P. J. (2010). “Inference with normal-gamma prior distributions in regression problems.” Bayesian Analysis, 5(1): 171–188.
  • Jones, B., Carvalho, C., Dobra, A., Hans, C., Carter, C., and West, M. (2005). “Experiments in stochastic computation for high-dimensional graphical models.” Statistical Science, 20: 388–400.
  • Kauermann, G. (1996). “On a Dualization of Graphical Gaussian Models.” Scandinavian Journal of Statistics, 23(1): pp. 105–116.
  • Khare, K. and Rajaratnam, B. (2011). “Wishart distributions for decomposable covariance graph models.” Annals of Statistics, 39(1): 514–555.
  • Khondker, Z. S., Zhu, H., Chu, H., Lin, W., and Ibrahim, J. G. (2013). “The Bayesian covariance lasso.” Statistics and its interface, 6(2): 243.
  • Lenkoski, A. and Dobra, A. (2011). “Computational Aspects Related to Inference in Gaussian Graphical Models With the G-Wishart Prior.” Journal of Computational and Graphical Statistics, 20(1): 140–157.
  • Murray, I. (2007). Advances in Markov chain Monte Carlo methods. University College London: PhD. Thesis.
  • Park, T. and Casella, G. (2008). “The Bayesian Lasso.” Journal of the American Statistical Association, 103(482): 681–686.
  • Rothman, A. J., Bickel, P. J., Levina, E., and Zhu, J. (2008). “Sparse permutation invariant covariance estimation.” Electronic Journal of Statistics, 2: 494–515.
  • Rothman, A. J., Levina, E., and Zhu, J. (2009). “Generalized Thresholding of Large Covariance Matrices.” Journal of the American Statistical Association, 104(485): 177–186.
  • Roverato, A. (2002). “Hyper-Inverse Wishart Distribution for Non-decomposable Graphs and its Application to Bayesian Inference for Gaussian Graphical Models.” Scandinavian Journal of Statistics, 29: 391–411.
  • Scott, J. G. and Carvalho, C. M. (2008). “Feature-Inclusion Stochastic Search for Gaussian Graphical Models.” Journal of Computational and Graphical Statistics, 17(4): 790–808.
  • Silva, R. (2013). “A MCMC Approach for Learning the Structure of Gaussian Acyclic Directed Mixed Graphs.” In Statistical Models for Data Analysis, 343–351. Springer.
  • Silva, R. and Ghahramani, Z. (2009). “The Hidden Life of Latent Variables: Bayesian Learning with Mixed Graph Models.” Journal of Machine Learning Research, 10: 1187–1238.
  • Wang, H. (2010). “Sparse seemingly unrelated regression modelling: Applications in finance and econometrics.” Computational Statistics & Data Analysis, 54(11): 2866–2877.
  • — (2012). “The Bayesian Graphical Lasso and Efficient Posterior Computation.” Bayesian Analysis, 7(2): 771–790.
  • — (2014). “Coordinate descent algorithm for covariance graphical lasso.” Statistics and Computing, 24(4): 521–529.
  • Wang, H. and Li, S. Z. (2012). “Efficient Gaussian graphical model determination under G-Wishart prior distributions.” Electronic Journal of Statistics, 6: 168–198.
  • Wang, H., Reeson, C., and Carvalho, C. M. (2011). “Dynamic Financial Index Models: Modeling conditional dependencies via graphs.” Bayesian Analysis, 6(4): 639–664.
  • Wermuth, N., Cox, D. R., and Marchetti, G. M. (2006). “Covariance Chains.” Bernoulli, 12: 841–862.
  • Yuan, M. and Lin, Y. (2007). “Model selection and estimation in the Gaussian graphical model.” Biometrika, 94(1): 19–35.