Bayesian Analysis

Bayesian Structure Learning in Sparse Gaussian Graphical Models

A. Mohammadi and E. C. Wit

Full-text: Open access

Abstract

Decoding complex relationships among large numbers of variables with relatively few observations is one of the crucial issues in science. One approach to this problem is Gaussian graphical modeling, which describes conditional independence of variables through the presence or absence of edges in the underlying graph. In this paper, we introduce a novel and efficient Bayesian framework for Gaussian graphical model determination which is a trans-dimensional Markov Chain Monte Carlo (MCMC) approach based on a continuous-time birth-death process. We cover the theory and computational details of the method. It is easy to implement and computationally feasible for high-dimensional graphs. We show our method outperforms alternative Bayesian approaches in terms of convergence, mixing in the graph space and computing time. Unlike frequentist approaches, it gives a principled and, in practice, sensible approach for structure learning. We illustrate the efficiency of the method on a broad range of simulated data. We then apply the method on large-scale real applications from human and mammary gland gene expression studies to show its empirical usefulness. In addition, we implemented the method in the R package BDgraph which is freely available at http://CRAN.R-project.org/package=BDgraph.

Article information

Source
Bayesian Anal., Volume 10, Number 1 (2015), 109-138.

Dates
First available in Project Euclid: 28 January 2015

Permanent link to this document
https://projecteuclid.org/euclid.ba/1422468425

Digital Object Identifier
doi:10.1214/14-BA889

Mathematical Reviews number (MathSciNet)
MR3420899

Zentralblatt MATH identifier
1335.62056

Keywords
Bayesian model selection Sparse Gaussian graphical models Non-decomposable graphs Birth-death process Markov chain Monte Carlo G-Wishart

Citation

Mohammadi, A.; Wit, E. C. Bayesian Structure Learning in Sparse Gaussian Graphical Models. Bayesian Anal. 10 (2015), no. 1, 109--138. doi:10.1214/14-BA889. https://projecteuclid.org/euclid.ba/1422468425


Export citation

References

  • Abegaz, F. and Wit, E. (2013). “Sparse time series chain graphical models for reconstructing genetic networks.” Biostatistics, 14(3): 586–599.
  • Albert, R. and Barabási, A.-L. (2002). “Statistical mechanics of complex networks.” Reviews of modern physics, 74(1): 47.
  • Atay-Kayis, A. and Massam, H. (2005). “A Monte Carlo method for computing the marginal likelihood in nondecomposable Gaussian graphical models.” Biometrika, 92(2): 317–335.
  • Baldi, P., Brunak, S., Chauvin, Y., Andersen, C. A., and Nielsen, H. (2000). “Assessing the accuracy of prediction algorithms for classification: an overview.” Bioinformatics, 16(5): 412–424.
  • Bhadra, A. and Mallick, B. K. (2013). “Joint High-Dimensional Bayesian Variable and Covariance Selection with an Application to eQTL Analysis.” Biometrics, 69(2): 447–457.
  • Cappé, O., Robert, C., and Rydén, T. (2003). “Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(3): 679–700.
  • Carvalho, C. M., and Scott, J. G. (2009). “Objective Bayesian model selection in Gaussian graphical models.” Biometrika, 96(3): 497–512.
  • Chen, L., Tong, T., and Zhao, H. (2008). “Considering dependence among genes and markers for false discovery control in eQTL mapping.” Bioinformatics, 24(18): 2015–2022.
  • Cheng, Y., Lenkoski, A., et al. (2012). “Hierarchical Gaussian graphical models: Beyond reversible jump.” Electronic Journal of Statistics, 6: 2309–2331.
  • Dahlhaus, R. and Eichler, M. (2003). “Causality and graphical models in time series analysis.” Oxford Statistical Science Series, 115–137.
  • Dempster, A. (1972). “Covariance selection.” Biometrics, 28(1): 157–175.
  • Dobra, A., Lenkoski, A., and Rodriguez, A. (2011a). “Bayesian inference for general Gaussian graphical models with application to multivariate lattice data.” Journal of the American Statistical Association, 106(496): 1418–1433.
  • Dobra, A., Lenkoski, A., et al. (2011b). “Copula Gaussian graphical models and their application to modeling functional disability data.” The Annals of Applied Statistics, 5(2A): 969–993.
  • Foygel, R. and Drton, M. (2010). “Extended Bayesian Information Criteria for Gaussian Graphical Models.” In Lafferty, J., Williams, C. K. I., Shawe-Taylor, J., Zemel, R., and Culotta, A. (eds.), Advances in Neural Information Processing Systems 23, 604–612.
  • Friedman, J., Hastie, T., and Tibshirani, R. (2008). “Sparse inverse covariance estimation with the graphical lasso.” Biostatistics, 9(3): 432–441.
  • Geyer, C. J. and Møller, J. (1994). “Simulation procedures and likelihood inference for spatial point processes.” Scandinavian Journal of Statistics, 359–373.
  • Giudici, P. and Castelo, R. (2003). “Improving Markov chain Monte Carlo model search for data mining.” Machine Learning, 50(1-2): 127–158.
  • Giudici, P. and Green, P. (1999). “Decomposable graphical Gaussian model determination.” Biometrika, 86(4): 785–801.
  • Green, P. (1995). “Reversible jump Markov chain Monte Carlo computation and Bayesian model determination.” Biometrika, 82(4): 711–732.
  • Green, P. J. (2003). “Trans-dimensional Markov chain Monte Carlo.” Oxford Statistical Science Series, 179–198.
  • Hastie, T., Tibshirani, R., and Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer.
  • Jones, B., Carvalho, C., Dobra, A., Hans, C., Carter, C., and West, M. (2005). “Experiments in stochastic computation for high-dimensional graphical models.” Statistical Science, 20(4): 388–400.
  • Kullback, S. and Leibler, R. A. (1951). “On information and sufficiency.” The Annals of Mathematical Statistics, 22(1): 79–86.
  • Labrie, F., Luu-The, V., Lin, S.-X., Claude, L., Simard, J., Breton, R., and Bélanger, A. (1997). “The key role of 17$\beta$-hydroxysteroid dehydrogenases in sex steroid biology.” Steroids, 62(1): 148–158.
  • Lauritzen, S. (1996). Graphical models, volume 17. Oxford University Press, USA.
  • Lenkoski, A. (2013). “A direct sampler for G-Wishart variates.” Stat, 2(1): 119–128.
  • Lenkoski, A. and Dobra, A. (2011). “Computational aspects related to inference in Gaussian graphical models with the G-Wishart prior.” Journal of Computational and Graphical Statistics, 20(1): 140–157.
  • Letac, G. and Massam, H. (2007). “Wishart distributions for decomposable graphs.” The Annals of Statistics, 35(3): 1278–1323.
  • Liang, F. (2010). “A double Metropolis–Hastings sampler for spatial models with intractable normalizing constants.” Journal of Statistical Computation and Simulation, 80(9): 1007–1022.
  • Liu, H., Roeder, K., and Wasserman, L. (2010). “Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models.” In Advances in Neural Information Processing Systems, 1432–1440.
  • Meinshausen, N. and Bühlmann, P. (2006). “High-dimensional graphs and variable selection with the lasso.” The Annals of Statistics, 34(3): 1436–1462.
  • Mohammadi, A. and Wit, E. C. (2013). BDgraph: Graph estimation based on birth-death MCMC. R package version 2.10. http://CRAN.R-project.org/package=BDgraph
  • Muirhead, R. (1982). Aspects of multivariate statistical theory, volume 42. Wiley Online Library.
  • Murray, I., Ghahramani, Z., and MacKay, D. (2012). “MCMC for doubly-intractable distributions.” arXiv preprint arXiv:1206.6848.
  • Pitt, M., Chan, D., and Kohn, R. (2006). “Efficient Bayesian inference for Gaussian copula regression models.” Biometrika, 93(3): 537–554.
  • Powers, D. M. (2011). “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation.” Journal of Machine Learning Technologies, 2(1): 37–63.
  • Preston, C. J. (1976). “Special birth-and-death processes.” Bulletin of the International Statistical Institute, 46: 371–391.
  • Ravikumar, P., Wainwright, M. J., Lafferty, J. D., et al. (2010). “High-dimensional Ising model selection using L1-regularized logistic regression.” The Annals of Statistics, 38(3): 1287–1319.
  • Ripley, B. D. (1977). “Modelling spatial patterns.” Journal of the Royal Statistical Society. Series B (Methodological), 172–212.
  • Roverato, A. (2002). “Hyper Inverse Wishart Distribution for Non-decomposable Graphs and its Application to Bayesian Inference for Gaussian Graphical Models.” Scandinavian Journal of Statistics, 29(3): 391–411.
  • Schmidt-Ott, K. M., Mori, K., Li, J. Y., Kalandadze, A., Cohen, D. J., Devarajan, P., and Barasch, J. (2007). “Dual action of neutrophil gelatinase–associated lipocalin.” Journal of the American Society of Nephrology, 18(2): 407–413.
  • Scott, J. G. and Berger, J. O. (2006). “An exploration of aspects of Bayesian multiple testing.” Journal of Statistical Planning and Inference, 136(7): 2144–2162.
  • Scutari, M. (2013). “On the Prior and Posterior Distributions Used in Graphical Modelling.” Bayesian Analysis, 8(1): 1–28.
  • Stein, T., Morris, J. S., Davies, C. R., Weber-Hall, S. J., Duffy, M.-A., Heath, V. J., Bell, A. K., Ferrier, R. K., Sandilands, G. P., and Gusterson, B. A. (2004). “Involution of the mouse mammary gland is associated with an immune cascade and an acute-phase response, involving LBP, CD14 and STAT3.” Breast Cancer Research, 6(2): R75–91.
  • Stephens, M. (2000). “Bayesian analysis of mixture models with an unknown number of components-an alternative to reversible jump methods.” Annals of Statistics, 28(1): 40–74.
  • Stranger, B. E., Nica, A. C., Forrest, M. S., Dimas, A., Bird, C. P., Beazley, C., Ingle, C. E., Dunning, M., Flicek, P., Koller, D., et al. (2007). “Population genomics of human gene expression.” Nature genetics, 39(10): 1217–1224.
  • Wang, H. (2012). “Bayesian graphical lasso models and efficient posterior computation.” Bayesian Analysis, 7(4): 867–886.
  • — (2014). “Scaling It Up: Stochastic Search Structure Learning in Graphical Models.” http://www.stat.sc.edu/~wang345/RESEARCH/Wang2013WP.pdf
  • Wang, H. and Li, S. (2012). “Efficient Gaussian graphical model determination under G-Wishart prior distributions.” Electronic Journal of Statistics, 6: 168–198.
  • Wang, H. and Pillai, N. S. (2013). “On a class of shrinkage priors for covariance matrix estimation.” Journal of Computational and Graphical Statistics, 22(3): 689–707.
  • Wit, E. and McClure, J. (2004). Statistics for Microarrays: Design, Analysis and Inference. John Wiley & Sons.
  • Zhao, P. and Yu, B. (2006). “On model selection consistency of Lasso.” The Journal of Machine Learning Research, 7: 2541–2563.
  • Zhao, T., Liu, H., Roeder, K., Lafferty, J., and Wasserman, L. (2012). “The Huge Package for High-dimensional Undirected Graph Estimation in R.” The Journal of Machine Learning Research, 13(1): 1059–1062.