Statistical Science

Variational Inference for Generalized Linear Mixed Models Using Partially Noncentered Parametrizations

Linda S. L. Tan and David J. Nott

Full-text: Open access

Abstract

The effects of different parametrizations on the convergence of Bayesian computational algorithms for hierarchical models are well explored. Techniques such as centering, noncentering and partial noncentering can be used to accelerate convergence in MCMC and EM algorithms but are still not well studied for variational Bayes (VB) methods. As a fast deterministic approach to posterior approximation, VB is attracting increasing interest due to its suitability for large high-dimensional data. Use of different parametrizations for VB has not only computational but also statistical implications, as different parametrizations are associated with different factorized posterior approximations. We examine the use of partially noncentered parametrizations in VB for generalized linear mixed models (GLMMs). Our paper makes four contributions. First, we show how to implement an algorithm called nonconjugate variational message passing for GLMMs. Second, we show that the partially noncentered parametrization can adapt to the quantity of information in the data and determine a parametrization close to optimal. Third, we show that partial noncentering can accelerate convergence and produce more accurate posterior approximations than centering or noncentering. Finally, we demonstrate how the variational lower bound, produced as part of the computation, can be useful for model selection.

Article information

Source
Statist. Sci., Volume 28, Number 2 (2013), 168-188.

Dates
First available in Project Euclid: 21 May 2013

Permanent link to this document
https://projecteuclid.org/euclid.ss/1369147910

Digital Object Identifier
doi:10.1214/13-STS418

Mathematical Reviews number (MathSciNet)
MR3112404

Zentralblatt MATH identifier
1331.62167

Keywords
Variational Bayes hierarchical centering variational message passing nonconjugate models longitudinal data analysis

Citation

Tan, Linda S. L.; Nott, David J. Variational Inference for Generalized Linear Mixed Models Using Partially Noncentered Parametrizations. Statist. Sci. 28 (2013), no. 2, 168--188. doi:10.1214/13-STS418. https://projecteuclid.org/euclid.ss/1369147910


Export citation

References

  • Attias, H. (1999). Inferring parameters and structure of latent variable models by variational Bayes. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence 21–30. Morgan Kaufmann, San Francisco, CA.
  • Attias, H. (2000). A variational Bayesian framework for graphical models. In Advances in Neural Information Processing Systems 12 209–215. MIT Press, Cambridge, MA.
  • Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer, New York.
  • Blocker, A. W. (2011). Fast Rcpp implementation of Gauss–Hermite quadrature. R package “fastGHQuad” version 0.1-1. Available at http://cran.r-project.org/.
  • Braun, M. and McAuliffe, J. (2010). Variational inference for large-scale models of discrete choice. J. Amer. Statist. Assoc. 105 324–335.
  • Breslow, N. E. and Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. J. Amer. Statist. Assoc. 88 9–25.
  • Brown, P. and Zhou, L. (2010). MCMC for generalized linear mixed models with glmmBUGS. The R Journal 2 13–16.
  • Browne, W. J. and Draper, D. (2006). A comparison of Bayesian and likelihood-based methods for fitting multilevel models. Bayesian Anal. 1 473–513 (electronic).
  • Cai, B. and Dunson, D. B. (2008). Bayesian variable selection in generalized linear mixed models. In Random Effect and Latent Variable Model Selection. Lecture Notes in Statistics 192 63–91. Springer, New York.
  • Christensen, O. F., Roberts, G. O. and Sköld, M. (2006). Robust Markov chain Monte Carlo methods for spatial generalized linear mixed models. J. Comput. Graph. Statist. 15 1–17.
  • Corduneanu, A. and Bishop, C. M. (2001). Variational Bayesian model selection for mixture distributions. In Artificial Intelligence and Statistics 27–34. Morgan Kaufmann, San Francisco, CA.
  • De Backer, M., De Vroey, C., Lesaffre, E., Scheys, I. and De Keyser, P. (1998). Twelve weeks of continuous oral therapy for toenail onychomycosis caused by dermatophytes: A double-blind comparative trial of terbinafine 250 mg/day versus itraconazole 200 mg/day. Journal of the American Academy of Dermatology 38 57–63.
  • Fitzmaurice, G. and Laird, N. (1993). A likelihood-based method for analysing longitudinal binary responses. Biometrika 80 141–151.
  • Fong, Y., Rue, H. and Wakefield, J. (2010). Bayesian inference for generalised linear mixed models. Biostatistics 11 397–412.
  • Gelfand, A. E., Sahu, S. K. and Carlin, B. P. (1995). Efficient parameterisations for normal linear mixed models. Biometrika 82 479–488.
  • Gelfand, A. E., Sahu, S. K. and Carlin, B. P. (1996). Efficient parametrizations for generalized linear mixed models. In Bayesian Statistics 5 (Alicante, 1994) 165–180. Oxford Univ. Press, New York.
  • Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2004). Bayesian Data Analysis, 2nd ed. Chapman & Hall/CRC, Boca Raton, FL.
  • Ghahramani, Z. and Beal, M. J. (2001). Propagation algorithms for variational Bayesian learning. In Advances in Neural Information Processing Systems 13 507–513. MIT Press, Cambridge, MA.
  • Hoffman, M. D., Blei, D. M., Wang, C. and Paisley, J. (2012). Stochastic variational inference. Available at arXiv:1206.7051.
  • Jaakkola, T. S. and Jordan, M. I. (2000). Bayesian parameter estimation via variational methods. Statist. Comput. 10 25–37.
  • Kass, R. E. and Natarajan, R. (2006). A default conjugate prior for variance components in generalized linear mixed models (comment on article by Browne and Draper). Bayesian Anal. 1 535–542 (electronic).
  • Knowles, D. A. and Minka, T. P. (2011). Non-conjugate variational message passing for multinomial and binary regression. In Advances in Neural Information Processing Systems 24 1701–1709. Available at http://books.nips.cc/papers/files/nips24/NIPS2011_0962.pdf.
  • Liu, Q. and Pierce, D. A. (1994). A note on Gauss–Hermite quadrature. Biometrika 81 624–629.
  • Liu, J. S. and Wu, Y. N. (1999). Parameter expansion for data augmentation. J. Amer. Statist. Assoc. 94 1264–1274.
  • Lunn, D. J., Thomas, A., Best, N. and Spiegelhalter, D. (2000). WinBUGS—A Bayesian modelling framework: Concepts, structure, and extensibility. Statist. Comput. 10 325–337.
  • Magnus, J. R. and Neudecker, H. (1988). Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, Chichester.
  • Meng, X.-L. and van Dyk, D. (1997). The EM algorithm—An old folk-song sung to a fast new tune (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 59 511–567.
  • Meng, X.-L. and van Dyk, D. A. (1999). Seeking efficient data augmentation schemes via conditional and marginal augmentation. Biometrika 86 301–320.
  • O’Hagan, A. and Forster, J. (2004). Kendall’s Advanced Theory of Statistics V. 2B: Bayesian Inference, 2nd ed. Arnold, London.
  • Ormerod, J. T. and Wand, M. P. (2010). Explaining variational approximations. Amer. Statist. 64 140–153.
  • Ormerod, J. T. and Wand, M. P. (2012). Gaussian variational approximate inference for generalized linear mixed models. J. Comput. Graph. Statist. 21 2–17.
  • Overstall, A. M. and Forster, J. J. (2010). Default Bayesian model determination methods for generalised linear mixed models. Comput. Statist. Data Anal. 54 3269–3288.
  • Papaspiliopoulos, O., Roberts, G. O. and Sköld, M. (2003). Non-centered parameterizations for hierarchical models and data augmentation. In Bayesian Statistics 7 (Tenerife, 2002) 307–326. Oxford Univ. Press, New York.
  • Papaspiliopoulos, O., Roberts, G. O. and Sköld, M. (2007). A general framework for the parametrization of hierarchical models. Statist. Sci. 22 59–73.
  • Qi, Y. and Jaakkola, T. S. (2006). Parameter expanded variational Bayesian methods. In Advances in Neural Information Processing Systems 19 1097–1104. MIT Press, Cambridge, MA.
  • Raudenbush, S. W., Yang, M.-L. and Yosef, M. (2000). Maximum likelihood for generalized linear models with nested random effects via high-order, multivariate Laplace approximation. J. Comput. Graph. Statist. 9 141–157.
  • Rijmen, F. and Vomlel, J. (2008). Assessing the performance of variational methods for mixed logistic regression models. J. Stat. Comput. Simul. 78 765–779.
  • Roos, M. and Held, L. (2011). Sensitivity analysis in Bayesian generalized linear mixed models for binary data. Bayesian Anal. 6 259–278.
  • Roulin, A. and Bersier, L. F. (2007). Nestling barn owls beg more intensely in the presence of their mother than in the presence of their father. Animal Behaviour 74 1099–1106.
  • Saul, L. K. andJordan (1998). A mean field learning algorithm for unsupervised neural networks. In Learning in Graphical Models 541–554. Kluwer Academic, Dordrecht.
  • Sturtz, S., Ligges, U. and Gelman, A. (2005). R2WinBUGS: A package for running WinBUGS from R. Journal of Statistical Software 12 1–16.
  • Tan, S. L. and Nott, D. J. (2013). Variational approximation for mixtures of linear mixed models. J. Comput. Graph. Statist. To appear. DOI:10.1080/10618600.2012.761138.
  • Thall, P. F. and Vail, S. C. (1990). Some covariance models for longitudinal count data with overdispersion. Biometrics 46 657–671.
  • Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S, 4th ed. Springer, New York.
  • Wand, M. P. (2013). Fully simplified multivariate normal updates in non-conjugate variational message passing. Unpublished manuscript. Available at http://www.uow.edu.au/~mwand/fsupap.pdf.
  • Winn, J. and Bishop, C. M. (2005). Variational message passing. J. Mach. Learn. Res. 6 661–694.
  • Yu, Y. and Meng, X.-L. (2011). To center or not to center: That is not the question—An ancillarity–sufficiency interweaving strategy (ASIS) for boosting MCMC efficiency. J. Comput. Graph. Statist. 20 531–570.
  • Yu, D. and Yau, K. K. W. (2012). Conditional Akaike information criterion for generalized linear mixed models. Comput. Statist. Data Anal. 56 629–644.
  • Zhao, Y., Staudenmayer, J., Coull, B. A. and Wand, M. P. (2006). General design Bayesian generalized linear mixed models. Statist. Sci. 21 35–51.
  • Zuur, A. F., Ieno, E. N., Walker, N. J., Saveliev, A. A. and Smith, G. M. (2009). Mixed Effects Models and Extensions in Ecology with R. Springer, New York.