Statistical Science

Mitigating Bias in Generalized Linear Mixed Models: The Case for Bayesian Nonparametrics

Joseph Antonelli, Lorenzo Trippa, and Sebastien Haneuse

Full-text: Open access


Generalized linear mixed models are a common statistical tool for the analysis of clustered or longitudinal data where correlation is accounted for through cluster-specific random effects. In practice, the distribution of the random effects is typically taken to be a Normal distribution, although if this does not hold then the model is misspecified and standard estimation/inference may be invalid. An alternative is to perform a so-called nonparametric Bayesian analyses in which one assigns a Dirichlet process (DP) prior to the unknown distribution of the random effects. In this paper we examine operating characteristics for estimation of fixed effects and random effects based on such an analysis under a range of “true” random effects distributions. As part of this we investigate various approaches for selection of the precision parameter of the DP prior. In addition, we illustrate the use of the methods with an analysis of post-operative complications among $n=18{,}643$ female Medicare beneficiaries who underwent a hysterectomy procedure at $N=503$ hospitals in the US. Overall, we conclude that using the DP prior in modeling the random effect distribution results in large reductions of bias with little loss of efficiency. While no single choice for the precision parameter will be optimal in all settings, certain strategies such as importance sampling or empirical Bayes can be used to obtain reasonable results in a broad range of data scenarios.

Article information

Statist. Sci., Volume 31, Number 1 (2016), 80-95.

First available in Project Euclid: 10 February 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Dirichlet process prior generalized linear mixed models model misspecification random effects


Antonelli, Joseph; Trippa, Lorenzo; Haneuse, Sebastien. Mitigating Bias in Generalized Linear Mixed Models: The Case for Bayesian Nonparametrics. Statist. Sci. 31 (2016), no. 1, 80--95. doi:10.1214/15-STS533.

Export citation


  • Agresti, A., Caffo, B. and Ohman-Strickland, P. (2004). Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies. Comput. Statist. Data Anal. 47 639–653.
  • Antonelli, J., Trippa, L. and Haneuse, S. (2016). Supplement to “Mitigating bias in generalized linear mixed models: The case for Bayesian nonparametrics.” DOI:10.1214/15-STS533SUPP.
  • Antoniak, C. E. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann. Statist. 2 1152–1174.
  • Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2014). Hierarchical Modeling and Analysis for Spatial Data. CRC Press, Boca Raton.
  • Basu, S. and Chib, S. (2003). Marginal likelihood and Bayes factors for Dirichlet process mixture models. J. Amer. Statist. Assoc. 98 224–235.
  • Branscum, A. J. and Hanson, T. E. (2008). Bayesian nonparametric meta-analysis using Polya tree mixture models. Biometrics 64 825–833.
  • Broström, G. and Holmberg, H. (2012). glmmML: Generalized linear models with clustering (2011). R package version 0.82-1.
  • Caffo, B., An, M.-W. and Rohde, C. (2007). Flexible random intercept models for binary outcomes using mixtures of normals. Comput. Statist. Data Anal. 51 5220–5235.
  • Celeux, G., Forbes, F., Robert, C. P. and Titterington, D. M. (2006). Deviance information criteria for missing data models. Bayesian Anal. 1 651–673 (electronic).
  • Davidian, M. and Gallant, A. R. (1993). The nonlinear mixed effects model with a smooth random effects density. Biometrika 80 475–488.
  • Dey, D., Müller, P. and Sinha, D., eds. (1998). Practical Nonparametric and Semiparametric Bayesian Statistics. Lecture Notes in Statistics 133. Springer, New York.
  • Diggle, P. J., Heagerty, P. J., Liang, K.-Y. and Zeger, S. L. (2013). Analysis of Longitudinal Data, 2nd ed. Oxford Statistical Science Series 25. Oxford Univ. Press, Oxford.
  • Dorazio, R. M. (2009). On selecting a prior for the precision parameter of Dirchlet process mixture models. J. Statist. Plann. Inference 139 3384–3390.
  • Dorazio, R. M., Mukherjee, B., Zhang, L., Ghosh, M., Jelks, H. L. and Jordan, F. (2008). Modeling unobserved sources of heterogeneity in animal abundance using a Dirichlet process prior. Biometrics 64 635–644, 670–671.
  • Dunson, D. B., Chen, Z. and Harry, J. (2003). A Bayesian approach for joint modeling of cluster size and subunit-specific outcomes. Biometrics 59 521–530.
  • Escobar, M. D. (1994). Estimating normal means with a Dirichlet process prior. J. Amer. Statist. Assoc. 89 268–277.
  • Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577–588.
  • Geisser, S. and Eddy, W. F. (1979). A predictive approach to model selection. J. Amer. Statist. Assoc. 74 153–160.
  • Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. and Rubin, D. B. (2013). Bayesian Data Analysis. CRC Press, Boca Raton.
  • Hanson, T. and Johnson, W. O. (2002). Modeling regression error with a mixture of Polya trees. J. Amer. Statist. Assoc. 97 1020–1033.
  • Heagerty, P. J. and Kurland, B. F. (2001). Misspecified maximum likelihood estimates and generalised linear mixed models. Biometrika 88 973–985.
  • Heagerty, P. J. and Zeger, S. L. (2000). Marginalized multilevel models and likelihood inference. Statist. Sci. 15 1–26.
  • Jara, A., Hanson, T. E., Quintana, F. A., Müller, P. and Rosner, G. L. (2011). DPpackage: Bayesian non-and semi-parametric modelling in R. J. Stat. Softw. 40 1.
  • Kleinman, K. P. and Ibrahim, J. G. (1998). A semiparametric Bayesian approach to the random effects model. Biometrics 921–938.
  • Kottas, A., Müller, P. and Quintana, F. (2005). Nonparametric Bayesian modeling for multivariate ordinal data. J. Comput. Graph. Statist. 14 610–625.
  • Kyung, M., Gill, J. and Casella, G. (2009). Characterizing the variance improvement in linear Dirichlet random effects models. Statist. Probab. Lett. 79 2343–2350.
  • Kyung, M., Gill, J. and Casella, G. (2010). Estimation in Dirichlet random effects models. Ann. Statist. 38 979–1009.
  • Laird, N. (1978). Nonparametric maximum likelihood estimation of a mixed distribution. J. Amer. Statist. Assoc. 73 805–811.
  • Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics 963–974.
  • Lange, N. and Ryan, L. (1989). Assessing normality in random effects models. Ann. Statist. 17 624–642.
  • Lavine, M. (1992). Some aspects of Pólya tree distributions for statistical modelling. Ann. Statist. 20 1222–1235.
  • Lee, J., Quintana, F. A., Müller, P. and Trippa, L. (2013). Defining predictive probability functions for species sampling models. Statist. Sci. 28 209–222.
  • Leon-Novelo, L. G., Zhou, X., Bekele, B. N. and Müller, P. (2010). Assessing toxicities in a clinical trial: Bayesian inference for ordinal data nested within categories. Biometrics 66 966–974.
  • Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13–22.
  • Litière, S., Alonso, A. and Molenberghs, G. (2007). Type I and type II error under random-effects misspecification in generalized linear mixed models. Biometrics 63 1038–1044, 1310.
  • Liu, J. S. (1996). Nonparametric hierarchical Bayes via sequential imputations. Ann. Statist. 24 911–930.
  • Magder, L. S. and Zeger, S. L. (1996). A smooth nonparametric estimate of a mixing distribution using mixtures of Gaussians. J. Amer. Statist. Assoc. 91 1141–1151.
  • McCulloch, C. E. (2006). Generalized Linear Mixed Models. Wiley Online Library, New York.
  • McCulloch, C. E. and Neuhaus, J. M. (2011a). Misspecifying the shape of a random effects distribution: Why getting it wrong may not matter. Statist. Sci. 26 388–402.
  • McCulloch, C. E. and Neuhaus, J. M. (2011b). Prediction of random effects in linear and generalized linear models under model misspecification. Biometrics 67 270–279.
  • Müller, P. and Quintana, F. A. (2004). Nonparametric Bayesian data analysis. Statist. Sci. 19 95–110.
  • Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Statist. 9 249–265.
  • Neuhaus, J. M., Hauck, W. W. and Kalbfleisch, J. D. (1992). The effects of mixture distribution misspecification when fitting mixed-effects logistic models. Biometrika 79 755–762.
  • Neuhaus, J. M., McCulloch, C. E. and Boylan, R. (2011). A note on type II error under random effects misspecification in generalized linear mixed models. Biometrics 67 654–660.
  • O’Brien, S. M. and Dunson, D. B. (2004). Bayesian multivariate logistic regression. Biometrics 60 739–746.
  • Piepho, H.-P. and McCulloch, C. E. (2004). Transformations in mixed models: Application to risk analysis for a multienvironment trial. J. Agric. Biol. Environ. Stat. 9 123–137.
  • Trippa, L., Müller, P. and Johnson, W. (2011). The multivariate beta process and an extension of the Polya tree model. Biometrika 98 17–34.
  • Verbeke, G. and Molenberghs, G. (2009). Linear Mixed Models for Longitudinal Data. Springer, New York.
  • Walker, S. G. and Mallick, B. K. (1997). Hierarchical generalized linear models and frailty models with Bayesian nonparametric mixing. J. Roy. Statist. Soc. Ser. B 59 845–860.
  • Waller, L. A., Carlin, B. P., Xia, H. and Gelfand, A. E. (1997). Hierarchical spatio-temporal mapping of disease rates. J. Amer. Statist. Assoc. 92 607–617.
  • White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica 50 1–25.
  • Zhang, D. and Davidian, M. (2001). Linear mixed models with flexible distributions of random effects for longitudinal data. Biometrics 57 795–802.

Supplemental materials

  • Supplement to “Mitigating bias in generalized linear mixed models: The case for Bayesian nonparametrics”. We include in the supplementary files a detailed description of both the model and prior specification for the Logistic-DP model. We also include extended simulation results that include all parameters from the model and an additional simulation that looks at a larger sample size. Finally, we include convergence diagnostics for all Bayesian models in the Medicare application.