Brazilian Journal of Probability and Statistics

Mixture models applied to heterogeneous populations

Carolina V. Cavalcante and Kelly C. M. Gonçalves

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Mixture models provide a flexible representation of heterogeneity in a finite number of latent classes. From the Bayesian point of view, Markov Chain Monte Carlo methods provide a way to draw inferences from these models. In particular, when the number of subpopulations is considered unknown, more sophisticated methods are required to perform Bayesian analysis. The Reversible Jump Markov Chain Monte Carlo is an alternative method for computing the posterior distribution by simulation in this case. Some problems associated with the Bayesian analysis of these class of models are frequent, such as the so-called “label-switching” problem. However, as the level of heterogeneity in the population increases, these problems are expected to become less frequent and the model’s performance to improve. Thus, the aim of this work is to evaluate the normal mixture model fit using simulated data under different settings of heterogeneity and prior information about the mixture proportions. A simulation study is also presented to evaluate the model’s performance considering the number of components known and estimating it. Finally, the model is applied to a censored real dataset containing antibody levels of Cytomegalovirus in individuals.

Article information

Source
Braz. J. Probab. Stat., Volume 32, Number 2 (2018), 320-345.

Dates
Received: October 2015
Accepted: November 2016
First available in Project Euclid: 17 April 2018

Permanent link to this document
https://projecteuclid.org/euclid.bjps/1523952018

Digital Object Identifier
doi:10.1214/16-BJPS345

Mathematical Reviews number (MathSciNet)
MR3787757

Zentralblatt MATH identifier
06914678

Keywords
Identifiability sensitivity analysis subpopulations frequentist properties NHANES

Citation

Cavalcante, Carolina V.; Gonçalves, Kelly C. M. Mixture models applied to heterogeneous populations. Braz. J. Probab. Stat. 32 (2018), no. 2, 320--345. doi:10.1214/16-BJPS345. https://projecteuclid.org/euclid.bjps/1523952018


Export citation

References

  • Celeux, G., Forbes, F., Robert, C. P., Titterington, D. M., et al. (2006). Deviance information criteria for missing data models. Bayesian analysis 1, 651–673.
  • Dellaportas, P. and Papageorgiou, I. (2006). Multivariate mixtures of normals with unknown number of components. Statistics and Computing 16, 57–68.
  • Diebolt, J. and Robert, C. P. (1994). Estimation of finite mixture distributions through Bayesian sampling. Journal of the Royal Statistical Society. Series B (Methodological) 56, 363–375.
  • Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732.
  • Jasra, A., Holmes, C. C. and Stephens, D. A. (2005). Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Statistical Science 20, 50–67.
  • Jordan, M. I. (2004). Graphical models. Statistical Science 19, 140–155.
  • Komárek, A. (2009). A new R package for Bayesian estimation of multivariate normal mixtures allowing for selection of the number of components and interval-censored data. Computational Statistics & Data Analysis 53, 3932–3947.
  • Kusne, S., Shapiro, R. and Fung, J. (1999). Prevention and treatment of cytomegalovirus infection in organ transplant recipients. Transplant infectious disease 1, 187–203.
  • McLachlan, G. and Peel, D. (2004). Finite Mixture Models. Wiley.
  • Nobile, A. (2004). On the posterior distribution of the number of components in a finite mixture. Annals of statistics 32, 2044–2073.
  • Plummer, M. (2008). Penalized loss functions for Bayesian model comparison. Biostatistics 9, 523–539.
  • R Core Team (2014). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. Available at http://www.R-project.org/.
  • Redner, R. A. and Walker, H. F. (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM review 26, 195–239.
  • Richardson, S. and Green, P. J. (1997). On Bayesian analysis of mixtures with an unknown number of components. Journal of the Royal Statistical Society. Series B (Methodological) 59, 731–792.
  • Roeder, K. and Wasserman, L. (1997). Practical Bayesian density estimation using mixtures of normals. Journal of the American Statistical Association 92, 894–902.
  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64, 583–639.
  • Stephens, M. (2000). Dealing with label switching in mixture models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 62, 795–809.
  • Viallefont, V., Richardson, S. and Green, P. J. (2002). Bayesian analysis of Poisson mixtures. Journal of nonparametric statistics 14, 181–202.