Open Access
2018 Variable selection methods for model-based clustering
Michael Fop, Thomas Brendan Murphy
Statist. Surv. 12: 18-65 (2018). DOI: 10.1214/18-SS119


Model-based clustering is a popular approach for clustering multivariate data which has seen applications in numerous fields. Nowadays, high-dimensional data are more and more common and the model-based clustering approach has adapted to deal with the increasing dimensionality. In particular, the development of variable selection techniques has received a lot of attention and research effort in recent years. Even for small size problems, variable selection has been advocated to facilitate the interpretation of the clustering results. This review provides a summary of the methods developed for variable selection in model-based clustering. Existing R packages implementing the different methods are indicated and illustrated in application to two data analysis examples.


Download Citation

Michael Fop. Thomas Brendan Murphy. "Variable selection methods for model-based clustering." Statist. Surv. 12 18 - 65, 2018.


Received: 1 July 2017; Published: 2018
First available in Project Euclid: 26 April 2018

zbMATH: 06875306
MathSciNet: MR3794323
Digital Object Identifier: 10.1214/18-SS119

Keywords: Gaussian mixture model , latent class analysis , Model-based clustering , R packages , Variable selection

Vol.12 • 2018
Back to Top