Electronic Journal of Statistics

On strong identifiability and convergence rates of parameter estimation in finite mixtures

Nhat Ho and XuanLong Nguyen

Full-text: Open access

Abstract

This paper studies identifiability and convergence behaviors for parameters of multiple types, including matrix-variate ones, that arise in finite mixtures, and the effects of model fitting with extra mixing components. We consider several notions of strong identifiability in a matrix-variate setting, and use them to establish sharp inequalities relating the distance of mixture densities to the Wasserstein distances of the corresponding mixing measures. Characterization of identifiability is given for a broad range of mixture models commonly employed in practice, including location-covariance mixtures and location-covariance-shape mixtures, for mixtures of symmetric densities, as well as some asymmetric ones. Minimax lower bounds and rates of convergence for the maximum likelihood estimates are established for such classes, which are also confirmed by simulation studies.

Article information

Source
Electron. J. Statist., Volume 10, Number 1 (2016), 271-307.

Dates
Received: February 2015
First available in Project Euclid: 17 February 2016

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1455715963

Digital Object Identifier
doi:10.1214/16-EJS1105

Mathematical Reviews number (MathSciNet)
MR3466183

Zentralblatt MATH identifier
1332.62095

Subjects
Primary: 62F15: Bayesian inference 62G05: Estimation
Secondary: 62G20: Asymptotic properties

Keywords
Mixture models strong identifiability Wasserstein distances minimax bounds maximum likelihood estimation

Citation

Ho, Nhat; Nguyen, XuanLong. On strong identifiability and convergence rates of parameter estimation in finite mixtures. Electron. J. Statist. 10 (2016), no. 1, 271--307. doi:10.1214/16-EJS1105. https://projecteuclid.org/euclid.ejs/1455715963


Export citation

References

  • [1] E. S. Allman, C. Matias, and J. A. Rhodes. Identifiability of parameters in latent structure models with many observed variables., Annals of Statistics, 37 :3099–3132, 2009.
  • [2] M. Belkin and K. Sinha. Polynomial learning of distribution families. In, FOCS, 2010.
  • [3] R. J. Carroll and P. Hall. Optimal rates of convergence for deconvolving a density., Journal of American Statistical Association, 83 :1184–1186, 1988.
  • [4] J. Chen. Optimal rate of convergence for finite mixture models., Annals of Statistics, 23(1):221–233, 1995.
  • [5] A. DasGupta., Asymptotic Theory of Statistics and Probability. Springer, 2008.
  • [6] S. Dasgupta. Learning mixtures of Gaussians. Technical Report UCB/CSD-99 -1047, University of California, Berkeley, 1999.
  • [7] R. Elmore, P. Hall, and A. Neeman. An application of classical invariant theory to identifiability in nonparametric mixtures., Ann. Inst. Fourier (Grenoble), 55:1–28, 2005.
  • [8] J. Fan. On the optimal rates of convergence for nonparametric deconvolution problems., Annals of Statistics, 19(3) :1257–1272, 1991.
  • [9] S. Ghosal and A. van der Vaart. Entropies and rates of convergence for maximum likelihood and bayes estimation for mixtures of normal densities., Annals of Statistics, 29 :1233–1263, 2001.
  • [10] P. Hall, A. Neeman, R. Pakyari, and R. Elmore. Nonparametric inference in multivariate mixtures., Biometrika, 92:667–678, 2005.
  • [11] P. Hall and X. H. Zhou. Nonparametric estimation of component distributions in a multivariate mixture., Annals of Statistics, 31:201–224, 2003.
  • [12] Y. S. Hsu, M. D. Fraser, and J. J. Walker. Identifiability of finite mixtures of von mises distributions., Annals of Statistics, 9 :1130–1131, 1981.
  • [13] A. Kalai, A. Moitra, and G. Valiant. Disentangling gaussians., Communications of the ACM, 55(2):113–120, 2012.
  • [14] J. T. Kent. Identifiability of finite mixtures for directional data., Annals of Statistics, 11:984–988, 1983.
  • [15] B. Lindsay., Mixture models: Theory, geometry and applications. In NSF-CBMS Regional Conference Series in Probability and Statistics. IMS, Hayward, CA., 1995.
  • [16] X. Liu and Y. Shao. Asymptotics for likelihood ratio tests under loss of identifiability., Annals of Statistics, 31:807–832, 2004.
  • [17] K. V. Mardia. Statistics of directional data., Journal of the Royal Statistical Society. Series B(Methodological), 37:349–393, 1975.
  • [18] G. J. McLachlan and K. E. Basford., Mixture models: Inference and Applications to Clustering. Statistics: Textbooks and Monographs. New York, 1988.
  • [19] X. Nguyen. Convergence of latent mixing measures in finite and infinite mixture models., Annals of Statistics, 41(1):370–400, 2013.
  • [20] D. Peel and G. J. McLachlan. Robust mixture modelling using the t distribution., Statistics and Computing, 10:339–348, 2000.
  • [21] J. Rousseau and K. Mengersen. Asymptotic behaviour of the posterior distribution in overfitted mixture models., Journal of the Royal Statistical Society: Series B, 73(5):689–710, 2011.
  • [22] H. Teicher. Identifiability of mixtures., Annals of Statistics, 32:244–248, 1961.
  • [23] H. Teicher. Identifiability of finite mixtures., Annals of Statistics, 34 :1265–1269, 1963.
  • [24] S. van de Geer., Empirical Processes in M-estimation. Cambridge University Press, 2000.
  • [25] Cédric Villani., Optimal transport: Old and New. Springer, 2008.
  • [26] S. J. Yakowitz and J. D. Spragins. On the identifiability of finite mixtures., Annals of Statistics, 39(1):209–214, 1968.
  • [27] B. Yu. Assouad, Fano, and Le Cam., Festschrift for Lucien Le Cam, pages 423–435, 1997.
  • [28] C. Zhang. Fourier methods for estimating mixing densities and distributions., Annals of Statistics, 18(2):806–831, 1990.
  • [29] T. Zhang, A. Weisel, and M. S. Greco. Multivariate generalized gaussian distribution: Convexity and graphical models., IEEE Transactions on Signal Processing, 61 :4141–4148, 2013.