The Annals of Statistics

Inference for mixtures of symmetric distributions

David R. Hunter, Shaoli Wang, and Thomas P. Hettmansperger

Full-text: Open access

Abstract

This article discusses the problem of estimation of parameters in finite mixtures when the mixture components are assumed to be symmetric and to come from the same location family. We refer to these mixtures as semi-parametric because no additional assumptions other than symmetry are made regarding the parametric form of the component distributions. Because the class of symmetric distributions is so broad, identifiability of parameters is a major issue in these mixtures. We develop a notion of identifiability of finite mixture models, which we call k-identifiability, where k denotes the number of components in the mixture. We give sufficient conditions for k-identifiability of location mixtures of symmetric components when k=2 or 3. We propose a novel distance-based method for estimating the (location and mixing) parameters from a k-identifiable model and establish the strong consistency and asymptotic normality of the estimator. In the specific case of L2-distance, we show that our estimator generalizes the Hodges–Lehmann estimator. We discuss the numerical implementation of these procedures, along with an empirical estimate of the component distribution, in the two-component case. In comparisons with maximum likelihood estimation assuming normal components, our method produces somewhat higher standard error estimates in the case where the components are truly normal, but dramatically outperforms the normal method when the components are heavy-tailed.

Article information

Source
Ann. Statist., Volume 35, Number 1 (2007), 224-251.

Dates
First available in Project Euclid: 6 June 2007

Permanent link to this document
https://projecteuclid.org/euclid.aos/1181100187

Digital Object Identifier
doi:10.1214/009053606000001118

Mathematical Reviews number (MathSciNet)
MR2332275

Zentralblatt MATH identifier
1114.62035

Subjects
Primary: 62G05: Estimation

Keywords
Deconvolution Hodges–Lehmann estimator identifiability semi-parametric mixtures

Citation

Hunter, David R.; Wang, Shaoli; Hettmansperger, Thomas P. Inference for mixtures of symmetric distributions. Ann. Statist. 35 (2007), no. 1, 224--251. doi:10.1214/009053606000001118. https://projecteuclid.org/euclid.aos/1181100187


Export citation

References

  • Arcones, M. A., Chen, Z. and Giné, E. (1994). Estimators related to $U$-processes with applications to multivariate medians: Asymptotic normality. Ann. Statist. 22 1460–1477.
  • Billingsley, P. (1986). Probability and Measure, 2nd ed. Wiley, New York.
  • Bordes, L., Mottelet, S. and Vandekerkhove, P. (2006). Semiparametric estimation of,a,two-component,mixture,model. Ann.,Statist. 34 1204–1232.
  • Cruz-Medina, I. R. and Hettmansperger, T. P. (2004). Nonparametric estimation in semi-parametric univariate mixture models. J. Stat. Comput. Simul. 74 513–524.
  • Ellis, S. P. (2002). Blind deconvolution when noise is symmetric: Existence and examples of solutions. Ann. Inst. Statist. Math. 54 758–767.
  • Hall, P. and Zhou, X.-H. (2003). Nonparametric estimation of component distributions in a multivariate mixture. Ann. Statist. 31 201–224.
  • Hettmansperger, T. P. and Thomas, H. (2000). Almost nonparametric inference for repeated measures in mixture models. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 811–825.
  • Hodges, J. L., Jr. and Lehmann, E. L. (1963). Estimates of location based on rank tests. Ann. Math. Statist. 34 598–611.
  • Lee, A. J. (1990). $U$-Statistics: Theory and Practice. Dekker, New York.
  • Lindsay, B. G. (1995). Mixture Models: Theory, Geometry and Applications. IMS, Hayward, CA.
  • McLachlan, G. and Peel, D. A. (2000). Finite Mixture Models. Wiley, New York.
  • Pollard, D. (1985). New ways to prove central limit theorems. Econometric Theory 1 295–314.
  • Titterington, D. M., Smith, A. F. M. and Makov, U. E. (1985). Statistical Analysis of Finite Mixture Distributions. Wiley, Chichester.
  • Walther, G. (2001). Multiscale maximum likelihood analysis of a semiparametric model, with applications. Ann. Statist. 29 1297–1319.
  • Walther, G. (2002). Detecting the presence of mixing with multiscale maximum likelihood. J. Amer. Statist. Assoc. 97 508–513.
  • Yakowitz, S. J. and Spragins, J. D. (1968). On the identifiability of finite mixtures. Ann. Math. Statist. 39 209–214.