The Annals of Statistics

Semiparametric estimation of a two-component mixture model

Laurent Bordes, Stéphane Mottelet, and Pierre Vandekerkhove

Full-text: Open access

Abstract

Suppose that univariate data are drawn from a mixture of two distributions that are equal up to a shift parameter. Such a model is known to be nonidentifiable from a nonparametric viewpoint. However, if we assume that the unknown mixed distribution is symmetric, we obtain the identifiability of this model, which is then defined by four unknown parameters: the mixing proportion, two location parameters and the cumulative distribution function of the symmetric mixed distribution. We propose estimators for these four parameters when no training data is available. Our estimators are shown to be strongly consistent under mild regularity assumptions and their convergence rates are studied. Their finite-sample properties are illustrated by a Monte Carlo study and our method is applied to real data.

Article information

Source
Ann. Statist., Volume 34, Number 3 (2006), 1204-1232.

Dates
First available in Project Euclid: 10 July 2006

Permanent link to this document
https://projecteuclid.org/euclid.aos/1152540747

Digital Object Identifier
doi:10.1214/009053606000000353

Mathematical Reviews number (MathSciNet)
MR2278356

Zentralblatt MATH identifier
1112.62029

Subjects
Primary: 62G05: Estimation 62G20: Asymptotic properties
Secondary: 62E10: Characterization and structure theory

Keywords
Semiparametric two-component mixture model identifiability contrast estimators consistency rate of convergence mixing operator

Citation

Bordes, Laurent; Mottelet, Stéphane; Vandekerkhove, Pierre. Semiparametric estimation of a two-component mixture model. Ann. Statist. 34 (2006), no. 3, 1204--1232. doi:10.1214/009053606000000353. https://projecteuclid.org/euclid.aos/1152540747


Export citation

References

  • Azzalini, A. and Bowman, A. W. (1990). A look at some data on the Old Faithful Geyser. Appl. Statist. 39 357--365.
  • Barndorff-Nielsen, O. (1965). Identifiability of mixtures of exponential families. J. Math. Anal. Appl. 12 115--121.
  • Bowman, A. W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis. Oxford Univ. Press.
  • Cerrito, P. B. (1992). Using stratification to estimate multimodal density functions with applications to regression. Comm. Statist. Simulation Comput. 21 1149--1164.
  • Chandra, S. (1977). On the mixtures of probability distributions. Scand. J. Statist. 4 105--112.
  • Chen, J. (1995). Optimal rate of convergence for finite mixture models. Ann. Statist. 23 221--233.
  • Cohen, A. C. (1967). Estimation in mixtures of two normal distributions. Technometrics 9 15--28.
  • Cruz-Medina, I. R. and Hettmansperger, T. P. (2004). Nonparametric estimation in semi-parametric univariate mixture models. J. Stat. Comput. Simul. 74 513--524.
  • Dacunha-Castelle, D. and Duflo, M. (1983). Probabilités et Statistique 2. Problèmes à temps mobile. Masson, Paris.
  • Dacunha-Castelle, D. and Gassiat, E. (1999). Testing the order of a model using locally conic parametrization: Population mixtures and stationary ARMA processes. Ann. Statist. 27 1178--1209.
  • Day, N. E. (1969). Estimating the components of a mixture of normal distributions. Biometrika 56 463--474.
  • Devroye, L. (1983). The equivalence of weak, strong and complete convergence in $L_1$ for kernel density estimates. Ann. Statist. 11 896--904.
  • Diebolt, J. and Robert, C. P. (1994). Estimation of finite mixture distributions through Bayesian sampling. J. Roy. Statist. Soc. Ser. B 56 363--375.
  • Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman and Hall, London.
  • Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577--588.
  • Everitt, B. S. and Hand, D. J. (1981). Finite Mixture Distributions. Chapman and Hall, London.
  • Hall, P. (1981). On the nonparametric estimation of mixture proportions. J. Roy. Statist. Soc. Ser. B 43 147--156.
  • Hall, P. and Zhou, X.-H. (2003). Nonparametric estimation of component distributions in a multivariate mixture. Ann. Statist. 31 201--224.
  • Hunter, D. R., Wang, S. and Hettmansperger, T. P. (2006). Inference for mixtures of symmetric distributions. Ann. Statist. To appear.
  • Kitamura, Y. (2004). Nonparametric identifiability of finite mixtures. Preprint.
  • Lancaster, T. and Imbens, G. (1996). Case-control studies with contaminated controls. J. Econometrics 71 145--160.
  • Lemdani, M. and Pons, O. (1999). Likelihood ratio tests in contamination models. Bernoulli 5 705--719.
  • Leroux, B. G. (1992). Consistent estimation of a mixing distribution. Ann. Statist. 20 1350--1360.
  • Lindsay, B. G. (1983). The geometry of mixture likelihoods: A general theory. Ann. Statist. 11 86--94.
  • Lindsay, B. G. (1983). The geometry of mixture likelihoods. II. The exponential family. Ann. Statist. 11 783--792.
  • Lindsay, B. G. and Basak, P. (1993). Multivariate normal mixtures: A fast consistent method of moments. J. Amer. Statist. Assoc. 88 468--476.
  • Lindsay, B. G. and Lesperance, M. L. (1995). A review of semiparametric mixture models. J. Statist. Plann. Inference 47 29--39.
  • McLachlan, G. J. and Basford, K. E. (1988). Mixture Models. Inference and Applications to Clustering. Dekker, New York.
  • McLachlan, G. J. and Peel, D. (2000). Finite Mixture Models. Wiley, New York.
  • McNeil, D. R. (1977). Interactive Data Analysis. Wiley, New York.
  • Murray, G. D. and Titterington, D. M. (1978). Estimation problems with data from a mixture. Appl. Statist. 27 325--334.
  • Nocedal, J. and Wright, S. J. (1999). Numerical Optimization. Springer, Berlin.
  • Qin, J. (1999). Empirical likelihood ratio based confidence intervals for mixture proportions. Ann. Statist. 27 1368--1384.
  • Quandt, R. E. and Ramsey, J. B. (1978). Estimating mixtures of normal distributions and switching regressions. J. Amer. Statist. Assoc. 73 730--738.
  • Redner, R. A. and Walker, H. F. (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26 195--239.
  • Shashahani, B. M. and Landgrebe, D. A. (1994). The effect of unlabeled samples in reducing the small sample-size problem and mitigating the Hughes phenomenon. IEEE Trans. Geoscience and Remote Sensing 32 1087--1095.
  • Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics. Wiley, New York.
  • Teicher, H. (1961). Identifiability of mixtures. Ann. Math. Statist. 32 244--248.
  • Titterington, D. M. (1983). Minimum distance non-parametric estimation of mixture proportions. J. Roy. Statist. Soc. Ser. B 45 37--46.
  • Titterington, D. M., Smith, A. F. M. and Makov, U. E. (1985). Statistical Analysis of Finite Mixture Distributions. Wiley, Chichester.
  • van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical Processes. Springer, New York.