Electronic Journal of Statistics

An MM algorithm for estimation of a two component semiparametric density mixture with a known component

Zhou Shen, Michael Levine, and Zuofeng Shang

Full-text: Open access

Abstract

We consider a semiparametric mixture of two univariate density functions where one of them is known while the weight and the other function are unknown. We do not assume any additional structure on the unknown density function. For this mixture model, we derive a new sufficient identifiability condition and pinpoint a specific class of distributions describing the unknown component for which this condition is mostly satisfied. We also suggest a novel approach to estimation of this model that is based on an idea of applying a maximum smoothed likelihood to what would otherwise have been an ill-posed problem. We introduce an iterative MM (Majorization-Minimization) algorithm that estimates all of the model parameters. We establish that the algorithm possesses a descent property with respect to a log-likelihood objective functional and prove that the algorithm, indeed, converges. Finally, we also illustrate the performance of our algorithm in a simulation study and apply it to a real dataset.

Article information

Source
Electron. J. Statist., Volume 12, Number 1 (2018), 1181-1209.

Dates
Received: July 2017
First available in Project Euclid: 28 March 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1522224150

Digital Object Identifier
doi:10.1214/18-EJS1417

Mathematical Reviews number (MathSciNet)
MR3780730

Zentralblatt MATH identifier
06864489

Subjects
Primary: 62G07: Density estimation
Secondary: 62G99: None of the above, but in this section

Keywords
Penalized smoothed likelihood MM algorithm regularization

Rights
Creative Commons Attribution 4.0 International License.

Citation

Shen, Zhou; Levine, Michael; Shang, Zuofeng. An MM algorithm for estimation of a two component semiparametric density mixture with a known component. Electron. J. Statist. 12 (2018), no. 1, 1181--1209. doi:10.1214/18-EJS1417. https://projecteuclid.org/euclid.ejs/1522224150


Export citation

References

  • [1] Bar-Lev, S. K. and P. Enis (1986). Reproducibility and natural exponential families with power variance functions., The Annals of Statistics 14(4), 1507–1522.
  • [2] Bar-Lev, S. K. and O. Stramer (1987). Characterizations of natural exponential families with power variance functions by zero regression properties., Probability Theory and Related Fields 76(4), 509–522.
  • [3] Bordes, L., C. Delmas, and P. Vandekerkhove (2006). Semiparametric estimation of a two-component mixture model where one component is known., Scandinavian Journal of Statistics 33(4), 733–752.
  • [4] Bordes, L. and P. Vandekerkhove (2010). Semiparametric two-component mixture model with a known component: an asymptotically normal estimator., Mathematical Methods of Statistics 19(1), 22–41.
  • [5] Cai, T. T. and J. Jin (2010). Optimal rates of convergence for estimating the null density and proportion of nonnull effects in large-scale multiple testing., The Annals of Statistics 38(1), 100–145.
  • [6] Chauveau, D., D. R. Hunter, and M. Levine (2015). Semi-parametric estimation for conditional independence multivariate finite mixture models., Statistics Surveys 9, 1–31.
  • [7] Cohen, A. C. (1967). Estimation in mixtures of two normal distributions., Technometrics 9(1), 15–28.
  • [8] Crawford, S. L. (1994). An application of the laplace method to finite mixture distributions., Journal of the American Statistical Association 89(425), 259–267.
  • [9] Crawford, S. L., M. H. DeGroot, J. B. Kadane, and M. J. Small (1992). Modeling lake-chemistry distributions: Approximate bayesian methods for estimating a finite-mixture model., Technometrics 34(4), 441–453.
  • [10] Day, N. E. (1969). Estimating the components of a mixture of normal distributions., Biometrika 56(3), 463–474.
  • [11] Efron, B. (2012)., Large-scale inference: empirical Bayes methods for estimation, testing, and prediction, Volume 1. Cambridge University Press, Cambridge, United Kingdom.
  • [12] Eggermont, P. P. B., V. N. LaRiccia, and V. LaRiccia (2001)., Maximum penalized likelihood estimation, Volume 1. Springer, New York.
  • [13] Flemming, J. (2010). Theory and examples of variational regularization with non-metric fitting functionals., Journal of Inverse and Ill-Posed Problems 18(6), 677–699.
  • [14] Flemming, J. (2011)., Generalized Tikhonov regularization: basic theory and comprehensive results on convergence rates. Ph. D. thesis.
  • [15] Hall, P., A. Neeman, R. Pakyari, and R. Elmore (2005). Nonparametirc inference in multivariate mixtures., Biometrika Trust 92(3), 667–678.
  • [16] Hall, P. and X. Zhou (2003). Nonparametric estimation of component distributions in multivariate mixture., The Annals of Statistics 31(1), 201–224.
  • [17] Hanche-Olsen, H. and H. Holden (2010). The kolmogorov-riesz compactness theorem., Expositiones Mathematicae 28(4), 385–394.
  • [18] Hofmann, B., B. Kaltenbacher, C. Poeschl, and O. Scherzer (2007). A convergence rates result for tikhonov regularization in banach spaces with non-smooth operators., Inverse Problems 23(3), 987–1010.
  • [19] Hunter, D. R. and K. Lange (2004). A tutorial on mm algorithms., The American Statistician 58(1), 30–37.
  • [20] Jin, J. (2008). Proportion of non-zero normal means: universal oracle equivalences and uniformly consistent estimators., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70(3), 461–493.
  • [21] Lange, K., D. R. Hunter, and I. Yang (2000). Optimization transfer using surrogate objective functions., Journal of Computational and Graphical Statistics 9(1), 1–20.
  • [22] Lindsay, B. G. (1983). The geometry of mixture likelihoods: a general theory., The Annals of Statistics 11(1), 86–94.
  • [23] Lindsay, B. G. and P. Basak (1993). Multivariate normal mixtures: a fast consistent method of moments., Journal of the American Statistical Association 88(422), 468–476.
  • [24] McLachlan, G. and D. Peel (2004)., Finite mixture models. Wiley, Hoboken, New Jersey.
  • [25] Ortega, J. and W. Reinboldt (1970). Iterative solution of nonlinear equations with multiple, variables.
  • [26] Patra, R. K. and B. Sen (2015). Estimation of a two-component mixture model with applications to multiple testing., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 78(4), 869–893.
  • [27] Robin, S., A. Bar-Hen, J.-J. Daudin, and L. Pierre (2007). A semi-parametric approach for mixture models: Application to local false discovery rate estimation., Computational Statistics and Data Analysis 51(12), 5483–5493.
  • [28] Silverman, B. W. (1986)., Density estimation for statistics and data analysis, Volume 26. CRC press, Boca Raton, Florida.