Open Access
December 2017 Estimating a probability mass function with unknown labels
Dragi Anevski, Richard D. Gill, Stefan Zohren
Ann. Statist. 45(6): 2708-2735 (December 2017). DOI: 10.1214/17-AOS1542


In the context of a species sampling problem, we discuss a nonparametric maximum likelihood estimator for the underlying probability mass function. The estimator is known in the computer science literature as the high profile estimator. We prove strong consistency and derive the rates of convergence, for an extended model version of the estimator. We also study a sieved estimator for which similar consistency results are derived. Numerical computation of the sieved estimator is of great interest for practical problems, such as forensic DNA analysis, and we present a computational algorithm based on the stochastic approximation of the expectation maximisation algorithm. As an interesting byproduct of the numerical analyses, we introduce an algorithm for bounded isotonic regression for which we also prove convergence.


Download Citation

Dragi Anevski. Richard D. Gill. Stefan Zohren. "Estimating a probability mass function with unknown labels." Ann. Statist. 45 (6) 2708 - 2735, December 2017.


Received: 1 May 2016; Published: December 2017
First available in Project Euclid: 15 December 2017

zbMATH: 06838148
MathSciNet: MR3737907
Digital Object Identifier: 10.1214/17-AOS1542

Primary: 62G05 , 62G20 , 62P10 , 65C60

Keywords: high profile , monotone rearrangement , nonparametric , NPMLE , ordered , probability mass function , rates , SA-EM , sieve , strong consistency

Rights: Copyright © 2017 Institute of Mathematical Statistics

Vol.45 • No. 6 • December 2017
Back to Top