## The Annals of Statistics

### Empirical Distributions in Selection Bias Models

Y. Vardi

#### Abstract

The following problem is treated: Given $s$ not-necessarily-random samples from an unknown distribution $F$, and assuming that we know the sampling rule of each sample, is it possible to combine the samples in order to estimate $F$, and if so what is the natural way of doing it? More formally, this translates to the problem of determining whether there exists a nonparametric maximum likelihood estimate (NPMLE) of $F$ on the basis of $s$ samples from weighted versions of $F$, with known weight functions, and if it exists, how to construct it? We give a simple necessary and sufficient condition, which can be checked graphically, for the existence and uniqueness of the NPMLE and, under this condition, we describe a simple method for constructing it. The method is numerically efficient and mathematically interesting because it reduces the problem to one of solving $s - 1$ nonlinear equations with $s - 1$ unknowns, the unique solution of which is easily obtained by the iterative, Gauss-Seidel type, scheme described in the paper. Extensions for the case where the weight functions are not completely specified and for censored samples, applications, numerical examples, and statistical properties of the NPMLE, are discussed. In particular, we prove under this condition that the NPMLE is a sufficient statistic for $F$. The technique has many potential applications, because it is not limited to the case where the sampled items are univariate. A FORTRAN program for the described algorithm is available from the author.

#### Article information

Source
Ann. Statist., Volume 13, Number 1 (1985), 178-203.

Dates
First available in Project Euclid: 12 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aos/1176346585

Digital Object Identifier
doi:10.1214/aos/1176346585

Mathematical Reviews number (MathSciNet)
MR773161

Zentralblatt MATH identifier
0578.62047

JSTOR