## The Annals of Statistics

### Large Sample Theory of Empirical Distributions in Biased Sampling Models

#### Abstract

Vardi (1985a) introduced an $s$-sample model for biased sampling, gave conditions which guarantee the existence and uniqueness of the nonparametric maximum likelihood estimator $\mathbb{G}_n$ of the common underlying distribution $G$ and discussed numerical methods for calculating the estimator. Here we examine the large sample behavior of the NPMLE $\mathbb{G}_n$, including results on uniform consistency of $\mathbb{G}_n$, convergence of $\sqrt n (\mathbb{G}_n - G)$ to a Gaussian process and asymptotic efficiency of $\mathbb{G}_n$ as an estimator of $G$. The proofs are based upon recent results for empirical processes indexed by sets and functions and convexity arguments. We also give a careful proof of identifiability of the underlying distribution $G$ under connectedness of a certain graph $\mathbf{G}$. Examples and applications include length-biased sampling, stratified sampling, "enriched" stratified sampling, "choice-based" sampling in econometrics and "case-control" studies in biostatistics. A final section discusses design issues and further problems.

#### Article information

Source
Ann. Statist., Volume 16, Number 3 (1988), 1069-1112.

Dates
First available in Project Euclid: 12 April 2007

https://projecteuclid.org/euclid.aos/1176350948

Digital Object Identifier
doi:10.1214/aos/1176350948

Mathematical Reviews number (MathSciNet)
MR959189

Zentralblatt MATH identifier
0668.62024

JSTOR