## The Annals of Statistics

### Estimation of a monotone density in $s$-sample biased sampling models

#### Abstract

We study the nonparametric estimation of a decreasing density function $g_{0}$ in a general $s$-sample biased sampling model with weight (or bias) functions $w_{i}$ for $i=1,\ldots,s$. The determination of the monotone maximum likelihood estimator $\hat{g}_{n}$ and its asymptotic distribution, except for the case when $s=1$, has been long missing in the literature due to certain nonstandard structures of the likelihood function, such as nonseparability and a lack of strictly positive second order derivatives of the negative of the log-likelihood function. The existence, uniqueness, self-characterization, consistency of $\hat{g}_{n}$ and its asymptotic distribution at a fixed point are established in this article. To overcome the barriers caused by nonstandard likelihood structures, for instance, we show the tightness of $\hat{g}_{n}$ via a purely analytic argument instead of an intrinsic geometric one and propose an indirect approach to attain the $\sqrt{n}$-rate of convergence of the linear functional $\int w_{i}\hat{g}_{n}$.

#### Article information

Source
Ann. Statist., Volume 46, Number 5 (2018), 2125-2152.

Dates
Revised: May 2017
First available in Project Euclid: 17 August 2018

https://projecteuclid.org/euclid.aos/1534492831

Digital Object Identifier
doi:10.1214/17-AOS1614

Mathematical Reviews number (MathSciNet)
MR3845013

Zentralblatt MATH identifier
06964328

Subjects
Primary: 62G20: Asymptotic properties 62E20: Asymptotic distribution theory
Secondary: 62G08: Nonparametric regression

#### Citation

Chan, Kwun Chuen Gary; Ling, Hok Kan; Sit, Tony; Yam, Sheung Chi Phillip. Estimation of a monotone density in $s$-sample biased sampling models. Ann. Statist. 46 (2018), no. 5, 2125--2152. doi:10.1214/17-AOS1614. https://projecteuclid.org/euclid.aos/1534492831

#### References

• Banerjee, M. (2007). Likelihood based inference for monotone response models. Ann. Statist. 35 931–956.
• Banerjee, M. and Wellner, J. A. (2001). Likelihood ratio tests for monotone functions. Ann. Statist. 29 1699–1731.
• Barlow, R. E., Bartholomew, D. J., Bremner, J. M. and Brunk, H. D. (1972). Statistical Inference Under Order Restrictions. Wiley, New York.
• Chan, K. C. G. and Wang, M.-C. (2012). Estimating incident population distribution from prevalent data. Biometrics 68 521–531.
• Chan, K. C. G., Ling, H. K., Sit, T. and Yam, S. C. P. (2018). Supplement to “Estimation of a monotone density in $s$-sample biased sampling models.” DOI:10.1214/17-AOS1614SUPP.
• Cook, R. C. and Martin, F. B. (1974). A model for quadrat sampling with visibility bias. J. Amer. Statist. Assoc. 69 345–349.
• Cox, D. R. (1968). Some sampling problems in technology. In New Developments in Survey Sampling (N. L. Johnson and H. Smith, eds.) 506–527. Wiley, New York.
• Davidov, O. and Iliopoulos, G. (2009). On the existence and uniqueness of the NPMLE in biased sampling models. J. Statist. Plann. Inference 139 176–183.
• Drummer, T. D. and McDonald, L. L. (1987). Size bias in line transect sampling. Biometrics 43 13–21.
• Dümbgen, L., Wellner, J. A. and Wolff, M. (2016). A law of the iterated logarithm for Grenander’s estimator. Stochastic Process. Appl. 126 3854–3864.
• El Barmi, H. and Nelson, P. I. (2002). A note on estimating a non-increasing density in the presence of selection bias. J. Statist. Plann. Inference 107 353–364.
• Gill, R. D., Vardi, Y. and Wellner, J. A. (1988). Large sample theory of empirical distributions in biased sampling models. Ann. Statist. 16 1069–1112.
• Grenander, U. (1956). On the theory of mortality measurement. II. Skand. Aktuarietidskr. 39 125–153.
• Groeneboom, P. (1985). Estimating a monotone density. In Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, Vol. II (Berkeley, Calif., 1983) 539–555. Wadsworth, Belmont, CA.
• Groeneboom, P. (1996). Lectures on inverse problems. In Lectures on Probability Theory and Statistics (Saint-Flour, 1994). Lecture Notes in Math. 1648 67–164. Springer, Berlin.
• Groeneboom, P., Hooghiemstra, G. and Lopuhaä, H. P. (1999). Asymptotic normality of the $L_{1}$ error of the Grenander estimator. Ann. Statist. 27 1316–1347.
• Groeneboom, P. and Jongbloed, G. (2014). Nonparametric Estimation Under Shape Constraints: Estimators, Algorithms and Asymptotics. Cambridge Series in Statistical and Probabilistic Mathematics 38. Cambridge Univ. Press, New York.
• Groeneboom, P. and Wellner, J. A. (1992). Information Bounds and Nonparametric Maximum Likelihood Estimation. DMV Seminar 19. Birkhäuser, Basel.
• Hausman, J. A. and Wise, D. A. (1981). Stratification on endogenous variables and estimation: The Gary income maintenance experiment. In Structural Analysis of Discrete Data with Econometric Applications (C. Manski and D. McFadden, eds.) 365–391. MIT Press, Cambridge, MA.
• Huang, J. and Wellner, J. A. (1995a). Asymptotic normality of the NPMLE of linear functionals for interval censored data, case 1. Stat. Neerl. 49 153–163.
• Huang, J. and Wellner, J. A. (1995b). Estimation of a monotone density or monotone hazard under random censoring. Scand. J. Stat. 22 3–33.
• Imbens, G. W. and Lancaster, T. (1996). Efficient estimation and stratified sampling. J. Econometrics 74 289–318.
• Jankowski, H. (2014). Convergence of linear functionals of the Grenander estimator under misspecification. Ann. Statist. 42 625–653.
• Kang, Q., Nelson, P. I. and Vahl, C. I. (2010). Parameter estimation from an outcome-dependent enriched sample using weighted likelihood method. Statist. Sinica 20 1529–1550.
• Patil, G. P. (1984). Studies in statistical ecology involving weighted distributions. In Statistics: Applications and New Directions (Calcutta, 1981) 478–503. Indian Statist. Inst., Calcutta.
• Patil, G. P. and Rao, C. R. (1978). Weighted distributions and size-biased sampling with applications to wildlife populations and human families. Biometrics 34 179–189.
• Perlis, S. (1991). Theory of Matrices. Dover, New York.
• Prakasa Rao, B. L. S. (1969). Estkmation of a unimodal density. Sankhya, Ser. A 31 23–36.
• Smith, W. and Parnes, M. (1994). Mean streets: The median of a size-biased sample and the population mean. Amer. Statist. 106–10.
• van de Geer, S. (2000). Empirical Processes in M-Estimation. Cambridge Univ. Press, Cambridge.
• van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York.
• Vardi, Y. (1985). Empirical distributions in selection bias models. Ann. Statist. 13 178–205.
• Wang, M.-C. (1991). Nonparametric estimation from cross-sectional survival data. J. Amer. Statist. Assoc. 86 130–143.
• Wang, M.-C. (1992). The analysis of retrospectively ascertained data in the presence of reporting delays. J. Amer. Statist. Assoc. 87 397–406.
• Wang, X. and Zhou, H. (2006). A semiparametric empirical likelihood method for biased sampling schemes with auxiliary covariates. Biometrics 62 1149–1160.
• Woodroofe, M. and Sun, J. (1993). A penalized maximum likelihood estimate of $f(0+)$ when $f$ is nonincreasing. Statist. Sinica 3 501–515.

#### Supplemental materials

• Supplement to “Estimation of a monotone density in $s$-sample biased sampling models”. In the supplementary paper, we provide the proofs for Propositions 3.1, 3.2, 4.1 and 5.13, Lemmas 5.1, 5.2, 5.4, 5.5, 5.9, 5.11, 5.12, 5.14, 5.15, 6.1, 6.2, 6.3, 6.4 and 6.5, Theorems 1.1 and 6.6. In addition, we also state and prove the fact that the function $\mathcal{\tilde{L}}_{n}$ defined in (3.3) is concave in $\boldsymbol{p}$ in Proposition 8.1, and hence establishes the unique existence of $\hat{g}_{n}$ in Proposition 8.2.