The Annals of Applied Statistics

A covariate adjustment for zero-truncated approaches to estimating the size of hidden and elusive populations

Dankmar Böhning and Peter G. M. van der Heijden

Source: Ann. Appl. Stat. Volume 3, Number 2 (2009), 595-610.

Abstract

In this paper we consider the estimation of population size from one-source capture–recapture data, that is, a list in which individuals can potentially be found repeatedly and where the question is how many individuals are missed by the list. As a typical example, we provide data from a drug user study in Bangkok from 2001 where the list consists of drug users who repeatedly contact treatment institutions. Drug users with 1, 2, 3, … contacts occur, but drug users with zero contacts are not present, requiring the size of this group to be estimated. Statistically, these data can be considered as stemming from a zero-truncated count distribution. We revisit an estimator for the population size suggested by Zelterman that is known to be robust under potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a locally truncated Poisson likelihood which is equivalent to a binomial likelihood. This result allows the extension of the Zelterman estimator by means of logistic regression to include observed heterogeneity in the form of covariates. We also review an estimator proposed by Chao and explain why we are not able to obtain similar results for this estimator. The Zelterman estimator is applied in two case studies, the first a drug user study from Bangkok, the second an illegal immigrant study in the Netherlands. Our results suggest the new estimator should be used, in particular, if substantial unobserved heterogeneity is present.

Related Works:

Keywords: Population size estimation; capture–recapture; estimation under model misspecification; truncated Poisson and binomial likelihood; elusive population

Full-text: Access denied (no subscription detected)

In 2007, access to the Annals of Applied Statistics was open. Beginning in 2008, you must hold a subscription or be a member of the IMS to view the full journal. For more information on subscribing, please visit: http://imstat.org/orders.
If you are already an IMS member, you may need to update your Euclid profile following the instructions here: http://imstat.org/publications/eaccess.htm.
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1245676187
Digital Object Identifier: doi:10.1214/08-AOAS214

References

Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975)., Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge.
Mathematical Reviews (MathSciNet): MR381130
Böhning, D., Suppawattanabodee, B., Kusolvisitkul, W. and Viwatwongkasem, C. (2004). Estimating the number of drug users in Bangkok 2001: A capture–recapture approach using repeated entries in one list., European Journal of Epidemiology 19 1075–1083.
Böhning, D. and Schön, D. (2005). Nonparametric maximum likelihood estimation of the population size based upon the counting distribution., J. Roy. Statist. Soc. Ser. C 54 721–737.
Böhning, D. and Kuhnert, R. (2006). The equivalence of truncated count mixture distributions and mixture of truncated of truncated count distributions., Biometrics 62 1207–1215.
Mathematical Reviews (MathSciNet): MR2307446
Digital Object Identifier: doi:10.1111/j.1541-0420.2006.00565.x
Böhning, D. and van der Heijden, P. G. M. (2008a). Supplement to “A covariate adjustment for zero-truncated approaches to estimating the size of hidden and elusive populations.” DOI:, 10.1214/08-AOAS214SUPP.
Böhning, D. and van der Heijden, P. G. M. (2008b). Supplement to “A covariate adjustment for zero-truncated approaches to estimating the size of hidden and elusive, populations.”
Böhning, D. and van der Heijden, P. G. M. (2008c). Supplement to “A covariate adjustment for zero-truncated approaches to estimating the size of hidden and elusive, populations.”
Chao, A. (1987). Estimating the population size for capture–recapture data with unequal capture probabilities., Biometrics 43 783–791.
Mathematical Reviews (MathSciNet): MR920467
Digital Object Identifier: doi:10.2307/2531532
Chao, A. (1989). Estimating population size for sparse data in capture–recapture experiments., Biometrics 45 427–438.
Mathematical Reviews (MathSciNet): MR1010510
Digital Object Identifier: doi:10.2307/2531487
Gurmu, S. (1991). Tests for detecting overdispersion in the positive Poisson regression model., J. Bus. Econom. Statist. 9 215–222.
Hay, G. and Smit, F. (2003). Estimating the number of drug injectors from needle exchange data., Addiction Research and Theory 11 235–243.
Hook, E. B. and Regal, R. (1995). Capture–recapture methods in epidemiology: Methods and limitations., Epidemiologic Reviews 17 243–264.
Huggins, R. M. (1989). On the statistical analysis of capture experiments., Biometrika 76 133–140.
Mathematical Reviews (MathSciNet): MR991431
Zentralblatt MATH: 0664.62115
Digital Object Identifier: doi:10.1093/biomet/76.1.133
International Working Group for Disease Monitoring and Forecasting (1995a). Capture–recapture and multiple record systems estimation I: History and theoretical development., American Journal of Epidemiology 142 1047–1058.
International Working Group for Disease Monitoring and Forecasting (1995b). Capture–recapture and multiple record systems estimation II: Applications in human diseases., American Journal of Epidemiology 142 1059–1068.
McKendrick, A. G. (1926). Application of mathematics to medical problems., Proc. Edinb. Math. Soc. 44 98–130.
Roberts, J. M. and Brewer, D. D. (2006). Estimating the prevalence of male clients of prostitute women in Vancouver with a simple capture–recapture method., J. Roy. Statist. Soc. Ser. A 169 745–756.
Mathematical Reviews (MathSciNet): MR2291342
Digital Object Identifier: doi:10.1111/j.1467-985X.2006.00416.x
Ross, S. M. (1985)., Introduction to Probability Models, 3rd ed. Academic Press, Orlando, FL.
Smit, F., Reinking, D. and Reijerse, M. (2002). Estimating the number of people eligible for health service use., Evaluation and Program Planning 25 101–105.
Thompson, S. K. (2002)., Sampling, 2nd ed. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1891249
van der Heijden, P. G. M., Bustami, R., Cruyff, M., Engbersen, G. and van Houwelingen, H. C. (2003a). Point and interval estimation of the population size using the truncated Poisson regression model., Stat. Model. 3 305–322.
Mathematical Reviews (MathSciNet): MR2012155
Digital Object Identifier: doi:10.1191/1471082X03st057oa
van der Heijden, P. G. M., Cruyff, M. and van Houwelingen, H. C. (2003b). Estimating the size of a criminal population from police records using the truncated Poisson regression model., Statist. Neerlandica 57 1–16.
Mathematical Reviews (MathSciNet): MR2019847
Digital Object Identifier: doi:10.1111/1467-9574.00232
Van Hest, N. H. A., Grant, A. D., Smit, F., Story, A. and Richardus, J. H. (2007). Estimating infectious diseases incidence: Validity of capture–recapture analysis and truncated models for incomplete count data., Epidemiology and Infection 136 14–22.
Wilson, R. M. and Collins, M. F. (1992). Capture–recapture estimation with samples of size one using frequency data., Biometrika 79 543–553.
Zelterman, D. (1988). Robust estimation in truncated discrete distributions with applications to capture–recapture experiments., J. Statist. Plann. Inference 18 225–237.
Mathematical Reviews (MathSciNet): MR922210
Zentralblatt MATH: 0642.62021
Digital Object Identifier: doi:10.1016/0378-3758(88)90007-9

2009 © Institute of Mathematical Statistics