Open Access
2014 Estimating hidden population size using Respondent-Driven Sampling data
Mark S. Handcock, Krista J. Gile, Corinne M. Mar
Electron. J. Statist. 8(1): 1491-1521 (2014). DOI: 10.1214/14-EJS923

Abstract

Respondent-Driven Sampling (RDS) is n approach to sampling design and inference in hard-to-reach human populations. It is often used in situations where the target population is rare and/or stigmatized in the larger population, so that it is prohibitively expensive to contact them through the available frames. Common examples include injecting drug users, men who have sex with men, and female sex workers. Most analysis of RDS data has focused on estimating aggregate characteristics, such as disease prevalence. However, RDS is often conducted in settings where the population size is unknown and of great independent interest. This paper presents an approach to estimating the size of a target population based on data collected through RDS.

The proposed approach uses a successive sampling approximation to RDS to leverage information in the ordered sequence of observed personal network sizes. The inference uses the Bayesian framework, allowing for the incorporation of prior knowledge. A flexible class of priors for the population size is used that aids elicitation. An extensive simulation study provides insight into the performance of the method for estimating population size under a broad range of conditions. A further study shows the approach also improves estimation of aggregate characteristics. Finally, the method demonstrates sensible results when used to estimate the size of known networked populations from the National Longitudinal Study of Adolescent Health, and when used to estimate the size of a hard-to-reach population at high risk for HIV.

Citation

Download Citation

Mark S. Handcock. Krista J. Gile. Corinne M. Mar. "Estimating hidden population size using Respondent-Driven Sampling data." Electron. J. Statist. 8 (1) 1491 - 1521, 2014. https://doi.org/10.1214/14-EJS923

Information

Published: 2014
First available in Project Euclid: 2 September 2014

zbMATH: 1295.62011
MathSciNet: MR3263129
Digital Object Identifier: 10.1214/14-EJS923

Subjects:
Primary: 62D05 , 91D30
Secondary: 60K35

Keywords: Hard-to-reach population sampling , model-based survey sampling , network sampling , social networks , successive sampling

Rights: Copyright © 2014 The Institute of Mathematical Statistics and the Bernoulli Society

Vol.8 • No. 1 • 2014
Back to Top