Open Access
December 2012 Latent demographic profile estimation in hard-to-reach groups
Tyler H. McCormick, Tian Zheng
Ann. Appl. Stat. 6(4): 1795-1813 (December 2012). DOI: 10.1214/12-AOAS569


The sampling frame in most social science surveys excludes members of certain groups, known as hard-to-reach groups. These groups, or subpopulations, may be difficult to access (the homeless, e.g.), camouflaged by stigma (individuals with HIV/AIDS), or both (commercial sex workers). Even basic demographic information about these groups is typically unknown, especially in many developing nations. We present statistical models which leverage social network structure to estimate demographic characteristics of these subpopulations using Aggregated relational data (ARD), or questions of the form “How many X’s do you know?” Unlike other network-based techniques for reaching these groups, ARD require no special sampling strategy and are easily incorporated into standard surveys. ARD also do not require respondents to reveal their own group membership. We propose a Bayesian hierarchical model for estimating the demographic characteristics of hard-to-reach groups, or latent demographic profiles, using ARD. We propose two estimation techniques. First, we propose a Markov-chain Monte Carlo algorithm for existing data or cases where the full posterior distribution is of interest. For cases when new data can be collected, we propose guidelines and, based on these guidelines, propose a simple estimate motivated by a missing data approach. Using data from McCarty et al. [Human Organization 60 (2001) 28–39], we estimate the age and gender profiles of six hard-to-reach groups, such as individuals who have HIV, women who were raped, and homeless persons. We also evaluate our simple estimates using simulation studies.


Download Citation

Tyler H. McCormick. Tian Zheng. "Latent demographic profile estimation in hard-to-reach groups." Ann. Appl. Stat. 6 (4) 1795 - 1813, December 2012.


Published: December 2012
First available in Project Euclid: 27 December 2012

zbMATH: 1257.62122
MathSciNet: MR3058684
Digital Object Identifier: 10.1214/12-AOAS569

Keywords: Aggregated relational data , hard-to-reach populations , hierarchical model , Social network , survey design

Rights: Copyright © 2012 Institute of Mathematical Statistics

Vol.6 • No. 4 • December 2012
Back to Top