Open Access
May 2017 Inference for Nonprobability Samples
Michael R. Elliott, Richard Valliant
Statist. Sci. 32(2): 249-264 (May 2017). DOI: 10.1214/16-STS598


Although selecting a probability sample has been the standard for decades when making inferences from a sample to a finite population, incentives are increasing to use nonprobability samples. In a world of “big data”, large amounts of data are available that are faster and easier to collect than are probability samples. Design-based inference, in which the distribution for inference is generated by the random mechanism used by the sampler, cannot be used for nonprobability samples. One alternative is quasi-randomization in which pseudo-inclusion probabilities are estimated based on covariates available for samples and nonsample units. Another is superpopulation modeling for the analytic variables collected on the sample units in which the model is used to predict values for the nonsample units. We discuss the pros and cons of each approach.


Download Citation

Michael R. Elliott. Richard Valliant. "Inference for Nonprobability Samples." Statist. Sci. 32 (2) 249 - 264, May 2017.


Published: May 2017
First available in Project Euclid: 11 May 2017

zbMATH: 1381.62024
MathSciNet: MR3648958
Digital Object Identifier: 10.1214/16-STS598

Keywords: coverage error , Hierarchical regression , quasi-randomization , reference sample , selection bias , superpopulation model

Rights: Copyright © 2017 Institute of Mathematical Statistics

Vol.32 • No. 2 • May 2017
Back to Top