The Annals of Applied Statistics

Two-phase sampling experiment for propensity score estimation in self-selected samples

Sixia Chen and Jae-Kwang Kim

Full-text: Open access


Self-selected samples are frequently obtained due to different levels of survey participation propensity of the survey individuals. When the survey participation is related to the survey topic of interest, propensity score weighting adjustment using auxiliary information may lead to biased estimation. In this paper, we consider a parametric model for the response probability that includes the study variable itself in the covariates of the model and proposes a novel application of two-phase sampling to estimate the parameters of the propensity model. The proposed method includes an experiment in which data are collected again from a subset of the original self-selected sample. With this two-phase sampling experiment, we can estimate the parameters in a propensity score model consistently. Then the propensity score adjustment can be applied to the self-selected sample to estimate the population parameters. Sensitivity of the selection model assumption is investigated from two limited simulation studies. The proposed method is applied to the 2012 Iowa Caucus Survey.

Article information

Ann. Appl. Stat., Volume 8, Number 3 (2014), 1492-1515.

First available in Project Euclid: 23 October 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Leverage-saliency theory measurement error models nonignorable nonresponse survey sampling voluntary sampling


Chen, Sixia; Kim, Jae-Kwang. Two-phase sampling experiment for propensity score estimation in self-selected samples. Ann. Appl. Stat. 8 (2014), no. 3, 1492--1515. doi:10.1214/14-AOAS746.

Export citation


  • Baker, R., Brick, J. M., Bates, N. A., Battaglia, M., Couper, M. P., Dever, J. A., Gile, K. J. and Tourangeau, R. (2013). Summary report of the AAPOR task force on nonprobability sampling. J. Surv. Stat. Methodol. 1 90–143.
  • Chen, S. and Kim, J. K. (2014a). Supplement to “Two-phase sampling experiment for propensity score estimation in self-selected samples.” DOI:10.1214/14-AOAS746SUPPA.
  • Chen, S. and Kim, J. K. (2014b). Supplement to “Two-phase sampling experiment for propensity score estimation in self-selected samples.” DOI:10.1214/14-AOAS746SUPPB.
  • Deville, J.-C. and Särndal, C.-E. (1992). Calibration estimators in survey sampling. J. Amer. Statist. Assoc. 87 376–382.
  • Duncan, K. B. and Stasny, E. A. (2001). Using propensity scores to control coverages bias in telephone surveys. Surv. Methodol. 27 121–130.
  • Durrant, G. B. and Skinner, C. (2006). Using missing data methods to correct for measurement error in a distribution function. Surv. Methodol. 32 25–36.
  • Folsom, R. E. and Singh, A. C. (2000). The generalized exponential model for sampling weight calibration for extreme values, nonresponse, and poststratification. In Proceedings of the Section on Survey Research Methods 598–603. Amer. Statist. Assoc., Alexandria, VA.
  • Fuller, W. A. (2002). Regression estimation for sample surveys. Surv. Methodol. 28 5–23.
  • Fuller, W. A., Loughin, M. M. and Baker, H. D. (1994). Regression weighting for the 1987–1988 national food consumption survey. Surv. Methodol. 20 75–85.
  • Groves, R., Eleanor, S. and Amy, C. (2000). Leverage-salience theory of survey participation: Descriptiona and illustration. Public. Opin. Quart. 64 299–308.
  • Groves, R., Presser, S. and Dipko, S. (2004). The role of topic interest in survey participation decisions. Public. Opin. Quart. 68 2–31.
  • Hidiroglou, M. A. and Särndal, C. E. (1998). Use of auxiliary information for two-phase sampling. Surv. Methodol. 24 11–20.
  • Kim, J. K. and Riddles, M. (2012). Some theory for propensity scoring adjustment estimator. Surv. Methodol. 38 157–165.
  • Kim, J. K. and Yu, C. L. (2011). Replication variance estimation under two-phase sampling. Surv. Methodol. 37 67–74.
  • Kott, P. S. (2006). Using calibration weighting to adjust for nonresponse and coverage errors. Surv. Methodol. 32 133–142.
  • Kott, P. S. and Chang, T. (2010). Using calibration weighting to adjust for nonignorable unit nonresponse. J. Amer. Statist. Assoc. 105 1265–1275.
  • Lee, S. (2006). Propensity score adjustment as a weighting scheme for volunteer panel web surveys. J. Off. Stat. 22 329–349.
  • Lee, S. and Valliant, R. (2009). Estimation for volunteer panel web surveys using propensity score adjustment and calibration adjustment. Sociol. Methods Res. 37 319–343.
  • Legg, J. C. and Fuller, W. A. (2009). Two-phase sampling. In Sample Surveys: Theory, Methods and Inference (D. Pfeffermann and C. R. Rao, eds.). Wiley, New York.
  • Lundstöm, S. and Särndal, C. E. (1999). Calibration as a standard method for treatment of nonresponse. J. Off. Stat. 15 305–327.
  • McCaffrey, D. F., Lockwood, J. R. and Setodji, C. M. (2013). Inverse probability weighting with error-prone covariates. Biometrika 100 671–680.
  • Rao, J. N. K. (1994). Estimation of totals and distributing functions using auxiliary information at the estimation stage. J. Off. Stat. 10 153–165.
  • Rosenbaum, P. R. (1987). Model-based direct adjustment. J. Amer. Statist. Assoc. 82 387–394.
  • Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
  • Rubin, D. B. (1976). Inference and missing data. Biometrika 63 581–592.
  • Valliant, R. and Dever, J. A. (2011). Estimating propensity adjustments for volunteer web surveys. Sociol. Methods Res. 40 105–137.

Supplemental materials