Statistical Science

Introduction to the Design and Analysis of Complex Survey Data

Chris Skinner and Jon Wakefield

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


We give a brief overview of common sampling designs used in a survey setting, and introduce the principal inferential paradigms under which data from complex surveys may be analyzed. In particular, we distinguish between design-based, model-based and model-assisted approaches. Simple examples highlight the key differences between the approaches. We discuss the interplay between inferential approaches and targets of inference and the important issue of variance estimation.

Article information

Statist. Sci., Volume 32, Number 2 (2017), 165-175.

First available in Project Euclid: 11 May 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Design-based inference model-assisted inference model-based inference weights variance estimation


Skinner, Chris; Wakefield, Jon. Introduction to the Design and Analysis of Complex Survey Data. Statist. Sci. 32 (2017), no. 2, 165--175. doi:10.1214/17-STS614.

Export citation


  • Basu, D. (1971). An essay on the logical foundations of survey sampling, part I. In Foundations of Statistical Inference (Proc. Sympos., Univ. Waterloo, Waterloo, Ont., 1970) 203–242. Holt, Rinehart and Winston, Toronto.
  • Berger, Y. G. and Tillé, Y. (2009). Sampling with unequal probabilities. In Handbook of Statistics, Vol. 29A, Sample Surveys: Design, Methods and Applications (D. Pfeffermann and C. R. Rao, eds.) 39–54. North-Holland, Amsterdam.
  • Binder, D. A. (1983). On the variances of asymptotically normal estimators from complex surveys. Int. Stat. Rev. 51 279–292.
  • Breidt, J. and Opsomer, J. (2017). Model-assisted survey estimation with modern prediction techniques. Statist. Sci. 32 190–205.
  • Brewer, K. (2002). Combined Survey Sampling Inference: Weighing Basu’s Elephants. Arnold, London.
  • Chambers, R. L. and Clark, R. G. (2012). An Introduction to Model-Based Survey Sampling with Applications. Oxford Univ. Press, Oxford.
  • Chen, Q., Elliott, M. R., Haziza, D., Yang, Y., Ghosh, M., Little, R., Sedransk, J. and Thompson, M. (2017). Approaches to improving survey-weighted estimates. Statist. Sci. 32 227–248.
  • Cox, D. R. (2006). Principles of Statistical Inference. Cambridge Univ. Press, Cambridge.
  • Elliott, M. and Valliant, R. (2017). Inference for non-probability samples. Statist. Sci. 32 249–264.
  • Gelman, A. (2007). Struggles with survey weighting and regression modeling. Statist. Sci. 22 153–164.
  • Graubard, B. I. and Korn, E. L. (2002). Inference for superpopulation parameters using sample surveys. Statist. Sci. 17 73–96.
  • Hájek, J. (1971). Discussion of ‘An essay on the logical foundations of survey sampling, part I’, by D. Basu. In Foundations of Statistical Inference (Proc. Sympos., Univ. Waterloo, Waterloo, Ont., 1970). Holt, Rinehart and Winston, Toronto.
  • Haziza, D. and Beaumont, J.-F. (2017). Construction of weights in surveys: A review. Statist. Sci. 32 206–226.
  • Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47 663–685.
  • Japec, L., Kreuter, F., Berg, M., Biemer, P., Decker, P., Lampe, C., Lane, J., O’Neil, C. and Asher, A. (2015). Big data in survey research: Aapor task force report. Public Opin. Q. 79 839–880.
  • Little, R. J. (2013). Calibrated Bayes, an alternative inferential paradigm for official statistics (with discussion). J. Off. Stat. 28 309–372.
  • Lohr, S. L. (2010). Sampling: Design and Analysis, 2nd ed. Brooks/Cole Cengage Learning, Boston, MA.
  • Lohr, S. and Raghunathan, T. (2017). Combining survey data with other data sources. Statist. Sci. 32 293–312.
  • Lumley, T. (2010). Complex Surveys: A Guide to Analysis Using R. Wiley, Hoboken, NJ.
  • Lumley, T. and Scott, A. (2017). Fitting regression models to survey data. Statist. Sci. 32 265–278.
  • Pfeffermann, D., Skinner, C. J., Holmes, D. J., Goldstein, H. and Rasbash, J. (1998). Weighting for unequal selection probabilities in multilevel models. J. R. Stat. Soc. Ser. B 60 23–40.
  • Rabe-Hesketh, S. and Skrondal, A. (2006). Multilevel modelling of complex survey data. J. Roy. Statist. Soc. Ser. A 169 805–827.
  • Rust, K. F. and Rao, J. N. K. (1996). Variance estimation for complex surveys using replication techniques. Stat. Methods Med. Res. 5 283–310.
  • Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer, New York.
  • Schonlau, M. and Couper, M. (2017). Options for conducting web surveys. Statist. Sci. 32 279–292.
  • Scott, A. and Smith, T. M. F. (1969). Estimation in multi-stage surveys. J. Amer. Statist. Assoc. 64 830–840.
  • Seaman, S. R. and White, I. R. (2013). Review of inverse probability weighting for dealing with missing data. Stat. Methods Med. Res. 22 278–295.
  • Shao, J. and Tu, D. (2012). The Jackknife and Bootstrap. Springer, New York.
  • Skinner, C. J. (2003). Introduction to part b. In Analysis of Survey Data (R. L. Chamber and C. J. Skinner, eds.) 75–84. Wiley, Chichester.
  • Skinner, C. J., Holt, D. and Smith, T. M. F., eds. (1989). Analysis of Complex Surveys. Wiley, Chichester.
  • Tillé, Y. and Wilhelm, M. (2017). Probability sampling designs; principles for the choice of design and balancing. Statist. Sci. 32 176–189.
  • Valliant, R., Dever, J. A. and Kreuter, F. (2013). Practical Tools for Designing and Weighting Survey Samples. Springer, Berlin.
  • Valliant, R., Dorfman, A. H. and Royall, R. M. (2000). Finite Population Sampling and Inference: A Prediction Approach. Wiley-Interscience, New York.