## The Annals of Statistics

### Coupling methods for multistage sampling

Guillaume Chauvet

#### Abstract

Multistage sampling is commonly used for household surveys when there exists no sampling frame, or when the population is scattered over a wide area. Multistage sampling usually introduces a complex dependence in the selection of the final units, which makes asymptotic results quite difficult to prove. In this work, we consider multistage sampling with simple random without replacement sampling at the first stage, and with an arbitrary sampling design for further stages. We consider coupling methods to link this sampling design to sampling designs where the primary sampling units are selected independently. We first generalize a method introduced by [Magyar Tud. Akad. Mat. Kutató Int. Közl. 5 (1960) 361–374] to get a coupling with multistage sampling and Bernoulli sampling at the first stage, which leads to a central limit theorem for the Horvitz–Thompson estimator. We then introduce a new coupling method with multistage sampling and simple random with replacement sampling at the first stage. When the first-stage sampling fraction tends to zero, this method is used to prove consistency of a with-replacement bootstrap for simple random without replacement sampling at the first stage, and consistency of bootstrap variance estimators for smooth functions of totals.

#### Article information

Source
Ann. Statist., Volume 43, Number 6 (2015), 2484-2506.

Dates
Revised: May 2015
First available in Project Euclid: 7 October 2015

https://projecteuclid.org/euclid.aos/1444222082

Digital Object Identifier
doi:10.1214/15-AOS1348

Mathematical Reviews number (MathSciNet)
MR3405601

Zentralblatt MATH identifier
1331.62071

Subjects
Primary: 62D05: Sampling theory, sample surveys
Secondary: 62E20: Asymptotic distribution theory 62G09: Resampling methods

#### Citation

Chauvet, Guillaume. Coupling methods for multistage sampling. Ann. Statist. 43 (2015), no. 6, 2484--2506. doi:10.1214/15-AOS1348. https://projecteuclid.org/euclid.aos/1444222082

#### References

• [1] Antal, E. and Tillé, Y. (2011). A direct bootstrap method for complex sampling designs from a finite population. J. Amer. Statist. Assoc. 106 534–543.
• [2] Beaumont, J.-F. and Patak, Z. (2012). On the generalized bootstrap for sample surveys with special attention to Poisson sampling. Int. Stat. Rev. 80 127–148.
• [3] Bertail, P. and Combris, P. (1997). Bootstrap généralisé d’un sondage. Ann. Econom. Statist. 46 49–83.
• [4] Bickel, P. J. and Freedman, D. A. (1981). Some asymptotic theory for the bootstrap. Ann. Statist. 9 1196–1217.
• [5] Brändén, P. and Jonasson, J. (2012). Negative dependence in sampling. Scand. J. Statist. 39 830–838.
• [6] Chauvet, G. (2012). On a characterization of ordered pivotal sampling. Bernoulli 18 1320–1340.
• [7] Chauvet, G. (2015). Supplement to “Coupling methods for multistage sampling.” DOI:10.1214/15-AOS1348SUPP.
• [8] Chen, J. and Rao, J. N. K. (2007). Asymptotic normality under two-phase sampling designs. Statist. Sinica 17 1047–1064.
• [9] Cochran, W. G. (1977). Sampling Techniques, 3rd ed. Wiley, New York.
• [10] Davison, A. C. and Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge Series in Statistical and Probabilistic Mathematics 1. Cambridge Univ. Press, Cambridge.
• [11] Davison, A. C. and Sardy, S. (2007). Resamping variance estimation in surveys with missing data. J. Off. Statist. 23 371–386.
• [12] Ezzati, T. M., Hoffman, K., Judkins, D. R., Massey, J. T. and Moore, T. F. (1992). Sample design: Third national health and nutrition examination survey. Vital Health Stat. 113 1–35.
• [13] Fuller, W. A. (2011). Sampling Statistics. Wiley, New York.
• [14] Funaoka, F., Saigo, H., Sitter, R. R. and Toida, T. (2006). Bernoulli bootstrap for stratified multistage sampling. Surv. Methodol. 32 151–156.
• [15] Gordon, L. (1983). Successive sampling in large finite populations. Ann. Statist. 11 702–706.
• [16] Hájek, J. (1960). Limiting distributions in simple random sampling from a finite population. Magyar Tud. Akad. Mat. Kutató Int. Közl. 5 361–374.
• [17] Hájek, J. (1961). Some extensions of the Wald–Wolfowitz–Noether theorem. Ann. Math. Statist. 32 506–523.
• [18] Hájek, J. (1964). Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann. Math. Statist. 35 1491–1523.
• [19] Hansen, M. H. and Hurwitz, W. N. (1943). On the theory of sampling from finite populations. Ann. Math. Statist. 14 333–362.
• [20] Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47 663–685.
• [21] Isaki, C. T. and Fuller, W. A. (1982). Survey design under the regression superpopulation model. J. Amer. Statist. Assoc. 77 89–96.
• [22] Krewski, D. and Rao, J. N. K. (1981). Inference from stratified samples: Properties of the linearization, jackknife and balanced repeated replication methods. Ann. Statist. 9 1010–1019.
• [23] Lahiri, P. (2003). On the impact of bootstrap in survey sampling and small-area estimation. Statist. Sci. 18 199–210.
• [24] Lin, C. D., Lu, W. W., Rust, K. and Sitter, R. R. (2013). Replication variance estimation in unequal probability sampling without replacement: One-stage and two-stage. Canad. J. Statist. 41 696–716.
• [25] Mallows, C. L. (1972). A note on asymptotic joint normality. Ann. Math. Statist. 43 508–515.
• [26] Narain, R. D. (1951). On sampling without replacement with varying probabilities. J. Indian Soc. Agricultural Statist. 3 169–174.
• [27] Nigam, A. K. and Rao, J. N. K. (1996). On balanced bootstrap for stratified multistage samples. Statist. Sinica 6 199–214.
• [28] Ohlsson, E. (1986). Asymptotic normality of the Rao–Hartley–Cochran estimator: An application of the martingale CLT. Scand. J. Statist. 13 17–28.
• [29] Ohlsson, E. (1989). Asymptotic normality for two-stage sampling from a finite population. Probab. Theory Related Fields 81 341–352.
• [30] Preston, J. (2009). Rescaled bootstrap for stratified multistage sampling. Surv. Methodol. 35 227–234.
• [31] Rao, J. N. K., Wu, C. F. J. and Yue, K. (1992). Some recent work on resampling methods for complex surveys. Surv. Methodol. 18 209–217.
• [32] Rao, J. N. K., Hartley, H. O. and Cochran, W. G. (1962). On a simple procedure of unequal probability sampling without replacement. J. Roy. Statist. Soc. Ser. B 24 482–491.
• [33] Rao, J. N. K. and Wu, C.-F. J. (1988). Resampling inference with complex survey data. J. Amer. Statist. Assoc. 83 231–241.
• [34] Rosén, B. (1972). Asymptotic theory for successive sampling with varying probabilities without replacement. I, II. Ann. Math. Statist. 43 373–397, 748–776.
• [35] Saegusa, T. and Wellner, J. A. (2013). Weighted likelihood estimation under two-phase sampling. Ann. Statist. 41 269–295.
• [36] Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer, New York.
• [37] Sen, P. K. (1980). Limit theorems for an extended coupon collector’s problem and for successive subsampling with varying probabilities. Calcutta Statist. Assoc. Bull. 29 113–132.
• [38] Shao, J. and Tu, D. S. (1995). The Jackknife and Bootstrap. Springer, New York.
• [39] Thorisson, H. (2000). Coupling, Stationarity, and Regeneration. Springer, New York.

#### Supplemental materials

• Supplement to “Coupling methods for multistage sampling”. The supplement [7] contains additional proofs of Propositions in Section 1, and additional simulation results in Section 2.