## The Annals of Probability

### Central limit theorems and bootstrap in high dimensions

#### Abstract

This paper derives central limit and bootstrap theorems for probabilities that sums of centered high-dimensional random vectors hit hyperrectangles and sparsely convex sets. Specifically, we derive Gaussian and bootstrap approximations for probabilities $\mathrm{P}(n^{-1/2}\sum_{i=1}^{n}X_{i}\in A)$ where $X_{1},\dots,X_{n}$ are independent random vectors in $\mathbb{R}^{p}$ and $A$ is a hyperrectangle, or more generally, a sparsely convex set, and show that the approximation error converges to zero even if $p=p_{n}\to\infty$ as $n\to\infty$ and $p\gg n$; in particular, $p$ can be as large as $O(e^{Cn^{c}})$ for some constants $c,C>0$. The result holds uniformly over all hyperrectangles, or more generally, sparsely convex sets, and does not require any restriction on the correlation structure among coordinates of $X_{i}$. Sparsely convex sets are sets that can be represented as intersections of many convex sets whose indicator functions depend only on a small subset of their arguments, with hyperrectangles being a special case.

#### Article information

Source
Ann. Probab., Volume 45, Number 4 (2017), 2309-2352.

Dates
Received: April 2015
Revised: March 2016
First available in Project Euclid: 11 August 2017

Permanent link to this document
https://projecteuclid.org/euclid.aop/1502438428

Digital Object Identifier
doi:10.1214/16-AOP1113

Mathematical Reviews number (MathSciNet)
MR3693963

Zentralblatt MATH identifier
1377.60040

#### Citation

Chernozhukov, Victor; Chetverikov, Denis; Kato, Kengo. Central limit theorems and bootstrap in high dimensions. Ann. Probab. 45 (2017), no. 4, 2309--2352. doi:10.1214/16-AOP1113. https://projecteuclid.org/euclid.aop/1502438428

#### References

• [1] Adamczak, R. (2008). A tail inequality for suprema of unbounded empirical processes with applications to Markov chains. Electron. J. Probab. 13 1000–1034.
• [2] Adamczak, R. (2010). A few remarks on the operator norm of random Toeplitz matrices. J. Theoret. Probab. 23 85–108.
• [3] Ball, K. (1993). The reverse isoperimetric problem for Gaussian measure. Discrete Comput. Geom. 10 411–420.
• [4] Barvinok, A. (2014). Thrifty approximations of convex bodies by polytopes. Int. Math. Res. Not. IMRN 16 4341–4356.
• [5] Bentkus, V. (2003). On the dependence of the Berry–Esseen bound on dimension. J. Statist. Plann. Inference 113 385–402.
• [6] Bentkus, V. Yu. (1985). Lower bounds for the rate of convergence in the central limit theorem in Banach spaces. Lith. Math. J. 25 312–320.
• [7] Bentkus, V. Yu. (1986). Dependence of the Berry–Esseen estimate on the dimension [in Russian]. Litovsk. Mat. Sb. 26 205–210.
• [8] Bhattacharya, R. and Rao, R. (1986). Normal Approximation and Asymptotic Expansions. Wiley, New York.
• [9] Bhattacharya, R. N. (1975). On errors of normal approximation. Ann. Probab. 3 815–828.
• [10] Bolthausen, E. (1984). An estimate of the remainder in a combinatorial central limit theorem. Z. Wahrsch. Verw. Gebiete 66 379–386.
• [11] Borell, C. (1974). Convex measures on locally convex spaces. Ark. Mat. 12 239–252.
• [12] Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence, with a Foreword by Michel Ledoux. Oxford Univ. Press, Oxford.
• [13] Chatterjee, S. (2005). A simple invariance theorem. Preprint. Available at arXiv:math/0508213.
• [14] Chatterjee, S. (2006). A generalization of the Lindeberg principle. Ann. Probab. 34 2061–2076.
• [15] Chatterjee, S. and Meckes, E. (2008). Multivariate normal approximation using exchangeable pairs. ALEA Lat. Am. J. Probab. Math. Stat. 4 257–283.
• [16] Chen, L. and Fang, X. (2011). Multivariate normal approximation by Stein’s method: The concentration inequality approach. Preprint. Available at arXiv:1111.4073.
• [17] Chernozhukov, V., Chetverikov, D. and Kato, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Statist. 41 2786–2819.
• [18] Chernozhukov, V., Chetverikov, D. and Kato, K. (2013). Supplemental Material to “Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors”. Ann. Statist. 41 2786–2819.
• [19] Chernozhukov, V., Chetverikov, D. and Kato, K. (2014). Gaussian approximation of suprema of empirical processes. Ann. Statist. 42 1564–1597.
• [20] Chernozhukov, V., Chetverikov, D. and Kato, K. (2015). Comparison and anti-concentration bounds for maxima of Gaussian random vectors. Probab. Theory Related Fields 162 47–70.
• [21] Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics 63. Cambridge Univ. Press, Cambridge.
• [22] Einmahl, U. and Li, D. (2008). Characterization of LIL behavior in Banach space. Trans. Amer. Math. Soc. 360 6677–6693.
• [23] Goldstein, L. and Rinott, Y. (1996). Multivariate normal approximations by Stein’s method and size bias couplings. J. Appl. Probab. 33 1–17.
• [24] Götze, F. (1991). On the rate of convergence in the multivariate CLT. Ann. Probab. 19 724–739.
• [25] Klivans, A., O’Donnell, R. and Servedio, R. (2008). Learning geometric concepts via Gaussian surface area. In 49th Annual IEEE Symposium on Foundations of Computer Science. Philadelphia, PA.
• [26] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces: Isoperimetry and Processes. Springer, Berlin.
• [27] Massart, P. (2000). About the constants in Talagrand’s concentration inequalities for empirical processes. Ann. Probab. 28 863–884.
• [28] Milman, V. D. and Schechtman, G. (1986). Asymptotic Theory of Finite-Dimensional Normed Spaces. Lecture Notes in Math. 1200. Springer, Berlin.
• [29] Nagaev, S. V. (1976). An estimate of the remainder term in the multidimensional central limit theorem. In Proceedings of the Third Japan–USSR Symposium on Probability Theory (Tashkent, 1975). Lecture Notes in Math. 550 419–438. Springer, Berlin.
• [30] Nazarov, F. (2003). On the maximal perimeter of a convex set in ${\mathbb{R}}^{n}$ with respect to a Gaussian measure. In Geometric Aspects of Functional Analysis. Lecture Notes in Math. 1807 169–187. Springer, Berlin.
• [31] Panchenko, D. (2013). The Sherrington–Kirkpatrick Model. Springer, New York.
• [32] Præstgaard, J. and Wellner, J. A. (1993). Exchangeably weighted bootstraps of the general empirical process. Ann. Probab. 21 2053–2086.
• [33] Reinert, G. and Röllin, A. (2009). Multivariate normal approximation with Stein’s method of exchangeable pairs under a general linearity condition. Ann. Probab. 37 2150–2173.
• [34] Röllin, A. (2013). Stein’s method in high dimensions with applications. Ann. Inst. Henri Poincaré Probab. Stat. 49 529–549.
• [35] Sazonov, V. V. (1968). On the multi-dimensional central limit theorem. Sankhyā Ser. A 30 181–204.
• [36] Sazonov, V. V. (1981). Normal Approximation—Some Recent Advances. Lecture Notes in Math. 879. Springer, Berlin.
• [37] Senatov, V. V. (1980). Several estimates of the rate of convergence in the multidimensional central limit theorem. Dokl. Akad. Nauk SSSR 254 809–812.
• [38] Slepian, D. (1962). The one-sided barrier problem for Gaussian noise. Bell System Tech. J. 41 463–501.
• [39] Stein, C. M. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9 1135–1151.
• [40] Sweeting, T. J. (1977). Speeds of convergence for the multidimensional central limit theorem. Ann. Probab. 5 28–41.
• [41] Talagrand, M. (2003). Spin Glasses: A Challenge for Mathematicians. Springer, Berlin.
• [42] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.