The Annals of Statistics
- Ann. Statist.
- Volume 47, Number 6 (2019), 3438-3469.
Bootstrapping and sample splitting for high-dimensional, assumption-lean inference
Several new methods have been recently proposed for performing valid inference after model selection. An older method is sample splitting: use part of the data for model selection and the rest for inference. In this paper, we revisit sample splitting combined with the bootstrap (or the Normal approximation). We show that this leads to a simple, assumption-lean approach to inference and we establish results on the accuracy of the method. In fact, we find new bounds on the accuracy of the bootstrap and the Normal approximation for general nonlinear parameters with increasing dimension which we then use to assess the accuracy of regression inference. We define new parameters that measure variable importance and that can be inferred with greater accuracy than the usual regression coefficients. Finally, we elucidate an inference-prediction trade-off: splitting increases the accuracy and robustness of inference but can decrease the accuracy of the predictions.
Ann. Statist., Volume 47, Number 6 (2019), 3438-3469.
Received: April 2018
Revised: November 2018
First available in Project Euclid: 31 October 2019
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Primary: 62F40: Bootstrap, jackknife and other resampling methods 62F35: Robustness and adaptive procedures
Secondary: 62J05: Linear regression 62G09: Resampling methods 62G20: Asymptotic properties
Rinaldo, Alessandro; Wasserman, Larry; G’Sell, Max. Bootstrapping and sample splitting for high-dimensional, assumption-lean inference. Ann. Statist. 47 (2019), no. 6, 3438--3469. doi:10.1214/18-AOS1784. https://projecteuclid.org/euclid.aos/1572487399
- Supplement to “Bootstrapping and sample splitting for high-dimensional, assumption-lean inference”. This supplement provides additional material, including numerical examples, comments on other approaches, an alternative bootstrap approach, and algorithmic statements of the studied procedures. The supplement also includes proofs of many of the results stated in this paper.