The Annals of Statistics

Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework

Alexandre Belloni, Victor Chernozhukov, Denis Chetverikov, and Ying Wei

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

In this paper, we develop procedures to construct simultaneous confidence bands for ${\tilde{p}}$ potentially infinite-dimensional parameters after model selection for general moment condition models where ${\tilde{p}}$ is potentially much larger than the sample size of available data, $n$. This allows us to cover settings with functional response data where each of the ${\tilde{p}}$ parameters is a function. The procedure is based on the construction of score functions that satisfy Neyman orthogonality condition approximately. The proposed simultaneous confidence bands rely on uniform central limit theorems for high-dimensional vectors (and not on Donsker arguments as we allow for ${{\tilde{p}}\gg n}$). To construct the bands, we employ a multiplier bootstrap procedure which is computationally efficient as it only involves resampling the estimated score functions (and does not require resolving the high-dimensional optimization problems). We formally apply the general theory to inference on regression coefficient process in the distribution regression model with a logistic link, where two implementations are analyzed in detail. Simulations and an application to real data are provided to help illustrate the applicability of the results.

Article information

Source
Ann. Statist., Volume 46, Number 6B (2018), 3643-3675.

Dates
Received: February 2016
Revised: October 2017
First available in Project Euclid: 11 September 2018

Permanent link to this document
https://projecteuclid.org/euclid.aos/1536631286

Digital Object Identifier
doi:10.1214/17-AOS1671

Mathematical Reviews number (MathSciNet)
MR3852664

Zentralblatt MATH identifier
1407.62268

Subjects
Primary: 62-07: Data analysis
Secondary: 62H99: None of the above, but in this section

Keywords
Inference after model selection moment condition models with a continuum of target parameters Lasso and Post-Lasso with functional response data

Citation

Belloni, Alexandre; Chernozhukov, Victor; Chetverikov, Denis; Wei, Ying. Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework. Ann. Statist. 46 (2018), no. 6B, 3643--3675. doi:10.1214/17-AOS1671. https://projecteuclid.org/euclid.aos/1536631286


Export citation

References

  • [1] Andrews, D. W. K. (1994). Asymptotics for semiparametric econometric models via stochastic equicontinuity. Econometrica 62 43–72.
  • [2] Belloni, A., Chen, D., Chernozhukov, V. and Hansen, C. (2012). Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80 2369–2429.
  • [3] Belloni, A. and Chernozhukov, V. (2011). $\ell_{1}$-Penalized quantile regression for high dimensional sparse models. Ann. Statist. 39 82–130.
  • [4] Belloni, A. and Chernozhukov, V. (2013). Least squares after model selection in high-dimensional sparse models. Bernoulli 19 521–547. Available at arXiv:1001.0188.
  • [5] Belloni, A., Chernozhukov, V., Chetverikov, D. and Wei, Y. (2018). Supplement to “Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework.” DOI:10.1214/17-AOS1671SUPP.
  • [6] Belloni, A., Chernozhukov, V., Fernández-Val, I. and Hansen, C. (2013). Program evaluation with high-dimensional data. Available at arXiv:1311.2645.
  • [7] Belloni, A., Chernozhukov, V. and Hansen, C. (2010). Lasso methods for Gaussian instrumental variables models. Available at arXiv:1012.1297.
  • [8] Belloni, A., Chernozhukov, V. and Hansen, C. (2013). Inference for high-dimensional sparse econometric models. In Advances in Economics and Econometrics. 10th World Congress of Econometric Society, August 2010, Vol. III. 245–295. Available at arXiv:1201.0220.
  • [9] Belloni, A., Chernozhukov, V. and Hansen, C. (2014). Inference on treatment effects after selection among high-dimensional controls. Rev. Econ. Stud. 81 608–650.
  • [10] Belloni, A., Chernozhukov, V. and Kato, K. (2013). Valid post-selection inference in high-dimensional approximately sparse quantile regression models. Available at arXiv:1312.7186.
  • [11] Belloni, A., Chernozhukov, V. and Kato, K. (2015). Uniform post selection inference for LAD regression models and other Z-estimators. Biometrika 102 77–94.
  • [12] Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root-lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98 791–806.
  • [13] Belloni, A., Chernozhukov, V. and Wang, L. (2014). Pivotal estimation via square-root Lasso in nonparametric regression. Ann. Statist. 42 757–788.
  • [14] Chamberlain, G. (1992). Efficiency bounds for semiparametric regression. Econometrica 60 567–596.
  • [15] Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W. and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. Econom. J. 21 C1–C68.
  • [16] Chernozhukov, V., Chetverikov, D. and Kato, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Statist. 41 2786–2819.
  • [17] Chernozhukov, V., Chetverikov, D. and Kato, K. (2014). Anti-concentration and honest, adaptive confidence bands. Ann. Statist. 42 1787–1818.
  • [18] Chernozhukov, V., Chetverikov, D. and Kato, K. (2017). Central limit theorems and bootstrap in high dimensions. Ann. Probab. 4 2309–2352.
  • [19] Chernozhukov, V., Chetverikov, D. and Kato, K. (2014). Gaussian approximation of suprema of empirical processes. Ann. Statist. 42 1564–1597.
  • [20] Chernozhukov, V., Chetverikov, D. and Kato, K. (2015). Comparison and anti-concentration bounds for maxima of Gaussian random vectors. Probab. Theory Related Fields 162 47–70.
  • [21] Chernozhukov, V., Chetverikov, D. and Kato, K. (2015). Empirical and multiplier bootstraps for suprema of empirical processes of increasing complexity, and related Gaussian couplings. Available at arXiv:1502.00352.
  • [22] Chernozhukov, V., Fernández-Val, I. and Melly, B. (2013). Inference on counterfactual distributions. Econometrica 81 2205–2268.
  • [23] Chernozhukov, V., Hansen, C. and Spindler, M. (2015). Post-selection and post-regularization inference in linear models with very many controls and instruments. Am. Econ. Rev. Pap. Proc. 105 486–490.
  • [24] Deng, H. and Zhang, C.-H. (2017). Beyond Gaussian approximation: Bootstrap for maxima of sums of independent random vectors. Available at arXiv:1705.09528.
  • [25] Dudley, R. (1999). Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics 63. Cambridge Univ. Press, Cambridge.
  • [26] Hothorn, T., Kneib, T. and Bühlmann, P. (2014). Conditional transformation models. J. Roy. Statist. Soc. Ser. B 76 3–27.
  • [27] Javanmard, A. and Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression. J. Mach. Learn. Res. 15 2869–2909.
  • [28] Javanmard, A. and Montanari, A. (2014). Hypothesis testing in high-dimensional regression under the Gaussian random design model: Asymptotic theory. IEEE Trans. Inform. Theory 60 6522–6554.
  • [29] Kosorok, M. (2008). Introduction to Empirical Processes and Semiparametric Inference. Springer, Berlin.
  • [30] Leeb, H. and Pötscher, B. (2008). Can one estimate the unconditional distribution of post-model-selection estimators? Econometric Theory 24 338–376.
  • [31] Leeb, H. and Pötscher, B. (2008). Recent developments in model selection and related areas. Econometric Theory 24 319–322.
  • [32] Leeb, H. and Pötscher, B. M. (2008). Sparse estimators and the oracle property, or the return of Hodges’ estimator. J. Econometrics 142 201–211.
  • [33] Linton, O. (1996). Edgeworth approximation for MINPIN estimators in semiparametric regression models. Econometric Theory 12 30–60. Cowles Foundation Discussion Papers 1086 (1994).
  • [34] Mammen, E. (1993). Bootstrap and wild bootstrap for high dimensional linear models. Ann. Statist. 21 255–285.
  • [35] Newey, W. (1990). Semiparametric efficiency bounds. J. Appl. Econometrics 5 99–135.
  • [36] Newey, W. (1994). The asymptotic variance of semiparametric estimators. Econometrica 62 1349–1382.
  • [37] Neyman, J. (1959). Optimal asymptotic tests of composite statistical hypotheses. In Probability and Statistics: The Harald Cramér Volume (U. Grenander, ed.) 213–234. Almqvist & Wiksell, Stockholm.
  • [38] Neyman, J. (1979). $c(\alpha)$ tests and their use. Sankhyā 41 1–21.
  • [39] Ning, Y. and Liu, H. (2014). A general theory of hypothesis tests and confidence regions for sparse high dimensional models. Available at arXiv:1412.8765.
  • [40] Pötscher, B. and Leeb, H. (2009). On the distribution of penalized maximum likelihood estimators: The LASSO, SCAD, and thresholding. J. Multivariate Anal. 100 2065–2082.
  • [41] Robins, J. and Rotnitzky, A. (1995). Semiparametric efficiency in multivariate regression models with missing data. J. Amer. Statist. Assoc. 90 122–129.
  • [42] Stein, C. (1956). Efficient nonparametric testing and estimation. In Proc. 3rd Berkeley Symp. Math. Statist. and Probab. 1 187–195. Univ. California Press, Berkeley, CA.
  • [43] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42 1166–1202.
  • [44] van der Vaart, A. (1998). Asymptotic Statistics. Cambridge Univ. Press, Cambridge.
  • [45] van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical Processes.
  • [46] Zhang, C.-H. and Zhang, S. (2014). Confidence intervals for low-dimensional parameters with high-dimensional data. J. Roy. Statist. Soc. Ser. B 76 217–242.
  • [47] Zhao, T., Kolar, M. and Liu, H. (2014). A general framework for robust testing and confidence regions in high-dimensional quantile regression. Available at arXiv:1412.8724.

Supplemental materials

  • Supplement to “Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework”. The supplemental material contains additional proofs omitted in the main text, a discussion of the double selection method, a set of new results for $\ell_{1}$-penalized $M$-estimators with functional data, additional simulation results, and an empirical application.