The Annals of Statistics

Semiparametric efficiency bounds for high-dimensional models

Jana Janková and Sara van de Geer

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Asymptotic lower bounds for estimation play a fundamental role in assessing the quality of statistical procedures. In this paper, we propose a framework for obtaining semiparametric efficiency bounds for sparse high-dimensional models, where the dimension of the parameter is larger than the sample size. We adopt a semiparametric point of view: we concentrate on one-dimensional functions of a high-dimensional parameter. We follow two different approaches to reach the lower bounds: asymptotic Cramér–Rao bounds and Le Cam’s type of analysis. Both of these approaches allow us to define a class of asymptotically unbiased or “regular” estimators for which a lower bound is derived. Consequently, we show that certain estimators obtained by de-sparsifying (or de-biasing) an $\ell_{1}$-penalized M-estimator are asymptotically unbiased and achieve the lower bound on the variance: thus in this sense they are asymptotically efficient. The paper discusses in detail the linear regression model and the Gaussian graphical model.

Article information

Source
Ann. Statist., Volume 46, Number 5 (2018), 2336-2359.

Dates
Received: June 2016
Revised: August 2017
First available in Project Euclid: 17 August 2018

Permanent link to this document
https://projecteuclid.org/euclid.aos/1534492838

Digital Object Identifier
doi:10.1214/17-AOS1622

Mathematical Reviews number (MathSciNet)
MR3845020

Zentralblatt MATH identifier
06964335

Subjects
Primary: 62J07: Ridge regression; shrinkage estimators
Secondary: 62F12: Asymptotic properties of estimators

Keywords
Asymptotic efficiency high-dimensional sparsity Lasso linear regression graphical models Cramér–Rao bound Le Cam’s lemma

Citation

Janková, Jana; van de Geer, Sara. Semiparametric efficiency bounds for high-dimensional models. Ann. Statist. 46 (2018), no. 5, 2336--2359. doi:10.1214/17-AOS1622. https://projecteuclid.org/euclid.aos/1534492838


Export citation

References

  • Bellec, P. and Tsybakov, A. B. (2016). Bounds on the prediction error of penalized least squares estimators with convex penalty. Available at arXiv:1609.06675.
  • Bickel, P. J., Klaassen, C. A., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Springer, Berlin.
  • Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Heidelberg.
  • Cai, T. T. and Guo, Z. (2017). Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity. Ann. Statist. 45 615–646.
  • Chernozhukov, V., Hansen, C. and Spindler, M. (2015). Valid post-selection and post-regularization inference: An elementary, general approach. Ann. Rev. Econ. 7 649–688.
  • Collier, O., Comminges, L. and Tsybakov, A. B. (2015). Minimax estimation of linear and quadratic functionals on sparsity classes. Available at arXiv:1502.00665.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical Lasso. Biostatistics 9 432–441.
  • Gao, C., Ma, Z. and Zhou, H. H. (2017). Sparse CCA: Adaptive estimation and computational barriers. Ann. Statist. 45 2074–2101.
  • Janková, J. and van de Geer, S. (2014). Confidence intervals for high-dimensional inverse covariance estimation. Electron. J. Stat. 9 1205–1229.
  • Janková, J. and van de Geer, S. (2017). Honest confidence regions and optimality for high-dimensional precision matrix estimation. TEST 26 143–162.
  • Janková, J. and van de Geer, S. (2018). Supplement to “Semiparametric efficiency bounds for high-dimensional models.” DOI:10.1214/17-AOS1622SUPP.
  • Javanmard, A. and Montanari, A. (2014a). Confidence intervals and hypothesis testing for high-dimensional regression. J. Mach. Learn. Res. 15 2869–2909.
  • Javanmard, A. and Montanari, A. (2014b). Hypothesis testing in high-dimensional regression under the Gaussian random design model: Asymptotic theory. IEEE Trans. Inform. Theory 60 6522–6554.
  • Javanmard, A. and Montanari, A. (2015). De-biasing the Lasso: Optimal sample size for gaussian designs. Available at arxiv:1508.02757.
  • Knight, K. and Fu, W. (2000). Asymptotics for Lasso-type estimators. Ann. Statist. 28 1356–1378.
  • Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 1436–1462.
  • Meinshausen, N. and Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 246–270.
  • Ren, Z., Sun, T., Zhang, C.-H. and Zhou, H. H. (2015). Asymptotic normality and optimalities in estimation of large Gaussian graphical models. Ann. Statist. 43 991–1026.
  • van de Geer, S. (2016). Estimation and Testing under Sparsity. Springer, Berlin.
  • van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42 1166–1202.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge.
  • Zhang, C. H. and Zhang, S. S. (2014). Confidence intervals for low-dimensional parameters in high-dimensional linear models. J. Roy. Statist. Soc. Ser. B 76 217–242.

Supplemental materials

  • Supplement to “Semiparametric efficiency bounds for high-dimensional models”. The supplementary material contains proofs.