## The Annals of Statistics

### Semiparametric efficiency bounds for high-dimensional models

#### Abstract

Asymptotic lower bounds for estimation play a fundamental role in assessing the quality of statistical procedures. In this paper, we propose a framework for obtaining semiparametric efficiency bounds for sparse high-dimensional models, where the dimension of the parameter is larger than the sample size. We adopt a semiparametric point of view: we concentrate on one-dimensional functions of a high-dimensional parameter. We follow two different approaches to reach the lower bounds: asymptotic Cramér–Rao bounds and Le Cam’s type of analysis. Both of these approaches allow us to define a class of asymptotically unbiased or “regular” estimators for which a lower bound is derived. Consequently, we show that certain estimators obtained by de-sparsifying (or de-biasing) an $\ell_{1}$-penalized M-estimator are asymptotically unbiased and achieve the lower bound on the variance: thus in this sense they are asymptotically efficient. The paper discusses in detail the linear regression model and the Gaussian graphical model.

#### Article information

Source
Ann. Statist., Volume 46, Number 5 (2018), 2336-2359.

Dates
Revised: August 2017
First available in Project Euclid: 17 August 2018

https://projecteuclid.org/euclid.aos/1534492838

Digital Object Identifier
doi:10.1214/17-AOS1622

Mathematical Reviews number (MathSciNet)
MR3845020

Zentralblatt MATH identifier
06964335

Subjects
Primary: 62J07: Ridge regression; shrinkage estimators
Secondary: 62F12: Asymptotic properties of estimators

#### Citation

Janková, Jana; van de Geer, Sara. Semiparametric efficiency bounds for high-dimensional models. Ann. Statist. 46 (2018), no. 5, 2336--2359. doi:10.1214/17-AOS1622. https://projecteuclid.org/euclid.aos/1534492838

#### References

• Bellec, P. and Tsybakov, A. B. (2016). Bounds on the prediction error of penalized least squares estimators with convex penalty. Available at arXiv:1609.06675.
• Bickel, P. J., Klaassen, C. A., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Springer, Berlin.
• Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, Heidelberg.
• Cai, T. T. and Guo, Z. (2017). Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity. Ann. Statist. 45 615–646.
• Chernozhukov, V., Hansen, C. and Spindler, M. (2015). Valid post-selection and post-regularization inference: An elementary, general approach. Ann. Rev. Econ. 7 649–688.
• Collier, O., Comminges, L. and Tsybakov, A. B. (2015). Minimax estimation of linear and quadratic functionals on sparsity classes. Available at arXiv:1502.00665.
• Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical Lasso. Biostatistics 9 432–441.
• Gao, C., Ma, Z. and Zhou, H. H. (2017). Sparse CCA: Adaptive estimation and computational barriers. Ann. Statist. 45 2074–2101.
• Janková, J. and van de Geer, S. (2014). Confidence intervals for high-dimensional inverse covariance estimation. Electron. J. Stat. 9 1205–1229.
• Janková, J. and van de Geer, S. (2017). Honest confidence regions and optimality for high-dimensional precision matrix estimation. TEST 26 143–162.
• Janková, J. and van de Geer, S. (2018). Supplement to “Semiparametric efficiency bounds for high-dimensional models.” DOI:10.1214/17-AOS1622SUPP.
• Javanmard, A. and Montanari, A. (2014a). Confidence intervals and hypothesis testing for high-dimensional regression. J. Mach. Learn. Res. 15 2869–2909.
• Javanmard, A. and Montanari, A. (2014b). Hypothesis testing in high-dimensional regression under the Gaussian random design model: Asymptotic theory. IEEE Trans. Inform. Theory 60 6522–6554.
• Javanmard, A. and Montanari, A. (2015). De-biasing the Lasso: Optimal sample size for gaussian designs. Available at arxiv:1508.02757.
• Knight, K. and Fu, W. (2000). Asymptotics for Lasso-type estimators. Ann. Statist. 28 1356–1378.
• Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 1436–1462.
• Meinshausen, N. and Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 246–270.
• Ren, Z., Sun, T., Zhang, C.-H. and Zhou, H. H. (2015). Asymptotic normality and optimalities in estimation of large Gaussian graphical models. Ann. Statist. 43 991–1026.
• van de Geer, S. (2016). Estimation and Testing under Sparsity. Springer, Berlin.
• van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42 1166–1202.
• van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge.
• Zhang, C. H. and Zhang, S. S. (2014). Confidence intervals for low-dimensional parameters in high-dimensional linear models. J. Roy. Statist. Soc. Ser. B 76 217–242.

#### Supplemental materials

• Supplement to “Semiparametric efficiency bounds for high-dimensional models”. The supplementary material contains proofs.