Electronic Journal of Statistics

Uniformly valid confidence sets based on the Lasso

Karl Ewald and Ulrike Schneider

Full-text: Open access

Abstract

In a linear regression model of fixed dimension $p\leq n$, we construct confidence regions for the unknown parameter vector based on the Lasso estimator that uniformly and exactly hold the prescribed in finite samples as well as in an asymptotic setup. We thereby quantify estimation uncertainty as well as the “post-model selection error” of this estimator. More concretely, in finite samples with Gaussian errors and asymptotically in the case where the Lasso estimator is tuned to perform conservative model selection, we derive exact formulas for computing the minimal coverage probability over the entire parameter space for a large class of shapes for the confidence sets, thus enabling the construction of valid confidence regions based on the Lasso estimator in these settings. The choice of shape for the confidence sets and comparison with the confidence ellipse based on the least-squares estimator is also discussed. Moreover, in the case where the Lasso estimator is tuned to enable consistent model selection, we give a simple confidence region with minimal coverage probability converging to one. Finally, we also treat the case of unknown error variance and present some ideas for extensions.

Article information

Source
Electron. J. Statist., Volume 12, Number 1 (2018), 1358-1387.

Dates
Received: October 2016
First available in Project Euclid: 14 May 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1526284830

Digital Object Identifier
doi:10.1214/18-EJS1425

Mathematical Reviews number (MathSciNet)
MR3802261

Zentralblatt MATH identifier
06875403

Subjects
Primary: 62F25: Tolerance and confidence regions
Secondary: 62J05: Linear regression 62J07: Ridge regression; shrinkage estimators

Keywords
Sparsity confidence region valid inference

Rights
Creative Commons Attribution 4.0 International License.

Citation

Ewald, Karl; Schneider, Ulrike. Uniformly valid confidence sets based on the Lasso. Electron. J. Statist. 12 (2018), no. 1, 1358--1387. doi:10.1214/18-EJS1425. https://projecteuclid.org/euclid.ejs/1526284830


Export citation

References

  • Alliney, S. and Ruzinsky, A. (1994). An Algorithm for the Minimization of Mixed $l_1$ and $l_2$ Norms with Applications to Bayesian Estimation., IEEE Transactions on Signal Processing 42 618–627.
  • Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Valid post-selection inference., Annals of Statistics 41 802–837.
  • Caner, M. and Kock, A. B. (2018). Asymptotically Honest Confidence Regions for High Dimensional Parameters by the Desparsified Conservative Lasso, Journal of Econometrics 203 143–168.
  • Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least Angle Regression., Annals of Statistics 32 407–499.
  • Geyer, C. (1996). On the Asymptotics of Convex Stochastic Optimization. Unpublished, manuscript.
  • Javanmard, A. and Montanari, A. (2014). Confidence Intervals and Hypothesis Testing for High-Dimensional Regression., Journal of Machine Learning Research 15 2869–2909.
  • Knight, K. and Fu, W. (2000). Asymptotics of Lasso-Type Estimators., Annals of Statistics 28 1356–1378.
  • Lee, J. D., Sun, D. L., Sun, Y. and Taylor, J. E. (2016). Exact Post-Selection Inference with an Application to the Lasso., Annals of Statistics 44 907–927.
  • Pötscher, B. M. and Leeb, H. (2009). On the Distribution of Penalized Maximum Likelihood Estimators: The LASSO, SCAD, and Thresholding., Journal of Multivariate Analysis 100 2065–2082.
  • Pötscher, B. M. and Schneider, U. (2010). Confidence Sets Based on Penalized Maximum Likelihood Estimators in Gaussian Regression., Electronic Journal of Statistics 4 334–360.
  • Pötscher, B. M. and Schneider, U. (2011). Distributional Results for Thresholding Estimators in High-Dimensional Gaussian Regression Models., Electronic Journal of Statistics 5 1876–1934.
  • Rosset, S. and Zhu, J. (2007). Piecewise Linear Regularized Solution Paths., Annals of Statistics 35 1012–1030.
  • Schneider, U. (2016). Confidence Sets Based on Thresholding Estimators in High-Dimensional Gaussian Regression., Econometric Reviews 35 1412–1455.
  • Schneider, U. and Ewald, K. (2017). On the Distribution, Model Selection Properties and Uniqueness of the Lasso Estimator in Low and High Dimensions Technical Report, arXiv:1708.09608.
  • Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso., Journal of the Royal Statistical Society Series B 58 267–288.
  • Van de Geer, S. and Stucky, B. (2016). $\chi^2$-Confidence Sets in High-Dimensional Regression. In: Statistical Analysis for High-Dimensional Data: The Abel Symposium 2014 (A. Frigessi, P. Bühlmann, I. K. Glad, M. Langaas, S. Richardson and M. Vannucci, eds.) 279–306, Springer International, Publishing.
  • Van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeures, R. (2014). On Asymptotically Optimal Confidence Regions and Tests for High-Dimensional Models., Annals of Statistics 42 1166–1202.
  • Yuan, M. and Lin, Y. (2007). On the Non-negative Garrotte Estimator., Journal of the Royal Statistical Society Series B 69 143–161.
  • Zhang, C. and Zhang, S. S. (2014). Confidence Intervals for Low Dimensional Parameters., Journal of the Royal Statistical Society Series B 76 217–242.
  • Zhao, P. and Yu, B. (2006). On Model Selection Consistency of Lasso., Journal of Machine Learning Research 7 2541–2563.
  • Zou, H. (2006). The Adaptive Lasso and Its Oracle Properties., Journal of the American Statistical Association 101 1418–1429.