Minimax risks for sparse regressions: Ultra-high dimensional phenomenons

Nicolas Verzelen

doi:10.1214/12-EJS666

2012 Minimax risks for sparse regressions: Ultra-high dimensional phenomenons

Nicolas Verzelen

Electron. J. Statist. 6: 38-90 (2012). DOI: 10.1214/12-EJS666

Abstract

Consider the standard Gaussian linear regression model Y=Xθ₀+ε, where Y∈ℝⁿ is a response vector and X∈ℝ^n×p is a design matrix. Numerous work have been devoted to building efficient estimators of θ₀ when p is much larger than n. In such a situation, a classical approach amounts to assume that θ₀ is approximately sparse. This paper studies the minimax risks of estimation and testing over classes of k-sparse vectors θ₀. These bounds shed light on the limitations due to high-dimensionality. The results encompass the problem of prediction (estimation of Xθ₀), the inverse problem (estimation of θ₀) and linear testing (testing Xθ₀=0). Interestingly, an elbow effect occurs when the number of variables klog(p/k) becomes large compared to n. Indeed, the minimax risks and hypothesis separation distances blow up in this ultra-high dimensional setting. We also prove that even dimension reduction techniques cannot provide satisfying results in an ultra-high dimensional setting. Moreover, we compute the minimax risks when the variance of the noise is unknown. The knowledge of this variance is shown to play a significant role in the optimal rates of estimation and testing. All these minimax bounds provide a characterization of statistical problems that are so difficult so that no procedure can provide satisfying results.

Citation

Download Citation

Nicolas Verzelen. "Minimax risks for sparse regressions: Ultra-high dimensional phenomenons." Electron. J. Statist. 6 38 - 90, 2012. https://doi.org/10.1214/12-EJS666