Open Access
2017 Recovery of weak signal in high dimensional linear regression by data perturbation
Yongli Zhang
Electron. J. Statist. 11(2): 3226-3250 (2017). DOI: 10.1214/17-EJS1320

Abstract

How to recover weak signals (i.e., small nonzero regression coefficients) is a difficult task in high dimensional feature selection problems. Both convex and nonconvex regularization methods fail to fully recover the true model whenever there exist strong columnwise correlations in design matrices or small nonzero coefficients below some threshold. To address the two challenges, we propose a procedure, Perturbed LASSO (PLA), that weakens correlations in the design matrix and strengthens signals by adding random perturbations to the design matrix. Moreover, a quantitative relationship between the selection accuracy and computing cost of PLA is derived. We theoretically prove and demonstrate using simulations that PLA substantially improves the chance of recovering weak signals and outperforms comparable methods at a limited cost of computation.

Citation

Download Citation

Yongli Zhang. "Recovery of weak signal in high dimensional linear regression by data perturbation." Electron. J. Statist. 11 (2) 3226 - 3250, 2017. https://doi.org/10.1214/17-EJS1320

Information

Received: 1 November 2016; Published: 2017
First available in Project Euclid: 25 September 2017

zbMATH: 1373.62373
MathSciNet: MR3705451
Digital Object Identifier: 10.1214/17-EJS1320

Subjects:
Primary: 62J07

Keywords: Beta-min condition , Data perturbation , high dimensional data , Irrepresentable Condition , Lasso , weak signal

Vol.11 • No. 2 • 2017
Back to Top