## The Annals of Statistics

- Ann. Statist.
- Volume 45, Number 5 (2017), 2133-2150.

### False discoveries occur early on the Lasso path

Weijie Su, Małgorzata Bogdan, and Emmanuel Candès

#### Abstract

In regression settings where explanatory variables have very low correlations and there are relatively few effects, each of large magnitude, we expect the Lasso to find the important variables with few errors, if any. This paper shows that in a regime of linear sparsity—meaning that the fraction of variables with a nonvanishing effect tends to a constant, however small—this cannot really be the case, even when the design variables are stochastically independent. We demonstrate that true features and null features are always interspersed on the Lasso path, and that this phenomenon occurs no matter how strong the effect sizes are. We derive a sharp asymptotic trade-off between false and true positive rates or, equivalently, between measures of type I and type II errors along the Lasso path. This trade-off states that if we ever want to achieve a type II error (false negative rate) under a critical value, then anywhere on the Lasso path the type I error (false positive rate) will need to exceed a given threshold so that we can never have both errors at a low level at the same time. Our analysis uses tools from approximate message passing (AMP) theory as well as novel elements to deal with a possibly adaptive selection of the Lasso regularizing parameter.

#### Article information

**Source**

Ann. Statist., Volume 45, Number 5 (2017), 2133-2150.

**Dates**

Received: June 2016

Revised: September 2016

First available in Project Euclid: 31 October 2017

**Permanent link to this document**

https://projecteuclid.org/euclid.aos/1509436830

**Digital Object Identifier**

doi:10.1214/16-AOS1521

**Mathematical Reviews number (MathSciNet)**

MR3718164

**Zentralblatt MATH identifier**

06821121

**Subjects**

Primary: 62F03: Hypothesis testing

Secondary: 62J07: Ridge regression; shrinkage estimators 62J05: Linear regression

**Keywords**

Lasso Lasso path false discovery rate false negative rate power approximate message passing (AMP) adaptive selection of parameters

#### Citation

Su, Weijie; Bogdan, Małgorzata; Candès, Emmanuel. False discoveries occur early on the Lasso path. Ann. Statist. 45 (2017), no. 5, 2133--2150. doi:10.1214/16-AOS1521. https://projecteuclid.org/euclid.aos/1509436830

#### Supplemental materials

- Supplement to “False discoveries occur early on the Lasso path”. The supplementary materials contain proofs of some technical results in this paper.Digital Object Identifier: doi:10.1214/16-AOS1521SUPPSupplemental files are immediately available to subscribers. Non-subscribers gain access to supplemental files with the purchase of the article.