## Electronic Journal of Statistics

### The Generalized Lasso Problem and Uniqueness

#### Abstract

We study uniqueness in the generalized lasso problem, where the penalty is the $\ell _{1}$ norm of a matrix $D$ times the coefficient vector. We derive a broad result on uniqueness that places weak assumptions on the predictor matrix $X$ and penalty matrix $D$; the implication is that, if $D$ is fixed and its null space is not too large (the dimension of its null space is at most the number of samples), and $X$ and response vector $y$ jointly follow an absolutely continuous distribution, then the generalized lasso problem has a unique solution almost surely, regardless of the number of predictors relative to the number of samples. This effectively generalizes previous uniqueness results for the lasso problem [32] (which corresponds to the special case $D=I$). Further, we extend our study to the case in which the loss is given by the negative log-likelihood from a generalized linear model. In addition to uniqueness results, we derive results on the local stability of generalized lasso solutions that might be of interest in their own right.

#### Article information

Source
Electron. J. Statist., Volume 13, Number 2 (2019), 2307-2347.

Dates
Received: May 2018
First available in Project Euclid: 9 July 2019

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1562637626

Digital Object Identifier
doi:10.1214/19-EJS1569

#### Citation

Ali, Alnur; Tibshirani, Ryan J. The Generalized Lasso Problem and Uniqueness. Electron. J. Statist. 13 (2019), no. 2, 2307--2347. doi:10.1214/19-EJS1569. https://projecteuclid.org/euclid.ejs/1562637626

#### References

• [1] Samrachana Adhikari, Fabrizio Lecci, James T. Becker, Brian W. Junker, Lewis H. Kuller, Oscar L. Lopez, and Ryan J. Tibshirani. High-dimensional longitudinal classification with the multinomial fused lasso., Statistics in Medicine, 38(12) :2184–2205, 2019.
• [2] A. Albert and J. A. Anderson. On the existence of maximum likelihood estimates in logistic regression models., Biometrika, 71(1):1–10, 1984.
• [3] Heinz H. Bauschke and Jonathan M. Borwein. Legendre functions and the method of random Bregman projections., Journal of Convex Analysis, 4(1):27–67, 1997.
• [4] Emmanuel J. Candes and Yaniv Plan. Near ideal model selection by $\ell _1$ minimization., Annals of Statistics, 37(5) :2145–2177, 2009.
• [5] Emmanuel J. Candes and Pragya Sur. The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression., arXiv:1804.09753, 2018.
• [6] David L. Donoho. For most large underdetermined systems of linear equations, the minimal $\ell _1$ solution is also the sparsest solution., Communications on Pure and Applied Mathematics, 59(6):797–829, 2006.
• [7] Charles Dossal. A necessary and sufficient condition for exact sparse recovery by $\ell _1$ minimization., Comptes Rendus Mathematique, 350(1–2):117–120, 2012.
• [8] Lawrence C. Evans and Ronald F. Gariepy., Measure Theory and Fine Properties of Functions. CRC Press, 1992.
• [9] Jean Jacques Fuchs. Recovery of exact sparse representations in the presence of bounded noise., IEEE Transactions on Information Theory, 51(10) :3601–3608, 2005.
• [10] Shelby J. Haberman. Log-linear models for frequency tables derived by indirect observation: Maximum likelihood equations., Annals of Statistics, 2(5):911–924, 1974.
• [11] Holger Hoefling. A path algorithm for the fused lasso signal approximator., Journal of Computational and Graphical Statistics, 19(4):984 –1006, 2010.
• [12] Murray C. Kemp and Yoshio Kimura., Introduction to Mathematical Economics. Springer, 1978.
• [13] Mohammad Khabbazian, Ricardo Kriebel, Karl Rohe, and Cecile Ane. Fast and accurate detection of evolutionary shifts in Ornstein-Uhlenbeck models., Evolutionary Quantitative Genetics, 7:811–824, 2016.
• [14] Seung-Jean Kim, Kwangmoo Koh, Stephen Boyd, and Dimitry Gorinevsky. $\ell_1$ trend filtering., SIAM Review, 51(2):339–360, 2009.
• [15] Robert Lang. A note on the measurability of convex sets., Archiv der Mathematik, 47(1):90–92, 1986.
• [16] Jason Lee, Yuekai Sun, and Jonathan Taylor. On model selection consistency of M-estimators with geometrically decomposable penalties., Electronic Journal of Statistics, 9(1):608–642, 2015.
• [17] Oscar Hernan Madrid-Padilla and James Scott. Tensor decomposition with generalized lasso penalties., Journal of Computational and Graphical Statistics, 26(3):537–546, 2017.
• [18] Pertti Mattila., Geometry of Sets and Measures in Euclidean Spaces: Fractals and Rectifiability. Cambridge University Press, 1995.
• [19] Sangnam Nam, Mike E. Davies, Michael Elad, and Remi Gribonval. The cosparse analysis model and algorithms., Applied and Computational Harmonic Analysis, 34(1):30–56, 2013.
• [20] Michael Osborne, Brett Presnell, and Berwin Turlach. On the lasso and its dual., Journal of Computational and Graphical Statistics, 9(2):319–337, 2000.
• [21] R. Tyrrell Rockafellar., Convex Analysis. Princeton University Press, 1970.
• [22] R. Tyrrell Rockafellar and Roger J-B Wets., Variational Analysis. Springer, 2009.
• [23] Saharon Rosset, Ji Zhu, and Trevor Hastie. Boosting as a regularized path to a maximum margin classifier., Journal of Machine Learning Research, 5:941–973, 2004.
• [24] Leonid I. Rudin, Stanley Osher, and Emad Faterni. Nonlinear total variation based noise removal algorithms., Physica D: Nonlinear Phenomena, 60(1):259–268, 1992.
• [25] Veeranjaneyulu Sadhanala and Ryan J. Tibshirani. Additive models via trend filtering., arXiv:1702.05037, 2017.
• [26] Veeranjaneyulu Sadhanala, Yu-Xiang Wang, James Sharpnack, and Ryan J. Tibshirani. Higher-total variation classes on grids: Minimax theory and trend filtering methods., Advances in Neural Information Processing Systems, 30, 2017.
• [27] Ulrike Schneider and Karl Ewald. On the distribution, model selection properties and uniqueness of the lasso estimator in low and high dimensions., arXiv:1708.09608, 2017.
• [28] Gabriel Steidl, Stephan Didas, and Julia Neumann. Splines in higher order TV regularization., International Journal of Computer Vision, 70(3):214–255, 2006.
• [29] Robert Tibshirani. Regression shrinkage and selection via the lasso., Journal of the Royal Statistical Society: Series B, 58(1):267–288, 1996.
• [30] Robert Tibshirani and Pei Wang. Spatial smoothing and hot spot detection for CGH data using the fused lasso., Biostatistics, 9(1):18–29, 2008.
• [31] Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. Sparsity and smoothness via the fused lasso., Journal of the Royal Statistical Society: Series B, 67(1):91–108, 2005.
• [32] Ryan J. Tibshirani. The lasso problem and uniqueness., Electronic Journal of Statistics, 7 :1456–1490, 2013.
• [33] Ryan J. Tibshirani and Jonathan Taylor. The solution path of the generalized lasso., Annals of Statistics, 39(3) :1335–1371, 2011.
• [34] Ryan J. Tibshirani and Jonathan Taylor. Degrees of freedom in lasso problems., Annals of Statistics, 40(2) :1198–1232, 2012.
• [35] Martin J. Wainwright. Sharp thresholds for high-dimensional and noisy sparsity recovery using $\ell_1$-constrained quadratic programming (lasso)., IEEE Transactions on Information Theory, 55(5) :2183–2202, 2009.
• [36] Yu-Xiang Wang, James Sharpnack, Alex Smola, and Ryan J. Tibshirani. Trend filtering on graphs., Journal of Machine Learning Research, 17(105):1–41, 2016.
• [37] Bo Xin, Yoshinobu Kawahara, Yizhou Wang, and Wen Gao. Efficient generalized fused lasso and its application to the diagnosis of Alzheimer’s disease., AAAI Conference on Artificial Intelligence, 28, 2014.