Electronic Journal of Statistics

The Generalized Lasso Problem and Uniqueness

Alnur Ali and Ryan J. Tibshirani

Full-text: Open access

Abstract

We study uniqueness in the generalized lasso problem, where the penalty is the $\ell _{1}$ norm of a matrix $D$ times the coefficient vector. We derive a broad result on uniqueness that places weak assumptions on the predictor matrix $X$ and penalty matrix $D$; the implication is that, if $D$ is fixed and its null space is not too large (the dimension of its null space is at most the number of samples), and $X$ and response vector $y$ jointly follow an absolutely continuous distribution, then the generalized lasso problem has a unique solution almost surely, regardless of the number of predictors relative to the number of samples. This effectively generalizes previous uniqueness results for the lasso problem [32] (which corresponds to the special case $D=I$). Further, we extend our study to the case in which the loss is given by the negative log-likelihood from a generalized linear model. In addition to uniqueness results, we derive results on the local stability of generalized lasso solutions that might be of interest in their own right.

Article information

Source
Electron. J. Statist., Volume 13, Number 2 (2019), 2307-2347.

Dates
Received: May 2018
First available in Project Euclid: 9 July 2019

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1562637626

Digital Object Identifier
doi:10.1214/19-EJS1569

Subjects
Primary: 62J07: Ridge regression; shrinkage estimators 62J07: Ridge regression; shrinkage estimators
Secondary: 90C46: Optimality conditions, duality [See also 49N15]

Keywords
Generalized lasso high-dimensional uniqueness of solutions generalized linear models existence of solutions

Rights
Creative Commons Attribution 4.0 International License.

Citation

Ali, Alnur; Tibshirani, Ryan J. The Generalized Lasso Problem and Uniqueness. Electron. J. Statist. 13 (2019), no. 2, 2307--2347. doi:10.1214/19-EJS1569. https://projecteuclid.org/euclid.ejs/1562637626


Export citation

References

  • [1] Samrachana Adhikari, Fabrizio Lecci, James T. Becker, Brian W. Junker, Lewis H. Kuller, Oscar L. Lopez, and Ryan J. Tibshirani. High-dimensional longitudinal classification with the multinomial fused lasso., Statistics in Medicine, 38(12) :2184–2205, 2019.
  • [2] A. Albert and J. A. Anderson. On the existence of maximum likelihood estimates in logistic regression models., Biometrika, 71(1):1–10, 1984.
  • [3] Heinz H. Bauschke and Jonathan M. Borwein. Legendre functions and the method of random Bregman projections., Journal of Convex Analysis, 4(1):27–67, 1997.
  • [4] Emmanuel J. Candes and Yaniv Plan. Near ideal model selection by $\ell _1$ minimization., Annals of Statistics, 37(5) :2145–2177, 2009.
  • [5] Emmanuel J. Candes and Pragya Sur. The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression., arXiv:1804.09753, 2018.
  • [6] David L. Donoho. For most large underdetermined systems of linear equations, the minimal $\ell _1$ solution is also the sparsest solution., Communications on Pure and Applied Mathematics, 59(6):797–829, 2006.
  • [7] Charles Dossal. A necessary and sufficient condition for exact sparse recovery by $\ell _1$ minimization., Comptes Rendus Mathematique, 350(1–2):117–120, 2012.
  • [8] Lawrence C. Evans and Ronald F. Gariepy., Measure Theory and Fine Properties of Functions. CRC Press, 1992.
  • [9] Jean Jacques Fuchs. Recovery of exact sparse representations in the presence of bounded noise., IEEE Transactions on Information Theory, 51(10) :3601–3608, 2005.
  • [10] Shelby J. Haberman. Log-linear models for frequency tables derived by indirect observation: Maximum likelihood equations., Annals of Statistics, 2(5):911–924, 1974.
  • [11] Holger Hoefling. A path algorithm for the fused lasso signal approximator., Journal of Computational and Graphical Statistics, 19(4):984 –1006, 2010.
  • [12] Murray C. Kemp and Yoshio Kimura., Introduction to Mathematical Economics. Springer, 1978.
  • [13] Mohammad Khabbazian, Ricardo Kriebel, Karl Rohe, and Cecile Ane. Fast and accurate detection of evolutionary shifts in Ornstein-Uhlenbeck models., Evolutionary Quantitative Genetics, 7:811–824, 2016.
  • [14] Seung-Jean Kim, Kwangmoo Koh, Stephen Boyd, and Dimitry Gorinevsky. $\ell_1$ trend filtering., SIAM Review, 51(2):339–360, 2009.
  • [15] Robert Lang. A note on the measurability of convex sets., Archiv der Mathematik, 47(1):90–92, 1986.
  • [16] Jason Lee, Yuekai Sun, and Jonathan Taylor. On model selection consistency of M-estimators with geometrically decomposable penalties., Electronic Journal of Statistics, 9(1):608–642, 2015.
  • [17] Oscar Hernan Madrid-Padilla and James Scott. Tensor decomposition with generalized lasso penalties., Journal of Computational and Graphical Statistics, 26(3):537–546, 2017.
  • [18] Pertti Mattila., Geometry of Sets and Measures in Euclidean Spaces: Fractals and Rectifiability. Cambridge University Press, 1995.
  • [19] Sangnam Nam, Mike E. Davies, Michael Elad, and Remi Gribonval. The cosparse analysis model and algorithms., Applied and Computational Harmonic Analysis, 34(1):30–56, 2013.
  • [20] Michael Osborne, Brett Presnell, and Berwin Turlach. On the lasso and its dual., Journal of Computational and Graphical Statistics, 9(2):319–337, 2000.
  • [21] R. Tyrrell Rockafellar., Convex Analysis. Princeton University Press, 1970.
  • [22] R. Tyrrell Rockafellar and Roger J-B Wets., Variational Analysis. Springer, 2009.
  • [23] Saharon Rosset, Ji Zhu, and Trevor Hastie. Boosting as a regularized path to a maximum margin classifier., Journal of Machine Learning Research, 5:941–973, 2004.
  • [24] Leonid I. Rudin, Stanley Osher, and Emad Faterni. Nonlinear total variation based noise removal algorithms., Physica D: Nonlinear Phenomena, 60(1):259–268, 1992.
  • [25] Veeranjaneyulu Sadhanala and Ryan J. Tibshirani. Additive models via trend filtering., arXiv:1702.05037, 2017.
  • [26] Veeranjaneyulu Sadhanala, Yu-Xiang Wang, James Sharpnack, and Ryan J. Tibshirani. Higher-total variation classes on grids: Minimax theory and trend filtering methods., Advances in Neural Information Processing Systems, 30, 2017.
  • [27] Ulrike Schneider and Karl Ewald. On the distribution, model selection properties and uniqueness of the lasso estimator in low and high dimensions., arXiv:1708.09608, 2017.
  • [28] Gabriel Steidl, Stephan Didas, and Julia Neumann. Splines in higher order TV regularization., International Journal of Computer Vision, 70(3):214–255, 2006.
  • [29] Robert Tibshirani. Regression shrinkage and selection via the lasso., Journal of the Royal Statistical Society: Series B, 58(1):267–288, 1996.
  • [30] Robert Tibshirani and Pei Wang. Spatial smoothing and hot spot detection for CGH data using the fused lasso., Biostatistics, 9(1):18–29, 2008.
  • [31] Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. Sparsity and smoothness via the fused lasso., Journal of the Royal Statistical Society: Series B, 67(1):91–108, 2005.
  • [32] Ryan J. Tibshirani. The lasso problem and uniqueness., Electronic Journal of Statistics, 7 :1456–1490, 2013.
  • [33] Ryan J. Tibshirani and Jonathan Taylor. The solution path of the generalized lasso., Annals of Statistics, 39(3) :1335–1371, 2011.
  • [34] Ryan J. Tibshirani and Jonathan Taylor. Degrees of freedom in lasso problems., Annals of Statistics, 40(2) :1198–1232, 2012.
  • [35] Martin J. Wainwright. Sharp thresholds for high-dimensional and noisy sparsity recovery using $\ell_1$-constrained quadratic programming (lasso)., IEEE Transactions on Information Theory, 55(5) :2183–2202, 2009.
  • [36] Yu-Xiang Wang, James Sharpnack, Alex Smola, and Ryan J. Tibshirani. Trend filtering on graphs., Journal of Machine Learning Research, 17(105):1–41, 2016.
  • [37] Bo Xin, Yoshinobu Kawahara, Yizhou Wang, and Wen Gao. Efficient generalized fused lasso and its application to the diagnosis of Alzheimer’s disease., AAAI Conference on Artificial Intelligence, 28, 2014.