Electronic Journal of Statistics

Exact post-selection inference for the generalized lasso path

Sangwon Hyun, Max G’Sell, and Ryan J. Tibshirani

Full-text: Open access

Abstract

We study tools for inference conditioned on model selection events that are defined by the generalized lasso regularization path. The generalized lasso estimate is given by the solution of a penalized least squares regression problem, where the penalty is the $\ell_{1}$ norm of a matrix $D$ times the coefficient vector. The generalized lasso path collects these estimates as the penalty parameter $\lambda$ varies (from $\infty$ down to 0). Leveraging a (sequential) characterization of this path from Tibshirani and Taylor [37], and recent advances in post-selection inference from Lee at al. [22], Tibshirani et al. [38], we develop exact hypothesis tests and confidence intervals for linear contrasts of the underlying mean vector, conditioned on any model selection event along the generalized lasso path (assuming Gaussian errors in the observations).

Our construction of inference tools holds for any penalty matrix $D$. By inspecting specific choices of $D$, we obtain post-selection tests and confidence intervals for specific cases of generalized lasso estimates, such as the fused lasso, trend filtering, and the graph fused lasso. In the fused lasso case, the underlying coordinates of the mean are assigned a linear ordering, and our framework allows us to test selectively chosen breakpoints or changepoints in these mean coordinates. This is an interesting and well-studied problem with broad applications; our framework applied to the trend filtering and graph fused lasso cases serves several applications as well. Aside from the development of selective inference tools, we describe several practical aspects of our methods such as (valid, i.e., fully-accounted-for) post-processing of generalized lasso estimates before performing inference in order to improve power, and problem-specific visualization aids that may be given to the data analyst for he/she to choose linear contrasts to be tested. Many examples, from both simulated and real data sources, are presented to examine the empirical properties of our inference methods.

Article information

Source
Electron. J. Statist., Volume 12, Number 1 (2018), 1053-1097.

Dates
Received: January 2017
First available in Project Euclid: 17 March 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1521252212

Digital Object Identifier
doi:10.1214/17-EJS1363

Mathematical Reviews number (MathSciNet)
MR3777139

Zentralblatt MATH identifier
06864485

Subjects
Primary: 62F03: Hypothesis testing
Secondary: 62G15: Tolerance and confidence regions

Keywords
Generalized lasso fused lasso trend filtering post-selection inference

Rights
Creative Commons Attribution 4.0 International License.

Citation

Hyun, Sangwon; G’Sell, Max; Tibshirani, Ryan J. Exact post-selection inference for the generalized lasso path. Electron. J. Statist. 12 (2018), no. 1, 1053--1097. doi:10.1214/17-EJS1363. https://projecteuclid.org/euclid.ejs/1521252212


Export citation

References

  • [1] Arnold, T. and Tibshirani, R. J. (2016), ‘Efficient implementations of the generalized lasso dual path algorithm’, Journal of Computational and Graphical Statistics 25(1), 1–27.
  • [2] Bai, J. (1999), ‘Likelihood ratio tests for multiple structural changes’, Journal of Econometrics 91(2), 299–323.
  • [3] Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013), ‘Valid post-selection inference’, Annals of Statistics 41(2), 802–837.
  • [4] Brodsky, B. and Darkhovski, B. (1993), Nonparametric Methods in Change-Point Problems, Springer, Netherlands.
  • [5] Chambolle, A. and Darbon, J. (2009), ‘On total variation minimization and surface evolution using parametric maximum flows’, International Journal of Computer Vision 84, 288–307.
  • [6] Chen, J. and Chen, Z. (2008), ‘Extended Bayesian information criteria for model selection with large model spaces’, Biometrika 95(3), 759–771.
  • [7] Chen, J. and Gupta, A. (2000), Parametric Statistical Change Point Analysis, Birkhauser, Basel.
  • [8] Choi, Y., Taylor, J. and Tibshirani, R. (2014), Selecting the number of principal components: estimation of the true rank of a noisy matrix. arXiv:, 1410.8260.
  • [9] Eckley, I., Fearnhead, P. and Killick, R. (2011), Analysis of changepoint models, in D. Barber, T. Cemgil and S. Chiappa, eds, ‘Bayesian Time Series Models’, Cambridge University Press, Cambridge, chapter 10, pp. 205–224.
  • [10] Fithian, W., Sun, D. and Taylor, J. (2014), Optimal inference after model selection. arXv:, 1410.2597.
  • [11] Fithian, W., Taylor, J., Tibshirani, R. and Tibshirani, R. J. (2015), Selective sequential model selection. arXiv:, 1512.02565.
  • [12] Frick, K., Munk, A. and Sieling, H. (2014), ‘Multiscale change point inference’, Journal of the Royal Statistical Society. Series B: Statistical Methodology 76(3), 495–580.
  • [13] Friedman, J., Hastie, T., Hoefling, H. and Tibshirani, R. (2007), ‘Pathwise coordinate optimization’, Annals of Applied Statistics 1(2), 302–332.
  • [14] Fryzlewicz, P. (2014), ‘Wild binary segmentation for multiple change-point detection’, Annals of Statistics 42(6), 2243–2281.
  • [15] Grazier G’Sell, M., Wager, S., Chouldechova, A. and Tibshirani, R. (2016), ‘Sequential selection procedures and false discovery rate control’, Journal of the Royal Statistical Society: Series B 78(2), 423–444.
  • [16] Hastie, T., Tibshirani, R. and Friedman, J. (2009), The Elements of Statistical Learning; Data Mining, Inference and Prediction, Springer, New York. Second edition.
  • [17] Hinkley, D. (1970), ‘Inference about the change-point in a sequence of random variables’, Biometrika 57(1), 1–17.
  • [18] Hoefling, H. (2010), ‘A path algorithm for the fused lasso signal approximator’, Journal of Computational and Graphical Statistics 19(4), 984–1006.
  • [19] Horvath, L. and Rice, G. (2014), ‘Extensions of some classical methods in change point analysis’, TEST 23(2), 219–255.
  • [20] Jandhyala, V., Fotopoulos, S., Macneill, I. and Liu, P. (2013), ‘Inference for single and multiple change-points in time series’, Journal of Time Series Analysis 34(4), 423–446.
  • [21] Kim, S.-J., Koh, K., Boyd, S. and Gorinevsky, D. (2009), ‘$\ell_1$ trend filtering’, SIAM Review 51(2), 339–360.
  • [22] Lee, J., Sun, D., Sun, Y. and Taylor, J. (2016), ‘Exact post-selection inference, with application to the lasso’, Annals of Statistics 44(3), 907–927.
  • [23] Lee, J. and Taylor, J. (2014), ‘Exact post model selection inference for marginal screening’, Advances in Neural Information Processing Systems 27.
  • [24] Leeb, H. and Potscher, B. (2003), ‘The finite-sample distribution of post-model-selection estimators and uniform versus nonuniform approximations’, Econometric Theory 19(1), 100–142.
  • [25] Leeb, H. and Potscher, B. (2006), ‘Can one estimate the conditional distribution of post-model-selection estimators?’, Annals of Statistics 34(5), 2554–2591.
  • [26] Leeb, H. and Potscher, B. (2008), ‘Can one estimate the unconditional distribution of post-model-selection estimators?’, Econometric Theory 24(2), 338–376.
  • [27] Lockhart, R., Taylor, J., Tibshirani, R. J. and Tibshirani, R. (2014), ‘A significance test for the lasso’, Annals of Statistics 42(2), 413–468.
  • [28] Loftus, J. and Taylor, J. (2014), A significance test for forward stepwise model selection. arXiv:, 1405.3920.
  • [29] Reid, S., Taylor, J. and Tibshirani, R. (2014), Post-selection point and interval estimation of signal sizes in Gaussian samples. arXiv:, 1405.3340.
  • [30] Rudin, L. I., Osher, S. and Faterni, E. (1992), ‘Nonlinear total variation based noise removal algorithms’, Physica D: Nonlinear Phenomena 60, 259–268.
  • [31] Sharpnack, J., Rinaldo, A. and Singh, A. (2012), ‘Sparsistency of the edge lasso over graphs’, Proceedings of the International Conference on Artificial Intelligence and Statistics 15, 1028–1036.
  • [32] Steidl, G., Didas, S. and Neumann, J. (2006), ‘Splines in higher order TV regularization’, International Journal of Computer Vision 70(3), 214–255.
  • [33] Tian, X. and Taylor, J., [2015a], Asymptotics of selective inference. arXiv: 1501.03588.
  • [34] Tian, X. and Taylor, J., [2015b], Selective inference with a randomized response. arXiv: 1507.06739.
  • [35] Tibshirani, R. J. (2014), ‘Adaptive piecewise polynomial estimation via trend filtering’, Annals of Statistics 42(1), 285–323.
  • [36] Tibshirani, R. J., Rinaldo, A., Tibshirani, R. and Wasserman, L. (2015), Uniform asymptotic inference and the bootstrap after model selection. arXiv:, 1506.06266.
  • [37] Tibshirani, R. J. and Taylor, J. (2011), ‘The solution path of the generalized lasso’, Annals of Statistics 39(3), 1335–1371.
  • [38] Tibshirani, R. J., Taylor, J., Lockhart, R., and Tibshirani, R. (2016), ‘Exact post-selection inference for sequential regression procedures’, Journal of the American Statistical Association 111(514), 600–620.
  • [39] Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005), ‘Sparsity and smoothness via the fused lasso’, Journal of the Royal Statistical Society: Series B 67(1), 91–108.
  • [40] Tibshirani, R. and Wang, P. (2008), ‘Spatial smoothing and hot spot detection for CGH data using the fused lasso’, Biostatistics 9(1), 18–29. http://www.ncbi.nlm.nih.gov/pubmed/17513312
  • [41] Wang, Y.-X., Sharpnack, J., Smola, A. and Tibshirani, R. J. (2016), ‘Trend filtering on graphs’, Journal of Machine Learning Research 17(105), 1–41.
  • [42] Worsley, K. J. (1986), ‘Confidence-regions and tests for a change-point in a sequence of exponential family random-variables’, Biometrika 73(1), 91–104.