The Annals of Statistics

Oracle inequalities for the lasso in the Cox model

Jian Huang, Tingni Sun, Zhiliang Ying, Yi Yu, and Cun-Hui Zhang

Full-text: Open access


We study the absolute penalized maximum partial likelihood estimator in sparse, high-dimensional Cox proportional hazards regression models where the number of time-dependent covariates can be larger than the sample size. We establish oracle inequalities based on natural extensions of the compatibility and cone invertibility factors of the Hessian matrix at the true regression coefficients. Similar results based on an extension of the restricted eigenvalue can be also proved by our method. However, the presented oracle inequalities are sharper since the compatibility and cone invertibility factors are always greater than the corresponding restricted eigenvalue. In the Cox regression model, the Hessian matrix is based on time-dependent covariates in censored risk sets, so that the compatibility and cone invertibility factors, and the restricted eigenvalue as well, are random variables even when they are evaluated for the Hessian at the true regression coefficients. Under mild conditions, we prove that these quantities are bounded from below by positive constants for time-dependent covariates, including cases where the number of covariates is of greater order than the sample size. Consequently, the compatibility and cone invertibility factors can be treated as positive constants in our oracle inequalities.

Article information

Ann. Statist., Volume 41, Number 3 (2013), 1142-1165.

First available in Project Euclid: 13 June 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62N02: Estimation
Secondary: 62G05: Estimation

Proportional hazards regression absolute penalty regularization oracle inequality survival analysis


Huang, Jian; Sun, Tingni; Ying, Zhiliang; Yu, Yi; Zhang, Cun-Hui. Oracle inequalities for the lasso in the Cox model. Ann. Statist. 41 (2013), no. 3, 1142--1165. doi:10.1214/13-AOS1098.

Export citation


  • Andersen, P. K. and Gill, R. D. (1982). Cox’s regression model for counting processes: A large sample study. Ann. Statist. 10 1100–1120.
  • Azuma, K. (1967). Weighted sums of certain dependent random variables. Tôhoku Math. J. (2) 19 357–367.
  • Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • Bradic, J., Fan, J. and Jiang, J. (2011). Regularization for Cox’s proportional hazards model with NP-dimensionality. Ann. Statist. 39 3092–3120.
  • Bunea, F., Tsybakov, A. and Wegkamp, M. (2007). Sparsity oracle inequalities for the Lasso. Electron. J. Stat. 1 169–194.
  • Chen, S. S., Donoho, D. L. and Saunders, M. A. (1998). Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20 33–61.
  • Cox, D. R. (1972). Regression models and life-tables (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 34 187–220.
  • de la Peña, V. H. (1999). A general class of exponential inequalities for martingales and ratios. Ann. Probab. 27 537–564.
  • Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–451.
  • Fan, J. (1997). Comments on “Wavelets in statistics: A review,” by A. Antoniadis. J. Amer. Statist. Assoc. 6 131–138.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Fan, J. and Li, R. (2002). Variable selection for Cox’s proportional hazards model and frailty model. Ann. Statist. 30 74–99.
  • Fan, J. and Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Ann. Statist. 32 928–961.
  • Gaïffas, S. and Guilloux, A. (2012). High-dimensional additive hazards models and the Lasso. Electron. J. Stat. 6 522–546.
  • Greenshtein, E. and Ritov, Y. (2004). Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli 10 971–988.
  • Gui, J. and Li, H. (2005). Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics 21 3001–3008.
  • Hjort, N. L. and Pollard, D. (1993). Asymptotics for minimisers of convex processes. Preprint, Yale Univ.
  • Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13–30.
  • Huang, J. and Zhang, C.-H. (2012). Estimation and selection via absolute penalized convex minimization and its multistage adaptive applications. J. Mach. Learn. Res. 13 1839–1864.
  • Koltchinskii, V. (2009). The Dantzig selector and sparsity oracle inequalities. Bernoulli 15 799–828.
  • Kong, S. and Nan, B. (2012). Non-asymptotic oracle inequalities for the high-dimensional Cox regression via Lasso. Available at arXiv:1204.1992.
  • Lemler, S. (2012). Oracle inequalities for the Lasso for the conditional hazard rate in a high-dimensional setting. Available at arXiv:1206.5628.
  • Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • Meinshausen, N. and Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 246–270.
  • Negahban, S., Ravikumar, P., Wainwright, M. and Yu, B. (2009). A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers. In Proceedings of the NIPS Conference. Vancouver, Canada.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 267–288.
  • Tibshirani, R. (1997). The lasso method for variable selection in the Cox model. Stat. Med. 16 385–395.
  • Tsiatis, A. A. (1981). A large sample study of Cox’s regression model. Ann. Statist. 9 93–108.
  • van de Geer, S. A. (2007). The deterministic Lasso. Technical Report 140, ETH Zürich, Switzerland. Available at
  • van de Geer, S. A. (2008). High-dimensional generalized linear models and the lasso. Ann. Statist. 36 614–645.
  • van de Geer, S. A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3 1360–1392.
  • Ye, F. and Zhang, C.-H. (2010). Rate minimaxity of the Lasso and Dantzig selector for the $\ell_{q}$ loss in $\ell_{r}$ balls. J. Mach. Learn. Res. 11 3519–3540.
  • Zhang, T. (2009). On the consistency of feature selection using greedy least squares regression. J. Mach. Learn. Res. 10 555–568.
  • Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567–1594.
  • Zhang, H. H. and Lu, W. (2007). Adaptive Lasso for Cox’s proportional hazards model. Biometrika 94 691–703.
  • Zhang, C. H. and Zhang, T. (2012). A general theory of concave regularization for high dimensional sparse estimation problems. Statist. Sci. 27 576–593.
  • Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541–2563.