Electronic Journal of Statistics

Robust learning for optimal treatment decision with NP-dimensionality

Chengchun Shi, Rui Song, and Wenbin Lu

Full-text: Open access


In order to identify important variables that are involved in making optimal treatment decision, Lu, Zhang and Zeng (2013) proposed a penalized least squared regression framework for a fixed number of predictors, which is robust against the misspecification of the conditional mean model. Two problems arise: (i) in a world of explosively big data, effective methods are needed to handle ultra-high dimensional data set, for example, with the dimension of predictors is of the non-polynomial (NP) order of the sample size; (ii) both the propensity score and conditional mean models need to be estimated from data under NP dimensionality.

In this paper, we propose a robust procedure for estimating the optimal treatment regime under NP dimensionality. In both steps, penalized regressions are employed with the non-concave penalty function, where the conditional mean model of the response given predictors may be misspecified. The asymptotic properties, such as weak oracle properties, selection consistency and oracle distributions, of the proposed estimators are investigated. In addition, we study the limiting distribution of the estimated value function for the obtained optimal treatment regime. The empirical performance of the proposed estimation method is evaluated by simulations and an application to a depression dataset from the STAR∗D study.

Article information

Electron. J. Statist., Volume 10, Number 2 (2016), 2894-2921.

Received: November 2015
First available in Project Euclid: 13 October 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Non-concave penalized likelihood optimal treatment strategy oracle property variable selection


Shi, Chengchun; Song, Rui; Lu, Wenbin. Robust learning for optimal treatment decision with NP-dimensionality. Electron. J. Statist. 10 (2016), no. 2, 2894--2921. doi:10.1214/16-EJS1178. https://projecteuclid.org/euclid.ejs/1476368559

Export citation


  • Bunea, F., Tsybakov, A. and Wegkamp, M. (2007). Sparsity oracle inequalities for the Lasso., Electron. J. Stat. 1 169–194.
  • Chakraborty, B., Murphy, S. and Strecher, V. (2010). Inference for non-regular parameters in optimal dynamic treatment regimes., Stat. Methods Med. Res. 19 317–343.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties., J. Amer. Statist. Assoc. 96 1348–1360.
  • Fan, A., Lu, W. and Song, R. (2015). Sequetial advantage selection for optimal treatment regime., Ann. Appl. Stat. To appear.
  • Fan, J. and Lv, J. (2011). Nonconcave penalized likelihood with NP-dimensionality., IEEE Trans. Inform. Theory 57 5467–5484.
  • Fan, J., Xue, L. and Zou, H. (2014). Strong oracle optimality of folded concave penalized estimation., Ann. Statist. 42 819–849.
  • Gunter, L., Zhu, J. and Murphy, S. A. (2011). Variable selection for qualitative interactions., Stat. Methodol. 8 42–55.
  • Li, K.-C. and Duan, N. (1989). Regression analysis under link violation., Ann. Statist. 17 1009–1052.
  • Lu, W., Zhang, H. H. and Zeng, D. (2013). Variable selection for optimal treatment decision., Stat. Methods Med. Res. 22 493–504.
  • Murphy, S. A. (2003). Optimal dynamic treatment regimes., J. R. Stat. Soc. Ser. B Stat. Methodol. 65 331–366.
  • Qian, M. and Murphy, S. A. (2011). Performance guarantees for individualized treatment rules., Ann. Statist. 39 1180–1210.
  • Robins, J. M. (2004). Optimal structural nested models for optimal sequential decisions. In, Proceedings of the Second Seattle Symposium in Biostatistics. Lecture Notes in Statist. 179 189–326. Springer, New York.
  • Robins, J. M., Hernan, M. A. and Brumback, B. (2000). Marginal structural models and causal inference in epidemiology., Epidemiol. 11 550–560.
  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and non-randomized studies., J. Edu. Psychol. 66 688–701.
  • Shi, C., Song, R. and Lu, W. (2016). Supplement to “Robust Learning for Optimal Treatment Decision with NP-Dimensionality”., DOI:10.1214/16-EJS1178SUPP.
  • Song, R., Kosorok, M., Zeng, D., Zhao, Y., Laber, E. and Yuan, M. (2015). On sparse representation for optimal individualized treatment selection with penalized outcome weighted learning., Stat. 4 59–68.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso: a retrospective., J. R. Stat. Soc. Ser. B Stat. Methodol. 73 273–282.
  • Tsiatis, A. A. (2006)., Semiparametric theory and missing data. Springer Series in Statistics. Springer, New York.
  • Wang, L., Kim, Y. and Li, R. (2013). Calibrating nonconvex penalized regression in ultra-high dimension., Ann. Statist. 41 2505–2536.
  • Watkins, C. J. C. H. and Dayan, P. (1992). Q-learning., Mach. Learn. 8 279-292.
  • White, H. (1982). Maximum likelihood estimation of misspecified models., Econometrica 50 1–25.
  • Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty., Ann. Statist. 38 894–942.
  • Zhang, B., Tsiatis, A. A., Laber, E. B. and Davidian, M. (2012). A robust method for estimating optimal treatment regimes., Biometrics 68 1010–1018.
  • Zhao, Y., Zeng, D., Rush, A. J. and Kosorok, M. R. (2012). Estimating individualized treatment rules using outcome weighted learning., J. Amer. Statist. Assoc. 107 1106–1118.

Supplemental materials