The Annals of Statistics

Hazard models with varying coefficients for multivariate failure time data

Jianwen Cai, Jianqing Fan, Haibo Zhou, and Yong Zhou

Full-text: Open access


Statistical estimation and inference for marginal hazard models with varying coefficients for multivariate failure time data are important subjects in survival analysis. A local pseudo-partial likelihood procedure is proposed for estimating the unknown coefficient functions. A weighted average estimator is also proposed in an attempt to improve the efficiency of the estimator. The consistency and asymptotic normality of the proposed estimators are established and standard error formulas for the estimated coefficients are derived and empirically tested. To reduce the computational burden of the maximum local pseudo-partial likelihood estimator, a simple and useful one-step estimator is proposed. Statistical properties of the one-step estimator are established and simulation studies are conducted to compare the performance of the one-step estimator to that of the maximum local pseudo-partial likelihood estimator. The results show that the one-step estimator can save computational cost without compromising performance both asymptotically and empirically and that an optimal weighted average estimator is more efficient than the maximum local pseudo-partial likelihood estimator. A data set from the Busselton Population Health Surveys is analyzed to illustrate our proposed methodology.

Article information

Ann. Statist., Volume 35, Number 1 (2007), 324-354.

First available in Project Euclid: 6 June 2007

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G05: Estimation
Secondary: 62N01: Censored data models 62N02: Estimation

Local pseudo-partial likelihood marginal hazard model martingale multivariate failure time one-step estimator varying coefficients


Cai, Jianwen; Fan, Jianqing; Zhou, Haibo; Zhou, Yong. Hazard models with varying coefficients for multivariate failure time data. Ann. Statist. 35 (2007), no. 1, 324--354. doi:10.1214/009053606000001145.

Export citation


  • Andersen, P. K. and Gill, R. D. (1982). Cox's regression model for counting processes: A large sample study. Ann. Statist. 10 1100–1120.
  • Anderson, J. E. and Louis, T. A. (1995). Survival analysis using a scale change random effects model. J. Amer. Statist. Assoc. 90 669–679.
  • Cai, J. and Prentice, R. L. (1995). Estimating equations for hazard ratio parameters based on correlated failure time data. Biometrika 82 151–164.
  • Cai, J. and Prentice, R. L. (1997). Regression estimation using multivariate failure time data and a common baseline hazard function model. Lifetime Data Anal. 3 197–213.
  • Cai, J. and Shen, Y. (2000). Permutation tests for comparing marginal survival functions with clustered failure time data. Stat. Med. 19 2963–2973.
  • Cai, Z., Fan, J. and Li, R. (2000). Efficient estimation and inferences for varying-coefficient models. J. Amer. Statist. Assoc. 95 888–902.
  • Cai, Z. and Sun, Y. (2003). Local linear estimation for time-dependent coefficients in Cox's regression models. Scand. J. Statist. 30 93–111.
  • Carroll, R. J., Fan, J., Gijbels, I. and Wand, M. P. (1997). Generalized partially linear single-index models. J. Amer. Statist. Assoc. 92 477–489.
  • Carroll, R. J., Ruppert, D. and Welsh, A. H. (1998). Local estimating equations. J. Amer. Statist. Assoc. 93 214–227.
  • Clayton, D. and Cuzick, J. (1985). Multivariate generalizations of the proportional hazards model. J. Roy. Statist. Soc. Ser. A 148 82–117.
  • Cox, D. R. (1972). Regression models and life-tables (with discussion). J. Roy. Statist. Soc. Ser. B 34 187–220.
  • Cullen, K. J. (1972). Mass health examinations in the Busselton population, 1966 to 1970. Medical J. Australia 2 714–718.
  • Fan, J. and Chen, J. (1999). One-step local quasi-likelihood estimation. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 927–943.
  • Fan, J., Farmen, M. and Gijbels, I. (1998). Local maximum likelihood estimation and inference. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 591–608.
  • Fan, J., Gijbels, I. and King, M. (1997). Local likelihood and local partial likelihood in hazard regression. Ann. Statist. 25 1661–1690.
  • Fan, J. and Li, R. (2002). Variable selection for Cox's proportional hazards model and frailty model. Ann. Statist. 30 74–99.
  • Hastie, T. J. and Tibshirani, R. J. (1993). Varying-coefficient models (with discussion). J. Roy. Statist. Soc. Ser. B 55 757–796.
  • Kalbfleisch, J. D. and Prentice, R. L. (2002). The Statistical Analysis of Failure Time Data, 2nd ed. Wiley, Hoboken, NJ.
  • Knuiman, M. W., Cullen, K. J., Bulsara, M. K., Welborn, T. A. and Hobbs, M. S. T. (1994). Mortality trends 1965 to 1989 in Busselton, the site of repeated health surveys and interventions. Australian J. Public Health 18 129–135.
  • Lin, D. Y. (1994). Cox regression analysis of multivariate failure time data: The marginal approach. Stat. Med. 13 2233–2247.
  • Murphy, S. A. (1993). Testing for a time dependent coefficient in Cox's regression model. Scand. J. Statist. 20 35–50.
  • Oakes, D. and Jeong, J. (1998). Frailty models and rank tests. Lifetime Data Anal. 4 209–228.
  • \uRe\uricha, V., Kulich, M., \uRe\uricha, R., Shore, D. and Sandler, D. (2006). Incidence of leukemia, lymphoma, and multiple myeloma in Czech uranium miners: A case-cohort study. Environmental Health Perspectives 114 818–822.
  • Robinson, P. M. (1988). The stochastic difference between econometric statistics. Econometrica 56 531–548.
  • Spiekerman, C. F. and Lin, D. Y. (1998). Marginal regression models for multivariate failure time data. J. Amer. Statist. Assoc. 93 1164–1175.
  • Tian, L., Zucker, D. and Wei, L. J. (2002). On the Cox model with time-varying regression coefficients. Working paper, Dept. Biostatistics, Harvard Univ.
  • Vaupel, J. W., Manton, K. G. and Stallard, E. (1979). The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography 16 439–454.
  • Wei, L. J., Lin, D. Y. and Weissfeld, L. (1989). Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J. Amer. Statist. Assoc. 84 1065–1073.