The Annals of Applied Statistics

Maximum likelihood and pseudo score approaches for parametric time-to-event analysis with informative entry times

Brian D. M. Tom, Vernon T. Farewell, and Sheila M. Bird

Full-text: Open access


We develop a maximum likelihood estimating approach for time-to-event Weibull regression models with outcome-dependent sampling, where sampling of subjects is dependent on the residual fraction of the time left to developing the event of interest. Additionally, we propose a two-stage approach which proceeds by iteratively estimating, through a pseudo score, the Weibull parameters of interest (i.e., the regression parameters) conditional on the inverse probability of sampling weights; and then re-estimating these weights (given the updated Weibull parameter estimates) through the profiled full likelihood. With these two new methods, both the estimated sampling mechanism parameters and the Weibull parameters are consistently estimated under correct specification of the conditional referral distribution. Standard errors for the regression parameters are obtained directly from inverting the observed information matrix in the full likelihood specification and by either calculating bootstrap or robust standard errors for the hybrid pseudo score/profiled likelihood approach. Loss of efficiency with the latter approach is considered. Robustness of the proposed methods to misspecification of the referral mechanism and the time-to-event distribution is also briefly examined. Further, we show how to extend our methods to the family of parametric time-to-event distributions characterized by the generalized gamma distribution. The motivation for these two approaches came from data on time to cirrhosis from hepatitis C viral infection in patients referred to the Edinburgh liver clinic. We analyze these data here.

Article information

Ann. Appl. Stat., Volume 8, Number 2 (2014), 726-746.

First available in Project Euclid: 1 July 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Biased data generalized gamma distribution outcome-dependent sampling pseudo score robust standard error survival analysis Weibull distribution


Tom, Brian D. M.; Farewell, Vernon T.; Bird, Sheila M. Maximum likelihood and pseudo score approaches for parametric time-to-event analysis with informative entry times. Ann. Appl. Stat. 8 (2014), no. 2, 726--746. doi:10.1214/14-AOAS725.

Export citation


  • Aalen, O. O., Farewell, V. T., Angelis, D. D., Day, N. E. and Gill, O. N. (1997). A Markov model for HIV disease progression including the effect of HIV diagnosis and treatment: Application to AIDS prediction in England and Wales. Stat. Med. 16 2191–2210.
  • Andersen, P. K., Borgan, Ø., Gill, R. D. and Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer, New York.
  • Brookmeyer, R. (2005). Biased sampling of cohorts. In Encyclopedia of Biostatistics, 2nd ed. (P. Armitage and T. Colton, eds.) 427–439. Wiley, New York.
  • Cook, R. J. and Lawless, J. F. (2007). The Statistical Analysis of Recurrent Events. Springer, Berlin.
  • Copas, A. J. and Farewell, V. T. (2001). Incorporating retrospective data into an analysis of time to illness. Biostatistics 2 1–12.
  • Cox, C., Chu, H., Schneider, M. F. and Muñoz, A. (2007). Parametric survival analysis and taxonomy of hazard functions for the generalized gamma distribution. Stat. Med. 26 4352–4374.
  • De Angelis, D., Sweeting, M., Ades, A. E., Hickman, M., Hope, V. and Ramsay, M. (2009). An evidence synthesis approach to estimating hepatitis C prevalence in England and Wales. Stat. Methods Med. Res. 18 361–379.
  • Farewell, V. T. and Prentice, R. L. (1977). A study of distributional shape in life testing. Technometrics 19 69–75.
  • Freeman, A. J., Dore, G. J., Law, M. G., Thorpe, M., Overbeck, J. V., Lloyd, A. R., Marinos, G. and Kaldor, J. M. (2001). Estimating progression to cirrhosis in chronic hepatitis C virus infection. Hepatology 34 809–816.
  • Fu, B., Tom, B. D. M. and Bird, S. M. (2009). Re-weighted inference about hepatitis C virus-infected communities when analysing diagnosed patients referred to liver clinics. Stat. Methods Med. Res. 18 303–320.
  • Fu, B., Tom, B. D. M., Delahooke, T., Alexander, G. J. M. and Bird, S. M. (2007). Event-biased referral can distort estimation of hepatitis C virus progression rate to cirrhosis, and of prognostic influences. J. Clin. Epidemiol. 60 1140–1148.
  • Glaser, R. E. (1980). Bathtub and related failure rate characterizations. J. Amer. Statist. Assoc. 75 667–672.
  • Hagan, H., Pouget, E. R., Des Jarais, D. C. and Lelutiu-Weinberger, C. (2008). Meta-regression of hepatitis C virus infection in relation to time since onset of illicit drug injection: The influence of time and place. Am. J. Epidemiol. 168 1099–1109.
  • Hardin, J. W. and Hilbe, J. M. (2003). Generalized Estimating Equations. Chapman & Hall, London.
  • Hutchinson, S. J., Bird, S. M. and Goldberg, D. J. (2005). Modelling the current and future disease burden of hepatitis C among injecting drug users in Scotland. Hepatology 42 711–723.
  • Kalbfleisch, J. D. and Prentice, R. L. (2002). The Statistical Analysis of Failure Time Data, 2nd ed. Wiley, New York.
  • Keiding, N. (2005). Delayed entry. In Encyclopedia of Biostatistics, 2nd ed. (P. Armitage and T. Colton, eds.) 1404–1409. Wiley, New York.
  • Lawless, J. F. (1980). Inference in the generalized gamma and log gamma distributions. Technometrics 22 409–419.
  • Lawless, J. F. (1997). Likelihood and pseudo likelihood estimation based on response-biased observation. In Selected Proceedings of the Symposium on Estimating Functions (Athens, GA, 1996) (V. B. Ishwar, V. P. Godambe and R. L. Taylor, eds.). Institute of Mathematical Statistics Lecture Notes—Monograph Series 32 43–55. IMS, Hayward, CA.
  • Prentice, R. L. (1974). A log gamma model and its maximum likelihood estimation. Biometrika 61 539–544.
  • Qin, J. and Shen, Y. (2010). Statistical methods for analyzing right-censored length-biased data under Cox model. Biometrics 66 382–392.
  • Stacy, E. W. (1962). A generalization of the gamma distribution. Ann. Math. Statist. 33 1187–1192.
  • Stacy, E. W. and Mihram, G. A. (1965). Parameter estimation for a generalized gamma distribution. Technometrics 7 349–358.
  • Struthers, C. A. and Farewell, V. T. (1989). A mixture model for time to AIDS data with left truncation and an uncertain origin. Biometrika 76 814–817.
  • Tom, B. D. M., Farewell, V. T. and Bird, S. M. (2014). Supplement to “Maximum likelihood and pseudo score approaches for parametric time-to-event analysis with informative entry times.” DOI:10.1214/14-AOAS725SUPP.
  • Tsai, W. Y. (2009). Pseudo-partial likelihood for proportional hazards models with biased-sampling data. Biometrika 96 601–615.
  • Wang, M.-C. (2005). Length bias. In Encyclopedia of Biostatistics, 2nd ed. (P. Armitage and T. Colton, eds.) 2756–2759. Wiley, New York.
  • Wang, M.-C., Brookmeyer, R. and Jewell, N. P. (1993). Statistical models for prevalent cohort data. Biometrics 49 1–11.

Supplemental materials

  • Supplementary material: Appendix: Derivations of the expressions based on the generalized gamma and mixture of uniforms. Proofs of the various expressions required in the constructing of the likelihood and pseudo score based on the assumption that the time-to-event distribution is from a generalized gamma distribution and the conditional referral distribution is a mixture of independent uniforms.