The Annals of Applied Statistics

Likelihood reweighting methods to reduce potential bias in noninferiority trials which rely on historical data to make inference

Lei Nie, Zhiwei Zhang, Daniel Rubin, and Jianxiong Chu

Full-text: Open access


It is generally believed that bias is minimized in well-controlled randomized clinical trials. However, bias can arise in active controlled noninferiority trials because the inference relies on a previously estimated effect size obtained from a historical trial that may have been conducted for a different population. By implementing a likelihood reweighting method through propensity scoring, a study designed to estimate a treatment effect in one trial population can be used to estimate the treatment effect size in a different target population. We illustrate this method in active controlled noninferiority trials, although it can also be used in other types of studies, such as historically controlled trials, meta-analyses, and comparative effectiveness analyses.

Article information

Ann. Appl. Stat., Volume 7, Number 3 (2013), 1796-1813.

First available in Project Euclid: 3 October 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bias generalized linear model inverse probability weighting noninferiority propensity score


Nie, Lei; Zhang, Zhiwei; Rubin, Daniel; Chu, Jianxiong. Likelihood reweighting methods to reduce potential bias in noninferiority trials which rely on historical data to make inference. Ann. Appl. Stat. 7 (2013), no. 3, 1796--1813. doi:10.1214/13-AOAS655.

Export citation


  • Carbonell-Estrany, X., Simões, E. A. F., Dagan, R. et al. (2010). Motavizumab for prophylaxis of respiratory syncytial virus in high-risk children: A noninferiority trial. Pediatrics 125 e35–e51.
  • Cole, S. R. and Hernán, M. A. (2008). Constructing inverse probability weights for marginal structural models. Am. J. Epidemiol. 168 656–664.
  • Cole, S. R. and Stuart, E. A. (2010). Generalizing evidence from randomized clinical trials to target populations: The ACTG 320 trial. Am. J. Epidemiol. 172 107–115.
  • Cooper, D. A., Steigbigel, R. T., Gatell, J. M. et al. (2008). Subgroup and resistance analyses of raltegravir for resistant HIV-1 infection. N. Engl. J. Med. 359 355–365.
  • Efron, B. (1981). Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods. Biometrika 68 589–599.
  • FDA (1998). Providing clinical evidence of effectiveness for human drug and biological products. FDA, 1998.
  • FDA (2010). Draft guidance for industry: Non-inferiority clinical trials. FDA, 2010.
  • Frangakis, C. (2009). The calibration of treatment effects from clinical trials to target populations. Clin Trials 6 136–140.
  • Friedman, L. M., Furberg, C. D. and Demets, D. L. (1998). Fundamentals of Clinical Trials, 3rd ed. Springer, New York.
  • Greenhouse, J. B., Kaizar, E. E., Kelleher, K., Seltman, H. and Gardner, W. (2008). Generalizing from clinical trial data: A case study. The risk of suicidality among pediatric antidepressant users. Stat. Med. 27 1801–1813.
  • Greenland, S., Pearl, J. and Robins, J. M. (1999). Causal diagrams for epidemiologic research. Epidemiology 10 37–48.
  • Impact-RSV Study Group (1998). Palivizumab, a humanized respiratory syncytial virus monoclonal antibody, reduces hospitalization from respiratory syncytial virus infection in high-risk infants. The IMpact-RSV Study Group. Pediatrics 102 531–537.
  • Lee, B. K., Lessler, J. T. and Stuart, E. A. (2009). Using weight trimming to improve propensity score weighting. Am. J. Epidemiol. 169 S90–S90.
  • McCaffrey, D. F., Ridgeway, G. and Morral, A. R. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol Methods 9 403–425.
  • Molina, J.-M., Lamarca, A., Andrade-Villanueva, J. et al. (2012). Efficacy and safety of once daily elvitegravir versus twice daily raltegravir in treatment-experienced patients with HIV-1 receiving a ritonavir-boosted protease inhibitor: Randomised, double-blind, phase 3, non-inferiority study. Lancet. Infect. Dis. 12 27–35.
  • Nie, L. and Soon, G. (2010). A covariate-adjustment regression model approach to noninferiority margin definition. Stat. Med. 29 1107–1113.
  • Nie, L., Zhang, Z., Rubin, D. and Chu, J. (2013). Supplement to “Likelihood reweighting methods to reduce potential bias in noninferiority trials which rely on historical data to make inference.” DOI:10.1214/13-AOAS655SUPP.
  • Ridgeway, G. and McCaffrey, D. F. (2007). Comment: Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statist. Sci. 22 540–543.
  • Robins, J. M., Hernán, M. A. and Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology 11 550–560.
  • Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
  • Rothman, K. J. and Michels, K. B. (1994). The continuing unethical use of placebo controls. N. Engl. J. Med. 331 394–398.
  • Signorovitch, J. E., Wu, E. Q., Yu, A. P. et al. (2010). Comparative effectiveness without head-to-head trials: A method for matching-adjusted indirect comparisons applied to psoriasis treatment with adalimumab or etanercept. Pharmacoeconomics 28 935–945.
  • Signorovitch, J. E., Wu, E. Q., Betts, K. A. et al. (2011). Comparative efficacy of nilotinib and dasatinib in newly diagnosed chronic myeloid leukemia: A matching-adjusted indirect comparison of randomized trials. Curr. Med. Res. Opin. 27 1263–1271.
  • Soon, G. G., Nie, L., Hammerstrom, T., Zeng, W. and Chu, H. (2011). Meeting the demand for more sophisticated study designs. A proposal for a new type of clinical trial: The hybrid design. BMJ Open 1 e000156.
  • Soon, G., Zhang, Z., Tsong, Y. and Nie, L. (2013). Assessing overall evidence from noninferiority trials with shared historical data. Stat. Med. 32 2349–2363.
  • Weisberg, H. I., Hayden, V. C. and Pontes, V. P. (2009). Selection criteria and generalizability within the counterfactual framework: Explaining the paradox of antidepressant-induced suicidality? Clin Trials 6 109–118.
  • White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica 50 1–25.
  • Zeger, S. L. and Liang, K. Y. (1986). Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42 121–130.
  • Zhang, Z. (2007). Estimating the current treatment effect with historical control data. JP J. Biostat. 1 217–247.
  • Zhang, Z. (2009). Covariate-adjusted putative placebo analysis in active-controlled clinical trials. Statistics in Biopharmaceutical Research 1 279–290.

Supplemental materials

  • Supplementary material: Supplement to “Likelihood reweighting methods to reduce potential bias in noninferiority trials which rely on historical data to make inference”. The supplement provides an assessment of the efficiency loss for the weighted likelihood method and a comparison between the likelihood reweighting method and related methods in historically controlled trials.