Statistical Science

Semiparametric Estimation of Treatment Effect in a Pretest–Posttest Study with Missing Data

Marie Davidian, Anastasios A. Tsiatis, and Selene Leon

Full-text: Open access


The pretest–posttest study is commonplace in numerous applications. Typically, subjects are randomized to two treatments, and response is measured at baseline, prior to intervention with the randomized treatment (pretest), and at prespecified follow-up time (posttest). Interest focuses on the effect of treatments on the change between mean baseline and follow-up response. Missing posttest response for some subjects is routine, and disregarding missing cases can lead to invalid inference. Despite the popularity of this design, a consensus on an appropriate analysis when no data are missing, let alone for taking into account missing follow-up, does not exist. Under a semiparametric perspective on the pretest–posttest model, in which limited distributional assumptions on pretest or posttest response are made, we show how the theory of Robins, Rotnitzky and Zhao may be used to characterize a class of consistent treatment effect estimators and to identify the efficient estimator in the class. We then describe how the theoretical results translate into practice. The development not only shows how a unified framework for inference in this setting emerges from the Robins, Rotnitzky and Zhao theory, but also provides a review and demonstration of the key aspects of this theory in a familiar context. The results are also relevant to the problem of comparing two treatment means with adjustment for baseline covariates.

Article information

Statist. Sci., Volume 20, Number 3 (2005), 261-301.

First available in Project Euclid: 24 August 2005

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Analysis of covariance covariate adjustment influence function inverse probability weighting missing at random


Davidian, Marie; Tsiatis, Anastasios A.; Leon, Selene. Semiparametric Estimation of Treatment Effect in a Pretest–Posttest Study with Missing Data. Statist. Sci. 20 (2005), no. 3, 261--301. doi:10.1214/088342305000000151.

Export citation


  • Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ. Press.
  • Brogan, D. R. and Kutner, M. H. (1980). Comparative analyses of pretest--posttest research designs. Amer. Statist. 34 229--232.
  • Casella, G. and Berger, R. L. (2002). Statistical Inference, 2nd ed. Duxbury, Pacific Grove, CA.
  • Cleveland, W. S., Grosse, E. and Shyu, W. M. (1993). Local regression models. In Statistical Models in S (J. M. Chambers and T. J. Hastie, eds.) 309--376. Wadsworth, Pacific Grove, CA.
  • Crager, M. R. (1987). Analysis of covariance in parallel-group clinical trials with pretreatment baseline. Biometrics 43 895--901.
  • Follmann, D. A. (1991). The effect of screening on some pretest--posttest test variances. Biometrics 47 763--771.
  • Hammer, S. M., Katzenstein, D. A., Hughes, M. D., Gundaker, H., Schooley, R. T., Haubrich, R. H., Henry, W. K., Lederman, M. M., Phair, J. P., Niu, M., Hirsch, M. S. and Merigan, T. C., for The AIDS Clinical Trials Group Study 175 Study Team (1996). A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. New England J. Medicine 335 1081--1090.
  • Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Chapman and Hall, London.
  • Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47 663--685.
  • Koch, G. G., Tangen, C. M., Jung, J.-W. and Amara, I. A. (1998). Issues for covariance analysis of dichotomous and ordered categorical data from randomized clinical trials and non-parametric strategies for addressing them. Statistics in Medicine 17 1863--1892.
  • Laird, N. (1983). Further comparative analyses of pretest--posttest research designs. Amer. Statist. 37 329--330.
  • Leon, S., Tsiatis, A. A. and Davidian, M. (2003). Semiparametric estimation of treatment effect in a pretest--posttest study. Biometrics 59 1046--1055.
  • Luenberger, D. G. (1969). Optimization by Vector Space Methods. Wiley, New York.
  • Lunceford, J. K. and Davidian, M. (2004). Stratification and weighting via the propensity score in estimation of causal treatment effects: A comparative study. Statistics in Medicine 23 2937--2960.
  • Newey, W. K. (1990). Semiparametric efficiency bounds. J. Applied Econometrics 5 99--135.
  • Robins, J. M. (1999). Robust estimation in sequentially ignorable missing data and causal inference models. In ASA Proc. Bayesian Statistical Science Section 6--10. Amer. Statist. Assoc., Alexandria, VA.
  • Robins, J. M., Rotnitzky, A. and Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. J. Amer. Statist. Assoc. 89 846--866.
  • Rubin, D. B. (1976). Inference and missing data (with discussion). Biometrika 63 581--592.
  • Scharfstein, D. O., Rotnitzky, A. and Robins, J. M. (1999). Rejoinder to ``Adjusting for nonignorable drop-out using semiparametric nonresponse models.'' J. Amer. Statist. Assoc. 94 1135--1146.
  • Singer, J. M. and Andrade, D. F. (1997). Regression models for the analysis of pretest/posttest data. Biometrics 53 729--735.
  • Stanek, E. J., III (1988). Choosing a pretest--posttest analysis. Amer. Statist. 42 178--183.
  • Stein, R. A. (1989). Adjusting treatment effects for baseline and other predictor variables. In ASA Proc. Biopharmaceutical Section 274--280. Amer. Statist. Assoc., Alexandria, VA.
  • van der Laan, M. J. and Robins, J. M. (2003). Unified Methods for Censored Longitudinal Data and Causality. Springer, New York.
  • Yang, L. and Tsiatis, A. A. (2001). Efficiency study of estimators for a treatment effect in a pretest--posttest trial. Amer. Statist. 55 314--321.