Logistic regression is widely used in medical studies to investigate the relationship between a binary response variable Y and a set of potential predictors X. The binary response may represent, for example, the occurrence of some outcome of interest (Y=1 if the outcome occurred and Y=0 otherwise). In this paper, we consider the problem of estimating the logistic regression model with a cure fraction. A sample of observations is said to contain a cure fraction when a proportion of the study subjects (the so-called cured individuals, as opposed to the susceptibles) cannot experience the outcome of interest. One problem arising then is that it is usually unknown who are the cured and the susceptible subjects, unless the outcome of interest has been observed. In this setting, a logistic regression analysis of the relationship between X and Y among the susceptibles is no more straightforward. We develop a maximum likelihood estimation procedure for this problem, based on the joint modeling of the binary response of interest and the cure status. We investigate the identifiability of the resulting model. Then, we establish the consistency and asymptotic normality of the proposed estimator, and we conduct a simulation study to investigate its finite-sample behavior.
"Maximum likelihood estimation in the logistic regression model with a cure fraction." Electron. J. Statist. 5 460 - 483, 2011. https://doi.org/10.1214/11-EJS616