Abstract
The receiver operating characteristic (ROC) curve describes the performance of a diagnostic test used to discriminate between healthy and diseased individuals based on a variable measured on a continuous scale. The data consist of a training set of m responses $X_1, \dots, X_m$ from healthy individuals and n responses $Y_1, \dots, Y_n$ from diseased individuals. The responses are assumed i.i.d. from unknown distributions F and G, respectively. We consider estimation of the ROC curve defined by $1 - G(F^{-1} (1 - t))$ for $0 \leq t \leq 1$ or, equivalently, the ordinal dominance curve (ODC) given by $F(G^{-1} (t))$. First we consider nonparametric estimators based on empirical distribution functions and derive asymptotic properties. Next we consider the so-called semiparametric "binormal" model, in which it is assumed that the distributions F and G are normal after some unknown monotonic transformation of the measurement scale. For this model, we propose a generalized least squares procedure and compare it with the estimation algorithm of Dorfman and Alf, which is based on grouped data. Asymptotic results are obtained; small sample properties are examined via a simulation study. Finally, we describe a minimum distance estimator for the ROC curve, which does not require grouping the data.
Citation
Fushing Hsieh. Bruce W. Turnbull. "Nonparametric and semiparametric estimation of the receiver operating characteristic curve." Ann. Statist. 24 (1) 25 - 40, February 1996. https://doi.org/10.1214/aos/1033066197
Information