Open Access
March 2016 Evaluating risk-prediction models using data from electronic health records
Le Wang, Pamela A. Shaw, Hansie M. Mathelier, Stephen E. Kimmel, Benjamin French
Ann. Appl. Stat. 10(1): 286-304 (March 2016). DOI: 10.1214/15-AOAS891


The availability of data from electronic health records facilitates the development and evaluation of risk-prediction models, but estimation of prediction accuracy could be limited by outcome misclassification, which can arise if events are not captured. We evaluate the robustness of prediction accuracy summaries, obtained from receiver operating characteristic curves and risk-reclassification methods, if events are not captured (i.e., “false negatives”). We derive estimators for sensitivity and specificity if misclassification is independent of marker values. In simulation studies, we quantify the potential for bias in prediction accuracy summaries if misclassification depends on marker values. We compare the accuracy of alternative prognostic models for 30-day all-cause hospital readmission among 4548 patients discharged from the University of Pennsylvania Health System with a primary diagnosis of heart failure. Simulation studies indicate that if misclassification depends on marker values, then the estimated accuracy improvement is also biased, but the direction of the bias depends on the direction of the association between markers and the probability of misclassification. In our application, 29% of the 1143 readmitted patients were readmitted to a hospital elsewhere in Pennsylvania, which reduced prediction accuracy. Outcome misclassification can result in erroneous conclusions regarding the accuracy of risk-prediction models.


Download Citation

Le Wang. Pamela A. Shaw. Hansie M. Mathelier. Stephen E. Kimmel. Benjamin French. "Evaluating risk-prediction models using data from electronic health records." Ann. Appl. Stat. 10 (1) 286 - 304, March 2016.


Received: 1 December 2014; Revised: 1 July 2015; Published: March 2016
First available in Project Euclid: 25 March 2016

zbMATH: 1358.62109
MathSciNet: MR3480497
Digital Object Identifier: 10.1214/15-AOAS891

Keywords: Outcome misclassification , prediction accuracy , risk reclassification , ROC curves

Rights: Copyright © 2016 Institute of Mathematical Statistics

Vol.10 • No. 1 • March 2016
Back to Top