Abstract
Multiple biomarkers are often combined to improve disease diagnosis. The uniformly optimal combination, that is, with respect to all reasonable performance metrics, unfortunately requires excessive distributional modeling, to which the estimation can be sensitive. An alternative strategy is rather to pursue local optimality with respect to a specific performance metric. Nevertheless, existing methods may not target clinical utility of the intended medical test, which usually needs to operate above a certain sensitivity or specificity level, or do not have their statistical properties well studied and understood. In this article, we develop and investigate a linear combination method to maximize the clinical utility empirically for such a constrained classification. The combination coefficient is shown to have cube root asymptotics. The convergence rate and limiting distribution of the predictive performance are subsequently established, exhibiting robustness of the method in comparison with others. An algorithm with sound statistical justification is devised for efficient and high-quality computation. Simulations corroborate the theoretical results, and demonstrate good statistical and computational performance. Illustration with a clinical study on aggressive prostate cancer detection is provided.
Funding Statement
The authors were supported in part by NIH Grants R01 CA230268, U01 CA113913 and P30 AI050409.
Acknowledgments
The authors thank the reviewers for their helpful comments and suggestions, in particular the Associate Editor for pointing out several mistakes in previous versions of the paper, and Dattatraya H. Patil for assistance in arranging the prostate cancer data set analyzed in Section 5.2.
Citation
Yijian Huang. Martin G. Sanda. "Linear biomarker combination for constrained classification." Ann. Statist. 50 (5) 2793 - 2815, October 2022. https://doi.org/10.1214/22-AOS2210
Information