Abstract
Real-world evidence synthesis through integration of data from distributed research networks has gained increasing attention in recent years. Due to privacy concerns and restrictions of sharing patient-level data, distributed algorithms that do not require sharing patient level information are in great need for facilitating multisite collaborations. On the other hand, data collected at multiple sites often come from diverse populations, and there exists a substantial amount of heterogeneity across sites in patient characteristics. Most of the existing distributed algorithms have ignored such between-site heterogeneity. In this paper we aim to fill this methodological gap by proposing a general distributed algorithm. We develop our distributed algorithm based on a general semiparametric model, namely, the proportional likelihood ratio model (Biometrika 99 (2012) 211–222), which is a semiparametric extension of generalized linear model. We devise the proportional likelihood ratio model with site-specific baseline function, to account for between-site heterogeneity, and shared regression parameters to borrow information across sites. Under this flexible formulation, our distributed algorithm is designed to be privacy-preserving and communication-efficient (i.e., only one round of communication across sites is needed). We validate our method via simulation studies and demonstrate the utility of our method via a multisite study of pediatric avoidable hospitalization based on electronic health record data from a total of 354,672 patients across 26 different clinical sites within the Children’s Hospital of Philadelphia health system.
Funding Statement
This work was supported in part by National Institutes of Health (1R01LM012607, 1R01AI130460, 1R01AG073435, 1R56AG074604, 1R01LM013519, 1R56AG069880, 1R01AG077820, 1U01TR003709). This work was supported partially through Patient-Centered Outcomes Research Institute (PCORI) Project Program Awards (ME-2019C3-18315 and ME-2018C3-14899). All statements in this report, including its findings and conclusions, are solely those of the authors and do not necessarily represent the views of the Patient-Centered Outcomes Research Institute (PCORI), its Board of Governors, or Methodology Committee.
Acknowledgments
We want to thank the Editor, an Associate Editor, and two anonymous reviewers from the journal for their constructive comments that helped us improve the manuscript significantly. We also want to thank all the participants in the PEDSnet.
Citation
Chongliang Luo. Rui Duan. Mackenzie Edmondson. Jiasheng Shi. Mitchell Maltenfort. Jeffrey S. Morris. Christopher B. Forrest. Rebecca Hubbard. Yong Chen. "Distributed proportional likelihood ratio model with application to data integration across clinical sites." Ann. Appl. Stat. 18 (1) 63 - 79, March 2024. https://doi.org/10.1214/23-AOAS1779
Information