March 2024 Distributed proportional likelihood ratio model with application to data integration across clinical sites
Chongliang Luo, Rui Duan, Mackenzie Edmondson, Jiasheng Shi, Mitchell Maltenfort, Jeffrey S. Morris, Christopher B. Forrest, Rebecca Hubbard, Yong Chen
Author Affiliations +
Ann. Appl. Stat. 18(1): 63-79 (March 2024). DOI: 10.1214/23-AOAS1779


Real-world evidence synthesis through integration of data from distributed research networks has gained increasing attention in recent years. Due to privacy concerns and restrictions of sharing patient-level data, distributed algorithms that do not require sharing patient level information are in great need for facilitating multisite collaborations. On the other hand, data collected at multiple sites often come from diverse populations, and there exists a substantial amount of heterogeneity across sites in patient characteristics. Most of the existing distributed algorithms have ignored such between-site heterogeneity. In this paper we aim to fill this methodological gap by proposing a general distributed algorithm. We develop our distributed algorithm based on a general semiparametric model, namely, the proportional likelihood ratio model (Biometrika 99 (2012) 211–222), which is a semiparametric extension of generalized linear model. We devise the proportional likelihood ratio model with site-specific baseline function, to account for between-site heterogeneity, and shared regression parameters to borrow information across sites. Under this flexible formulation, our distributed algorithm is designed to be privacy-preserving and communication-efficient (i.e., only one round of communication across sites is needed). We validate our method via simulation studies and demonstrate the utility of our method via a multisite study of pediatric avoidable hospitalization based on electronic health record data from a total of 354,672 patients across 26 different clinical sites within the Children’s Hospital of Philadelphia health system.

Funding Statement

This work was supported in part by National Institutes of Health (1R01LM012607, 1R01AI130460, 1R01AG073435, 1R56AG074604, 1R01LM013519, 1R56AG069880, 1R01AG077820, 1U01TR003709). This work was supported partially through Patient-Centered Outcomes Research Institute (PCORI) Project Program Awards (ME-2019C3-18315 and ME-2018C3-14899). All statements in this report, including its findings and conclusions, are solely those of the authors and do not necessarily represent the views of the Patient-Centered Outcomes Research Institute (PCORI), its Board of Governors, or Methodology Committee.


We want to thank the Editor, an Associate Editor, and two anonymous reviewers from the journal for their constructive comments that helped us improve the manuscript significantly. We also want to thank all the participants in the PEDSnet.


Download Citation

Chongliang Luo. Rui Duan. Mackenzie Edmondson. Jiasheng Shi. Mitchell Maltenfort. Jeffrey S. Morris. Christopher B. Forrest. Rebecca Hubbard. Yong Chen. "Distributed proportional likelihood ratio model with application to data integration across clinical sites." Ann. Appl. Stat. 18 (1) 63 - 79, March 2024.


Received: 1 February 2021; Revised: 1 November 2021; Published: March 2024
First available in Project Euclid: 31 January 2024

Digital Object Identifier: 10.1214/23-AOAS1779

Keywords: Distributed research network , heterogeneity-aware distributed algorithms , noniterative distributed algorithm , Privacy-preserving , real-world evidence

Rights: Copyright © 2024 Institute of Mathematical Statistics


This article is only available to subscribers.
It is not available for individual sale.

Vol.18 • No. 1 • March 2024
Back to Top