Abstract
Mechanistic models fit to streaming surveillance data are critical for understanding the transmission dynamics of an outbreak as it unfolds in real-time. However, transmission model parameter estimation can be imprecise, sometimes even impossible, because surveillance data are noisy and not informative about all aspects of the mechanistic model. To partially overcome this obstacle, Bayesian models have been proposed to integrate multiple surveillance data streams. We devised a modeling framework for integrating SARS-CoV-2 diagnostics test and mortality time series data as well as seroprevalence data from cross-sectional studies and tested the importance of individual data streams for both inference and forecasting. Importantly, our model for incidence data accounts for changes in the total number of tests performed. We apply our Bayesian data integration method to COVID-19 surveillance data collected in Orange County, California, between March 2020 and February 2021 and find that 32–72% of the Orange County residents experienced SARS-CoV-2 infection by mid-January, 2021. Despite this high number of infections, our results suggest that the abrupt end of the winter surge in January 2021 was due to both behavioral changes and a high level of accumulated natural immunity.
Funding Statement
We are grateful for funding from the UCI Infectious Disease Science Initiative. This work was made possible in part through support from the UC CDPH Modeling Consortium. D.B, I.H.G, and V.M.M were supported in part by NIH grant R01AI147336. V.M.M was supported, in part, by NIH grant R01AI170204 and NSF grant DMS 1936833. ER was supported by the Division of Intramural Research, NIAID, NIH. This project has been funded, in part, with federal funds from the National Cancer Institute, National Institutes of Health, under Contract No. 75N91019D00024, Task Order No. 75N91019F00130. This work was, in part, supported by the intramural research programs of the National Institutes of Health, Bethesda, MD. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.
Acknowledgments
This work utilized the infrastructure for high-performance and high-throughput computing, research data storage and analysis, and scientific software tool integration built, operated, and updated by the Research Cyberinfrastructure Center (RCIC) at the University of California, Irvine. We thank UC Health & CDPH COVID Modeling Consortium, especially Marm Kilpatrick, Tomás Leon, Paul Mattern, Maya Petersen, and Joshua Schwab for giving us feedback on this project.
Citation
Damon Bayer. Isaac H. Goldstein. Jonathan Fintzi. Keith Lumbard. Emily Ricotta. Sarah Warner. Jeffrey R Strich. Daniel S. Chertow. Lindsay M. Busch. Daniel M. Parker. Bernadette Boden-Albala. Richard Chhuon. Matthew Zahn. Nichole Quick. Alissa Dratch. Volodymyr M. Minin. "Semiparametric modeling of SARS-CoV-2 transmission using tests, cases, deaths, and seroprevalence data." Ann. Appl. Stat. 18 (3) 2307 - 2325, September 2024. https://doi.org/10.1214/24-AOAS1882
Information