Abstract
Characterizing the cumulative burden of COVID-19 by race/ethnicity is of the utmost importance for public health researchers and policy makers in order to design effective mitigation measures. This analysis is hampered, however, by surveillance case data with substantial missingness in race and ethnicity covariates. Worse yet, this missingness likely depends on the values of these missing covariates; that is, they are not-missing-at-random (NMAR). We propose a Bayesian parametric model that leverages joint information on spatial variation in the disease and covariate missingness processes and can accommodate both MAR and NMAR missingness. We show that the model is locally identifiable when the spatial distribution of the population covariates is known and observed cases can be associated with a spatial unit of observation. We also use a simulation study to investigate the model’s finite-sample performance. We compare our model’s performance on NMAR data against complete-case analysis and multiple imputation (MI), both of which are commonly used by public health researchers when confronted with missing categorical covariates. Finally, we model spatial variation in cumulative COVID-19 incidence in Wayne County, Michigan, using data from the Michigan Department of Health and Human Services. The analysis suggests that population relative risk estimates by race during the early part of the COVID-19 pandemic in Michigan were understated for non-white residents, compared to white residents, when cases missing race were dropped or had these values imputed using MI.
Funding Statement
JZ and RT were supported by award #6 U01 IP00113801-01 from the U.S. Centers for Disease Control and Prevention, and award #812255 from the Simons Foundation.
This research was supported in part through computational resources and services provided by Advanced Research Computing (ARC), a division of Information and Technology Services (ITS) at the University of Michigan, Ann Arbor.
YC is supported by NSF Grant DMS-1811083 & 2113397.
Acknowledgments
We would like to thank Mitzi Morris, Andrew Gelman, and Bob Carpenter for their feedback on an earlier draft of the paper.
Citation
Rob Trangucci. Yang Chen. Jon Zelner. "Modeling racial/ethnic differences in COVID-19 incidence with covariates subject to nonrandom missingness." Ann. Appl. Stat. 17 (4) 2723 - 2758, December 2023. https://doi.org/10.1214/22-AOAS1711
Information