June 2024 Bayesian nested latent class models for cause-of-death assignment using verbal autopsies across multiple domains
Zehang Richard Li, Zhenke Wu, Irena Chen, Samuel J. Clark
Author Affiliations +
Ann. Appl. Stat. 18(2): 1137-1159 (June 2024). DOI: 10.1214/23-AOAS1826

Abstract

Understanding cause-specific mortality rates is crucial for monitoring population health and designing public health interventions. Worldwide, two-thirds of deaths do not have a cause assigned. Verbal autopsy (VA) is a well-established tool to collect information describing deaths outside of hospitals by conducting surveys to caregivers of a deceased person. It is routinely implemented in many low- and middle-income countries. Statistical algorithms to assign cause of death using VAs are typically vulnerable to the distribution shift between the data used to train the model and the target population. This presents a major challenge for analyzing VAs, as labeled data are usually unavailable in the target population. This article proposes a latent class model framework for VA data (LCVA) that jointly models VAs collected over multiple heterogeneous domains, assigns causes of death for out-of-domain observations and estimates cause-specific mortality fractions for a new domain. We introduce a parsimonious representation of the joint distribution of the collected symptoms using nested latent class models and develop a computationally efficient algorithm for posterior inference. We demonstrate that LCVA outperforms existing methods in predictive performance and scalability. Supplementary Material and reproducible analysis codes are available online. The R package LCVA implementing the method is available on GitHub ( https://github.com/richardli/LCVA).

Funding Statement

ZRL and ZW were supported by grant R03HD110962 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD). ZW and IC were supported in part by a seed grant from Michigan Institute of Data Science. SJC was supported by grant R01HD086227 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD). ZRL and SJC were funded in part by the Bill & Melinda Gates Foundation.
The findings and conclusions contained within are those of the authors and do not necessarily reflect positions or policies of the Bill & Melinda Gates Foundation.

Acknowledgments

The authors would like to thank the anonymous referees, an Associate Editor, and the Editor for their constructive comments that improved the quality of this paper. The authors are grateful to the openVA team for discussion and feedback on the paper.

Citation

Download Citation

Zehang Richard Li. Zhenke Wu. Irena Chen. Samuel J. Clark. "Bayesian nested latent class models for cause-of-death assignment using verbal autopsies across multiple domains." Ann. Appl. Stat. 18 (2) 1137 - 1159, June 2024. https://doi.org/10.1214/23-AOAS1826

Information

Received: 1 April 2022; Revised: 1 June 2023; Published: June 2024
First available in Project Euclid: 5 April 2024

Digital Object Identifier: 10.1214/23-AOAS1826

Keywords: data shift , dependent binary data , domain adaptation , mixture model , quantification learning

Rights: Copyright © 2024 Institute of Mathematical Statistics

JOURNAL ARTICLE
23 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.18 • No. 2 • June 2024
Back to Top