September 2022 Contrastive latent variable modeling with application to case-control sequencing experiments
Andrew Jones, F. William Townes, Didong Li, Barbara E. Engelhardt
Author Affiliations +
Ann. Appl. Stat. 16(3): 1268-1291 (September 2022). DOI: 10.1214/21-AOAS1534

Abstract

High-throughput RNA-sequencing (RNA-seq) technologies are powerful tools for understanding cellular state. Often, it is of interest to quantify and to summarize changes in cell state that occur between experimental or biological conditions. Differential expression is typically assessed using univariate tests to measure genewise shifts in expression. However, these methods largely ignore changes in transcriptional correlation. Furthermore, there is a need to identify the low-dimensional structure of the gene expression shift to identify collections of genes that change between conditions. Here, we propose contrastive latent variable models designed for count data to create a richer portrait of differential expression in sequencing data. These models disentangle the sources of transcriptional variation in different conditions in the context of an explicit model of variation at baseline. Moreover, we develop a model-based hypothesis testing framework that can test for global and gene subset-specific changes in expression. We evaluate our model through extensive simulations and analyses with count-based gene expression data from perturbation and observational sequencing experiments. We find that our methods effectively summarize and quantify complex transcriptional changes in case-control experimental sequencing data.

Funding Statement

AJ, FWT, DL, and BEE were supported by a grant from the Helmsley Trust, a grant from the NIH Human Tumor Atlas Research Program, NIH NHLBI R01 HL133218, and NSF CAREER AWD1005627.

Acknowledgments

We would like to thank the Editor, Associate Editor, and anonymous referees for their constructive comments. We thank Danny Simpson and Isabella Grabski for helpful conversations. DL is also affiliated with the Department of Biostatistics, University of California, Los Angeles. DL and BEE are also affiliated with Gladstone Institutes.

Citation

Download Citation

Andrew Jones. F. William Townes. Didong Li. Barbara E. Engelhardt. "Contrastive latent variable modeling with application to case-control sequencing experiments." Ann. Appl. Stat. 16 (3) 1268 - 1291, September 2022. https://doi.org/10.1214/21-AOAS1534

Information

Received: 1 February 2021; Revised: 1 July 2021; Published: September 2022
First available in Project Euclid: 19 July 2022

MathSciNet: MR4455880
zbMATH: 1498.62228
Digital Object Identifier: 10.1214/21-AOAS1534

Keywords: case-control data , contrastive models , differential expression , latent variable models , RNA sequencing

Rights: Copyright © 2022 Institute of Mathematical Statistics

JOURNAL ARTICLE
24 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.16 • No. 3 • September 2022
Back to Top