The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 9, Number 2 (2015), 621-639.
A Bayesian feature allocation model for tumor heterogeneity
Juhee Lee, Peter Müller, Kamalakar Gulukota, and Yuan Ji
Abstract
We develop a feature allocation model for inference on genetic tumor variation using next-generation sequencing data. Specifically, we record single nucleotide variants (SNVs) based on short reads mapped to human reference genome and characterize tumor heterogeneity by latent haplotypes defined as a scaffold of SNVs on the same homologous genome. For multiple samples from a single tumor, assuming that each sample is composed of some sample-specific proportions of these haplotypes, we then fit the observed variant allele fractions of SNVs for each sample and estimate the proportions of haplotypes. Varying proportions of haplotypes across samples is evidence of tumor heterogeneity since it implies varying composition of cell subpopulations. Taking a Bayesian perspective, we proceed with a prior probability model for all relevant unknown quantities, including, in particular, a prior probability model on the binary indicators that characterize the latent haplotypes. Such prior models are known as feature allocation models. Specifically, we define a simplified version of the Indian buffet process, one of the most traditional feature allocation models. The proposed model allows overlapping clustering of SNVs in defining latent haplotypes, which reflects the evolutionary process of subclonal expansion in tumor samples.
Article information
Source
Ann. Appl. Stat., Volume 9, Number 2 (2015), 621-639.
Dates
Received: July 2014
Revised: January 2015
First available in Project Euclid: 20 July 2015
Permanent link to this document
https://projecteuclid.org/euclid.aoas/1437397104
Digital Object Identifier
doi:10.1214/15-AOAS817
Mathematical Reviews number (MathSciNet)
MR3371328
Zentralblatt MATH identifier
06499923
Keywords
Haplotypes feature allocation models Indian buffet process Markov chain Monte Carlo next-generation sequencing random binary matrices variant calling
Citation
Lee, Juhee; Müller, Peter; Gulukota, Kamalakar; Ji, Yuan. A Bayesian feature allocation model for tumor heterogeneity. Ann. Appl. Stat. 9 (2015), no. 2, 621--639. doi:10.1214/15-AOAS817. https://projecteuclid.org/euclid.aoas/1437397104
Supplemental materials
- Supplement to “A Bayesian feature allocation model for tumor heterogeneity”. The supplementary material includes the second simulation study.Digital Object Identifier: doi:10.1214/15-AOAS817SUPP