The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 13, Number 2 (2019), 874-899.
TreeClone: Reconstruction of tumor subclone phylogeny based on mutation pairs using next generation sequencing data
We present TreeClone, a latent feature allocation model to reconstruct tumor subclones subject to phylogenetic evolution that mimics tumor evolution. Similar to most current methods, we consider data from next-generation sequencing of tumor DNA. Unlike most methods that use information in short reads mapped to single nucleotide variants (SNVs), we consider subclone phylogeny reconstruction using pairs of two proximal SNVs that can be mapped by the same short reads. As part of the Bayesian inference model, we construct a phylogenetic tree prior. The use of the tree structure in the prior greatly strengthens inference. Only subclones that can be explained by a phylogenetic tree are assigned non-negligible probabilities. The proposed Bayesian framework implies posterior distributions on the number of subclones, their genotypes, cellular proportions and the phylogenetic tree spanned by the inferred subclones. The proposed method is validated against different sets of simulated and real-world data using single and multiple tumor samples. An open source software package is available at http://www.compgenome.org/treeclone.
Ann. Appl. Stat., Volume 13, Number 2 (2019), 874-899.
Received: October 2017
Revised: August 2018
First available in Project Euclid: 17 June 2019
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Zhou, Tianjian; Sengupta, Subhajit; Müller, Peter; Ji, Yuan. TreeClone: Reconstruction of tumor subclone phylogeny based on mutation pairs using next generation sequencing data. Ann. Appl. Stat. 13 (2019), no. 2, 874--899. doi:10.1214/18-AOAS1224. https://projecteuclid.org/euclid.aoas/1560758431
- Supplement to “TreeClone: Reconstruction of Tumor Subclone Phylogeny Based on Mutation Pairs using Next Generation Sequencing Data”. We provide the R package TreeClone, a glossary of biological terms and the supplementary details referenced in the main text.