The Annals of Applied Statistics

Quantification of multiple tumor clones using gene array and sequencing data

Yichen Cheng, James Y. Dai, Thomas G. Paulson, Xiaoyu Wang, Xiaohong Li, Brian J. Reid, and Charles Kooperberg

Cancer development is driven by genomic alterations, including copy number aberrations. The detection of copy number aberrations in tumor cells is often complicated by possible contamination of normal stromal cells in tumor samples and intratumor heterogeneity, namely the presence of multiple clones of tumor cells. In order to correctly quantify copy number aberrations, it is critical to successfully de-convolute the complex structure of the genetic information from tumor samples. In this article, we propose a general Bayesian method for estimating copy number aberrations when there are normal cells and potentially more than one tumor clones. Our method provides posterior probabilities for the proportions of tumor clones and normal cells. We incorporate prior information on the distribution of the copy numbers to prioritize biologically more plausible solutions and alleviate possible identifiability issues that have been observed by many researchers. Our model is flexible and can work for both SNP array and next-generation sequencing data. We compare our method to existing ones and illustrate the advantage of our approach in multiple datasets.

Ann. Appl. Stat., Volume 11, Number 2 (2017), 967-991.

Received: May 2016
Revised: October 2016
First available in Project Euclid: 20 July 2017

Copy number aberration intratumor heterogeneity identifiability BIC


Cheng, Yichen; Dai, James Y.; Paulson, Thomas G.; Wang, Xiaoyu; Li, Xiaohong; Reid, Brian J.; Kooperberg, Charles. Quantification of multiple tumor clones using gene array and sequencing data. Ann. Appl. Stat. 11 (2017), no. 2, 967--991. doi:10.1214/17-AOAS1026.

