Abstract
In The Cancer Genome Atlas (TCGA) data set, there are many interesting nonlinear dependencies between pairs of genes that reveal important relationships and subtypes of cancer. Such genomic data analysis requires a rapid, powerful, and interpretable detection process, especially in a high-dimensional environment. We study the nonlinear patterns among the expression of pairs of genes from TCGA using a powerful tool called binary expansion testing. We find many nonlinear patterns, some of which are driven by known cancer subtypes, some of which are novel.
Funding Statement
Xiang’s research was supported by SAMSI, NIH/NIAMS Grants P30AR072580 and R21AR074685, and DMS-2152289 from NSF.
Perou’s research was supported by NCI Breast SPORE program P50-CA58223 and U01CA238475-01.
Zhang’s research was partially supported by NSF Grants DMS-1613112, IIS-1633212, DMS-1916237 and DMS-2152289.
Marron’s research was supported by NSF Grants IIS-1633074 and DMS-2113404.
Acknowledgments
The results published here are in whole or part based upon data from the Cancer Genome Atlas managed by the NCI and NHGRI (dbGaP accession phs000178).
Citation
Siqi Xiang. Wan Zhang. Siyao Liu. Katherine A. Hoadley. Charles M. Perou. Kai Zhang. J. S. Marron. "Pairwise nonlinear dependence analysis of genomic data." Ann. Appl. Stat. 17 (4) 2924 - 2943, December 2023. https://doi.org/10.1214/23-AOAS1745
Information