The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 10, Number 1 (2016), 506-526.
Change point analysis of histone modifications reveals epigenetic blocks linking to physical domains
Histone modification is a vital epigenetic mechanism for transcriptional control in eukaryotes. High-throughput techniques have enabled whole-genome analysis of histone modifications in recent years. However, most studies assume one combination of histone modification invariantly translates to one transcriptional output regardless of local chromatin environment. In this study we hypothesize that the genome is organized into local domains that manifest a similar enrichment pattern of histone modification, which leads to orchestrated regulation of expression of genes with relevant biological functions. We propose a multivariate Bayesian Change Point (BCP) model to segment the Drosophila melanogaster genome into consecutive blocks on the basis of combinatorial patterns of histone marks. By modeling the sparse distribution of histone marks with a zero-inflated Gaussian mixture, our partitions capture local BLOCKs that manifest a relatively homogeneous enrichment pattern of histone marks. We further characterized BLOCKs by their transcription levels, distribution of genes, degree of co-regulation and GO enrichment. Our results demonstrate that these BLOCKs, although inferred merely from histone modifications, reveal a strong relevance with physical domains, which suggest their important roles in chromatin organization and coordinated gene regulation.
Ann. Appl. Stat., Volume 10, Number 1 (2016), 506-526.
Received: May 2014
Revised: August 2015
First available in Project Euclid: 25 March 2016
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Chen, Mengjie; Lin, Haifan; Zhao, Hongyu. Change point analysis of histone modifications reveals epigenetic blocks linking to physical domains. Ann. Appl. Stat. 10 (2016), no. 1, 506--526. doi:10.1214/16-AOAS905. https://projecteuclid.org/euclid.aoas/1458909925
- Supplement A: modENCODEhistone. Number of enriched regions of 46 histone marks and nonhistone chromosomal proteins from the modENCODE project.
- Supplement B: BLOCKs. BLOCKs identified by BCP in S2 cells using posterior probability cutoff 0.75.
- Supplement C: EnrichedGenes. Gene lists in GO enriched BLOCKs in S2 cell.
- Supplement D: LargestVarianceBLOCKs. BLOCKs with the top 20% largest deviations in the transcription across 9 different developmental stages.
- Supplement E: SmallestVarianceBLOCKs. BLOCKs with the top 20% smallest deviations in the transcription across 9 different developmental stages.