Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 4, Number 2 (2010), 871-892.
Statistical inference of transmission fidelity of DNA methylation patterns over somatic cell divisions in mammals
We develop Bayesian inference methods for a recently-emerging type of epigenetic data to study the transmission fidelity of DNA methylation patterns over cell divisions. The data consist of parent-daughter double-stranded DNA methylation patterns with each pattern coming from a single cell and represented as an unordered pair of binary strings. The data are technically difficult and time-consuming to collect, putting a premium on an efficient inference method. Our aim is to estimate rates for the maintenance and de novo methylation events that gave rise to the observed patterns, while accounting for measurement error. We model data at multiple sites jointly, thus using whole-strand information, and considerably reduce confounding between parameters. We also adopt a hierarchical structure that allows for variation in rates across sites without an explosion in the effective number of parameters. Our context-specific priors capture the expected stationarity, or near-stationarity, of the stochastic process that generated the data analyzed here. This expected stationarity is shown to greatly increase the precision of the estimation. Applying our model to a data set collected at the human FMR1 locus, we find that measurement errors, generally ignored in similar studies, occur at a nontrivial rate (inappropriate bisulfite conversion error: 1.6% with 80% CI: 0.9–2.3%). Accounting for these errors has a substantial impact on estimates of key biological parameters. The estimated average failure of maintenance rate and daughter de novo rate decline from 0.04 to 0.024 and from 0.14 to 0.07, respectively, when errors are accounted for. Our results also provide evidence that de novo events may occur on both parent and daughter strands: the median parent and daughter de novo rates are 0.08 (80% CI: 0.04–0.13) and 0.07 (80% CI: 0.04–0.11), respectively.
Ann. Appl. Stat., Volume 4, Number 2 (2010), 871-892.
First available in Project Euclid: 3 August 2010
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Fu, Audrey Qiuyan; Genereux, Diane P.; Stöger, Reinhard; Laird, Charles D.; Stephens, Matthew. Statistical inference of transmission fidelity of DNA methylation patterns over somatic cell divisions in mammals. Ann. Appl. Stat. 4 (2010), no. 2, 871--892. doi:10.1214/09-AOAS297. https://projecteuclid.org/euclid.aoas/1280842144
- Supplementary material A: Appendices. The pdf file contains biological background, experimental design issues, Markov chain Monte Carlo (MCMC) procedures and likelihood analyses for special cases.
- Supplementary material B: Data and MCMC code. The zip file contains the FMR1 data analyzed in this paper, the R code that implements the MCMC procedure and MCMC outputs summarized and displayed in Section 3.