The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 10, Number 2 (2016), 786-811.
A Bayesian graphical model for genome-wide association studies (GWAS)
The analysis of GWAS data has long been restricted to simple models that cannot fully capture the genetic architecture of complex human diseases. As a shift from standard approaches, we propose here a general statistical framework for multi-SNP analysis of GWAS data based on a Bayesian graphical model. Our goal is to develop a general approach applicable to a wide range of genetic association problems, including GWAS and fine-mapping studies, and, more specifically, be able to: (1) Assess the joint effect of multiple SNPs that can be linked or unlinked and interact or not; (2) Explore the multi-SNP model space efficiently using the Mode Oriented Stochastic Search (MOSS) algorithm and determine the best models. We illustrate our new methodology with an application to the CGEM breast cancer GWAS data. Our algorithm selected several SNPs embedded in multi-locus models with high posterior probabilities. Most of the SNPs selected have a biological relevance. Interestingly, several of them have never been detected in standard single-SNP analyses. Finally, our approach has been implemented in the open source $R$ package genMOSS.
Ann. Appl. Stat., Volume 10, Number 2 (2016), 786-811.
Received: March 2013
Revised: September 2015
First available in Project Euclid: 22 July 2016
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Briollais, Laurent; Dobra, Adrian; Liu, Jinnan; Friedlander, Matt; Ozcelik, Hilmi; Massam, Hélène. A Bayesian graphical model for genome-wide association studies (GWAS). Ann. Appl. Stat. 10 (2016), no. 2, 786--811. doi:10.1214/16-AOAS909. https://projecteuclid.org/euclid.aoas/1469199893
- Supplement A: Example of $R$ code. This is a simple example of code to run our $R$ package genMOSS.
- Supplement B: Complete Table 1 results. This table is similar to Table 1 but adds additional FDR results for each of the five SNPs simulated and for the SNP pairwise interactions.
- Supplement C: Additional simulation results. We performed additional simulations to assess the performance of MOSS where it is compared to the standard Bonferroni correction. The $R$ code used to generate the data is given in Supplement A [Briollais et al. (2016a)].
- Supplement D: Sensitivity analyses. In this section, we assess the sensitivity of the priors to the detection of rare and common genetic variants.
- Supplement E: Additional real data analyses. This section provides additional results from the real data analysis.