Open Access
December 2013 An empirical Bayes testing procedure for detecting variants in analysis of next generation sequencing data
Zhigen Zhao, Wei Wang, Zhi Wei
Ann. Appl. Stat. 7(4): 2229-2248 (December 2013). DOI: 10.1214/13-AOAS660

Abstract

Because of the decreasing cost and high digital resolution, next-generation sequencing (NGS) is expected to replace the traditional hybridization-based microarray technology. For genetics study, the first-step analysis of NGS data is often to identify genomic variants among sequenced samples. Several statistical models and tests have been developed for variant calling in NGS study. The existing approaches, however, are based on either conventional Bayesian or frequentist methods, which are unable to address the multiplicity and testing efficiency issues simultaneously. In this paper, we derive an optimal empirical Bayes testing procedure to detect variants for NGS study. We utilize the empirical Bayes technique to exploit the across-site information among many testing sites in NGS data. We prove that our testing procedure is valid and optimal in the sense of rejecting the maximum number of nonnulls while the Bayesian false discovery rate is controlled at a given nominal level. We show by both simulation studies and real data analysis that our testing efficiency can be greatly enhanced over the existing frequentist approaches that fail to pool and utilize information across the multiple testing sites.

Citation

Download Citation

Zhigen Zhao. Wei Wang. Zhi Wei. "An empirical Bayes testing procedure for detecting variants in analysis of next generation sequencing data." Ann. Appl. Stat. 7 (4) 2229 - 2248, December 2013. https://doi.org/10.1214/13-AOAS660

Information

Published: December 2013
First available in Project Euclid: 23 December 2013

zbMATH: 1283.62011
MathSciNet: MR3161720
Digital Object Identifier: 10.1214/13-AOAS660

Keywords: Bayesian FDR , multiplicity control , next-generation sequencing , optimality , Variant call

Rights: Copyright © 2013 Institute of Mathematical Statistics

Vol.7 • No. 4 • December 2013
Back to Top