Abstract
In this study we present a data-driven method called false negative control (FNC) screening to address the challenge of detecting weak signals in underpowered genome-wide association studies (GWASs), where true signals are often obscured by a large amount of noise. Our approach focuses on controlling false negatives and efficiently regulates the proportion of false negatives at a user-specified level in realistic settings with arbitrary covariance dependence between variables. We calibrate overall dependence using a parameter that aligns with the existing phase diagram in high-dimensional sparse inference, allowing us to asymptotically explicate the joint effect of covariance dependence, signal sparsity, and signal intensity on the proposed method. Our new phase diagram shows that FNC screening can efficiently select a set of candidate variables to retain a high proportion of signals, even when the signals are not individually separable from noise. We compare the performance of FNC screening to several existing methods in simulation studies, and the proposed method outperforms the others in adapting to a user-specified false negative control level. Moreover, we apply FNC screening to 145 GWAS datasets, obtained from the UK Biobank, and demonstrate a substantial increase in power to retain true signals for downstream analyses.
Funding Statement
Research of Dr. Li is partially supported by NIH Grant R01HL146500.
Acknowledgments
The authors would like to thank the anonymous referees, an Associate Editor, and the Editor for their constructive comments that improved the quality of this paper. This study has been conducted using the UK Biobank Resource under Application Number 25953.
Citation
X. Jessie Jeng. Yifei Hu. Quan Sun. Yun Li. "Weak signal inclusion under dependence and applications in genome-wide association study." Ann. Appl. Stat. 18 (1) 841 - 857, March 2024. https://doi.org/10.1214/23-AOAS1815
Information