Open Access
April 2018 Testing independence with high-dimensional correlated samples
Xi Chen, Weidong Liu
Ann. Statist. 46(2): 866-894 (April 2018). DOI: 10.1214/17-AOS1571

Abstract

Testing independence among a number of (ultra) high-dimensional random samples is a fundamental and challenging problem. By arranging $n$ identically distributed $p$-dimensional random vectors into a $p\times n$ data matrix, we investigate the problem of testing independence among columns under the matrix-variate normal modeling of data. We propose a computationally simple and tuning-free test statistic, characterize its limiting null distribution, analyze the statistical power and prove its minimax optimality. As an important by-product of the test statistic, a ratio-consistent estimator for the quadratic functional of a covariance matrix from correlated samples is developed. We further study the effect of correlation among samples to an important high-dimensional inference problem—large-scale multiple testing of Pearson’s correlation coefficients. Indeed, blindly using classical inference results based on the assumed independence of samples will lead to many false discoveries, which suggests the need for conducting independence testing before applying existing methods. To address the challenge arising from correlation among samples, we propose a “sandwich estimator” of Pearson’s correlation coefficient by de-correlating the samples. Based on this approach, the resulting multiple testing procedure asymptotically controls the overall false discovery rate at the nominal level while maintaining good statistical power. Both simulated and real data experiments are carried out to demonstrate the advantages of the proposed methods.

Citation

Download Citation

Xi Chen. Weidong Liu. "Testing independence with high-dimensional correlated samples." Ann. Statist. 46 (2) 866 - 894, April 2018. https://doi.org/10.1214/17-AOS1571

Information

Received: 1 November 2015; Revised: 1 March 2017; Published: April 2018
First available in Project Euclid: 3 April 2018

zbMATH: 06870282
MathSciNet: MR3782387
Digital Object Identifier: 10.1214/17-AOS1571

Subjects:
Primary: 62F05
Secondary: 62H10

Keywords: False discovery rate , high-dimensional sample correlation matrix , Independence test , Matrix-variate normal , multiple testing of correlations , Quadratic functional estimation

Rights: Copyright © 2018 Institute of Mathematical Statistics

Vol.46 • No. 2 • April 2018
Back to Top