The Annals of Statistics
- Ann. Statist.
- Volume 45, Number 2 (2017), 557-590.
A rate optimal procedure for recovering sparse differences between high-dimensional means under dependence
The paper considers the problem of recovering the sparse different components between two high-dimensional means of column-wise dependent random vectors. We show that dependence can be utilized to lower the identification boundary for signal recovery. Moreover, an optimal convergence rate for the marginal false nondiscovery rate (mFNR) is established under dependence. The convergence rate is faster than the optimal rate without dependence. To recover the sparse signal bearing dimensions, we propose a Dependence-Assisted Thresholding and Excising (DATE) procedure, which is shown to be rate optimal for the mFNR with the marginal false discovery rate (mFDR) controlled at a pre-specified level. Extensions of the DATE to recover the differences in contrasts among multiple population means and differences between two covariance matrices are also provided. Simulation studies and case study are given to demonstrate the performance of the proposed signal identification procedure.
Ann. Statist., Volume 45, Number 2 (2017), 557-590.
Received: November 2015
Revised: February 2016
First available in Project Euclid: 16 May 2017
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Li, Jun; Zhong, Ping-Shou. A rate optimal procedure for recovering sparse differences between high-dimensional means under dependence. Ann. Statist. 45 (2017), no. 2, 557--590. doi:10.1214/16-AOS1459. https://projecteuclid.org/euclid.aos/1494921950
- Supplementary material for “A rate optimal procedure for recovering sparse differences between high-dimensional means under dependence”. The supplementary material provides the proofs of Lemmas 1–4 and Theorems 1–5.