Abstract
We introduce a new method for two-sample testing of high-dimensional linear regression coefficients without assuming that those coefficients are individually estimable. The procedure works by first projecting the matrices of covariates and response vectors along directions that are complementary in sign in a subset of the coordinates, a process which we call “complementary sketching.” The resulting projected covariates and responses are aggregated to form two test statistics, which are shown to have essentially optimal asymptotic power under a Gaussian design when the difference between the two regression coefficients is sparse and dense respectively. Simulations confirm that our methods perform well in a broad class of settings and an application to a large single-cell RNA sequencing dataset demonstrates its utility in the real world.
Funding Statement
This work was primarily supported by the Royal Society grant IEC/NSFC/ 170119. In addition, the research of FG was supported by NNSFC grants 11701095 and 11690013 and that of TW was supported by EPSRC grant EP/T02772X/1.
Acknowledgements
We would like to thank Chenqu Suo for suggesting the dataset and offering her valuable insights in the real data analysis. We thank the anonymous reviewers for helpful and constructive comments on an earlier draft.
Citation
Fengnan Gao. Tengyao Wang. "Two-sample testing of high-dimensional linear regression coefficients via complementary sketching." Ann. Statist. 50 (5) 2950 - 2972, October 2022. https://doi.org/10.1214/22-AOS2216
Information