Abstract
Motivated by emerging applications in ecology, microbiology, and neuroscience, this paper studies high-dimensional regression with two-way structured data. To estimate the high-dimensional coefficient vector, we propose the generalized matrix decomposition regression (GMDR) to efficiently leverage auxiliary information on row and column structures. GMDR extends the principal component regression (PCR) to two-way structured data, but unlike PCR, GMDR selects the components that are most predictive of the outcome, leading to more accurate prediction. For inference on regression coefficients of individual variables, we propose the generalized matrix decomposition inference (GMDI), a general high-dimensional inferential framework for a large family of estimators that include the proposed GMDR estimator. GMDI provides more flexibility for incorporating relevant auxiliary row and column structures. As a result, GMDI does not require the true regression coefficients to be sparse but constrains the coordinate system representing the regression coefficients according to the column structure. GMDI also allows dependent and heteroscedastic observations. We study the theoretical properties of GMDI in terms of both the type-I error rate and power and demonstrate the effectiveness of GMDR and GMDI in simulation studies and an application to human microbiome data.
Funding Statement
The second author was supported by National Institutes of Health Grant R01GM133848.
The third author was supported by National Institutes of Health Grants R01GM129512, R01HL1554178, and P50CA228944.
The fifth author was supported by National Institutes of Health Grant R01GM145772.
Acknowledgments
The authors would like to thank the anonymous referees, an Associate Editor, and the Editor for their constructive comments that improved the quality of this paper.
Citation
Yue Wang. Ali Shojaie. Timothy Randolph. Parker Knight. Jing Ma. "Generalized matrix decomposition regression: Estimation and inference for two-way structured data." Ann. Appl. Stat. 17 (4) 2944 - 2969, December 2023. https://doi.org/10.1214/23-AOAS1746
Information