Open Access
April 2017 Interaction pursuit in high-dimensional multi-response regression via distance correlation
Yinfei Kong, Daoji Li, Yingying Fan, Jinchi Lv
Ann. Statist. 45(2): 897-922 (April 2017). DOI: 10.1214/16-AOS1474


Feature interactions can contribute to a large proportion of variation in many prediction models. In the era of big data, the coexistence of high dimensionality in both responses and covariates poses unprecedented challenges in identifying important interactions. In this paper, we suggest a two-stage interaction identification method, called the interaction pursuit via distance correlation (IPDC), in the setting of high-dimensional multi-response interaction models that exploits feature screening applied to transformed variables with distance correlation followed by feature selection. Such a procedure is computationally efficient, generally applicable beyond the heredity assumption, and effective even when the number of responses diverges with the sample size. Under mild regularity conditions, we show that this method enjoys nice theoretical properties including the sure screening property, support union recovery and oracle inequalities in prediction and estimation for both interactions and main effects. The advantages of our method are supported by several simulation studies and real data analysis.


Download Citation

Yinfei Kong. Daoji Li. Yingying Fan. Jinchi Lv. "Interaction pursuit in high-dimensional multi-response regression via distance correlation." Ann. Statist. 45 (2) 897 - 922, April 2017.


Received: 1 December 2015; Published: April 2017
First available in Project Euclid: 16 May 2017

zbMATH: 1368.62140
MathSciNet: MR3650404
Digital Object Identifier: 10.1214/16-AOS1474

Primary: 62H12 , 62J02
Secondary: 62F07 , 62F12

Keywords: Distance correlation , high dimensionality , Interaction pursuit , multi-response regression , Sparsity , square transformation

Rights: Copyright © 2017 Institute of Mathematical Statistics

Vol.45 • No. 2 • April 2017
Back to Top