The Annals of Statistics
- Ann. Statist.
- Volume 36, Number 4 (2008), 1649-1668.
Dimension reduction based on constrained canonical correlation and variable filtering
The “curse of dimensionality” has remained a challenge for high-dimensional data analysis in statistics. The sliced inverse regression (SIR) and canonical correlation (CANCOR) methods aim to reduce the dimensionality of data by replacing the explanatory variables with a small number of composite directions without losing much information. However, the estimated composite directions generally involve all of the variables, making their interpretation difficult. To simplify the direction estimates, Ni, Cook and Tsai [Biometrika 92 (2005) 242–247] proposed the shrinkage sliced inverse regression (SSIR) based on SIR. In this paper, we propose the constrained canonical correlation (C3) method based on CANCOR, followed by a simple variable filtering method. As a result, each composite direction consists of a subset of the variables for interpretability as well as predictive power. The proposed method aims to identify simple structures without sacrificing the desirable properties of the unconstrained CANCOR estimates. The simulation studies demonstrate the performance advantage of the proposed C3 method over the SSIR method. We also use the proposed method in two examples for illustration.
Ann. Statist., Volume 36, Number 4 (2008), 1649-1668.
First available in Project Euclid: 16 July 2008
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62J07: Ridge regression; shrinkage estimators
Secondary: 62H20: Measures of association (correlation, canonical correlation, etc.)
Zhou, Jianhui; He, Xuming. Dimension reduction based on constrained canonical correlation and variable filtering. Ann. Statist. 36 (2008), no. 4, 1649--1668. doi:10.1214/07-AOS529. https://projecteuclid.org/euclid.aos/1216237295