## The Annals of Statistics

### Canonical Variables as Optimal Predictors

#### Abstract

Let $\mathbf{X} = (X_1, \cdots, X_m)'$ and $\mathbf{Y} = (Y_1, \cdots, Y_n)'$ be two random vectors. Given any random vector $\mathbf{Z}$, let $\mathbf{Y}^\ast_Z$ be the best linear predictor of $\mathbf{Y}$ based on $\mathbf{Z}$. Let $p$ be any natural number smaller than $m$. We consider the problem of finding the $p$-dimensional random vector $\mathbf{Z} = (Z_1, \cdots, Z_p)'$ where each component $Z_i$ is a linear function of $\mathbf{X}$, which minimizes the determinant of $E(\mathbf{Y} - \mathbf{Y}^\ast_Z)(\mathbf{Y} - \mathbf{Y}^\ast_Z)'$. We show that $Z_1, \cdots, Z_p$ coincide with the first $p$ canonical variables (except for a nonsingular linear transformation). We also show that the square of the $(p + 1)$th canonical correlation coefficient measures the relative improvement in the prediction of $\mathbf{Y}$ when $p + 1 Z_i$'s are used instead of $p$.

#### Article information

Source
Ann. Statist., Volume 8, Number 4 (1980), 865-869.

Dates
First available in Project Euclid: 12 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aos/1176345079

Digital Object Identifier
doi:10.1214/aos/1176345079

Mathematical Reviews number (MathSciNet)
MR572630

Zentralblatt MATH identifier
0463.62054

JSTOR