## The Annals of Mathematical Statistics

### A Note on Wilks' Internal Scatter

H. Robert van der Vaart

#### Abstract

Let $Y_1$ and $Y_2$ be two real valued, independently and identically distributed random variables with variance $\sigma^2$. Then $\sigma^2 = \varepsilon(Y_1 - \varepsilon Y_1)^2 = 2^{-1} \cdot \varepsilon(Y_1 - Y_2)^2$. Note that \begin{equation*}\tag{0.1}(Y_1 - Y_2)^2 = \begin{vmatrix}1 & 1 \\ Y_1 & Y_2\end{vmatrix}^2\end{equation*} is the square of the length of the interval $\lbrack Y_1, Y_2\rbrack$. Given a sample of size $n$, a well known unbiased estimator for $\sigma^2$ is given by \begin{equation*}\tag{0.2}\lbrack n(n - 1)\rbrack^{-1} \sum_{1\leqq i_1<i_2\leqq n} (Y_{i_1} - Y_{i_2})^2 = (n - 1)^{-1} \sum^n_{i=1}(Y_i - \bar Y)^2.\end{equation*} The present note will discuss a $k$-dimensional generalization of this situation. Throughout our discussion we will assume that the following condition is satisfied. Condition $mathscr{I} X_1, X_2, \cdots, X_n$ are $n(>k)$ independently and identically distributed, $k$-vector valued random variables with (unknown) expected vector $\mu$ and covariance matrix $\not\sum$. One natural generalization of the above parameter $\sigma^2$ then is (the $X_i$ in the following formulae being understood as one-column matrices with $k$ components) the parameter $\theta$ defined as follows: \begin{equation*} \tag{0.3} \theta = \lbrack(k + 1)!\rbrack^{-1}\varepsilon \big(\begin{vmatrix}1 & 1& \cdots & 1 & 1 \\ X_1 & X_2 & \cdots & X_k & X_{k+1}\end{vmatrix}^2\big).\end{equation*} Theorem 1 will show that $\theta = \det \not\sum$. Note that the absolute value of the determinant in the second member of (0.3) is known to be $k$! times the $k$-dimensional content of the simplex (for a definition e.g. see [4], p. 10) which has the $(k + 1)$ points (i.e., the $(k + 1) k$-tuples) $X_1, X_2, \cdots, X_{k+1}$ for its vertices (an enlightening discussion of determinants as connected with volumes can be found in [3], pp. 152-162). This furnishes a geometric interpretation to the parameter $\theta = \det \not\sum$. By an argument well known from the theory of $U$-statistics and using the equality $\binom{n}{k+1} \cdot (k + 1)! = n(n - 1) \cdots (n - k) \equiv n_{k+1}$, say, one finds (see corollary 1.1) that an unbiased estimator for $\theta = \det \not\sum$ is given by $\hat{\theta}$, where $\hat{\theta}$ is defined by: \begin{equation*}\tag{0.4}\hat{\theta} = n^{-1}_{k+1} \cdot \sum \begin{vmatrix}1 & 1 & \cdots & 1 \\ X_{i_1} & X_{i_2} & \cdots & X_{i_{k+1}}\end{vmatrix}^2.\end{equation*} where summation is over all $(i_1,i_2, \cdots, i_{k+1})$ with $1 \leqq i_1 < i_2 < \cdots < i_{k+1} \leqq n$. Theorem 2 will point out a simple relation between $\hat{\theta}$ and Wilks' internal scatter $S_{k,\bar{X},n} = \det U$ (cf. Wilks [5], equations (4.8), (4.2) and (4.3); Wilks [6], equation (18.1.23); and Section 2 below): \begin{equation*}\tag{0.5} (n - 1)(n - 2) \cdots (n - k) \hat{\theta} = S_{k,\bar{X},n} = \det U.\end{equation*} This solves a question, left open by Wilks [5], p. 493, namely: under which conditions is it true that \begin{equation*}\tag{0.6} \varepsilon S_{k,\bar{X},n} = (n - 1)(n - 2) \cdots (n - k) \det \not\sum?\end{equation*} Corollary 2.1 shows that equality (0.6) is true as soon as the trivial Condition $\mathscr{I}$ is satisfied. This clearly constitutes a much stronger result than Wilks' preliminary statement. In fact, equation (0.6) is a full generalization of the equality liminary statement. In fact, equation (0.6) is a full generalization of the equality $\varepsilon\sum_i (Y_i - \bar{Y})^2 = (n - 1)\sigma^2$, and valid under conditions of the same scope.

#### Article information

Source
Ann. Math. Statist., Volume 36, Number 4 (1965), 1308-1312.

Dates
First available in Project Euclid: 27 April 2007

https://projecteuclid.org/euclid.aoms/1177700006

Digital Object Identifier
doi:10.1214/aoms/1177700006

Mathematical Reviews number (MathSciNet)
MR178533

Zentralblatt MATH identifier
0156.39601

JSTOR