The Annals of Mathematical Statistics

A Class of Statistics with Asymptotically Normal Distribution

Wassily Hoeffding

Abstract

Let $X_1, \cdot, X_n$ be $n$ independent random vectors, $X_\nu = (X^{(1)}_\nu, \cdots, X^{(r)}_\nu),$ and $\Phi(x_1, \cdots, x_m)$ a function of $m(\leq n)$ vectors $x_\nu = (x^{(1)}_\nu , \cdots, x^{(r)}_\nu)$. A statistic of the form $U = \sum"\Phi(X_{\alpha 1}, \cdots, X_{\alpha_m})/n(n - 1) \cdots (n - m + 1),$ where the sum $\sum"$ is extended over all permutations $(\alpha_1, \cdots, \alpha_m)$ of $m$ different integers, $1 \leq \alpha_i \leq n$, is called a $U$-statistic. If $X_1, \cdots, X_n$ have the same (cumulative) distribution function (d.f.) $F(x), U$ is an unbiased estimate of the population characteristic $\theta(F) = \int \cdots \int\Phi(x_1, \cdots, x_m) dF(x_1) \cdots dF(x_m). \theta(F)$ is called a regular functional of the d.f. $F(x)$. Certain optimal properties of $U$-statistics as unbiased estimates of regular functionals have been established by Halmos  (cf. Section 4). The variance of a $U$-statistic as a function of the sample size $n$ and of certain population characteristics is studied in Section 5. It is shown that if $X_1, \cdots, X_n$ have the same distribution and $\Phi(x_1, \cdots, x_m)$ is independent of $n$, the d.f. of $\sqrt n(U - \theta)$ tends to a normal d.f. as $n \rightarrow \infty$ under the sole condition of the existence of $E\Phi^2(X_1, \cdots, X_m)$. Similar results hold for the joint distribution of several $U$-statistics (Theorems 7.1 and 7.2), for statistics $U'$ which, in a certain sense, are asymptotically equivalent to $U$ (Theorems 7.3 and 7.4), for certain functions of statistics $U$ or $U'$ (Theorem 7.5) and, under certain additional assumptions, for the case of the $X_\nu$'s having different distributions (Theorems 8.1 and 8.2). Results of a similar character, though under different assumptions, are contained in a recent paper by von Mises  (cf. Section 7). Examples of statistics of the form $U$ or $U'$ are the moments, Fisher's $k$-statistics, Gini's mean difference, and several rank correlation statistics such as Spearman's rank correlation and the difference sign correlation (cf. Section 9). Asymptotic power functions for the non-parametric tests of independence based on these rank statistics are obtained. They show that these tests are not unbiased in the limit (Section 9f). The asymptotic distribution of the coefficient of partial difference sign correlation which has been suggested by Kendall also is obtained (Section 9h).

Article information

Source
Ann. Math. Statist., Volume 19, Number 3 (1948), 293-325.

Dates
First available in Project Euclid: 28 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aoms/1177730196

Digital Object Identifier
doi:10.1214/aoms/1177730196

Mathematical Reviews number (MathSciNet)
MR26294

Zentralblatt MATH identifier
0032.04101

JSTOR