Abstract
In many practical problems, the experimenter is faced with the task of deciding if the variability within several classes is uniform throughout the classes, or if not, which class exhibits the greatest amount of variability. This type of problem arises when the data relate to several processes, to the same process at different times, to several different products, or to the same products from different sources. If the variability is not uniform throughout the classes, then misleading results would be obtained in comparing the classes in other respects. If the experimenter expects the variability to be uniform throughout the different classes, and if the variability is large in a particular class, he will consider the situation to be "out of control" and take measures to locate the source of the large variability. The problem we will consider here is that of comparing the variances of $k$ populations, $\Pi_1, \Pi_2, \cdots, \Pi_k,$ on the basis of $n$ observations $x_{i1}, x_{i2}, \cdots, x_{in}$ from the $i$th population. We will assume that these observations are normally and independently distributed with unknown mean $m_i$ and unknown standard deviation $\sigma_i$ for $i = 1, 2, \cdots, k.$ Our problem is to find a statistical procedure which will, on the basis of these observations, decide if all the populations have equal variances, and if not, which has the largest variance. We would like the procedure to be in some sense "optimum." We will say that our procedure is optimum if, subject to certain restrictions, it maximizes the probability of making the correct decision. A similar problem dealing with the means of several normal distributions has been studied by Paulson [1]. Let $D_0$ be the decision that all $k$ variances are equal, and let $D_j$ be the decision that $D_0$ is false and $\sigma^2_1 = \max (\sigma^2_1, \cdots, \sigma^2_k)$ for $j = 1, 2, \cdots, k.$ Our problem now is to find a statistical procedure for selecting one of these $k + 1$ decisions. Let $x_{i\alpha}$ denote the $\alpha$th observation from the $i$th population, and let $\bar{x}_i = \sum^n_{\alpha=1} x_{i\alpha}/n.$ Let $s^2_i = \sum^n_{\alpha=1}(x_{i\alpha} - \bar{x})^2/(n - 1)$ denote the unbiased estimate of the variance of the $i$th population. We will say that $\Pi_i$ has "slipped to the right" if $\sigma^2_1 = \cdots = \sigma^2_{i-1} = \sigma^2_{i+1} = \cdots = \sigma^2_k$ and $\sigma^2_i = \lambda^2\sigma^2_1$ where $|\lambda| > 1.$ In our first formulation of the problem we will want to find a statistical procedure which will select one of the $k + 1$ decisions $D_0, D_1, \cdots, D_k$ so that (a) when all the variances are equal, $D_0$ should be selected with probability $1 - \alpha,$ where $\alpha$ is a small positive number fixed prior to the experiment. Since the class of possible decision procedures seems to be too large to admit an optimum solution we will impose the following restrictions which seem to be reasonable: (b) the procedure should be symmetric, that is, the probability of selecting $D_i$ when $\sigma^2_1 = \cdots = \sigma^2_{i-1} = \sigma^2_{i+1} = \cdots = \sigma^2_k$ and $\sigma^2_i = \lambda^2\sigma^2_1$ should be the same for all $i$; (c) the procedure should be invariant if all the observations are multiplied by the same positive constant; and (d) the procedure should be invariant if some constant $b_i$ is added to all the observations in the $i$th population. We will now reformulate the problem as follows. We want a statistical procedure for selecting one of the $k + 1$ decisions $D_0, D_1, \cdots, D_k$ which, subject to conditions (a), (b), (c), and (d) will maximize the probability of making the correct decision when one of the populations has slipped to the right. We shall prove that the optimum solution is the following: \begin{equation*}\begin{split} \text{if} s^2_M\big/\sum^k_{i=1} s^2_i > L_\alpha \text{select} D_M; \\ \text{if} s^2_M\big/\sum^k_{i=1} s^2_i \leqq L_\alpha \text{select} D_0,\end{split} \end{equation*} where $M$ denotes the population yielding the largest sample variance. $L_\alpha$ is a constant whose value is determined by restriction (a). This statistic has been suggested, on intuitive grounds, by Cochran [2], and a good tabulation of $L_\alpha$ for several values of $\alpha, n,$ and $k$ is available [3].
Citation
Donald R. Traux. "An Optimum Slippage Test for the Variances of $k$ Normal Distributions." Ann. Math. Statist. 24 (4) 669 - 674, December, 1953. https://doi.org/10.1214/aoms/1177728923
Information