A Class of Multisample Distribution-free Tests

Jayant V. Deshpande

doi:10.1214/aoms/1177697204

February, 1970 A Class of Multisample Distribution-free Tests

Jayant V. Deshpande

Ann. Math. Statist. 41(1): 227-236 (February, 1970). DOI: 10.1214/aoms/1177697204

Abstract

Let $x_{i1}, x_{i2}, \cdots, x_{ini}$ be a random sample of real observations from the $i$th population with cumulative distribution function (cdf) $F_i(x), i = 1,2, \cdots, c$. Let the $c$ samples be independent and the $F$'s continuous. In this paper we shall consider tests for the null hypothesis $H_0:F_1(x) = F_2(x) = \cdots = F_c(x) = F(x), \text{say}$. The statistics and tests, proposed in this paper, are based upon $c$-plets of observations which are formed by selecting one observation from each of the $c$ samples. The total number of distinct $c$-plets that can be formed in this way is $\prod^c_{i=1}n_i$. In each $c$-plet we compare and rank observations appearing therein. Let $v_{ij}$ be the number of $c$-plets in which the observation selected from the $i$th sample is larger than exactly $(j - 1)$ observations and smaller than the other $(c - j)$ observations. Since the distributions are assumed to be continuous the probability of the existence of ties is zero. Let us define $u_{ij} = \nu_{ij}/\prod^c_{i=1}n_i$; it is the proportion of $c$-plets which give rank $j$ to the observation from the $i$th sample. Let us have $N = \sum^c_{i=1}n_i, p_i = n_i/N, L_i = \sum^c_{j=1} a_ju_{ij}$, where the $a$'s are real constants such that they are not all equal and \begin{equation*} \tag{1.1} A = \sum^c_{j=1} \sum^c_{l=1} a_ja_l \big\{\frac{\binom{c-1}{j-1}\binom{c-1}{l-1}}{(2c - 1)\binom{2c-2}{j+l-2}} - \frac{1}{c^2}\big\}.\end{equation*} Then we define a class of statistics $\mathscr{L}$ as \begin{equation*} \tag{1.2} \mathscr{L} = \frac{N(c - 1)^2}{Ac^2} \big\lbrack \sum^c_{i=1} p_iL_i^2 - \big(\sum^c_{i=1} p_i L_i\big)^2 \big\rbrack.\end{equation*} A particular member of the class is found by specifying the real constants $a$'s. With each member of this class we associate a test of $H_0$: Reject $H_0$ at a significance level $\alpha$ if $\mathscr{L}$ exceeds some predetermined constant $\mathscr{L}_\alpha$. We, later in this paper, show that under $H_0, \mathscr{L}$ is distributed as a $\chi^2$ variate with $c - 1$ degrees of freedom, in the limit as $N \rightarrow \infty$. Hence for sufficiently large $N, \mathscr{L}_\alpha$ may be approximated by the corresponding significance point of the $\chi^2$ distribution with requisite degrees of freedom. Tests proposed by Bhapkar [2], [3], Sugiura [13], and the author [5], [6] may be seen to belong to this class. In this paper it is attempted to provide a unified treatment of statistics and tests based on $c$-plets--particularly those based on linear combinations of the $u$'s. The detailed properties of statistics belonging to this class are discussed under the null hypothesis and the following two alternative hypotheses. (I) the alternative of different locations or shift, the distributions being equal in all other respects and, (II) the alternative of different scales, the distributions again being equal in all other respects. Haller [7] has discussed the use and the properties of some statistics belonging to this class for testing $H_0$ against an alternative of stochastically ordered variables and for selection and ranking procedures. In the fourth section we give a condition on the distributions under which these tests are consistent against specified alternatives. In the fifth section $\mathscr{L}$ is shown to have a limiting noncentral $\chi^2$ distribution with $c - 1$ degrees of freedom under the pertinent alternative hypotheses. The noncentrality parameter is seen to be a quadratic form in the constants $a$'s, involving $F$. The earlier test statistics, mentioned above, were constructed taking into account the relative magnitudes of the $u$'s under the null and under the alternative hypotheses. The idea was to emphasize the difference between the two magnitudes. This "difference" is, in some sense, maximized if we are able to obtain the statistics, from the class, which has the largest noncentrality parameter under the alternative hypothesis of interest. This statistic would then be recommended to test $H_0$ whenever the particular alternative is suspected as likely. Also, for this particular alternative hypothesis, this test shall have maximum asymptotic relative efficiency (in the Pitman sense) among the class of statistics proposed. In the sixth section we show how to obtain the statistics with the above property and do so for certain specified alternatives. In the same section we compute the ARE of these tests with respect to certain of their competitors.

Citation

Download Citation

Jayant V. Deshpande. "A Class of Multisample Distribution-free Tests." Ann. Math. Statist. 41 (1) 227 - 236, February, 1970. https://doi.org/10.1214/aoms/1177697204