The Annals of Mathematical Statistics

Asymptotic Distributions of "Psi-Squared" Goodness of Fit Criteria for $m$-th Order Markov Chains

Leo A. Goodman

Abstract

Let $\{X_1, X_2, \cdots, X_N\}$ be an observed sequence from a stochastic process, where $X_i$ can take any one of $s$ values $1, 2, \cdots, s$. Let $f_\mathfrak{u}$ be the frequency of the $m$-tuple $\mathfrak{u} = (u_1, u_2, \cdots, u_m)$ in the sequence. Let $H'_n$ be the composite hypothesis that the process is a Markov chain of order $n$. Let $H_n$ be any simple hypothesis belonging to $H'_n$. Let $H^{\ast}_n$ be the maximum likelihood $H_n$. Let the expected value of $f_\mathfrak{u}$ in a new sequence of length $N$ given $H_n$ be $f_{\mathfrak{u},n}$, and given $H^{\ast}_n$ be $f^{\ast}_{\mathfrak{u},n}$. Let $$\Psi^2_{m,n} = \sum_\mathfrak{u} (f_\mathfrak{u} - f_{\mathfrak{u},n})^2/f_{\mathfrak{u},n},$$ $$\Psi^{\ast 2}_{m,n} = \sum_\mathfrak{u} (f_\mathfrak{u} - f^{\ast}_{\mathfrak{u},n})^2/f^{\ast}_{\mathfrak{u},n},$$ $$\Psi^{\ast 2}_{n + 1,n} = 0.$$ Good had proposed in  the following two conjectures: (a) that the asymptotic distribution $(N \rightarrow \infty)$ of $\Psi^{\ast 2}_{m,n}$, when $H'_n$ is true, is $$\ast^{m - n - 1}_{\lambda = 1} K_{g(\lambda)} (x/\lambda),$$ where $\ast$ denotes convolution, $g(\lambda) = (s - 1)^2s^{m - 1 - \lambda}$, and $K_i(x)$ is the $\chi^2$-distribution with $i$ degrees of freedom; (b) that the asymptotic distribution of $\Psi^2_{m,n}$, when $H_n$ is true, is $$\ast^{m - 1}_{\lambda = 1} K_{g(\lambda)}(x/\lambda)\ast K_{s - 1}(x/m),$$ mathematically independent of $n$. Conjectures (a) and (b) were proved by Billingsley  for the special case $n = 0$. For the special case $n = -1$ (by convention, $H'_{-1}$ is the hypothesis of equiprobable or perfect randomness (see )), Conjecture (b) was proved by Good  when $s$ is prime. In the present paper, Conjecture (a) will be proved for the general case $n \geqq -1$; conjecture (b) will be shown to be incorrect for $n > 0$, although a modified version of (b) will be proved for $n \geqq -1$. A third conjecture by Good  will also be proved here. It was assumed in these earlier papers, and it will be assumed here, that all transition probabilities in the Markov chain are positive; the results can be modified accordingly when some of these probabilities are zero (see  and ). Let $M_{m,n} = -2 \log\lambda_{n,m - 1}$, where $\lambda_{n,m - 1}$ is the ratio of the maximum likelihood given $H'_n$ to that given $H'_{m - 1}$ (see ). For $m = n + 2$, the statistics $\Psi^{\ast 2}_{m,n}$ is asymptotically equivalent, when $H'_n$ is true, to the likelihood ratio statistic $M_{m,n}$. For $m > n + 2, \Psi^{\ast 2}_{m,n}$ is asymptotically equivalent, when $H'_n$ is true, to $\sum^{m - n -1}_{\lambda = 1}\lambda M_{m + 1 - \lambda, m - 1 - \lambda}$, while $M_{m,n}$ is asymptotically equivalent to $$\sum^{m - n - 1}_{\lambda = 1} M_{m + 1 - \lambda, m - 1 - \lambda}$$ (see , ). Thus, $\Psi^{\ast 2}_{m,n}$ corresponds asymptotically to a weighted sum of the likelihood ratio statistics $M_{n + 2,n}, M_{n + 3, n + 1}, \cdots, M_{m,m - 2}$, with the weights $m - n - 1, m - n - 2, \cdots, 1$, respectively, while $M_{m, n}$ weights these statistics equally (see  and reference to  in Section 4 herein). Let $L_{m,n} = -2 \log \mu_{n,m - 1}$, where $\mu_{n,m - 1}$ is the ratio of the likelihood given $H_n$ to the maximum likelihood given $H'_{m - 1}$. For $m - 1 = n = 0$, the statistic $\Psi^2_{m,n}$ is asymptotically equivalent, when $H_n$ is true, to $L_{m,n}$. For $m - 1 > n = 0, \Psi^2_{m,n}$ is asymptotically equivalent, when $H_n$ is true, to $$\sum^{m - 1}_{\lambda = 1} \lambda M_{m + 1 - \lambda, m - 1 - \lambda} + mL_{n + 1,n},$$ while $L_{m,n}$ is asymptotically equivalent to $\sum^{m - 1}_{\lambda = 1} M_{m + 1 - \lambda, m - 1 - \lambda} + L_{n + 1, n}$. For $n > 0$, the relation between $\Psi^2_{m,n}$ and the likelihood ratio statistics $L_{m,n}$ and $M_{m,n}$ is not so straightforward. However, a modification $\Psi^{'2}_{m,n}$ of $\Psi^2_{m,n}$ (see Section 6 herein) is asymptotically equivalent, when $H_n$ is true, to $L_{m,n}$ for $m = n + 1$, and to $\sum^{m - n -1}_{\lambda = 1} \lambda M_{m + 1 - \lambda, m - 1 - \lambda} + (m - n)L_{n + 1, n}$ for $m > n + 1$; while the likelihood ratio statistic $L_{m,n}$ is asymptotically equivalent to $$\sum^{m - n - 1}_{\lambda = 1} M_{m + 1 - \lambda, m - 1- \lambda} + L_{n + 1,n}.$$ In , the $m$-tuple $\mathfrak{u}$ was "split" into an $(m - n - 1)$-tuple, an $n$-tuple, and a 1-tuple; thus obtaining $s^n$ "contingency tables" $(n \geqq 0)$ each $s^{m - n - 1} \times s$ (see ). The statistic $M_{m,n}$ can be seen to be asymptotically equivalent to the sum of the "likelihood ratio statistics" (for testing "independence" in each table) for the $s^n$ tables, and the asymptotic distribution, when $H'_n$ is true, of $M_{m,n}$ will be $\chi^2$ with $s^n(s^{m - n - 1} - 1)(s - 1) = s^m - s^{m - 1} - s^{n + 1} + s^n$ degrees of freedom. It is also possible to "split" the $m$-tuple $\mathfrak{u}$ into an $(m - n - 1 - r)$-tuple, and $\mathbf{n}$-tuple, and a $(1 + r)$-tuple $(0 \leqq r \leqq m - n - 2)$; thus obtaining $s^n$ "contingency tables," each $s^{m - n - 1 - r} \times s^{1 + r}$ (see ). The sum $_rM_{m,n}$ of the likelihood ratio (or any equivalent goodness of fit) statistics for the $s^n$ tables will have an asymptotic mean value, when $H'_n$ is true, of $$s^n(s^{m - n - 1 - r} - 1)(s^{1 + r} - 1) = s^m - s^{m - r - 1} - s^{n + 1 + r} + s^n.$$ but the asymptotic distribution will not be $\chi^2$ unless $r = 0$ or $m - n - 2$. It can be seen, using the methods developed in the present paper, that the statistic $_rM_{m,n}$ will be asymptotically equivalent, when $H'_n$ is true, to $$\sum^{m - n - 1}_{\lambda = 1} h(\lambda)M_{m + 1 - \lambda, m - 1 - \lambda},$$ where \begin{equation*}h(\lambda) = \begin{cases}\lambda \text{for} 0 < \lambda \leqq v\\v \text{for} v \leqq \lambda \leqq m - n - v\\(m - n - \lambda) \text{for} m - n - v \leqq \lambda \leqq m - n - 1,\end{cases}\end{equation*} and $v = \min \lbrack r + 1, m - n - r - 1\rbrack$. Thus, the asymptotic distribution $(N \rightarrow \infty)$, of $_rM_{m,n}$ (or the corresponding asymptotically equivalent goodness of fit statistics), when $H'_n$ is true, is $$\ast^{m - n - 1}_{\lambda = 1} K_{g(\lambda)}\lbrack x/(h(\lambda))\rbrack.$$ This result generalizes the earlier published results concerning the asymptotic distribution of the likelihood ratio statistic $M_{m,n}$ (or the corresponding asymptotically equivalent goodness of fit statistics) for testing the null hypothesis $H'_n$ within $H'_{m - 1}$, since $_rM_{m,n}$ for $r = 0$ or $m - n - 2$ is asymptotically equivalent to $M_{m,n}$ (see , ). A proof of this result will not be given since the method of proof is quite similar to that presented here for the asymptotic distribution of $\Psi^{\ast 2}_{m,n}$.

Article information

Source
Ann. Math. Statist., Volume 29, Number 4 (1958), 1123-1133.

Dates
First available in Project Euclid: 27 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aoms/1177706445

Digital Object Identifier
doi:10.1214/aoms/1177706445

Mathematical Reviews number (MathSciNet)
MR99744

Zentralblatt MATH identifier
0113.12505

JSTOR
Goodman, Leo A. Asymptotic Distributions of "Psi-Squared" Goodness of Fit Criteria for $m$-th Order Markov Chains. Ann. Math. Statist. 29 (1958), no. 4, 1123--1133. doi:10.1214/aoms/1177706445. https://projecteuclid.org/euclid.aoms/1177706445