## The Annals of Probability

### Entropy and the Consistent Estimation of Joint Distributions

#### Abstract

The $k$th-order joint distribution for an ergodic finite-alphabet process can be estimated from a sample path of length $n$ by sliding a window of length $k$ along the sample path and counting frequencies of $k$-blocks. In this paper the problem of consistent estimation when $k = k(n)$ grows as a function of $n$ is addressed. It is shown that the variational distance between the true $k(n)$-block distribution and the empirical $k(n)$-block distribution goes to 0 almost surely for the class of weak Bernoulli processes, provided $k(n) \leq (\log n)/(H + \epsilon)$, where $H$ is the entropy of the process. The weak Bernoulli class includes the i.i.d. processes, the aperiodic Markov chains and functions thereof and the aperiodic renewal processes. A similar result is also shown to hold for functions of irreducible Markov chains. This work sharpens prior results obtained for more general classes of processes by Ornstein and Weiss and by Ornstein and Shields, which used the $\bar{d}$-distance rather than the variational distance.

#### Article information

Source
Ann. Probab., Volume 22, Number 2 (1994), 960-977.

Dates
First available in Project Euclid: 19 April 2007

https://projecteuclid.org/euclid.aop/1176988736

Digital Object Identifier
doi:10.1214/aop/1176988736

Mathematical Reviews number (MathSciNet)
MR1288138

Zentralblatt MATH identifier
0806.28014

JSTOR