## Journal of Applied Probability

### Uniform Chernoff and Dvoretzky-Kiefer-Wolfowitz-type inequalities for Markov chains and related processes

#### Abstract

We observe that the technique of Markov contraction can be used to establish measure concentration for a broad class of noncontracting chains. In particular, geometric ergodicity provides a simple and versatile framework. This leads to a short, elementary proof of a general concentration inequality for Markov and hidden Markov chains, which supersedes some of the known results and easily extends to other processes such as Markov trees. As applications, we provide a Dvoretzky-Kiefer-Wolfowitz-type inequality and a uniform Chernoff bound. All of our bounds are dimension-free and hold for countably infinite state spaces.

#### Article information

Source
J. Appl. Probab., Volume 51, Number 4 (2014), 1100-1113.

Dates
First available in Project Euclid: 20 January 2015

https://projecteuclid.org/euclid.jap/1421763330

Mathematical Reviews number (MathSciNet)
MR3301291

Zentralblatt MATH identifier
1320.60060

#### Citation

Kontorovich, Aryeh; Weiss, Roi. Uniform Chernoff and Dvoretzky-Kiefer-Wolfowitz-type inequalities for Markov chains and related processes. J. Appl. Probab. 51 (2014), no. 4, 1100--1113. https://projecteuclid.org/euclid.jap/1421763330

#### References

• Adamczak, R. (2008). A tail inequality for suprema of unbounded empirical processes with applications to Markov chains. Electron. J. Prob. 13, 1000–1034.
• Adamczak, R. and Bednorz, W. (2012). Exponential concentration inequalities for additive functionals of Markov chains. Preprint. Available at http://arxiv.org/abs/1201.3569v1.
• Anandkumar, A., Hsu, D. and Kakade, S. M. (2012). A method of moments for mixture models and hidden Markov models. In Proc. 25th Annual Conf. Learning Theory (Edinburgh, June 2012), 34 pp.
• Berend, D. and Kontorovich, A. (2013). A sharp estimate of the binomial mean absolute deviation with applications. Statist. Prob. Lett. 83, 1254–1259.
• Bobkov, S. G. and Götze, F. (2010). Concentration of empirical distribution functions with applications to non-i.i.d. models. Bernoulli 16, 1385–1414.
• Brémaud, P. (1999). Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer, New York.
• Chazottes, J.-R. and Redig, F. (2009). Concentration inequalities for Markov processes via coupling. Electron. J. Prob. 14, 1162–1180.
• Chazottes, J.-R., Collet, P., Külske, C. and Redig, F. (2007). Concentration inequalities for random fields via coupling. Prob. Theory Relat. Fields 137, 201–225.
• Chung, K.-M., Lam, H., Liu, Z. and Mitzenmacher, M. (2012). Chernoff–Hoeffding bounds for Markov chains: generalized and simplified. In 29th Internat. Symp. Theoret. Aspects Comput. Sci., Schloss Dagstuhl, Wadern, pp. 124–135.
• Diaconis, P. and Saloff-Coste, L. (1996). Logarithmic Sobolev inequalities for finite Markov chains. Ann. Appl. Prob. 6, 695–750.
• Diaconis, P. and Saloff-Coste, L. (1996). Nash inequalities for finite Markov chains. J. Theoret. Prob. 9, 459–510.
• Dinwoodie, I. H. (1995). A probability inequality for the occupation measure of a reversible Markov chain. Ann. Appl. Prob. 5, 37–43.
• Dinwoodie, I. H. (1998). Expectations for nonreversible Markov chains. J. Math. Anal. Appl. 220, 585–596.
• Dvoretzky, A., Kiefer, J. and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann. Math. Statist. 27, 642–669.
• Fill, J. A. (1991). Eigenvalue bounds on convergence to stationarity for nonreversible Markov chains, with an application to the exclusion process. Ann. Appl. Prob. 1, 62–87.
• Gillman, D. (1998). A Chernoff bound for random walks on expander graphs. SIAM J. Comput. 27, 1203–1220.
• Hsu, D., Kakade, S. M. and Zhang, T. (2009). A spectral algorithm for learning hidden Markov models. In Proc. 22nd Annual Conf. Learning Theory (Montreal, June 2009), 10 pp.
• Kahale, N. (1997). Large deviation bounds for Markov chains. Combin. Prob. Comput. 6, 465–474.
• Kontorovich, A. (2012). Obtaining measure concentration from Markov contraction. Markov Process. Relat. Fields 18, 613–638.
• Kontorovich, A., Nadler, B. and Weiss, R. (2013). On learning parametric-output HMMS. In Proc. 30th Internat. Conf. Machine Learning (June 2013, Atlanta), pp. 702–710.
• Kontorovich, L. and Ramanan, K. (2008). Concentration inequalities for dependent random variables via the martingale method. Ann. Prob. 36, 2126–2158.
• Kontoyiannis, I. and Meyn, S. P. (2012). Geometric ergodicity and the spectral gap of non-reversible Markov chains. Prob. Theory Relat. Fields 154, 327–339.
• León, C. A. and Perron, F. (2004). Optimal Hoeffding bounds for discrete reversible Markov chains. Ann. Appl. Prob. 14, 958–970.
• Lezaud, P. (1998). Chernoff-type bound for finite Markov chains. Ann. Appl. Prob. 8, 849–867.
• Markov, A. A. (1906). Extension of the law of large numbers to dependent quantities. Izvestiia Fiz.-Matem. Obsch. Kazan Univ. 15, 135–156.
• Marton, K. (1996). Bounding $\bar{d}$-distance by informational divergence: a method to prove measure concentration. Ann. Prob. 24, 857–866.
• Marton, K. (1998). Measure concentration for a class of random processes. Prob. Theory Relat. Fields 110, 427–439.
• Marton, K. (2003). Measure concentration and strong mixing. Studia Sci. Math. Hungarica 40, 95–113.
• Marton, K. (2004). Measure concentration for Euclidean distance in the case of dependent random variables. Ann. Prob. 32, 2526–2544.
• Massart, P. (1990). The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality. Ann. Prob. 18, 1269–1283.
• Mossel, E. and Roch, S. (2006). Learning nonsingular phylogenies and hidden Markov models. Ann. Appl. Prob. 16, 583–614.
• Rio, E. (2000). Inégalités de Hoeffding pour les fonctions lipschitziennes de suites dépendantes. C. R. Acad. Sci. Paris Sér. I Math. 330, 905–908.
• Samson, P.-M. (2000). Concentration of measure inequalities for Markov chains and $\Phi$-mixing processes. Ann. Prob. 28, 416–461.
• Siddiqi, S., Boots, B. and Gordon, G. (2010). Reduced-rank hidden Markov models. In Proc. 13th Internat Conf. Artificial Intelligence Statist. (Sardinia, Italy, May 2010), pp. 741–748.
• Wagner, R. (2008). Tail estimates for sums of variables sampled by a random walk. Combin. Prob. Comput. 17, 307–316.
• Zou, J. Y., Hsu, D., Parkes, D. and Adams, R. P. (2013). Contrastive learning using spectral methods. In Advances in Neural Information Processing Systems 26, eds C. J. C. Burges et al., pp. 2238–2246.