Journal of Applied Probability

Uniform Chernoff and Dvoretzky-Kiefer-Wolfowitz-type inequalities for Markov chains and related processes

Aryeh Kontorovich and Roi Weiss

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


We observe that the technique of Markov contraction can be used to establish measure concentration for a broad class of noncontracting chains. In particular, geometric ergodicity provides a simple and versatile framework. This leads to a short, elementary proof of a general concentration inequality for Markov and hidden Markov chains, which supersedes some of the known results and easily extends to other processes such as Markov trees. As applications, we provide a Dvoretzky-Kiefer-Wolfowitz-type inequality and a uniform Chernoff bound. All of our bounds are dimension-free and hold for countably infinite state spaces.

Article information

J. Appl. Probab., Volume 51, Number 4 (2014), 1100-1113.

First available in Project Euclid: 20 January 2015

Permanent link to this document

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60E15: Inequalities; stochastic orderings
Secondary: 60J10: Markov chains (discrete-time Markov processes on discrete state spaces)

Concentration of measure Markov chain hidden Markov chain Chernoff Dvoretzky-Kiefer-Wolfowitz


Kontorovich, Aryeh; Weiss, Roi. Uniform Chernoff and Dvoretzky-Kiefer-Wolfowitz-type inequalities for Markov chains and related processes. J. Appl. Probab. 51 (2014), no. 4, 1100--1113.

Export citation


  • Adamczak, R. (2008). A tail inequality for suprema of unbounded empirical processes with applications to Markov chains. Electron. J. Prob. 13, 1000–1034.
  • Adamczak, R. and Bednorz, W. (2012). Exponential concentration inequalities for additive functionals of Markov chains. Preprint. Available at
  • Anandkumar, A., Hsu, D. and Kakade, S. M. (2012). A method of moments for mixture models and hidden Markov models. In Proc. 25th Annual Conf. Learning Theory (Edinburgh, June 2012), 34 pp.
  • Berend, D. and Kontorovich, A. (2013). A sharp estimate of the binomial mean absolute deviation with applications. Statist. Prob. Lett. 83, 1254–1259.
  • Bobkov, S. G. and Götze, F. (2010). Concentration of empirical distribution functions with applications to non-i.i.d. models. Bernoulli 16, 1385–1414.
  • Brémaud, P. (1999). Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer, New York.
  • Chazottes, J.-R. and Redig, F. (2009). Concentration inequalities for Markov processes via coupling. Electron. J. Prob. 14, 1162–1180.
  • Chazottes, J.-R., Collet, P., Külske, C. and Redig, F. (2007). Concentration inequalities for random fields via coupling. Prob. Theory Relat. Fields 137, 201–225.
  • Chung, K.-M., Lam, H., Liu, Z. and Mitzenmacher, M. (2012). Chernoff–Hoeffding bounds for Markov chains: generalized and simplified. In 29th Internat. Symp. Theoret. Aspects Comput. Sci., Schloss Dagstuhl, Wadern, pp. 124–135.
  • Diaconis, P. and Saloff-Coste, L. (1996). Logarithmic Sobolev inequalities for finite Markov chains. Ann. Appl. Prob. 6, 695–750.
  • Diaconis, P. and Saloff-Coste, L. (1996). Nash inequalities for finite Markov chains. J. Theoret. Prob. 9, 459–510.
  • Dinwoodie, I. H. (1995). A probability inequality for the occupation measure of a reversible Markov chain. Ann. Appl. Prob. 5, 37–43.
  • Dinwoodie, I. H. (1998). Expectations for nonreversible Markov chains. J. Math. Anal. Appl. 220, 585–596.
  • Dvoretzky, A., Kiefer, J. and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann. Math. Statist. 27, 642–669.
  • Fill, J. A. (1991). Eigenvalue bounds on convergence to stationarity for nonreversible Markov chains, with an application to the exclusion process. Ann. Appl. Prob. 1, 62–87.
  • Gillman, D. (1998). A Chernoff bound for random walks on expander graphs. SIAM J. Comput. 27, 1203–1220.
  • Hsu, D., Kakade, S. M. and Zhang, T. (2009). A spectral algorithm for learning hidden Markov models. In Proc. 22nd Annual Conf. Learning Theory (Montreal, June 2009), 10 pp.
  • Kahale, N. (1997). Large deviation bounds for Markov chains. Combin. Prob. Comput. 6, 465–474.
  • Kontorovich, A. (2012). Obtaining measure concentration from Markov contraction. Markov Process. Relat. Fields 18, 613–638.
  • Kontorovich, A., Nadler, B. and Weiss, R. (2013). On learning parametric-output HMMS. In Proc. 30th Internat. Conf. Machine Learning (June 2013, Atlanta), pp. 702–710.
  • Kontorovich, L. and Ramanan, K. (2008). Concentration inequalities for dependent random variables via the martingale method. Ann. Prob. 36, 2126–2158.
  • Kontoyiannis, I. and Meyn, S. P. (2012). Geometric ergodicity and the spectral gap of non-reversible Markov chains. Prob. Theory Relat. Fields 154, 327–339.
  • León, C. A. and Perron, F. (2004). Optimal Hoeffding bounds for discrete reversible Markov chains. Ann. Appl. Prob. 14, 958–970.
  • Lezaud, P. (1998). Chernoff-type bound for finite Markov chains. Ann. Appl. Prob. 8, 849–867.
  • Markov, A. A. (1906). Extension of the law of large numbers to dependent quantities. Izvestiia Fiz.-Matem. Obsch. Kazan Univ. 15, 135–156.
  • Marton, K. (1996). Bounding $\bar{d}$-distance by informational divergence: a method to prove measure concentration. Ann. Prob. 24, 857–866.
  • Marton, K. (1998). Measure concentration for a class of random processes. Prob. Theory Relat. Fields 110, 427–439.
  • Marton, K. (2003). Measure concentration and strong mixing. Studia Sci. Math. Hungarica 40, 95–113.
  • Marton, K. (2004). Measure concentration for Euclidean distance in the case of dependent random variables. Ann. Prob. 32, 2526–2544.
  • Massart, P. (1990). The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality. Ann. Prob. 18, 1269–1283.
  • Mossel, E. and Roch, S. (2006). Learning nonsingular phylogenies and hidden Markov models. Ann. Appl. Prob. 16, 583–614.
  • Rio, E. (2000). Inégalités de Hoeffding pour les fonctions lipschitziennes de suites dépendantes. C. R. Acad. Sci. Paris Sér. I Math. 330, 905–908.
  • Samson, P.-M. (2000). Concentration of measure inequalities for Markov chains and $\Phi$-mixing processes. Ann. Prob. 28, 416–461.
  • Siddiqi, S., Boots, B. and Gordon, G. (2010). Reduced-rank hidden Markov models. In Proc. 13th Internat Conf. Artificial Intelligence Statist. (Sardinia, Italy, May 2010), pp. 741–748.
  • Wagner, R. (2008). Tail estimates for sums of variables sampled by a random walk. Combin. Prob. Comput. 17, 307–316.
  • Zou, J. Y., Hsu, D., Parkes, D. and Adams, R. P. (2013). Contrastive learning using spectral methods. In Advances in Neural Information Processing Systems 26, eds C. J. C. Burges et al., pp. 2238–2246.