The Annals of Applied Probability

Error bounds for Metropolis–Hastings algorithms applied to perturbations of Gaussian measures in high dimensions

Andreas Eberle

Abstract

The Metropolis-adjusted Langevin algorithm (MALA) is a Metropolis–Hastings method for approximate sampling from continuous distributions. We derive upper bounds for the contraction rate in Kantorovich–Rubinstein–Wasserstein distance of the MALA chain with semi-implicit Euler proposals applied to log-concave probability measures that have a density w.r.t. a Gaussian reference measure. For sufficiently “regular” densities, the estimates are dimension-independent, and they hold for sufficiently small step sizes $h$ that do not depend on the dimension either. In the limit $h\downarrow0$, the bounds approach the known optimal contraction rates for overdamped Langevin diffusions in a convex potential.

A similar approach also applies to Metropolis–Hastings chains with Ornstein–Uhlenbeck proposals. In this case, the resulting estimates are still independent of the dimension but less optimal, reflecting the fact that MALA is a higher order approximation of the diffusion limit than Metropolis–Hastings with Ornstein–Uhlenbeck proposals.

Article information

Source
Ann. Appl. Probab., Volume 24, Number 1 (2014), 337-377.

Dates
First available in Project Euclid: 9 January 2014

https://projecteuclid.org/euclid.aoap/1389278728

Digital Object Identifier
doi:10.1214/13-AAP926

Mathematical Reviews number (MathSciNet)
MR3161650

Zentralblatt MATH identifier
1296.60195

Citation

Eberle, Andreas. Error bounds for Metropolis–Hastings algorithms applied to perturbations of Gaussian measures in high dimensions. Ann. Appl. Probab. 24 (2014), no. 1, 337--377. doi:10.1214/13-AAP926. https://projecteuclid.org/euclid.aoap/1389278728

References

• [1] Bakry, D. (1994). L’hypercontractivité et son utilisation en théorie des semigroupes. In Lectures on Probability Theory (Saint-Flour, 1992). Lecture Notes in Math. 1581 1–114. Springer, Berlin.
• [2] Bakry, D., Cattiaux, P. and Guillin, A. (2008). Rate of convergence for ergodic continuous Markov processes: Lyapunov versus Poincaré. J. Funct. Anal. 254 727–759.
• [3] Beskos, A., Roberts, G., Stuart, A. and Voss, J. (2008). MCMC methods for diffusion bridges. Stoch. Dyn. 8 319–350.
• [4] Beskos, A. and Stuart, A. (2009). MCMC methods for sampling function space. In ICIAM 076th International Congress on Industrial and Applied Mathematics 337–364. Eur. Math. Soc., Zürich.
• [5] Bou-Rabee, N., Hairer, M. and Vanden-Eijnden, E. (2010). Non-asymptotic mixing of the MALA algorithm. Available at arXiv:1008.3514.
• [6] Bou-Rabee, N. and Vanden-Eijnden, E. (2010). Pathwise accuracy and ergodicity of metropolized integrators for SDEs. Comm. Pure Appl. Math. 63 655–696.
• [7] Chen, M. F. and Li, S. F. (1989). Coupling methods for multidimensional diffusion processes. Ann. Probab. 17 151–177.
• [8] Cotter, S., Roberts, G., Stuart, A. and White, D. (2012). MCMC methods for functions: Modifying old algorithms to make them faster. Available at arXiv:1202.0709.
• [9] Da Prato, G. and Zabczyk, J. (1996). Ergodicity for Infinite-dimensional Systems. London Mathematical Society Lecture Note Series 229. Cambridge Univ. Press, Cambridge.
• [10] Dyer, M., Frieze, A. and Kannan, R. (1991). A random polynomial-time algorithm for approximating the volume of convex bodies. J. Assoc. Comput. Mach. 38 1–17.
• [11] Eberle, A. (2011). Reflection coupling and Wasserstein contractivity without convexity. C. R. Math. Acad. Sci. Paris 349 1101–1104.
• [12] Gruhlke, D. (2013). Convergence of multilevel MCMC methods on path spaces. Ph.D. thesis, Univ. Bonn.
• [13] Hairer, M., Stuart, A. and Vollmer, S. (2011). Spectral gaps for a Metropolis–Hastings algorithm in infinite dimensions. Available at arXiv:1112.1392.
• [14] Hairer, M., Stuart, A. and Voss, J. (2011). Signal processing problems on function space: Bayesian formulation, stochastic PDEs and effective MCMC methods. In The Oxford Handbook of Nonlinear Filtering 833–873. Oxford Univ. Press, Oxford.
• [15] Has’minskiĭ, R. Z. (1980). Stochastic Stability of Differential Equations. Monographs and Textbooks on Mechanics of Solids and Fluids: Mechanics and Analysis 7. Sijthoff & Noordhoff, Alphen aan den Rijn.
• [16] Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97–109.
• [17] Kannan, R., Lovász, L. and Simonovits, M. (1997). Random walks and an $O^{*}(n^{5})$ volume algorithm for convex bodies. Random Structures Algorithms 11 1–50.
• [18] Levin, D. A., Peres, Y. and Wilmer, E. L. (2009). Markov Chains and Mixing Times. Amer. Math. Soc., Providence, RI.
• [19] Lovász, L. and Vempala, S. (2006). Hit-and-run from a corner. SIAM J. Comput. 35 985–1005 (electronic).
• [20] Lovász, L. and Vempala, S. (2006). Simulated annealing in convex bodies and an $O^{*}(n^{4})$ volume algorithm. J. Comput. System Sci. 72 392–417.
• [21] Lovász, L. and Vempala, S. (2007). The geometry of logconcave functions and sampling algorithms. Random Structures Algorithms 30 307–358.
• [22] Mattingly, J. C., Pillai, N. S. and Stuart, A. M. (2012). Diffusion limits of the random walk Metropolis algorithm in high dimensions. Ann. Appl. Probab. 22 881–930.
• [23] Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 6 1087–1092.
• [24] Meyn, S. P. and Tweedie, R. L. (1993). Markov Chains and Stochastic Stability. Springer, London.
• [25] Pillai, N. S., Stuart, A. M. and Thiery, A. H. (2013). Gradient flow from a random walk in Hilbert space. Available at arXiv:1108.1494v3.
• [26] Pillai, N. S., Stuart, A. M. and Thiéry, A. H. (2012). Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions. Ann. Appl. Probab. 22 2320–2356.
• [27] Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods, 2nd ed. Springer, New York.
• [28] Roberts, G. O., Gelman, A. and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110–120.
• [29] Roberts, G. O. and Rosenthal, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 255–268.
• [30] Roberts, G. O. and Tweedie, R. L. (1996). Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2 341–363.
• [31] Royer, G. (2007). An Initiation to Logarithmic Sobolev Inequalities. SMF/AMS Texts and Monographs 14. Amer. Math. Soc., Providence, RI.
• [32] Saloff-Coste, L. (1997). Lectures on finite Markov chains. In Lectures on Probability Theory and Statistics (Saint-Flour, 1996). Lecture Notes in Math. 1665 301–413. Springer, Berlin.
• [33] Steele, J. M. (2001). Stochastic Calculus and Financial Applications. Applications of Mathematics (New York) 45. Springer, New York.
• [34] Villani, C. (2009). Optimal Transport: Old and New. Grundlehren der Mathematischen Wissenschaften 338. Springer, Berlin.
• [35] von Renesse, M.-K. and Sturm, K.-T. (2005). Transport inequalities, gradient estimates, entropy, and Ricci curvature. Comm. Pure Appl. Math. 58 923–940.