The Annals of Statistics

Trajectory averaging for stochastic approximation MCMC algorithms

Faming Liang

Full-text: Open access

Abstract

The subject of stochastic approximation was founded by Robbins and Monro [Ann. Math. Statist. 22 (1951) 400–407]. After five decades of continual development, it has developed into an important area in systems control and optimization, and it has also served as a prototype for the development of adaptive algorithms for on-line estimation and control of stochastic systems. Recently, it has been used in statistics with Markov chain Monte Carlo for solving maximum likelihood estimation problems and for general simulation and optimizations. In this paper, we first show that the trajectory averaging estimator is asymptotically efficient for the stochastic approximation MCMC (SAMCMC) algorithm under mild conditions, and then apply this result to the stochastic approximation Monte Carlo algorithm [Liang, Liu and Carroll J. Amer. Statist. Assoc. 102 (2007) 305–320]. The application of the trajectory averaging estimator to other stochastic approximation MCMC algorithms, for example, a stochastic approximation MLE algorithm for missing data problems, is also considered in the paper.

Article information

Source
Ann. Statist., Volume 38, Number 5 (2010), 2823-2856.

Dates
First available in Project Euclid: 20 July 2010

Permanent link to this document
https://projecteuclid.org/euclid.aos/1279638541

Digital Object Identifier
doi:10.1214/10-AOS807

Mathematical Reviews number (MathSciNet)
MR2722457

Zentralblatt MATH identifier
1218.60064

Subjects
Primary: 60J22: Computational methods in Markov chains [See also 65C40] 65C05: Monte Carlo methods

Keywords
Asymptotic efficiency convergence Markov chain Monte Carlo stochastic approximation Monte Carlo trajectory averaging

Citation

Liang, Faming. Trajectory averaging for stochastic approximation MCMC algorithms. Ann. Statist. 38 (2010), no. 5, 2823--2856. doi:10.1214/10-AOS807. https://projecteuclid.org/euclid.aos/1279638541


Export citation

References

  • Andrieu, C. and Moulines, É. (2006). On the ergodicity properties of some adaptive MCMC algorithms. Ann. Appl. Probab. 16 1462–1505.
  • Andrieu, C., Moulines, É. and Priouret, P. (2005). Stability of stochastic approximation under verifiable conditions. SIAM J. Control Optim. 44 283–312.
  • Atchadé, Y. F. and Liu, J. S. (2010). The Wang–Landau algorithm in general state spaces: Applications and convergence analysis. Statist. Sinica 20 209–233.
  • Benveniste, A., Métivier, M. and Priouret, P. (1990). Adaptive Algorithms and Stochastic Approximations. Springer, New York.
  • Chandra, T. K. and Goswami, A. (2006). Cesàro α-integrability and laws of large numbers. II. J. Theoret. Probab. 19 789–816.
  • Chen, H. F. (1993). Asymptotically efficient stochastic approximation. Stochastics Stochastics Rep. 45 1–16.
  • Chen, H. F. (2002). Stochastic Approximation and Its Applications. Kluwer Academic, Dordrecht.
  • Chen, H. F., Guo, L. and Gao, A. (1988). Convergence and robustness of the Robbins–Monro algorithm truncated at randomly varying bounds. Stochastic Process. Appl. 27 217–231.
  • Chen, H. F. and Zhu, Y. M. (1986). Stochastic approximation procedures with randomly varying truncations. Sci. Sinica Ser. A 29 914–926.
  • Cheon, S. and Liang, F. (2007). Phylogenetic tree reconstruction using sequential stochastic approximation Monte Carlo. BioSystems 91 94–107.
  • Cheon, S. and Liang, F. (2009). Bayesian phylogeny analysis via stochastic approximation Monte Carlo. Mol. Phylog. Evol. 53 394–403.
  • Delyon, B., Lavielle, M. and Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Ann. Statist. 27 94–128.
  • Dippon, J. and Renz, J. (1997). Weighted means in stochastic approximation of minima. SIAM J. Control Optim. 35 1811–1827.
  • Duflo, M. (1997). Random Iterative Models. Springer, Berlin.
  • Geyer, C. J. (1991). Markov chain Monte Carlo maximum likelihood. In Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface (E. M. Keramigas, ed.) 156–163. Interface Foundation, Fairfax, VA.
  • Geyer, C. J. and Thompson, E. A. (1995). Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Amer. Statist. Assoc. 90 909–920.
  • Gu, M. G. and Kong, F. H. (1998). A stochastic approximation algorithm with Markov chain Monte Carlo method for incomplete data estimation problems. Proc. Natl. Acad. Sci. USA 95 7270–7274.
  • Gu, M. G. and Zhu, H. T. (2001). Maximum likelihood estimation for spatial models by Markov chain Monte Carlo stochastic approximation. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 339–355.
  • Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97–109.
  • Kushner, H. J. and Yang, J. (1993). Stochastic approximation with averaging of the iterates: Optimal asymptotic rate of convergence for general processes. SIAM J. Control Optim. 31 1045–1062.
  • Kushner, H. J. and Yang, J. (1995). Stochastic approximation with averaging and feedback: Rapidly convergent “on-line” algorithms. IEEE Trans. Automat. Control 40 24–34.
  • Kushner, H. J. and Yin, G. G. (2003). Stochastic Approximation and Recursive Algorithms and Applications, 2nd ed. Springer, New York.
  • Liang, F. (2005). Generalized Wang–Landau algorithm for Monte Carlo computation. J. Amer. Statist. Assoc. 100 1311–1327.
  • Liang, F. (2007a). Continuous contour Monte Carlo for marginal density estimation with an application to a spatial statistical model. J. Comp. Graph. Statist. 16 608–632.
  • Liang, F. (2007b). Annealing stochastic approximation Monte Carlo for neural network training. Mach. Learn. 68 201–233.
  • Liang, F. (2009). Improving SAMC using smoothing methods: Theory and applications to Bayesian model selection problems. Ann. Statist. 37 2626–2654.
  • Liang, F., Liu, C. and Carroll, R. J. (2007). Stochastic approximation in Monte Carlo computation. J. Amer. Statist. Assoc. 102 305–320.
  • Liang, F. and Zhang, J. (2009). Learning Bayesian networks for discrete data. Comput. Statist. Data Anal. 53 865–876.
  • Marinari, E. and Parisi, G. (1992). Simulated tempering: A new Monte Carlo scheme. Europhys. Lett. 19 451–458.
  • Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 21 1087–1091.
  • Moyeed, R. A. and Baddeley, A. J. (1991). Stochastic approximation of the MLE for a spatial point pattern. Scand. J. Statist. 18 39–50.
  • Pelletier, M. (2000). Asymptotic almost sure efficiency of averaged stochastic algorithms. SIAM J. Control Optim. 39 49–72.
  • Polyak, B. T. (1990). New stochastic approximation type procedures. Avtomat. i Telemekh. 7 98–107 (in Russian).
  • Polyak, B. T. and Juditsky, A. B. (1992). Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30 838–855.
  • Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Math. Statist. 22 400–407.
  • Ruppert, D. (1988). Efficient estimators from a slowly convergent Robbins–Monro procedure. Technical Report 781, School of Operations Research and Industrial Engineering, Cornell Univ.
  • Tadić, V. (1997). Convergence of stochastic approximation under general noise and stability conditions. In: Proceedings of the 36th IEEE Conference on Decision and Control 3 2281–2286. IEEE Systems Society, San Diego, CA.
  • Tang, Q. Y., L’Ecuyer, P. and Chen, H. F. (1999). Asymptotic efficiency of perturbation-analysis-based stochastic approximation with averaging. SIAM J. Control Optim. 37 1822–1847.
  • Wang, F. and Landau, D. P. (2001). Efficient, multiple-range random walk algorithm to calculate the density of states. Phys. Rev. Lett. 86 2050–2053.
  • Wang, I.-J., Chong, E. K. P. and Kulkarni, S. R. (1997). Weighted averaging and stochastic approximation. Math. Control Signals Systems 10 41–60.
  • Younes, L. (1989). Parametric inference for imperfectly observed Gibbsian fields. Probab. Theory Related Fields 82 625–645.
  • Younes, L. (1999). On the convergence of Markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics Stochastics Rep. 65 177–228.
  • Yu, K. and Liang, F. (2009). Efficient P-value evaluation for resampling-based tests. Technical report, Dept. Statistics, Texas A&M Univ., College Station, TX.