The Annals of Applied Probability

Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algorithms

Abdelkader Mokkadem and Mariane Pelletier

Full-text: Open access

Abstract

The first aim of this paper is to establish the weak convergence rate of nonlinear two-time-scale stochastic approximation algorithms. Its second aim is to introduce the averaging principle in the context of two-time-scale stochastic approximation algorithms. We first define the notion of asymptotic efficiency in this framework, then introduce the averaged two-time-scale stochastic approximation algorithm, and finally establish its weak convergence rate. We show, in particular, that both components of the averaged two-time-scale stochastic approximation algorithm simultaneously converge at the optimal rate $\sqrt{n}$.

Article information

Source
Ann. Appl. Probab., Volume 16, Number 3 (2006), 1671-1702.

Dates
First available in Project Euclid: 2 October 2006

Permanent link to this document
https://projecteuclid.org/euclid.aoap/1159804996

Digital Object Identifier
doi:10.1214/105051606000000448

Mathematical Reviews number (MathSciNet)
MR2260078

Zentralblatt MATH identifier
1104.62095

Subjects
Primary: 62L20: Stochastic approximation

Keywords
Stochastic approximation two-time-scales weak convergence rate averaging principle

Citation

Mokkadem, Abdelkader; Pelletier, Mariane. Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algorithms. Ann. Appl. Probab. 16 (2006), no. 3, 1671--1702. doi:10.1214/105051606000000448. https://projecteuclid.org/euclid.aoap/1159804996


Export citation

References

  • Baras, J. S. and Borkar, V. S. (2000). A learning algorithm for Markov decision processes with adaptive state aggregation. In Proc. 39th IEEE Conference on Decision and Control. IEEE, New York.
  • Benveniste, A., Métivier, M. and Priouret, P. (1990). Adaptive Algorithms and Stochastic Appproximations. Springer, Berlin.
  • Bhatnagar, S., Fu, M. C., Marcus, S. I. and Bathnagar, S. (2001). Two timescale algorithms for simulation optimization of hidden Markov models. IIE Transactions 3 245–258.
  • Bhatnagar, S., Fu, M. C., Marcus, S. I. and Fard, P. J. (2001). Optimal structured feedback policies for ABR flow control using two timescale SPSA. IEEE/ACM Transactions on Networking 9 479–491.
  • Borkar, V. S. (1997). Stochastic approximation with two time scales. Systems Control Lett. 29 291–294.
  • Delyon, B. and Juditsky, A. B. (1992). Stochastic optimization with averaging of trajectories. Stochastics Stochastic Rep. 39 107–118.
  • Dippon, J. and Renz, J. (1996). Weighted means of processes in stochastic approximation. Math. Methods Statist. 5 32–60.
  • Dippon, J. and Renz, J. (1997). Weighted means in stochastic approximation of minima. SIAM J. Control Optim. 35 1811–1827.
  • Duflo, M. (1996). Algorithmes Stochastiques. Springer, Berlin.
  • Ljung, L., Pflug, G. and Walk, H. (1992). Stochastic Approximation and Optimization of Random Systems. Birkhäuser, Boston.
  • Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge Univ. Press.
  • Konda, V. R. and Borkar, V. S. (1999). Actor-critic like learning algorithms for Markov decision processes. SIAM J. Control Optim. 38 94–123.
  • Konda, V. R. and Tsitsiklis, J. N. (2003). On actor-critic algorithms. SIAM J. Control Optim. 42 1143–1166.
  • Konda, V. R. and Tsitsiklis, J. N. (2004). Convergence rate of linear two-time-scale stochastic approximation. Ann. Appl. Probab. 14 796–819.
  • Kushner, H. J. and Clark, D. S. (1978). Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York.
  • Kushner, H. J. and Yang, J. (1993). Stochastic approximation with averaging of the iterates: Optimal asymptotic rate of convergence for general processes. SIAM J. Control Optim. 31 1045–1062.
  • Kushner, H. J. and Yin, G. G. (1997). Stochastic Approximation Algorithms and Applications. Springer, New York.
  • Ljung, L. (1978). Strong convergence of a stochastic approximation algorithm. Ann. Statist. 6 680–696.
  • Mokkadem, A. and Pelletier, M. (2005). The compact law of the iterated logarithm for multivariate stochastic approximation algorithms. Stochastic Anal. Appl. 23 181–203.
  • Mokkadem, A. and Pelletier, M. (2004). A companion for the Kiefer–Wolfowitz–Blum stochastic approximation algorithm. To appear.
  • Nevels'on, M. B. and Has'minskii, R. Z. (1976). Stochastic Approximation and Recursive Estimation. Amer. Math. Soc., Providence, RI.
  • Pelletier, M. (1998). On the almost sure asymptotic behaviour of stochastic algorithms. Stochastic Process. Appl. 78 217–244.
  • Pelletier, M. (2000). Asymptotic almost sure efficiency of averaged stochastic algorithms. SIAM J. Control Optim. 39 49–72.
  • Polyak, B. T. (1990). New method of stochastic approximation type. Automat. Remote Control 51 937–946.
  • Polyak, B. T. and Juditsky, A. B. (1992). Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30 838–855.
  • Ruppert, D. (1991). Stochastic approximation. In Handbook of Sequential Analysis (B. K. Ghosh and P. K. Sen, eds.) 503–529. Dekker, New York.
  • Yin, G. (1991). On extensions of Polyak's averaging approach to stochastic approximation. Stochastics Stochastic Rep. 33 245–264.