## The Annals of Applied Probability

### Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algorithms

#### Abstract

The first aim of this paper is to establish the weak convergence rate of nonlinear two-time-scale stochastic approximation algorithms. Its second aim is to introduce the averaging principle in the context of two-time-scale stochastic approximation algorithms. We first define the notion of asymptotic efficiency in this framework, then introduce the averaged two-time-scale stochastic approximation algorithm, and finally establish its weak convergence rate. We show, in particular, that both components of the averaged two-time-scale stochastic approximation algorithm simultaneously converge at the optimal rate $\sqrt{n}$.

#### Article information

Source
Ann. Appl. Probab., Volume 16, Number 3 (2006), 1671-1702.

Dates
First available in Project Euclid: 2 October 2006

https://projecteuclid.org/euclid.aoap/1159804996

Digital Object Identifier
doi:10.1214/105051606000000448

Mathematical Reviews number (MathSciNet)
MR2260078

Zentralblatt MATH identifier
1104.62095

Subjects
Primary: 62L20: Stochastic approximation

#### Citation

Mokkadem, Abdelkader; Pelletier, Mariane. Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algorithms. Ann. Appl. Probab. 16 (2006), no. 3, 1671--1702. doi:10.1214/105051606000000448. https://projecteuclid.org/euclid.aoap/1159804996

#### References

• Baras, J. S. and Borkar, V. S. (2000). A learning algorithm for Markov decision processes with adaptive state aggregation. In Proc. 39th IEEE Conference on Decision and Control. IEEE, New York.
• Benveniste, A., Métivier, M. and Priouret, P. (1990). Adaptive Algorithms and Stochastic Appproximations. Springer, Berlin.
• Bhatnagar, S., Fu, M. C., Marcus, S. I. and Bathnagar, S. (2001). Two timescale algorithms for simulation optimization of hidden Markov models. IIE Transactions 3 245–258.
• Bhatnagar, S., Fu, M. C., Marcus, S. I. and Fard, P. J. (2001). Optimal structured feedback policies for ABR flow control using two timescale SPSA. IEEE/ACM Transactions on Networking 9 479–491.
• Borkar, V. S. (1997). Stochastic approximation with two time scales. Systems Control Lett. 29 291–294.
• Delyon, B. and Juditsky, A. B. (1992). Stochastic optimization with averaging of trajectories. Stochastics Stochastic Rep. 39 107–118.
• Dippon, J. and Renz, J. (1996). Weighted means of processes in stochastic approximation. Math. Methods Statist. 5 32–60.
• Dippon, J. and Renz, J. (1997). Weighted means in stochastic approximation of minima. SIAM J. Control Optim. 35 1811–1827.
• Duflo, M. (1996). Algorithmes Stochastiques. Springer, Berlin.
• Ljung, L., Pflug, G. and Walk, H. (1992). Stochastic Approximation and Optimization of Random Systems. Birkhäuser, Boston.
• Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge Univ. Press.
• Konda, V. R. and Borkar, V. S. (1999). Actor-critic like learning algorithms for Markov decision processes. SIAM J. Control Optim. 38 94–123.
• Konda, V. R. and Tsitsiklis, J. N. (2003). On actor-critic algorithms. SIAM J. Control Optim. 42 1143–1166.
• Konda, V. R. and Tsitsiklis, J. N. (2004). Convergence rate of linear two-time-scale stochastic approximation. Ann. Appl. Probab. 14 796–819.
• Kushner, H. J. and Clark, D. S. (1978). Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York.
• Kushner, H. J. and Yang, J. (1993). Stochastic approximation with averaging of the iterates: Optimal asymptotic rate of convergence for general processes. SIAM J. Control Optim. 31 1045–1062.
• Kushner, H. J. and Yin, G. G. (1997). Stochastic Approximation Algorithms and Applications. Springer, New York.
• Ljung, L. (1978). Strong convergence of a stochastic approximation algorithm. Ann. Statist. 6 680–696.
• Mokkadem, A. and Pelletier, M. (2005). The compact law of the iterated logarithm for multivariate stochastic approximation algorithms. Stochastic Anal. Appl. 23 181–203.
• Mokkadem, A. and Pelletier, M. (2004). A companion for the Kiefer–Wolfowitz–Blum stochastic approximation algorithm. To appear.
• Nevels'on, M. B. and Has'minskii, R. Z. (1976). Stochastic Approximation and Recursive Estimation. Amer. Math. Soc., Providence, RI.
• Pelletier, M. (1998). On the almost sure asymptotic behaviour of stochastic algorithms. Stochastic Process. Appl. 78 217–244.
• Pelletier, M. (2000). Asymptotic almost sure efficiency of averaged stochastic algorithms. SIAM J. Control Optim. 39 49–72.
• Polyak, B. T. (1990). New method of stochastic approximation type. Automat. Remote Control 51 937–946.
• Polyak, B. T. and Juditsky, A. B. (1992). Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30 838–855.
• Ruppert, D. (1991). Stochastic approximation. In Handbook of Sequential Analysis (B. K. Ghosh and P. K. Sen, eds.) 503–529. Dekker, New York.
• Yin, G. (1991). On extensions of Polyak's averaging approach to stochastic approximation. Stochastics Stochastic Rep. 33 245–264.