## Journal of Applied Probability

### On ε-optimality of the pursuit learning algorithm

#### Abstract

Estimator algorithms in learning automata are useful tools for adaptive, real-time optimization in computer science and engineering applications. In this paper we investigate theoretical convergence properties for a special case of estimator algorithms - the pursuit learning algorithm. We identify and fill a gap in existing proofs of probabilistic convergence for pursuit learning. It is tradition to take the pursuit learning tuning parameter to be fixed in practical applications, but our proof sheds light on the importance of a vanishing sequence of tuning parameters in a theoretical convergence analysis.

#### Article information

Source
J. Appl. Probab., Volume 49, Number 3 (2012), 795-805.

Dates
First available in Project Euclid: 6 September 2012

https://projecteuclid.org/euclid.jap/1346955334

Digital Object Identifier
doi:10.1239/jap/1346955334

Mathematical Reviews number (MathSciNet)
MR3012100

Zentralblatt MATH identifier
1251.68167

#### Citation

Martin, Ryan; Tilak, Omkar. On ε-optimality of the pursuit learning algorithm. J. Appl. Probab. 49 (2012), no. 3, 795--805. doi:10.1239/jap/1346955334. https://projecteuclid.org/euclid.jap/1346955334

#### References

• Agache, M. and Oommen, B. J. (2002). Generalized pursuit learning schemes: new families of continuous and discretized learning automata. IEEE Trans. Systems Man Cybernet. 32, 738–749.
• Atlasis, A. F., Loukas, A. N. H. and Vasilakos, A. V. (2000). The use of learning algorithms in ATM networks call admission control problem: a methodology. Comput. Networks 34, 341–353.
• Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58, 13–30.
• Kashki, M., Abido, M. A. and Abdel-Magid, Y. L. (2010). Pole placement approach for robust optimum design of PSS and TCSC-based stabilizers using reinforcement learning automata. Electrical Eng. 91, 383–394.
• Klenke, A. and Mattner, L. (2010). Stochastic ordering of classical discrete distributions. Adv. Appl. Prob. 42, 392–410.
• Kushner, H. J. and Yin, G. G. (2003). Stochastic Approximation and Recursive Algorithms and Applications, 2nd edn. Springer, New York.
• Lanctôt, J. K. and Oommen, B. J. (1992). Discretized estimator learning automata. IEEE Trans. Systems Man Cybernet. 22, 1473–1483.
• Lixia, L., Gang, H., Ming, X. and Yuxing, P. (2010). Learning automata based spectrum allocation in cognitive networks. In IEEE Internat. Conf. Wireless Communications, Networking, and Information Security, pp. 503–508.
• Misra, S., Tiwari, V. and Obaidat, M. S. (2009). Lacas: learning automata-based congestion avoidance scheme for healthcare wireless sensor networks. IEEE J. Selected Areas Commun. 27, 466–479.
• Narendra, K. S. and Thathachar, M. A. L. (1989). Learning Automata: An Introduction. Prentice Hall, Englewood Cliffs, NJ.
• Oommen, B. J. and Hashem, M. K. (2010). Modeling a student's behavior in a tutorial-like system using learning automata. IEEE Trans. Systems Man Cybernet. B 40, 481–492.
• Oommen, B. J. and Lanctôt, J. K. (1990). Discretized pursuit learning automata. IEEE Trans. Systems Man Cybernet. 20, 931–938.
• Papadimitriou, G. I., Sklira, M. and Pomportsis, A. S. (2004). A new class of $\epsilon$-optimal learning automata. IEEE Trans. Systems Man Cybernet. B 34, 246–254.
• Proschan, F. and Sethuraman, J. (1976). Stochastic comparisons of order statistics from heterogeneous populations, with applications in reliability. J. Multivariate Anal. 6, 608–616.
• Rajaraman, K. and Sastry, P. S. (1996). Finite time analysis of the pursuit algorithm for learning automata. IEEE Trans. Systems Man Cybernet. B 26, 590–598.
• Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Math. Statist. 22, 400–407.
• Sastry, P. S. (1985). Systems of learning automata: estimator algorithms and applications. Doctoral Thesis, Indian Institute of Science.
• Thathachar, M. A. L. and Sastry, P. S. (1985). A new approach to the design of reinforcement schemes for learning automata. IEEE Trans. Systems Man Cybernet. 15, 168–175.
• Thathachar, M. A. L. and Sastry, P. S. (1987). Learning optimal discriminant functions through a cooperative game of automata. IEEE Trans. Systems Man Cybernet. 17, 73–85.
• Tilak, O., Martin, R. and Mukhopadhyay, S. (2011). Decentralized, indirect methods for learning automata games. IEEE Trans. Systems Man Cybernet. B 41, 1213–1223.
• Torkestania, J. A. and Meybodi, M. R. (2010). Clustering the wireless ad hoc networks: a distributed learning automata approach. J. Parallel Distributed Computing 70, 394–405.
• Torkestania, J. A. and Meybodi, M. R. (2010). An intelligent backbone formation algorithm for wireless ad hoc networks based on distributed learning automata. Comput. Networks 54, 826–843.
• Tuan, T. A., Tong, L. C. and Premkumar, A. B. (2010). An adaptive learning automata algorithm for channel selection in cognitive radio network. In 2010 Internat. Conf. Communications and Mobile Computing, Vol. 2, IEEE Computer Society, Washington, DC, pp. 159–163.
• Zhong, W., Xu, Y. and Tao, M. (2010). Precoding strategy selection for cognitive MIMO multiple access channels using learning automata. In 2010 IEEE Internat. Conf. Communications, pp. 23–27.