Journal of Applied Probability

On ε-optimality of the pursuit learning algorithm

Ryan Martin and Omkar Tilak

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Estimator algorithms in learning automata are useful tools for adaptive, real-time optimization in computer science and engineering applications. In this paper we investigate theoretical convergence properties for a special case of estimator algorithms - the pursuit learning algorithm. We identify and fill a gap in existing proofs of probabilistic convergence for pursuit learning. It is tradition to take the pursuit learning tuning parameter to be fixed in practical applications, but our proof sheds light on the importance of a vanishing sequence of tuning parameters in a theoretical convergence analysis.

Article information

J. Appl. Probab., Volume 49, Number 3 (2012), 795-805.

First available in Project Euclid: 6 September 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 68Q87: Probability in computer science (algorithm analysis, random structures, phase transitions, etc.) [See also 68W20, 68W40]
Secondary: 68W27: Online algorithms 68W40: Analysis of algorithms [See also 68Q25]

Convergence indirect estimator algorithm learning automaton


Martin, Ryan; Tilak, Omkar. On ε-optimality of the pursuit learning algorithm. J. Appl. Probab. 49 (2012), no. 3, 795--805. doi:10.1239/jap/1346955334.

Export citation


  • Agache, M. and Oommen, B. J. (2002). Generalized pursuit learning schemes: new families of continuous and discretized learning automata. IEEE Trans. Systems Man Cybernet. 32, 738–749.
  • Atlasis, A. F., Loukas, A. N. H. and Vasilakos, A. V. (2000). The use of learning algorithms in ATM networks call admission control problem: a methodology. Comput. Networks 34, 341–353.
  • Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58, 13–30.
  • Kashki, M., Abido, M. A. and Abdel-Magid, Y. L. (2010). Pole placement approach for robust optimum design of PSS and TCSC-based stabilizers using reinforcement learning automata. Electrical Eng. 91, 383–394.
  • Klenke, A. and Mattner, L. (2010). Stochastic ordering of classical discrete distributions. Adv. Appl. Prob. 42, 392–410.
  • Kushner, H. J. and Yin, G. G. (2003). Stochastic Approximation and Recursive Algorithms and Applications, 2nd edn. Springer, New York.
  • Lanctôt, J. K. and Oommen, B. J. (1992). Discretized estimator learning automata. IEEE Trans. Systems Man Cybernet. 22, 1473–1483.
  • Lixia, L., Gang, H., Ming, X. and Yuxing, P. (2010). Learning automata based spectrum allocation in cognitive networks. In IEEE Internat. Conf. Wireless Communications, Networking, and Information Security, pp. 503–508.
  • Misra, S., Tiwari, V. and Obaidat, M. S. (2009). Lacas: learning automata-based congestion avoidance scheme for healthcare wireless sensor networks. IEEE J. Selected Areas Commun. 27, 466–479.
  • Narendra, K. S. and Thathachar, M. A. L. (1989). Learning Automata: An Introduction. Prentice Hall, Englewood Cliffs, NJ.
  • Oommen, B. J. and Hashem, M. K. (2010). Modeling a student's behavior in a tutorial-like system using learning automata. IEEE Trans. Systems Man Cybernet. B 40, 481–492.
  • Oommen, B. J. and Lanctôt, J. K. (1990). Discretized pursuit learning automata. IEEE Trans. Systems Man Cybernet. 20, 931–938.
  • Papadimitriou, G. I., Sklira, M. and Pomportsis, A. S. (2004). A new class of $\epsilon$-optimal learning automata. IEEE Trans. Systems Man Cybernet. B 34, 246–254.
  • Proschan, F. and Sethuraman, J. (1976). Stochastic comparisons of order statistics from heterogeneous populations, with applications in reliability. J. Multivariate Anal. 6, 608–616.
  • Rajaraman, K. and Sastry, P. S. (1996). Finite time analysis of the pursuit algorithm for learning automata. IEEE Trans. Systems Man Cybernet. B 26, 590–598.
  • Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Math. Statist. 22, 400–407.
  • Sastry, P. S. (1985). Systems of learning automata: estimator algorithms and applications. Doctoral Thesis, Indian Institute of Science.
  • Thathachar, M. A. L. and Sastry, P. S. (1985). A new approach to the design of reinforcement schemes for learning automata. IEEE Trans. Systems Man Cybernet. 15, 168–175.
  • Thathachar, M. A. L. and Sastry, P. S. (1987). Learning optimal discriminant functions through a cooperative game of automata. IEEE Trans. Systems Man Cybernet. 17, 73–85.
  • Tilak, O., Martin, R. and Mukhopadhyay, S. (2011). Decentralized, indirect methods for learning automata games. IEEE Trans. Systems Man Cybernet. B 41, 1213–1223.
  • Torkestania, J. A. and Meybodi, M. R. (2010). Clustering the wireless ad hoc networks: a distributed learning automata approach. J. Parallel Distributed Computing 70, 394–405.
  • Torkestania, J. A. and Meybodi, M. R. (2010). An intelligent backbone formation algorithm for wireless ad hoc networks based on distributed learning automata. Comput. Networks 54, 826–843.
  • Tuan, T. A., Tong, L. C. and Premkumar, A. B. (2010). An adaptive learning automata algorithm for channel selection in cognitive radio network. In 2010 Internat. Conf. Communications and Mobile Computing, Vol. 2, IEEE Computer Society, Washington, DC, pp. 159–163.
  • Zhong, W., Xu, Y. and Tao, M. (2010). Precoding strategy selection for cognitive MIMO multiple access channels using learning automata. In 2010 IEEE Internat. Conf. Communications, pp. 23–27.