The Annals of Applied Probability

Multi-armed bandits in discrete and continuous time

Haya Kaspi and Avishai Mandelbaum

Full-text: Open access


We analyze Gittins' Markovian model, as generalized by Varaiya, Walrand and Buyukkoc, in discrete and continuous time. The approach resembles Weber's modification of Whittle's, within the framework of both multi-parameter processes and excursion theory. It is shown that index-priority strategies are optimal, in concert with all the special cases that have been treated previously.

Article information

Ann. Appl. Probab., Volume 8, Number 4 (1998), 1270-1290.

First available in Project Euclid: 9 August 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60G40: Stopping times; optimal stopping problems; gambling theory [See also 62L15, 91A60]
Secondary: 60J55: Local time and additive functionals 60G44: Martingales with continuous parameter

Multi-armed bandits optional increasing paths multiparameter processes excursions local times dual predictable projection


Kaspi, Haya; Mandelbaum, Avishai. Multi-armed bandits in discrete and continuous time. Ann. Appl. Probab. 8 (1998), no. 4, 1270--1290. doi:10.1214/aoap/1028903380.

Export citation


  • [1] Az´ema, J. (1985). Sur les fermes aleatoires. S´eminaire de Probabilit´es XIX. Lecture Notes in Math. 1123 397-495. Springer, Berlin.
  • [2] Berry, D. A. and Fristedt, D. (1985). Bandit Problems: Sequential Allocation of Experiments. Chapman and Hall, London.
  • [3] Cairoli, R. and Dalang, R. C. (1996). Sequential stochastic optimization. In Probability and Statistics. Wiley, New York.
  • [4] Dellacherie, C. and Meyer, P. A. (1978). Probabilities and Potentials. North-Holland, Amsterdam.
  • [5] El Karoui, N. and Karatzas, I. (1993). General Gittins index processes in discrete time. Proc. Nat. Acad. Sci. U.S.A. 90 1232-1236.
  • [6] El Karoui, N. and Karatzas, I. (1994). Dy namic allocation problems in continuous time. Ann. Appl. Probab. 4 255-286.
  • [7] El Karoui, N. and Karatzas, I. (1996). Sy nchronization and optimality for multi-armed bandit problems in continuous time. Unpublished manuscript.
  • [8] Gittins, J. C. (1989). Multi-armed Bandit Allocation Indices. Wiley, New York.
  • [9] Gittins, J. C. and Jones, D. M. (1974). A dy namic allocation index for the sequential design of experiments. In Progress in Statistics (J. Gani et al., eds.) 241-266. North-Holland, Amsterdam.
  • [10] Kaspi, H. and Maisonneurve, B. (1984/85). Predictable local times and exit sy stems. S´eminaire de Probabilit´es XX. Lecture Notes in Math. 1204 95-100. Springer, Berlin.
  • [11] Kaspi, H. and Mandelbaum, A. (1994). L´evy bandits: multi-armed bandits driven by L´evy processes. Ann. Appl. Probab. 5 541-565.
  • [12] Mandelbaum, A. (1986). Discrete multiarmed bandits and multiparameter processes. Probab. Theory Related Fields 71 129-147.
  • [13] Mandelbaum, A. (1987). Continuous multi-armed bandits and multi-parameter processes. Ann. Probab. 15 1527-1556.
  • [14] Mandelbaum, A. and Vanderbei, R. J. (1981). Optimal stopping and supermartingales over partially ordered sets. Z. Wahrsch. Verw. Gebiete 57 253-264.
  • [15] Presman, E. L. and Sonin, I. N. (1990). Sequential Control with Incomplete Information: The Bayesian Approach to Multi-armed Bandit Problems. Academic Press, New York.
  • [16] Sharpe, M. (1988). General Theory of Markov Processes. Academic Press, New York.
  • [17] Snell, L. (1952). Applications of martingale sy stems theorems. Trans. Amer. Math. Soc. 73 293-312.
  • [18] Varaiy a, P., Walrand, J. and Buy ukkoc, C. (1985). Extensions of the multi-armed bandit problem. The discounted case. IEEE Trans. Automat. Control AC-30 426-439.
  • [19] Walsh, J. B. (1981). Optional increasing paths. Colloque ENST-CNET: Lecture Notes in Math. 863 172-201. Springer, Berlin.
  • [20] Weber, R. (1992). On the Gittins index for multi-armed bandits. Ann. Appl. Probab. 2 1024- 1035.
  • [21] Whittle, P. (1980). Multi-armed bandits and the Gittins index. J. Roy. Statist. Soc. Ser. B 42 143-149.