Open Access
April, 1981 Arm-Acquiring Bandits
P. Whittle
Ann. Probab. 9(2): 284-292 (April, 1981). DOI: 10.1214/aop/1176994469

Abstract

We consider the problem of allocating effort between projects at different stages of development when new projects are also continually appearing. An expression (14) is derived for the expected reward yielded by the Gittins index policy. This is shown to satisfy the dynamic programming equation for the problem, so confirming optimality of the policy.

Citation

Download Citation

P. Whittle. "Arm-Acquiring Bandits." Ann. Probab. 9 (2) 284 - 292, April, 1981. https://doi.org/10.1214/aop/1176994469

Information

Published: April, 1981
First available in Project Euclid: 19 April 2007

zbMATH: 0464.90081
MathSciNet: MR606990
Digital Object Identifier: 10.1214/aop/1176994469

Subjects:
Primary: 42C99
Secondary: 62C99

Keywords: allocation index , dynamic programming , Multiarmed bandit

Rights: Copyright © 1981 Institute of Mathematical Statistics

Vol.9 • No. 2 • April, 1981
Back to Top