Abstract
We consider the problem of allocating effort between projects at different stages of development when new projects are also continually appearing. An expression (14) is derived for the expected reward yielded by the Gittins index policy. This is shown to satisfy the dynamic programming equation for the problem, so confirming optimality of the policy.
Citation
P. Whittle. "Arm-Acquiring Bandits." Ann. Probab. 9 (2) 284 - 292, April, 1981. https://doi.org/10.1214/aop/1176994469
Information