Open Access
2022 Gittins’ theorem under uncertainty
Samuel N. Cohen, Tanut Treetanthiploet
Author Affiliations +
Electron. J. Probab. 27: 1-48 (2022). DOI: 10.1214/22-EJP742

Abstract

We study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under independence assumption on the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optimal choices. This involves studying the interaction of our uncertainty with controls which determine the filtration. We also run a simple numerical example which illustrates the interaction between the willingness to explore and uncertainty aversion of the agent when making decisions.

Acknowledgments

Samuel Cohen thanks the Oxford-Man Institute for research support and acknowledges the support of The Alan Turing Institute under the Engineering and Physical Sciences Research Council grant EP/N510129/1. Tanut Treetanthiploet thanks the University of Oxford for research support while completing this work, and acknowledges the support of the Development and Promotion of Science and Technology Talents Project (DPST) of the Government of Thailand.

Citation

Download Citation

Samuel N. Cohen. Tanut Treetanthiploet. "Gittins’ theorem under uncertainty." Electron. J. Probab. 27 1 - 48, 2022. https://doi.org/10.1214/22-EJP742

Information

Received: 21 August 2020; Accepted: 11 January 2022; Published: 2022
First available in Project Euclid: 31 January 2022

MathSciNet: MR4373324
zbMATH: 1485.91060
Digital Object Identifier: 10.1214/22-EJP742

Subjects:
Primary: 60G40 , 91B32 , 91B70 , 93E35

Keywords: Gittins index , Multi-armed bandits , nonlinear expectation , robustness , time-consistency , uncertainty

Vol.27 • 2022
Back to Top