Gittins’ theorem under uncertainty

Samuel N. Cohen; Tanut Treetanthiploet

doi:10.1214/22-EJP742

2022 Gittins’ theorem under uncertainty

Samuel N. Cohen, Tanut Treetanthiploet

Author Affiliations +

Electron. J. Probab. 27: 1-48 (2022). DOI: 10.1214/22-EJP742

Abstract

We study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under independence assumption on the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optimal choices. This involves studying the interaction of our uncertainty with controls which determine the filtration. We also run a simple numerical example which illustrates the interaction between the willingness to explore and uncertainty aversion of the agent when making decisions.

Acknowledgments

Samuel Cohen thanks the Oxford-Man Institute for research support and acknowledges the support of The Alan Turing Institute under the Engineering and Physical Sciences Research Council grant EP/N510129/1. Tanut Treetanthiploet thanks the University of Oxford for research support while completing this work, and acknowledges the support of the Development and Promotion of Science and Technology Talents Project (DPST) of the Government of Thailand.