Open Access
April 2016 Batched bandit problems
Vianney Perchet, Philippe Rigollet, Sylvain Chassang, Erik Snowberg
Ann. Statist. 44(2): 660-681 (April 2016). DOI: 10.1214/15-AOS1381


Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.


Download Citation

Vianney Perchet. Philippe Rigollet. Sylvain Chassang. Erik Snowberg. "Batched bandit problems." Ann. Statist. 44 (2) 660 - 681, April 2016.


Received: 1 May 2015; Revised: 1 August 2015; Published: April 2016
First available in Project Euclid: 17 March 2016

zbMATH: 1338.62180
MathSciNet: MR3476613
Digital Object Identifier: 10.1214/15-AOS1381

Primary: 62L05
Secondary: 62C20

Keywords: batches , grouped clinical trials , Multi-armed bandit problems , multi-phase allocation , regret bounds , sample size determination , switching cost

Rights: Copyright © 2016 Institute of Mathematical Statistics

Vol.44 • No. 2 • April 2016
Back to Top