As does Woodroofe, we consider a Bayesian sequential allocation between two treatments that incorporates a covariate. The goal is to maximize the total discounted expected reward from an infinite population of patients. Although our model is more general than Woodroofe's, we are able to duplicate his main result: The myopic rule is asymptotically optimal.
"One-Armed Bandit Problems with Covariates." Ann. Statist. 19 (4) 1978 - 2002, December, 1991. https://doi.org/10.1214/aos/1176348382