$\mathbf{Q}$- and $\mathbf{A}$-Learning Methods for Estimating Optimal Dynamic Treatment Regimes

Phillip J. Schulte; Anastasios A. Tsiatis; Eric B. Laber; Marie Davidian

doi:10.1214/13-STS450

November 2014 $\mathbf{Q}$- and $\mathbf{A}$-Learning Methods for Estimating Optimal Dynamic Treatment Regimes

Phillip J. Schulte, Anastasios A. Tsiatis, Eric B. Laber, Marie Davidian

Statist. Sci. 29(4): 640-661 (November 2014). DOI: 10.1214/13-STS450

Abstract

In clinical practice, physicians make a series of treatment decisions over the course of a patient’s disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.

Citation

Download Citation

Phillip J. Schulte. Anastasios A. Tsiatis. Eric B. Laber. Marie Davidian. "$\mathbf{Q}$- and $\mathbf{A}$-Learning Methods for Estimating Optimal Dynamic Treatment Regimes." Statist. Sci. 29 (4) 640 - 661, November 2014. https://doi.org/10.1214/13-STS450

Information

Published: November 2014

First available in Project Euclid: 15 January 2015

zbMATH: 1331.62437

MathSciNet: MR3300363

Digital Object Identifier: 10.1214/13-STS450

Keywords: Advantage learning , bias-variance trade-off , model misspecification , Personalized medicine , potential outcomes , sequential decision-making

Access the abstract

JOURNAL ARTICLE
22 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY