The Annals of Mathematical Statistics

Sequential Selection of Experiments

K. B. Gray, Jr.

Abstract

The problem of sequential selection of experiments, with fixed and optional stopping, is considered. Conditions are given which allow selection, stopping and terminal action rules to be based on a sequence $\{T_j\}$ of statistics, where $T_j$ is a function of past observations $\mathbf{X}^j = (X_1, \cdots, X_j)$ and experiment selections $\mathbf{E}^j = (E_1, \cdots, E_j)$. Randomized stopping, selection, and terminal action rules are allowed, and all probability distributions are defined by densities relative to $\sigma$-finite measures over Euclidean spaces. Here we give a heuristic description of the principal results for the case of optional stopping. At each time $j$ the random variable $X_j$ is observed and a decision is made to stop or continue. If the procedure is stopped, a terminal action $A$ is taken. If it is continued, an experiment $E_{j+1}$, to be performed at time $j + 1$, is chosen. At time $j$, all decisions are based on $\mathbf{X}^j,\mathbf{E}^j$, the past observations and experiment selections. Upon stopping, and taking action $A$, a loss $L(\theta, A)$, where $\theta$ is the unknown state of nature, is incurred. The sampling cost of stopping at $j$ is $C_j(\theta, \mathbf{X}^j, \mathbf{E}^j)$. Let the random variable $N$ denote the random stopping time. A selection rule $\gamma = (\gamma_0, \gamma_1, \cdots)$ is defined by the sequence of conditional densities $\gamma_j(e_{j+1}\mid\mathbf{x}^j, \mathbf{e}^j)$, a stopping rule $(\mathbb{\Phi} = (\phi_0, \phi_1, \cdots)$ by the probabilities $\phi_j(\mathbf{x}^j,\mathbf{e}^j) = P\{N = j\mid N \geqq j, \mathbf{x}^j,\mathbf{e}^j\}$, and a terminal action rule $\delta = (\delta_0, \delta_1, \cdots)$ by the conditional densities $\delta_j(a\mid\mathbf{x}^j,\mathbf{e}^j)$. Definition of the population densities $f_\theta(x_{j+1}\mid\mathbf{x}^j, \mathbf{e}^{j+1})$ for $j = 0, 1, 2, \cdots$ completely fixes the probability structure. Define $\{T_j\}$ to be parameter sufficient (PARS) if, for $j = 0, 1, 2, \cdots, \operatorname{Dist}_{\theta,\gamma}(\mathbf{X}^j, \mathbf{E}^j\mid T_j)$ is independent of $\theta$ for all $\gamma$ and policy sufficient (POLS) if, for $j = 0, 1, 2, \cdots, \operatorname{Dist}_{\theta,\Phi,\gamma} (T_{j+1}\mid T_j, E_{j+1}, N \geqq j + 1)$ is independent of $\mathbf{\phi}, \mathbf{\gamma}$ for all $\theta$. THEOREM. If $\{T_j\}$ is PARS; then the class of policies $\{\mathbf{\phi}, \mathbf{\gamma}, \mathbf{\delta}^0\}$, where $\delta^0$ is based on $\{T_j\}$, is essentially complete. THEOREM. If $\{T_j\}$ is PARS and POLS, and the sampling cost is of the form $C_j(\theta, T_j)$, then the class of policies $\{\mathbf{\Phi}^0, \mathbf{\gamma}^0, \mathbf{\delta}^0\}$, where $\mathbf{\phi}^0, \mathbf{\gamma}^0, \mathbf{\delta}^0$ are based on $\{T_j\}$, is essentially complete. Conditions are given to aid in the verification of PARS and POLS. The theorems are applied to examples, including versions of the two armed bandit problem.

Article information

Source
Ann. Math. Statist., Volume 39, Number 6 (1968), 1953-1977.

Dates
First available in Project Euclid: 27 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aoms/1177698025

Digital Object Identifier
doi:10.1214/aoms/1177698025

Mathematical Reviews number (MathSciNet)
MR243690

Zentralblatt MATH identifier
0187.16202

JSTOR