## The Annals of Mathematical Statistics

### On the Design and Comparison of Certain Dichotomous Experiments

#### Abstract

It may frequently happen that a researcher, wishing to decide which one of a set of alternatives to accept, finds that there are several experiments available to him which he might perform to guide him in reaching his decision. Thus he is faced with making a preliminary decision as to which experiment or experiments he is to perform. If he admits the possibility of performing more than one, then the question of how many, which ones, and in what order arises. It is such questions as these that come under the heading of comparison and design of experiments. While a great deal of general theory of the design problem has been developed, e.g., by Wald [1] and Maguire [2], few actual solutions of particular problems, especially of the sequential type, have been investigated thus far. Robbin's paper [3] is the first published report dealing with various sequential rules for particular nontruncated design problems. The basic purpose of this paper is to investigate for certain cases the optimal design, which almost uniformly turns out to be exceedingly complicated (see Section 3), and to propose and determine some justification for certain simpler criteria. Attention is restricted to problems in which there are but two alternatives, or hypotheses, $H_1$ and $H_2$, and it is required to decide between them with a loss of one unit if the false one is selected, while no loss results from selecting the correct one. Further, $\xi$ will denote the a priori probability that $H_1$ is true, and the basic criterion for comparison will be the Bayes risks associated with the various experiments. To say that an experiment is available to a researcher is to say that there is a real random variable which he can observe whose distribution is known under each hypothesis. As an example of a situation in which this type of question may arise, consider the problem of deciding between utilizing a use test as against a specifications test for acceptance of a lot of manufactured items. A large lot of items has been produced and a decision is to be made between, say, $w_1$ and $w_2$ as being the proportion of defectives in the lot. Let $X = 1$ or 0 according as an item selected at random is defective or not as determined by subjecting it to a use test. Let $Y = 1$ or 0 according as an item selected at random is classified as defective (because it fails to meet certain specifications) or not. If $\alpha$, the probability that a nondefective item fails to meet the specifications, and $\beta$, the probability that a defective item meets the specifications, are known, then both $X$ and $Y$ have a binomial distribution with known parameter under each hypothesis. Again, it might be that in the course of a series of treatments of a material there are two points at which a certain characteristic may be measured, say the breaking strength of a metal undergoing a series of heat treatments. Let $X$ and $Y$, respectively, denote the value of the characteristic at the different points in the process. It is reasonable to assume that under each of two simple hypotheses concerning the process and material, $X$ and $Y$ have prescribed normal distributions. In general, suppose that $X$ and $Y$ are two real random variables having distribution functions of $F_i$ and $G_i$, respectively, under hypothesis $H_i$ and with corresponding densities $f_i$ and $g_i$ with respect to a common measure, $\Psi$, such that $f_i > 0$ if and only if $g_i > 0$. Let $R_Z(\xi)$ denote the Bayes risk against $\xi$ when using experiment $Z$. Now the computation and comparison of the risk, $R_X(\xi)$ and $R_Y(\xi)$, for all $\xi$, which appears to be necessary in order to obtain an optimal design, is intrinsically complicated in most cases, as will be seen in the following sections. Hence it is of some interest to investigate some more convenient criteria for choosing between experiments. Any such criterion should, of course, dictate the use of $X$ if $R_X \leqq R_Y$ (i.e., $R_X(\xi) \leqq R_Y(\xi)$ for all $\xi$). To be able to check on this is but one reason for interest in conditions that $R_X \leqq R_Y$. Another is that whenever it is true that $R_X \leqq R_Y$, then regardless of the choice of actions open to the researcher or the loss function used, use of $X$ will never yield a greater Bayes risk than $Y$. Also, if a total of, say, $n$ independent experiments is to be performed, the optimal sequential design is the nonsequential rule: take all $n$ observations of $X$ ([4], [5], [8]). In Section 2.1 general conditions that $R_X \leqq R_Y$ are derived, some of which are related to those obtained by Blackwell [5] via consideration of the standard experiment. In Section 2.2 the Kullbach-Leibler (abbreviated hereafter as K-L) information numbers are introduced, and it is shown, in particular, that they provide a criterion which yields an especially simple necessary condition that $R_X \leqq R_Y$. The K-L information numbers are also considered as functions of that transformation, $t$, such that the distribution of $t(X)$ under $H_1$ is the distribution of $X$ under $H_2$. The case in which all the distributions involved are normal is analyzed in some detail in Section 2.3, where it is seen that the K-L numbers do not yield a sufficient condition that $R_X \leqq R_Y$, though they do yield a sufficient condition that $R_X = R_Y$. The normal case gives an example in which a second criterion, that of being "locally more informative" (Bayes) at both zero and 1, yields a condition both necessary and sufficient that $R_X \leqq R_Y. X$ is termed locally more informative than $Y$ at $\xi$ if $\xi$ lies in an interval $\lbrack\xi_1, \xi_2\rbrack$ such that on $\lbrack 0,1\rbrack \cap \lbrack\xi_1, \xi_2\rbrack R_X(\xi) \leqq R_Y(\xi)$, with strict inequality at at least one end point. This latter criterion is discussed further in Section 2.4. The problem of determining the optimal designs, sequential and nonsequential, for the case in which all distributions are binomial and a fixed number of experiments is to be performed is discussed in Section 3. The complete solutions are found to be exceedingly complicated; a few are given. For the sequential design, a system for obtaining the optimal design which avoids the complete calculation of the successive risk functions was found. In the final section, a sequential rule for terminating experimentation is considered, and the problem of finding a sequential design which minimizes the expected number of experiments is posed. Two reasonable designs are proposed, are shown to be equivalent, and are shown to be better than either of the rules which require that all observations be of the same random variable. Throughout the paper it will be convenient to consider $\xi/(1 - \xi)$, and this will regularly be denoted by $\eta$.

#### Article information

Source
Ann. Math. Statist., Volume 27, Number 2 (1956), 390-409.

Dates
First available in Project Euclid: 28 April 2007

https://projecteuclid.org/euclid.aoms/1177728265

Digital Object Identifier
doi:10.1214/aoms/1177728265

Mathematical Reviews number (MathSciNet)
MR87287

Zentralblatt MATH identifier
0072.36102

JSTOR