The Annals of Statistics

Generalized Group Testing Procedures

Steven F. Arnold

Abstract

A person wishes to determine which, if any, of $n = \Pi^k_{j=1} a_j$ i.i.d. random variables, $X(i_1,\cdots, i_k), i_j = 1,\cdots, a_j$, lie in some specified set $A$. Such observations will be called unsafe. It is assumed that the density of the $X$'s is known and that $Y_j(i_1,\cdots, i_j)$, the sum of all the $X$'s whose first $j$ indices are $i_1,\cdots, i_j$, can be measured as easily as the individual $X$'s. In this paper, search procedures of the following form are studied. The person first measures $Y_0$, the sum of all the $X$'s. On the basis of $Y_0$, he decides whether to stop, and classify all the $X$'s as safe, or to continue and measure $Y_1(1),\cdots, Y_1(a_1 - 1)$ (and hence know $Y_1(a_1) = Y_0 - \sum^{a_1-1}_{i=1} Y_1(i))$. If he has decided to continue, he measures $Y_1(j)$. For each of $(Y_0, Y_1(j))$, he must decide whether to stop and classify as safe all $X$'s whose first index is $j$, or to continue and measure $Y_2(j, 1),\cdots, Y_2(j, a_2 - 1)$ (and hence know $Y_2(j, a_2))$. He continues in this fashion until each $X$ has either been classified safe or has been observed. Unlike most group testing problems, he is not restricted to procedures that will locate all the unsafe observations. Instead there is a loss function $L(x)$ measuring the loss if $X(i_1,\cdots, i_k) = x$ and is not observed. Let $V_1$ be the expected loss of a procedure (summed over all the $X$'s), and let $V_2$ be the expected number of measurements. For each $0 \leqq p \leqq 1$, a class of rules $D(p)$ is defined such that if a procedure is in $D(p)$, it minimizes $pV_1 + (1 - p)V_2$, and conversely, if a procedure minimizes $pV_1 + (1 - p)V_2$, then there is a rule in $D(p)$ that leads to the same decisions a.e. The union of the $D(p)$ is shown to be an essentially complete class of rules. A simpler form for the rules in $D(p)$ is derived for the case where the loss function is nondecreasing. More specific calculations are given for the case where the $X$'s are normally distributed, and $L(x)$ is the indicator function for the set $\{x \geqq d\}$.

Article information

Source
Ann. Statist., Volume 5, Number 6 (1977), 1170-1182.

Dates
First available in Project Euclid: 12 April 2007

https://projecteuclid.org/euclid.aos/1176344002

Digital Object Identifier
doi:10.1214/aos/1176344002

Mathematical Reviews number (MathSciNet)
MR448753

Zentralblatt MATH identifier
0383.62006

JSTOR