The Annals of Statistics
- Ann. Statist.
- Volume 39, Number 1 (2011), 278-304.
Detection of an anomalous cluster in a network
We consider the problem of detecting whether or not, in a given sensor network, there is a cluster of sensors which exhibit an “unusual behavior.” Formally, suppose we are given a set of nodes and attach a random variable to each node. We observe a realization of this process and want to decide between the following two hypotheses: under the null, the variables are i.i.d. standard normal; under the alternative, there is a cluster of variables that are i.i.d. normal with positive mean and unit variance, while the rest are i.i.d. standard normal. We also address surveillance settings where each sensor in the network collects information over time. The resulting model is similar, now with a time series attached to each node. We again observe the process over time and want to decide between the null, where all the variables are i.i.d. standard normal, and the alternative, where there is an emerging cluster of i.i.d. normal variables with positive mean and unit variance. The growth models used to represent the emerging cluster are quite general and, in particular, include cellular automata used in modeling epidemics. In both settings, we consider classes of clusters that are quite general, for which we obtain a lower bound on their respective minimax detection rate and show that some form of scan statistic, by far the most popular method in practice, achieves that same rate to within a logarithmic factor. Our results are not limited to the normal location model, but generalize to any one-parameter exponential family when the anomalous clusters are large enough.
Ann. Statist., Volume 39, Number 1 (2011), 278-304.
First available in Project Euclid: 3 December 2010
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62C20: Minimax procedures 62G10: Hypothesis testing
Secondary: 82B20: Lattice systems (Ising, dimer, Potts, etc.) and systems on graphs
Detecting a cluster of nodes in a network minimax detection Bayesian detection scan statistic generalized likelihood ratio test disease outbreak detection sensor networks Richardson’s model cellular automata
Arias-Castro, Ery; Candès, Emmanuel J.; Durand, Arnaud. Detection of an anomalous cluster in a network. Ann. Statist. 39 (2011), no. 1, 278--304. doi:10.1214/10-AOS839. https://projecteuclid.org/euclid.aos/1291388376
- Supplementary material: Technical Arguments. In the supplementary file , we prove the results stated here. It is divided into three sections. In the first section, we state and prove general lower bounds on the minimax rate and upper bounds on the detection rate achieved by an ɛ-scan statistic. We do this for the normal location model first and extend these results to a general one-parameter exponential family. In the second section, we gather a number of results on volumes and node counts. In the third and last section, we prove the main results.