Annals of Statistics

Detection of correlations

Ery Arias-Castro, Sébastien Bubeck, and Gábor Lugosi

Full-text: Open access


We consider the hypothesis testing problem of deciding whether an observed high-dimensional vector has independent normal components or, alternatively, if it has a small subset of correlated components. The correlated components may have a certain combinatorial structure known to the statistician. We establish upper and lower bounds for the worst-case (minimax) risk in terms of the size of the correlated subset, the level of correlation, and the structure of the class of possibly correlated sets. We show that some simple tests have near-optimal performance in many cases, while the generalized likelihood ratio test is suboptimal in some important cases.

Article information

Ann. Statist., Volume 40, Number 1 (2012), 412-435.

First available in Project Euclid: 16 April 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F03: Hypothesis testing
Secondary: 62F05: Asymptotic properties of tests

Sparse covariance matrix minimax detection Bayesian detection scan statistic generalized likelihood ratio test


Arias-Castro, Ery; Bubeck, Sébastien; Lugosi, Gábor. Detection of correlations. Ann. Statist. 40 (2012), no. 1, 412--435. doi:10.1214/11-AOS964.

Export citation


  • [1] CBOE S&P 500®Implied Correlation Index. Available at
  • [2] Addario-Berry, L., Broutin, N., Devroye, L. and Lugosi, G. (2010). On combinatorial testing problems. Ann. Statist. 38 3063–3092.
  • [3] Anandkumar, A., Tong, L. and Swami, A. (2009). Detection of Gauss–Markov random fields with nearest-neighbor dependency. IEEE Trans. Inform. Theory 55 816–827.
  • [4] Arias-Castro, E., Candès, E. J. and Durand, A. (2011). Detection of an anomalous cluster in a network. Ann. Statist. 39 278–304.
  • [5] Arias-Castro, E., Candès, E. J., Helgason, H. and Zeitouni, O. (2008). Searching for a trail of evidence in a maze. Ann. Statist. 36 1726–1757.
  • [6] Arias-Castro, E., Donoho, D. L. and Huo, X. (2005). Near-optimal detection of geometric objects by fast multiscale methods. IEEE Trans. Inform. Theory 51 2402–2425.
  • [7] Baraud, Y. (2002). Non-asymptotic minimax rates of testing in signal detection. Bernoulli 8 577–606.
  • [8] Berman, S. M. (1962). Equally correlated random variables. Sankhyā Ser. A 24 155–156.
  • [9] Bickel, P. J. and Levina, E. (2008). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604.
  • [10] Boutsikas, M. V. and Koutras, M. V. (2006). On the asymptotic distribution of the discrete scan statistic. J. Appl. Probab. 43 1137–1154.
  • [11] Cai, T., Jeng, X. and Jin, J. (2011). Optimal detection of heterogeneous and heteroscedastic mixtures. J. R. Stat. Soc. Ser. B Stat. Methodol. 73 629–662.
  • [12] Cai, T. T., Zhang, C.-H. and Zhou, H. H. (2010). Optimal rates of convergence for covariance matrix estimation. Ann. Statist. 38 2118–2144.
  • [13] Cross, G. R. and Jain, A. K. (1983). Markov random field texture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 5 25–39.
  • [14] Desolneux, A., Moisan, L. and Morel, J.-M. (2003). Maximal meaningful events and applications to image analysis. Ann. Statist. 31 1822–1851.
  • [15] Devroye, L., György, A., Lugosi, G. and Udina, F. (2011). High-dimensional random geometric graphs and their clique number. Unpublished manuscript.
  • [16] Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Statist. 32 962–994.
  • [17] Hall, P. and Jin, J. (2010). Innovated higher criticism for detecting sparse signals in correlated noise. Ann. Statist. 38 1686–1732.
  • [18] Ingster, Y. I. (1998). Minimax detection of a signal for ln-balls. Math. Methods Statist. 7 401–428.
  • [19] Jin, J. (2003). Detecting and estimating sparse mixtures. Ph.D. thesis, Stanford Univ.
  • [20] Kailath, T. and Poor, H. V. (1998). Detection of stochastic processes. IEEE Trans. Inform. Theory 44 2230–2259. Information theory: 1948–1998.
  • [21] Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses, 3rd ed. Springer, New York.
  • [22] Perone Pacifico, M., Genovese, C., Verdinelli, I. and Wasserman, L. (2004). False discovery control for random fields. J. Amer. Statist. Assoc. 99 1002–1014.
  • [23] Ramírez, D., Vía, J., Santamaría, I. and Scharf, L. L. (2010). Detection of spatially correlated Gaussian time series. IEEE Trans. Signal Process. 58 5006–5015.