The Annals of Statistics

Finding a large submatrix of a Gaussian random matrix

David Gamarnik and Quan Li

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


We consider the problem of finding a $k\times k$ submatrix of an $n\times n$ matrix with i.i.d. standard Gaussian entries, which has a large average entry. It was shown in [Bhamidi, Dey and Nobel (2012)] using nonconstructive methods that the largest average value of a $k\times k$ submatrix is $2(1+o(1))\sqrt{\log n/k}$, with high probability (w.h.p.), when $k=O(\log n/\log\log n)$. In the same paper, evidence was provided that a natural greedy algorithm called the Largest Average Submatrix ($\mathcal{LAS}$) for a constant $k$ should produce a matrix with average entry at most $(1+o(1))\sqrt{2\log n/k}$, namely approximately $\sqrt{2}$ smaller than the global optimum, though no formal proof of this fact was provided.

In this paper, we show that the average entry of the matrix produced by the $\mathcal{LAS}$ algorithm is indeed $(1+o(1))\sqrt{2\log n/k}$ w.h.p. when $k$ is constant and $n$ grows. Then, by drawing an analogy with the problem of finding cliques in random graphs, we propose a simple greedy algorithm which produces a $k\times k$ matrix with asymptotically the same average value $(1+o(1))\sqrt{2\log n/k}$ w.h.p., for $k=o(\log n)$. Since the greedy algorithm is the best known algorithm for finding cliques in random graphs, it is tempting to believe that beating the factor $\sqrt{2}$ performance gap suffered by both algorithms might be very challenging. Surprisingly, we construct a very simple algorithm which produces a $k\times k$ matrix with average value $(1+o_{k}(1)+o(1))(4/3)\sqrt{2\log n/k}$ for $k=o((\log n)^{1.5})$, that is, with the asymptotic factor $4/3$ when $k$ grows.

To get an insight into the algorithmic hardness of this problem, and motivated by methods originating in the theory of spin glasses, we conduct the so-called expected overlap analysis of matrices with average value asymptotically $(1+o(1))\alpha\sqrt{2\log n/k}$ for a fixed value $\alpha\in[1,\sqrt{2}]$. The overlap corresponds to the number of common rows and the number of common columns for pairs of matrices achieving this value (see the paper for details). We discover numerically an intriguing phase transition at $\alpha^{*}\triangleq5\sqrt{2}/(3\sqrt{3})\approx1.3608\ldots\in[4/3,\sqrt{2}]$: when $\alpha<\alpha^{*}$ the space of overlaps is a continuous subset of $[0,1]^{2}$, whereas $\alpha=\alpha^{*}$ marks the onset of discontinuity, and as a result the model exhibits the Overlap Gap Property (OGP) when $\alpha>\alpha^{*}$, appropriately defined. We conjecture that the OGP observed for $\alpha>\alpha^{*}$ also marks the onset of the algorithmic hardness—no polynomial time algorithm exists for finding matrices with average value at least $(1+o(1))\alpha\sqrt{2\log n/k}$, when $\alpha>\alpha^{*}$ and $k$ is a mildly growing function of $n$.

Article information

Ann. Statist., Volume 46, Number 6A (2018), 2511-2561.

Received: March 2016
Revised: June 2017
First available in Project Euclid: 7 September 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 68Q87: Probability in computer science (algorithm analysis, random structures, phase transitions, etc.) [See also 68W20, 68W40] 97K50: Probability theory 60C05: Combinatorial probability 68Q25: Analysis of algorithms and problem complexity [See also 68W40]

Random matrix random graphs maximum clique submatrix detection computational complexity overlap gap property


Gamarnik, David; Li, Quan. Finding a large submatrix of a Gaussian random matrix. Ann. Statist. 46 (2018), no. 6A, 2511--2561. doi:10.1214/17-AOS1628.

Export citation


  • [1] Achlioptas, D. and Coja-Oghlan, A. (2008). Algorithmic barriers from phase transitions. In 2008 49th Annual IEEE Symposium on Foundations of Computer Science 793–802. IEEE, New York.
  • [2] Achlioptas, D., Coja-Oghlan, A. and Ricci-Tersenghi, F. (2011). On the solution-space geometry of random constraint satisfaction problems. Random Structures Algorithms 38 251–268.
  • [3] Alon, N., Krivelevich, M. and Sudakov, B. (1998). Finding a large hidden clique in a random graph. Random Structures Algorithms 13 457–466.
  • [4] Berthet, Q. and Rigollet, P. (2013). Complexity theoretic lower bounds for sparse principal component detection. In Conference on Learning Theory 1046–1066.
  • [5] Berthet, Q. and Rigollet, P. (2013). Optimal detection of sparse principal components in high dimension. Ann. Statist. 41 1780–1815.
  • [6] Bhamidi, S., Dey, P. S. and Nobel, A. B. (2012). Energy landscape for large average submatrix detection problems in Gaussian random matrices. Preprint. Available at arXiv:1211.2284.
  • [7] Coja-Oghlan, A. and Efthymiou, C. (2011). On independent sets in random graphs. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms 136–144. SIAM, Philadelphia.
  • [8] Fortunato, S. (2010). Community detection in graphs. Phys. Rep. 486 75–174.
  • [9] Gamarnik, D. and Sudan, M. (2014). Limits of local algorithms over sparse random graphs. In Proceedings of the 5th Conference on Innovations in Theoretical Computer Science 369–376. ACM, New York.
  • [10] Gamarnik, D. and Sudan, M. (2014). Performance of the survey propagation-guided decimation algorithm for the random NAE-K-SAT problem. Preprint. Available at arXiv:1402.0052.
  • [11] Gamarnik, D. and Zadik, I. (2017). High-dimensional regression with binary coefficients. Estimating squared error and a phase transition. Preprint. Available at arXiv:1701.04455.
  • [12] Karp, R. M. (1976). The probabilistic analysis of some combinatorial search algorithms. In Algorithms and complexity: New directions and recent results 1–19.
  • [13] Leadbetter, M. R., Lindgren, G. and Rootzén, H. (1983). Extremes and Related Properties of Random Sequences and Processes. Springer, New York.
  • [14] Madeira, S. C. and Oliveira, A. L. (2004). Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans. Comput. Biol. Bioinform. 1 24–45.
  • [15] Montanari, A. (2015). Finding one community in a sparse graph. J. Stat. Phys. 161 273–299.
  • [16] Rahman, M. and Virág, B. (2014). Local algorithms for independent sets are half-optimal. Preprint. Available at arXiv:1402.0485.
  • [17] Shabalin, A. A., Weigman, V. J., Perou, C. M. and Nobel, A. B. (2009). Finding large average submatrices in high dimensional data. Ann. Appl. Stat. 985–1012.
  • [18] Sun, X. and Nobel, A. B. (2013). On the maximal size of large-average and ANOVA-fit submatrices in a Gaussian random matrix. Bernoulli 19 275–294.