## The Annals of Statistics

### Oracle inequalities for network models and sparse graphon estimation

#### Abstract

Inhomogeneous random graph models encompass many network models such as stochastic block models and latent position models. We consider the problem of statistical estimation of the matrix of connection probabilities based on the observations of the adjacency matrix of the network. Taking the stochastic block model as an approximation, we construct estimators of network connection probabilities—the ordinary block constant least squares estimator, and its restricted version. We show that they satisfy oracle inequalities with respect to the block constant oracle. As a consequence, we derive optimal rates of estimation of the probability matrix. Our results cover the important setting of sparse networks. Another consequence consists in establishing upper bounds on the minimax risks for graphon estimation in the $L_{2}$ norm when the probability matrix is sampled according to a graphon model. These bounds include an additional term accounting for the “agnostic” error induced by the variability of the latent unobserved variables of the graphon model. In this setting, the optimal rates are influenced not only by the bias and variance components as in usual nonparametric problems but also include the third component, which is the agnostic error. The results shed light on the differences between estimation under the empirical loss (the probability matrix estimation) and under the integrated loss (the graphon estimation).

#### Article information

Source
Ann. Statist., Volume 45, Number 1 (2017), 316-354.

Dates
Revised: February 2016
First available in Project Euclid: 21 February 2017

https://projecteuclid.org/euclid.aos/1487667625

Digital Object Identifier
doi:10.1214/16-AOS1454

Mathematical Reviews number (MathSciNet)
MR3611494

Zentralblatt MATH identifier
1367.62090

Subjects
Primary: 62G05: Estimation
Secondary: 60C05: Combinatorial probability

#### Citation

Klopp, Olga; Tsybakov, Alexandre B.; Verzelen, Nicolas. Oracle inequalities for network models and sparse graphon estimation. Ann. Statist. 45 (2017), no. 1, 316--354. doi:10.1214/16-AOS1454. https://projecteuclid.org/euclid.aos/1487667625

#### References

• [1] Aldous, D. J. (1985). Exchangeability and related topics. In École D’été de Probabilités de Saint-Flour, XIII—1983. Lecture Notes in Math. 1117 1–198. Springer, Berlin.
• [2] Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman–Girvan and other modularities. Proc. Natl. Acad. Sci. USA 106 21068–21073.
• [3] Bickel, P. J., Chen, A. and Levina, E. (2011). The method of moments and degree distributions for network models. Ann. Statist. 39 2280–2301.
• [4] Borgs, C., Chayes, J. and Smith, A. (2015). Private graphon estimation for sparse graphs. In Advances in Neural Information Processing Systems 1369–1377. Nips Foundation. Available at http://books.nips.cc.
• [5] Chan, S. H. and Airoldi, E. M. (2014). A consistent histogram estimator for exchangeable graph models. In Proceedings of the 31st International Conference on Machine Learning 208–216. Omnipress, Madison, WI.
• [6] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177–214.
• [7] Choi, D. (2015). Co-clustering of nonsmooth graphons. Preprint. Available at arXiv:1507.06352.
• [8] Chvátal, V. (1983). Linear Programming. W. H. Freeman and Company, New York.
• [9] Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33–61.
• [10] Gao, C., Lu, Y. and Zhou, H. H. (2014). Rate-optimal graphon estimation. Preprint. Available at arXiv:1410.5837.
• [11] Lovász, L. (2012). Large Networks and Graph Limits. American Mathematical Society Colloquium Publications 60. Amer. Math. Soc., Providence, RI.
• [12] Lovasz, L. and Szegedy, B. (2004). Limits of dense graph sequences. ArXiv Mathematics E-prints.
• [13] Penrose, M. (2003). Random Geometric Graphs. Oxford Studies in Probability 5. Oxford Univ. Press, Oxford.
• [14] Tang, M., Sussman, D. L. and Priebe, C. E. (2013). Universally consistent vertex classification for latent positions graphs. Ann. Statist. 41 1406–1430.
• [15] Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer, New York. Revised and extended from the 2004 French original, translated by Vladimir Zaiats.
• [16] Wolfe, P. J. and Olhede, S. C. (2013). Nonparametric graphon estimation. Preprint. Available at arXiv:1309.5936.
• [17] Xu, J., Massoulié, L. and Lelarge, M. (2014). Edge label inference in generalized stochastic block models: From spectral theory to impossibility results. Preprint. Available at arXiv:1406.6897.