## Bernoulli

• Bernoulli
• Volume 25, Number 4B (2019), 3883-3911.

### Structured matrix estimation and completion

#### Abstract

We study the problem of matrix estimation and matrix completion under a general framework. This framework includes several important models as special cases such as the Gaussian mixture model, mixed membership model, bi-clustering model and dictionary learning. We establish the optimal convergence rates in a minimax sense for estimation of the signal matrix under the Frobenius norm and under the spectral norm. As a consequence of our general result we obtain minimax optimal rates of convergence for various special models.

#### Article information

Source
Bernoulli, Volume 25, Number 4B (2019), 3883-3911.

Dates
Revised: February 2019
First available in Project Euclid: 25 September 2019

https://projecteuclid.org/euclid.bj/1569398788

Digital Object Identifier
doi:10.3150/19-BEJ1114

Mathematical Reviews number (MathSciNet)
MR4010976

Zentralblatt MATH identifier
07110159

#### Citation

Klopp, Olga; Lu, Yu; Tsybakov, Alexandre B.; Zhou, Harrison H. Structured matrix estimation and completion. Bernoulli 25 (2019), no. 4B, 3883--3911. doi:10.3150/19-BEJ1114. https://projecteuclid.org/euclid.bj/1569398788

#### References

• [1] Achlioptas, D. (2003). Database-friendly random projections: Johnson–Lindenstrauss with binary coins. J. Comput. System Sci. 66 671–687. Special issue on PODS 2001 (Santa Barbara, CA).
• [2] Airoldi, E.M., Blei, D.M., Fienberg, S.E. and Xing, E.P. (2008). Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 1981–2014.
• [3] Azizyan, M., Singh, A. and Wasserman, L. (2013). Minimax theory for high-dimensional Gaussian mixtures with sparse mean separation. In Advances in Neural Information Processing Systems (C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K.Q. Weinberger, eds.) 26 2139–2147. Curran Associates.
• [4] Bandeira, A.S. and van Handel, R. (2016). Sharp nonasymptotic bounds on the norm of random matrices with independent entries. Ann. Probab. 44 2479–2506.
• [5] Belkin, M. and Sinha, K. (2010). Polynomial learning of distribution families. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science—FOCS 2010 103–112. Los Alamitos, CA: IEEE Computer Soc.
• [6] Borgs, C., Chayes, J. and Smith, A. (2015). Private graphon estimation for sparse graphs. In Advances in Neural Information Processing Systems 1369–1377.
• [7] Bunea, F., She, Y. and Wegkamp, M.H. (2011). Optimal selection of reduced rank estimators of high-dimensional matrices. Ann. Statist. 39 1282–1309.
• [8] Cai, T.T. and Zhou, W.-X. (2016). Matrix completion via max-norm constrained optimization. Electron. J. Stat. 10 1493–1525.
• [9] Candes, E.J. and Plan, Y. (2010). Matrix completion with noise. Proc. IEEE 98 925–936.
• [10] Candès, E.J. and Tao, T. (2010). The power of convex relaxation: Near-optimal matrix completion. IEEE Trans. Inform. Theory 56 2053–2080.
• [11] Chan, S.H. and Airoldi, E.M. (2014). A consistent histogram estimator for exchangeable graph models. In Proceedings of the 31st International Conference on Machine Learning 208–216.
• [12] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177–214.
• [13] Chaudhuri, K., Dasgupta, S. and Vattani, A. (2009). Learning mixtures of gaussians using the $k$-means algorithm. Preprint. Available at arXiv:0912.0086.
• [14] Cheng, Y. and Church, G.M. (2000). Biclustering of expression data. In ISMB 8 93–103.
• [15] Dasgupta, S. (1999). Learning mixtures of Gaussians. In 40th Annual Symposium on Foundations of Computer Science (New York, 1999) 634–644. Los Alamitos, CA: IEEE Computer Soc.
• [16] Dasgupta, S. and Schulman, L.J. (2000). A two-round variant of em for Gaussian mixtures. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, UAI ’00 152–159. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
• [17] Du, S.S., Wang, Y. and Singh, A. (2017). On the power of truncated SVD for general high-rank matrix estimation problems. In Advances in Neural Information Processing Systems 30 (NIPS) 2017 445–455.
• [18] Foygel, R. and Srebro, N. (2011). Concentration-based guarantees for low-rank matrix reconstruction. In 24nd Annual Conference on Learning Theory (COLT).
• [19] Gao, C., Lu, Y., Ma, Z. and Zhou, H.H. (2016). Optimal estimation and completion of matrices with biclustering structures. J. Mach. Learn. Res. 17 Paper No. 161, 29.
• [20] Gao, C., Lu, Y. and Zhou, H.H. (2015). Rate-optimal graphon estimation. Ann. Statist. 43 2624–2652.
• [21] Giraud, C. (2015). Introduction to High-Dimensional Statistics. Monographs on Statistics and Applied Probability 139. Boca Raton, FL: CRC Press.
• [22] Gross, D. (2011). Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inform. Theory 57 1548–1566.
• [23] Hardt, M. (2014). Understanding alternating minimization for matrix completion. In 55th Annual IEEE Symposium on Foundations of Computer Science—FOCS 2014 651–660. Los Alamitos, CA: IEEE Computer Soc.
• [24] Hartigan, J.A. (1972). Direct clustering of a data matrix. J. Amer. Statist. Assoc. 67 123–129.
• [25] Holland, P.W., Laskey, K.B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109–137.
• [26] Hsu, D. and Kakade, S.M. (2013). Learning mixtures of spherical Gaussians: Moment methods and spectral decompositions. In ITCS’13—Proceedings of the 2013 ACM Conference on Innovations in Theoretical Computer Science 11–19. New York: ACM.
• [27] Hsu, D., Kakade, S.M. and Zhang, T. (2012). A tail inequality for quadratic forms of subgaussian random vectors. Electron. Commun. Probab. 17 no. 52, 6.
• [28] Karrer, B. and Newman, M.E.J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107, 10.
• [29] Keshavan, R.H., Montanari, A. and Oh, S. (2010). Matrix completion from noisy entries. J. Mach. Learn. Res. 11 2057–2078.
• [30] Klopp, O. (2011). Rank penalized estimators for high-dimensional matrices. Electron. J. Stat. 5 1161–1183.
• [31] Klopp, O. (2014). Noisy low-rank matrix completion with general sampling distribution. Bernoulli 20 282–303.
• [32] Klopp, O., Lu, Y., Tsybakov, A. and Zhou, H. (2019). Supplement to “Structured matrix estimation and completion.” DOI:10.3150/19-BEJ1114SUPP.
• [33] Klopp, O., Tsybakov, A.B. and Verzelen, N. (2017). Oracle inequalities for network models and sparse graphon estimation. Ann. Statist. 45 316–354.
• [34] Koltchinskii, V., Lounici, K. and Tsybakov, A.B. (2011). Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist. 39 2302–2329.
• [35] Negahban, S. and Wainwright, M.J. (2012). Restricted strong convexity and weighted matrix completion: Optimal bounds with noise. J. Mach. Learn. Res. 13 1665–1697.
• [36] Olshausen, B.A. and Field, D.J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by v1? Vis. Res. 37 3311–3325.
• [37] Rigollet, P. and Tsybakov, A. (2011). Exponential screening and optimal rates of sparse estimation. Ann. Statist. 39 731–771.
• [38] Soni, A., Jain, S., Haupt, J. and Gonella, S. (2016). Noisy matrix completion under sparse factor models. IEEE Trans. Inform. Theory 62 3636–3661.
• [39] Tsybakov, A.B. (2009). Introduction to Nonparametric Estimation. Springer Series in Statistics. New York: Springer.
• [40] Vempala, S. and Wang, G. (2004). A spectral algorithm for learning mixture models. J. Comput. System Sci. 68 841–860.
• [41] Wolfe, P.J. and Olhede, S.C. (2013). Nonparametric graphon estimation. Preprint. Available at arXiv:1309.5936.
• [42] Xu, J., Massoulié, L. and Lelarge, M. (2014). Edge label inference in generalized stochastic block models: From spectral theory to impossibility results. In Conference on Learning Theory 903–920. Barcelona, Spain.
• [43] Yang, Y. and Barron, A. (1999). Information-theoretic determination of minimax rates of convergence. Ann. Statist. 27 1564–1599.

#### Supplemental materials

• Supplement to “Structured matrix estimation and completion”. We provide the remaining proofs in the supplementary material [32].