Bernoulli

  • Bernoulli
  • Volume 25, Number 4B (2019), 3883-3911.

Structured matrix estimation and completion

Olga Klopp, Yu Lu, Alexandre B. Tsybakov, and Harrison H. Zhou

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

We study the problem of matrix estimation and matrix completion under a general framework. This framework includes several important models as special cases such as the Gaussian mixture model, mixed membership model, bi-clustering model and dictionary learning. We establish the optimal convergence rates in a minimax sense for estimation of the signal matrix under the Frobenius norm and under the spectral norm. As a consequence of our general result we obtain minimax optimal rates of convergence for various special models.

Article information

Source
Bernoulli, Volume 25, Number 4B (2019), 3883-3911.

Dates
Received: September 2017
Revised: February 2019
First available in Project Euclid: 25 September 2019

Permanent link to this document
https://projecteuclid.org/euclid.bj/1569398788

Digital Object Identifier
doi:10.3150/19-BEJ1114

Mathematical Reviews number (MathSciNet)
MR4010976

Zentralblatt MATH identifier
07110159

Keywords
matrix completion matrix estimation minimax optimality mixture model stochastic block model

Citation

Klopp, Olga; Lu, Yu; Tsybakov, Alexandre B.; Zhou, Harrison H. Structured matrix estimation and completion. Bernoulli 25 (2019), no. 4B, 3883--3911. doi:10.3150/19-BEJ1114. https://projecteuclid.org/euclid.bj/1569398788


Export citation

References

  • [1] Achlioptas, D. (2003). Database-friendly random projections: Johnson–Lindenstrauss with binary coins. J. Comput. System Sci. 66 671–687. Special issue on PODS 2001 (Santa Barbara, CA).
  • [2] Airoldi, E.M., Blei, D.M., Fienberg, S.E. and Xing, E.P. (2008). Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 1981–2014.
  • [3] Azizyan, M., Singh, A. and Wasserman, L. (2013). Minimax theory for high-dimensional Gaussian mixtures with sparse mean separation. In Advances in Neural Information Processing Systems (C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K.Q. Weinberger, eds.) 26 2139–2147. Curran Associates.
  • [4] Bandeira, A.S. and van Handel, R. (2016). Sharp nonasymptotic bounds on the norm of random matrices with independent entries. Ann. Probab. 44 2479–2506.
  • [5] Belkin, M. and Sinha, K. (2010). Polynomial learning of distribution families. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science—FOCS 2010 103–112. Los Alamitos, CA: IEEE Computer Soc.
  • [6] Borgs, C., Chayes, J. and Smith, A. (2015). Private graphon estimation for sparse graphs. In Advances in Neural Information Processing Systems 1369–1377.
  • [7] Bunea, F., She, Y. and Wegkamp, M.H. (2011). Optimal selection of reduced rank estimators of high-dimensional matrices. Ann. Statist. 39 1282–1309.
  • [8] Cai, T.T. and Zhou, W.-X. (2016). Matrix completion via max-norm constrained optimization. Electron. J. Stat. 10 1493–1525.
  • [9] Candes, E.J. and Plan, Y. (2010). Matrix completion with noise. Proc. IEEE 98 925–936.
  • [10] Candès, E.J. and Tao, T. (2010). The power of convex relaxation: Near-optimal matrix completion. IEEE Trans. Inform. Theory 56 2053–2080.
  • [11] Chan, S.H. and Airoldi, E.M. (2014). A consistent histogram estimator for exchangeable graph models. In Proceedings of the 31st International Conference on Machine Learning 208–216.
  • [12] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177–214.
  • [13] Chaudhuri, K., Dasgupta, S. and Vattani, A. (2009). Learning mixtures of gaussians using the $k$-means algorithm. Preprint. Available at arXiv:0912.0086.
  • [14] Cheng, Y. and Church, G.M. (2000). Biclustering of expression data. In ISMB 8 93–103.
  • [15] Dasgupta, S. (1999). Learning mixtures of Gaussians. In 40th Annual Symposium on Foundations of Computer Science (New York, 1999) 634–644. Los Alamitos, CA: IEEE Computer Soc.
  • [16] Dasgupta, S. and Schulman, L.J. (2000). A two-round variant of em for Gaussian mixtures. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, UAI ’00 152–159. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
  • [17] Du, S.S., Wang, Y. and Singh, A. (2017). On the power of truncated SVD for general high-rank matrix estimation problems. In Advances in Neural Information Processing Systems 30 (NIPS) 2017 445–455.
  • [18] Foygel, R. and Srebro, N. (2011). Concentration-based guarantees for low-rank matrix reconstruction. In 24nd Annual Conference on Learning Theory (COLT).
  • [19] Gao, C., Lu, Y., Ma, Z. and Zhou, H.H. (2016). Optimal estimation and completion of matrices with biclustering structures. J. Mach. Learn. Res. 17 Paper No. 161, 29.
  • [20] Gao, C., Lu, Y. and Zhou, H.H. (2015). Rate-optimal graphon estimation. Ann. Statist. 43 2624–2652.
  • [21] Giraud, C. (2015). Introduction to High-Dimensional Statistics. Monographs on Statistics and Applied Probability 139. Boca Raton, FL: CRC Press.
  • [22] Gross, D. (2011). Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inform. Theory 57 1548–1566.
  • [23] Hardt, M. (2014). Understanding alternating minimization for matrix completion. In 55th Annual IEEE Symposium on Foundations of Computer Science—FOCS 2014 651–660. Los Alamitos, CA: IEEE Computer Soc.
  • [24] Hartigan, J.A. (1972). Direct clustering of a data matrix. J. Amer. Statist. Assoc. 67 123–129.
  • [25] Holland, P.W., Laskey, K.B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109–137.
  • [26] Hsu, D. and Kakade, S.M. (2013). Learning mixtures of spherical Gaussians: Moment methods and spectral decompositions. In ITCS’13—Proceedings of the 2013 ACM Conference on Innovations in Theoretical Computer Science 11–19. New York: ACM.
  • [27] Hsu, D., Kakade, S.M. and Zhang, T. (2012). A tail inequality for quadratic forms of subgaussian random vectors. Electron. Commun. Probab. 17 no. 52, 6.
  • [28] Karrer, B. and Newman, M.E.J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107, 10.
  • [29] Keshavan, R.H., Montanari, A. and Oh, S. (2010). Matrix completion from noisy entries. J. Mach. Learn. Res. 11 2057–2078.
  • [30] Klopp, O. (2011). Rank penalized estimators for high-dimensional matrices. Electron. J. Stat. 5 1161–1183.
  • [31] Klopp, O. (2014). Noisy low-rank matrix completion with general sampling distribution. Bernoulli 20 282–303.
  • [32] Klopp, O., Lu, Y., Tsybakov, A. and Zhou, H. (2019). Supplement to “Structured matrix estimation and completion.” DOI:10.3150/19-BEJ1114SUPP.
  • [33] Klopp, O., Tsybakov, A.B. and Verzelen, N. (2017). Oracle inequalities for network models and sparse graphon estimation. Ann. Statist. 45 316–354.
  • [34] Koltchinskii, V., Lounici, K. and Tsybakov, A.B. (2011). Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist. 39 2302–2329.
  • [35] Negahban, S. and Wainwright, M.J. (2012). Restricted strong convexity and weighted matrix completion: Optimal bounds with noise. J. Mach. Learn. Res. 13 1665–1697.
  • [36] Olshausen, B.A. and Field, D.J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by v1? Vis. Res. 37 3311–3325.
  • [37] Rigollet, P. and Tsybakov, A. (2011). Exponential screening and optimal rates of sparse estimation. Ann. Statist. 39 731–771.
  • [38] Soni, A., Jain, S., Haupt, J. and Gonella, S. (2016). Noisy matrix completion under sparse factor models. IEEE Trans. Inform. Theory 62 3636–3661.
  • [39] Tsybakov, A.B. (2009). Introduction to Nonparametric Estimation. Springer Series in Statistics. New York: Springer.
  • [40] Vempala, S. and Wang, G. (2004). A spectral algorithm for learning mixture models. J. Comput. System Sci. 68 841–860.
  • [41] Wolfe, P.J. and Olhede, S.C. (2013). Nonparametric graphon estimation. Preprint. Available at arXiv:1309.5936.
  • [42] Xu, J., Massoulié, L. and Lelarge, M. (2014). Edge label inference in generalized stochastic block models: From spectral theory to impossibility results. In Conference on Learning Theory 903–920. Barcelona, Spain.
  • [43] Yang, Y. and Barron, A. (1999). Information-theoretic determination of minimax rates of convergence. Ann. Statist. 27 1564–1599.

Supplemental materials

  • Supplement to “Structured matrix estimation and completion”. We provide the remaining proofs in the supplementary material [32].