The Annals of Statistics

Spectral method and regularized MLE are both optimal for top-$K$ ranking

Yuxin Chen, Jianqing Fan, Cong Ma, and Kaizheng Wang

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

This paper is concerned with the problem of top-$K$ ranking from pairwise comparisons. Given a collection of $n$ items and a few pairwise comparisons across them, one wishes to identify the set of $K$ items that receive the highest ranks. To tackle this problem, we adopt the logistic parametric model—the Bradley–Terry–Luce model, where each item is assigned a latent preference score, and where the outcome of each pairwise comparison depends solely on the relative scores of the two items involved. Recent works have made significant progress toward characterizing the performance (e.g., the mean square error for estimating the scores) of several classical methods, including the spectral method and the maximum likelihood estimator (MLE). However, where they stand regarding top-$K$ ranking remains unsettled.

We demonstrate that under a natural random sampling model, the spectral method alone, or the regularized MLE alone, is minimax optimal in terms of the sample complexity—the number of paired comparisons needed to ensure exact top-$K$ identification, for the fixed dynamic range regime. This is accomplished via optimal control of the entrywise error of the score estimates. We complement our theoretical studies by numerical experiments, confirming that both methods yield low entrywise errors for estimating the underlying scores. Our theory is established via a novel leave-one-out trick, which proves effective for analyzing both iterative and noniterative procedures. Along the way, we derive an elementary eigenvector perturbation bound for probability transition matrices, which parallels the Davis–Kahan $\mathop{\mathrm{sin}}\nolimits \Theta $ theorem for symmetric matrices. This also allows us to close the gap between the $\ell_{2}$ error upper bound for the spectral method and the minimax lower limit.

Article information

Source
Ann. Statist., Volume 47, Number 4 (2019), 2204-2235.

Dates
Received: August 2017
Revised: July 2018
First available in Project Euclid: 21 May 2019

Permanent link to this document
https://projecteuclid.org/euclid.aos/1558425643

Digital Object Identifier
doi:10.1214/18-AOS1745

Mathematical Reviews number (MathSciNet)
MR3953449

Zentralblatt MATH identifier
07082284

Subjects
Primary: 62F07: Ranking and selection
Secondary: 62B10: Information-theoretic topics [See also 94A17]

Keywords
Top-$K$ ranking pairwise comparisons spectral method regularized MLE entrywise perturbation leave-one-out analysis reversible Markov chains

Citation

Chen, Yuxin; Fan, Jianqing; Ma, Cong; Wang, Kaizheng. Spectral method and regularized MLE are both optimal for top-$K$ ranking. Ann. Statist. 47 (2019), no. 4, 2204--2235. doi:10.1214/18-AOS1745. https://projecteuclid.org/euclid.aos/1558425643


Export citation

References

  • Abbe, E., Fan, J., Wang, K. and Zhong, Y. (2017). Entrywise eigenvector analysis of random matrices with low expected rank. ArXiv preprint. Available at ArXiv:1709.09565.
  • Agarwal, A., Agarwal, S., Assadi, S. and Khanna, S. (2017). Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons. In Conference on Learning Theory 39–75.
  • Ammar, A. and Shah, D. (2011). Ranking: Compare, don’t score. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton) 776–783. DOI:10.1109/Allerton.2011.6120246.
  • Ammar, A. and Shah, D. (2012). Efficient rank aggregation using partial data. In SIGMETRICS 40 355–366. ACM, New York.
  • Baltrunas, L., Makcinskas, T. and Ricci, F. (2010). Group recommendations with rank aggregation and collaborative filtering. In Proceedings of the Fourth ACM Conference on Recommender Systems. RecSys ’10 119–126. ACM, New York.
  • Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324–345.
  • Bubeck, S. (2015). Convex optimization: Algorithms and complexity. Found. Trends Mach. Learn. 8 231–357.
  • Busa-Fekete, R., Szörényi, B., Weng, P., Cheng, W. and Hüllermeier, E. (2013). Top-$k$ selection based on adaptive sampling of noisy preferences. In International Conference on Machine Learning.
  • Chen, Y. and Candes, E. (2016). The projected power method: An efficient algorithm for joint alignment from pairwise differences. Comm. Pure Appl. Math. To appear.
  • Chen, Y. and Candès, E. J. (2017). Solving random quadratic systems of equations is nearly as easy as solving linear systems. Comm. Pure Appl. Math. 70 822–883.
  • Chen, Y. and Suh, C. (2015). Spectral MLE: Top-$K$ rank aggregation from pairwise comparisons. In International Conference on Machine Learning 371–380.
  • Chen, X., Bennett, P. N., Collins-Thompson, K. and Horvitz, E. (2013). Pairwise ranking aggregation in a crowdsourced setting. In ACM International Conference on Web Search and Data Mining 193–202. ACM, New York.
  • Chen, X., Gopi, S., Mao, J. and Schneider, J. (2017). Competitive analysis of the top-$K$ ranking problem. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms 1245–1264. SIAM, Philadelphia, PA.
  • Chen, Y., Chi, Y., Fan, J. and Ma, C. (2018). Gradient descent with random initialization: Fast global convergence for nonconvex phase retrieval. Available at ArXiv:1803.07726.
  • Chen, Y., Fan, J., Ma, C. and Wang, K. (2019). Supplement to “Spectral method and regularized MLE are both optimal for top-$K$ ranking.” DOI:10.1214/18-AOS1745SUPP.
  • Chung, F. R. K. (1997). Spectral Graph Theory. CBMS Regional Conference Series in Mathematics 92. Published for the Conference Board of the Mathematical Sciences, Washington, DC; by the Amer. Math. Soc., Providence, RI.
  • Davis, C. and Kahan, W. M. (1970). The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7 1–46.
  • Dwork, C., Kumar, R., Naor, M. and Sivakumar, D. (2001). Rank aggregation methods for the Web. In International Conference on World Wide Web 613–622.
  • El Karoui, N. (2018). On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators. Probab. Theory Related Fields 170 95–175.
  • Eldridge, J., Belkin, M. and Wang, Y. (2017). Unperturbed: Spectral analysis beyond Davis–Kahan. ArXiv preprint. Available at ArXiv:1706.06516.
  • Fan, J., Wang, W. and Zhong, Y. (2018). An $\ell _{\infty}$ eigenvector perturbation bound and its application. J. Mach. Learn. Res. 18 1–42.
  • Ford, L. R. Jr. (1957). Solution of a ranking problem from binary comparisons. Amer. Math. Monthly 64 28–33.
  • Hajek, B., Oh, S. and Xu, J. (2014). Minimax-optimal inference from partial rankings. In Neural Information Processing Systems 1475–1483.
  • Heckel, R., Shah, N. B., Ramchandran, K. and Wainwright, M. J. (2016). Active ranking from pairwise comparisons and when parametric assumptions don’t help. ArXiv preprint. Available at ArXiv:1606.08842.
  • Hunter, D. R. (2004). MM algorithms for generalized Bradley–Terry models. Ann. Statist. 32 384–406.
  • Jamieson, K. G. and Nowak, R. D. (2011). Active ranking using pairwise comparisons. In Neural Information Processing Systems 2240–2248.
  • Jang, M., Kim, S., Suh, C. and Oh, S. (2016). Top-$K$ ranking from pairwise comparisons: When spectral ranking is optimal. ArXiv preprint. Available at arXiv:1603.04153.
  • Javanmard, A. and Montanari, A. (2018). Debiasing the lasso: Optimal sample size for Gaussian designs. Ann. Statist. 46 2593–2622.
  • Jiang, X., Lim, L.-H., Yao, Y. and Ye, Y. (2011). Statistical ranking and combinatorial Hodge theory. Math. Program. 127 203–244.
  • Keshavan, R. H., Montanari, A. and Oh, S. (2010). Matrix completion from noisy entries. J. Mach. Learn. Res. 11 2057–2078.
  • Koltchinskii, V. and Lounici, K. (2016). Asymptotics and concentration bounds for bilinear forms of spectral projectors of sample covariance. Ann. Inst. Henri Poincaré Probab. Stat. 52 1976–2013.
  • Koltchinskii, V. and Xia, D. (2016). Perturbation of linear forms of singular vectors under Gaussian noise. In High Dimensional Probability VII. Progress in Probability 71 397–423. Springer, Cham.
  • Lu, Y. and Negahban, S. N. (2014). Individualized rank aggregation using nuclear norm regularization. ArXiv preprint. Available at ArXiv:1410.0860.
  • Luce, R. D. (1959). Individual Choice Behavior: A Theoretical Analysis. Wiley, New York; Chapman & Hall, London.
  • Ma, C., Wang, K., Chi, Y. and Chen, Y. (2017). Implicit regularization in nonconvex statistical estimation: Gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution. ArXiv preprint. Available at ArXiv:1711.10467.
  • Masse, K. (1997). Statistical models applied to the rating of sports teams. Technical Report, Bluefield College, Bluefield, VA.
  • Negahban, S., Oh, S. and Shah, D. (2017). Rank centrality: Ranking from pairwise comparisons. Oper. Res. 65 266–287.
  • Negahban, S., Oh, S., Thekumparampil, K. K. and Xu, J. (2017). Learning from comparisons and choices. ArXiv preprint. Available at ArXiv:1704.07228.
  • Pananjady, A., Mao, C., Muthukumar, V., Wainwright, M. J. and Courtade, T. A. (2017). Worst-case vs average-case design for estimation from fixed pairwise comparisons. ArXiv preprint. Available at ArXiv:1707.06217.
  • Rajkumar, A. and Agarwal, S. (2014). A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In International Conference on Machine Learning I-118–I-126.
  • Rajkumar, A. and Agarwal, S. (2016). When can we rank well from comparisons of $O(n\log n)$ non-actively chosen pairs? In Conference on Learning Theory 1376–1401.
  • Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878–1915.
  • Shah, N. B. and Wainwright, M. J. (2015). Simple, robust and optimal ranking from pairwise comparisons. ArXiv preprint. Available at arXiv:1512.08949.
  • Shah, N. B., Balakrishnan, S., Guntuboyina, A. and Wainwright, M. J. (2017). Stochastically transitive models for pairwise comparisons: Statistical and computational issues. IEEE Trans. Inform. Theory 63 934–959.
  • Soufiani, H. A., Chen, W. Z., Parkes, D. C. and Xia, L. (2013). Generalized method-of-moments for rank aggregation. In Proceedings of the 26th International Conference on Neural Information Processing Systems. NIPS’13 2706–2714.
  • Suh, C., Tan, V. Y. F. and Zhao, R. (2017). Adversarial top-$K$ ranking. IEEE Trans. Inform. Theory 63 2201–2225.
  • Sur, P., Chen, Y. and Candès, E. J. (2017). The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled Chi-square. ArXiv preprint. Available at ArXiv:1706.01191.
  • Tropp, J. A. (2015). An introduction to matrix concentration inequalities. Found. Trends Mach. Learn. 8 1–230.
  • Zhong, Y. and Boumal, N. (2017). Near-optimal bounds for phase synchronization. Available at arXiv:1703.06605.

Supplemental materials

  • Additional Proofs. Additional proofs of the results in the paper can be found in the Supplementary Material.