Electronic Journal of Statistics

Upper bounds and aggregation in bipartite ranking

Sylvain Robbiano

Full-text: Open access

Abstract

One main focus of learning theory is to find optimal rates of convergence. In classification, it is possible to obtain optimal fast rates (faster than $n^{-1/2}$) in a minimax sense. Moreover, using an aggregation procedure, the algorithms are adaptive to the parameters of the class of distributions. Here, we investigate this issue in the bipartite ranking framework. We design a ranking rule by aggregating estimators of the regression function. We use exponential weights based on the empirical ranking risk. Under several assumptions on the class of distribution, we show that this procedure is adaptive to the margin parameter and smoothness parameter and achieves the same rates as in the classification framework. Moreover, we state a minimax lower bound that establishes the optimality of the aggregation procedure in a specific case.

Article information

Source
Electron. J. Statist., Volume 7 (2013), 1249-1271.

Dates
First available in Project Euclid: 29 April 2013

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1367242158

Digital Object Identifier
doi:10.1214/13-EJS805

Mathematical Reviews number (MathSciNet)
MR3056074

Zentralblatt MATH identifier
1336.62068

Subjects
Primary: 62F07: Ranking and selection 62C20: Minimax procedures
Secondary: 62G08: Nonparametric regression

Keywords
Ranking aggregation minimax rates

Citation

Robbiano, Sylvain. Upper bounds and aggregation in bipartite ranking. Electron. J. Statist. 7 (2013), 1249--1271. doi:10.1214/13-EJS805. https://projecteuclid.org/euclid.ejs/1367242158


Export citation

References

  • [1] S. Agarwal, T. Graepel, R. Herbrich, S. Har-Peled, and D. Roth. Generalization bounds for the Area Under the ROC Curve., The Journal of Machine Learning Research, 6:393–425, 2005.
  • [2] Pierre Alquier and Karim Lounici. Pac-bayesian bounds for sparse regression estimation with exponential weights., EJS, 5:127–145, 2011.
  • [3] J.-Y. Audibert and A. Tsybakov. Fast learning rates for plug-in classifiers., Ann. Statist., 35(2):608–633, 2007.
  • [4] J. Y. Audibert. Fast learning rates in statistical inference through aggregation., Annals of Statistics, 37 :1591–1646, 2009.
  • [5] P. L. Bartlett, M. I. Jordan, and J. D. McAuliffe. Convexity, classification and risk bounds., J. Amer. Statist. Assoc., 101:138–156, 2006.
  • [6] S. Boucheron, O. Bousquet, and G. Lugosi. Theory of Classification: A Survey of Some Recent Advances., ESAIM: Probability and Statistics, 9:323–375, 2005.
  • [7] Nicolo Cesa-Bianchi and Gabor Lugosi., Prediction, Learning, and Games. Cambridge University Press, 2006.
  • [8] S. Clémençon, G. Lugosi, and N. Vayatis. Ranking and empirical risk minimization of U-statistics., Ann. Statist., 36(2):844–874, 2008.
  • [9] S. Clémençon and S. Robbiano. Minimax learning rates for bipartite ranking and plug-in rules. In, Proceedings of the 28th international Conference on Machine Learning, ICML’11, pages 441–448, 2011.
  • [10] S. Clémençon and N. Vayatis. Tree-based ranking methods., IEEE Transactions on Information Theory, 55(9) :4316–4336, 2009.
  • [11] S. Clémençon and N. Vayatis. Overlaying classifiers: a practical approach to optimal scoring., Constructive Approximation, 32(3):619–648, 2010.
  • [12] A. Dalalyan and A. B. Tsybakov. Aggregation by exponential weighting, sharp pac-bayesian bounds and sparsity., Mach. Learn., 72(1-2):39–61, 2008.
  • [13] D. M. Green and J. A. Swets., Signal detection theory and psychophysics. Wiley, 1966.
  • [14] R. M. Dudley., Uniform Central Limit Theorems. Cambridge University Press, 1999.
  • [15] Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences., The Journal of Machine Learning Research, 4:933–969, 2003.
  • [16] A. N. Kolmogorov and V. M. Tikhomirov. $\epsilon$-entropy and $\epsilon$-capacity of sets in functional spaces., Amer. Math. Soc. Translations Ser. 2,, 17:277–364, 1961.
  • [17] V. Koltchinskii and O. Beznosova. Exponential convergence rates in classification. In, Proceedings of COLT’05, 2005.
  • [18] G. Lecué. Optimal oracle inequality for aggregation of classifiers under low noise condition. In, COLT, 2006.
  • [19] G. Lecué. Classification with minimax fast rates for classes of bayes rules with sparse representation., Electronic Journal of Statistics, 2:741–773, 2008.
  • [20] O. V. Lepski, E. Mammen, and V. G. Spokoiny. Optimal spatial adaptation to inhomogeneous smoothness: an approach based on kernel estimates with variable bandwidth selectors., Annals Statistics, 25:929–947, 1997.
  • [21] P. Massart. Some applications of concentration inequalities to statistics., Ann. Fac. Sci. Toulouse Math., 9:245–303, 2000.
  • [22] P. Massart., Concentration inequalities and model selection. Lecture Notes in Mathematics. Springer, 2006.
  • [23] P. Massart and E. Nédélec. Risk bounds for statistical learning., Ann. Statist., 34(5), 2006.
  • [24] J.-B Monnier. Classification via local multi-resolution projections., EJS, 6:382–420, 2012.
  • [25] P. Rigollet and A. Tsybakov. Sparse estimation by exponential weighting., Statistical Science, 27:558–575, 2011.
  • [26] C. Rudin. Ranking with a P-Norm Push. In, Proceedings of COLT, 2006.
  • [27] N. Srebro, K. Sridharan, and A. Tewari. Smoothness, low noise and fast rates. In, Proceedings of NIPS. 2010.
  • [28] A. Tsybakov. Optimal aggregation of classifiers in statistical learning., Ann. Statist., 32(1):135–166, 2004.
  • [29] T. Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization (with discussion)., Annals of Statistics, 32:56–85, 2004.