Statistical Science

Elo Ratings and the Sports Model: A Neglected Topic in Applied Probability?

David Aldous

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


In a simple model for sports, the probability A beats B is a specified function of their difference in strength. One might think this would be a staple topic in Applied Probability textbooks (like the Galton–Watson branching process model, for instance) but it is curiously absent. Our first purpose is to point out that the model suggests a wide range of questions, suitable for “undergraduate research” via simulation but also challenging as professional research. Our second, more specific, purpose concerns Elo-type rating algorithms for tracking changing strengths. There has been little foundational research on their accuracy, despite a much-copied “30 matches suffice” claim, which our simulation study casts doubt upon.

Article information

Statist. Sci., Volume 32, Number 4 (2017), 616-629.

First available in Project Euclid: 28 November 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Elo rating Bradley–Terry model dynamic ratings sports forecasting


Aldous, David. Elo Ratings and the Sports Model: A Neglected Topic in Applied Probability?. Statist. Sci. 32 (2017), no. 4, 616--629. doi:10.1214/17-STS628.

Export citation


  • Adler, I., Cao, Y., Karp, R., Pekoz, E. and Ross, S. M. (2016). Random knockout tournaments. Oper. Res. To appear. Available at arXiv:1612.04448.
  • Aldous, D. J. (2013). Using prediction market data to illustrate undergraduate probability. Amer. Math. Monthly 120 583–593.
  • Aldous, D. (2017). Mathematical probability foundations of dynamic sports ratings: Overview and open problems. To appear.
  • Aldous, D. and Han, W. (2017). Introducing Nash equilibria via an online casual game that people actually play. Amer. Math. Monthly 124 506–517.
  • Bayer, D. and Diaconis, P. (1992). Trailing the dovetail shuffle to its lair. Ann. Appl. Probab. 2 294–313.
  • Bradley, R. A. and Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method of paired comparisons. Biometrika 39 324–345.
  • Cattelan, M. (2012). Models for paired comparison data: A review with emphasis on dependent data. Statist. Sci. 27 412–433.
  • Cattelan, M., Varin, C. and Firth, D. (2013). Dynamic Bradley–Terry modelling of sports tournaments. J. R. Stat. Soc. Ser. C. Appl. Stat. 62 135–150.
  • Chetrite, R., Diel, R. and Lerasle, M. (2017). The number of potential winners in Bradley–Terry model in random environment. Ann. Appl. Probab. 27 1372–1394.
  • Cox, D. R. and Snell, E. J. (1989). Analysis of Binary Data, 2nd ed. Monographs on Statistics and Applied Probability 32. Chapman & Hall, London.
  • Curiel, R. S. da S. (2017). World Football Elo Ratings. Available at
  • David, H. A. (1988). The Method of Paired Comparisons, 2nd ed. Griffin’s Statistical Monographs & Courses 41. Charles Griffin & Co., Ltd., London.
  • Gardiner, C. W. (1983). Handbook of Stochastic Methods: For Physics, Chemistry and the Natural Sciences. Springer Series in Synergetics 13. Springer, Berlin.
  • Glickman, M. E. (2001). Dynamic paired comparison models with stochastic variances. J. Appl. Stat. 28 673–689.
  • Hvattum, L. M. and Arntzen, H. (2010). Using Elo ratings for match result prediction in association football. Int. J. Forecast. 26 460–470.
  • Jabin, P.-E. and Junca, S. (2015). A continuous model for ratings. SIAM J. Appl. Math. 75 420–442.
  • Király, F. J. and Qian, Z. (2017). Modelling competitive sports: Bradley–Terry–Elo models for supervised and on-line learning of paired competition outcomes. Available at arXiv:1701.08055.
  • Knorr-Held, L. (2000). Dynamic rating of sports teams. Statistician 49 261–276.
  • Kovalchik, S. (2016). Searching for the GOAT of tennis win prediction. J. Quant. Anal. Sports 12 127–138.
  • Lange, K. (2010). Applied Probability, 2nd ed. Springer, New York.
  • Langville, A. N. and Meyer, C. D. (2012). Who’s #1? The Science of Rating and Ranking. Princeton Univ. Press, Princeton, NJ.
  • Lasek, J., Szlávik, Z. and Bhulai, S. (2013). The predictive power of ranking systems in association football. Int. J. Appl. Pattern Recogn. 1 27–46.
  • McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman & Hall, London.
  • Meyn, S. and Tweedie, R. L. (2009). Markov Chains and Stochastic Stability, 2nd ed. Cambridge Univ. Press, Cambridge.
  • Resnick, S. I. (1987). Extreme Values, Regular Variation, and Point Processes. Applied Probability. A Series of the Applied Probability Trust 4. Springer, New York.
  • Tetlock, P. E. (2006). Expert Political Judgment: How Good Is It? How Can We Know? Princeton Univ. Press, Princeton, NJ.
  • United, O. (2017). Weekly Tennis ELO Rankings. Available at
  • Wikipedia (2014a). Elo rating system. Wikipedia, the free encyclopedia. [Online; accessed, 31-October-2014].
  • Wikipedia (2014b). Tournament. Wikipedia, the free encyclopedia. [Online; accessed. 4-December-2014].
  • Wikipedia (2017). Promotion and relegation. Wikipedia, the free encyclopedia. [Online; accessed 19-February-2017].