The Annals of Applied Statistics

Discussion of “Elicitability and backtesting: Perspectives for banking regulation”

Hajo Holzmann and Bernhard Klar

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


In our discussion of the insightful paper by Nolde and Ziegel, we further investigate comparative backtests based on consistent scoring rules. We use Diebold–Mariano tests in pairwise comparisons instead of mere rankings in terms of average scores, and illustrate the use of weighted proper scoring rules, which address the quality of forecasts of the full loss distribution in its upper tail rather than some specific risk measure such as the Value at Risk. Overall, at lower levels up to 95%, these allow for better discrimination between competing forecasting methods.

Article information

Ann. Appl. Stat., Volume 11, Number 4 (2017), 1875-1882.

Received: May 2017
Revised: May 2017
First available in Project Euclid: 28 December 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Backtesting forecasting risk management scoring rule


Holzmann, Hajo; Klar, Bernhard. Discussion of “Elicitability and backtesting: Perspectives for banking regulation”. Ann. Appl. Stat. 11 (2017), no. 4, 1875--1882. doi:10.1214/17-AOAS1041A.

Export citation


  • Davis, M. H. A. (2016). Verification of internal risk measure estimates. Stat. Risk Model. 33 67–93.
  • Diebold, F. X. and Mariano, R. S. (1995). Comparing predictive accuracy. J. Bus. Econom. Statist. 13 253–263.
  • Fissler, T. and Ziegel, J. F. (2016). Higher order elicitability and Osband’s principle. Ann. Statist. 44 1680–1707.
  • Fissler, T., Ziegel, J. F. and Gneiting, T. (2016). Expected shortfall is jointly elicitable with value at risk—implications for backtesting. Risk Mag. January 58–61.
  • Ghalanos, A. (2014). rugarch: Univariate GARCH models. R package version 1.3-5.
  • Gneiting, T. (2011). Making and evaluating point forecasts. J. Amer. Statist. Assoc. 106 746–762.
  • Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102 359–378.
  • Gneiting, T. and Ranjan, R. (2011). Comparing density forecasts using threshold- and quantile-weighted scoring rules. J. Bus. Econom. Statist. 29 411–422.
  • Holzmann, H. and Klar, B. (2017). Focusing on regions of interest in forecast evaluation. Ann. Appl. Stat. 11 2404–2431.
  • McNeil, A. J. and Frey, R. (2000). Estimation of tail-related risk measures for heteroscedastic financial time series: An extreme value approach. J. Empir. Finance 7 271–300.
  • McNeil, A. J., Frey, R. and Embrechts, P. (2005). Quantitative Risk Management: Concepts, Techniques and Tools. Princeton Series in Finance. Princeton Univ. Press, Princeton, NJ.
  • R Core Team (2016). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  • Stephenson, A. G. (2002). evd: Extreme value distributions. R News 2 31–32.
  • Straumann, D. (2005). Estimation in Conditionally Heteroscedastic Time Series Models. Lecture Notes in Statistics 181. Springer, Berlin.

See also

  • Main article: Elicitability and backtesting: Perspectives for banking regulation.