The Annals of Applied Statistics

Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate

Brian Bader, Jun Yan, and Xuebin Zhang

Full-text: Open access


Threshold selection is a critical issue for extreme value analysis with threshold-based approaches. Under suitable conditions, exceedances over a high threshold have been shown to follow the generalized Pareto distribution (GPD) asymptotically. In practice, however, the threshold must be chosen. If the chosen threshold is too low, the GPD approximation may not hold and bias can occur. If the threshold is chosen too high, reduced sample size increases the variance of parameter estimates. To process batch analyses, commonly used selection methods such as graphical diagnostics are subjective and cannot be automated. We develop an efficient technique to evaluate and apply the Anderson–Darling test to the sample of exceedances above a fixed threshold. In order to automate threshold selection, this test is used in conjunction with a recently developed stopping rule that controls the false discovery rate in ordered hypothesis testing. Previous attempts in this setting do not account for the issue of ordered multiple testing. The performance of the method is assessed in a large scale simulation study that mimics practical return level estimation. This procedure was repeated at hundreds of sites in the western US to generate return level maps of extreme precipitation.

Article information

Ann. Appl. Stat., Volume 12, Number 1 (2018), 310-329.

Received: April 2016
Revised: August 2017
First available in Project Euclid: 9 March 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Batch analysis exceedance diagnostic specification test stopping rule


Bader, Brian; Yan, Jun; Zhang, Xuebin. Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate. Ann. Appl. Stat. 12 (2018), no. 1, 310--329. doi:10.1214/17-AOAS1092.

Export citation


  • Bader, B. and Yan, J. (2015). eva: Extreme value analysis with goodness-of-fit testing. R package version 0.1.2.
  • Bader, B., Yan, J. and Zhang, X. (2018). Supplement to “Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate.” DOI:10.1214/17-AOAS1092SUPP.
  • Balkema, A. A. and de Haan, L. (1974). Residual life time at great age. Ann. Probab. 2 792–804.
  • Benjamini, Y. (2010a). Discovering the false discovery rate. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 405–416.
  • Benjamini, Y. (2010b). Simultaneous and selective inference: Current successes and future challenges. Biom. J. 52 708–721.
  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B. Stat. Methodol. 57 289–300.
  • Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165–1188.
  • Blanchard, G. and Roquain, É. (2009). Adaptive false discovery rate control under independence and dependence. J. Mach. Learn. Res. 10 2837–2871.
  • Blanchet, J. and Lehning, M. (2010). Mapping snow depth return levels: Smooth spatial modeling versus station interpolation. Hydrology and Earth System Sciences 14 2527–2544.
  • Caeiro, F. and Gomes, M. I. (2015). Threshold selection in extreme value analysis. In Extreme Value Modeling and Risk Analysis: Methods and Applications (D. K. Dey and J. Yan, eds.) 69–82. CRC Press, Boca Raton.
  • Caires, S. (2009). A Comparative Simulation Study of the Annual Maxima and the Peaks-over-Threshold Methods Technical report, SBW-Belastingen: Subproject “Statistics”. Deltares Report 1200264-002.
  • Cheng, R. C. H. and Stephens, M. A. (1989). A goodness-of-fit test using Moran’s statistic with estimated parameters. Biometrika 76 385–392.
  • Choulakian, V. and Stephens, M. A. (2001). Goodness-of-fit tests for the generalized Pareto distribution. Technometrics 43 478–484.
  • Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values, 1st ed. Springer, Berlin.
  • Danielsson, J., de Haan, L., Peng, L. and de Vries, C. G. (2001). Using a bootstrap method to choose the sample fraction in tail index estimation. J. Multivariate Anal. 76 226–248.
  • Davison, A. C. and Smith, R. L. (1990). Models for exceedances over high thresholds. J. Roy. Statist. Soc. Ser. B 52 393–442.
  • Deidda, R. and Puliga, M. (2006). Sensitivity of goodness-of-fit statistics to rainfall data rounding off. Physics and Chemistry of the Earth, Parts A/B/C 31 1240–1251.
  • Deidda, R. and Puliga, M. (2009). Performances of some parameter estimators of the generalized Pareto distribution over rounded-off samples. Physics and Chemistry of the Earth, Parts A/B/C 34 626–634.
  • Dey, D. K. and Yan, J., eds. (2015). Extreme Value Modeling and Risk Analysis: Methods and Applications. CRC Press, Boca Raton.
  • Drees, H., De Haan, L. and Resnick, S. (2000). How to make a hill plot. Ann. Statist. 28 254–274.
  • DuMouchel, W. H. (1983). Estimating the stable index $\alpha$ in order to measure tail thickness: A critique. Ann. Statist. 11 1019–1031.
  • Dupuis, D. J. (1999). Exceedances over high thresholds: A guide to threshold selection. Extremes 1 251–261.
  • Fawcett, L. and Walshaw, D. (2007). Improved estimation for temporally clustered extremes. Environmetrics 18 173–188.
  • Ferreira, A., de Haan, L. and Peng, L. (2003). On optimising the estimation of high quantiles of a probability distribution. Statistics 37 401–434.
  • Ferro, C. A. and Segers, J. (2003). Inference for clusters of extreme values. J. R. Stat. Soc. Ser. B. Stat. Methodol. 65 545–556.
  • Fisher, R. A. and Tippett, L. H. C. (1928). Limiting forms of the frequency distribution of the largest or smallest member of a sample. In Mathematical Proceedings of the Cambridge Philosophical Society 24 180–190. Cambridge Univ. Press, Cambridge.
  • G’Sell, M. G., Wager, S., Chouldechova, A. and Tibshirani, R. (2016). Sequential selection procedures and false discovery rate control. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 423–444.
  • Goegebeur, Y., Beirlant, J. and de Wet, T. (2008). Linking Pareto-tail kernel goodness-of-fit statistics with tail index at optimal threshold and second order estimation. REVSTAT 6 51–69.
  • Holden, L. and Haug, O. (2009). A Multidimensial Mixture Model for Unsupervised Tail Estimation. NR-notat SAMBA/09/09. pp 29.
  • Jackson, O. (1967). An analysis of departures from the exponential distribution. J. R. Stat. Soc. Ser. B. Stat. Methodol. 540–549.
  • Katz, R. W., Parlange, M. B. and Naveau, P. (2002). Statistics of extremes in hydrology. Advances in Water Resources 25 1287–1304.
  • Kharin, V. V., Zwiers, F. W., Zhang, X. and Hegerl, G. C. (2007). Changes in temperature and precipitation extremes in the IPCC ensemble of global coupled model simulations. J. Climate 20 1419–1444.
  • Kharin, V. V., Zwiers, F., Zhang, X. and Wehner, M. (2013). Changes in temperature and precipitation extremes in the CMIP5 ensemble. Climatic Change 119 345–357.
  • Langousis, A., Mamalakis, A., Puliga, M. and Deidda, R. (2016). Threshold detection for the generalized Pareto distribution: Review of representative methods and application to the NOAA NCDC daily rainfall database. Water Resources Research 52 2659–2681.
  • Lateltin, O. and Bonnard, C. (1999). Hazard Assessment and Land-Use Planning in Switzerland for Snow Avalanches, Floods and Landslides. Technical report, World Meteorological Organization.
  • Leadbetter, M. R., Weissman, I., De Haan, L. and Rootzén, H. (1989). On clustering of high values in statistically stationary series. Proc. 4th Int. Meet. Statistical Climatology 16 217–222.
  • Lewis, P. A. W. (1965). Some results on tests for Poisson processes. Biometrika 52 67–77.
  • MacDonald, A., Scarrott, C. J., Lee, D., Darlow, B., Reale, M. and Russell, G. (2011). A flexible extreme value mixture model. Comput. Statist. Data Anal. 55 2137–2157.
  • Menne, M. J., Durre, I., Vose, R. S., Gleason, B. E. and Houston, T. G. (2012). An overview of the global historical climatology network-daily database. Journal of Atmospheric and Oceanic Technology 29 897–910.
  • Moran, P. A. P. (1953). The random division of an interval—part II. J. Roy. Statist. Soc. Ser. B 15 77–80.
  • Nadarajah, S. and Eljabri, S. (2013). The Kumaraswamy GP distribution. J. Data Sci. 11 739–766.
  • Naveau, P., Huser, R., Ribereau, P. and Hannart, A. (2016). Modeling jointly low, moderate, and heavy rainfall intensities without a threshold selection. Water Resources Research 52 2753–2769.
  • Northrop, P. J., Attalides, N. and Jonathan, P. (2017). Cross-validatory extreme value threshold selection and uncertainty with application to ocean storm severity. J. R. Stat. Soc. Ser. C. Appl. Stat. 66 93–120.
  • Northrop, P. J. and Coleman, C. L. (2014). Improved threshold diagnostic plots for extreme value analyses. Extremes 17 289–303.
  • Northrop, P. J. and Jonathan, P. (2011). Threshold modelling of spatially dependent non-stationary extremes with application to hurricane-induced wave heights. Environmetrics 22 799–809.
  • Papalexiou, S. M. and Koutsoyiannis, D. (2013). Battle of extreme value distributions: A global survey on extreme daily rainfall. Water Resources Research 49 187–201.
  • Papastathopoulos, I. and Tawn, J. A. (2013). Extended generalised Pareto models for tail estimation. J. Statist. Plann. Inference 143 131–143.
  • Pickands, J. III (1975). Statistical inference using extreme order statistics. Ann. Statist. 3 119–131.
  • Raoult, J.-P. and Worms, R. (2003). Rate of convergence for the generalized Pareto approximation of the excesses. Adv. in Appl. Probab. 35 1007–1027.
  • Roth, M., Jongbloed, G. and Buishand, T. A. (2016). Threshold selection for regional peaks-over-threshold data. J. Appl. Stat. 43 1291–1309.
  • Roth, M., Buishand, T. A., Jongbloed, G., Klein Tank, A. M. G. and van Zanten, J. H. (2012). A regional peaks-over-threshold model in a nonstationary climate. Water Resources Research 48.
  • Scarrott, C. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT 10 33–60.
  • Serinaldi, F. and Kilsby, C. G. (2014). Rainfall extremes: Toward reconciliation after the battle of distributions. Water Resources Research 50 336–352.
  • Southworth, H. and Heffernan, J. E. (2012). texmex: Threshold Exceedences and Multivariate Extremes. R package version 1.3.
  • Thompson, P., Cai, Y., Reeve, D. and Stander, J. (2009). Automated threshold selection methods for extreme wave analysis. Coastal Engineering 56 1013–1021.
  • Wadsworth, J. L. (2016). Exploiting structure of maximum likelihood estimators for extreme value threshold selection. Technometrics 58 116–126.
  • Wadsworth, J. L. and Tawn, J. A. (2012). Likelihood-based procedures for threshold diagnostics and uncertainty in extreme value modelling. J. R. Stat. Soc. Ser. B. Stat. Methodol. 74 543–567.
  • Wang, Q. J. (1991). The POT model described by the generalized Pareto distribution with Poisson arrival rate. J. Hydrol. 129 263–280.
  • Wong, T. S. T. and Li, W. K. (2006). A note on the estimation of extreme value distributions using maximum product of spacings. In Time Series and Related Topics 272–283. IMS.

Supplemental materials