The Annals of Statistics

Effect of extrapolation on coverage accuracy of prediction intervals computed from Pareto-type data

Peter Hall, Liang Peng, and Nader Tajvidi

Full-text: Open access

Abstract

A feature that distinguishes extreme-value contexts from more conventional statistical problems is that in the former we often wish to make predictions well beyond the range of the data. For example, one might have a 10-year sequence of observations of a phenomenon, and wish to make forecasts for the next 20 to 30 years. It is generally unclear how such long ranges of extrapolation affect prediction. In the present paper, and for extremes from a distribution with regularly varying tails at infinity, we address this problem. We approach it in two ways: first, from the viewpoint of predictive inference under a model that is admittedly only approximate, and where the errors of greatest concern are caused by the interaction of long-range extrapolation with model misspecification; second, where the model is accurate but errors arise from a combination of extrapolation and the fact that the method is only approximate. In both settings we show that, in a way which can be defined theoretically and confirmed numerically, one can make predictions exponentially far into the future without committing serious errors.

Article information

Source
Ann. Statist., Volume 30, Number 3 (2002), 875-895.

Dates
First available in Project Euclid: 6 August 2002

Permanent link to this document
https://projecteuclid.org/euclid.aos/1028674844

Digital Object Identifier
doi:10.1214/aos/1028674844

Mathematical Reviews number (MathSciNet)
MR1922544

Zentralblatt MATH identifier
1029.62079

Subjects
Primary: 62G30: Order statistics; empirical distribution functions
Secondary: 62G20: Asymptotic properties

Keywords
Bootstrap calibration coverage accuracy domain of attraction exceedence extreme value generalized Pareto distribution peaks over threshold

Citation

Hall, Peter; Peng, Liang; Tajvidi, Nader. Effect of extrapolation on coverage accuracy of prediction intervals computed from Pareto-type data. Ann. Statist. 30 (2002), no. 3, 875--895. doi:10.1214/aos/1028674844. https://projecteuclid.org/euclid.aos/1028674844


Export citation

References

  • BAI, C. and OLSHEN, R. A. (1988). Comment on "Theoretical comparison of bootstrap confidence intervals," by P. Hall. Ann. Statist. 16 953-956.
  • BAI, C., BICKEL, P. J. and OLSHEN, R. A. (1990). Hy peraccuracy of bootstrap based prediction. In Probability in Banach Spaces VII (E. Eberlein, J. Kuelbs and M. B. Marcus, eds.) 31-42. Birkhäuser, Boston.
  • BARNARD, G. A. (1986). Comment on "Predictive likelihood inference with applications," by R. W. Butler. J. Roy. Statist. Soc. Ser. B 48 27-28.
  • BARNDORFF-NIELSEN, O. E. and COX, D. R. (1994). Inference and Asy mptotics. Chapman and Hall, London.
  • BERAN, R. (1990). Refining bootstrap simultaneous confidence sets. J. Amer. Statist. Assoc. 85 417-426.
  • BERAN, R. (1992). Designing bootstrap prediction regions. In Bootstrapping and Related Techniques (K. H. Jöckel, G. Rothe and W. Sendler, eds.) 23-30. Springer, Berlin.
  • BJØRNSTAD, J. F. (1990). Predictive likelihood: a review (with discussion). Statist. Sci. 5 242-265.
  • BUTLER, R. W. (1986). Predictive likelihood inference with applications (with discussion). J. Roy. Statist. Soc. Ser. B 48 1-38.
  • CSÖRG O, S., DEHEUVELS, P. and MASON, D. (1985). Kernel estimates of the tail index of a distribution. Ann. Statist. 13 1050-1077.
  • DAVIS, R. and RESNICK, S. (1984). Tail estimates motivated by extreme value theory. Ann. Statist. 12 1467-1487.
  • DAVISON, A. C. (1984). Modelling excesses over high thresholds, with an application. In Statistical Extremes and Applications (J. Tiago de Oliveira, ed.) 461-482. Reidel, Dordrecht.
  • DAVISON, A. C. (1986). Approximate predictive likelihood. Biometrika 73 323-332.
  • DAVISON, A. C. and SMITH, R. L. (1990). Models for exceedences over high thresholds (with discussion). J. Roy. Statist. Soc. Ser. B 52 393-442.
  • EMBRECHTS, P., KLÜPPELBERG, C. and MIKOSCH, T. (1997). Modelling Extremal Events. Springer, Berlin.
  • FEUERVERGER, A. and HALL, P. (1999). Estimating a tail exponent by modelling departure from a Pareto distribution. Ann. Statist. 27 760-781.
  • GRIMSHAW, S. D. (1993). Computing maximum likelihood estimates for the generalized Pareto distribution. Technometrics 35 185-191.
  • HALL, P. (1992). The Bootstrap and Edgeworth Expansion. Springer, New York.
  • HALL, P., PENG, L. and TAJVIDI, N. (1999). On prediction intervals based on predictive likelihood or bootstrap methods. Biometrika 86 871-880.
  • HILL, B. M. (1975). A simple general approach to inference about the tail of a distribution. Ann. Statist. 3 1163-1174.
  • HOSKING, J. R. M. and WALLIS, J. R. (1987). Parameter and quantile estimation for the generalized Pareto distribution. Technometrics 29 339-349.
  • LEADBETTER, M. R. (1991). On a basis for "peaks over threshold" modeling. Statist. Probab. Lett. 12 357-362.
  • MOHARRAM, S. H., GOSAIN, A. K. and KAPOOR, P. N. (1993). A comparative study for the estimators of the generalized Pareto distribution. J. Hy drology 150 169-185.
  • REISS, R.-D. and THOMAS, M. (1997). Statistical Analy sis of Extreme Values, with Applications to Insurance, Finance, Hy drology and Other Fields. Birkhäuser, Basel.
  • ROOTZÉN, H. and TAJVIDI, N. (1997). Extreme value statistics and wind storm losses: a case study. Scand. Actuarial J. 70-94.
  • ROSBJERG, D., MADSEN, H. and RASMUSSEN, P. F. (1992). Prediction in partial duration series with generalized Pareto-distributed exceedences. Water Resources Research 28 3001-3010.
  • Ry TGAARD, M. (1990). Estimation in the Pareto distribution. Astin Bull. 20 201-216.
  • SMITH, R. L. (1984). Threshold methods for sample extremes. In Statistical Extremes and Applications (J. Tiago de Oliveira, ed.) 621-638. Reidel, Dordrecht.
  • SMITH, R. L. (1985). Maximum likelihood estimation in a class of nonregular cases. Biometrika 72 67-90.
  • SMITH, R. L. (1987). Estimating tails of probability distributions. Ann. Statist. 15 1174-1207.
  • SMITH, R. L. (1989). Extreme value analysis of environmental time series: an application to trend detection in ground-level ozone (with discussion). Statist. Sci. 4 367-393.
  • STINE, R. A. (1985). Bootstrap prediction intervals for regression. J. Amer. Statist. Assoc. 80 1026-1031.
  • ZIPF, G. K. (1941). National Unity and Disunity: The Nation as a Bio-Social Organism. Principia Press, Bloomington, IN.
  • ZIPF, G. K. (1949). Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley, Cambridge, MA.
  • CANBERRA, ACT 0200 AUSTRALIA E-MAIL: halpstat@pretty.anu.edu.au L. PENG CENTRE FOR MATHEMATICS AND ITS APPLICATIONS AUSTRALIAN NATIONAL UNIVERSITY
  • CANBERRA, ACT 0200 AUSTRALIA