Bayesian Analysis

Bayesian nonparametrics for heavy tailed distribution. Application to food risk assessment

Jessica Tressou

Full-text: Open access


Based on the fact that any heavy tailed distribution can be approximated by a possibly infinite mixture of Pareto distributions, this paper proposes two Bayesian methodologies tailored to infer on distribution tails belonging to the Frèchet domain of attraction. Firstly, a Bayesian Pareto based clustering procedure is developed, where the mixing distribution is chosen to be the classical conjugate prior of the Pareto distribution. This allows the grouping of $n$ objects into a certain number of clusters according to their extremal behavior and also exhibits a new estimator for the tail index. Secondly, a nonparametric extension of the model based clustering is proposed in which the parameter of interest is the mixing distribution. Estimation of the tail probability is conducted using a Dirichlet process prior for the unknown mixing distribution. To illustrate, both methodologies are applied to simulated data sets and a real data set concerning dietary exposure to a mycotoxin called Ochratoxin A.

Article information

Bayesian Anal., Volume 3, Number 2 (2008), 367-391.

First available in Project Euclid: 22 June 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Dirichlet process Model Based clustering Ochratoxin A Tail index estimation


Tressou, Jessica. Bayesian nonparametrics for heavy tailed distribution. Application to food risk assessment. Bayesian Anal. 3 (2008), no. 2, 367--391. doi:10.1214/08-BA314.

Export citation


  • Arnold, B., Castillo, E., and Sarabia, J. (1998). "Bayesian analysis for classical distributions using conditionally specified priors." Sankhya: The Indian Journal of Statistics, 60: 228–245.
  • Arnold, B. and Press, S. (1989). "Bayesian estimation and prediction for Pareto data." Journal of the American Statistical Association, 84(408): 1079–1084.
  • Beirlant, J., Dierckx, G., Goegebeur, Y., and Matthys, G. (1999). "Tail index estimation and an exponential regression model." Extremes, 2(2): 177–200.
  • Beirlant, J., Dierckx, G., and Guillou, A. (2005). "Estimation of the extreme-value index and generalized quantile plots." Bernoulli, 11(6): 949–970.
  • Bertail, P., Clémençon, S., and Tressou, J. (2008). "A storage model with random release rate for modeling exposure to food contaminants." Math. Biosc. Eng., 35(1): 35–60.
  • Bertail, P. and Tressou, J. (2006). "Incomplete generalized U-S"tatistics for food risk assessment. Biometrics, 62(1): 66–74.
  • Blackwell, D. and MacQueen, J. B. (1973). "Ferguson distributions via Pólya urn schemes." Annals of Statistics, 1: 353–355.
  • Boižić, Z., Duančić, V., Belicza, M., Krausand, O., and Skljarov, I. (1995). "Balkan endemic nephropathy: still a mysterious disease." European Journal of Epidemiology, 11: 235–238.
  • Bottolo, L., Consonni, G., Dellaportas, P., and Lijoi, A. (2003). "Bayesian Analysis of Extreme Values by Mixture Modeling." Extremes, 6: 25–47.
  • Coles, S. and Powell, E. (1996). "Bayesian Methods in Extreme Value Modelling: A Review and New Developments." International Statistical Review, 64: 119–136.
  • Counil, E., Verger, P., and Volatier, J.-L. (2005). "Handling of contamination variability in exposure assessment: A case study with Ochratoxin A." Food and Chemical Toxicology, 43(10): 1541–1555.
  • –- (2006). "Fitness-for-purpose of dietary survey duration: A case-study with the assessment of exposure to Ochratoxin A." Food and Chemical Toxicology, 44(4): 499–509.
  • CREDOC-AFSSA-DGAL (1999). Enquête INCA (individuelle et nationale sur les consommations alimentaires). Lavoisier, Paris, TEC&DOC edition. (Coordinateur : J.L. Volatier).
  • Diebolt, J., El-Aroui, M.-A., Garrido, M., and Girard, S. (2005). "Quasi-Conjugate Bayes estimates for GPD" parameters and Applications to Heavy tails modelling. Extremes, 8: 57–78.
  • Edler, L., Poirier, K., Dourson, M., Kleiner, J., Mileson, B., Nordmann, H., Renwick, A., Slob, W., Walton, K., and Würtzen, G. (2002). "Mathematical modelling and quantitative methods." Food and Chemical Toxicology, 40: 283–326.
  • Embrechts, P., Klüppelberg, C., and Mikosch, T. (1999). Modelling Extremal Events for Insurance and Finance. Applications of Mathematics. Berlin: Springer-Verlag.
  • Escobar, M. D. (1994). "Estimating normal means with a Dirichlet process prior." Journal of the American Statistical Association, 89: 268–277.
  • Ferguson, T. S. (1973). "A Bayesian analysis of some nonparametric problems." Annals of Statistics, 1: 209–230.
  • Feuerverger, A. and Hall, P. (1999). "Estimating a tail exponent by modelling departure from a Pareto Distribution." Annals of Statistics, 27: 760–781.
  • Fraley, C. and Raftery, A. (2002). "Model-based Clustering, Discriminant analysis, and density estimation." Journal of the American Statistical Association, 97(458): 611–631.
  • Frigessi, A., Haug, O., and Rue, H. (2002). "A dynamic mixture model for unsupervised tail estimation without threshold selection." Extremes, 5: 219–235.
  • Gauchi, J. P. and Leblanc, J. C. (2002). "Quantitative Assessment of Exposure to the Mycotoxin Ochratoxin A in food." Risk Analysis, 22: 219–234.
  • Gibney, M. J. and van der Voet, H. (2003). "Introduction to the Monte Carlo project and the approach to the validation of probabilistic models of dietary exposure to selected food chemicals." Food Additives and Contaminants, 20(Suppl. 1): S1–S7.
  • Green, P. and Richardson, S. (2001). "Modelling Heterogeneity With and Without the Dirichlet Process." Scandinavian Journal of Statistics, 28(2): 355–375.
  • Heard, N., Holmes, C., and Stephens, D. (2006). "Quantitative Study of Gene Regulation Involved in the Immune Response of Anopheline Mosquitoes: An Application of Bayesian Hierarchical Clustering of Curves." Journal of the American Statistical Association, 101: 18–29.
  • Hill, B. (1975). "A simple general approach to inference about the tail of a distribution." Annals of Statistics, 3: 1163–1174.
  • Ishwaran, H. and James, L. (2001). "Gibbs Sampling Methods for Stick-Breaking Priors." Journal of the American Statistical Association, 96: 161–173.
  • Kottas, A. and Sansó, B. (2007). "Bayesian mixture modeling for spatial Poisson process intensities, with applications to extreme value analysis." Journal of Statistical Planning and Inference, 37: 3151–3163.
  • Kroes, R., Müller, D., Lambe, J., Lowik, M. R. H., van Klaveren, J., Kleiner, J., Massey, R., Mayer, S., Urieta, I., Verger, P., and Visconti, A. (2002). "Assessment of intake from the diet." Food Chemical and Toxicology, 40: 327–385.
  • Lau, J. W. and Green, P. (2007). "Bayesian Model Based Clustering Procedures." Journal of Computational and Graphical Statistics, 16(3): 526–558.
  • Lau, J. W. and Lo, A. (2007). "Model based clustering and weighted Chinese restaurant processes." Advances in Statistical Modeling and Inference: Essays in Honor of Kjell A. Doksum, 405–424.
  • Lijoi, A., Mena, R., and Prünster, I. (2007). "Controlling the reinforcement in Bayesian nonparametric mixture models." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(4): 715–740.
  • Lo, A. Y. (1984). "On a class of bayesian nonparametric estimates: I. Density Estimates." Annals of Statistics, 12(1): 351–357.
  • MacEachern, S. (1994). "Estimating normal means with a conjugate style Dirichlet process prior." Communications in Statistics: Simulation and Computation, 23: 727–741.
  • –- (1998). "Computational methods for Mixture of Dirichlet process models." In Dey, D., Muller, P., and Sinha, D. (eds.), Practical Nonparametric and Semiparametric Bayesian Statistics. Springer-Verlag.
  • Marin, J., Mengersen, K., and Robert, C. (2005). "Bayesian modelling and inference on mixtures of distributions." In Dey, D. and Rao, C. (eds.), Handbook of Statistics, volume 25, 459–507. Elsevier.
  • Petrone, S. and Raftery, A. (1997). "A Note on the Dirichlet Process Prior in Bayesian Nonparametric Inference with Partial Exchangeability." Statistics and Probability Letters, 36: 39–83.
  • Quintana, F. and Iglesias, P. (2003). "Bayesian clustering and product partition models." Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(2): 557–574.
  • Stephenson, A. and Tawn, J. (2004). "Bayesian Inference for Extremes: Accounting for the Three Extremal Types." Extremes, 7: 297–307.
  • Teh, Y., Jordan, M., Beal, M., and Blei, D. (2006). "Hierarchical Dirichlet Processes." Journal of the American Statistical Association, 101(416): 1566–1581.
  • Tressou, J. (2006). "Non Parametric Modelling of the Left Censorship of Analytical Data in Food Risk Exposure Assessment." Journal of the American Statistical Association, 101(476): 1377–1386.
  • Tressou, J., Crépet, A., Bertail, P., Feinberg, M. H., and Leblanc, J. C. (2004). "Probabilistic exposure assessment to food chemicals based on Extreme Value Theory. Application to heavy metals from fish and sea products." Food and Chemical Toxicology, 42(8): 1349–1358.
  • van der Voet, H., de Mul, A., and van Klaveren, J. D. (2007). "A probabilistic model for simultaneous exposure to multiple compounds from food and its use for risk-benefit assessment." Food and Chemical Toxicology, 45(8): 1496–1506.