Bayesian Analysis

Expert Information and Nonparametric Bayesian Inference of Rare Events

Hwan-sik Choi

Full-text: Open access


Prior distributions are important in Bayesian inference of rare events because historical data information is scarce, and experts are an important source of information for elicitation of a prior distribution. I propose a method to incorporate expert information into nonparametric Bayesian inference on rare events when expert knowledge is elicited as moment conditions on a finite dimensional parameter θ only. I generalize the Dirichlet process mixture model to merge expert information into the Dirichlet process (DP) prior to satisfy expert’s moment conditions. Among all the priors that comply with expert knowledge, we use the one that is closest to the original DP prior in the Kullback–Leibler information criterion. The resulting prior distribution is given by exponentially tilting the DP prior along θ. I provide a Metropolis–Hastings algorithm to implement this approach to sample from posterior distributions with exponentially tilted DP priors. The proposed method combines prior information from a statistician and an expert by finding the least-informative prior given expert information.

Article information

Bayesian Anal., Volume 11, Number 2 (2016), 421-445.

First available in Project Euclid: 26 May 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Dirichlet process mixture defaults Kullback–Leibler information criterion maximum entropy Metropolis–Hastings algorithm


Choi, Hwan-sik. Expert Information and Nonparametric Bayesian Inference of Rare Events. Bayesian Anal. 11 (2016), no. 2, 421--445. doi:10.1214/15-BA956.

Export citation


  • Amari, S. I. (1982). “Differential geometry of curved exponential families – curvatures and information loss.” The Annals of Statistics, 10(2): 357–385.
  • — (1985). Differential-Geometrical Methods in Statistics. Lecture Notes in Statistics. Berlin: Springer-Verlag.
  • Amari, S. I. and Nagaoka, H. (2000). Methods of Information Geometry, volume 191 of Translations of Mathematical Monographs. Providence, RI: American Mathematical Society. Originally published in Japanese by Iwanami Shoten, Publishers, Tokyo, 1993.
  • Antoniak, C. (1974). “Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems.” The Annals of Statistics, 2(6): 1152–1174.
  • Bierens, H. (1994). Topics in Advanced Econometrics: Estimation, Testing, and Specification of Cross-Section and Time Series Models. Cambridge University Press.
  • Blackwell, D. and MacQueen, J. (1973). “Ferguson distributions via Pólya urn schemes.” The Annals of Statistics, 1(2): 353–355.
  • Bush, C. A. and MacEachern, S. N. (1996). “A semiparametric Bayesian model for randomised block designs.” Biometrika, 83(2): 275–285.
  • Chaloner, K. and Duncan, G. (1983). “Assessment of a beta prior distribution: PM elicitation.” The Statistician, 32(1/2): 174–180.
  • Chib, S. and Greenberg, E. (1994). “Bayes inference in regression models with ARMA$(p,q)$ errors.” Journal of Econometrics, 64(1–2): 183–206.
  • — (1995). “Understanding the Metropolis–Hastings algorithm.” The American Statistician, 49(4): 327–335.
  • Cressie, N. and Read, T. R. C. (1984). “Multinomial goodness-of-fit tests.” Journal of the Royal Statistical Society. Series B (Statistical Methodology), 46(3): 440–464.
  • Csiszár, I. (1967a). “Information type measures of difference of probability distributions and indirect observations.” Studia Scientiarum Mathematicarum Hungarica, 2: 299–318.
  • — (1967b). “On topological properties of $f$-divergence.” Studia Scientiarum Mathematicarum Hungarica, 2: 329–339.
  • — (1975). “$I$-divergence geometry of probability distributions and minimization problems.” The Annals of Probability, 3: 146–158.
  • Escobar, M. (1994). “Estimating normal means with a Dirichlet process prior.” Journal of the American Statistical Association, 89(425): 268–277.
  • Escobar, M. and West, M. (1995). “Bayesian density estimation and inference using mixtures.” Journal of the American Statistical Association, 90(430): 577–588.
  • Ferguson, T. (1973). “A Bayesian analysis of some nonparametric problems.” The Annals of Statistics, 1(2): 209–230.
  • — (1974). “Prior distributions on spaces of probability measures.” The Annals of Statistics, 2(4): 615–629.
  • Freedman, D. (1963). “On the asymptotic behavior of Bayes’ estimates in the discrete case.” The Annals of Mathematical Statistics, 34(4): 1386–1403.
  • Garthwaite, P., Kadane, J., and O’Hagan, A. (2005). “Statistical methods for eliciting probability distributions.” Journal of the American Statistical Association, 100(470): 680–701.
  • Gelfand, A. and Kottas, A. (2002). “A computational approach for full nonparametric Bayesian inference under Dirichlet process mixture models.” Journal of Computational and Graphical Statistics, 11(2): 289–305.
  • Gelfand, A. and Mukhopadhyay, S. (1995). “On nonparametric Bayesian inference for the distribution of a random sample.” Canadian Journal of Statistics, 23(4): 411–420.
  • Ghosh, J. and Ramamoorthi, R. (2003). Bayesian Nonparametrics. Springer Verlag.
  • Griffin, J. and Steel, M. (2011). “Stick-breaking autoregressive processes.” Journal of Econometrics, 162(2): 383–396.
  • Hirano, K. (2002). “Semiparametric Bayesian inference in autoregressive panel data models.” Econometrica, 70(2): 781–799.
  • Hjort, N. L., Holmes, C., Müller, P., and Walker, S. G. (eds.) (2010). Bayesian Nonparametrics. Cambridge University Press.
  • Ishwaran, H. and James, L. (2001). “Gibbs sampling methods for stick-breaking priors.” Journal of the American Statistical Association, 96(453): 161–173.
  • — (2002). “Approximate Dirichlet process computing in finite normal mixtures.” Journal of Computational and Graphical Statistics, 11(3): 508–532.
  • Ishwaran, H. and Zarepour, M. (2000). “Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models.” Biometrika, 87(2): 371–390.
  • — (2002). “Exact and approximate sum representations for the Dirichlet process.” Canadian Journal of Statistics, 30(2): 269–283.
  • Jensen, M. and Maheu, J. (2010). “Bayesian semiparametric stochastic volatility modeling.” Journal of Econometrics, 157(2): 306–316.
  • Kadane, J., Chan, N., and Wolfson, L. (1996). “Priors for unit root models.” Journal of Econometrics, 75(1): 99–111.
  • Kadane, J. and Wolfson, L. (1998). “Experiences in elicitation.” The Statistician, 47(1): 3–19.
  • Kessler, D. C., Hoff, P. D., and Dunson, D. B. (2015). “Marginally specified priors for non-parametric Bayesian estimation.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 77(1): 35–58.
  • Kiefer, N. (2009). “Default estimation for low-default portfolios.” Journal of Empirical Finance, 16(1): 164–173.
  • — (2010). “Default estimation and expert information.” Journal of Business and Economic Statistics, 28(2): 320–328.
  • Kitamura, Y. and Otsu, T. (2011). “Bayesian analysis of moment condition models using nonparametric priors.” Working paper, Yale University.
  • Kitamura, Y. and Stutzer, M. (1997). “An information-theoretic alternative to generalized method of moments estimation.” Econometrica, 65(4): 861–874.
  • Kullback, S. and Leibler, R. (1951). “On information and sufficiency.” The Annals of Mathematical Statistics, 22(1): 79–86.
  • Li, Y., Müller, P., and Lin, X. (2011). “Center-adjusted inference for a nonparametric Bayesian random effect distribution.” Statistica Sinica, 21: 1201–1223.
  • Lo, A. (1984). “On a class of Bayesian nonparametric estimates: I. Density estimates.” The Annals of Statistics, 12(1): 351–357.
  • MacEachern, S. and Müller, P. (1998). “Estimating mixture of Dirichlet process models.” Journal of Computational and Graphical Statistics, 7(2): 223–238.
  • MacEachern, S. N. (1999). “Dependent nonparametric processes.” In: ASA Proceedings of the Section on Bayesian Statistical Science, 50–55.
  • — (2000). “Dependent Dirichlet processes.” Unpublished manuscript, Department of Statistics, The Ohio State University.
  • Neal, R. (2000). “Markov chain sampling methods for Dirichlet process mixture models.” Journal of Computational and Graphical Statistics, 9(2): 249–265.
  • Newey, W. K. and Smith, R. J. (2004). “Higher order properties of GMM and generalized empirical likelihood estimators.” Econometrica, 72(1): 219–255.
  • Pitman, J. and Yor, M. (1997). “The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator.” The Annals of Probability, 25(2): 855–900.
  • Rényi, A. (1961). “On measures of entropy and information.” In: Proceedings of Fourth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, 547–561.
  • Sethuraman, J. (1994). “A constructive definition of Dirichlet priors.” Statistica Sinica, 4: 639–650.
  • Taddy, M. and Kottas, A. (2010). “A Bayesian nonparametric approach to inference for quantile regression.” Journal of Business and Economic Statistics, 28(3): 357–369.
  • White, H. (1982). “Maximum likelihood estimation of misspecified models.” Econometrica, 50(1): 1–26.