Bayesian Analysis

Hierarchical Species Sampling Models

Federico Bassetti, Roberto Casarin, and Luca Rossini

Advance publication

This article is in its final form and can be cited using the date of online publication and the DOI.

Full-text: Open access


This paper introduces a general class of hierarchical nonparametric prior distributions which includes new hierarchical mixture priors such as the hierarchical Gnedin measures, and other well-known prior distributions such as the hierarchical Pitman-Yor and the hierarchical normalized random measures. The random probability measures are constructed by a hierarchy of generalized species sampling processes with possibly non-diffuse base measures. The proposed framework provides a probabilistic foundation for hierarchical random measures, and allows for studying their properties under the alternative assumptions of diffuse, atomic and mixed base measure. We show that hierarchical species sampling models have a Chinese Restaurants Franchise representation and can be used as prior distributions to undertake Bayesian nonparametric inference. We provide a general sampling method for posterior approximation which easily accounts for non-diffuse base measures such as spike-and-slab.

Article information

Bayesian Anal., Advance publication (2018), 30 pages.

First available in Project Euclid: 2 October 2019

Permanent link to this document

Digital Object Identifier

Primary: 62G05: Estimation 62F15: Bayesian inference 60G57: Random measures 60G09: Exchangeability

Bayesian nonparametrics generalized species sampling Gibbs sampling hierarchical random measures spike-and-slab

Creative Commons Attribution 4.0 International License.


Bassetti, Federico; Casarin, Roberto; Rossini, Luca. Hierarchical Species Sampling Models. Bayesian Anal., advance publication, 2 October 2019. doi:10.1214/19-BA1168.

Export citation


  • Argiento, R., Cremaschi, A., and Vannucci, M. (2019). “Hierarchical Normalized Completely Random Measures to Cluster Grouped Data.” Journal of the American Statistical Association, 1–43.
  • Arratia, R., Barbour, A. D., and S., T. (2003). Logarithmic combinatorial structures: a probabilistic approach. European Mathematical Society.
  • Bacallado, S., Battiston, M., Favaro, S., and Trippa, L. (2017). “Sufficientness Postulates for Gibbs-Type Priors and Hierarchical Generalizations.” Statistical Science, 32(4): 487–500.
  • Bassetti, F., Casarin, R., and Leisen, F. (2014). “Beta-product dependent Pitman-Yor processes for Bayesian inference.” Journal of Econometrics, 180(1): 49–72.
  • Bassetti, F., Casarin, R., Rossini, L. (2019a). “Supplementary Material A to Hierarchical Species Sampling Models.” Bayesian Analysis.
  • Bassetti, F., Casarin, R., Rossini, L. (2019b). “Supplementary Material B to Hierarchical Species Sampling Models.” Bayesian Analysis.
  • Billio, M., Casarin, R., and Rossini, L. (2019). “Bayesian nonparametric sparse VAR models.” Journal of Econometrics, 212: 97–115. URL
  • Camerlenghi, F., Lijoi, A., Orbanz, P., and Pruenster, I. (2019). “Distribution theory for hierarchical processes.” Annals of Statistics, 47(1): 67–92.
  • Camerlenghi, F., Lijoi, A., and Prünster, I. (2017). “Bayesian prediction with multiple-samples information.” Journal of Multivariate Analysis, 156: 18–28. URL
  • Camerlenghi, F., Lijoi, A., and Prünster, I. (2018). “Bayesian nonparametric inference beyond the Gibbs-type framework.” Scandinavian Journal of Statistics, 45(4): 1062–1091.
  • Canale, A., Lijoi, A., Nipoti, B., and Prünster, I. (2017). “On the Pitman–Yor process with spike and slab base measure.” Biometrika, 104(3): 681–697.
  • Castillo, I., Schmidt-Hieber, J., and van der Vaart, A. (2015). “Bayesian linear regression with sparse priors.” Annals of Statistics, 43(5): 1986–2018. URL
  • Dahl, D. B. (2006). “Model-based clustering for expression data via a Dirichlet process mixture model.” In Do, K.-A., Müller, P. P., and Vannucci, M. (eds.), Bayesian Inference for Gene Expression and Proteomics, 201–218. Cambridge University Press.
  • De Blasi, P., Favaro, S., Lijoi, A., Mena, R. H., Prunster, I., and Ruggiero, M. (2015). “Are Gibbs-Type Priors the Most Natural Generalization of the Dirichlet Process?” IEEE Transactions on Pattern Analysis & Machine Intelligence, 37(2): 212–229.
  • Diaconis, P. and Ram, A. (2012). “A probabilistic interpretation of the Macdonald polynomials.” Annals of Probability, 40(5): 1861–1896.
  • Donnelly, P. (1986). “Partition structures, Pólya urns, the Ewens sampling formula, and the ages of alleles.” Theoretical Population Biology, 30(2): 271–288.
  • Donnelly, P. and Grimmett, G. (1993). “On the asymptotic distribution of large prime factors.” Journal of the London Mathematical Society (2), 47(3): 395–404.
  • Du, L., Buntine, W., and Jin, H. (2010). “A segmented topic model based on the two-parameter Poisson-Dirichlet process.” Machine Learning, 81(1): 5–19.
  • Dubey, A., Williamson, S., and Xing, E. (2014). “Parallel Markov chain Monte Carlo for Pitman-Yor mixture models.” In Uncertainty in Artificial Intelligence – Proceedings of the 30th Conference, UAI 2014, 142–151.
  • Escobar, M. (1994). “Estimating normal means with a Dirichlet process prior.” Journal of the American Statistical Association, 89(425): 268–277.
  • Escobar, M. and West, M. (1995). “Bayesian density estimation and inference using mixtures.” Journal of the American Statistical Association, 90(430): 577–588.
  • Ewens, W. J. (1972). “The sampling theory of selectively neutral alleles.” Theoretical Population Biology, 3: 87–112; erratum, ibid. 3 (1972), 240; erratum, ibid. 3 (1972), 376.
  • Favaro, S. and Teh, Y. W. (2013). “MCMC for Normalized Random Measure Mixture Models.” Statistical Science, 28(3): 335–359.
  • George, E. I. and McCulloch, R. E. (1993). “Variable Selection via Gibbs Sampling.” Journal of the American Statistical Association, 88(423): 881–889. URL
  • Gnedin, A. (2010). “A species sampling model with finitely many types.” Electronic Communications in Probability, 15(8): 79–88.
  • Gnedin, A. and Pitman, J. (2006). “Exchangeable Gibbs partitions and Stirling triangles.” Journal of Mathematical Sciences, 138(3): 5674–5685.
  • Griffin, J. E. and Steel, M. F. J. (2011). “Stick-breaking autoregressive processes.” Journal of Econometrics, 162(2): 383–396.
  • Hirano, K. (2002). “Semiparametric Bayesian Inference in autoregressive panel data models.” Econometrica, 70(2): 781–799.
  • Hjort, N. L., Homes, C., Müller, P., and Walker, S. G. (2010). Bayesian Nonparametrics. Cambridge University Press.
  • Hoppe, F. M. (1984). “Pólya-like urns and the Ewens’ sampling formula.” Journal of Mathematical Biology, 20(1): 91–94.
  • Kallenberg, O. (2006). Probabilistic Symmetries and Invariance Principles. Springer-Verlag New York.
  • Kalli, M. and Griffin, J. E. (2018). “Bayesian nonparametric vector autoregressive models.” Journal of Econometrics, 203(2): 267–282. URL
  • Kalli, M., Griffin, J. E., and Walker, S. (2011). “Slice sampling mixture models.” Statistics and Computing, 21(1): 93–105.
  • Kim, S., Dahl, D. B., and Vannucci, M. (2009). “Spiked Dirichlet process prior for Bayesian multiple hypothesis testing in random effects models.” Bayesian Analysis, 4(4): 707–732.
  • Kingman, J. F. C. (1980). Mathematics of genetic diversity, volume 34 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa.
  • Lau, J. W. and Green, P. J. (2007). “Bayesian Model-Based Clustering Procedures.” Journal of Computational and Graphical Statistics, 16(3): 526–558.
  • Lim, K. W., Buntine, W., Chen, C., and Du, L. (2016). “Nonparametric Bayesian topic modelling with the hierarchical Pitman-Yor processes.” International Journal of Approximate Reasoning, 78(C): 172–191.
  • Miller, J. and Harrison, M. (2018). “Mixture models with a Prior on the number of components.” Journal of the American Statistical Association, 113(521): 340–356.
  • Müller, P. and Quintana, F. (2010). “Random partition models with regression on covariates.” Journal of Statistical Planning and Inference, 140(10): 2801–2808.
  • Navarro, D. J., Griffiths, T. L., Steyvers, M., and Lee, M. D. (2006). “Modeling individual differences using Dirichlet processes.” Journal of Mathematical Psychology, 50(2): 101–122.
  • Neal, R. (2000). “Markov Chain sampling methods for Dirichlet process mixture models.” Journal of Computational and Graphical Statistics, 9(2): 249–265.
  • Nguyen, X. (2016). “Borrowing strengh in hierarchical Bayes: Posterior concentration of the Dirichlet base measure.” Bernoulli, 22(3): 1535–1571.
  • Papaspiliopoulos, O. and Roberts, G. O. (2008). “Retrospective Markov Chain Monte Carlo Methods for Dirichlet Process Hierarchical Models.” Biometrika, 95(1): 169–186.
  • Pitman, J. (1995). “Exchangeable and partially exchangeable random partitions.” Probability Theory and Related Fields, 102(2): 145–158.
  • Pitman, J. (1996). “Some developments of the Blackwell-MacQueen urn scheme.” In Statistics, probability and game theory, volume 30 of IMS Lecture Notes—Monograph Series, 245–267. Institute of Mathematical Statistics, Hayward, CA.
  • Pitman, J. (2003). “Poisson-Kingman partitions.” In Statistics and science: a Festschrift for Terry Speed, volume 40 of IMS Lecture Notes—Monograph Series, 1–34. Institute of Mathematical Statistics, Beachwood, OH.
  • Pitman, J. (2006). Combinatorial Stochastic Processes, volume 1875. Springer-Verlag.
  • Pitman, J. and Yor, M. (1997). “The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator.” The Annals of Probability, 25(2): 855–900.
  • Rockova, V. and George, E. I. (2018). “The Spike-and-Slab LASSO.” Journal of the American Statistical Association, 113(521): 431–444.
  • Sangalli, L. M. (2006). “Some developments of the normalized random measures with independent increments.” Sankhyā, 68(3): 461–487.
  • Sohn, K.-A. and Xing, E. P. (2009). “A hierarchical Dirichlet process mixture model for haplotype reconstruction from multi-population data.” The Annals of Applied Statistics, 3(2): 791–821.
  • Stock, J. H. and Watson, M. W. (2002). “Forecasting Using Principal Components from a Large Number of Predictors.” Journal of the American Statistical Association, 97(460): 1167–1179.
  • Teh, Y. and Jordan, M. I. (2010). “Hierarchical Bayesian nonparametric models with applications.” In Hjort, N. L., Holmes, C., Müller, P., and Walker, S. (eds.), Bayesian Nonparametrics. Cambridge University Press.
  • Teh, Y. W. (2006). “A Hierarchical Bayesian Language Model Based on Pitman-Yor Processes.” In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, 985–992. Stroudsburg, PA, USA: Association for Computational Linguistics.
  • Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M. (2006). “Hierarchical Dirichlet processes.” Journal of the American Statistical Association, 101(476): 1566–1581.
  • Walker, S. G. (2007). “Sampling the Dirichlet Mixture Model with Slices.” Communications in Statistics – Simulation and Computation, 36(1): 45–54.
  • Wood, F., Archambeau, C., Gasthaus, J., James, L. F., and Teh, Y. W. (2009). “A Stochastic Memoizer for Sequence Data.” In International Conference on Machine Learning (ICML), volume 26, 1129–1136.

Supplemental materials

  • Supplementary material A to Hierarchical Species Sampling Models. This document contains the derivations of the results of the paper and a detailed analysis of the generalized species sampling (with a general base measure). It also describes the Chinese Restaurant Franchise Sampler for Hierarchical Species Sampling Mixtures.
  • Supplementary material B to Hierarchical Species Sampling Models. This document provides further numerical illustrations and robustness checks.