Bernoulli

  • Bernoulli
  • Volume 22, Number 2 (2016), 901-926.

Dynamic density estimation with diffusive Dirichlet mixtures

Ramsés H. Mena and Matteo Ruggiero

Full-text: Open access

Abstract

We introduce a new class of nonparametric prior distributions on the space of continuously varying densities, induced by Dirichlet process mixtures which diffuse in time. These select time-indexed random functions without jumps, whose sections are continuous or discrete distributions depending on the choice of kernel. The construction exploits the widely used stick-breaking representation of the Dirichlet process and induces the time dependence by replacing the stick-breaking components with one-dimensional Wright–Fisher diffusions. These features combine appealing properties of the model, inherited from the Wright–Fisher diffusions and the Dirichlet mixture structure, with great flexibility and tractability for posterior computation. The construction can be easily extended to multi-parameter GEM marginal states, which include, for example, the Pitman–Yor process. A full inferential strategy is detailed and illustrated on simulated and real data.

Article information

Source
Bernoulli, Volume 22, Number 2 (2016), 901-926.

Dates
Received: September 2013
Revised: May 2014
First available in Project Euclid: 9 November 2015

Permanent link to this document
https://projecteuclid.org/euclid.bj/1447077764

Digital Object Identifier
doi:10.3150/14-BEJ681

Mathematical Reviews number (MathSciNet)
MR3449803

Zentralblatt MATH identifier
06562300

Keywords
density estimation Dirichlet process hidden Markov model nonparametric regression Pitman–Yor process Wright–Fisher diffusion

Citation

Mena, Ramsés H.; Ruggiero, Matteo. Dynamic density estimation with diffusive Dirichlet mixtures. Bernoulli 22 (2016), no. 2, 901--926. doi:10.3150/14-BEJ681. https://projecteuclid.org/euclid.bj/1447077764


Export citation

References

  • [1] Barrientos, A.F., Jara, A. and Quintana, F.A. (2012). On the support of MacEachern’s dependent Dirichlet processes and extensions. Bayesian Anal. 7 277–309.
  • [2] Bibby, B.M., Skovgaard, I.M. and Sørensen, M. (2005). Diffusion-type models with given marginal distribution and autocorrelation function. Bernoulli 11 191–220.
  • [3] Billingsley, P. (1968). Convergence of Probability Measures. New York: Wiley.
  • [4] Caron, F., Davy, M. and Doucet, A. (2007). Generalized Polya urn for time-varying Dirichlet process mixtures. In Proceedings 23rd Conference on Uncertainty in Artificial Intelligence. Vancouver.
  • [5] Caron, F., Davy, M., Doucet, A., Duflos, E. and Vanheeghe, P. (2008). Bayesian inference for linear dynamic models with Dirichlet process mixtures. IEEE Trans. Signal Process. 56 71–84.
  • [6] Cifarelli, D.M. and Regazzini, E. (1978). Nonparametric statistical problems under partial exchangeability: The use of associative means. (Original title: “Problemi statistici non parametrici in condizioni di scambiabilità parziale: Impiego di medie associative”.) Quaderni dell’Istituto di Matematica Finanziaria, Univ. of Torino, 3.
  • [7] Damien, P., Wakefield, J. and Walker, S. (1999). Gibbs sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 331–344.
  • [8] De Iorio, M., Müller, P., Rosner, G.L. and MacEachern, S.N. (2004). An ANOVA model for dependent random measures. J. Amer. Statist. Assoc. 99 205–215.
  • [9] Duan, J.A., Guindani, M. and Gelfand, A.E. (2007). Generalized spatial Dirichlet process models. Biometrika 94 809–825.
  • [10] Dunson, D.B. (2006). Bayesian dynamic modelling of latent trait distributions. Biostatistics 7 551–568.
  • [11] Dunson, D.B. and Park, J.-H. (2008). Kernel stick-breaking processes. Biometrika 95 307–323.
  • [12] Dunson, D.B., Pillai, N. and Park, J.-H. (2007). Bayesian density regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 163–183.
  • [13] Dunson, D.B., Xue, Y. and Carin, L. (2008). The matrix stick-breaking process: Flexible Bayes meta-analysis. J. Amer. Statist. Assoc. 103 317–327.
  • [14] Ethier, S.N. and Griffiths, R.C. (1993). The transition function of a Fleming–Viot process. Ann. Probab. 21 1571–1590.
  • [15] Ethier, S.N. and Kurtz, T.G. (1981). The infinitely-many-neutral-alleles diffusion model. Adv. in Appl. Probab. 13 429–452.
  • [16] Ethier, S.N. and Kurtz, T.G. (1986). Markov Processes: Characterization and Convergence. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. New York: Wiley.
  • [17] Ethier, S.N. and Kurtz, T.G. (1993). Fleming–Viot processes in population genetics. SIAM J. Control Optim. 31 345–386.
  • [18] Favaro, S., Ruggiero, M. and Walker, S.G. (2009). On a Gibbs sampler based random process in Bayesian nonparametrics. Electron. J. Stat. 3 1556–1566.
  • [19] Feng, S. and Wang, F.-Y. (2007). A class of infinite-dimensional diffusion processes with connection to population genetics. J. Appl. Probab. 44 938–949.
  • [20] Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
  • [21] Fuentes-García, R., Mena, R.H. and Walker, S.G. (2009). A nonparametric dependent process for Bayesian regression. Statist. Probab. Lett. 79 1112–1119.
  • [22] Gelfand, A.E., Kottas, A. and MacEachern, S.N. (2005). Bayesian nonparametric spatial modeling with Dirichlet process mixing. J. Amer. Statist. Assoc. 100 1021–1035.
  • [23] Gelman, A. and Rubin, D. (1992). Inferences from iterative simulation using multiple sequences. Statist. Inference 7 457–472.
  • [24] Ghosh, J.K. and Ramamoorthi, R.V. (2003). Bayesian Nonparametrics. Springer Series in Statistics. New York: Springer.
  • [25] Griffin, J.E. and Steel, M.F.J. (2006). Order-based dependent Dirichlet processes. J. Amer. Statist. Assoc. 101 179–194.
  • [26] Griffin, J.E. and Steel, M.F.J. (2011). Stick-breaking autoregressive processes. J. Econometrics 162 383–396.
  • [27] Hjort, N.L., Holmes, C.C., Müller, P. and Walker, S.G., eds. (2010). Bayesian Nonparametrics. Cambridge Series in Statistical and Probabilistic Mathematics 28. Cambridge: Cambridge Univ. Press.
  • [28] Ishwaran, H. and James, L.F. (2001). Gibbs sampling methods for stick-breaking priors. J. Amer. Statist. Assoc. 96 161–173.
  • [29] Johnson, N.L., Kotz, S. and Balakrishnan, N. (1997). Discrete Multivariate Distributions. Wiley Series in Probability and Statistics: Applied Probability and Statistics. New York: Wiley.
  • [30] Kalli, M., Griffin, J.E. and Walker, S.G. (2011). Slice sampling mixture models. Stat. Comput. 21 93–105.
  • [31] Karlin, S. and Taylor, H.M. (1981). A Second Course in Stochastic Processes. New York: Academic Press.
  • [32] Lijoi, A., Mena, R.H. and Prünster, I. (2005). Hierarchical mixture modeling with normalized inverse-Gaussian priors. J. Amer. Statist. Assoc. 100 1278–1291.
  • [33] Lijoi, A., Mena, R.H. and Prünster, I. (2007). Controlling the reinforcement in Bayesian non-parametric mixture models. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 715–740.
  • [34] Lijoi, A. and Prünster, I. (2010). Models beyond the Dirichlet process. In Bayesian Nonparametrics (N.L. Hjort, C.C. Holmes, P. Müller and S.G. Walker, eds.). Camb. Ser. Stat. Probab. Math. 80–136. Cambridge: Cambridge Univ. Press.
  • [35] Lo, A.Y. (1984). On a class of Bayesian nonparametric estimates, I. Density estimates. Ann. Statist. 12 351–357.
  • [36] MacEachern, S.N. (1999). Dependent nonparametric processes. In ASA Proceedings of the Section on Bayesian Statistical Science. Alexandria, VA: American Statist. Assoc.
  • [37] MacEachern, S.N. (2000). Dependent Dirichlet processes. Technical report, Ohio State University.
  • [38] Mena, R.H., Ruggiero, M. and Walker, S.G. (2011). Geometric stick-breaking processes for continuous-time Bayesian nonparametric modeling. J. Statist. Plann. Inference 141 3217–3230.
  • [39] Mena, R.H. and Walker, S.G. (2009). On a construction of Markov models in continuous time. Metron 67 303–323.
  • [40] Papaspiliopoulos, O. and Roberts, G.O. (2008). Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika 95 169–186.
  • [41] Petrone, S., Guindani, M. and Gelfand, A.E. (2009). Hybrid Dirichlet mixture models for functional data. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 755–782.
  • [42] Petrov, L.A. (2009). A two-parameter family of infinite-dimensional diffusions on the Kingman simplex. Funct. Anal. Appl. 43 279–296.
  • [43] Pitman, J. (1995). Exchangeable and partially exchangeable random partitions. Probab. Theory Related Fields 102 145–158.
  • [44] Pitman, J. and Yor, M. (1997). The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator. Ann. Probab. 25 855–900.
  • [45] Raftery, A. and Lewis, S. (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statist. Inference 7 493–497.
  • [46] Rodriguez, A. and Dunson, D.B. (2011). Nonparametric Bayesian models through probit stick-breaking processes. Bayesian Anal. 6 145–177.
  • [47] Rodriguez, A. and Ter Horst, E. (2008). Bayesian dynamic density estimation. Bayesian Anal. 3 339–365.
  • [48] Ruggiero, M. and Walker, S.G. (2009). Countable representation for infinite dimensional diffusions derived from the two-parameter Poisson–Dirichlet process. Electron. Commun. Probab. 14 501–517.
  • [49] Ruggiero, M., Walker, S.G. and Favaro, S. (2013). Alpha-diversity processes and normalized inverse-Gaussian diffusions. Ann. Appl. Probab. 23 386–425.
  • [50] Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statist. Sinica 4 639–650.
  • [51] Trippa, L., Müller, P. and Johnson, W. (2011). The multivariate beta process and an extension of the Polya tree model. Biometrika 98 17–34.
  • [52] Walker, S.G. (2007). Sampling the Dirichlet mixture model with slices. Comm. Statist. Simulation Comput. 36 45–54.