Electronic Journal of Statistics

Posterior concentration rates for mixtures of normals in random design regression

Zacharie Naulet and Judith Rousseau

Full-text: Open access


Previous works on location and location-scale mixtures of normals have shown different upper bounds on the posterior rates of contraction, either in a density estimation context or in nonlinear regression. In both cases, the observations were assumed not too spread by considering either the true density has light tails or the regression function has compact support. It has been conjectured that in a situation where the data are diffuse, location-scale mixtures may benefit from allowing a spatially varying order of approximation. Here we test the argument on the mean regression with normal errors and random design model. Although we cannot invalidate the conjecture due to the lack of lower bound, we find slower upper bounds for location-scale mixtures, even under heavy tails assumptions on the design distribution. However, the proofs suggest to introduce hybrid location-scale mixtures for which faster upper bounds are derived. Finally, we show that all tails assumptions on the design distribution can be released at the price of making the prior distribution covariate dependent.

Article information

Electron. J. Statist., Volume 11, Number 2 (2017), 4065-4102.

Received: August 2016
First available in Project Euclid: 24 October 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G20: Asymptotic properties
Secondary: 62G08: Nonparametric regression

Adaptive estimation Bayesian nonparametric estimation nonparametric regression Hölder class mixture prior rate of contraction heavy tails

Creative Commons Attribution 4.0 International License.


Naulet, Zacharie; Rousseau, Judith. Posterior concentration rates for mixtures of normals in random design regression. Electron. J. Statist. 11 (2017), no. 2, 4065--4102. doi:10.1214/17-EJS1344. https://projecteuclid.org/euclid.ejs/1508810899

Export citation


  • Barndorff-Nielsen, O., Blaesild, P., Jensen, J. L. and Jorgensen, B. (1982). Exponential Transformation Models., Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 379 41–65.
  • Birgé, L. (2006). Model selection via testing: an alternative to (penalized) maximum likelihood estimators. In, Annales de l’IHP Probabilités et statistiques 42 273–325.
  • Bochkina, N. and Rousseau, J. (2016). Adaptive density estimation based on a mixture of Gammas., ArXiv e-prints.
  • Canale, A. and De Blasi, P. (2017). Posterior asymptotics of nonparametric location-scale mixtures for multivariate density estimation., Bernoulli 23 379–404.
  • de Jonge, R. and van Zanten, J. H. (2010). Adaptive nonparametric Bayesian inference using location-scale mixture priors., The Annals of Statistics 38 3300–3320.
  • Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems., The Annals of Statistics 209–230.
  • Ghosal, S., Ghosh, J. K. and van der Vaart, A. W. (2000). Convergence rates of posterior distributions., The Annals of Statistics 28 500–531.
  • Ghosal, S. and van der Vaart, A. W. (2001). Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities., The Annals of Statistics 1233–1263.
  • Ghosal, S. and van der Vaart, A. W. (2007a). Posterior convergence rates of Dirichlet mixtures at smooth densities., The Annals of Statistics 35 697–723.
  • Ghosal, S. and van der Vaart, A. W. (2007b). Convergence rates of posterior distributions for noniid observations., The Annals of Statistics 35 192–223.
  • Goldenshluger, A. and Lepski, O. (2014). On adaptive minimax density estimation on $R^d$., Probability Theory and Related Fields 159 479–543.
  • Hangelbroek, T. and Ron, A. (2010). Nonlinear approximation using Gaussian kernels., Journal of Functional Analysis 259 203–219.
  • Härdle, W., Kerkyacharian, G., Picard, D. and Tsybakov, A. (1998). Wavelets. In, Wavelets, Approximation, and Statistical Applications 1–16. Springer.
  • Hjort, N. L., Holmes, C., Müller, P. and Walker, S. G. (2010)., Bayesian Nonparametrics. Cambridge University Press, Cambridge, UK.
  • Juditsky, A., Lambert-Lacroix, S. et al. (2004). On minimax density estimation on $\mathbbR$., Bernoulli 10 187–220.
  • Kingman, J. F. C. (1992)., Poisson processes 3. Oxford university press.
  • Kruijer, W., Rousseau, J. and van der Vaart, A. (2010). Adaptive Bayesian density estimation with location-scale mixtures., Electron. J. Stat. 4 1225–1257.
  • Lijoi, A., Prünster, I. and Walker, S. G. (2005). On consistency of nonparametric normal mixtures for Bayesian density estimation., Journal of the American Statistical Association 100 1292–1296.
  • Naulet, Z. and Barat, E. (2015). Some aspects of symmetric Gamma process mixtures., arXiv preprint arXiv:1504.00476.
  • Reynaud-Bouret, P., Rivoirard, V. and Tuleau-Malot, C. (2011). Adaptive density estimation: a curse of support?, Journal of Statistical Planning and Inference 141 115–139.
  • Salomond, J.-B. (2013). Bayesian testing for embedded hypotheses with application to shape constrains., arXiv preprint arXiv:1303.6466.
  • Scricciolo, C. (2014). Adaptive Bayesian density estimation in $L^p$-metrics with Pitman-Yor or normalized inverse-Gaussian process kernel mixtures., Bayesian Analysis 9 475–520.
  • Shen, W., Tokdar, S. T. and Ghosal, S. (2013). Adaptive Bayesian multivariate density estimation with Dirichlet mixtures., Biometrika 100 623–640.
  • Teh, Y. W., Jordan, M. I., Beal, M. J. and Blei, D. M. (2006). Hierarchical Dirichlet Processes., Journal of the American Statistical Association 101 1566–1581.
  • Wolpert, R. L., Clyde, M. A. and Tu, C. (2011). Stochastic expansions using continuous dictionaries: Lévy adaptive regression kernels., The Annals of Statistics 1916–1962.