Electronic Journal of Statistics

Conditions for posterior contraction in the sparse normal means problem

S.L. van der Pas, J.-B. Salomond, and J. Schmidt-Hieber

Full-text: Open access


The first Bayesian results for the sparse normal means problem were proven for spike-and-slab priors. However, these priors are less convenient from a computational point of view. In the meanwhile, a large number of continuous shrinkage priors has been proposed. Many of these shrinkage priors can be written as a scale mixture of normals, which makes them particularly easy to implement. We propose general conditions on the prior on the local variance in scale mixtures of normals, such that posterior contraction at the minimax rate is assured. The conditions require tails at least as heavy as Laplace, but not too heavy, and a large amount of mass around zero relative to the tails, more so as the sparsity increases. These conditions give some general guidelines for choosing a shrinkage prior for estimation under a nearly black sparsity assumption. We verify these conditions for the class of priors considered in [12], which includes the horseshoe and the normal-exponential gamma priors, and for the horseshoe+, the inverse-Gaussian prior, the normal-gamma prior, and the spike-and-slab Lasso, and thus extend the number of shrinkage priors which are known to lead to posterior contraction at the minimax estimation rate.

Article information

Electron. J. Statist., Volume 10, Number 1 (2016), 976-1000.

Received: October 2015
First available in Project Euclid: 12 April 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F15: Bayesian inference
Secondary: 62G20: Asymptotic properties

sparsity nearly black vectors normal means problem horseshoe horseshoe+ Bayesian inference frequentist Bayes posterior contraction shrinkage priors


van der Pas, S.L.; Salomond, J.-B.; Schmidt-Hieber, J. Conditions for posterior contraction in the sparse normal means problem. Electron. J. Statist. 10 (2016), no. 1, 976--1000. doi:10.1214/16-EJS1130. https://projecteuclid.org/euclid.ejs/1460463652

Export citation


  • [1] Andrews, D. F., and Mallows, C. L. Scale mixtures of normal distributions., J. R. Stat. Soc. Ser. B Stat. Methodol. (1974), 99–102.
  • [2] Bhadra, A., Datta, J., Polson, N. G., and Willard, B. The horseshoe+ estimator of ultra-sparse signals. arXiv :1502.00560v2, 2015.
  • [3] Bhattacharya, A., Pati, D., Pillai, N. S., and Dunson, D. B. Dirichlet-Laplace priors for optimal shrinkage. arXiv :1401.5398, 2014.
  • [4] Caron, F., and Doucet, A. Sparse Bayesian nonparametric regression. In, Proceedings of the 25th International Conference on Machine Learning (New York, NY, USA, 2008), ICML ’08, ACM, pp. 88–95.
  • [5] Carvalho, C. M., Polson, N. G., and Scott, J. G. The horseshoe estimator for sparse signals., Biometrika 97, 2 (2010), 465–480.
  • [6] Castillo, I., Schmidt-Hieber, J., and van der Vaart, A. Bayesian linear regression with sparse priors., Ann. Statist. 43, 5 (10 2015), 1986–2018.
  • [7] Castillo, I., and Van der Vaart, A. W. Needles and straw in a haystack: Posterior concentration for possibly sparse sequences., Ann. Statist. 40, 4 (2012), 2069–2101.
  • [8] Damien, P., Wakefield, J., and Walker, S. Gibbs sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables., J. R. Stat. Soc. Ser. B Stat. Methodol. 61, 2 (1999), 331–344.
  • [9] Datta, J., and Ghosh, J. K. Asymptotic properties of Bayes risk for the horseshoe prior., Bayesian Analysis 8, 1 (2013), 111–132.
  • [10] Donoho, D. L., Johnstone, I. M., Hoch, J. C., and Stern, A. S. Maximum entropy and the nearly black object (with discussion)., J. R. Stat. Soc. Ser. B Stat. Methodol. 54, 1 (1992), 41–81.
  • [11] Ghosal, S., Ghosh, J. K., and Van der Vaart, A. W. Convergence rates of posterior distributions., Ann. Statist. 28, 2 (2000), 500–531.
  • [12] Ghosh, P., and Chakrabarti, A. Posterior concentration properties of a general class of shrinkage estimators around nearly black vectors. arXiv :1412.8161v2, 2015.
  • [13] Griffin, J. E., and Brown, P. J. Alternative prior distributions for variable selection with very many more variables than observations., Technical Report, University of Warwick. (2005).
  • [14] Griffin, J. E., and Brown, P. J. Inference with normal-gamma prior distributions in regression problems., Bayesian Analysis 5, 1 (2010), 171–188.
  • [15] Hoffmann, M., Rousseau, J., and Schmidt-Hieber, J. On adaptive posterior concentration rates., Ann. Statist. 43, 5 (10 2015), 2259–2295.
  • [16] Johnson, V. E., and Rossell, D. On the use of non-local prior densities in Bayesian hypothesis tests., J. R. Stat. Soc. Ser. B Stat. Methodol. 72, 2 (2010), 143–170.
  • [17] Johnstone, I. M., and Silverman, B. W. Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences., Ann. Statist. 32, 4 (2004), 1594–1649.
  • [18] Martin, R., and Walker, S. G. Asymptotically minimax empirical Bayes estimation of a sparse normal mean vector., Electron. J. Stat. 8, 2 (2014), 2188–2206.
  • [19] Park, T., and Casella, G. The Bayesian lasso., J. Amer. Statist. Assoc. 103, 482 (2008), 681–686.
  • [20] Polson, N. G., and Scott, J. G. Shrink globally, act locally: Sparse Bayesian regularization and prediction., Bayesian Statistics 9 (2010), 501–538.
  • [21] Polson, N. G., and Scott, J. G. Good, great or lucky? Screening for firms with sustained superior performance using heavy-tailed priors., Ann. Appl. Stat. 6, 1 (2012), 161–185.
  • [22] Polson, N. G., and Scott, J. G. On the half-Cauchy prior for a global scale parameter., Bayesian Analysis 7, 4 (2012), 887–902.
  • [23] Robbins, H. An empirical Bayes approach to statistics. In, Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics (Berkeley, California, 1956), University of California Press, pp. 157–163.
  • [24] Ročková, V. Bayesian estimation of sparse signals with a continuous spike-and-slab prior. submitted manuscript, available at, http://stat.wharton.upenn.edu/~vrockova/rockova2015.pdf, 2015.
  • [25] Tibshirani, R. Regression shrinkage and selection via the lasso., J. R. Stat. Soc. Ser. B Stat. Methodol. 58, 1 (1996), 267–288.
  • [26] van der Pas, S., Kleijn, B., and van der Vaart, A. The horseshoe estimator: Posterior concentration around nearly black vectors., Electron. J. Stat. 8 (2014), 2585–2618.
  • [27] Yang, Y., Wainwright, M. J., and Jordan, M. I. On the computational complexity of high-dimensional Bayesian variable selection. arXiv :1505.07925, 2015.