Bayesian Analysis

Asymptotic Properties of Bayes Risk of a General Class of Shrinkage Priors in Multiple Hypothesis Testing Under Sparsity

Prasenjit Ghosh, Xueying Tang, Malay Ghosh, and Arijit Chakrabarti

Full-text: Open access


Consider the problem of simultaneous testing for the means of independent normal observations. In this paper, we study some asymptotic optimality properties of certain multiple testing rules induced by a general class of one-group shrinkage priors in a Bayesian decision theoretic framework, where the overall loss is taken as the number of misclassified hypotheses. We assume a two-groups normal mixture model for the data and consider the asymptotic framework adopted in Bogdan et al. (2011) who introduced the notion of asymptotic Bayes optimality under sparsity in the context of multiple testing. The general class of one-group priors under study is rich enough to include, among others, the families of three parameter beta, generalized double Pareto priors, and in particular the horseshoe, the normal–exponential–gamma and the Strawderman–Berger priors. We establish that within our chosen asymptotic framework, the multiple testing rules under study asymptotically attain the risk of the Bayes Oracle up to a multiplicative factor, with the constant in the risk close to the constant in the Oracle risk. This is similar to a result obtained in Datta and Ghosh (2013) for the multiple testing rule based on the horseshoe estimator introduced in Carvalho et al. (2009, 2010). We further show that under very mild assumption on the underlying sparsity parameter, the induced decisions using an empirical Bayes estimate of the corresponding global shrinkage parameter proposed by van der Pas et al. (2014), asymptotically attain the optimal Bayes risk up to the same multiplicative factor. We provide a unifying argument applicable for the general class of priors under study. In the process, we settle a conjecture regarding optimality property of the generalized double Pareto priors made in Datta and Ghosh (2013). Our work also shows that the result in Datta and Ghosh (2013) can be improved further.

Article information

Bayesian Anal., Volume 11, Number 3 (2016), 753-796.

First available in Project Euclid: 16 September 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62C10: Bayesian problems; characterization of Bayes procedures
Secondary: 62C10: Bayesian problems; characterization of Bayes procedures

asymptotic optimality Bayes Oracle empirical Bayes generalized double Pareto three parameter beta horseshoe normal–exponential–gamma Strawderman–Berger


Ghosh, Prasenjit; Tang, Xueying; Ghosh, Malay; Chakrabarti, Arijit. Asymptotic Properties of Bayes Risk of a General Class of Shrinkage Priors in Multiple Hypothesis Testing Under Sparsity. Bayesian Anal. 11 (2016), no. 3, 753--796. doi:10.1214/15-BA973.

Export citation


  • Armagan, A., Dunson, D. B., and Clyde, M. (2011). “Generalized Beta Mixtures of Gaussians”. In: Shawe-Taylor, J., Zemel, R. S., Bartlett, P. L., Pereira, F. C. N., and Weinberger, K. Q. (eds.), Advances in Neural Information Processing Systems, volume 24, 523–531.
  • Armagan, A., Dunson, D. B., and Lee, J. (2012). “Generalized Double Pareto Shrinkage”. Statistica Sinica, 23(1): 119–143.
  • Benjamini, Y. and Hochberg, Y. (1995). “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing”. Journal of the Royal Statistical Society, Series B, 57(1): 289–300.
  • Bhattacharya, A., Pati, D., Pillai, N., and Dunson, D. B. (2012). “Bayesian Shrinkage”. arXiv:1212.6088v1.
  • — (2014). “Dirichlet-Laplace priors for optimal shrinkage”. arXiv:1401.5398v1.
  • Bingham, N. H., Goldie, C. M., and Teugels, J. L. (1987). Regular Variation. Cambridge, Great Britain: Encyclopedia of Mathematics and Its Applications, University Press.
  • Bogdan, M., Chakrabarti, A., Frommlet, F., and Ghosh, J. K. (2011). “Asymptotic Bayes-Optimality under Sparsity of Some Multiple Testing Procedures”. The Annals of Statistics, 39(3): 1551–1579.
  • Bogdan, M., Ghosh, J. K., and Tokdar, S. T. (2008). “A comparison of the Benjamini–Hochberg procedure with some Bayesian rules for multiple testing”. In: Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen, volume 1, 211–230. Beachwood, OH: IMS Collections, IMS.
  • Cai, T. T. and Jin, J. (2010). “Optimal Rates of Convergence for Estimating the Null Density and Proportion of Nonnull Effects in Large-Scale Multiple Comparisons”. The Annals of Statistics, 38(1): 100–145.
  • Cai, T. T., Jin, J., and Low, M. J. (2007). “Estimation and confidence sets for sparse normal mixtures”. The Annals of Statistics, 35(6): 2421–2449.
  • Carvalho, C., Polson, N., and Scott, J. (2009). “Handling sparsity via the horseshoe”. Journal of Machine Learning Research W&CP, 5: 73–80.
  • — (2010). “The horseshoe estimator for sparse signals”. Biometrika, 97(2): 465–480.
  • Datta, J. and Ghosh, J. K. (2013). “Asymptotic Properties of Bayes Risk for the Horseshoe Prior”. Bayesian Analysis, 8(1): 111–132.
  • Donoho, D. and Jin, J. (2004). “Higher criticism for detecting sparse heterogeneous mixtures”. The Annals of Statistics, 32(3): 962–994.
  • Efron, B. (2004). “Large-scale simultaneous hypothesis testing: The choice of a null hypothesis”. Journal of the American Statistical Association, 99(465): 96–104.
  • — (2008). “Microarrays, Empirical Bayes and the two-groups Model”. Statistical Science, 23(1): 1–22.
  • Gelman, A. (2006). “Prior distributions for variance parameters in hierarchical models”. Bayesian Analysis, 1(3): 515–533.
  • Ghosh, P. and Chakrabarti, A. (2015). “Posterior Concentration Properties of a General Class of Shrinkage Priors around Nearly Black Vectors”. arXiv:1412.8161v4.
  • Griffin, J. E. and Brown, P. J. (2005). “Alternative prior distributions for variable selection with very many more variables than observations”. Technical report, University of Warwick.
  • — (2010). “Inference with normal–gamma prior distributions in regression problems”. Bayesian Analysis, 5(1): 171–188.
  • — (2012). “Structuring shrinkage: some correlated priors for regression”. Biometrika, 99(2): 481–487.
  • — (2013). “Some priors for sparse regression modeling”. Bayesian Analysis, 8(3): 691–702.
  • Hans, C. (2009). “Bayesian lasso regression”. Biometrika, 96(4): 835–845.
  • Hoeffding, W. (1963). “Probability inequalities for sums of bounded random variables”. Journal of the American Statistical Association, 58: 13–30.
  • Ingster, Y. I. (1997). “Some problems of hypothesis testing leading to infinitely divisible distributions”. Mathematical Methods of Statistics, 6(1): 47–69.
  • Meinshausen, N. and Rice, J. (2006). “Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses”. The Annals of Statistics, 34(1): 373–393.
  • Mitchell, T. and Beauchamp, J. (1988). “Bayesian variable selection in linear regression (with discussion)”. Journal of the American Statistical Association, 83(404): 1023–1036.
  • Park, T. and Casella, G. (2008). “The Bayesian lasso”. Journal of the American Statistical Association, 103(482): 681–686.
  • Pati, D., Bhattacharya, A., Pillai, N., and Dunson, D. (2014). “Posterior contraction in sparse Bayesian factor models for massive covariance matrices”. The Annals of Statistics, 42(3): 1102–1130.
  • Polson, N. G. and Scott, J. G. (2011). “Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction”. In: Bayesian Statistics 9, Proceedings of the 9th Valencia International Meeting, 501–538. Oxford University Press.
  • — (2012). “On the Half-Cauchy Prior for a Global Scale parameter”. Bayesian Analysis, 7(2): 1–16.
  • Scott, J. and Berger, J. O. (2006). “An exploration of aspects of Bayesian multiple testing”. Journal of Statistical Planning and Inference, 136(7): 2144–2162.
  • — (2010). “Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem”. The Annals of Statistics, 38(5): 2587–2619.
  • Scott, J. G. (2011). “Bayesian estimation of intensity surfaces on the sphere via needlet shrinkage and selection”. Bayesian Analysis, 6(2): 307–327.
  • Storey, J. D. (2007). “The optimal discovery procedure: a new approach to simultaneous significance testing”. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(3): 347–368.
  • Tipping, M. (2001). “Sparse Bayesian learning and the Relevance Vector Machine”. Journal of Machine Learning Research, 1: 211–244.
  • van der Pas, S. L., Kleijn, B. J. K., and van der Vaart, A. W. (2014). “The horseshoe estimator: Posterior concentration around nearly black vectors”. Electronic Journal of Statistics, 8: 2585–2618.