Electronic Journal of Statistics

Asymptotically minimax empirical Bayes estimation of a sparse normal mean vector

Ryan Martin and Stephen G. Walker

Full-text: Open access

Abstract

For the important classical problem of inference on a sparse high-dimensional normal mean vector, we propose a novel empirical Bayes model that admits a posterior distribution with desirable properties under mild conditions. In particular, our empirical Bayes posterior distribution concentrates on balls, centered at the true mean vector, with squared radius proportional to the minimax rate, and its posterior mean is an asymptotically minimax estimator. We also show that, asymptotically, the support of our empirical Bayes posterior has roughly the same effective dimension as the true sparse mean vector. Simulation from our empirical Bayes posterior is straightforward, and our numerical results demonstrate the quality of our method compared to others having similar large-sample properties.

Article information

Source
Electron. J. Statist., Volume 8, Number 2 (2014), 2188-2206.

Dates
First available in Project Euclid: 29 October 2014

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1414588191

Digital Object Identifier
doi:10.1214/14-EJS949

Mathematical Reviews number (MathSciNet)
MR3273623

Zentralblatt MATH identifier
1302.62015

Subjects
Primary: 62C12: Empirical decision procedures; empirical Bayes procedures 62C20: Minimax procedures 62F12: Asymptotic properties of estimators

Keywords
Data-dependent prior high-dimensional fractional likelihood posterior concentration shrinkage two-groups model

Citation

Martin, Ryan; Walker, Stephen G. Asymptotically minimax empirical Bayes estimation of a sparse normal mean vector. Electron. J. Statist. 8 (2014), no. 2, 2188--2206. doi:10.1214/14-EJS949. https://projecteuclid.org/euclid.ejs/1414588191


Export citation

References

  • Abramovich, F. and Grinshtein, V. (2013). Estimation of a sparse group of sparse vectors., Biometrika 100 355–370.
  • Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate., Ann. Statist. 34 584–653.
  • Babenko, A. and Belitser, E. (2010). Oracle convergence rate of posterior under projection prior and Bayesian model selection., Math. Methods Statist. 19 219–245.
  • Barron, A., Schervish, M. J. and Wasserman, L. (1999). The consistency of posterior distributions in nonparametric problems., Ann. Statist. 27 536–561.
  • Bhattacharya, A., Pati, D., Pillai, N. S. and Dunson, D. B. (2014). Dirichlet–Laplace priors for optimal shrinkage. Unpublished manuscript, arXiv:1212.6088.
  • Bogdan, M., Ghosh, J. K. and Tokdar, S. T. (2008). A comparison of the Benjamini-Hochberg procedure with some Bayesian rules for multiple testing. In, Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen (N. Balakrishnan, E. Peña and M. Silvapulle, eds.) 211–230. IMS, Beachwood, OH.
  • Bogdan, M., Chakrabarti, A., Frommlet, F. and Ghosh, J. K. (2011). Asymptotic Bayes-optimality under sparsity of some multiple testing procedures., Ann. Statist. 39 1551–1579.
  • Brown, L. D. and Greenshtein, E. (2009). Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means., Ann. Statist. 37 1685–1704.
  • Cai, T. T. (2012). Minimax and adaptive inference in nonparametric function estimation., Statist. Sci. 27 31–50.
  • Cai, T. T. and Jin, J. (2010). Optimal rates of convergence for estimating the null density and proportion of nonnull effects in large-scale multiple testing., Ann. Statist. 38 100–145.
  • Cai, T. T., Zhang, C.-H. and Zhou, H. H. (2010). Optimal rates of convergence for covariance matrix estimation., Ann. Statist. 38 2118–2144.
  • Cai, T. T. and Zhou, H. H. (2012). Optimal rates of convergence for sparse covariance matrix estimation., Ann. Statist. 40 2389–2420.
  • Carvalho, C. M., Polson, N. G. and Scott, J. G. (2010). The horseshoe estimator for sparse signals., Biometrika 97 465–480.
  • Castillo, I. and van der Vaart, A. (2012). Needles and straw in a haystack: Posterior concentration for possibly sparse sequences., Ann. Statist. 40 2069–2101.
  • Dalalyan, A. S. and Tsybakov, A. B. (2008). Aggregation by exponential weighting, sharp PAC-Bayesian bounds, and sparsity., Machine Learning 72 39–61.
  • Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over $l_p$-balls for $l_q$-error., Probab. Theory Related Fields 99 277–303.
  • Donoho, D. L., Johnstone, I. M., Hoch, J. C. and Stern, A. S. (1992). Maximum entropy and the nearly black object., J. Roy. Statist. Soc. Ser. B 54 41–81. With discussion and a reply by the authors.
  • Efron, B. (2008). Microarrays, empirical Bayes and the two-groups model., Statist. Sci. 23 1–22.
  • Fan, J. and Lv, J. (2010). A selective overview of variable selection in high dimensional feature space., Statist. Sinica 20 101–148.
  • Ghosal, S., Ghosh, J. K. and Ramamoorthi, R. V. (1999). Posterior consistency of Dirichlet mixtures in density estimation., Ann. Statist. 27 143–158.
  • Ghosal, S., Ghosh, J. K. and van der Vaart, A. W. (2000). Convergence rates of posterior distributions., Ann. Statist. 28 500–531.
  • Jiang, W. and Tanner, M. A. (2008). Gibbs posterior for variable selection in high-dimensional classification and data mining., Ann. Statist. 36 2207–2231.
  • Jiang, W. and Zhang, C.-H. (2009). General maximum likelihood empirical Bayes estimation of normal means., Ann. Statist. 37 1647–1684.
  • Jin, J. and Cai, T. T. (2007). Estimating the null and the proportional of nonnull effects in large-scale multiple comparisons., J. Amer. Statist. Assoc. 102 495–506.
  • Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences., Ann. Statist. 32 1594–1649.
  • Johnstone, I. M. and Silverman, B. W. (2005). Empirical Bayes selection of wavelet thresholds., Ann. Statist. 33 1700–1752.
  • Koenker, R. (2014). A Gaussian compound decision bakeoff., Stat. 3 12–16.
  • Koenker, R. and Mizera, I. (2014). Convex optimization, shape constraints, compound decisions, and empirical Bayes rules., J. Amer. Statist. Assoc. 109 674–685.
  • Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation., Ann. Statist. 37 4254–4278.
  • Martin, R., Mess, R. and Walker, S. G. (2014). Empirical Bayes posterior concentration in sparse high-dimensional linear models. Unpublished manuscript, arXiv:1406.7718.
  • Martin, R. and Tokdar, S. T. (2012). A nonparametric empirical Bayes framework for large-scale multiple testing., Biostatistics 13 427–439.
  • Park, T. and Casella, G. (2008). The Bayesian lasso., J. Amer. Statist. Assoc. 103 681–686.
  • Schwartz, L. (1965). On Bayes procedures., Z. Wahrs. verw. Geb. 4 10–26.
  • Scott, J. G. and Berger, J. O. (2006). An exploration of aspects of Bayesian multiple testing., J. Statist. Plann. Inference 136 2144–2162.
  • Scott, J. G., Kelly, R. C., Smith, M. A. and Kass, R. E. (2013). False discovery rate regression: An application to neural synchrony detection in primary visual cortex. Unpublished manuscript, arXiv:1307.3495.
  • Shen, X. and Wasserman, L. (2001). Rates of convergence of posterior distributions., Ann. Statist. 29 687–714.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso., J. Roy. Statist. Soc. Ser. B 58 267–288.
  • Walker, S. and Hjort, N. L. (2001). On Bayesian consistency., J. R. Stat. Soc. Ser. B Stat. Methodol. 63 811–821.
  • Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables., J. R. Stat. Soc. Ser. B Stat. Methodol. 68 49–67.
  • Zhang, T. (2006). From $\epsilon$-entropy to KL-entropy: Analysis of minimum information complexity density estimation., Ann. Statist. 34 2180–2210.