Annals of Statistics

General maximum likelihood empirical Bayes estimation of normal means

Wenhua Jiang and Cun-Hui Zhang

Full-text: Open access


We propose a general maximum likelihood empirical Bayes (GMLEB) method for the estimation of a mean vector based on observations with i.i.d. normal errors. We prove that under mild moment conditions on the unknown means, the average mean squared error (MSE) of the GMLEB is within an infinitesimal fraction of the minimum average MSE among all separable estimators which use a single deterministic estimating function on individual observations, provided that the risk is of greater order than (log n)5/n. We also prove that the GMLEB is uniformly approximately minimax in regular and weak p balls when the order of the length-normalized norm of the unknown means is between (log n)κ1/n1/(p∧2) and n/(log n)κ2. Simulation experiments demonstrate that the GMLEB outperforms the James–Stein and several state-of-the-art threshold estimators in a wide range of settings without much down side.

Article information

Ann. Statist., Volume 37, Number 4 (2009), 1647-1684.

First available in Project Euclid: 18 June 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62C12: Empirical decision procedures; empirical Bayes procedures 62G05: Estimation 62G08: Nonparametric regression 62G20: Asymptotic properties 62C25: Compound decision problems

Compound estimation empirical Bayes adaptive estimation white noise shrinkage estimator threshold estimator


Jiang, Wenhua; Zhang, Cun-Hui. General maximum likelihood empirical Bayes estimation of normal means. Ann. Statist. 37 (2009), no. 4, 1647--1684. doi:10.1214/08-AOS638.

Export citation


  • [1] Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34 584–653.
  • [2] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
  • [3] Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. 3 203–268.
  • [4] Borell, C. (1975). The Brunn–Minkowski inequality in Gaussian space. Invent. Math. 30 207–216.
  • [5] Brown, L. D. (1971). Admissible estimators, recurrent diffusions and insoluble boundary value problems. Ann. Math. Statist. 42 855–903.
  • [6] Brown, L. D. and Greenshtein, E. (2007). Empirical Bayes and compound decision approaches for estimation of a high-dimensional vector of normal means. Ann. Statist. To appear.
  • [7] Cai, T. T. (2002). On block thresholding in wavelet regression. Statist. Sinica 12 1241–1273.
  • [8] Cai, T. T. and Silverman, B. W. (2001). Incorporating information on neighboring coefficients into wavelet estimation. Sankhyā Ser. B 63 127–148.
  • [9] Carathéodory, C. (1911). Über den variabilitätsbereich der fourierschen konstanten von positiven harmonischen funktionen. Rend. Circ. Mat. Palermo 32 193–217.
  • [10] Cover, T. M. (1984). An algorithm for maximizing expected log investment return. IEEE Trans. Inform. Theory 30 369–373.
  • [11] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39 1–38.
  • [12] Donoho, D. L. and Johnstone, I. M. (1994a). Minimax risk over p-balls for q-error. Probab. Theory Related Fields 99 277–303.
  • [13] Donoho, D. L. and Johnstone, I. M. (1994b). Ideal spatial adaptation via wavelet shrinkage. Biometrika 81 425–455.
  • [14] Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200–1224.
  • [15] Efron, B. (2003). Robbins, empirical Bayes and microarrays. Ann. Statist. 31 366–378.
  • [16] Efron, B. and Morris, C. (1972). Empirical Bayes on vector observations: An extension of Stein’s method. Biometrika 59 335–347.
  • [17] Efron, B. and Morris, C. (1973). Stein’s estimation rule and its competitors—an empirical Bayes approach. J. Amer. Statist. Assoc. 68 117–130.
  • [18] Foster, D. P. and George, E. I. (1994). The risk inflation criterion for multiple regression. Ann. Statist. 22 1947–1975.
  • [19] George, E. (1986). Mimimax multiple shrinkage estimation. Ann. Statist. 14 288–305.
  • [20] Ghosal, S. and van der Vaart, A. W. (2001). Entropy and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Statist. 29 1233–1263.
  • [21] Ghosal, S. and van der Vaart, A. W. (2007). Posterior convergence rates for Dirichlet mixtures at smooth densities. Ann. Statist. 35 697–723.
  • [22] James, W. and Stein, C. (1961). Estimation with quadratic loss. In Proc. Fourth Berkeley Symp. Math. Statist. and Prob. 1 361–379. Univ. California Press, Berkeley.
  • [23] Johnstone, I. M. (1994). Minimax Bayes, asymptotic minimax and sparse wavelet priors. In Statistical Decision Theory and Related Topics V (S. Gupta and J. Berger, eds.) 303–326. Springer, New York.
  • [24] Johnstone, I. M. and Silverman, B. W. (2004). Needles and hay in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594–1649.
  • [25] Kiefer, J. and Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann. Math. Statist. 27 887–906.
  • [26] Morris, C. N. (1983). Parametric empirical Bayes inference: Theory and applications. J. Amer. Statist. Assoc. 78 47–55.
  • [27] Robbins, H. (1951). Asymptotically subminimax solutions of compound statistical decision problems. In Proc. Second Berkeley Symp. Math. Statist. Probab. 1 131–148. Univ. California Press, Berkeley.
  • [28] Robbins, H. (1956). An empirical Bayes approach to statistics. In Proc. Third Berkeley Symp. Math. Statist. Probab. 1 157–163. Univ. California Press, Berkeley.
  • [29] Robbins, H. (1964). The empirical Bayes approach to statistical decision problems. Ann. Math. Statist. 35 1–20.
  • [30] Robbins, H. (1983). Some thoughts on empirical Bayes estimation. Ann. Statist. 11 713–723.
  • [31] Stein, C. (1956). Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In Proc. Third Berkeley Symp. Math. Statist. Probab. 1 157–163. Univ. California Press, Berkeley.
  • [32] Tang, W. and Zhang, C.-H. (2005). Bayes and empirical Bayes approaches to controlling the false discovery rate. Technical Report 2005–2004, Dept. Statistics and Biostatistics, Rutgers Univ.
  • [33] Tang, W. and Zhang, C.-H. (2007). Empirical Bayes methods for controlling the false discovery rate with dependent data. In Complex Datasets and Inverse Problems: Tomography, Networks, and Beyond (R. Liu, W. Strawderman and C.-H. Zhang, eds.). Lecture Notes—Monograph Series 54 151–160. IMS, Beachwood, OH.
  • [34] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
  • [35] Vardi, Y. and Lee, D. (1993). From image deblurring to optimal investment: Maximum likelihood solutions for positive linear inverse problem (with discussion). J. Roy. Statist. Soc. Ser. B 55 569–612.
  • [36] Zhang, C.-H. (1997). Empirical Bayes and compound estimation of normal means. Statist. Sinica 7 181–193.
  • [37] Zhang, C.-H. (2003). Compound decision theory and empirical Bayes method. Ann. Statist. 31 379–390.
  • [38] Zhang, C.-H. (2005). General empirical Bayes wavelet methods and exactly adaptive minimax estimation. Ann. Statist. 33 54–100.
  • [39] Zhang, C.-H. (2008). Generalized maximum likelihood estimation of normal mixture densities. Statist. Sinica. To appear.