Statistical Science

On Improved Loss Estimation for Shrinkage Estimators

Dominique Fourdrinier and Martin T. Wells

Full-text: Open access


Let X be a random vector with distribution Pθ where θ is an unknown parameter. When estimating θ by some estimator φ(X) under a loss function L(θ, φ), classical decision theory advocates that such a decision rule should be used if it has suitable properties with respect to the frequentist risk R(θ, φ). However, after having observed X = x, instances arise in practice in which φ is to be accompanied by an assessment of its loss, L(θ, φ(x)), which is unobservable since θ is unknown. A common approach to this assessment is to consider estimation of L(θ, φ(x)) by an estimator δ, called a loss estimator. We present an expository development of loss estimation with substantial emphasis on the setting where the distributional context is normal and its extension to the case where the underlying distribution is spherically symmetric. Our overview covers improved loss estimators for least squares but primarily focuses on shrinkage estimators. Bayes estimation is also considered and comparisons are made with unbiased estimation.

Article information

Statist. Sci., Volume 27, Number 1 (2012), 61-81.

First available in Project Euclid: 14 March 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Conditional inference linear model loss estimation quadratic loss risk function robustness shrinkage estimation spherical symmetry SURE unbiased estimator of loss uniform distribution on a sphere


Fourdrinier, Dominique; Wells, Martin T. On Improved Loss Estimation for Shrinkage Estimators. Statist. Sci. 27 (2012), no. 1, 61--81. doi:10.1214/11-STS380.

Export citation


  • [1] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (Tsahkadsor, 1971) 267–281. Akadémiai Kiadó, Budapest.
  • [2] Barron, A., Birgé, L. and Massart, P. (1999). Risk bounds for model selection via penalization. Probab. Theory Related Fields 113 301–413.
  • [3] Bartlett, P., Boucheron, S. and Lugosi, G. (2002). Model selection and error estimation. Machine Learning 48 85–113.
  • [4] Berger, J. (1985). The frequentist viewpoint and conditioning. In Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, Vol. I (Berkeley, Calif., 1983) (L. Le Cam and R. Olshen, eds.) 15–44. Wadsworth, Belmont, CA.
  • [5] Berger, J. O. (1985). Statistical Decision Theory and Bayesian Analysis, 2nd ed. Springer, New York.
  • [6] Berger, J. O. (1985). In defense of the likelihood principle: Axiomatics and coherency. In Bayesian Statistics, 2 (Valencia, 1983) (D. V. Lindley, J. M. Bernardo, M. H. DeGroot and A. F. M. Smith, eds.) 33–66. North-Holland, Amsterdam.
  • [7] Blanchard, D. and Fourdrinier, D. (1999). Non-trivial solutions of non-linear partial differential inequations and order cut-off. Rend. Mat. Appl. (7) 19 137–154.
  • [8] Bock, M. E. (1988). Shrinkage estimators: Pseudo-Bayes rules for normal mean vectors. In Statistical Decision Theory and Related Topics, IV, Vol. 1 (West Lafayette, Ind., 1986) (S. S. Gupta and J. O. Berger, eds.) 281–297. Springer, New York.
  • [9] Brown, L. (1968). Inadmissiblity of the usual estimators of scale parameters in problems with unknown location and scale parameters. Ann. Math. Statist. 39 29–48.
  • [10] Brown, L. D. (1971). Admissible estimators, recurrent diffusions, and insoluble boundary value problems. Ann. Math. Statist. 42 855–903.
  • [11] Cellier, D. and Fourdrinier, D. (1995). Shrinkage estimators under spherical symmetry for the general linear model. J. Multivariate Anal. 52 338–351.
  • [12] Cellier, D., Fourdrinier, D. and Robert, C. (1989). Robust shrinkage estimators of the location parameter for elliptically symmetric distributions. J. Multivariate Anal. 29 39–52.
  • [13] Clevenson, M. L. and Zidek, J. V. (1975). Simultaneous estimation of the means of independent Poisson laws. J. Amer. Statist. Assoc. 70 698–705.
  • [14] Craven, P. and Wahba, G. (1978/79). Smoothing noisy data with spline functions. Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31 377–403.
  • [15] Dey, D. K. and Srinivasan, C. (1985). Estimation of a covariance matrix under Stein’s loss. Ann. Statist. 13 1581–1591.
  • [16] Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200–1224.
  • [17] du Plessis, N. (1970). An Introduction to Potential Theory. University Mathematical Monographs 7. Hafner, Darien, CT.
  • [18] Efron, B. (2004). The estimation of prediction error: Covariance penalties and cross-validation. J. Amer. Statist. Assoc. 99 619–642.
  • [19] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499.
  • [20] Fang, K. T., Kotz, S. and Ng, K. W. (1990). Symmetric Multivariate and Related Distributions. Monographs on Statistics and Applied Probability 36. Chapman & Hall, London.
  • [21] Fourdrinier, D. and Lepelletier, P. (2008). Estimating a general function of a quadratic function. Ann. Inst. Statist. Math. 60 85–119.
  • [22] Fourdrinier, D. and Strawderman, W. E. (2003). On Bayes and unbiased estimators of loss. Ann. Inst. Statist. Math. 55 803–816.
  • [23] Fourdrinier, D., Strawderman, W. E. and Wells, M. T. (2003). Robust shrinkage estimation for elliptically symmetric distributions with unknown covariance matrix. J. Multivariate Anal. 85 24–39.
  • [24] Fourdrinier, D. and Wells, M. T. (1994). Comparaisons de procédures de sélection d’un modèle de régression: Une approche décisionnelle. C. R. Acad. Sci. Paris Sér. I Math. 319 865–870.
  • [25] Fourdrinier, D. and Wells, M. T. (1995). Estimation of a loss function for spherically symmetric distributions in the general linear model. Ann. Statist. 23 571–592.
  • [26] Fourdrinier, D. and Wells, M. T. (1995). Loss estimation for spherically symmetric distributions. J. Multivariate Anal. 53 311–331.
  • [27] Haff, L. R. (1979). An identity for the Wishart distribution with applications. J. Multivariate Anal. 9 531–544.
  • [28] Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Monographs on Statistics and Applied Probability 43. Chapman & Hall, London.
  • [29] Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 55–67.
  • [30] Hsieh, F. S. and Hwang, J. T. G. (1993). Admissibility under the frequentist’s validity constraint in estimating the loss of the least-squares estimator. J. Multivariate Anal. 44 279–285.
  • [31] Hudson, H. M. (1978). A natural identity for exponential families with applications in multiparameter estimation. Ann. Statist. 6 473–484.
  • [32] Johnstone, I. (1988). On inadmissibility of some unbiased estimates of loss. In Statistical Decision Theory and Related Topics, IV, Vol. 1 (West Lafayette, Ind., 1986) (S. S. Gupta and J. O. Berger, eds.) 361–379. Springer, New York.
  • [33] Kiefer, J. (1975). Conditional confidence approach in multi-decision problems. In Multivariate Analysis 4 (P. R. Krishnaiah, ed.). Academic Press, New York.
  • [34] Kiefer, J. (1976). Admissibility of conditional confidence procedures. Ann. Statist. 4 836–865.
  • [35] Kiefer, J. (1977). Conditional confidence statements and confidence estimators. J. Amer. Statist. Assoc. 72 789–827.
  • [36] Kubokawa, T. and Srivastava, M. S. (1999). Robust improvement in estimation of a covariance matrix in an elliptically contoured distribution. Ann. Statist. 27 600–609.
  • [37] Lehmann, E. L. and Sheffé, H. (1950). Completeness, similar regions and unbiased estimates. Sankhyā 17 305–340.
  • [38] Lele, C. (1992). Inadmissibility of loss estimators. Statist. Decisions 10 309–322.
  • [39] Lele, C. (1993). Admissibility results in loss estimation. Ann. Statist. 21 378–390.
  • [40] Lu, K. L. and Berger, J. O. (1989). Estimation of normal means: Frequentist estimation of loss. Ann. Statist. 17 890–906.
  • [41] Mallows, C. (1973). Some comments on Cp. Technometrics 15 661–675.
  • [42] Rukhin, A. L. (1988). Estimated loss and admissible loss estimators. In Statistical Decision Theory and Related Topics, IV, Vol. 1 (West Lafayette, Ind., 1986) (S. S. Gupta and J. O. Berger, eds.) 409–418. Springer, New York.
  • [43] Sandved, E. (1968). Ancillary statistics and prediction of the loss in estimation problems. Ann. Math. Statist. 39 1756–1758.
  • [44] Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
  • [45] Stein, C. Estimating the covariance matrix. Unpublished manuscript.
  • [46] Stein, C. (1964). Inadmissibility of the usual estimator for the variance of a normal distribution with unknown mean. Ann. Inst. Statist. Math. 16 155–160.
  • [47] Steĭn, Č. (1977). Lectures on multivariate estimation theory. Zap. NauČn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 74 4–65, 146, 148.
  • [48] Stein, C. M. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9 1135–1151.
  • [49] Takemura, A. (1984). An orthogonally invariant minimax estimator of the covariance matrix of a multivariate normal population. Tsukuba J. Math. 8 367–376.
  • [50] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • [51] Wan, A. T. K. and Zou, G. (2004). On unbiased and improved loss estimation for the mean of a multivariate normal distribution with unknown variance. J. Statist. Plann. Inference 119 17–22.
  • [52] Ye, J. (1998). On measuring and correcting the effects of data mining and model selection. J. Amer. Statist. Assoc. 93 120–131.
  • [53] Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 301–320.
  • [54] Zou, H., Hastie, T. and Tibshirani, R. (2007). On the “degrees of freedom” of the lasso. Ann. Statist. 35 2173–2192.