## The Annals of Statistics

### Empirical Bayes estimates for a two-way cross-classified model

#### Abstract

We develop an empirical Bayes procedure for estimating the cell means in an unbalanced, two-way additive model with fixed effects. We employ a hierarchical model, which reflects exchangeability of the effects within treatment and within block but not necessarily between them, as suggested before by Lindley and Smith [J. R. Stat. Soc., B 34 (1972) 1–41]. The hyperparameters of this hierarchical model, instead of considered fixed, are to be substituted with data-dependent values in such a way that the point risk of the empirical Bayes estimator is small. Our method chooses the hyperparameters by minimizing an unbiased risk estimate and is shown to be asymptotically optimal for the estimation problem defined above, under suitable conditions. The usual empirical Best Linear Unbiased Predictor (BLUP) is shown to be substantially different from the proposed method in the unbalanced case and, therefore, performs suboptimally. Our estimator is implemented through a computationally tractable algorithm that is scalable to work under large designs. The case of missing cell observations is treated as well.

#### Article information

Source
Ann. Statist., Volume 46, Number 4 (2018), 1693-1720.

Dates
Revised: February 2017
First available in Project Euclid: 27 June 2018

https://projecteuclid.org/euclid.aos/1530086430

Digital Object Identifier
doi:10.1214/17-AOS1599

Mathematical Reviews number (MathSciNet)
MR3819114

Zentralblatt MATH identifier
06936475

#### Citation

Brown, Lawrence D.; Mukherjee, Gourab; Weinstein, Asaf. Empirical Bayes estimates for a two-way cross-classified model. Ann. Statist. 46 (2018), no. 4, 1693--1720. doi:10.1214/17-AOS1599. https://projecteuclid.org/euclid.aos/1530086430

#### References

• Bates, D. M. (2010). lme4: Mixed-effects modeling with R. http://lme4.r-forge.r-project.org/book.
• Brown, L. D., Mukherjee, G. and Weinstein, A. (2018). Supplement to “Empirical Bayes estimates for a two-way cross-classified model.” DOI:10.1214/17-AOS1599SUPP.
• Candès, E. J., Sing-Long, C. A. and Trzasko, J. D. (2013). Unbiased risk estimates for singular value thresholding and spectral estimators. IEEE Trans. Signal Process. 61 4643–4657.
• Dicker, L. H. (2013). Optimal equivariant prediction for high-dimensional linear models with arbitrary predictor covariance. Electron. J. Stat. 7 1806–1834.
• Dicker, L. H. and Erdogdu, M. A. (2017). Flexible results for quadratic forms with applications to variance components estimation. Ann. Statist. 45 386–414.
• Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1995). Wavelet shrinkage: Asymptopia? J. R. Stat. Soc., B 57 301–369.
• Draper, N. R. and Van Nostrand, R. C. (1979). Ridge regression and James–Stein estimation: Review and comments. Technometrics 21 451–466.
• Efron, B. and Morris, C. (1972). Empirical Bayes on vector observations: An extension of Stein’s method. Biometrika 59 335–347.
• Efron, B. and Morris, C. (1973). Stein’s estimation rule and its competitors—an empirical Bayes approach. J. Amer. Statist. Assoc. 68 117–130.
• Fahrmeir, L., Kneib, T., Lang, S. and Marx, B. (2013). Regression: Models, Methods and Applications. Springer, Heidelberg.
• Gelman, A. (2005). Analysis of variance—why it is more important than ever. Ann. Statist. 33 1–53.
• Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2004). Bayesian Data Analysis, 2nd ed. Chapman & Hall/CRC, Boca Raton, FL.
• Ghosh, M., Nickerson, D. M. and Sen, P. K. (1987). Sequential shrinkage estimation. Ann. Statist. 15 817–829.
• Goldstein, H., Browne, W. and Rasbash, J. (2002). Multilevel modelling of medical data. Stat. Med. 21 3291–3315.
• Harville, D. A. (1977). Maximum likelihood approaches to variance component estimation and to related problems. J. Amer. Statist. Assoc. 72 320–340.
• Henderson, C. (1984). ANOVA, MIVQUE, REML, and ML algorithms for estimation of variances and covariances. In Statistics: An Appraisal: Proceedings 50th Anniversary Conference (H. A. David and H. T. David, eds.) 257–280. The Iowa State University Press, Ames, IA.
• James, W. and Stein, C. (1961). Estimation with quadratic loss. In Proc. 4th Berkeley Sympos. Math. Statist. and Prob., Vol. I 361–379. Univ. California Press, Berkeley, CA.
• Jiang, J., Nguyen, T. and Rao, J. S. (2011). Best predictive small area estimation. J. Amer. Statist. Assoc. 106 732–745.
• Johnstone, I. M. (2011). Gaussian estimation: Sequence and wavelet models. Unpublished Manuscript.
• Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594–1649.
• Kou, S. C. and Yang, J. J. (2017). Optimal shrinkage estimation in heteroscedastic hierarchical linear models. In Big and Complex Data Analysis. Contrib. Stat. 249–284. Springer, Cham.
• Li, K.-C. (1986). Asymptotic optimality of $C_{L}$ and generalized cross-validation in ridge regression with application to spline smoothing. Ann. Statist. 14 1101–1112.
• Lindley, D. (1962). Discussion of the paper by Stein. J. R. Stat. Soc., B 24 265–296.
• Lindley, D. V. and Smith, A. F. M. (1972). Bayes estimates for the linear model. J. R. Stat. Soc., B 34 1–41.
• Mason, W. M., Wong, G. Y. and Entwisle, B. (1983). Contextual analysis through the multilevel linear model. Sociol. Method. 1984 72–103.
• McCulloch, C. E. and Searle, S. R. (2001). Generalized, Linear, and Mixed Models. Wiley, New York.
• Oman, S. D. (1982). Shrinking towards subspaces in multiple linear regression. Technometrics 24 307–311.
• Rasbash, J. and Goldstein, H. (1994). Efficient analysis of mixed hierarchical and cross-classified random structures using a multilevel model. J. Educ. Behav. Stat. 19 337–350.
• Rolph, J. E. (1976). Choosing shrinkage estimators for regression problems. Comm. Statist. Theory Methods A5 789–802.
• Sclove, S. L. (1968). Improved estimators for coefficients in linear regression. J. Amer. Statist. Assoc. 63 596–606.
• Searle, S. R., Casella, G. and McCulloch, C. E. (1992). Variance Components. Wiley, New York.
• Stein, C. M. (1962). Confidence sets for the mean of a multivariate normal distribution. J. R. Stat. Soc., B 24 265–296.
• Tan, Z. (2016). Steinized empirical Bayes estimation for heteroscedastic data. Statist. Sinica 26 1219–1248.
• Xie, X., Kou, S. C. and Brown, L. D. (2012). SURE estimates for a heteroscedastic hierarchical model. J. Amer. Statist. Assoc. 107 1465–1479.
• Zaccarin, S. and Rivellini, G. (2002). Multilevel analysis in social research: An application of a cross-classified model. Stat. Methods Appl. 11 95–108.

#### Supplemental materials

• Supplement to “Empirical Bayes estimates for a two-way cross-classified model”. The supplement [Brown, Mukherjee and Weinstein (2018)] contains detailed proofs of the lemmas that were used in the Appendix for proving the results in Section 3; and derivations and further discussions on the results of Sections 2 and 4.