## The Annals of Statistics

### Nonparametric estimation of mean-squared prediction error in nested-error regression models

#### Abstract

Nested-error regression models are widely used for analyzing clustered data. For example, they are often applied to two-stage sample surveys, and in biology and econometrics. Prediction is usually the main goal of such analyses, and mean-squared prediction error is the main way in which prediction performance is measured. In this paper we suggest a new approach to estimating mean-squared prediction error. We introduce a matched-moment, double-bootstrap algorithm, enabling the notorious underestimation of the naive mean-squared error estimator to be substantially reduced. Our approach does not require specific assumptions about the distributions of errors. Additionally, it is simple and easy to apply. This is achieved through using Monte Carlo simulation to implicitly develop formulae which, in a more conventional approach, would be derived laboriously by mathematical arguments.

#### Article information

Source
Ann. Statist., Volume 34, Number 4 (2006), 1733-1750.

Dates
First available in Project Euclid: 3 November 2006

https://projecteuclid.org/euclid.aos/1162567631

Digital Object Identifier
doi:10.1214/009053606000000579

Mathematical Reviews number (MathSciNet)
MR2283715

Zentralblatt MATH identifier
1246.62106

#### Citation

Hall, Peter; Maiti, Tapabrata. Nonparametric estimation of mean-squared prediction error in nested-error regression models. Ann. Statist. 34 (2006), no. 4, 1733--1750. doi:10.1214/009053606000000579. https://projecteuclid.org/euclid.aos/1162567631

#### References

• Battese, G. E., Harter, R. M. and Fuller, W. A. (1988). An error-components model for prediction of county crop areas using survey and satellite data. J. Amer. Statist. Assoc. 83 28--36.
• Bell, W. (2001). Discussion of Jackknife in the Fay--Herriot model with an application,'' by Jiang, Lahiri, Wan and Wu. In Proc. Seminar on Funding Opportunity in Survey Research 98--104. Council of Professional Associations on Federal Statistics, Washington.
• Booth, J. G. and Hobert, J. P. (1998). Standard errors of prediction in generalized linear mixed models. J. Amer. Statist. Assoc. 93 262--272.
• Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density. J. Amer. Statist. Assoc. 83 1184--1186.
• Chen, S. and Lahiri, P. (2003). A comparison of different MSPE estimators of EBLUP for the Fay--Herriot model. Proc. Survey Research Methods Section 905--911. Amer. Statist. Assoc., Alexandria, VA.
• Das, K., Jiang, J. and Rao, J. N. K. (2004). Mean squared error of empirical predictor. Ann. Statist. 32 818--840.
• Datta, G. S. and Ghosh, M. (1991). Bayesian prediction in linear models: Applications to small area estimation. Ann. Statist. 19 1748--1770.
• Datta, G. S. and Lahiri, P. (2000). A unified measure of uncertainty of estimated best linear unbiased predictors in small area estimation problems. Statist. Sinica 10 613--627.
• Delaigle, A. and Gijbels, I. (2004). Practical bandwidth selection in deconvolution kernel density estimation. Comput. Statist. Data Anal. 45 249--267.
• Domínguez, M. A. and Lobato, I. N. (2003). Testing the martingale difference hypothesis. Econometric Rev. 22 351--377.
• El-Amraoui, A. and Goffinet, B. (1991). Estimation of the density of $G$ given observations of $Y=G+E$. Biometrical J. 33 347--355.
• Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Statist. 19 1257--1272.
• Fan, J. (1992). Deconvolution with supersmooth distributions. Canad. J. Statist. 20 155--169.
• Fan, Y. and Li, Q. (2002). A consistent model specification test based on the kernel sum of squares of residuals. Econometric Rev. 21 337--352.
• Flachaire, E. (2002). Bootstrapping heteroskedasticity consistent covariance matrix estimator. Comput. Statist. 17 501--506.
• González Manteiga, W., Martínez Miranda, M. D. and Pérez González, A. (2004). The choice of smoothing parameter in nonparametric regression through wild bootstrap. Comput. Statist. Data Anal. 47 487--515.
• Hall, P. and Maiti, T. (2005). Nonparametric estimation of mean-squared prediction error in nested-error regression models. Available at http://arxiv.org/abs/math/0509493.
• Harville, D. A. and Jeske, D. R. (1992). Mean squared error of estimation or prediction under a general linear model. J. Amer. Statist. Assoc. 87 724--731.
• Jiang, J., Lahiri, P. and Wan, S.-M. (2002). A unified jackknife theory for empirical best prediction with $M$-estimation. Ann. Statist. 30 1782--1810.
• Kacker, R. and Harville, D. A. (1984). Approximations for standard errors of estimators of fixed and random effects in mixed linear models. J. Amer. Statist. Assoc. 79 853--862.
• Kauermann, G. and Opsomer, J. D. (2003). Local likelihood estimation in generalized additive models. Scand. J. Statist. 30 317--337.
• Lahiri, P. (2003). On the impact of bootstrap in survey sampling and small-area estimation. Statist. Sci. 18 199--210.
• Lahiri, P. (2003). A review of empirical best linear unbiased prediction for the Fay--Herriot small-area model. Philippine Statistician 52 1--15.
• Lahiri, P. and Rao, J. N. K. (1995). Robust estimation of mean squared error of small area estimators. J. Amer. Statist. Assoc. 90 758--766.
• Li, Q., Hsiao, C. and Zinn, J. (2003). Consistent specification tests for semiparametric/nonparametric models based on series estimation methods. J. Econometrics 112 295--325.
• Li, T. and Vuong, Q. (1998). Nonparametric estimation of the measurement error model using multiple indicators. J. Multivariate Anal. 65 139--165.
• Prasad, N. G. N. and Rao, J. N. K. (1990). The estimation of mean squared error of small-area estimators. J. Amer. Statist. Assoc. 85 163--171.
• Prášková, Z. (2003). Wild bootstrap in RCA(1) model. Kybernetika (Prague) 39 1--12.
• Rao, J. N. K. (2003). Small Area Estimation. Wiley, Hoboken, NJ.
• Rao, J. N. K. and Choudhry, G. H. (1995). Small area estimation: Overview and empirical study. In Business Survey Methods (B. G. Cox, D. A. Binder, B. N. Chinnappa, A. Christianson, M. J. Colledge and P. S. Kott, eds.) 527--542. Wiley, New York.
• Stukel, D. M. and Rao, J. N. K. (1997). Estimation of regression models with nested error structure and unequal error variances under two and three stage cluster sampling. Statist. Probab. Lett. 35 401--407.
• Wang, J. and Fuller, W. A. (2003). The mean squared error of small area predictors constructed with estimated area variances. J. Amer. Statist. Assoc. 98 716--723.