The Annals of Statistics

Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models

Snigdhansu Chatterjee, Partha Lahiri, and Huilin Li

Full-text: Open access

Abstract

Empirical best linear unbiased prediction (EBLUP) method uses a linear mixed model in combining information from different sources of information. This method is particularly useful in small area problems. The variability of an EBLUP is traditionally measured by the mean squared prediction error (MSPE), and interval estimates are generally constructed using estimates of the MSPE. Such methods have shortcomings like under-coverage or over-coverage, excessive length and lack of interpretability. We propose a parametric bootstrap approach to estimate the entire distribution of a suitably centered and scaled EBLUP. The bootstrap histogram is highly accurate, and differs from the true EBLUP distribution by only O(d3n−3/2), where d is the number of parameters and n the number of observations. This result is used to obtain highly accurate prediction intervals. Simulation results demonstrate the superiority of this method over existing techniques of constructing prediction intervals in linear mixed models.

Article information

Source
Ann. Statist., Volume 36, Number 3 (2008), 1221-1245.

Dates
First available in Project Euclid: 26 May 2008

Permanent link to this document
https://projecteuclid.org/euclid.aos/1211819562

Digital Object Identifier
doi:10.1214/07-AOS512

Mathematical Reviews number (MathSciNet)
MR2418655

Zentralblatt MATH identifier
1360.62378

Subjects
Primary: 62D05: Sampling theory, sample surveys
Secondary: 62F40: Bootstrap, jackknife and other resampling methods 62F25: Tolerance and confidence regions

Keywords
Predictive distribution prediction interval linear mixed model small area bootstrap coverage accuracy

Citation

Chatterjee, Snigdhansu; Lahiri, Partha; Li, Huilin. Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models. Ann. Statist. 36 (2008), no. 3, 1221--1245. doi:10.1214/07-AOS512. https://projecteuclid.org/euclid.aos/1211819562


Export citation

References

  • Abramovitch, L. and Singh, K. (1985). Edgeworth corrected pivotal statistics and the bootstrap. Ann. Statist. 13 116–132.
  • Aitchison, J. (1975). Goodness of predictive fit. Biometrika 62 547–554.
  • Basu, R., Ghosh, J. K. and Mukerjee, R. (2003). Empirical Bayes prediction intervals in a normal regression model: Higher order asymptotics. Statist. Probab. Lett. 63 197–203.
  • Beran, R. (1990a). Refining bootstrap simultaneous confidence sets. J. Amer. Statist. Assoc. 85 417–426.
  • Beran, R. (1990b). Calibrating prediction regions. Refining bootstrap simultaneous confidence sets. J. Amer. Statist. Assoc. 85 715–723.
  • Breiman, L. (1996). Bagging predictors. Machine Learning 24 123–140.
  • Carlin, B. P. and Louis, T. A. (1996). Bayes and Empirical Bayes Methods for data Analysis. Chapman and Hall, London.
  • Carlin, B. and Gelfand, A. (1990). Approaches for empirical Bayes confidence intervals. J. Amer. Statist. Assoc. 85 105–114.
  • Carlin, B. and Gelfand, A. (1991). A sample reuse method for accurate parametric empirical Bayes confidence intervals. J. Roy. Statist. Soc. Ser. B 53 189–200.
  • Chatterjee, S. and Lahiri, P. (2002). Parametric bootstrap confidence intervals in small area estimation problems. Unpublished manuscript.
  • Cox, D. R. (1975). Prediction intervals and empirical Bayes confidence intervals. In Perspectives in Probability and Statistics. Papers in Honor of M. S. Bartlett (J. Gani, ed.) 47–55. Applied Probability Trust, Univ. Sheffield, Sheffield.
  • Das, K., Jiang, J. and Rao, J. N. K. (2004). Mean squared error of empirical predictor. Ann. Statist. 32 818–840.
  • Datta, G. S., Ghosh, M., Smith, D. and Lahiri, P. (2002). On an asymptotic theory of conditional and unconditional coverage probabilities of empirical Bayes confidence intervals. Scand. J. Statist. 29 139–152.
  • Datta, G. S., Rao, J. N. K. and Smith, D. D. (2005). On measuring the variability of small area estimators under a basic area level model. Biometrika 92 183–196.
  • DiCiccio, T. and Efron, B. (1996). Bootstrap confidence intervals (with discussion). Statist. Sci. 11 189–228.
  • Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York.
  • Fay, R. E. and Herriot, R. A. (1979). Estimates of income for small places: An application of James–Stein procedure to census data. J. Amer. Statist. Assoc. 74 269–277.
  • Fushiki, T., Komaki, F. and Aihara, K. (2004). On parametric bootstrapping and Bayesian prediction. Scand. J. Statist. 31 403–416.
  • Fushiki, T., Komaki, F. and Aihara, K. (2005). Nonparametric bootstrap prediction. Bernoulli 11 293–307.
  • George, E. I., Liang, F. and Xu, X. (2006). Improved minimax predictive densities under Kullback–Leibler loss. Ann. Statist. 34 78–91.
  • Hall, P. (2006). Discussion of “Mixed model prediction and small area estimation,” by J. Jiang and P. Lahiri. Test 15 1–96.
  • Hall, P. and Maiti, T. (2006a). Nonparametric estimation of mean-squared prediction error in nested-error regression models. Ann. Statist. 34 1733–1750.
  • Hall, P. and Maiti, T. (2006b). On parametric bootstrap methods for small-area prediction. J. Roy. Statist. Soc. Ser. B 68 221–238.
  • Hall, P. and Martin, M. A. (1996). “Discussion on Bootstrap confidence intervals,” by DiCiccio and Efron. Statist. Sci. 11 212–214.
  • Harris, I. R. (1989). Predictive fit for natural exponential families. Biometrika 76 675–684.
  • Hartigan, J. (1964). Invariant prior distributions. Ann. Math. Statist. 35 836–845.
  • Hartigan, J. (1998). The maximum likelihood prior. Ann. Statist. 26 2083–2103.
  • Hill, J. R. (1990). A general framework for model-based statistics. Biometrika 77 115–126.
  • Jeske, D. R. and Harville, D. A. (1988). Prediction-interval procedures and (fixed-effects) confidence-interval procedures for mixed linear models. Comm. Statist. Theory Methods 17 1053–1087.
  • Jiang, J. (1996). REML estimation: Asymptotic behavior and related topics. Ann. Statist. 24 255–286.
  • Jiang, J. (1998). Asymptotic properties of the empirical BLUP and BLUE in mixed linear models. Statist. Sinica 8 861–885.
  • Jiang, J. and Lahiri, P. (2006). Mixed model prediction and small area estimation (with discussions). Test 15 1–96.
  • Jiang, J. and Zhang, W. (2002). Distribution-free prediction intervals in mixed linear models. Statist. Sinica 12 537–553.
  • Jiang, J., Lahiri, P. and Wan, S. (2002). A unified jackknife theory for empirical best prediction with M-estimation. Ann. Statist. 30 1782–1810.
  • Komaki, F. (1996). On asymptotic properties of predictive distributions. Biometrika 83 299–313.
  • Komaki, F. (2001). A shrinkage predictive distribution for multivariate normal observations. Biometrika 88 859–864.
  • Komaki, F. (2006). Shrinkage priors for Bayesian prediction. Ann. Statist. 34 808–819.
  • Laird, N. M. and Louis, T. A. (1987). Empirical Bayes confidence intervals based on bootstrap samples (with discussion). J. Amer. Statist. Assoc. 82 739–750.
  • Lee, S. M S. and Young, G. A. (1996). Discussion on “Bootstrap confidence intervals,” by DiCiccio and Efron. Statist. Sci. 11 221–223.
  • McCulloch, C. E. and Searle, S. R. (2001). Generalized, Linear, and Mixed Models. Wiley, New York.
  • Morris, C. N. (1983a). Parametric empirical Bayes inference: Theory and applications (with discussion). J. Amer. Statist. Assoc. 78 47–65.
  • Morris, C. N. (1983b). Parametric empirical Bayes confidence intervals. In Scientific Inference, Data Analysis, and Robustness (G. E. P. Box, T. Leonard and C.-F. Wu, eds.) 25–50. Academic Press, Orlando, FL.
  • Murray, D. G. (1977). A note on the estimation of probability density functions. Biometrika 64 150–152.
  • Ng, V. M. (1980). On the estimation of parametric density functions. Biometrika 67 505–506.
  • Prasad, N. G. N. and Rao, J. N. K. (1990). The estimation of the mean squared error of small-area estimators, J. Amer. Statist. Assoc. 85 163–171.
  • Rao, C. R. (1965). Linear Statistical Inference and Its Applications. Wiley, New York.
  • Rao, J. N. K. (2003). Small Area Estimation. Wiley, Hoboken, NJ.
  • Rao, J. N. K. (2005). Inferential issues in small area estimation: Some new developments. Statistics in Transition 7 513–526.
  • Yeh, A. B. and Singh, K. (1997). Balanced confidence regions based on Tukey’s depth and the bootstrap. J. Roy. Statist. Soc. Ser. B 59 639–652.