International Statistical Review

Estimation in Two-Stage Models with Heteroscedasticity

John Buonaccorsi

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


A surprising number of important problems can be cast in the framework of estimating a mean and variance using data arising from a two-stage structure. The first stage is a random sampling of ''units'' with some quantity of interest associated with the unit. The second stage produces an estimate of that quantity and usually, but not always, an estimated standard error, which may change considerably across units. Heteroscedasticity in the estimates over different units can arise for a number of reasons, including variation associated with the unit and changing sampling effort over units. This paper presents a broad discussion of the problem of making inferences for the population mean and variance associated with the unobserved true values at the first stage of sampling. A careful discussion of the causes of heteroscedasticity is given, followed by an examination of ways in which inferences can be carried out in a manner that is robust to the nature of the within unit heteroscedasticity. Among the conclusions are that under any type of heteroscedasticity, an unbiased estimate of the mean and the variance of the estimated mean can be obtained by using the estimates as if they were true unobserved values from the first stage. The issue of using the mean versus a weighted average which tries to account for the heteroscedasticity is also discussed. An unbiased estimate of the population variance is given and the variance of this estimate and its covariance with the estimated mean is provided under various types of heteroscedasticity. The two-stage setting arises in many contexts including the one-way random effects models with replication, meta-analysis, multi-stage sampling from finite populations and random coefficients models. We will motivate and illustrate the problem with data arising from these various contexts with the goal of providing a unified framework for addressing such problems.

Article information

Internat. Statist. Rev., Volume 74, Number 3 (2006), 403-418.

First available in Project Euclid: 4 December 2006

Permanent link to this document

Mean, meta-analysis Measurement error, Random coefficients Random effects Replication Sampling Variance


Buonaccorsi, John. Estimation in Two-Stage Models with Heteroscedasticity. Internat. Statist. Rev. 74 (2006), no. 3, 403--418.

Export citation


  • [1] Biggerstaff, B.J. & Tweedie, R.L. (1997). Incorporating variability in estimates of heterogeneity in the random effects model in meta-analysis. Statistics in Medicine, 16, 753-768.
  • [2] Böhning, D. (1999). Computer-assisted analysis of mixtures and applications: meta-analysis, disease mapping, and others. Boca Raton, FL: CRC Press Inc.
  • [3] Casella, G. & Berger, R.L. (2002). Statistical inference. Second Edition. North Scituate, MA: Duxbury Press.
  • [4] Cochran, W.G. (1977). Sampling techniques (3rd ed). New York; Chichester: John Wiley & Sons.
  • [5] Cox, L.H., Johnson, M.M. & Kafadar K. (1982). Exposition of statistics graphics technology. Amer. Statist. Assoc. Proc.-The Computation Section.
  • [6] Crowder, M.J. & Hand, D.J. (1990). Analysis of repeated measures. London; New York: Chapman & Hall Ltd.
  • [7] Davidian, M. & Giltinan, D.M. (1998). Nonlinear models for repeated measurement data. London; New York: Chapman & Hall Ltd .
  • [8] DerSimonian, R. & Laird, N. (1986). Meta-analysis in clinical trials. Controlled Clinical Trials, 7, 177-188.
  • [9] Graybill, F.A. (1976). Theory and application of the linear model. North Scituate, MA: Duxbury Press.
  • [10] Grizzle, J.E. & Allen, D.M. (1969). Analysis of growth and dose response curves (Corr: V26 p860). Biometrics, 25, 357-381.
  • [11] Gumpertz, M.A. & Pantula, S.G. (1989). A simple approach to inference in random coefficient models (C/R: 90V44 p262-263). The American Statistician, 43, 203-210.
  • [12] Lohr, S.L. (1999). Sampling: design and analysis. North Scituate, MA: Duxbury Press.
  • [13] Longford, N.T. (1993). Random coefficient models. Oxford: Oxford University Press.
  • [14] McCulloch, C.E. & Searle, S.R. (2001). Generalized, linear, and mixed models. New York; Chichester: John Wiley & Sons.
  • [15] Rao, P.S.R.S. (2000). Sampling methodologies: with applications. London; New York: Chapman & Hall Ltd.
  • [16] Rao, P.S.R.S., Kaplan, J. & Cochran, W.G. (1981). Estimators for the one-way random effects model with unequal error variances. J. American Statist. Assoc., 76, 89-97.
  • [17] Raudenbush, S.W. (1994). Random Effects Models. In The handbook of research synthesis, Eds. H. Cooper and L.V. Hedges. New York: Russell Sage Foundation.
  • [18] Reiser, B. (2000). Measuring the effectiveness of diagnostic markers in the presence of measurement error through the use of ROC curves. Statistics in Medicine, 19(16), 2115-2129.
  • [19] Rukhin, A.L. & Vangel, M.G. (1998). Estimation of a common mean and weighted means statistics. J. American Statist. Assoc., 93, 303-308.
  • [20] Searle, S.R., Casella, G. & McCulloch, C.E. (1992). Variance components. New York: John Wiley & Sons.
  • [21] Sutton, A.J, Abrams, K.R., Jones D.R., Shelond, T.A. & Song, F. (2000). Methods for meta-analysis in medical research. New York: John Wiley & Sons.
  • [22] Stukel, T.A., Demidenko, E., Dykes, J. & Karagas, M.R. (2001). Two-stage methods for the analysis of pooled data. Statistics in Medicine, 20(14), 2115-2130.
  • [23] Stukel, T.A. & Demidenko, E. (1997). Two-stage method of estimation for general linear growth curve models. Biometrics, 53, 720-728.
  • [24] Takkouche, B., Cadarso-Suarez, C. & Spiegelman, D. (1999). Evaluation of old and new tests of heterogeneity in epidemiologic meta-analysis. American J. Epidemiology, 150, 206-215.
  • [25] Thompson, S.G. & Pocock, S.J. (1991). Can meta-analyses be trusted. Lancet, 338(8775), 1127-1130.
  • [26] Tosteson, T., Buonaccorsi, J. & Demidenko, E. (2005). Measurement error and confidence intervals for ROC curves. Biometrical J., 47(4), 409-416.
  • [27] U.S. Department of Agriculture, Agricultural Research Service (2000). Continuing survey of food intakes by individuals 1994-96, 1998. CD-ROM.
  • [28] Vangel, M.G. & Rukhin, A.L. (1999). Maximum likelihood analysis for heteroscedastic one-way random effects ANOVA in interlaboratory studies. Biometrics, 55, 129-136.
  • [29] White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrics, 48, 817-838.
  • [30] Yang, S.-S., Yu, Q. & Al-Zaid, M.A. (2001). A two-stage estimation procedure for linear mixed-effects models. Commun. Statist., Part A-Theory and Methods, 30(12), 2637-2653.