The Annals of Statistics

Asymptotics for generalized estimating equations with large cluster sizes

Minge Xie and Yaning Yang

Full-text: Open access


Generalized estimating equations are used in regression analysis of longitudinal data, where observations on each subject are correlated. Statistical analysis using such methods is based on the asymptotic properties of regression parameter estimators. This paper presents asymptotic results when either the number of independent subjects or the cluster sizes (the number of observations on each subject) or both go to infinity. A set of (information matrix based) general conditions is developed, which leads to the weak and strong consistency as well as the asymptotic normality of the estimators. Most of the results are parallel to the elegant work of Fahrmeir and Kaufmann on maximum likelihood estimators related to the generalized linear models. The conditions for weak consistency and asymptotic normality are verified for several examples of general interest.

Article information

Ann. Statist., Volume 31, Number 1 (2003), 310-347.

First available in Project Euclid: 26 February 2003

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F12: Asymptotic properties of estimators 62J12: Generalized linear models

Generalized estimation equations (GEE) longitudinal data cluster correlated observations infinite cluster sizes


Xie, Minge; Yang, Yaning. Asymptotics for generalized estimating equations with large cluster sizes. Ann. Statist. 31 (2003), no. 1, 310--347. doi:10.1214/aos/1046294467.

Export citation


  • BILLINGSLEY, P. (1986). Probability and Measure, 2nd ed. Wiley, New York.
  • CHEN, K., HU, I. and YING, Z. (1999). Strong consistency of maximum quasi-likelihood estimators in generalized linear models with fixed and adaptive designs. Ann. Statist. 27 1155-1163.
  • CHOW, Y. S. and TEICHER, H. (1988). Probability Theory: Independence, Interchangeability, Martingales, 2nd ed. Springer, New York.
  • CRAMÉR, H. (1946). Mathematical Methods of Statistics. Princeton Univ. Press.
  • CROWDER, M. (1986). On consistency and inconsistency of estimating equations. Econometric Theory 2 305-330.
  • DIGGLE, P. J., LIANG, K.-Y. and ZEGER, S. L. (1996). Analy sis of Longitudinal Data. Clarendon Press, Oxford.
  • EICKER, F. (1967). Limit theorems for regressions with unequal and dependent errors. Proc. Fifth Berkeley Sy mp. Math. Statist. Probab. 1 59-82. Univ. California Press, Berkeley.
  • FAHRMEIR, L. and KAUFMANN, H. (1985). Consistency and asy mptotic normality of the maximum likelihood estimator in generalized linear models. Ann. Statist. 13 342-368.
  • FAHRMEIR, L. and KAUFMANN, H. (1986). Asy mptotic inference in discrete response models. Statist. Hefte 27 179-205.
  • GOURIEROUX, C. and MONFORT, A. (1981). Asy mptotic properties of the maximum likelihood estimator in dichotomous logit models. J. Econometrics 17 83-97.
  • HABERMAN, S. J. (1977). Maximum likelihood estimates in exponential response models. Ann. Statist. 5 815-841.
  • HUBER, P. J. (1981). Robust Statistics. Wiley, New York.
  • LI, B. (1996). A minimax approach to consistency and efficiency for estimating equations. Ann. Statist. 24 1283-1297.
  • LIANG, K.-Y. and ZEGER, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13-22.
  • NELDER, J. A. and WEDDERBURN, R. W. M. (1972). Generalized linear models. J. Roy. Statist. Soc. Ser. A 135 370-384.
  • NI, G. X. (1984). Basic Matrix Theories and Methods. Shanghai Science and Technology Publisher. (In Chinese.)
  • ROMANO, J. P. and SIEGEL, A. F. (1986). Counterexamples in Probability and Statistics. Wadsworth, Monterey, CA.
  • SCHOTT, J. R. (1997). Matrix Analy sis for Statistics. Wiley, New York.
  • SERFLING, R. J. (1970). Convergence properties of Sn under moment restrictions. Ann. Math. Statist. 41 1235-1248.
  • WU, C. F. (1981). Asy mptotic theory of nonlinear least squares estimation. Ann. Statist. 9 501-513.
  • XIE, M. and SIMPSON, D. G. (1998). Categorical exposure-response regression analysis of toxicology experiments. Case Studies in Environmental Statistics. Lecture Notes in Statist. 132 121-141. Springer, New York.
  • XIE, M., SIMPSON, D. G. and CARROLL, R. J. (2000). Random effects in interval-censored ordinal regression: Latent structure and Bayesian approach. Biometrics 56 376-383.
  • YOHAI, V. J. and MARONNA, R. A. (1979). Asy mptotic behavior of M-estimators for the linear model. Ann. Statist. 7 258-268.
  • YUAN, K.-H. and JENNRICH, R. I. (1998). Asy mptotics of estimating equations under natural conditions. J. Multivariate Anal. 65 245-260.
  • ZEGER, S. L. and LIANG, K.-Y. (1986). Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42 121-130.
  • NEW YORK, NEW YORK 10021