The Annals of Statistics

Local linear regression for generalized linear models with missing data

R. J. Carroll, Roberto G. Gutierrez, C. Y. Wang, and Suojin Wang

Full-text: Open access

Abstract

Fan, Heckman and Wand proposed locally weighted kernel polynomial regression methods for generalized linear models and quasilikelihood functions. When the covariate variables are missing at random, we propose a weighted estimator based on the inverse selection probability weights. Distribution theory is derived when the selection probabilities are estimated nonparametrically. We show that the asymptotic variance of the resulting nonparametric estimator of the mean function in the main regression model is the same as that when the selection probabilities are known, while the biases are generally different. This is different from results in parametric problems, where it is known that estimating weights actually decreases asymptotic variance. To reconcile the difference between the parametric and nonparametric problems, we obtain a second-order variance result for the nonparametric case. We generalize this result to local estimating equations. Finite-sample performance is examined via simulation studies. The proposed method is demonstrated via an analysis of data from a case-control study.

Article information

Source
Ann. Statist., Volume 26, Number 3 (1998), 1028-1050.

Dates
First available in Project Euclid: 21 June 2002

Permanent link to this document
https://projecteuclid.org/euclid.aos/1024691087

Digital Object Identifier
doi:10.1214/aos/1024691087

Mathematical Reviews number (MathSciNet)
MR1635438

Zentralblatt MATH identifier
1073.62548

Subjects
Primary: 62G07: Density estimation
Secondary: 62G20

Keywords
Generalized linear models kernel regression local linear smoother measurement error missing at random quasilikelihood functions

Citation

Wang, C. Y.; Wang, Suojin; Gutierrez, Roberto G.; Carroll, R. J. Local linear regression for generalized linear models with missing data. Ann. Statist. 26 (1998), no. 3, 1028--1050. doi:10.1214/aos/1024691087. https://projecteuclid.org/euclid.aos/1024691087


Export citation

References

  • Bruemmer, B., White, E., Vaughan, T. and Cheney, C. (1996). Nutrient intake in relationship to bladder cancer among middle aged men and women. Amer. J. Epidemiology 144 485-495.
  • Carroll, R. J., Ruppert, D. and Welsh, A. H. (1998). Local estimating equations. J. Amer. Statist. Assoc., 93 214-227.
  • Carroll, R. J., Fan, J., Gijbels, I. and Wand, M. (1997). Generalized partially linear single-index models. J. Amer. Statist. Assoc. 92 477-489.
  • Fan, J., Heckman, N. E. and Wand, M. P. (1995). Local poly nomial kernel regression for generalized linear models and quasilikelihood functions. J. Amer. Statist. Assoc. 90 141-150.
  • Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47 663-685.
  • Little, R. J. A. and Rubin, D. B. (1987). Statistical Analy sis with Missing Data. Wiley, New York.
  • McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman and Hall, London. Nelder, J. A. and Wedderburn, R. W. M. (1972), Generalized linear models. J. Roy. Statist. Soc. Ser. A 135 370-384.
  • Prentice, R. L. and Py ke, R. (1979). Logistic disease incidence models and case-control studies. Biometrika 66 403-411.
  • Robins, J. M., Rotnitzky, A. and Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not alway s observed. J. Amer. Statist. Assoc. 89 846-866.
  • Rubin, D. B. (1976). Inference and missing data. Biometrika 63 581-592.
  • Schucany, W. R. (1995). Adaptive bandwidth choice for kernel regression. J. Amer. Statist. Assoc. 90 535-540.
  • Severini, T. A. and Staniswalis, J. G. (1994). Quasilikelihood estimation in semiparametric models. J. Amer. Statist. Assoc. 89 501-511.
  • Staniswalis, J. G. (1989). The kernel estimate of a regression function in likelihood-based models. J. Amer. Statist. Assoc. 84 276-283.
  • 1050 WANG, WANG, GUTIERREZ AND CARROLL
  • Wang, C. Y., Wang, S., Zhao, L. P. and Ou, S. T. (1997). Weighted semiparametric estimation in regression analysis with missing covariate data. J. Amer. Statist. Assoc. 92 512-525.
  • Wedderburn, R. W. M. (1974). Quasilikelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika 61 439-447.
  • White, J. E. (1982). A two stage design for the study of the relationship between a rare exposure and a rare disease. Amer. J. Epidemiology 115 119-128.