Open Access
March 2012 Marginal analysis of longitudinal count data in long sequences: Methods and applications to a driving study
Zhiwei Zhang, Paul S. Albert, Bruce Simons-Morton
Ann. Appl. Stat. 6(1): 27-54 (March 2012). DOI: 10.1214/11-AOAS507


Most of the available methods for longitudinal data analysis are designed and validated for the situation where the number of subjects is large and the number of observations per subject is relatively small. Motivated by the Naturalistic Teenage Driving Study (NTDS), which represents the exact opposite situation, we examine standard and propose new methodology for marginal analysis of longitudinal count data in a small number of very long sequences. We consider standard methods based on generalized estimating equations, under working independence or an appropriate correlation structure, and find them unsatisfactory for dealing with time-dependent covariates when the counts are low. For this situation, we explore a within-cluster resampling (WCR) approach that involves repeated analyses of random subsamples with a final analysis that synthesizes results across subsamples. This leads to a novel WCR method which operates on separated blocks within subjects and which performs better than all of the previously considered methods. The methods are applied to the NTDS data and evaluated in simulation experiments mimicking the NTDS.


Download Citation

Zhiwei Zhang. Paul S. Albert. Bruce Simons-Morton. "Marginal analysis of longitudinal count data in long sequences: Methods and applications to a driving study." Ann. Appl. Stat. 6 (1) 27 - 54, March 2012.


Published: March 2012
First available in Project Euclid: 6 March 2012

zbMATH: 1235.62037
MathSciNet: MR2951528
Digital Object Identifier: 10.1214/11-AOAS507

Keywords: Correlation , generalized estimating equation , multiple outputation , overdispersion , random effect , separated blocks , within-cluster resampling

Rights: Copyright © 2012 Institute of Mathematical Statistics

Vol.6 • No. 1 • March 2012
Back to Top