The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 6, Number 3 (2012), 831-852.
People born in the Middle East but residing in the Netherlands: Invariant population size estimates and the role of active and passive covariates
Including covariates in loglinear models of population registers improves population size estimates for two reasons. First, it is possible to take heterogeneity of inclusion probabilities over the levels of a covariate into account; and second, it allows subdivision of the estimated population by the levels of the covariates, giving insight into characteristics of individuals that are not included in any of the registers. The issue of whether or not marginalizing the full table of registers by covariates over one or more covariates leaves the estimated population size estimate invariant is intimately related to collapsibility of contingency tables [Biometrika 70 (1983) 567–578]. We show that, with information from two registers, population size invariance is equivalent to the simultaneous collapsibility of each margin consisting of one register and the covariates. We give a short path characterization of the loglinear model which describes when marginalizing over a covariate leads to different population size estimates. Covariates that are collapsible are called passive, to distinguish them from covariates that are not collapsible and are termed active. We make the case that it can be useful to include passive covariates within the estimation model, because they allow a finer description of the population in terms of these covariates. As an example we discuss the estimation of the population size of people born in the Middle East but residing in the Netherlands.
Ann. Appl. Stat., Volume 6, Number 3 (2012), 831-852.
First available in Project Euclid: 31 August 2012
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
van der Heijden, Peter G. M.; Whittaker, Joe; Cruyff, Maarten; Bakker, Bart; van der Vliet, Rik. People born in the Middle East but residing in the Netherlands: Invariant population size estimates and the role of active and passive covariates. Ann. Appl. Stat. 6 (2012), no. 3, 831--852. doi:10.1214/12-AOAS536. https://projecteuclid.org/euclid.aoas/1346418564
- Supplementary material: Estimation in R. We make use of the CAT-procedure in R (Meng and Rubin (1991); Schafer [(1997a), Chapters 7 and 8], (1997b)). The CAT-procedure is a routine for the analysis of categorical variable data sets with missing values. We describe our application of this procedure in detail in the supplemental article [van der Heijden et al. (2012)].