Open Access
June 2011 A hierarchical Bayesian approach to record linkage and population size problems
Andrea Tancredi, Brunero Liseo
Ann. Appl. Stat. 5(2B): 1553-1585 (June 2011). DOI: 10.1214/10-AOAS447

Abstract

We propose and illustrate a hierarchical Bayesian approach for matching statistical records observed on different occasions. We show how this model can be profitably adopted both in record linkage problems and in capture–recapture setups, where the size of a finite population is the real object of interest. There are at least two important differences between the proposed model-based approach and the current practice in record linkage. First, the statistical model is built up on the actually observed categorical variables and no reduction (to 0–1 comparisons) of the available information takes place. Second, the hierarchical structure of the model allows a two-way propagation of the uncertainty between the parameter estimation step and the matching procedure so that no plug-in estimates are used and the correct uncertainty is accounted for both in estimating the population size and in performing the record linkage. We illustrate and motivate our proposal through a real data example and simulations.

Citation

Download Citation

Andrea Tancredi. Brunero Liseo. "A hierarchical Bayesian approach to record linkage and population size problems." Ann. Appl. Stat. 5 (2B) 1553 - 1585, June 2011. https://doi.org/10.1214/10-AOAS447

Information

Published: June 2011
First available in Project Euclid: 13 July 2011

zbMATH: 1223.62015
MathSciNet: MR2849786
Digital Object Identifier: 10.1214/10-AOAS447

Keywords: Capture–recapture methods , Conditional independence , Gibbs sampling , Metropolis–Hastings , record linkage

Rights: Copyright © 2011 Institute of Mathematical Statistics

Vol.5 • No. 2B • June 2011
Back to Top