Abstract
Multivariate data summarized over areal units (counties, zip codes, etc.) are common in the field of public health. Estimation or testing of geographic boundaries for such data may have varied goals. For example, for data on multiple disease outcomes, we may be interested in a single set of "composite" boundaries for all diseases, separate boundaries for each disease, or both. Different areal wombling (boundary analysis) techniques are needed to meet these different requirements. But in any case, the underlying statistical model needs to account for correlations across both diseases and locations. Utilizing recent developments in multivariate conditionally autoregressive (MCAR) distributions and spatial structural equation modeling, we suggest a variety of Bayesian hierarchical models for multivariate areal boundary analysis, including some that incorporate random neighborhood structure. Many of our models can be implemented via standard software, namely WinBUGS for posterior sampling and $R$ for summarization and plotting. We illustrate our methods using Minnesota county-level esophagus, larynx, and lung cancer data, comparing models that account for both, only one, or neither of the aforementioned correlations. We identify both composite and cancer-specific boundaries, selecting the best statistical model using the DIC criterion. Our results indicate primary boundaries in both the composite and cancer-specific response surface separating the mining- and tourism-oriented northeast counties from the remainder of the state, as well as secondary (residual) boundaries in the Twin Cities metro area.
Citation
Bradley P. Carlin. Haijun Ma. "Bayesian multivariate areal wombling for multiple disease boundary analysis." Bayesian Anal. 2 (2) 281 - 302, June 2007. https://doi.org/10.1214/07-BA211
Information