The Annals of Applied Statistics

Bayesian nonparametric multiresolution estimation for the American Community Survey

Terrance D. Savitsky

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Bayesian hierarchical methods implemented for small area estimation focus on reducing the noise variation in published government official statistics by borrowing information among dependent response values. Even the most flexible models confine parameters defined at the finest scale to link to each data observation in a one-to-one construction. We propose a Bayesian multiresolution formulation that utilizes an ensemble of observations at a variety of coarse scales in space and time to additively nest parameters we define at a finer scale, which serve as our focus for estimation. Our construction is motivated by and applied to the estimation of 1-year period employment totals, indexed by county, from statistics published at coarser areal domains and multi-year periods in the American Community Survey (ACS). We construct a nonparametric mixture of Gaussian processes as the prior on a set of regression coefficients of county-indexed latent functions over multiple survey years. We evaluate a modified Dirichlet process prior that incorporates county-year predictors as the mixing measure. Each county-year parameter of a latent function is estimated from multiple coarse-scale observations in space and time to which it links. The multiresolution formulation is evaluated on synthetic data and applied to the ACS.

Article information

Source
Ann. Appl. Stat., Volume 10, Number 4 (2016), 2157-2181.

Dates
Received: June 2015
Revised: July 2016
First available in Project Euclid: 5 January 2017

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1483606855

Digital Object Identifier
doi:10.1214/16-AOAS968

Mathematical Reviews number (MathSciNet)
MR3592052

Zentralblatt MATH identifier
06688772

Keywords
Survey sampling small area estimation latent models Gaussian process Dirichlet process Bayesian hierarchical models Markov chain Monte Carlo

Citation

Savitsky, Terrance D. Bayesian nonparametric multiresolution estimation for the American Community Survey. Ann. Appl. Stat. 10 (2016), no. 4, 2157--2181. doi:10.1214/16-AOAS968. https://projecteuclid.org/euclid.aoas/1483606855


Export citation

References

  • Bradley, J. R., Wikle, C. K. and Holan, S. H. (2014). Bayesian spatial change of support for count-valued survey data. Available at http://adsabs.harvard.edu/abs/2014arXiv1405.7227B.
  • Bradley, J. R., Wikle, C. K. and Holan, S. H. (2015). Spatio-temporal change of support with application to American Community Survey multi-year period estimates. Stat 4 255–270.
  • Celeux, G., Forbes, F., Robert, C. P. and Titterington, D. M. (2006). Rejoinder to “Deviance information criteria for missing data models.” Bayesian Anal. 1 701–706 (electronic).
  • Dawid, A. P. (1981). Some matrix-variate distribution theory: Notational considerations and a Bayesian application. Biometrika 68 265–274.
  • Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577–588.
  • Gelfand, A. E. and Dey, D. K. (1994). Bayesian model choice: Asymptotics and exact calculations. J. Roy. Statist. Soc. Ser. B 56 501–514.
  • Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. and Rubin, D. B. (2015). Bayesian Data Analysis, 3rd ed. Chapman & Hall/CRC, Boca Raton, FL.
  • Gelman, A. and Rubin, D. (1992). Inference from iterative simulation using multiple sequences. Statist. Sci. 7 457–511. Available at http://www.stat.columbia.edu/~gelman/research/published/itsim.pdf.
  • Ghosh, M., Natarajan, K., Stroud, T. W. F. and Carlin, B. P. (1998). Generalized linear models for small-area estimation. J. Amer. Statist. Assoc. 93 273–282.
  • Hawala, S. and Lahiri, P. (2012). Hierarchical Bayes estimation of poverty rates. Technical report, U.S. Census Bureau—Small Area Income and Poverty Estimates. Available at https://www.census.gov/did/www/saipe/publications/files/hawalalahirishpl2012.pdf.
  • Jones, G. L., Haran, M., Caffo, B. S. and Neath, R. (2006). Fixed-width output analysis for Markov chain Monte Carlo. J. Amer. Statist. Assoc. 101 1537–1547.
  • Müller, P., Quintana, F. and Rosner, G. L. (2011). A product partition model with regression on covariates. J. Comput. Graph. Statist. 20 260–278.
  • Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA.
  • Rue, H. and Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability 104. Chapman & Hall/CRC, Boca Raton, FL.
  • Särndal, C., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer, New York.
  • Savitsky, T. D. (2016). Supplement to “Bayesian nonparametric multiresolution estimation for the American Community Survey.” DOI:10.1214/16-AOAS968SUPP.
  • Savitsky, T. D. and McCaffrey, D. F. (2013). Bayesisan hierarchical multivariate formulation with factor analysis for nested ordinal data. Psychometrika 79 275–302.
  • Savitsky, T. D. and Paddock, S. M. (2013). Bayesian nonparametric hierarchical modeling for multiple membership data in grouped attendance interventions. Ann. Appl. Stat. 7 1074–1094.
  • Savitsky, T., Vannucci, M. and Sha, N. (2011). Variable selection for nonparametric Gaussian process priors: Models and computational strategies. Statist. Sci. 26 130–149.
  • Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statist. Sinica 4 639–650.

Supplemental materials

  • Technical Appendices. The online supplement contains three technical appendices with detailed material on the following topics: 1. posterior computation; 2. posterior mixing; 3. simulation study 5-year county results.