Bayesian Analysis

Hyper-$g$ priors for generalized linear models

Daniel Sabanés Bové and Leonhard Held

Full-text: Open access

Abstract

We develop an extension of the classical Zellner's $g$-prior to generalized linear models. Any continuous proper hyperprior $f(g)$ can be used, giving rise to a large class of hyper-$g$ priors. Connections with the literature are described in detail. A fast and accurate integrated Laplace approximation of the marginal likelihood makes inference in large model spaces feasible. For posterior parameter estimation we propose an efficient and tuning-free Metropolis-Hastings sampler. The methodology is illustrated with variable selection and automatic covariate transformation in the Pima Indians diabetes data set.

Article information

Source
Bayesian Anal., Volume 6, Number 3 (2011), 387-410.

Dates
First available in Project Euclid: 13 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1339616469

Digital Object Identifier
doi:10.1214/11-BA615

Mathematical Reviews number (MathSciNet)
MR2843537

Zentralblatt MATH identifier
1330.62058

Subjects
Primary: 62C12: Empirical decision procedures; empirical Bayes procedures
Secondary: 62F15: Bayesian inference 62J12: Generalized linear models 62P10: Applications to biology and medical sciences

Keywords
$g$-prior generalized linear model integrated Laplace approximation variable selection fractional polynomials

Citation

Sabanés Bové, Daniel; Held, Leonhard. Hyper-$g$ priors for generalized linear models. Bayesian Anal. 6 (2011), no. 3, 387--410. doi:10.1214/11-BA615. https://projecteuclid.org/euclid.ba/1339616469


Export citation

References

  • Barbieri, M. M. and Berger, J. O. (2004). "Optimal predictive model selection." Annals of Statistics, 32(3): 870–897.
  • Berger, J. O. and Pericchi, L. R. (2001). "Objective Bayesian methods for model selection: introduction and comparison." Lecture Notes-Monograph Series, 38(1): 135–207.
  • Bernardo, J. M. and Smith, A. F. M. (2000). Bayesian Theory. Wiley Series in Probability and Statistics. Chichester: John Wiley & Sons.
  • Box, G. E. P. and Tidwell, P. W. (1962). "Transformation of the independent variables." Technometrics, 4(4): 531–550.
  • Breiman, L. and Friedman, J. H. (1985). "Estimating optimal transformations for multiple regression and correlation." Journal of the American Statistical Association, 80(391): 580–598.
  • Brent, R. P. (1973). Algorithms for Minimization Without Derivatives. Prentice-Hall series in automatic computation. Englewood Cliffs, NJ: Prentice-Hall.
  • Brunsdon, C., Fotheringham, S., and Charlton, M. (1998). "Geographically weighted regression–-modelling spatial non-stationarity." Journal of the Royal Statistical Society. Series D (The Statistician), 47(3): 431–443.
  • Buckland, S. T., Burnham, K. P., and Augustin, N. H. (1997). "Model selection: an integral part of inference." Biometrics, 53(2): 603–618.
  • Chen, M. and Ibrahim, J. (2003). "Conjugate priors for generalized linear models." Statistica Sinica, 13: 461–476.
  • Chen, M.-H., Huang, L., Ibrahim, J. G., and Kim, S. (2008). "Bayesian variable selection and computation for generalized linear models with conjugate priors." Bayesian Analysis, 3(3): 585–614.
  • Chib, S. and Jeliazkov, I. (2001). "Marginal likelihood from the Metropolis-Hastings output." Journal of the American Statistical Association, 96(453): 270–281.
  • Clyde, M. and George, E. I. (2004). "Model uncertainty." Statistical Science, 19(1): 81–94.
  • Cottet, R., Kohn, R. J., and Nott, D. J. (2008). "Variable selection and model averaging in semiparametric overdispersed generalized linear models." Journal of the American Statistical Association, 103(482): 661–671.
  • Cui, W. and George, E. I. (2008). "Empirical Bayes vs. fully Bayes variable selection." Journal of Statistical Planning and Inference, 138(4): 888–900.
  • Dobra, A. (2009). "Variable selection and dependency networks for genomewide data." Biostatistics, 10(4): 621–639.
  • Fernández, C., Ley, E., and Steel, M. F. J. (2001). "Benchmark priors for Bayesian model averaging." Journal of Econometrics, 100(2): 381–427.
  • Frank, A. and Asuncion, A. (2010). UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
  • Gamerman, D. (1997). "Sampling from the posterior distribution in generalized linear mixed models." Statistics and Computing, 7(1): 57–68.
  • George, E. I. and Foster, D. P. (2000). "Calibration and empirical Bayes variable selection." Biometrika, 87(4): 731–747.
  • George, E. I. and McCulloch, R. E. (1993). "Variable selection via Gibbs sampling." Journal of the American Statistical Association, 88(423): 881–889.
  • Golub, G. and Welsch, J. (1969). "Calculation of Gauss quadrature rules." Mathematics of Computation, 23(106): 221–230.
  • Gupta, M. and Ibrahim, J. (2009). "An information matrix prior for Bayesian analysis in generalized linear models with high dimensional data." Statistica Sinica, 19(4): 1641–1663.
  • Han, C. and Carlin, B. (2001). "Markov chain Monte Carlo methods for computing Bayes factors: A comparative review." Journal of the American Statistical Association, 96(455): 1122–1132.
  • Hans, C., Dobra, A., and West, M. (2007). "Shotgun stochastic search for ”large p” regression." Journal of the American Statistical Association, 102(478): 507–516.
  • Hansen, M. H. and Yu, B. (2001). "Model selection and the principle of minimum description length." Journal of the American Statistical Association, 96(454): 746–774.
  • –- (2003). "Minimum description length model selection criteria for generalized linear models." Lecture Notes-Monograph Series, 40(1): 145–163. Statistics and Science: A Festschrift for Terry Speed.
  • Holmes, C. C. and Held, L. (2006). "Bayesian auxiliary variable models for binary and multinomial regression." Bayesian Analysis, 1(1): 145–168.
  • Jeffreys, H. (1961). Theory of Probability. Oxford: Oxford University Press, third edition.
  • Kass, R. E. and Raftery, A. E. (1995). "Bayes factors." Journal of the American Statistical Association, 90(430): 773–795.
  • Kass, R. E. and Wasserman, L. (1995). "A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion." Journal of the American Statistical Association, 90(431): 928–934.
  • Liang, F., Paulo, R., Molina, G., Clyde, M. A., and Berger, J. O. (2008). "Mixtures of $g$ priors for Bayesian variable selection." Journal of the American Statistical Association, 103(481): 410–423.
  • Lindley, D. V. (1957). "A statistical paradox." Biometrika, 44(1–2): 187–192.
  • –- (1980). "Approximate Bayesian methods." In Bernardo, J. M., DeGroot, M. H., Lindley, D. V., and Smith, A. F. M. (eds.), Bayesian Statistics: Proceedings of the First International Meeting Held in Valencia, 223–245. Valencia: University of Valencia Press.
  • Madigan, D. and York, J. (1995). "Bayesian graphical models for discrete data." International Statistical Review, 63(2): 215–232.
  • Marin, J.-M. and Robert, C. P. (2007). Bayesian Core: A Practical Approach to Computational Bayesian Statistics. Springer texts in Statistics. New York: Springer.
  • Maruyama, Y. and George, E. I. (2010). "gBF": A Fully Bayes Factor with a Generalized g-prior. Technical report, Center for Spatial Information Science, University of Tokyo. http://arxiv.org/abs/0801.4410
  • McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models. Number 37 in Monographs on Statistics and Applied Probability. Chapman and Hall, second edition.
  • Naylor, J. C. and Smith, A. F. M. (1982). "Applications of a method for the efficient computation of posterior distributions." Journal of the Royal Statistical Society. Series C (Applied Statistics), 31(3): 214–225.
  • Nott, D. J., Kohn, R. J., and Fielding, M. (2008). "Approximating the marginal likelihood using copula." Technical report, Department of Statistics and Applied Probability, National University of Singapore. http://arxiv.org/abs/0810.5474
  • Ntzoufras, I., Dellaportas, P., and Forster, J. J. (2003). "Bayesian variable and link determination for generalised linear models." Journal of Statistical Planning and Inference, 111(1-2): 165–180.
  • Overstall, A. M. and Forster, J. J. (2010). "Default Bayesian model determination methods for generalised linear mixed models." Computational Statistics and Data Analysis, 54(12): 3269–3288.
  • Pfeffermann, D. (1993). "The role of sampling weights when modeling survey data." International Statistical Review, 61(2): 317–337.
  • Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. (2007). Numerical Recipes: The Art of Scientific Computing. Cambridge: Cambridge University Press, 3rd edition.
  • R Development Core Team (2010). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
  • Raudenbush, S. W., Yang, M.-L., and Yosef, M. (2000). "Maximum likelihood for generalized linear models with nested random effects via high-order, multivariate Laplace approximation." Journal of Computational and Graphical Statistics, 9(1): 141–157.
  • Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press.
  • Robert, C. P. (2001). The Bayesian Choice. Springer Texts in Statistics. New York: Springer, second edition.
  • Robert, C. P., Chopin, N., and Rousseau, J. (2009). "Harold Jeffreys's Theory of Probability revisited." Statistical Science, 24(2): 141–172.
  • Robert, C. P. and Saleh, A. K. M. E. (1991). "Point estimation and confidence set estimation in a parallelism model: an empirical Bayes approach." Annales d'Économie et de Statistique, 23: 65–89.
  • Robins, J. M., Hernán, M. A., and Brumback, B. (2000). "Marginal structural models and causal inference in epidemiology." Epidemiology, 11(5): 550–560.
  • Royston, P. and Altman, D. G. (1994). "Regression using fractional polynomials of continuous covariates: Parsimonious parametric modelling." Journal of the Royal Statistical Society. Series C (Applied Statistics), 43(3): 429–467.
  • Rue, H., Martino, S., and Chopin, N. (2009). "Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations." Journal of the Royal Statistical Society. Series B (Methodological), 71(2): 319–392.
  • Sabanés Bové, D. and Held, L. (2010). "Bayesian fractional polynomials." Statistics and Computing. Epub ahead of print, DOI: 10.1007/s11222-010-9170-7.
  • Scott, J. G. and Berger, J. O. (2010). "Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem." Annals of Statistics, 38(5): 2587–2619.
  • Smyth, G., Hu, Y., and Dunn, P. (2010). statmod: Statistical Modeling. R package version 1.4.8.
  • Tierney, L. and Kadane, J. B. (1986). "Accurate approximations for posterior moments and marginal densities." Journal of the American Statistical Association, 81(393): 82–86.
  • Wang, X. and George, E. I. (2007). "Adaptive Bayesian criteria in variable selection for generalized linear models." Statistica Sinica, 17(2): 667–690.
  • Wedderburn, R. W. M. (1976). "On the existence and uniqueness of the maximum likelihood estimates for certain generalized linear models." Biometrika, 63(1): 27–32.
  • West, M. (1985). "Generalized linear models: scale parameters, outlier accommodation and prior distributions." In Bernardo, J. M., DeGroot, M. H., Lindley, D. V., and Smith, A. F. M. (eds.), Bayesian Statistics 2: Proceedings of the Second Valencia International Meeting, 531–558. Amsterdam: North-Holland.
  • –- (2003). "Bayesian factor regression models in the "large p, small n" paradigm." In Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A., and West, M. (eds.), Bayesian Statistics 7: Proceedings of the Seventh Valencia International Meeting, 733–742. Oxford University Press.
  • Zellner, A. (1986). "On assessing prior distributions and Bayesian regression analysis with $g$-prior distributions." In Goel, P. K. and Zellner, A. (eds.), Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, volume 6 of Studies in Bayesian Econometrics and Statistics, chapter 5, 233–243. Amsterdam: North-Holland.
  • Zellner, A. and Siow, A. (1980). "Posterior odds ratios for selected regression hypotheses." In Bernardo, J. M., DeGroot, M. H., Lindley, D. V., and Smith, A. F. M. (eds.), Bayesian Statistics: Proceedings of the First International Meeting Held in Valencia, 585–603. Valencia: University of Valencia Press.
  • Zhang, Z., Jordan, M. I., and Yeung, D. Y. (2009). "Posterior consistency of the Silverman $g$-prior in Bayesian model choice." In Koller, D., Bengio, Y., Schuurmans, D., and Bottou, L. (eds.), Advances in Neural Information Processing Systems (NIPS), volume 21.