Bayesian Analysis

Some Priors for Sparse Regression Modelling

Jim E. Griffin and Philip. J. Brown

Full-text: Open access


A wide range of methods, Bayesian and others, tackle regression when there are many variables. In the Bayesian context, the prior is constructed to reflect ideas of variable selection and to encourage appropriate shrinkage. The prior needs to be reasonably robust to different signal to noise structures. Two simple evergreen prior constructions stem from ridge regression on the one hand and g-priors on the other. We seek to embed recent ideas about sparsity of the regression coefficients and robustness into these priors. We also explore the gains that can be expected from these differing approaches.

Article information

Bayesian Anal., Volume 8, Number 3 (2013), 691-702.

First available in Project Euclid: 9 September 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Correlated priors Canonical reduction Multiple regression g-priors Markov chain Monte Carlo Normal-Gamma prior p>n Ridge regression Robust priors Sparsity


Griffin, Jim E.; Brown, Philip. J. Some Priors for Sparse Regression Modelling. Bayesian Anal. 8 (2013), no. 3, 691--702. doi:10.1214/13-BA827.

Export citation


  • Abramowitz, M. and Stegun, I. A. (1964). Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables. Dover.
  • Bøvelstad, H. M., Nygård, S., Størvold, H. L., Aldrin, M., Borgan, Ø., Frigressi, A., and Lingjærde, O. C. (2007). “Predictive survival from microarray data – a comparative study.” Bioinformatics, 23: 2080–2087.
  • Breiman, L. and Friedman, J. H. (1985). “Estimating optimal transformations for multiple regression and correlation.” Journal of the American Statistical Association, 80: 580–598.
  • Carvalho, C., Polson, N., and Scott, J. (2010). “The horseshoe estimator for sparse signals.” Biometrika, 97: 465–480.
  • Casella, G. (1980). “Minimax ridge regression estimation.” Annals of Statistics, 8: 1036–1056.
  • Fearn, T. (1983). “A misuse of ridge regression in the calibration of a near infrared reflectance instrument.” Journal of the Royal Statistical Society C: Applied Statistics, 32: 73–79.
  • Good, I. J. (1952). “Rational Decisions.” Journal of the Royal Statistical Society B, 14: 107–114.
  • Griffin, J. E. and Brown, P. J. (2010). “Inference with Normal-Gamma prior distributions in regression problems.” Bayesian Analysis, 5: 171–188.
  • — (2012). “Structuring shrinkage: some correlated priors for regression.” Biometrika, 99: 481–487.
  • Hoerl, A. E. and Kennard, R. W. (1970). “Ridge regression: biased estimation for nonorthogonal problems.” Technometrics, 12: 55–67.
  • Kass, R. F. and Raftery, A. E. (1995). “Bayes factors.” Journal of the American Statistical Association, 90: 773–795.
  • Liang, F., Paulo, R., Molina, G., Clyde, M. A., and Berger, J. O. (2008). “Mixtures of g Priors for Bayesian Variable Selection.” Journal of the American Statistical Association, 103: 410–423.
  • Maruyama, Y. and George, E. I. (2011). “Fully Bayes Factors with a generalised g-prior.” Annals of Statistics, 39: 2740–2765.
  • Ogutu, J. O., Schulz-Streeck, T., and Piepho, H.-P. (2012). “Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions.” BMC Proceedings, 6 (Suppl 2): S10.
  • Polson, N. G. and Scott, J. G. (2011). “Shrink globally, act locally: sparse Bayesian regularization and prediction.” In Bernardo J. M., M. J., Bayarri, Berger, J. O., Dawid, A. P., Heckerman, D., Smith, A. F. M., and West, M. (eds.), Bayesian Statistics 9, 501–538. Oxford: Clarendon Press.
  • — (2012). “Local shrinkage rules, Lévy processes, and regularized regression.” Journal of the Royal Statistical Society Series B, 74: 287–311.
  • Raftery, A. E., Madigan, D., and Hoeting, J. A. (1997). “Bayesian model averaging for linear regression models.” Journal of the American Statistical Association, 92: 179–191.
  • Raiffa, H. and Schlaifer, R. (1961). Applied statistical decision theory. M.I.T. Press.
  • Vandaele, W. (1978). “Participation in illigimate activities: Ehrlich revisited.” In Deterrence and Incapacitation, 270–335. Washington, D. C.: National Academy of Sciences.
  • Waldron, L., Pintilie, M., Taso, M.-S., Shepherd, F. A., Huttenhower, C., and Jurisica, I. (2011). “Optimized application of penalized regression methods to diverse genomic data.” Bioinformatics, 27: 3399–3406.
  • Zellner, A. (1986). “On assessing prior distributions and Bayesian regression analysis with g-prior distributions.” In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, 233–243. Amsterdam: North Holland/Elsevier.