Electronic Journal of Statistics

Convergence properties of Gibbs samplers for Bayesian probit regression with proper priors

Saptarshi Chakraborty and Kshitij Khare

Full-text: Open access

Abstract

The Bayesian probit regression model (Albert and Chib [1]) is popular and widely used for binary regression. While the improper flat prior for the regression coefficients is an appropriate choice in the absence of any prior information, a proper normal prior is desirable when prior information is available or in modern high dimensional settings where the number of coefficients ($p$) is greater than the sample size ($n$). For both choices of priors, the resulting posterior density is intractable and a Data Augmentation (DA) Markov chain is used to generate approximate samples from the posterior distribution. Establishing geometric ergodicity for this DA Markov chain is important as it provides theoretical guarantees for constructing standard errors for Markov chain based estimates of posterior quantities. In this paper, we first show that in case of proper normal priors, the DA Markov chain is geometrically ergodic for all choices of the design matrix $X$, $n$ and $p$ (unlike the improper prior case, where $n\geq p$ and another condition on $X$ are required for posterior propriety itself). We also derive sufficient conditions under which the DA Markov chain is trace-class, i.e., the eigenvalues of the corresponding operator are summable. In particular, this allows us to conclude that the Haar PX-DA sandwich algorithm (obtained by inserting an inexpensive extra step in between the two steps of the DA algorithm) is strictly better than the DA algorithm in an appropriate sense.

Note

When this article was first made public, on February 1, 2017, Kshitij Khare funding information was left out of the article. The article was corrected on August 30, 2019.

Article information

Source
Electron. J. Statist., Volume 11, Number 1 (2017), 177-210.

Dates
Received: March 2016
First available in Project Euclid: 1 February 2017

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1485939612

Digital Object Identifier
doi:10.1214/16-EJS1219

Mathematical Reviews number (MathSciNet)
MR3604022

Zentralblatt MATH identifier
1366.60093

Subjects
Primary: 60J05: Discrete-time Markov processes on general state spaces 60J20: Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) [See also 90B30, 91D10, 91D35, 91E40]
Secondary: 33C10: Bessel and Airy functions, cylinder functions, $_0F_1$

Keywords
Bayesian probit model binary regression geometric ergodicity proper normal prior trace class sandwich algorithms Data Augmentation Markov chain Monte Carlo

Rights
Creative Commons Attribution 4.0 International License.

Citation

Chakraborty, Saptarshi; Khare, Kshitij. Convergence properties of Gibbs samplers for Bayesian probit regression with proper priors. Electron. J. Statist. 11 (2017), no. 1, 177--210. doi:10.1214/16-EJS1219. https://projecteuclid.org/euclid.ejs/1485939612


Export citation

References

  • [1] Albert, J.H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data., J. Amer. Statist. Assoc., 88(422):669–679.
  • [2] Asmussen, S. and Glynn, P.W. (2011). A new proof of convergence of MCMC via the ergodic theorem., Statistics & Probability Letters, 81(10):1482–1485.
  • [3] Birnbaum, Z.W. (1942). An inequality for mill’s ratio., Ann. Math. Statist., 13(2):245–246.
  • [4] Botev, Z.I. (2015)., TruncatedNormal: Truncated Multivariate Normal. R package version 1.0.
  • [5] Chan, K.S. and Geyer, C.J. (1994). Discussion: Markov chains for exploring posterior distributions., Ann. Statist., 22(4):1747–1758.
  • [6] Chen, L.H. and Shao, Q.-M. (2000). Propriety of posterior distribution for dichotomous quantal response models., Proceedings of the American Mathematical Society, 129(293-302).
  • [7] Flegal, J.M. and Jones, G.L. (2010). Batch means and spectral variance estimators in Markov chain Monte Carlo., Ann. Statist., 38(2):1034–1070.
  • [8] Hobert, J.P. and Marchev, D. (2008). A theoretical comparison of the data ugmentation, marginal augmentation and PX-DA algorithms., Ann. Statist., 36(2):532–554.
  • [9] Jones, G., Haran, M., Caffo, B., and Neath, R. (2006). Fixed-width output analysis for markov chain monte carlo., J. Amer. Statist. Assoc., 101(1537-1547).
  • [10] Jörgens, K. (1982)., Linear integral operators. Surveys and reference works in mathematics. Pitman Advanced Pub. Program.
  • [11] Khare, K. and Hobert, J.P. (2011). A spectral analytic comparison of trace-class Data Augmentation algorithms and their sandwich variants., Ann. Statist., 39(5):2585–2606.
  • [12] Liu, J.S. and Wu, Y.N. (1999). Parameter expansion for Data Augmentation., J. Amer. Statist. Assoc., 94(448):1264–1274.
  • [13] Meng, X.-L. and Van Dyk, D.A. (1999). Seeking efficient Data Augmentation schemes via conditional and Marginal Augmentation., Biometrika, 86(2):301–320.
  • [14] Meyn, S. and Tweedie, R. (1996)., Markov Chains and Stochastic Stability. Communications and Control Engineering. Springer London.
  • [15] Mykland, P., Tierney, L., and Yu, B. (1995). Regeneration in markov chain samplers., Journal of the American Statistical Association, 90(429):233–241.
  • [16] Pal, S., Khare, K., and Hobert, J.P. (2015). Improving the Data Augmentation algorithm in the two-block setup., Journal of Computational and Graphical Statistics, 24(4):1114–1133.
  • [17] R Core Team (2015)., R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  • [18] Robert, C.P. (1995). Convergence control methods for Markov chain Monte Carlo algorithms., Statist. Sci., 10(3):231–253.
  • [19] Roberts, G. and Rosenthal, J. (1997). Geometric ergodicity and hybrid Markov chains., Electron. Commun. Probab., 2:13–25.
  • [20] Román, J.C. and Hobert, J.P. (2015). Geometric ergodicity of Gibbs samplers for Bayesian general linear mixed models with proper priors., Linear Algebra and its Applications, 473:54 – 77. Special issue on Statistics.
  • [21] Roy, V. (2012). Convergence rates for MCMC algorithms for a robust Bayesian binary regression model., Electron. J. Statist., 6:2463–2485.
  • [22] Roy, V. and Hobert, J.P. (2007). Convergence rates and asymptotic standard errors for Markov chain Monte Carlo algorithms for Bayesian probit regression., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(4):607–623.
  • [23] Trautmann, H., Steuer, D., Mersmann, O., and Bornkamp, B. (2014)., truncnorm: Truncated normal distribution. R package version 1.0-7.
  • [24] van Dyk, D.A. and Meng, X.-L. (2001). The art of Data Augmentation., Journal of Computational and Graphical Statistics, 10(1):1–50.
  • [25] Venables, W.N. and Ripley, B.D. (2002)., Modern Applied Statistics with S. Springer, New York, fourth edition. ISBN 0-387-95457-0.
  • [26] Zellner, A. (1983). Applications of Bayesian analysis in Econometrics., Journal of the Royal Statistical Society. Series D (The Statistician), 32(1/2):23–34.