The Annals of Statistics
- Ann. Statist.
- Volume 44, Number 6 (2016), 2497-2532.
On the computational complexity of high-dimensional Bayesian variable selection
We study the computational complexity of Markov chain Monte Carlo (MCMC) methods for high-dimensional Bayesian linear regression under sparsity constraints. We first show that a Bayesian approach can achieve variable-selection consistency under relatively mild conditions on the design matrix. We then demonstrate that the statistical criterion of posterior concentration need not imply the computational desideratum of rapid mixing of the MCMC algorithm. By introducing a truncated sparsity prior for variable selection, we provide a set of conditions that guarantee both variable-selection consistency and rapid mixing of a particular Metropolis–Hastings algorithm. The mixing time is linear in the number of covariates up to a logarithmic factor. Our proof controls the spectral gap of the Markov chain by constructing a canonical path ensemble that is inspired by the steps taken by greedy algorithms for variable selection.
Ann. Statist., Volume 44, Number 6 (2016), 2497-2532.
Received: May 2015
Revised: September 2015
First available in Project Euclid: 23 November 2016
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62F15: Bayesian inference
Secondary: 60J10: Markov chains (discrete-time Markov processes on discrete state spaces)
Yang, Yun; Wainwright, Martin J.; Jordan, Michael I. On the computational complexity of high-dimensional Bayesian variable selection. Ann. Statist. 44 (2016), no. 6, 2497--2532. doi:10.1214/15-AOS1417. https://projecteuclid.org/euclid.aos/1479891626
- Supplement to “On the computational complexity of high-dimensional Bayesian variable selection”. Owing to space constraints, we have moved some materials and technical proofs to the Appendix, which is contained in the supplementary document.