Bayesian Analysis

A Bayesian Conjugate Gradient Method (with Discussion)

Jon Cockayne, Chris J. Oates, Ilse C.F. Ipsen, and Mark Girolami

Full-text: Open access

Abstract

A fundamental task in numerical computation is the solution of large linear systems. The conjugate gradient method is an iterative method which offers rapid convergence to the solution, particularly when an effective preconditioner is employed. However, for more challenging systems a substantial error can be present even after many iterations have been performed. The estimates obtained in this case are of little value unless further information can be provided about, for example, the magnitude of the error. In this paper we propose a novel statistical model for this error, set in a Bayesian framework. Our approach is a strict generalisation of the conjugate gradient method, which is recovered as the posterior mean for a particular choice of prior. The estimates obtained are analysed with Krylov subspace methods and a contraction result for the posterior is presented. The method is then analysed in a simulation study as well as being applied to a challenging problem in medical imaging.

Article information

Source
Bayesian Anal., Volume 14, Number 3 (2019), 937-1012.

Dates
First available in Project Euclid: 18 May 2019

Permanent link to this document
https://projecteuclid.org/euclid.ba/1558144846

Digital Object Identifier
doi:10.1214/19-BA1145

Mathematical Reviews number (MathSciNet)
MR4012393

Subjects
Primary: 62C10: Bayesian problems; characterization of Bayes procedures 62F15: Bayesian inference 65F10: Iterative methods for linear systems [See also 65N22]

Keywords
probabilistic numerics linear systems Krylov subspaces

Rights
Creative Commons Attribution 4.0 International License.

Citation

Cockayne, Jon; Oates, Chris J.; Ipsen, Ilse C.F.; Girolami, Mark. A Bayesian Conjugate Gradient Method (with Discussion). Bayesian Anal. 14 (2019), no. 3, 937--1012. doi:10.1214/19-BA1145. https://projecteuclid.org/euclid.ba/1558144846


Export citation

References

  • Ajiz, M. A. and Jennings, A. (1984). “A robust incomplete Choleski-conjugate gradient algorithm.” International Journal for Numerical Methods in Engineering, 20(5): 949–966.
  • Allaire, G. and Kaber, S. M. (2008). Numerical Linear Algebra, volume 55 of Texts in Applied Mathematics. Springer New York.
  • Bartels, S. and Hennig, P. (2016). “Probabilistic Approximate Least-Squares.” In Proceedings of Artificial Intelligence and Statistics (AISTATS).
  • Benzi, M. (2002). “Preconditioning Techniques for Large Linear Systems: A Survey.” Journal of Computational Physics, 182(2): 418–477.
  • Besag, J. and Green, P. J. (1993). “Spatial statistics and Bayesian computation.” Journal of the Royal Statistical Society. Series B (Statistical Methodology), 25–37.
  • Bogachev, V. I. (1998). Gaussian Measures, volume 62. American Mathematical Society Providence.
  • Bramble, J. H., Pasciak, J. E., and Xu, J. (1990). “Parallel Multilevel Preconditioners.” Mathematics of Computation, 55(191): 1–22.
  • Briol, F.-X., Oates, C. J., Girolami, M., Osborne, M. A., and Sejdinovic, D. (2018). “Probabilistic Integration: A Role in Statistical Computation?” arXiv:1512.00933.
  • Calvetti, D., Pitolli, F., Somersalo, E., and Vantaggi, B. (2018). “Bayes Meets Krylov: Statistically Inspired Preconditioners for CGLS.” SIAM Review, 60(2): 429–461.
  • Cheng, K.-S., Isaacson, D., Newell, J. C., and Gisser, D. G. (1989). “Electrode models for electric current computed tomography.” IEEE Transactions on Biomedical Engineering, 36(9): 918–924.
  • Cockayne, J., Oates, C. J., Ipsen, I. C. F., and Girolami, M. (2019). “Supplementary Material for “Bayesian Conjugate-Gradient Method”.” Bayesian Analysis.
  • Cockayne, J., Oates, C., Sullivan, T., and Girolami, M. (2017). “Bayesian Probabilistic Numerical Methods.” arXiv:1702.03673.
  • Cockayne, J., Oates, C., Sullivan, T. J., and Girolami, M. (2016). “Probabilistic Meshless Methods for Partial Differential Equations and Bayesian Inverse Problems.” arXiv:1605.07811v1.
  • Cotter, S. L., Roberts, G. O., Stuart, A. M., and White, D. (2013). “MCMC methods for functions: Modifying old algorithms to make them faster.” Statistical Science, 28(3): 424–446.
  • Davis, T. A. (2006). Direct Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Philadelphia, PA.
  • Diaconis, P. (1988). “Bayesian numerical analysis.” Statistical Decision Theory and Related Topics IV, 1: 163–175.
  • Dunlop, M. M. and Stuart, A. M. (2016). “The Bayesian formulation of EIT: Analysis and algorithms.” Inverse Problems and Imaging, 10: 1007–1036.
  • Evans, L. (2010). Partial Differential Equations, volume 19 of Graduate Studies in Mathematics. Providence, Rhode Island: American Mathematical Society, second edition.
  • Fasshauer, G. E. (1999). “Solving differential equations with radial basis functions: Multilevel methods and smoothing.” Advances in Computational Mathematics, 11(2–3): 139–159.
  • Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2014). Bayesian Data Analysis, volume 2. CRC press Boca Raton, FL.
  • Golub, G. H. and Van Loan, C. F. (2013). Matrix computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, fourth edition.
  • Hennig, P. (2015). “Probabilistic Interpretation of Linear Solvers.” SIAM Journal on Optimization, 25(1): 234–260.
  • Hennig, P., Osborne, M. A., and Girolami, M. (2015). “Probabilistic numerics and uncertainty in computations.” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Science, 471(2179): 20150142.
  • Hestenes, M. R. and Stiefel, E. (1952). “Methods of conjugate gradients for solving linear systems.” Journal of Research of the National Bureau of Standards, 49(6): 409.
  • Holder, D. S. (2004). Electrical Impedance Tomography: Methods, History and Applications. CRC Press.
  • Isaacson, D., Mueller, J. L., Newell, J. C., and Siltanen, S. (2004). “Reconstructions of chest phantoms by the D-bar method for electrical impedance tomography.” IEEE Transactions on Medical Imaging, 23(7): 821–828.
  • Larkin, F. M. (1972). “Gaussian measure in Hilbert space and applications in numerical analysis.” The Rocky Mountain Journal of Mathematics, 2(3): 379–421.
  • Liesen, J. and Strakos, Z. (2012). Krylov Subspace Methods. Principles and Analysis. Oxford University Press.
  • Oates, C. J., Cockayne, J., Aykroyd, R. G., and Girolami, M. (2019). “Bayesian Probabilistic Numerical Methods in Time-Dependent State Estimation for Industrial Hydrocyclone Equipment.” Journal of the American Statistical Association. To appear.
  • Owhadi, H. (2015). “Bayesian numerical homogenization.” Multiscale Modeling & Simulation, 13(3): 812–828.
  • Parker, A. and Fox, C. (2012). “Sampling Gaussian distributions in Krylov spaces with conjugate gradients.” SIAM Journal on Scientific Computing, 34(3): B312–B334.
  • Rasmussen, C. E. (2004). “Gaussian Processes in Machine Learning.” In Advances in Intelligent Data Analysis VIII, 63–71. Berlin, Heidelberg: Springer Berlin Heidelberg.
  • Reinarz, A., Dodwell, T., Fletcher, T., Seelinger, L., Butler, R., and Scheichl, R. (2018). “Dune-composites – A new framework for high-performance finite element modelling of laminates.” Composite Structures, 184: 269–278.
  • Roeckner, E., Bäuml, G., Bonaventura, L., Brokopf, R., Esch, M., and Giorgetta, M. (2003). “The atmospheric general circulation model ECHAM 5. PART I: Model description.” Technical report, MPI für Meteorologie.
  • Saad, Y. (1994). “ILUT: A dual threshold incomplete LU factorization.” Numerical Linear Algebra with Applications, 1(4): 387–402.
  • Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Philadelphia, PA, second edition.
  • Schäfer, F., Sullivan, T. J., and Owhadi, H. (2017). “Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity.” arXiv:1706.02205.
  • Shewchuk, J. R. (1994). “An introduction to the conjugate gradient method without the agonizing pain.” Technical report.
  • Somersalo, E., Cheney, M., and Isaacson, D. (1992). “Existence and uniqueness for electrode models for electric current computed tomography.” SIAM Journal on Applied Mathematics, 52(4): 1023–1040.
  • Stuart, A. M. (2010). “Inverse problems: A Bayesian perspective.” Acta Numerica, 19: 451–559.
  • Tikhonov, A. N. (1963). “On the solution of ill-posed problems and the method of regularization.” In Doklady Akademii Nauk, volume 151, 501–504. Russian Academy of Sciences.
  • Traub, J. F., Wasilkowski, G. W., and Woźniakowski, H. (1988). Information-Based Complexity. Computer Science and Scientific Computing. Academic Press, Inc., Boston, MA. With contributions by A. G. Werschulz and T. Boult.
  • Wikle, C. K., Milliff, R. F., Nychka, D., and Berliner, L. M. (2001). “Spatiotemporal Hierarchical Bayesian Modeling Tropical Ocean Surface Winds.” Journal of the American Statistical Association, 96(454): 382–397.

Supplemental materials