Statistical Science

Approximate Bayesian Computation and Simulation-Based Inference for Complex Stochastic Epidemic Models

Trevelyan J. McKinley, Ian Vernon, Ioannis Andrianakis, Nicky McCreesh, Jeremy E. Oakley, Rebecca N. Nsubuga, Michael Goldstein, and Richard G. White

Full-text: Open access


Approximate Bayesian Computation (ABC) and other simulation-based inference methods are becoming increasingly used for inference in complex systems, due to their relative ease-of-implementation. We briefly review some of the more popular variants of ABC and their application in epidemiology, before using a real-world model of HIV transmission to illustrate some of challenges when applying ABC methods to high-dimensional, computationally intensive models. We then discuss an alternative approach—history matching—that aims to address some of these issues, and conclude with a comparison between these different methodologies.

Article information

Statist. Sci., Volume 33, Number 1 (2018), 4-18.

First available in Project Euclid: 2 February 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Approximate Bayesian Computation history matching emulation Bayesian inference infectious disease models


McKinley, Trevelyan J.; Vernon, Ian; Andrianakis, Ioannis; McCreesh, Nicky; Oakley, Jeremy E.; Nsubuga, Rebecca N.; Goldstein, Michael; White, Richard G. Approximate Bayesian Computation and Simulation-Based Inference for Complex Stochastic Epidemic Models. Statist. Sci. 33 (2018), no. 1, 4--18. doi:10.1214/17-STS618.

Export citation


  • Andrianakis, I., Vernon, I., McCreesh, N., McKinley, T. J., Oakley, J. E., Nsubuga, R. N., Goldstein, M. and White, R. G. (2015). Bayesian history matching of complex infectious disease models using emulation: A tutorial and a case study on HIV in Uganda. PLoS Comput. Biol. 11. e1003968.
  • Andrianakis, I., McCreesh, N., Vernon, I., McKinley, T. J., Oakley, J. E., Nsubuga, R. N., Goldstein, M. and White, R. G. (2017). History matching of a high dimensional HIV transmission individual based model. SIAM/ASA J. Uncertain. Quantificat. 5 694–719.
  • Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 269–342.
  • Andrieu, C. and Roberts, G. O. (2009). The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist. 37 697–725.
  • Barnes, C. P., Filippi, S., Stumpf, M. P. H. and Thorne, T. (2012). Considerate approaches to constructing summary statistics for ABC model selection. Stat. Comput. 22 1181–1197.
  • Beaumont, M. A. (2003). Estimation of population growth or decline in genetically monitored populations. Genetics 164 1139–1160.
  • Beaumont, M. A. (2010). Approximate Bayesian Computation in evolution and ecology. Annu. Rev. Ecol. Evol. Syst. 41 379–406.
  • Beaumont, M. A., Zhang, W. and Balding, D. J. (2002). Approximate Bayesian Computation in population genetics. Genetics 162 2025–2035.
  • Beaumont, M. A., Cornuet, J.-M., Marin, J.-M. and Robert, C. P. (2009). Adaptive approximate Bayesian computation. Biometrika 96 983–990.
  • Blum, M. G. B. and François, O. (2010). Non-linear regression models for approximate Bayesian computation. Stat. Comput. 20 63–73.
  • Bornn, L., Pillai, N. S., Smith, A. and Woodard, D. (2017). The use of a single pseudo-sample in approximate Bayesian computation. Stat. Comput. 27 583–590.
  • Bortot, P., Coles, S. G. and Sisson, S. A. (2007). Inference for stereological extremes. J. Amer. Statist. Assoc. 102 84–92.
  • Brooks Pollock, E., Roberts, G. O. and Keeling, M. J. (2014). A dynamic model of bovine tuberculosis spread and control in Great Britain. Nature 511 228–231.
  • Cameron, E., Battle, K. E., Bhatt, S., Weiss, D. J., Bisanzio, D., Mappin, B., Dalrymple, U., Hay, S. I., Smith, D. L., Griffin, J. T., Wenger, E. A., Eckhoff, P. A., Smith, T. A., Penny, M. A. and Gething, P. W. (2015). Defining the relationship between infection prevalence and clinical incidence of Plasmodium falciparum malaria. Nat. Commun. 6 8170.
  • Conlan, A. J. K., McKinley, T. J., Karolemeas, K., Pollock, E. B., Goodchild, A. V., Mitchell, A. P., Birch, C. P. D., Clifton-Hadley, R. S. and Wood, J. L. N. (2012). Estimating the hidden burden of bovine tuberculosis in Great Britain. PLoS Comput. Biol. 8 e1002730.
  • Craig, P. S., Goldstein, M., Seheult, A. H. and Smith, J. A. (1997). Pressure matching for hydrocarbon reservoirs: A case study in the use of Bayes linear strategies for large computer experiments. In Case Studies in Bayesian Statistics. 37–93. Springer.
  • Csilléry, K., Blum, M. G. B., Gaggiotti, O. E. and François, O. (2010). Approximate Bayesian Computation (ABC) in practice. Trends Ecol. Evol. 25 410–418.
  • Del Moral, P., Doucet, A. and Jasra, A. (2012). An adaptive sequential Monte Carlo method for approximate Bayesian computation. Stat. Comput. 22 1009–1020.
  • Diggle, P. J. and Gratton, R. J. (1984). Monte Carlo methods of inference for implicit statistical models. J. Roy. Statist. Soc. Ser. B 46 193–227.
  • Doucet, A., Pitt, M. K., Deligiannidis, G. and Kohn, R. (2015). Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. Biometrika 102 295–313.
  • Drovandi, C. C. and Pettitt, A. N. (2011). Estimation of parameters for macroparasite population evolution using approximate Bayesian computation. Biometrics 67 225–233.
  • Drovandi, C. C., Pettitt, A. N. and Faddy, M. J. (2011). Approximate Bayesian computation using indirect inference. J. R. Stat. Soc. Ser. C. Appl. Stat. 60 317–337.
  • Drovandi, C. C., Pettitt, A. N. and Lee, A. (2015). Bayesian indirect inference using a parametric auxiliary model. Statist. Sci. 30 72–95.
  • Drovandi, C. C., Pettitt, A. N. and McCutchan, R. A. (2016). Exact and approximate Bayesian inference for low integer-valued time series models with intractable likelihoods. Bayesian Anal. 11 325–352.
  • Fearnhead, P. and Prangle, D. (2012). Constructing summary statistics for approximate Bayesian computation: Semi-automatic approximate Bayesian computation. J. R. Stat. Soc. Ser. B. Stat. Methodol. 74 419–474.
  • Filippi, S., Barnes, C. P., Cornebise, J. and Stumpf, M. P. H. (2013). On optimality of kernels for approximate Bayesian computation using sequential Monte Carlo. Stat. Appl. Genet. Mol. Biol. 12 87–107.
  • Gibson, G. J. and Renshaw, E. (1998). Estimating parameters in stochastic compartmental models using Markov chain methods. IMA J. Math. Appl. Med. Biol. 15 19–40.
  • Goldstein, M. and Rougier, J. (2009). Reified Bayesian modelling and inference for physical systems. J. Statist. Plann. Inference 139 1221–1239.
  • Goldstein, M., Seheult, A. and Vernon, I. (2013). Assessing Model Adequacy, 2nd ed. Wiley, UK.
  • Gouriéroux, C., Monfort, A. and Renault, E. (1993). Indirect inference. J. Appl. Econometrics 8 S85–S118.
  • Henderson, D. A., Boys, R. J., Krishnan, K. J., Lawless, C. and Wilkinson, D. J. (2009). Bayesian emulation and calibration of a stochastic computer model of mitochondrial DNA deletions in substantia nigra neurons. J. Amer. Statist. Assoc. 104 76–87.
  • Holden, P. B., Edwards, N. R., Hensman, J. and Wilkinson, R. D. (2016). ABC for climate: Dealing with expensive simulators. Handbook of Approximate Bayesian Computation (ABC). Available at 1511.03475.
  • Ionides, E. L., Bretó, C. and King, A. A. (2006). Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 103 18438–18443.
  • Ionides, E. L., Bhadra, A., Atchadé, Y. and King, A. (2011). Iterated filtering. Ann. Statist. 39 1776–1802.
  • Ionides, E. L., Nguyen, D., Atchadé, Y., Stoev, S. and King, A. A. (2015). Inference for dynamic and latent variable models via iterated, perturbed Bayes maps. Proc. Natl. Acad. Sci. USA 112 719–724.
  • Jabot, F., Lagarrigues, G., Courbaud, B. and Dumoulin, N. (2014). A comparison of emulation methods for Approximate Bayesian Computation. Available at
  • Jandarov, R., Haran, M., Bjørnstad, O. and Grenfell, B. (2014). Emulating a gravity model to infer the spatiotemporal dynamics of an infectious disease. J. R. Stat. Soc. Ser. C. Appl. Stat. 63 423–444.
  • Jewell, C. P., Kypraios, T., Christley, R. M. and Roberts, G. O. (2009). A novel approach to real-time risk prediction for emerging infectious diseases: A case study in avian influenza H5N1. Prev. Vet. Med. 91 19–28.
  • Joyce, P. and Marjoram, P. (2008). Approximately sufficient statistics and Bayesian computation. Stat. Appl. Genet. Mol. Biol. 7.
  • Kypraios, T., Neal, P. and Prangle, D. (2017). A tutorial introduction to Bayesian inference for stochastic epidemic models using approximate Bayesian computation. Math. Biosci. 287 42–53.
  • Lenormand, M., Jabot, F. and Deffuant, G. (2013). Adaptive approximate Bayesian computation for complex models. Comput. Statist. 28 2777–2796.
  • Marin, J.-M., Pudlo, P., Robert, C. P. and Ryder, R. J. (2012). Approximate Bayesian computational methods. Stat. Comput. 22 1167–1180.
  • Marjoram, P., Molitor, J., Plagnol, V. and Tavaré, S. (2003). Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA 100 15324–15328.
  • McCreesh, N., Andrianakis, I., Nsubuga, R. N., Strong, M., Vernon, I., McKinley, T. J., Oakley, J. E., Goldstein, M., Hayes, R. and White, R. G. (2017). Universal, test, treat, and keep: Improving ART retention is key in cost-effective HIV care and control in Uganda. BMC Infect. Dis.. To appear.
  • McKinley, T., Cook, A. R. and Deardon, R. (2009). Inference in epidemic models without likelihoods. Int. J. Biostat. 5.
  • McKinley, T. J., Ross, J. V., Deardon, R. and Cook, A. R. (2014). Simulation-based Bayesian inference for epidemic models. Comput. Statist. Data Anal. 71 434–447.
  • McKinley, T. J, Vernon, I., Andrianakis, I., McCreesh, N., Oakley, J. E., Nsubuga, R. N., Goldstein, M. and White, R. G. (2017). Supplement to “Approximate Bayesian computation and simulation-based inference for complex stochastic epidemic models.” DOI:10.1214/17-STS618SUPPA, DOI:10.1214/17-STS618SUPPB.
  • Meeds, E. and Welling, M. (2014). GPS-ABC: Gaussian process surrogate Approximate Bayesian Computation. Available at
  • Neal, P. (2012). Efficient likelihood-free Bayesian computation for household epidemics. Stat. Comput. 22 1239–1256.
  • Nunes, M. A. and Balding, D. J. (2010). On optimal selection of summary statistics for approximate Bayesian computation. Stat. Appl. Genet. Mol. Biol. 9.
  • O’Neill, P. D. and Roberts, G. O. (1999). Bayesian inference for partially observed stochastic epidemics. J. R. Stat. Soc., A 162 121–129.
  • O’Neill, P. D., Balding, D. J., Becker, N. G., Eerola, M. and Mollison, D. (2000). Analyses of infectious disease data from household outbreaks by Markov chain Monte Carlo methods. J. Roy. Statist. Soc. Ser. C 49 517–542.
  • Oakley, J. E. and Youngman, B. D. (2017). Calibration of stochastic computer simulators using likelihood emulation. Technometrics 59 80–92.
  • Pitt, M. K., Silva, R. d. S., Giordani, P. and Kohn, R. (2012). On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econometrics 171 134–151.
  • Pukelsheim, F. (1994). The three sigma rule. Amer. Statist. 48 88–91.
  • Ratmann, O., Jørgensen, O., Hinkley, T., Stumpf, M., Richardson, S. and Wiuf, C. (2007). Using likelihood-free inference to compare evolutionary dynamics of the protein networks of H. pylori and P. falciparum. PLoS Comput. Biol. 3 2266–2278.
  • Ratmann, O., Andrieu, C., Wiuf, C. and Richardson, S. (2009). Model criticism based on likelihood-free inference, with an application to protein network evolution. Proc. Natl. Acad. Sci. USA 106 10576–10581.
  • Ratmann, O., Camacho, A., Meijer, A. and Donker, G. (2014). Statistical modelling of summary values leads to accurate Approximate Bayesian Computations. Available at arXiv:1305.4283v2.
  • Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Statist. 12 1151–1172.
  • Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P. (1989). Design and analysis of computer experiments. Statist. Sci. 4 409–435.
  • Sherlock, C., Thiery, A. H., Roberts, G. O. and Rosenthal, J. S. (2015). On the efficiency of pseudo-marginal random walk Metropolis algorithms. Ann. Statist. 43 238–275.
  • Silk, D., Filippi, S. and Stumpf, M. P. H. (2012). Optimizing threshold-schedules for approximate Bayesian computation sequential Monte Carlo samplers: applications to molecular systems. Available at arXiv:1210.3296v1.
  • Sisson, S. A., Fan, Y. and Tanaka, M. M. (2007). Sequential Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA 104 1760–1765.
  • Tavaré, S., Balding, D. J., Griffiths, R. C. and Donnelly, P. (1997). Inferring coalescence times from DNA sequence data. Genetics 145 505–518.
  • Toni, T., Welch, D., Strelkowa, N., Ipsen, A. and Strumpf, M. P. H. (2009). Approximate Bayesian Computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6 187–202.
  • Vernon, I., Goldstein, M. and Bower, R. G. (2010). Galaxy formation: A Bayesian uncertainty analysis. Bayesian Anal. 5 619–669.
  • Vernon, I., Goldstein, M. and Bower, R. (2014). Galaxy formation: Bayesian history matching for the observable universe. Statist. Sci. 29 81–90.
  • Wilkinson, R. D. (2013). Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. Stat. Appl. Genet. Mol. Biol. 12 129–141.
  • Wilkinson, R. D. (2014). Accelerating ABC methods using Gaussian processes. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS) 33 1015–1023.
  • Wood, S. N. (2010). Statistical inference for noisy nonlinear ecological dynamic systems. Nature 466 1102–1104.

Supplemental materials

  • Supplement A: Bisection method. Details the bisection method used to generate tolerances at each generation of ABC.
  • Supplement B: Approximate posterior distributions for ABC vs. nonimplausible region for HM. Plots of the approximate posterior distributions after 11 generations of ABC, and depth plots after 9 waves of history matching. (Note that HM does not produce posterior samples, rather these correspond to the densities of nonimplausible points.).