The Annals of Applied Statistics

Predictive modeling of cholera outbreaks in Bangladesh

Amanda A. Koepke, Ira M. Longini, Jr., M. Elizabeth Halloran, Jon Wakefield, and Vladimir N. Minin

Full-text: Open access


Despite seasonal cholera outbreaks in Bangladesh, little is known about the relationship between environmental conditions and cholera cases. We seek to develop a predictive model for cholera outbreaks in Bangladesh based on environmental predictors. To do this, we estimate the contribution of environmental variables, such as water depth and water temperature, to cholera outbreaks in the context of a disease transmission model. We implement a method which simultaneously accounts for disease dynamics and environmental variables in a Susceptible-Infected-Recovered-Susceptible (SIRS) model. The entire system is treated as a continuous-time hidden Markov model, where the hidden Markov states are the numbers of people who are susceptible, infected or recovered at each time point, and the observed states are the numbers of cholera cases reported. We use a Bayesian framework to fit this hidden SIRS model, implementing particle Markov chain Monte Carlo methods to sample from the posterior distribution of the environmental and transmission parameters given the observed data. We test this method using both simulation and data from Mathbaria, Bangladesh. Parameter estimates are used to make short-term predictions that capture the formation and decline of epidemic peaks. We demonstrate that our model can successfully predict an increase in the number of infected individuals in the population weeks before the observed number of cholera cases increases, which could allow for early notification of an epidemic and timely allocation of resources.

Article information

Ann. Appl. Stat., Volume 10, Number 2 (2016), 575-595.

Received: November 2014
Revised: December 2015
First available in Project Euclid: 22 July 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Hidden Markov model particle filter MCMC Bayesian SIR


Koepke, Amanda A.; Longini, Jr., Ira M.; Halloran, M. Elizabeth; Wakefield, Jon; Minin, Vladimir N. Predictive modeling of cholera outbreaks in Bangladesh. Ann. Appl. Stat. 10 (2016), no. 2, 575--595. doi:10.1214/16-AOAS908.

Export citation


  • Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 269–342.
  • Andrieu, C. and Roberts, G. O. (2009). The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist. 37 697–725.
  • Baum, L. E., Petrie, T., Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist. 41 164–171.
  • Beaumont, M. A. (2003). Estimation of population growth or decline in genetically monitored populations. Genetics 164 1139–1160.
  • Bhadra, A., Ionides, E. L., Laneri, K., Pascual, M., Bouma, M. and Dhiman, R. C. (2011). Malaria in Northwest India: Data analysis via partially observed stochastic differential equation models driven by Lévy noise. J. Amer. Statist. Assoc. 106 440–451.
  • Bretó, C., He, D., Ionides, E. L. and King, A. A. (2009). Time series analysis via mechanistic models. Ann. Appl. Stat. 3 319–348.
  • Cao, Y., Gillespie, D. T. and Petzold, L. R. (2005). Avoiding negative populations in explicit Poisson tau-leaping. J. Chem. Phys. 123 054104.
  • Cauchemez, S. and Ferguson, N. M. (2008). Likelihood-based estimation of continuous-time epidemic models from time-series data: Application to measles transmission in London. J. R. Soc. Interface 5 885–897.
  • Codeço, C. (2001). Endemic and epidemic dynamics of cholera: The role of the aquatic reservoir. BMC Infect. Dis. 1 1.
  • Colwell, R. R. and Huq, A. (1994). Environmental reservoir of Vibrio cholerae: The causative agent of cholera. Ann. N.Y. Acad. Sci. 740 44–54.
  • Diekmann, O., Heesterbeek, J. A. P. and Metz, J. A. J. (1990). On the definition and the computation of the basic reproduction ratio $R_{0}$ in models for infectious diseases in heterogeneous populations. J. Math. Biol. 28 365–382.
  • Doucet, A., de Freitas, N. and Gordon, N., eds. (2001). Sequential Monte Carlo Methods in Practice. Springer, New York.
  • Dukic, V., Lopes, H. F. and Polson, N. G. (2012). Tracking epidemics with Google Flu Trends data and a state-space SEIR model. J. Amer. Statist. Assoc. 107 1410–1426.
  • Eddelbuettel, D. (2013). Seamless R and C${+}{+}$ Integration with Rcpp. Springer, New York.
  • Eddelbuettel, D. and François, R. (2011). Rcpp: Seamless R and C${+}{+}$ integration. J. Stat. Softw. 40 1–18.
  • Eisenberg, M. C., Robertson, S. L. and Tien, J. H. (2013). Identifiability and estimation of multiple transmission pathways in cholera and waterborne disease. J. Theoret. Biol. 324 84–102.
  • Fearnhead, P., Giagos, V. and Sherlock, C. (2014). Inference for reaction networks using the linear noise approximation. Biometrics 70 457–466.
  • Ferm, L., Lötstedt, P. and Hellander, A. (2008). A hierarchy of approximations of the master equation scaled by a size parameter. J. Sci. Comput. 34 127–151.
  • Finkenstädt, B. F. and Grenfell, B. T. (2000). Time series modelling of childhood diseases: A dynamical systems approach. J. Roy. Statist. Soc. Ser. C 49 187–205.
  • Gillespie, D. T. (1976). A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 22 403–434.
  • Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81 2340–2361.
  • Gillespie, D. T. (2001). Approximate accelerated stochastic simulation of chemically reacting systems. J. Chem. Phys. 115 1716–1733.
  • Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S. and Brilliant, L. (2008). Detecting influenza epidemics using search engine query data. Nature 457 1012–1014.
  • Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97–109.
  • He, D., Ionides, E. L. and King, A. A. (2010). Plug-and-play inference for disease dynamics: Measles in large and small populations as a case study. J. R. Soc. Interface 7 271–283.
  • Held, L., Höhle, M. and Hofmann, M. (2005). A statistical framework for the analysis of multivariate infectious disease surveillance counts. Stat. Model. 5 187–199.
  • Huq, A., Colwell, R. R., Rahman, R., Ali, A., Chowdhury, M. A., Parveen, S., Sack, D. A. and Russek-Cohen, E. (1990). Detection of Vibrio cholerae O1 in the aquatic environment by fluorescent-monoclonal antibody and culture methods. Appl. Environ. Microbiol. 56 2370–2373.
  • Huq, A., Sack, R. B., Nizam, A., Longini, I. M., Nair, G. B., Ali, A., Morris, J. G. Jr, Khan, M. N., Siddique, A. K., Yunus, M., Albert, M. J., Sack, D. A. and Colwell, R. R. (2005). Critical factors influencing the occurrence of Vibrio cholerae in the environment of Bangladesh. Appl. Environ. Microbiol. 71 4645–4654.
  • International Vaccine Institute (2012). Country investment case study on cholera vaccination: Bangladesh. International Vaccine Institute, Seoul.
  • Ionides, E. L., Bretó, C. and King, A. A. (2006). Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 103 18438–18443.
  • Keeling, M. J. and Rohani, P. (2008). Modeling Infectious Diseases in Humans and Animals. Princeton Univ. Press, Princeton, NJ.
  • Keeling, M. J. and Ross, J. V. (2008). On methods for studying stochastic disease dynamics. J. R. Soc. Interface 5 171–181.
  • King, A. A., Ionides, E. L., Pascual, M. and Bouma, M. J. (2008). Inapparent infections and cholera dynamics. Nature 454 877–880.
  • Koelle, K. and Pascual, M. (2004). Disentangling extrinsic from intrinsic factors in disease dynamics: A nonlinear time series approach with an application to cholera. Amer. Nat. 163 901–913.
  • Koelle, K., Rodó, X., Pascual, M., Yunus, M. and Mostafa, G. (2005). Refractory periods and climate forcing in cholera dynamics. Nature 436 696–700.
  • Koepke, A. A., Longini, Jr., I. M., Halloran, M., Wakefield, J. and Minin, V. N. (2016). Supplement to “Predictive modeling of cholera outbreaks in Bangladesh.” DOI:10.1214/16-AOAS908SUPP.
  • Komorowski, M., Finkenstädt, B., Harper, C. V. and Rand, D. A. (2009). Bayesian inference of biochemical kinetic parameters using the linear noise approximation. BMC Bioinformatics 10 343.
  • Longini, I. M., Yunus, M., Zaman, K., Siddique, A. K., Sack, R. B. and Nizam, A. (2002). Epidemic and endemic cholera trends over a 33-year period in Bangladesh. J. Infect. Dis. 186 246–251.
  • Longini, I. M. Jr., Nizam, A., Ali, M., Yunus, M., Shenvi, N. and Clemens, J. D. (2007). Controlling endemic cholera with oral vaccines. PLoS Med. 4 e336.
  • May, R. and Anderson, R. M. (1991). Infectious Diseases of Humans: Dynamics and Control. Oxford Univ. Press, London.
  • McKinley, T., Cook, A. R. and Deardon, R. (2009). Inference in epidemic models without likelihoods. Int. J. Biostat. 5 Art. 24, 39.
  • Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 21 1087–1092.
  • Rasmussen, D. A., Ratmann, O. and Koelle, K. (2011). Inference for nonlinear epidemiological models using genealogies and time series. PLoS Comput. Biol. 7 e1002136, 11.
  • Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Statist. 12 1151–1172.
  • Sack, R. B., Siddique, A. K., Longini, I. M., Nizam, A., Yunus, M., Islam, S., Morris, J. G., Ali, A., Huq, A., Nair, G. B., Qadri Shah, F., Faruque, M., Sack, D. A. and Colwell, R. R. (2003). A 4-year study of the epidemiology of Vibrio cholerae in four rural areas of Bangladesh. J. Infect. Dis. 187 96–101.
  • Taylor, H. M. and Karlin, S. (1998). An Introduction to Stochastic Modeling, 3rd ed. Academic Press, San Diego, CA.
  • Tien, J. H. and Earn, D. J. D. (2010). Multiple transmission pathways and disease dynamics in a waterborne pathogen model. Bull. Math. Biol. 72 1506–1533.
  • Toni, T., Welch, D., Strelkowa, N., Ipsen, A. and Stumpf, M. (2009). Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6 187–202.
  • van Kampen, N. G. (1992). Stochastic Processes in Physics and Chemistry 1. Elsevier, Amsterdam.

Supplemental materials