The Annals of Applied Statistics

Reconstructing transmission trees for communicable diseases using densely sampled genetic data

Colin J. Worby, Philip D. O’Neill, Theodore Kypraios, Julie V. Robotham, Daniela De Angelis, Edward J. P. Cartwright, Sharon J. Peacock, and Ben S. Cooper

Full-text: Open access

Abstract

Whole genome sequencing of pathogens from multiple hosts in an epidemic offers the potential to investigate who infected whom with unparalleled resolution, potentially yielding important insights into disease dynamics and the impact of control measures. We considered disease outbreaks in a setting with dense genomic sampling, and formulated stochastic epidemic models to investigate person-to-person transmission, based on observed genomic and epidemiological data. We constructed models in which the genetic distance between sampled genotypes depends on the epidemiological relationship between the hosts. A data-augmented Markov chain Monte Carlo algorithm was used to sample over the transmission trees, providing a posterior probability for any given transmission route. We investigated the predictive performance of our methodology using simulated data, demonstrating high sensitivity and specificity, particularly for rapidly mutating pathogens with low transmissibility. We then analyzed data collected during an outbreak of methicillin-resistant Staphylococcus aureus in a hospital, identifying probable transmission routes and estimating epidemiological parameters. Our approach overcomes limitations of previous methods, providing a framework with the flexibility to allow for unobserved infection times, multiple independent introductions of the pathogen and within-host genetic diversity, as well as allowing forward simulation.

Article information

Source
Ann. Appl. Stat., Volume 10, Number 1 (2016), 395-417.

Dates
Received: July 2014
Revised: November 2015
First available in Project Euclid: 25 March 2016

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1458909921

Digital Object Identifier
doi:10.1214/15-AOAS898

Mathematical Reviews number (MathSciNet)
MR3480501

Zentralblatt MATH identifier
1358.62110

Keywords
Bayesian inference infectious disease epidemics outbreak investigation transmission routes

Citation

Worby, Colin J.; O’Neill, Philip D.; Kypraios, Theodore; Robotham, Julie V.; De Angelis, Daniela; Cartwright, Edward J. P.; Peacock, Sharon J.; Cooper, Ben S. Reconstructing transmission trees for communicable diseases using densely sampled genetic data. Ann. Appl. Stat. 10 (2016), no. 1, 395--417. doi:10.1214/15-AOAS898. https://projecteuclid.org/euclid.aoas/1458909921


Export citation

References

  • Albrich, W. C. and Harbarth, S. (2008). Health-care workers: Source, vector, or victim of MRSA? Lancet, Infect. Dis. 8 289–301.
  • Bryant, J. M., Schürch, A. C., van Deutekom, H., Harris, S. R., de Beer, J. L., de Jager, V., Kremer, K., van Hijum, S. A. F. T., Siezen, R. J., Borgdorff, M., Bentley, S. D., Parkhill, J. and van Soolingen, D. (2013). Inferring patient to patient transmission of mycobacterium tuberculosis from whole genome sequencing data. BMC Infect. Dis. 13 1–12.
  • Cooper, B. S., Medley, G. F. and Scott, G. M. (1999). Preliminary analysis of the transmission dynamics of nosocomial infections: Stochastic and management effects. J. Hosp. Infect. 43 131–147.
  • Cottam, E. M., Thébaud, G., Wadsworth, J., Gloster, J., Mansley, L., Paton, D. J., King, D. P. and Haydon, D. T. (2008). Integrating genetic and epidemiological data to determine transmission pathways of foot-and-mouth disease virus. Proceedings of the Royal Society (Series B) 275 887–895.
  • Gardy, J. L., Johnston, J. C., Ho Sui, S. J., Cook, V. J., Shah, L., Brodkin, E., Rempel, S., Moore, R., Zhao, Y., Holt, R., Varhol, R., Birol, I., Lem, M., Sharma, M. K., Elwood, K., Jones, S. J. M., Brinkman, F. S. L., Brunham, R. C. and Tang, P. (2011). Whole-genome sequencing and social-network analysis of a tuberculosis outbreak. New England Journal of Medicine 364 730–739.
  • Harris, S. R., Feil, E. J., Holden, M. T. G., Quail, M. A., Nickerson, E. K., Chantratita, N., Gardete, S., Tavares, A., Day, N., Lindsay, J. A., Edgeworth, J. D., de Lencastre, H., Parkhill, J., Peacock, S. J. and Bentley, S. D. (2010). Evolution of MRSA during hospital transmission and intercontinental spread. Science 327 469–474.
  • Harris, S. R., Cartwright, E. J. P., Török, M. E., Holden, M. T. G., Brown, N. M., Ogilvy-Stuart, A. L., Ellington, M. J., Quail, M. A., Bentley, S. D., Parkhill, J. and Peacock, S. J. (2013). Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphyloccus aureus: A descriptive study. Lancet, Infect. Dis. 13 130–136.
  • Jombart, T., Eggo, R. M., Dodd, P. J. and Balloux, F. (2011). Reconstructing disease outbreaks from genetic data: A graph approach. Heredity (Edinb) 106 383–390.
  • Jombart, T., Cori, A., Didelot, X., Cauchemez, S., Fraser, C. and Ferguson, N. (2014). Bayesian reconstruction of disease outbreaks by combining epidemiologic and genomic data. PLoS Comput. Biol. 10 e1003457.
  • Köser, C. U., Holden, M. T. G., Ellington, M. J., Cartwright, E. J. P., Brown, N. M., Ogilvy-Stuart, A. L., Yang Hsu, L., Chewapreecha, C., Croucher, N. J., Harris, S. R., Sanders, M., Enright, M. C., Dougan, G., Bentley, S. D., Parkhill, J., Fraser, L. J., Betley, J. R., Schulz-Trieglaff, O. B., Smith, G. P. and Peacock, S. J. (2012). Rapid whole-genome sequencing for investigation of a neonatal MRSA outbreak. New England Journal of Medicine 366 2267–2275.
  • Krzanowski, W. J. and Hand, D. J. (2009). ROC Curves for Continuous Data. Monographs on Statistics and Applied Probability 111. CRC Press, Boca Raton, FL.
  • Kypraios, T., O’Neill, P. D., Huang, S. S., Rifas-Shiman, S. L. and Cooper, B. (2010). Assessing the role of undetected colonisation and isolation precautions in reducing methicillin-resistant Staphyloccus aureus transmission in intensive care units. BMC Infect. Dis. 10.
  • Mollentze, N., Nel, L. H., Townsend, S., le Roux, K., Hampson, K., Haydon, D. T. and Soubeyrand, S. (2014). A Bayesian approach for inferring the dynamics of partially observed endemic infectious diseases from space-time-genetic data. Proceedings of the Royal Society (Series B) 281 1782.
  • Morelli, M. J., Thébaud, G., Chadœuf, J., King, D. P., Haydon, D. T. and Soubeyrand, S. (2012). A Bayesian inference framework to reconstruct transmission trees using epidemiological and genetic data. PLoS Comput. Biol. 8 e1002768, 14.
  • Numminen, E., Chewapreecha, C., Sirén, J., Turner, C., Turner, P., Bentley, S. D. and Corander, J. (2014). Two-phase importance sampling for inference about transmission trees. J. R. Soc. Interface 281 20141324.
  • O’Neill, P. and Roberts, G. (1999). Bayesian inference for partially observed stochastic epidemics. J. Roy. Statist. Soc. Ser. A 162 121–129.
  • Perry, J. D., Davies, A., Butterworth, L. A., Hopley, A. L. J., Nicholson, A. and Gould, F. K. (2004). Development and evaluation of a chromogenic agar medium for methicillin-resistant staphylococcus aureus. J. Clin. Microbiol. 42 4519–4523.
  • Pittet, D., Allegranzi, B., Storr, J., Nejad, S. B., Dziekan, G., Leotsakos, A. and Donaldson, L. (2008). Infection control as a major World Health Organization priority for developing countries. J. Hosp. Infect. 68 285–292.
  • Pybus, O. G. and Rambaut, A. (2009). Evolutionary analysis of the dynamics of viral infectious disease. Nat. Rev. Genet. 10 540–550.
  • Pybus, O. G., Charleston, M. A., Gupta, S., Rambaut, A., Holmes, E. C. and Harvey, P. H. (2001). The epidemic behavior of the hepatitis C virus. Science 292 2323–2325.
  • Rasmussen, D. A., Ratmann, O. and Koelle, K. (2011). Inference for nonlinear epidemiological models using genealogies and time series. PLoS Comput. Biol. 7 e1002136, 11.
  • Romero-Severson, E., Skar, H., Bulla, I., Albert, J. and Leitner, T. (2014). Timing and order of transmission events is not directly reflected in a pathogen phylogeny. Mol. Biol. Evol. 31 2472–2482.
  • Snitkin, E. S., Zelazny, A. M., Thomas, P. J., Stock, F., NISC Comparative Sequencing Program Group, Henderson, D. K., Palmore, T. N. and Segre, J. A. (2012). Tracking a hospital outbreak of carbapenem-resistant Klebsiella pneumoniae with whole-genome sequencing. Science Translational Medicine 4 148ra116.
  • Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. J. Amer. Statist. Assoc. 82 528–550. With discussion and with a reply by the authors.
  • Volz, E. M., Pond, S. L. K., Ward, M. J., Brown, A. J. L. and Frost, S. D. W. (2009). Phylodynamics of infectious disease epidemics. Genetics 183 1421–1430.
  • Watterson, G. A. (1975). On the number of segregating sites in genetical models without recombination. Theoret. Population Biology 7 256–276.
  • Worby, C. J., Lipsitch, M. and Hanage, W. P. (2014). Within-host bacterial diversity hinders accurate reconstruction of transmission networks from genomic distance data. PLoS Comput. Biol. 10 e1003549.
  • Worby, C. J. and Read, T. D. (2015). “seedy” (simulation of evolution and epidemiological dynamics): An R package to follow within-host mutation in pathogens. PLOS One 10 e0129745.
  • Worby, C. J., Jeyaratnam, D., Robotham, J. V., Kypraios, T., O’Neill, P. D., Angelis, D. D., French, G. and Cooper, B. S. (2013). Estimating the effectiveness of isolation and decolonization measures in reducing transmission of methicillin-resistant Staphylococcus aureus in hospital general wards. Am. J. Epidemiol. 177 1306–1313.
  • Worby, C. J., Chang, H. H., Hanage, W. P. and Lipsitch, M. (2014). The distribution of pairwise genetic distances: A tool for investigating disease transmission. Genetics 198 1395–1404.
  • Worby, C. J., O’Neill, P. D., Kypraios, T., Robotham, J. V., De Angelis, D., Cartwright E. J. P., Peacock, S. J. and Cooper, B. S. (2016). Supplement to “Reconstructing transmission trees for communicable diseases using densely sampled genetic data.” DOI:10.1214/15-AOAS898SUPPA, DOI:10.1214/15-AOAS898SUPPB.
  • Ypma, R. J. F., van Ballegooijen, W. M. and Wallinga, J. (2013). Relating phylogenetic trees to transmission trees of infectious disease outbreaks. Genetics 195 1055–1062.
  • Ypma, R. J. F., Bataille, A. M. A., Stegeman, A., Koch, G., Wallinga, J. and van Ballegooijen, W. M. (2012). Unravelling transmission trees of infectious diseases by combining genetic and epidemiological data. Proceedings of the Royal Society (Series B) 279 444–450.

Supplemental materials