Bayesian Reconstruction of Disease Outbreaks by Combining Epidemiologic and Genomic Data

Top Cited Papers
Open Access
Abstract
Recent years have seen progress in the development of statistically rigorous frameworks to infer outbreak transmission trees (“who infected whom”) from epidemiological and genetic data. Making use of pathogen genome sequences in such analyses remains a challenge, however, with a variety of heuristic approaches having been explored to date. We introduce a statistical method exploiting both pathogen sequences and collection dates to unravel the dynamics of densely sampled outbreaks. Our approach identifies likely transmission events and infers dates of infections, unobserved cases and separate introductions of the disease. It also proves useful for inferring numbers of secondary infections and identifying heterogeneous infectivity and super-spreaders. After testing our approach using simulations, we illustrate the method with the analysis of the beginning of the 2003 Singaporean outbreak of Severe Acute Respiratory Syndrome (SARS), providing new insights into the early stage of this epidemic. Our approach is the first tool for disease outbreak reconstruction from genetic data widely available as free software, the R package outbreaker. It is applicable to various densely sampled epidemics, and improves previous approaches by detecting unobserved and imported cases, as well as allowing multiple introductions of the pathogen. Because of its generality, we believe this method will become a tool of choice for the analysis of densely sampled disease outbreaks, and will form a rigorous framework for subsequent methodological developments. Understanding how infectious diseases are transmitted from one individual to another is essential for designing containment strategies and epidemic prevention. Recently, the reconstruction of transmission trees (“who infected whom”) has been revolutionized by the availability of pathogen genome sequences. Exploiting this information remains a challenge, however, with a variety of heuristic approaches having been explored to date. Here, we introduce a new method which uses both pathogen DNA and collection dates to gain insights into transmission events, and detect unobserved cases and separate introductions of the disease. Our approach is also useful for identifying super-spreaders, i.e., cases which caused many subsequent infections. After testing our method using simulations, we use it to gain new insights into the beginning of the 2003 Singaporean outbreak of Severe Acute Respiratory Syndrome (SARS). Our approach is applicable to a wide range of diseases and available in a free software package called outbreaker.