Skip to main content

4. Modeling the Ebola Epidemic: Challenges and Lessons for the Future

Published onMar 28, 2020
4. Modeling the Ebola Epidemic: Challenges and Lessons for the Future


Analyzing infectious disease dynamics by means of mathematics has a long history, going back as far as the late sixteenth and early seventeenth century when plague epidemics raged through Europe.[1] Modern infectious disease epidemiology became established in the second half of the twentieth century[2] and has proven to play a key role in understanding the spread of infectious diseases. At the center of this discipline lies the formulation of mathematical models that describe the contact between susceptible and infectious individuals in order to study the dynamics of transmission over time. Arguably the most important quantity in infectious disease epidemiology is the basic reproduction number, R0.[3][4][5] This quantity is defined as the average number of new infections caused by a typical infected individual in a population that is completely susceptible (figure 4.1). Knowledge of this quantity provides crucial information about the potential impact of control interventions. As an example, newest estimates of R0 for measles lie around 30, meaning that one person infected with measles will on average transmit the infection to 30 other people in a population that is unvaccinated and has never before been exposed to the measles virus.[6][7] The high value of R0 for measles explains the high vaccination rate (~95%) that is required to eliminate measles in a population.[7] On the other hand, seasonal influenza virus has an R0 around 2, meaning that even small levels of effective vaccination, paired with hygiene measures, can limit transmission to a certain extent.[8]

<p>Figure 4.1 Schematic illustration of the basic reproduction number, <em>R</em><sub>0</sub>. When the population is completely susceptible and <em>R</em><sub>0</sub> = 2, the first infectious case (gray circle) generates on average two secondary cases, each of which in turn generates two additional cases on average, and so forth.</p>

Figure 4.1 Schematic illustration of the basic reproduction number, R0. When the population is completely susceptible and R0 = 2, the first infectious case (gray circle) generates on average two secondary cases, each of which in turn generates two additional cases on average, and so forth.

Until the Ebola virus disease (EVD) outbreak in West Africa, relatively little was known about the transmission dynamics of EVD. In 2004, Chowell et al.[9] published the first estimates of the basic reproduction number for two previous EVD outbreaks in the Democratic Republic of Congo (R0 = 1.8) and Uganda (R0 = 1.3). These relatively low estimates of R0 indicated that EVD is not highly transmissible unless individuals are in direct contact with body fluids of an infected person and that control interventions such as case isolation, quarantine, and contact tracing have the potential to reduce transmission significantly. Apart from a notable second study published by Legrand et al.,[10] analyzing the transmission dynamics of EVD received little notice in the scientific community. However, this changed dramatically with the emergence of EVD in West Africa, a region that had never reported an EVD outbreak before 2013.

First Months of the Outbreak

It is now believed that the first case of the EVD outbreak in West Africa was a two-year-old child from the prefecture of Guéckédou in Guinea.[11] The child likely contracted the virus from an animal reservoir and died on 6 December 2013, infecting other family members along the way. The outbreak remained unnoticed until March 2014, when teams of the health ministry and Médecins sans Frontières (MSF) started an investigation. One month later, the New England Journal of Medicine published the first study describing the emergence of EVD in Guinea.[11] The outbreak received limited attention in the following months before the rapid increase in the number of infected cases caused the World Health Organization (WHO) to call the outbreak a public health emergency of international concern (PHEIC) on August 8, 2014.[12]

The initial lack of studies investigating the outbreak dynamics was rather atypical for an outbreak of an emergent infectious disease. Other outbreaks, such as severe acute respiratory syndrome coronavirus (SARS-CoV) in 2003, H1N1 pandemic influenza in 2009, and Middle East respiratory syndrome coronavirus (MERS-CoV) in 2012 saw the rapid publication of early outbreak analyses with descriptions of the transmission dynamics and estimates of R0.[13][14][15][16] These studies helped assess epidemic potential and impact of control interventions in real time. In contrast, there was still inadequate understanding of the outbreak dynamics of EVD in the three affected countries by summer 2014. This was all the more surprising as the reported cumulative number of clinical cases and deaths climbed to 1,603 and 887 respectively by August 1, 2014,[17] including four cases from an additional outbreak caused by an infected air traveler who exported the disease to Nigeria via the international airport in Lagos.[18][19]

Early Studies on Transmission Dynamics

At the beginning of August 2014, there was no study yet describing the transmission dynamics and basic reproduction number for the West African EVD outbreak. However, several researchers started to collect data from various websites from the WHO and local ministries of health that had published case data, and compiled data files that allowed them to analyze the different epidemic curves in the affected countries. Althaus[17] constructed a mathematical model—similar in fashion to the first model developed by Chowell et al.[9]—with different compartments that describe the transmission of EVD from infected to susceptible individuals and how infected individuals recover from the disease or die. Fitting this model to the data resulted in the first estimates of the basic reproduction number of EVD for each country. Furthermore, the model provided insights into whether control interventions had already led to a reduction in transmission. What the study showed was reassuring as well as frightening. The values of R0 in Guinea, Sierra Leone, and Liberia ranged between 1.5 and 2.5.[17] They were very similar to the estimates from previous outbreaks,[9][10] indicating that the transmissibility of EVD had not changed and that implementing the control interventions that had been used previously would limit further spread. Indeed, the model results showed that a certain level of control had been achieved in Guinea and Sierra Leone during May and July 2014. In stark contrast, the results also showed that the epidemic was completely out of control in Liberia, where the number of infected cases and deaths due to EVD continued to grow exponentially and was doubling every two weeks (figure 4.2). The study was initially published on arXiv, an open-access repository for electronic preprints, and later appeared in PLOS Currents: Outbreaks, a specialized scientific journal that undertakes rapid peer review and was designed for such emergency situations, on September 2, 2014.[17] The study’s findings were corroborated by a number of other modeling studies that were published in the days and weeks that followed.[20][21][22][23]

<p>Figure 4.2 Dynamics of Ebola virus disease (EVD) outbreak in Liberia up to the end of August 2014. Reported data of the cumulative numbers of infected cases and deaths are shown as circles and squares, respectively. The lines represent the fit of the mathematical model to the data. Figure adapted from Althaus.<sup>17</sup></p>

Figure 4.2 Dynamics of Ebola virus disease (EVD) outbreak in Liberia up to the end of August 2014. Reported data of the cumulative numbers of infected cases and deaths are shown as circles and squares, respectively. The lines represent the fit of the mathematical model to the data. Figure adapted from Althaus.17

The WHO Ebola Response Team published their long-awaited study describing the outbreak dynamics in great detail in the second half of September 2014.[24] The infectious disease epidemiologists from the WHO and their collaborators had access to clinical data that included the dates of symptom onset and hospitalization, and the time at which the patients had contact with other persons who had EVD. This allowed the researchers to study infection characteristics such as the incubation period, which is the time between infection and onset of symptoms. Knowing the length of this period was important for assessing the duration during which case contacts needed to be followed up. Furthermore, the study provided a detailed picture of the generation time (time between infection in an index case and infection in a patient infected by said index case), which allowed for more accurate estimates of R0. Overall, the WHO study came to the conclusion that the infection characteristics of EVD were similar to what had been observed in smaller outbreaks during the past decades. Nevertheless, a pessimistic outlook on the course of the outbreak remained. Assuming no change in control measures that would lead to a decrease in the reproduction number, the authors predicted that the cumulative number of cases in all three countries could exceed 20,000 by the beginning of November 2014.[24] Another study used a similar method but extrapolated the number of cases until the end of 2014 and found that 77,181 to 277,124 cases would be expected by then.[22] Around the same time, the US Centers for Disease Control and Prevention published a report with the most catastrophic scenario. They argued that the true number of EVD cases could be 2.5 times higher than reported and calculated that Sierra Leone and Liberia could reach 1.4 million cases by January 20, 2015, if this correction factor for underreporting was taken into account.[25]

While such long-term forecasts are error-prone, all these studies clearly highlighted that the epidemic was completely rampant and growing exponentially in certain areas. With such large numbers of infected cases to be expected, would traditional control measures such as case isolation, quarantine, and contact tracing still be feasible? Or was there a critical point beyond which it would prove almost impossible to manage the outbreak? Several studies also approached these questions by incorporating various control interventions into their mathematical models. For example, it is possible to simulate isolation of infected patients by moving infected individuals into another compartment where they could not transmit the disease further. These real-time studies provided important insights into the proportion of patients that needed to be hospitalized,[26] the required number of hospital beds,[27] and the benefits and risks of introducing community care centers to isolate suspected cases.[28]

Another factor of uncertainty was the impact of traditional funeral practices—which can involve washing, touching, or kissing of the body—on EVD transmission. The journal Science published a controversial modeling study suggesting that funeral transmission alone could sustain the epidemic in Liberia.[29] However, it is exceedingly difficult to quantify the separate contribution of community, hospital, and funeral transmission,[30] and the result of this particular study was based on various model assumptions about the risk of acquiring the infection at a funeral of a person who died of EVD. Other studies—of which some were based on epidemiological contact tracing—showed that the amount of funeral transmission was minor and contributed around 10% to overall transmission.[24][31][32]

Concomitant with the publication of these modeling studies, the international aid to contain the epidemic in West Africa increased dramatically. The weekly numbers of new EVD cases that were reported started to decline after October 2014. How much of this decline can be attributed to the increase in health care capacities remains a matter of debate. Additional factors, such as behavior change in the population due to increased awareness, could have led to a reduction in transmission and might have supported the effect of the newly introduced control interventions.


The website of the WHO Regional Office for Africa began publishing regular updates about the epidemiology and surveillance of the outbreak after March 2014.[33] While this proved useful as scientists around the world could access the data, there were several major problems. First, there was considerable uncertainty around the data, as they were not assembled in a coordinated manner. Second, not all reported cases were laboratory-confirmed, and it was unclear whether this led to over- or underreporting of the actual number of cases. Third, the case numbers were mostly reported as cumulative numbers and at irregular intervals. This, and the fact that cases were sometimes reclassified as non-cases, complicated the calculation of the number of new infections observed every week. Finally, the data were presented as text or in tables in HTML or PDF documents. This made it difficult to automatically download the data using customized software tools. Caitlin Rivers, who was at the time a graduate student in computational epidemiology at Virginia Tech, aggregated the available data from the WHO outbreak news, situation reports, and the local ministries of health into machine-readable files and made them publicly available on her GitHub repository.[34] This made it much easier for other research groups to quickly access the most recent data to analyze the outbreak dynamics. At the peak of the epidemic, the WHO redesigned their website and began publishing up-to-date situation reports,[35] making it substantially easier to access and analyze the epidemiological data in real time. If these features had been available during spring 2014, researchers would have been able to analyze the epidemic trajectory much earlier. An understanding of the scale of the outbreak early on, and showing that the epidemic was out of control in some areas, might have helped inspire an earlier international response to the outbreak. This missed opportunity was mentioned in a report from the WHO Ebola Interim Assessment Panel published in July 2015, which stated that “data were not aggregated, analysed or shared in a timely manner and in some cases not at all.”[36]

Another challenge is the inherent uncertainty in analyzing and interpreting epidemiological data and in making predictions about the future course of an epidemic. Small changes in model assumptions and parameters can lead to wildly different outcomes, in particular for long-term model projections as discussed above. Nevertheless, such models can still be useful to study worst-case scenarios and the type of interventions that would be needed to prevent them from happening. Estimating the probability of rare events also proves difficult, in particular if there is no previous information about such events. For example, several studies assessed the potential for international spread of EVD through air travel.[20][37][38] These studies made use of worldwide airline passenger data and came to the conclusion that the short-term probability of international spread was small but not negligible. Indeed, infected cases that traveled outside West Africa spread EVD to several countries. But to predict exactly which country would be affected proved nearly impossible. These two examples—epidemic forecasting and assessing the potential for international spread—illustrate the potential and limitations of mathematical models.

The ways in which some modeling studies were presented, interpreted, and communicated led to substantial criticism, and their use for public health policy making was sometimes met with resistance. A news piece in the journal Nature noted that model forecasts were not in line with the observed trajectory of the epidemic in Liberia and overestimated the number of cases during October 2014.[39] However, the projections cited in the piece were worst-case scenarios, and assumed that containment measures were ineffective and transmission did not change through other means. A group of modelers responded to this criticism and argued that focusing on the failure of models to project the epidemic accurately undervalues their other aims.[40] While the models played a role in informing the international response, they were also important tools for synthesizing and incorporating data from multiple sources to create a summary picture that could help guide decision makers during an epidemic.[41]

Lessons to Learn

Retrospectively, the EVD outbreak in West Africa has taught the scientific community several lessons on how to respond to similar events in the future. On one hand, the outbreak demonstrated how mathematical modeling of infectious diseases could provide crucial information for anticipating transmission dynamics and potential impact of control interventions. On the other hand, it became clear that several factors prevented the full potential of models from being realized during the outbreak. Some guidelines on what could be improved to enhance the use of models in outbreak situations follow:

  1. Epidemiological data should be readily accessible during the early phase of an outbreak. This requires a collaborative effort of local authorities and health ministries with international organizations such as the WHO. Rapid data sharing will allow experts in the field of mathematical and computational epidemiology to analyze the outbreak in real time and to provide recommendations for policy makers. The MERS-CoV outbreak in South Korea from May to July 2015 represents a promising example where the epidemic curve of the outbreak was published in real time.[42]

  2. Line lists that contain information about infected individuals such as age, sex, date of symptom onset, and infection outcome should be made available in machine-readable file formats, such as comma-separated values. This will facilitate the automated analyses of newly released data sets and prevents errors that could otherwise happen during data processing. Sharing of such information can pose privacy issues, and standardized protocols need to be established by the respective health agencies and ministries to better protect patients.[43]

  3. Alternative data sources that could improve our understanding of infectious disease transmission in a particular geographical area should be considered.[44] For example, maps of human mobility based on mobile phone network data[45] or high-resolution data on human population distributions[46] could help improve the parameterization of infectious disease models.

  4. The scientific research community should aim for rapid dissemination of their results and publish them via open-access journals, in addition to the use of digital repositories (e.g., GitHub). While it might be more prestigious to publish early outbreak analyses in leading subscription-based journals, publications in open-access journals often receive more citations, thus providing an incentive for academic researchers.[47] Before publication in peer-reviewed journals, manuscripts could be made available through the use of preprint servers such as arXiv, bioRxiv, or PeerJ Preprints.

  5. Mathematical modelers should be upfront about the potential and limitations of their models and results. Modelers should think carefully about what is really known about the transmission of the pathogen and the impact of interventions, as well as highlight the assumptions they made to come to their conclusions. In particular, modelers need to clearly distinguish between results that are inferred from data (data-driven) and results that are based on model assumptions (assumption-driven).

  6. The results of modeling studies should be relayed in a balanced way, particularly when communicating with the media. Scientists should do their best to clarify the inherent uncertainties around their results and projections and should make sure there is little room for misinterpretation of their statements.

The last decades have shown an increasing trend in emergent and re-emergent infectious diseases.[48] Climate change, increased population density, and human mobility will likely lead to new infectious disease outbreaks in the years to come. The points mentioned above highlight just a few important aspects, but taking them into consideration will likely result in a better use of mathematical modeling for future outbreaks of emerging and re-emerging infectious diseases. During the 2013–2016 outbreak of EVD, the modeling community was caught by surprise. If we don’t learn from the mistakes that we made, we could again face a situation where important insights from modeling studies appear a little too late.

Funding Statement

Christian L. Althaus received funding through an Ambizione grant from the Swiss National Science Foundation (SNSF).


I would like to thank my co-workers and collaborators, particularly Sandro Gsteiger and Fabienne Krauer, for contributing to the projects related to the Ebola outbreak in West Africa.

Copyright © 2016 Massachusetts Institute of Technology. (All rights reserved.)

No comments here