When does a minor outbreak become a major epidemic? Linking the risk from invading pathogens to practical definitions of a major epidemic

Forecasting whether or not initial reports of disease will be followed by a major epidemic is an important component of disease management, guiding optimal deployment of limited resources for surveillance and control. For example, the probability that undetected cases arriving in different countries would lead to a major epidemic was estimated during the 2014-16 Ebola epidemic in West Africa, and in the ongoing epidemic in DR Congo. Standard epidemic risk estimates involve assuming that infections occur according to a branching process. Surprisingly, however, these calculations are carried out without the factors differentiating major epidemics from minor outbreaks being defined precisely. We assess implications of this lack of explicitness by considering three practically relevant potential definitions of a major epidemic; namely, an outbreak in which: i) a large number of hosts are infected simultaneously; ii) a large number of infections occur in total; and iii) disease remains in the population for a long period. We show that the major epidemic probability under these definitions can be similar to, or different from, the standard estimate. This holds in a range of systems, highlighting that careful consideration of what constitutes a “major epidemic” in each outbreak is vital for accurate quantification of risk.


INTRODUCTION 38 39
Infectious disease epidemics in populations of humans, animals and plants represent a 40 recurring risk worldwide [1][2][3][4][5][6][7]. An important question for policy-makers towards the start 41 of an outbreak is whether initial cases will lead on to a major epidemic, or whether the 42 pathogen will rapidly die out instead [8,9]. An important practical consequence is that, if 43 an outbreak is likely to simply fade out, then costly interventions such as vaccination 44 where the square root accounts for the fact that it takes two generations for infected 257 humans to generate new infections, since new infections require host-vector-host 258 transmission [50,51]. We note that in some studies, e.g. [49], the square root is omitted 259 from the definition of 5 . In contrast to the expression calculated by Kucharski et al. [49], 260 to facilitate simulation of the stochastic model we also explicitly track the total number of 261 vectors, G , rather than the density. 262 263 Probability of a major epidemic (branching process estimate) 264 265

Standard estimate (stochastic SIS/SIR models) 266
The commonly used estimate for the probability of a major epidemic when a pathogen 267 first arrives in a host population [8,9,21,27,[29][30][31][32][33][34][35] can be derived by assuming that 268 infections occur according to a branching process, making the assumptions that the 269 susceptible population is large and that infection lineages arising from different infected 270 hosts are independent. When a single infected host arrives in an otherwise susceptible 271 population, the branching process estimate for the probability of a major epidemic is given 272 This expression is derived in Text S1. 277 If instead there are I(0) infected individuals initially rather than one, then for no major 279 epidemic to occur, it is necessary for each initial infection lineage to die out, leading to 280 the approximation given in equation (1)  We denote the probability of no major epidemic occurring starting from i exposed or 292 infectious human hosts, j exposed vectors and k infectious vectors by qijk. We must 293 consider exposed and infectious vectors separately to account for the possibility that 294 exposed vectors die before becoming infectious. starting from a single exposed or infectious vector gives 302 We again assume that infection lineages are independent, permitting us to approximate 305 terms with two exposed or infectious individuals by non-linear terms involving single 306 exposed or infectious individuals, e.g. SS5 ≈ S55 5S5 . Noting that 555 = 1, the three 307 equations above can be solved to give expressions for S55 , 5S5 and 55S . In particular, 308 the probability of a major epidemic starting from a single infected host is then 309 1 − S55 = Y 0 for 5 ≤ 1, ( 5 ) a − 1 ( 5 ) a + 5 GJ for 5 > 1.  We also consider the probability of a major epidemic according to the deterministic and 408 stochastic SIS model for the "Total infections" and "Duration" definitions of a major 409 epidemic. Under the "Total infections" definition, a major epidemic is assumed to be an 410 outbreak in which at least F infections occur over the course of the outbreak. Under the 411 "Duration" definition, a major epidemic is defined to be an outbreak that persists for at 412 least T days. 413

414
In the deterministic SIS model, whenever 5 > 1, the outbreak persists indefinitely with 415 an infinite number of infection events. As a result, any outbreak is a major epidemic under 416 the "Total infections" and "Duration" definitions. 417

418
In contrast, in any simulation of the stochastic SIS model, the number of infected 419 individuals will always reach zero, even if this takes a long time. As a result, the probability 420 of a major epidemic under the "Total infections" and "Duration" definitions is not simply 421 one or zero depending on the value of 5 . We approximate the probability of a major 422 epidemic under these definitions by simulating the model 10,000 times using the Gillespie 423

547
Under the stochastic SIR and Zika models, for R0 larger than and not close to one, the 549 maximum number of simultaneously infected individuals whenever the pathogen invaded 550 the host population was typically smaller than under the SIS model (cf . Fig 2). 551 Nonetheless, we found qualitatively similar behaviour in these cases -the probability of 552 a major epidemic approximated using a branching process corresponded to a wide range 553 of values of the major epidemic threshold when R0 was high (Fig 4). However, even if that 554 is the case, the practically relevant value of the major epidemic threshold (e.g. the number 555 of available hospital beds) may not give a probability of a major epidemic that matches 556 the branching process estimate. For example, if R0 = 2 and 250 beds are available, for 557 the SIR model the probability of a major epidemic under the "Concurrent size" definition 558 is 0 (solid grey line in Fig 4a), yet the branching process estimate for the probability of a 559 major epidemic is 0.5 (dotted grey line in Fig 4a). For the stochastic SIS model, we then calculated the probability of a major epidemic for 564 different definitions of a major epidemic -specifically, outbreaks in which there are at 565 least F infection events (the "Total infections" definition - Fig 5a) or outbreaks that persist 566 for at least T days (the "Duration" definition - Fig 5b).
result, in these specific cases (i.e. when the stochastic SIS model was used and a major 587 epidemic was defined according to the "Total infections" or "Duration" definitions) it can 588 be concluded that the branching process approximation often leads to sensible estimates 589 of the risk posed by invading pathogens. Nonetheless, even in these cases, for small or 590 very large values of the major epidemic thresholds the probability of a major epidemic 591 does not match the branching process estimate, particularly when 5 was larger than but 592 close to one (see e.g. red line in Fig 5b). 593

DISCUSSION 613 614
Determining the risk of an emerging outbreak developing into a major epidemic is vital for 615 planning whether or not intervention and/or containment strategies will be necessary. 616 When a pathogen arrives in a new location, the probability of a major epidemic can be 617 approximated by assuming that infections occur according to a branching process. For 618 simple models such as the stochastic SIS and SIR models, this leads to the probability of 619 a major epidemic in equation (1). It is also possible to estimate the probability of a major 620 epidemic according to the branching process approximation using models with more 621 complexity, as we showed by considering the case of host-vector transmission (see 622 equations (4) and (5)). 623 624 However, the branching process estimate for the probability of a major epidemic is not 625 necessarily accurate when definitions of a major epidemic are used that address practical 626 aspects of disease control (Fig 3). The branching process estimate corresponds to a 627 range of choices of the epidemic thresholds in our definitions when R0 is much greater 628 than one, or when the population size is extremely large (see e.g. different values of M in 629   Fig 3a,b). However, when R0 is close to one and the population is not large, the standard 630 estimate can correspond to a single choice of the epidemic threshold (see e.g. blue and red lines in Fig 3a, and Tables 1 and 2). For specific outbreaks, even when both R0 and 632 the population size are large, the branching process estimate may not be relevant -since 633 the range of choices of the major epidemic threshold that the standard estimate 634 corresponds to may or may not include the specific threshold of practical importance in 635 the outbreak, for example the number of hospital beds available. Consequently, using the 636 standard branching process estimate for the probability of a major epidemic could lead to 637 the risk of a major epidemic being incorrectly assessed, potentially including scenarios in 638 which a major epidemic develops when previously deemed unlikely. Our main conclusion 639 that the branching process estimate for the probability of a major epidemic may or may 640 not match the true probability of a major epidemic when a practically relevant definition is 641 used holds for a range of epidemiological systems (Fig 4) as well as different definitions 642 of a major epidemic that apply in alternative settings (Fig 5). We note that exactly how to contain one outbreak, and both epidemics are equally controllable, then it might be 691 preferable to choose the one that is likely to generate more infections. In other real-world 692 scenarios, alternative definitions might be appropriate. We also considered major 693 epidemics defined as outbreaks that persist for a threshold length of time (the "Duration" 694 definition). Different definitions of a major epidemic might appear contradictory -for 695 example, treatment can act to reduce the total number of infections yet increase the 696 duration of the outbreak [61], making a major epidemic less likely under the "Total 697 infections" definition of a major epidemic but more likely under the "Duration" definition. 698 Our intention here was to use very simple models to demonstrate the principle that 700 different definitions of a major epidemic lead to different probabilities of a major epidemic. 701 Although simple models are commonly used, accurate outbreak forecasts require a model 702 carefully matched to the epidemiology of the host-pathogen system, potentially including 703 asymptomatic transmission [9,62,63] or spread between spatially distinct regions [29,64]. 704 For certain definitions, it may be necessary to include convalescent hosts in the model 705 explicitly. For example, if convalescent individuals require resources, such as beds in 706 treatment rooms or hospitals, and the definition of a major epidemic is linked to the 707 availability of resources (as in the case of the "Concurrent size" definition), then these 708 individuals should be modelled, potentially by including them in a new compartment 709 following the infectious class. More complex definitions of major epidemics could also be 710 used, for example requiring multiple criteria to be satisfied for an outbreak to be classified 711 as a major epidemic. In these more complicated scenarios, analytic calculations of the 712 probability of a major epidemic might not be possible. Model simulations can then be used 713 to assess the probability of a major epidemic, as we showed for a host-vector model of 714 Zika virus transmission (Fig 4b). 715

716
We note that practical use of the methods presented here at the start of an emerging 717 outbreak to assess the major epidemic risk would require the wide range of interventions 718 that are introduced in outbreak response settings to be integrated into the models 719 explicitly. One way in which control can be included is to consider the effective 720