Epidemic spreading is well understood when a disease propagates around a contact graph. In a stochastic susceptible–infected–susceptible setting, spectral conditions characterize whether the disease vanishes. However, modelling human interactions using a graph is a simplification which only considers pairwise relationships. This does not fully represent the more realistic case where people meet in groups. Hyperedges can be used to record higher order interactions, yielding more faithful and flexible models and allowing for the rate of infection of a node to depend on group size and also to vary as a nonlinear function of the number of infectious neighbours. We discuss different types of contagion models in this hypergraph setting and derive spectral conditions that characterize whether the disease vanishes. We study both the exact individual-level stochastic model and a deterministic mean field ODE approximation. Numerical simulations are provided to illustrate the analysis. We also interpret our results and show how the hypergraph model allows us to distinguish between contributions to infectiousness that (i) are inherent in the nature of the pathogen and (ii) arise from behavioural choices (such as social distancing, increased hygiene and use of masks). This raises the possibility of more accurately quantifying the effect of interventions that are designed to contain the spread of a virus.
Compartmental models for disease propagation have a long and illustrious history [1,2], and they remain a fundamental predictive tool [3,4]. For a stochastic, individual-level model, it has been suggested recently that hyperedge information should be incorporated [5–7]. Hyperedges allow us to account directly for group interactions of any size, rather than, as in, for example, [8–10], treating them as a collection of essentially independent pairwise encounters.
In this work, we contribute to the modelling and analysis of disease spreading on a hypergraph. We include the case where the number of infected nodes in a hyperedge contributes nonlinearly to the overall infection rate; this covers the so-called collective contagion model setting and a new alternative that we call a collective suppression model. The main contributions of our work are
a mean field approximation ((4.1) and (4.2)) with a spectral condition for local asymptotic stability of the zero-infection state (theorem 6.1) and an extension to global asymptotic stability (theorem 6.5) when the nonlinear infection function is concave,
for the exact, individual-level model, a spectral condition for exponential decay of the non-extinction probability in the concave case (theorem 8.1) and a spectral bound on the expected time to extinction (corollary 8.2),
extensions of these results to more general partitioned hypergraph models, where distinct infection rates apply to different categories of hyperedge (theorems 6.2, 6.6 and 8.3 and corollary 8.4),
results for the non-concave collective contagion model (theorems 9.1 and 9.2),
a complementary condition that rules out extinction of the disease (theorem 8.6),
interpretations of these mathematical results: the spectral thresholds for disease extinction naturally distinguish between the inherent biological infectiousness of the disease and behavioural choices of the individuals in the population, allowing us to account for intervention strategies (§10).
The paper is organized as follows. In §2, we introduce the traditional graph-based susceptible–infected–susceptible (SIS) model and quote a spectral condition that characterizes control of the disease. We then discuss the generalization to hyperedges, and motivate the use of infection rates that do not scale linearly with respect to the number of infected neighbours. Section 3 formalizes the hypergraph model and shows how it may be simulated. In §4, we derive a mean field approximation to characterize the behaviour of the model, and in §5 we give computational results to illustrate its relevance. Section 6 analyses the deterministic mean field setting and gives a spectral condition for long-term decay of the disease. The result is local for general infection rates and global (independent of the initial condition) for the concave case. This spectral condition generalizes a well-known result concerning disease propagation on a graph. Section 7 provides further computational simulations to illustrate the spectral threshold. The full stochastic model is then studied in §8, where we extend the analysis in  to our hypergraph setting. Here, we study extinction of the disease in the case where the nonlinearity in the infection rate is concave. We also derive conditions for non-extinction. In §9, we extend our analysis to the so-called collective contagion model proposed in [5–7]. In §10, we summarize and interpret our results and discuss follow-on work.
For a review of recent studies of spreading processes on hypergraphs, including the dissemination of rumours, opinions and knowledge, we recommend [11, §7.1.2]. The model that we study fits into the framework of . This work introduced the idea of a nonlinear ‘infection pressure’ from each hyperedge, and derived a mean field approximation that was compared with microscale-level simulation results. In , the authors studied this type of model on simplicial complexes of degree up to 2 (a subclass of the more general hypergraph setting) and also studied a mean field approximation. These authors examined the mean field system from a dynamical systems perspective and analysed issues such as bistability, hysteresis and discontinuous transitions. Similarly, in [5,7], a hypergraph version was considered. Our work differs from these studies in (i) focusing on the derivation of spectral thresholds for extinction of a disease in both the exact and mean field settings and (ii) seeking to interpret the results from a mathematical modelling perspective. We mention that it would also be of interest to develop corresponding thresholds for the mean field models in [5,6,12].
2. Stochastic susceptible–infected–susceptible models
(a) Stochastic susceptible–infected–susceptible model on a graph
Classical ODE compartmental models are based on the assumption that any pair of individuals is equally likely to interact—this is the homogeneous mixing case . If, instead, we have knowledge of all possible pairwise interactions between individuals, then this information may be incorporated via a contact graph and used in a stochastic model. Here, each node represents an individual, and an edge between nodes i and j indicates that individuals i and j interact. For a population with n individuals, we may let denote the corresponding symmetric adjacency matrix, so nodes i and j interact if and only if . In this setting, a stochastic SIS model uses the two-state random variable to represent the status of node i at time t, with for a susceptible node and for an infected node. Each then follows a continuous time Markov process where the infection rate is given by
This model was studied in [10, theorem 1], where it was argued that the condition
(b) Why use a hypergraph?
It has been argued [11,13–16] that in many network science applications we lose information by recording only pairwise interactions. For example, e-mails can be sent to groups of recipients, scholarly articles may have multiple coauthors and many proteins may interact to form a complex. In such cases, recording the relevant lists of interacting nodes gives a more informative picture than reducing these down to a collection of edges.
In the setting of an SIS model, we may argue that individuals typically come together in well-defined groups; for example, in a household, a workplace or a social setting. Such groups may be handled by the use of hyperedges, leading to a hypergraph; these concepts are formalized in the next section.
With a classic graph model, as described in §2a, all types of (pairwise) interactions are treated equally and the rate of infection of a node is linearly proportional to the number of infectious neighbours. With a hypergraph we may consider more intricate contagion mechanisms. For example, using the terminology of , the collective contagion model is studied in [5–7]. Here, infection only starts spreading within a hyperedge after a certain critical number of infectious neighbours has been reached. This type of behaviour is relevant, for example, in a shared office area where infection occurs only when a minimal viral load is exceeded. The critical number may also depend on the environment and group size; in a given office space a small number of workers may be able to socially distance in a way that effectively eliminates the risk of infection. Hence, we may wish to use a different threshold for different sizes and categories of hyperedge.
We mention here that an alternative type of mechanism may also operate, which we call collective suppression. Imagine that a disease may be contracted through contact with a surface that was previously touched by an infected individual. Now suppose that a group of individuals is likely to use the same physical object, such as a door handle, hand rail, cash machine or water cooler. If an infected individual contaminates the object, then further contamination by other individuals is less relevant. In this case, doubling the number of infected users will increase the risk by a factor less than 2; generally, risk grows sublinearly as a function of the size of the hyperedge.
These arguments motivate us to allow the rate of infection of a node within a hyperdege to depend on a generic function f of the number of infectious neighbours in that hyperedge; this approach was also taken in . The arguments also motivate us to study the case where the form of the nonlinearity depends on the type of hyperedge. We will be particularly concerned with the case where f is concave, since this is tractable for analysis and allows us to draw conclusions about the collective contagion model. We note that if f is the identity, then we recover linear dependence on the number of infectious neighbours and the hypergraph model is equivalent to a virus spreading on the clique graph of the hypergraph.
3. Susceptible–infected–susceptible on a hypergraph
We continue with some standard definitions .
A hypergraph is a tuple of nodesV and hyperedgesE such that . Here, denotes the power set of V.
We will let n and m denote the number of nodes and hyperedges, respectively; that is, and . Loosely, a hypergraph generalizes the concept of a graph by allowing an ‘edge’ to be a list of more than two nodes.
Consider a hypergraph . The incidence matrix, , is the matrix such that if node i belongs to hyperedge and otherwise.
It is also useful to introduce . This matrix has the property that records the number of hyperedges containing both nodes i and j. In particular, if is a graph then W is the affinity matrix of the graph.
(b) General infection model
In our context, the nodes represent individuals and a hyperedge records a collection of individuals who are known to interact as a group. As in the graph case introduced in §2a, we use a state vector , which follows a continuous time Markov process, where, for each , if node i is infected at time t and otherwise. We continue to assume that an infectious node becomes susceptible with constant recovery rate . However, generalizing (2.1), we now assume that a susceptible node i becomes infectious with rate
If f is the identity, the rate of infection reduces to . This gives a weighted version of the infection rate of an SIS model on a graph. As discussed in §2b, it may be appropriate to choose nonlinear f in certain circumstances. We note that in  the authors have in mind functions which behave like the identity near the origin and have a horizontal asymptote. Instances of such functions are and for some . Relaxing these conditions, we may ask more generally in such a setting that the function be concave. On the other hand, the authors in [5,6] consider a collective contagion model, where infection spreads within a hyperdge only if a certain threshold of infectious vertices is reached in that hyperedge. A collective contagion model may be represented via the function for some , or for some .
(c) Partitioned hypergraph model
Following the discussion in §2b, we also introduce a more general case where we partition the hyperedges into K disjoint categories with each category having its own distinct rate of infection in response to the number of infected nodes in a hyperedge, represented by a function . For example, the categories may correspond to different types of housing, workplaces, hospitality venues or sports facilities, each representing different physical spaces and forms of interaction. We may then represent the infection rate of node i as
In this generalized case, a collective contagion model could be defined by first organizing the hyperedges into categories depending on their size, so that category k is the set of hyperedges of size . A collective contagion model may then be represented, for example, via the functions and , .
4. Mean field approximation
A classic approach to studying processes with random infection rates is to develop a mean field approximation for the expected process
5. Simulations and comparison between exact and mean field models
Let us emphasize that the approximate infection rates in (4.2) differ in general from the expectation of the random rates in (3.1). When the function f is concave, however, Jensen’s reverse inequality indicates that the rates in (4.2) are greater than the expectation of the rates in (3.1). Hence, in this case the expected quantities are overestimated by (4.1) and (4.2). This is fine when we are looking for conditions for the disease to vanish. If f is not concave (e.g. for a collective contagion model) it is less clear a priori how the exact model and mean field ODE compare. In this section, we therefore present results of computational simulations in order to gain insight into the accuracy of our mean field approximation.
(a) Simulation algorithm
Before presenting numerical results, we summarize our approach for simulating the individual-level stochastic model, which is based on a standard time discretization; see, for example, . Using a small fixed time step , we advance from time t to as follows. First, let be a random vector of i.i.d. values uniformly sampled from . For every node ,
when , set if
and set otherwise;
when , set if
and set otherwise.
(b) Computational results
In the simulations, we chose nodes with fixed recovery rate and a time step of . We look at results for different choices of (i) the infection strength, β, and (ii) the (independent) initial probability for each node to be infectious, which we denote . We simulated the mean field ODE using Euler’s method with time step of 0.05. The largest size of a hyperedge was 5 and we distributed the number of hyperedges for the hypergraph randomly as follows: 300 edges, 200 hyperedges of size 3, 100 hyperedges of size 4 and 50 hyperedges of size 5. To give a feel for the level of fluctuation, the individual-level paths are averaged over 10 runs, each with the same hypergraph connectivity and initial state.
For figure 4, we used a collective contagion model on a partitioned hypergraph. Assigning each hyperedge to a category in , where category k contains the hyperedges of size , we chose the following associated functions to determine the infection rates: , and for , .
The four figures show the proportion of infectious individuals as a function of time. In these simulations, and others not reported here, we observe that the initial value does not affect the asymptotic behaviour of the process: the process vanishes or converges to a non-zero equilibrium depending on the value of β but regardless of the value of . In figures 1–3, where f is concave, we know that the mean field model gives an upper bound on the expected proportion of infected individuals in the microscale model. We also see that the mean field model provides a reasonably sharp approximation. Moreover, we see a similar level of sharpness in figure 4 for the collective contagion model, where f is not concave.
A key advantage of the mean field approximation is that it gives rise to a deterministic autonomous dynamical system for which there exists a rich theory to study the asymptotic stability of equilibrium points. This motivates the analysis in the next section.
6. Stability analysis
We provide below spectral conditions which imply that the infection-free solution is a locally or globally asymptotically stable equilibrium of (4.1) and (4.2). We will find that local asymptotic stability can be shown with no structural assumptions on f. We will also find that global asymptotic stability follows under the same conditions when f is concave. Our conclusions fit into a framework that generalizes the graph case (2.2): the spectral threshold takes the form
Throughout this work, to be concrete we let denote the Euclidean norm.
(a) Local asymptotic stability
We see that , so is an equilibrium for (4.1). It remains to show that this solution is locally asymptotically stable. Appealing to a standard linearization result , it suffices to show that every eigenvalue of the Jacobian matrix has a negative real part. We compute
The proof of theorem 6.1 extends straightforwardly. We compute
(b) Global asymptotic stability for the concave infection model
We now show that when f is concave the condition in theorem 6.1 ensures global stability of the zero equilibrium, and hence guarantees that the disease dies out according to the mean field approximation.
Given a matrix A, define its symmetric version to be
Suppose that A and B arereal matrices, and suppose that there exists a diagonal matrix such that for all , and
Let be a unit eigenvector associated with . We have
Suppose f is concave. If
From the global asymptotic stability result in [19, lemma 1′], it is sufficient to show that all eigenvalues of the symmetric matrix are strictly less than 0, for all . We have
Letting B denote the matrix given by
Combining these inequalities and using the spectral condition in the statement of the theorem, we deduce that
When f is the identity—which is concave—we are effectively using the standard graph-based model, albeit with weighted edges. The inequality in theorem 6.5 then generalizes the well-known vanishing condition (2.2) found, for instance, in [8–10]. To our knowledge, previous arguments leading to this vanishing condition for graphs were not completely rigorous. Theorem 6.5 provides a new, rigorous justification for this spectral bound in the case of traditional mean field graph models.
A straightforward adaptation of the proof of theorem 6.5 yields the following global asymptotic stability result for the more general partitioned model.
Suppose allare concave. If
7. Simulations to test the spectral condition
We now show the results of experiments that test the sharpness of our spectral vanishing condition. Here, we used the concave functions (on figures 5a and 6) and (on figures 5a and 7) to construct partitioned models with and for all . We fixed a hypergraph with nodes, 400 edges, 200 hyperedges of size 3, 100 hyperedges of size 4 and 50 hyperedges of size 5. At time zero, each node was infected with independent probability and we used a recovery rate of . In addition to the mean field ODE, we also simulated the microscale model, averaged over five runs, using the discretization scheme described in §5b, with .
In figure 5, the circles (red) show the corresponding proportion of infected individuals according to the mean field model, , at time , for a range of different β between 0 and 0.2: . The crosses (blue) show the corresponding proportion of infected individuals from the microscale model, , at time . The vertical green line represents the critical value (we have on the left and on the right).
For the mean field model, we know from theorem 6.6 that guarantees global stability of the zero-infection state. We see that also lies close to the threshold beyond which extinction of the disease is lost in the mean field model. For the individual-level stochastic model, theorem 8.3 below shows that is also sufficient for eventual extinction of the disease. This is consistent with the results in figure 5.
In figures 6a and 7a, we show individual trajectories of the proportion of infected individuals, , according to the mean field model, for a range of β values. For the same range of β values, the plots on the right of these figures show the corresponding proportion of infected individuals from the microscale model, . The curves are coloured in red if the spectral vanishing condition is satisfied. We see qualitative agreement between the mean field and individual-level models, and extinction for the β values below the spectral threshold.
Having derived and tested spectral conditions that concern extinction of the disease at the mean field approximation level, in the next section, we study the microscale model directly.
8. Exact model
To proceed, we recall our setting where at time zero each node has the same, independent, probability, , of being infectious; so for all . This implies that is the expected number of infectious individuals at time zero.
We are interested in the stochastic process , which records the number of infected individuals. Our analysis generalizes arguments in , which considered a stochastic SIS model on a graph with f as the identity map. We are able to prove results in the case of concave nonlinearites, giving insights into the behaviour of the exact model and also allowing comparison with corresponding results for the mean field approximation. Furthermore, we show in §9 how these results can be extended to the case of collective contagion models.
Our first result shows that the spectral condition arising from the mean field analysis in theorems 6.1 and 6.5 is also relevant to the probability of extinction in the individual-level model.
Suppose f is concave in the hypergraph infection model (3.1). Then
Consider the continuous time Markov process taking values in , with transition of states defined for every and by
Suppose also that for all . Since f is concave,
We deduce, analogously to , the following corollary.
Suppose f is concave in the hypergraph infection model (3.1). Let τ denote the time of extinction of the disease and suppose, then
Using theorem 8.1,
Likewise, the partitioned case yields the following result.
Suppose everyis concave in the partitioned hypergraph model with infection rate (3.2). Then
We also have the following analogue of corollary 7.2 on the expected time to extinction for the partitioned case.
Suppose everyis concave in the partitioned hypergraph model with infection rate (3.2). Let τ denote the time of extinction of the disease and suppose, then
(b) Conditions that preclude extinction
So far, we have focused on deriving thresholds that imply extinction. In this subsection, following ideas from , we derive a condition under which the disease will persist.
Note that our analysis does not require the graph associated with W to be connected. The disconnected setting is relevant, for example, when interventions have been imposed in order to limit interactions. We let denote the Laplacian, and let denote the smallest non-zero eigenvalue of . We also let denote the size of the largest hyperedge.
Given a hypergraph , a function and a subset of the nodes , let
Note that when consists of those nodes for which , we can write the infection transition rate of as . More generally, may be regarded as the rate at which nodes in the set may be infected by nodes in the remainder of the network. When and , is the Cheeger constant, or isoperimetric number, associated with the weighted graph induced by . We may also regard as the smallest average infection rate over all subsets consisting of no more than half of the network.
The next theorem gives a probabilistic lower bound on the time to extinction.
Recall that τ denotes the hitting time of the state 0 for the processin the hypergraph model (3.1). If f is concave and, then
From standard Cheeger inequalities , we know that the Cheeger constant of the graph induced by W satisfies ; hence, using the assumptions on in the theorem, we see that indeed.
In order to prove this result, we introduce the following lemma.
If f is concave and non-decreasing, then
The proof is immediate once we see that, by concavity of f, for all
Now, to prove theorem 8.6 consider the Markov process valued in , with transition of states given by
9. Collective contagion models
We now consider the collective contagion models from [5–7], where infection only starts spreading within a hyperedge once a threshold number of infectious nodes in that hyperedge has been reached. As discussed in §2b, collective contagion models can be represented by nonlinear functions of the form for some , or for some . In these cases, it is obvious that the zero-infection state for the mean field approximation is locally asymptotically stable (and, indeed, theorem 6.1 applies). However, because the functions are not concave, the theory found in §8 for the exact model does not directly apply. Nonetheless, we can still derive similar spectral conditions for the vanishing of the disease by finding concave functions which serve as upper bounds for f. For instance using , the bounds in theorem 8.1 and corollary 8.2 lead to the following result.
Suppose thatfor some. Then
Likewise note that where we recall that is the largest size of a hyperedge of . Hence, we deduce the following result.
Suppose thatfor some. Then
In this work, we derived spectral conditions that control the spread of disease in an SIS model on a hypergraph. The conditions have the general form
We note that in the special case where (i) the hypergraph is an undirected graph and hence W becomes the binary adjacency matrix and (ii) we have linear dependence on the number of infectious neighbours for the infection rate of a node, so f is the identity function, the condition (10.1) reduces to the well-known vanishing spectral condition studied in, for example, [8–10].
There are two important points to be made about the general form of (10.1). First, the hypergraph structure appears only via the presence of the symmetric matrix . Recall that records the number of times that i and j both appear in the same hyperedge. Such weighted but pairwise information is all that feeds into this spectral threshold. On a positive note, this implies that useful predictions can be made about disease spread on a hypergraph without full knowledge of the types of hyperedge present and the distribution of nodes within them. (For example, when collecting human interaction data it is more reasonable to ask an individual to list each contact and state how many different ways they interact with that contact than to ask an individual to list all hyperedges they take part in.) However, this observation also raises the possibility that more refined analysis might lead to sharper bounds, perhaps at the expense of simplicity and interpretability.
Our second point is that the new vanishing condition (10.1) neatly separates into three aspects:
The biologically motivated infection parameter, β.
The interaction structure, captured in .
The coefficient that arises from modelling the nonlinear infection process. For instance, theorems 6.1 and 6.5 have . In the collective contagion model case , theorem 9.1 indicates that we can take .
We may view β as an invariant biological constant that reflects the underlying virulence of the disease and is not affected by human behaviour. The factor , which arises from the interaction structure, will be determined by regional and cultural issues, including population density, age demographics, typical household sizes and the nature of prevalent commercial and manufacturing activities. Interventions, including full or partial lockdowns, could be modelled through a change in . The third factor, , is strongly dependent upon human behaviour and may be adjusted to reflect individual-based containment strategies such as social distancing, mask wearing or more frequent hand washing.
This work has focused on modelling, analysis and interpretation at the abstract level, concentrating on the fundamental question of disease extinction. Having developed this theory, it would, of course, now be of great interest to perform practical experiments using realistic interaction and infection data, with the aim of
calibrating model parameters,
testing hypotheses about the appropriate functional form of the infection rate,
testing the predictive power of the modelling framework, especially in comparison with simpler homogeneous mixing and pairwise interaction versions,
quantifying the effect of different interventions.
Code to run the experiments reported here is available at https://www.maths.ed.ac.uk/dhigham/algfiles.html.
Both authors contributed to the modelling and analysis, and to the document preparation. H.-L.d.K. implemented the computational tests.
The authors have no competing interests.
Both authors were supported by Engineering and Physical Sciences Research Council grant no. EP/P020720/1.
Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.
Anderson RM, May RM. 1992Infectious diseases of humans: dynamics and control. Oxford, UK: Oxford University Press. Google Scholar
Ganesh A, Massoulié L, Towsley D. 2005The effect of network topology on the spread of epidemics. Proc. IEEE INFOCOM 2, 1455-1466. Google Scholar
Wang Y, Chakrabarti D, Wang C, Faloutsos C. 2003Epidemic spreading in real networks: an eigenvalue point of view. In Proc. 22nd Int. Symp. on Reliable Distributed Systems (SRDS 2003), Florence, Italy, 6–8 October 2003, pp. 25–34. Piscataway, NJ: IEEE Press. Google Scholar
Gillespie DT. 1991Markov processes: an introduction for physical scientists. Cambridge, MA: Academic Press. Google Scholar