## Abstract

Epidemic spreading is well understood when a disease propagates around a contact graph. In a stochastic susceptible–infected–susceptible setting, spectral conditions characterize whether the disease vanishes. However, modelling human interactions using a graph is a simplification which only considers pairwise relationships. This does not fully represent the more realistic case where people meet in groups. Hyperedges can be used to record higher order interactions, yielding more faithful and flexible models and allowing for the rate of infection of a node to depend on group size and also to vary as a nonlinear function of the number of infectious neighbours. We discuss different types of contagion models in this hypergraph setting and derive spectral conditions that characterize whether the disease vanishes. We study both the exact individual-level stochastic model and a deterministic mean field ODE approximation. Numerical simulations are provided to illustrate the analysis. We also interpret our results and show how the hypergraph model allows us to distinguish between contributions to infectiousness that (i) are inherent in the nature of the pathogen and (ii) arise from behavioural choices (such as social distancing, increased hygiene and use of masks). This raises the possibility of more accurately quantifying the effect of interventions that are designed to contain the spread of a virus.

### 1. Introduction

Compartmental models for disease propagation have a long and illustrious history [1,2], and they remain a fundamental predictive tool [3,4]. For a stochastic, individual-level model, it has been suggested recently that hyperedge information should be incorporated [5–7]. Hyperedges allow us to account directly for group interactions of any size, rather than, as in, for example, [8–10], treating them as a collection of essentially independent pairwise encounters.

In this work, we contribute to the modelling and analysis of disease spreading on a hypergraph. We include the case where the number of infected nodes in a hyperedge contributes nonlinearly to the overall infection rate; this covers the so-called *collective contagion model* setting and a new alternative that we call a *collective suppression model*. The main contributions of our work are

— | a mean field approximation ((4.1) and (4.2)) with a spectral condition for local asymptotic stability of the zero-infection state (theorem 6.1) and an extension to global asymptotic stability (theorem 6.5) when the nonlinear infection function is concave, | ||||

— | for the exact, individual-level model, a spectral condition for exponential decay of the non-extinction probability in the concave case (theorem 8.1) and a spectral bound on the expected time to extinction (corollary 8.2), | ||||

— | extensions of these results to more general partitioned hypergraph models, where distinct infection rates apply to different categories of hyperedge (theorems 6.2, 6.6 and 8.3 and corollary 8.4), | ||||

— | results for the non-concave collective contagion model (theorems 9.1 and 9.2), | ||||

— | a complementary condition that rules out extinction of the disease (theorem 8.6), | ||||

— | interpretations of these mathematical results: the spectral thresholds for disease extinction naturally distinguish between the inherent biological infectiousness of the disease and behavioural choices of the individuals in the population, allowing us to account for intervention strategies (§10). |

The paper is organized as follows. In §2, we introduce the traditional graph-based susceptible–infected–susceptible (SIS) model and quote a spectral condition that characterizes control of the disease. We then discuss the generalization to hyperedges, and motivate the use of infection rates that do not scale linearly with respect to the number of infected neighbours. Section 3 formalizes the hypergraph model and shows how it may be simulated. In §4, we derive a mean field approximation to characterize the behaviour of the model, and in §5 we give computational results to illustrate its relevance. Section 6 analyses the deterministic mean field setting and gives a spectral condition for long-term decay of the disease. The result is local for general infection rates and global (independent of the initial condition) for the concave case. This spectral condition generalizes a well-known result concerning disease propagation on a graph. Section 7 provides further computational simulations to illustrate the spectral threshold. The full stochastic model is then studied in §8, where we extend the analysis in [8] to our hypergraph setting. Here, we study extinction of the disease in the case where the nonlinearity in the infection rate is concave. We also derive conditions for non-extinction. In §9, we extend our analysis to the so-called collective contagion model proposed in [5–7]. In §10, we summarize and interpret our results and discuss follow-on work.

For a review of recent studies of spreading processes on hypergraphs, including the dissemination of rumours, opinions and knowledge, we recommend [11, §7.1.2]. The model that we study fits into the framework of [12]. This work introduced the idea of a nonlinear ‘infection pressure’ from each hyperedge, and derived a mean field approximation that was compared with microscale-level simulation results. In [6], the authors studied this type of model on simplicial complexes of degree up to 2 (a subclass of the more general hypergraph setting) and also studied a mean field approximation. These authors examined the mean field system from a dynamical systems perspective and analysed issues such as bistability, hysteresis and discontinuous transitions. Similarly, in [5,7], a hypergraph version was considered. Our work differs from these studies in (i) focusing on the derivation of spectral thresholds for extinction of a disease in both the exact and mean field settings and (ii) seeking to interpret the results from a mathematical modelling perspective. We mention that it would also be of interest to develop corresponding thresholds for the mean field models in [5,6,12].

### 2. Stochastic susceptible–infected–susceptible models

#### (a) Stochastic susceptible–infected–susceptible model on a graph

Classical ODE compartmental models are based on the assumption that any pair of individuals is equally likely to interact—this is the homogeneous mixing case [1]. If, instead, we have knowledge of all possible pairwise interactions between individuals, then this information may be incorporated via a contact graph and used in a stochastic model. Here, each node represents an individual, and an edge between nodes *i* and *j* indicates that individuals *i* and *j* interact. For a population with *n* individuals, we may let $A\in {\mathbb{R}}^{n\times n}$ denote the corresponding symmetric adjacency matrix, so nodes *i* and *j* interact if and only if ${A}_{ij}=1$. In this setting, a stochastic SIS model uses the two-state random variable ${X}_{i}(t)$ to represent the status of node *i* at time *t*, with ${X}_{i}(t)=0$ for a susceptible node and ${X}_{i}(t)=1$ for an infected node. Each ${X}_{i}(t)$ then follows a continuous time Markov process where the infection rate is given by

*δ*. Here, $\beta >0$ and $\delta >0$ are parameters governing the strength of the two effects. In this model, we see from (2.1) that the current chance of infection increases linearly in proportion to the current number of infected neighbours.

This model was studied in [10, theorem 1], where it was argued that the condition

*A*corresponds to the complete graph).

#### (b) Why use a hypergraph?

It has been argued [11,13–16] that in many network science applications we lose information by recording only pairwise interactions. For example, e-mails can be sent to groups of recipients, scholarly articles may have multiple coauthors and many proteins may interact to form a complex. In such cases, recording the relevant lists of interacting nodes gives a more informative picture than reducing these down to a collection of edges.

In the setting of an SIS model, we may argue that individuals typically come together in well-defined groups; for example, in a household, a workplace or a social setting. Such groups may be handled by the use of hyperedges, leading to a hypergraph; these concepts are formalized in the next section.

With a classic graph model, as described in §2a, all types of (pairwise) interactions are treated equally and the rate of infection of a node is linearly proportional to the number of infectious neighbours. With a hypergraph we may consider more intricate contagion mechanisms. For example, using the terminology of [7], the *collective contagion* model is studied in [5–7]. Here, infection only starts spreading within a hyperedge after a certain critical number of infectious neighbours has been reached. This type of behaviour is relevant, for example, in a shared office area where infection occurs only when a minimal viral load is exceeded. The critical number may also depend on the environment and group size; in a given office space a small number of workers may be able to socially distance in a way that effectively eliminates the risk of infection. Hence, we may wish to use a different threshold for different sizes and categories of hyperedge.

We mention here that an alternative type of mechanism may also operate, which we call *collective suppression*. Imagine that a disease may be contracted through contact with a surface that was previously touched by an infected individual. Now suppose that a group of individuals is likely to use the same physical object, such as a door handle, hand rail, cash machine or water cooler. If an infected individual contaminates the object, then further contamination by other individuals is less relevant. In this case, doubling the number of infected users will increase the risk by a factor less than 2; generally, risk grows sublinearly as a function of the size of the hyperedge.

These arguments motivate us to allow the rate of infection of a node within a hyperdege to depend on a generic function *f* of the number of infectious neighbours in that hyperedge; this approach was also taken in [12]. The arguments also motivate us to study the case where the form of the nonlinearity depends on the type of hyperedge. We will be particularly concerned with the case where *f* is concave, since this is tractable for analysis and allows us to draw conclusions about the collective contagion model. We note that if *f* is the identity, then we recover linear dependence on the number of infectious neighbours and the hypergraph model is equivalent to a virus spreading on the clique graph of the hypergraph.

### 3. Susceptible–infected–susceptible on a hypergraph

#### (a) Background

We continue with some standard definitions [17].

#### Definition 3.1.

A *hypergraph* is a tuple $\mathcal{H}:=(V,E)$ of *nodes**V* and *hyperedges**E* such that $E\subset \mathcal{P}(V)$. Here, $\mathcal{P}(V)$ denotes the power set of *V*.

We will let *n* and *m* denote the number of nodes and hyperedges, respectively; that is, $|V|=n$ and $|E|=m$. Loosely, a hypergraph generalizes the concept of a graph by allowing an ‘edge’ to be a list of more than two nodes.

#### Definition 3.2.

Consider a hypergraph $\mathcal{H}:=(V,E)$. The *incidence matrix*, $\mathcal{I}$, is the $n\times m$ matrix such that ${\mathcal{I}}_{ih}=1$ if node *i* belongs to hyperedge $h$ and ${\mathcal{I}}_{ih}=0$ otherwise.

It is also useful to introduce $W:=\mathcal{I}{\mathcal{I}}^{T}$. This $n\times n$ matrix has the property that ${W}_{ij}$ records the number of hyperedges containing both nodes *i* and *j*. In particular, if $\mathcal{H}$ is a graph then *W* is the affinity matrix of the graph.

#### (b) General infection model

In our context, the nodes represent individuals and a hyperedge records a collection of individuals who are known to interact as a group. As in the graph case introduced in §2a, we use a state vector $X(t)$, which follows a continuous time Markov process, where, for each $1\le i\le n$, ${X}_{i}(t)=1$ if node *i* is infected at time *t* and ${X}_{i}(t)=0$ otherwise. We continue to assume that an infectious node becomes susceptible with constant recovery rate $\delta >0$. However, generalizing (2.1), we now assume that a susceptible node *i* becomes infectious with rate

*i*is assumed to increase in proportion to the number of infected nodes in that hyperedge. Throughout our analysis, we will always assume that $f(0)=0$ and

*f*is ${C}^{1}$ in a neighbourhood of 0.

If *f* is the identity, the rate of infection reduces to $\beta \sum _{j=1}^{n}{W}_{ij}{X}_{j}(t)$. This gives a weighted version of the infection rate of an SIS model on a graph. As discussed in §2b, it may be appropriate to choose nonlinear *f* in certain circumstances. We note that in [12] the authors have in mind functions which behave like the identity near the origin and have a horizontal asymptote. Instances of such functions are $x\mapsto \text{arctan}(x)$ and $x\mapsto min\{x,c\}$ for some $c>0$. Relaxing these conditions, we may ask more generally in such a setting that the function be concave. On the other hand, the authors in [5,6] consider a *collective contagion model*, where infection spreads within a hyperdge only if a certain threshold of infectious vertices is reached in that hyperedge. A collective contagion model may be represented via the function $x\mapsto {c}_{2}\mathbb{1}(x\ge {c}_{1})$ for some ${c}_{1},{c}_{2}>0$, or $x\mapsto max\{0,x-c\}$ for some $c>0$.

#### (c) Partitioned hypergraph model

Following the discussion in §2b, we also introduce a more general case where we partition the hyperedges into *K* disjoint categories with each category $1\le k\le K$ having its own distinct rate of infection in response to the number of infected nodes in a hyperedge, represented by a function ${f}_{k}$. For example, the categories may correspond to different types of housing, workplaces, hospitality venues or sports facilities, each representing different physical spaces and forms of interaction. We may then represent the infection rate of node *i* as

*i*belongs to hyperedge

*h*in the category

*k*and ${\mathcal{I}}_{ih}^{(k)}=0$ otherwise; so ${\mathcal{I}}^{(k)}$ is the incidence matrix of the subhypergraph consisting of only the hyperedges from category

*k*. We will refer to this as a

*partitioned hypergraph model*.

In this generalized case, a collective contagion model could be defined by first organizing the hyperedges into categories depending on their size, so that category *k* is the set of hyperedges of size $k+1$. A collective contagion model may then be represented, for example, via the functions ${f}_{1}:x\mapsto x$ and ${f}_{k}:x\mapsto {c}_{2,k}\mathbb{1}(x\ge {c}_{1,k})$, $k\in \{2,\dots ,K\}$.

### 4. Mean field approximation

A classic approach to studying processes with random infection rates is to develop a mean field approximation for the expected process

*δ*if and only if ${X}_{i}(t)=1$. Making the simplifying assumption that the infection rate for node

*i*is independent of its state, we have

*f*and the expectation in the first factor on the right-hand side. We arrive at the deterministic mean field ODE

### 5. Simulations and comparison between exact and mean field models

Let us emphasize that the approximate infection rates in (4.2) differ in general from the expectation of the random rates in (3.1). When the function *f* is concave, however, Jensen’s reverse inequality indicates that the rates in (4.2) are greater than the expectation of the rates in (3.1). Hence, in this case the expected quantities ${p}_{i}(t)$ are overestimated by (4.1) and (4.2). This is fine when we are looking for conditions for the disease to vanish. If *f* is not concave (e.g. for a collective contagion model) it is less clear *a priori* how the exact model and mean field ODE compare. In this section, we therefore present results of computational simulations in order to gain insight into the accuracy of our mean field approximation.

#### (a) Simulation algorithm

Before presenting numerical results, we summarize our approach for simulating the individual-level stochastic model, which is based on a standard time discretization; see, for example, [12]. Using a small fixed time step $\mathrm{\Delta}t$, we advance from time *t* to $t+\mathrm{\Delta}t$ as follows. First, let $r\in {[0,1]}^{n}$ be a random vector of i.i.d. values uniformly sampled from $[0,1]$. For every node $1\le i\le n$,

— | when ${X}_{i}(t)=0$, set ${X}_{i}(t+\mathrm{\Delta}t)=1$ if $${r}_{i}<1-\mathrm{exp}(-\beta \sum _{h}{\mathcal{I}}_{ih}f(\sum _{j}{X}_{j}(t){\mathcal{I}}_{jh})\mathrm{\Delta}t),$$
and set ${X}_{i}(t+\mathrm{\Delta}t)=0$ otherwise; | ||||

— | when ${X}_{i}(t)=1$, set ${X}_{i}(t+\mathrm{\Delta}t)=0$ if $${r}_{i}<1-\mathrm{exp}(-\delta \mathrm{\Delta}t),$$
and set ${X}_{i}(t+\mathrm{\Delta}t)=1$ otherwise. |

#### (b) Computational results

In the simulations, we chose $n=400$ nodes with fixed recovery rate $\delta =1$ and a time step of $\mathrm{\Delta}t=0.05$. We look at results for different choices of (i) the infection strength, *β*, and (ii) the (independent) initial probability for each node to be infectious, which we denote ${i}_{0}$. We simulated the mean field ODE using Euler’s method with time step of 0.05. The largest size of a hyperedge was 5 and we distributed the number of hyperedges for the hypergraph randomly as follows: 300 edges, 200 hyperedges of size 3, 100 hyperedges of size 4 and 50 hyperedges of size 5. To give a feel for the level of fluctuation, the individual-level paths are averaged over 10 runs, each with the same hypergraph connectivity and initial state.

Figures 1–3 show results for three concave choices of *f*; respectively,

— | $f(x)=min(x,3)$, | ||||

— | $f(x)=\mathrm{log}(1+x)$, | ||||

— | $f(x)=\text{arctan}(x)$. |

For figure 4, we used a collective contagion model on a partitioned hypergraph. Assigning each hyperedge to a category in $\{1,2,3,4\}$, where category *k* contains the hyperedges of size $k+1$, we chose the following associated functions to determine the infection rates: ${f}_{1}(x):=x$, and for $k\in \{2,3,4\}$, ${f}_{k}(x):=(k-1)\mathbb{1}(x\ge k-1)$.

The four figures show the proportion of infectious individuals as a function of time. In these simulations, and others not reported here, we observe that the initial value ${i}_{0}$ does not affect the asymptotic behaviour of the process: the process vanishes or converges to a non-zero equilibrium depending on the value of *β* but regardless of the value of ${i}_{0}$. In figures 1–3, where *f* is concave, we know that the mean field model gives an upper bound on the expected proportion of infected individuals in the microscale model. We also see that the mean field model provides a reasonably sharp approximation. Moreover, we see a similar level of sharpness in figure 4 for the collective contagion model, where *f* is not concave.

A key advantage of the mean field approximation is that it gives rise to a deterministic autonomous dynamical system for which there exists a rich theory to study the asymptotic stability of equilibrium points. This motivates the analysis in the next section.

### 6. Stability analysis

We provide below spectral conditions which imply that the infection-free solution $0\in {\mathbb{R}}^{n}$ is a locally or globally asymptotically stable equilibrium of (4.1) and (4.2). We will find that local asymptotic stability can be shown with no structural assumptions on *f*. We will also find that global asymptotic stability follows *under the same conditions* when *f* is concave. Our conclusions fit into a framework that generalizes the graph case (2.2): the spectral threshold takes the form

*f*.

Throughout this work, to be concrete we let $||\cdot ||$ denote the Euclidean norm.

#### (a) Local asymptotic stability

#### Theorem 6.1.

*If*

*then*$0\in {\mathbb{R}}^{n}$

*is a locally asymptotically stable equilibrium of*(

*4.1*)

*and*(

*4.2*);

*that is, there exists a positive γ such that*$||P(0)||<\gamma \Rightarrow \underset{t\to \mathrm{\infty}}{lim}||P(t)||=0$.

#### Proof.

We see that $g(0)=0$, so $0\in {\mathbb{R}}^{n}$ is an equilibrium for (4.1). It remains to show that this solution is locally asymptotically stable. Appealing to a standard linearization result [18], it suffices to show that every eigenvalue of the Jacobian matrix $\mathrm{\nabla}g(0)$ has a negative real part. We compute

*δ*, and the result follows. ▪

Theorem 6.1 extends to the partitioned model in (3.2). In this case, ${g}_{i}(P(t))$ in the mean field ODE (4.1) is defined as

#### Theorem 6.2.

*If*

*then*$0\in {\mathbb{R}}^{n}$

*is a locally asymptotically stable equilibrium of*(

*4.1*),

*with*${g}_{i}$

*defined in*(

*6.2*).

#### Proof.

The proof of theorem 6.1 extends straightforwardly. We compute

#### (b) Global asymptotic stability for the concave infection model

We now show that when *f* is concave the condition in theorem 6.1 ensures global stability of the zero equilibrium, and hence guarantees that the disease dies out according to the mean field approximation.

#### Definition 6.3.

Given a matrix *A*, define its symmetric version to be

#### Lemma 6.4.

*Suppose that A and B are*$n\times n$*real matrices, and suppose that there exists a diagonal matrix*$\Lambda $ such that for all $i\in \{1,2,\dots ,n\},\hspace{0.17em}{\Lambda}_{ii}\ge 0$, *and*

*Then the largest eigenvalues of A and B satisfy*$\lambda (A)\le \lambda (B)$,

*and the largest eigenvalues of*${A}^{(S)}$

*and*${B}^{(S)}$

*also satisfy*$\lambda ({A}^{(S)})\le \lambda ({B}^{(S)})$.

#### Proof.

Let $x$ be a unit eigenvector associated with $\lambda ({A}^{(S)})$. We have

#### Theorem 6.5.

*Suppose f is concave. If*

*then*$0\in {\mathbb{R}}^{n}$

*is a globally asymptotically stable equilibrium of*(

*4.1*)

*and*(

*4.2*);

*so*$\underset{t\to \mathrm{\infty}}{lim}||P(t)||=0$

*for any valid initial condition*$(\text{that is, with}\hspace{0.17em}0\le p{(0)}_{i}\le 1).$

#### Proof.

From the global asymptotic stability result in [19, lemma 1′], it is sufficient to show that all eigenvalues of the symmetric matrix ${(\mathrm{\nabla}g(P))}^{(S)}$ are strictly less than 0, for all $P\ne 0$. We have

Letting *B* denote the $n\times n$ matrix given by

*f*is concave. Hence ${B}^{(S)}+\delta I\le \mathrm{\nabla}g(0)+\delta I$, and since ${B}^{(S)}+\delta I$ has only non-negative entries, appealing to the Perron–Frobenius theorem, we have $\lambda ({B}^{(S)})\le \lambda (\mathrm{\nabla}g(0))$.

Combining these inequalities and using the spectral condition in the statement of the theorem, we deduce that

When *f* is the identity—which is concave—we are effectively using the standard graph-based model, albeit with weighted edges. The inequality $\lambda (W)\beta /\delta <1$ in theorem 6.5 then generalizes the well-known vanishing condition (2.2) found, for instance, in [8–10]. To our knowledge, previous arguments leading to this vanishing condition for graphs were not completely rigorous. Theorem 6.5 provides a new, rigorous justification for this spectral bound in the case of traditional mean field graph models.

A straightforward adaptation of the proof of theorem 6.5 yields the following global asymptotic stability result for the more general partitioned model.

#### Theorem 6.6.

*Suppose all*${f}_{k}$*are concave. If*

*then*$0\in {\mathbb{R}}^{n}$

*is a globally asymptotically stable equilibrium of*(

*4.1*),

*with*${g}_{i}$

*defined in*(

*6.2*).

### 7. Simulations to test the spectral condition

We now show the results of experiments that test the sharpness of our spectral vanishing condition. Here, we used the concave functions $f(x)=2\mathrm{log}(1+x)$ (on figures 5*a* and 6) and $f(x)=\text{arctan}(x)$ (on figures 5*a* and 7) to construct partitioned models with ${f}_{1}(x)=x$ and ${f}_{k}(x)=f(x)$ for all $k\ge 2$. We fixed a hypergraph with $n=400$ nodes, 400 edges, 200 hyperedges of size 3, 100 hyperedges of size 4 and 50 hyperedges of size 5. At time zero, each node was infected with independent probability ${i}_{0}=0.5$ and we used a recovery rate of $\delta =1$. In addition to the mean field ODE, we also simulated the microscale model, averaged over five runs, using the discretization scheme described in §5b, with $\mathrm{\Delta}t=0.1$.

In figure 5, the circles (red) show the corresponding proportion of infected individuals according to the mean field model, $\sum _{i}{p}_{i}(t)/n$, at time $t=150$, for a range of different *β* between 0 and 0.2: $\beta \in \{(0.01)k\hspace{0.17em}|\hspace{0.17em}k\in \{0,\dots ,20\}\}$. The crosses (blue) show the corresponding proportion of infected individuals from the microscale model, $\sum _{i}{X}_{i}(t)/n$, at time $t=150$. The vertical green line represents the critical value ${\beta}_{c}:=\delta /\lambda (\sum _{k=1}^{K}{f}_{k}^{\prime}(0){W}^{(k)})$ (we have ${\beta}_{c}\approx 0.0265$ on the left and ${\beta}_{c}\approx 0.0490$ on the right).

For the mean field model, we know from theorem 6.6 that $\beta <{\beta}_{c}$ guarantees global stability of the zero-infection state. We see that ${\beta}_{c}$ also lies close to the threshold beyond which extinction of the disease is lost in the mean field model. For the individual-level stochastic model, theorem 8.3 below shows that $\beta <{\beta}_{c}$ is also sufficient for eventual extinction of the disease. This is consistent with the results in figure 5.

In figures 6*a* and 7*a*, we show individual trajectories of the proportion of infected individuals, $\sum _{i}{p}_{i}(t)/n$, according to the mean field model, for a range of *β* values. For the same range of *β* values, the plots on the right of these figures show the corresponding proportion of infected individuals from the microscale model, $\sum _{i}{X}_{i}(t)/n$. The curves are coloured in red if the spectral vanishing condition $\beta <{\beta}_{c}$ is satisfied. We see qualitative agreement between the mean field and individual-level models, and extinction for the *β* values below the spectral threshold.

Having derived and tested spectral conditions that concern extinction of the disease at the mean field approximation level, in the next section, we study the microscale model directly.

### 8. Exact model

To proceed, we recall our setting where at time zero each node has the same, independent, probability, ${i}_{0}$, of being infectious; so $\mathbb{P}({X}_{j}(0)=1)={i}_{0}$ for all $1\le j\le n$. This implies that $n\hspace{0.17em}{i}_{0}$ is the expected number of infectious individuals at time zero.

We are interested in the stochastic process $\sum _{i}{X}_{i}(t)$, which records the number of infected individuals. Our analysis generalizes arguments in [8], which considered a stochastic SIS model on a graph with *f* as the identity map. We are able to prove results in the case of concave nonlinearites, giving insights into the behaviour of the exact model and also allowing comparison with corresponding results for the mean field approximation. Furthermore, we show in §9 how these results can be extended to the case of collective contagion models.

#### (a) Extinction

Our first result shows that the spectral condition arising from the mean field analysis in theorems 6.1 and 6.5 is also relevant to the probability of extinction in the individual-level model.

#### Theorem 8.1.

*Suppose f is concave in the hypergraph infection model* (*3.1*). *Then*

*Hence, if*$\lambda (W){f}^{\prime}(0)\beta /\delta <1$

*then the disease vanishes at an exponential rate*.

#### Proof.

Consider the continuous time Markov process ${\{{({Y}_{i}(t))}_{t\ge 0}\}}_{i=1}^{n}$ taking values in ${\mathbb{N}}^{n}$, with transition of states defined for every $1\le i\le n$ and $\hspace{0.17em}t\ge 0$ by

Suppose also that ${X}_{i}(0)={Y}_{i}(0)$ for all $1\le i\le n$. Since *f* is concave,

We deduce, analogously to [8], the following corollary.

#### Corollary 8.2.

*Suppose f is concave in the hypergraph infection model* (*3.1*). *Let τ denote the time of extinction of the disease and suppose*$\lambda (W){f}^{\prime}(0)\beta <\delta $, *then*

#### Proof.

Using theorem 8.1,

Likewise, the partitioned case yields the following result.

#### Theorem 8.3.

*Suppose every*${f}_{k}$*is concave in the partitioned hypergraph model with infection rate* (*3.2*). *Then*

*Hence, if*$\lambda (\sum _{k=1}^{K}{f}_{k}^{\prime}(0){W}^{(k)})\beta /\delta <1$

*then the disease vanishes at an exponential rate*.

We also have the following analogue of corollary 7.2 on the expected time to extinction for the partitioned case.

#### Corollary 8.4.

*Suppose every*${f}_{k}$*is concave in the partitioned hypergraph model with infection rate* (*3.2*). *Let τ denote the time of extinction of the disease and suppose*$\lambda (\sum _{k=1}^{K}{f}_{k}^{\prime}(0){W}^{(k)})\beta /\delta <1$, *then*

#### (b) Conditions that preclude extinction

So far, we have focused on deriving thresholds that imply extinction. In this subsection, following ideas from [8], we derive a condition under which the disease will persist.

Note that our analysis does not require the graph associated with *W* to be connected. The disconnected setting is relevant, for example, when interventions have been imposed in order to limit interactions. We let $\mathrm{\Delta}:=D-W$ denote the Laplacian, and let ${\lambda}_{c}(\mathrm{\Delta})>0$ denote the smallest non-zero eigenvalue of $\mathrm{\Delta}$. We also let ${e}_{max}$ denote the size of the largest hyperedge.

#### Definition 8.5.

Given a hypergraph $\mathcal{H}$, a function $f:{\mathbb{R}}_{+}\to {\mathbb{R}}_{+}$ and a subset of the nodes $S\subset V$, let

Note that when $S$ consists of those nodes for which ${X}_{i}(t)=0$, we can write the infection transition rate of $\sum _{i=1}^{n}{X}_{i}(t)$ as $\beta E(S,f)$. More generally, $\beta E(S,f)$ may be regarded as the rate at which nodes in the set $S$ may be infected by nodes in the remainder of the network. When $f=Id$ and $m=\lfloor n/2\rfloor $, $\eta (\mathcal{H},m)$ is the Cheeger constant, or isoperimetric number, associated with the weighted graph induced by $W=\mathcal{I}{\mathcal{I}}^{T}$. We may also regard $\eta (\mathcal{H},m,f)$ as the smallest average infection rate over all subsets consisting of no more than half of the network.

The next theorem gives a probabilistic lower bound on the time to extinction.

#### Theorem 8.6.

*Recall that τ denotes the hitting time of the state* 0 *for the process*${(\sum _{j}{X}_{j}(t))}_{t\ge 0}$*in the hypergraph model* (*3.1*). *If f is concave and*${\lambda}_{c}(\mathrm{\Delta})>(2(({e}_{max}-1)/f({e}_{max}-1)))(\delta /\beta )$, *then*

*where*$r:=({e}_{max}-1)\delta /f({e}_{max}-1)\hspace{0.17em}\beta \hspace{0.17em}\eta (\mathcal{H},m)<1$

*and*$m:=\lfloor n/2\rfloor $.

From standard Cheeger inequalities [21], we know that the Cheeger constant of the graph induced by *W* satisfies $2\eta (\mathcal{H},m)\ge {\lambda}_{c}(\mathrm{\Delta})$; hence, using the assumptions on ${\lambda}_{c}(\mathrm{\Delta})$ in the theorem, we see that $r<1$ indeed.

In order to prove this result, we introduce the following lemma.

#### Lemma 8.7.

*If f is concave and non-decreasing, then*

#### Proof.

The proof is immediate once we see that, by concavity of *f*, for all $x\in \{0,\dots ,{e}_{max}-1\}$

Now, to prove theorem 8.6 consider the Markov process ${(Z(t))}_{t\ge 0}$ valued in $\{0,\dots ,m\}$, with transition of states given by

### 9. Collective contagion models

We now consider the collective contagion models from [5–7], where infection only starts spreading within a hyperedge once a threshold number of infectious nodes in that hyperedge has been reached. As discussed in §2b, collective contagion models can be represented by nonlinear functions of the form $f(x):=max\{0,x-c\}$ for some $c>0$, or $f(x):={c}_{2}\mathbb{1}(x\ge {c}_{1})$ for some ${c}_{1},{c}_{2}>0$. In these cases, it is obvious that the zero-infection state for the mean field approximation is locally asymptotically stable (and, indeed, theorem 6.1 applies). However, because the functions are not concave, the theory found in §8 for the exact model does not directly apply. Nonetheless, we can still derive similar spectral conditions for the vanishing of the disease by finding concave functions which serve as upper bounds for *f*. For instance using ${c}_{2}\mathbb{1}(x\ge {c}_{1})\le \frac{{c}_{2}}{{c}_{1}}x\mathbb{1}(x\le {c}_{1})+{c}_{2}\mathbb{1}(x\ge {c}_{1})$, the bounds in theorem 8.1 and corollary 8.2 lead to the following result.

### Theorem 9.1.

*Suppose that*$f(x):={c}_{2}\mathbb{1}(x\ge {c}_{1})$*for some*${c}_{1},{c}_{2}>0$. *Then*

*In particular, if*$\lambda (W)<({c}_{1}/{c}_{2})\delta /\beta $,

*then the disease asymptotically vanishes with exponential decay and the extinction time τ satisfies*

Likewise note that $max\{0,x-c\}\le ({e}_{max}-1-c)/({e}_{max}-1)x,$ where we recall that ${e}_{max}$ is the largest size of a hyperedge of $\mathcal{H}$. Hence, we deduce the following result.

### Theorem 9.2.

*Suppose that*$f(x):=max\{0,x-c\}$*for some*$c>0$. *Then*

*In particular, if*

*then the disease asymptotically vanishes with exponential decay and the extinction time τ satisfies*

### 10. Discussion

In this work, we derived spectral conditions that control the spread of disease in an SIS model on a hypergraph. The conditions have the general form

*f*that determines the nonlinear infection rate within a hyperedge.

We note that in the special case where (i) the hypergraph is an undirected graph and hence *W* becomes the binary adjacency matrix and (ii) we have linear dependence on the number of infectious neighbours for the infection rate of a node, so *f* is the identity function, the condition (10.1) reduces to the well-known vanishing spectral condition studied in, for example, [8–10].

There are two important points to be made about the general form of (10.1). First, the hypergraph structure appears only via the presence of the symmetric matrix $W\in {\mathbb{R}}^{n\times n}$. Recall that ${W}_{ij}$ records the number of times that *i* and *j* both appear in the same hyperedge. Such weighted but *pairwise* information is all that feeds into this spectral threshold. On a positive note, this implies that useful predictions can be made about disease spread on a hypergraph without full knowledge of the types of hyperedge present and the distribution of nodes within them. (For example, when collecting human interaction data it is more reasonable to ask an individual to list each contact and state how many different ways they interact with that contact than to ask an individual to list all hyperedges they take part in.) However, this observation also raises the possibility that more refined analysis might lead to sharper bounds, perhaps at the expense of simplicity and interpretability.

Our second point is that the new vanishing condition (10.1) neatly separates into three aspects:

(i) | The biologically motivated infection parameter, | ||||

(ii) | The interaction structure, captured in $\lambda (W)$. | ||||

(iii) | The coefficient ${c}_{f}$ that arises from modelling the nonlinear infection process. For instance, theorems 6.1 and 6.5 have ${c}_{f}={f}^{\prime}(0)$. In the collective contagion model case $f(x)={c}_{2}\mathbb{1}(x\ge {c}_{1})$, theorem 9.1 indicates that we can take ${c}_{f}={c}_{2}/{c}_{1}$. |

We may view *β* as an invariant biological constant that reflects the underlying virulence of the disease and is not affected by human behaviour. The factor $\lambda (W)$, which arises from the interaction structure, will be determined by regional and cultural issues, including population density, age demographics, typical household sizes and the nature of prevalent commercial and manufacturing activities. Interventions, including full or partial lockdowns, could be modelled through a change in $\lambda (W)$. The third factor, ${c}_{f}$, is strongly dependent upon human behaviour and may be adjusted to reflect individual-based containment strategies such as social distancing, mask wearing or more frequent hand washing.

This work has focused on modelling, analysis and interpretation at the abstract level, concentrating on the fundamental question of disease extinction. Having developed this theory, it would, of course, now be of great interest to perform practical experiments using realistic interaction and infection data, with the aim of

— | calibrating model parameters, | ||||

— | testing hypotheses about the appropriate functional form of the infection rate, | ||||

— | testing the predictive power of the modelling framework, especially in comparison with simpler homogeneous mixing and pairwise interaction versions, | ||||

— | quantifying the effect of different interventions. |

### Data accessibility

Code to run the experiments reported here is available at https://www.maths.ed.ac.uk/dhigham/algfiles.html.

### Authors' contributions

Both authors contributed to the modelling and analysis, and to the document preparation. H.-L.d.K. implemented the computational tests.

### Competing interests

The authors have no competing interests.

### Funding

Both authors were supported by Engineering and Physical Sciences Research Council grant no. EP/P020720/1.

## Footnotes

### References

- 1.
Anderson RM, May RM . 1992**Infectious diseases of humans: dynamics and control**. Oxford, UK: Oxford University Press. Google Scholar - 2.
Kermack WO, McKendrick AG . 1927A contribution to the mathematical theory of epidemics.**Proc. R. Soc. A**, 700-721. Link, Google Scholar**115** - 3.
Estrada E . 2020COVID-19 and SARS-CoV-2. Modeling the present, looking at the future.**Phys. Rep.**, 1-51. (doi:10.1016/j.physrep.2020.07.005) Crossref, PubMed, ISI, Google Scholar**869** - 4.
Giordano G, Blanchini F, Bruno R, Colaneri P, Di Filippo A, Di Matteo A, Colaneri M . 2020Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy.**Nat. Med.**, 855-860. (doi:10.1038/s41591-020-0883-7) Crossref, PubMed, ISI, Google Scholar**26** - 5.
de Arruda GF, Petri G, Moreno Y . 2020Social contagion models on hypergraphs.**Phys. Rev. Res.**, 023032. (doi:10.1103/PhysRevResearch.2.023032) Crossref, Google Scholar**2** - 6.
Iacoponi I, Petri G, Barrat A, Latora V . 2019Simplicial models of social contagion.**Nat. Commun.**, 1-9. (doi:10.1038/s41467-018-07882-8) PubMed, ISI, Google Scholar**10** - 7.
Landry NW, Restrepo JG . 2020The effect of heterogeneity on hypergraph contagion models.**Chaos**, 103117. (doi:10.1063/5.0020034) Crossref, PubMed, ISI, Google Scholar**30** - 8.
Ganesh A, Massoulié L, Towsley D . 2005The effect of network topology on the spread of epidemics.**Proc. IEEE INFOCOM**, 1455-1466. Google Scholar**2** - 9.
Mieghem PV, Omic J, Kooij R . 2009Virus spread in networks.**IEEE Trans. Netw.**, 1-14. (doi:10.1109/TNET.2008.925623) Crossref, ISI, Google Scholar**17** - 10.
Wang Y, Chakrabarti D, Wang C, Faloutsos C . 2003Epidemic spreading in real networks: an eigenvalue point of view. In*Proc. 22nd Int. Symp. on Reliable Distributed Systems (SRDS 2003), Florence, Italy, 6–8 October 2003*, pp. 25–34. Piscataway, NJ: IEEE Press. Google Scholar - 11.
Battiston F, Cencetti G, Iacopini I, Latora V, Lucas M, Patania A, Young J-G, Petri G . 2020Networks beyond pairwise interactions: structure and dynamics.**Phys. Rep.**, 1-92. (doi:10.1016/j.physrep.2020.05.004) Crossref, ISI, Google Scholar**874** - 12.
Bodó A, Katona G, Simon P . 2016SIS epidemic propagation on hypergraphs.**Bull. Math. Biol.**, 713-735. (doi:10.1007/s11538-016-0158-0) Crossref, PubMed, ISI, Google Scholar**78** - 13.
Alvarez-Rodriguez U, Battiston F, de Arruda GF, Moreno Y, Perc M, Latora V . 2021Evolutionary dynamics of higher-order interactions in social networks.**Nat. Hum. Behav.**, 586-595. (doi:10.1038/s41562-020-01024-1) Crossref, PubMed, ISI, Google Scholar**5** - 14.
Benson AR, Abebe R, Schaub MT, Jadbabaie A, Kleinberg J . 2018Simplicial closure and higher-order link prediction.**Proc. Natl Acad. Sci. USA**, E11221-E11230. (doi:10.1073/pnas.1800683115) Crossref, PubMed, ISI, Google Scholar**115** - 15.
Benson AR, Gleich DF, Leskovec J . 2016Higher-order organization of complex networks.**Science**, 163-166. (doi:10.1126/science.aad9029) Crossref, PubMed, ISI, Google Scholar**353** - 16.
Estrada E, Rodríguez-Velázquez JA . 2006Subgraph centrality and clustering in complex hyper-networks.**Physica A**, 581-594. (doi:10.1016/j.physa.2005.12.002) Crossref, ISI, Google Scholar**364** - 17.
Bretto A . 2013**Hypergraph theory: an introduction**. Berlin, Germany: Springer. Crossref, Google Scholar - 18.
Verhulst F . 1990**Nonlinear differential equations and dynamical systems**. Berlin, Germany: Springer. Crossref, Google Scholar - 19.
Hartman P . 1961On the stability in the large for systems of ordinary differential equations.**Can. J. Math.**, 480-492. (doi:10.4153/CJM-1961-040-6) Crossref, ISI, Google Scholar**13** - 20.
Gillespie DT . 1991**Markov processes: an introduction for physical scientists**. Cambridge, MA: Academic Press. Google Scholar - 21.
Montenegro R, Tetali P . 2006Mathematical aspects of mixing times in Markov chains.**Found. Trends Theor. Comput. Sci.**, 237-354. (doi:10.1561/0400000003) Crossref, Google Scholar**1**