Exogenous re-infection and the dynamics of tuberculosis epidemics: local effects in a network model of transmission
Abstract
Infection with Mycobacterium tuberculosis leads to tuberculosis (TB) disease by one of the three possible routes: primary progression after a recent infection; re-activation of a latent infection; or exogenous re-infection of a previously infected individual. Recent studies show that optimal TB control strategies may vary depending on the predominant route to disease in a specific population. It is therefore important for public health policy makers to understand the relative frequency of each type of TB within specific epidemiological scenarios. Although molecular epidemiologic tools have been used to estimate the relative contribution of recent transmission and re-activation to the burden of TB disease, it is not possible to use these techniques to distinguish between primary disease and re-infection on a population level. Current estimates of the contribution of re-infection therefore rely on mathematical models which identify the parameters most consistent with epidemiological data; these studies find that exogenous re-infection is important only when TB incidence is high. A basic assumption of these models is that people in a population are all equally likely to come into contact with an infectious case. However, theoretical studies demonstrate that the social and spatial structure can strongly influence the dynamics of infectious disease transmission. Here, we use a network model of TB transmission to evaluate the impact of non-homogeneous mixing on the relative contribution of re-infection over realistic epidemic trajectories. In contrast to the findings of previous models, our results suggest that re-infection may be important in communities where the average disease incidence is moderate or low as the force of infection can be unevenly distributed in the population. These results have important implications for the development of TB control strategies.
1. Introduction
The global burden of tuberculosis (TB) has increased over the past two decades, despite widespread implementation of control measures including BCG vaccination and the World Health Organization's DOTS strategy which focuses on case finding and short-course chemotherapy. This rise has been attributed to the spread of HIV, the collapse of public health programs and the emergence of drug-resistant strains of Mycobacterium tuberculosis. The rise in TB incidence has led to a growing consensus that new strategies will be needed to achieve TB control in sub-Saharan Africa and Eastern Europe (Corbett et al. 2003; Harries & Dye 2006). Proposed approaches include active case finding, isoniazid preventive therapy (IPT), anti-retroviral therapy among the HIV-infected and improved detection and treatment of patients with multidrug-resistant TB (Corbett et al. 2006; Dye 2006; Zignol et al. 2006). Novel anti-TB drugs, shortened drug regimens and the use of improved vaccines may also serve an important role in TB control in the future.
Infection with M. tuberculosis leads to disease by one of the three routes. A small proportion of those infected will develop primary disease within several years of their first infection. Those who escape primary disease may eventually re-activate this latent infection at some point during their lifetimes; as such, re-activation may occur decades after an initial transmission event. Lastly, latently infected patients can be re-infected and develop disease as a result of this new exposure. In the short term, strategies such as infection control measures designed to reduce TB transmission are expected to lower the incidence of disease that arises from primary or exogenous re-infection, but will have little effect on the incidence of re-activation disease which results from infection that was acquired in the distant past. Conversely, strategies that emphasize the diagnosis and treatment of latent infection will reduce re-activation TB, but may not be an effective way to interrupt an ongoing chain of transmission. Researchers also argue that TB control measures will have variable effects which depend on the frequency of re-infection. Gomes et al. (2004) developed a model in which they postulated that TB interventions such as IPT and BCG will perform less effectively when re-infection is the predominant route to disease.
Given the practical importance of identifying the relative contributions of primary, re-activation and re-infection disease to the burden of TB, it is unfortunate that there are no empirical means with which to quantify the incidence of re-infection TB. Although molecular epidemiologic tools have been used to estimate the relative contribution of recent transmission and re-activation (Alland et al. 1994; Small et al. 1994), it is not possible to use these techniques to distinguish between primary and re-infection TB on a population level. Estimates of the contribution of re-infection therefore rely on mathematical models which can be used to infer the average degree of immunity to subsequent infection afforded by a first infection with TB (Romeyn 1970; Sutherland et al. 1982; Vynnycky & Fine 1997). Chiang & Riley (2005) recently summarized the conclusions of these models in their excellent review on exogenous re-infection. While the details of these models differ, most find that re-infection increases in frequency as the prevalence of TB rises and consequently that it presents a problem only in especially high burden communities.
The finding that exogenous re-infection is unimportant where disease is rare, however, depends strongly on the assumption, shared by each of the models reviewed, that individuals mix randomly in a population. As noted by Ferguson et al. (2003), this assumption of homogeneous mixing is especially unrealistic for diseases in which close contact is necessary for transmission. Individuals are more likely to contact those with whom they share a home, a school or a workplace, than those with whom they have no particular link. Incorporation of the spatial and social structure into models of infectious diseases has revealed that these factors influence disease dynamics and the structure of pathogen and host populations (Keeling et al. 1997; Watts & Strogatz 1998; Keeling 1999; Read & Keeling 2003; Keeling & Eames 2005). In particular, several studies have demonstrated that the efficacy of interventions depends on assumptions of how members of a population interact (Eames & Keeling 2003; Pourbohloul et al. 2005).
While previous models have considered the effects of heterogeneity in contact structure (Aparicio et al. 2000; Song et al. 2002; Schinazi 2003), here we describe the first network model of TB that permits exploration of the effects of non-random mixing over realistic epidemic trajectories. We use the model, with standard assumptions about the dynamics of primary progression, endogenous re-activation and exogenous re-infection, to assess the importance of these routes to disease for populations with structured mixing. We then compare our findings to models which assume random mixing. We highlight the similarities and differences in our results and discuss the impact of these findings for the design and evaluation of TB interventions.
2. Model description
2.1 Generation of networks
We develop a network model of TB in an idealized community. Following Read & Keeling (2003), we simulate a population where each individual is placed at random on a square patch at a constant average density. Contacts between individuals are drawn as edges connecting vertices; these edges represent sufficient contact for transmission of TB. We specify that the chance of a link between two individuals decreases as the distance between them increases such that infection is transmitted preferentially to individuals in the proximity of an infectious case. Thus, individuals located nearest to each other on the network can be thought of as family members, while those slightly farther away may be neighbours, friends or other social contacts. The fact that links between individuals are assigned with a probability related to their distance from each other allows that some longer-distance contacts will exist in the network. We specify the relative probability of making shorter-versus longer-distance connections by setting a single parameter, D. Networks with low D values have most of their edges between nearby individuals while networks with high D values are more global; adjustment of this single parameter allows us to calibrate the extent of clustering. We also set n, the average number of contacts of each vertex; n is called the average degree of the graph (see appendix A for additional details).
2.2 Network characteristics
Transmission dynamics within a network depend in part on the clustering coefficient C (transitivity) of the graph, i.e. the average likelihood that two contacts of a given individual in the network are contacts of each other. In this framework, lower D values yield graphs with higher clustering coefficients. This means that the likelihood that two contacts of a single individual also share an edge, forming a triangle, increases as D decreases. Low D graphs represent communities in which there is tighter grouping of respiratory contacts that may lead to transmission of TB within clusters. Figure 1 shows two small graphs with different D values and the corresponding histograms of the length of edges in these graphs. Figure 1 Graphs with two different D values. (a,c) D=1. (b,d) D=10. Clustering coefficients (C) are calculated using the formula in appendix A. While the degree distribution and average number of contacts for each of these graphs are similar, the average length of each of these connections is much higher on the D=10 graph (note that different length-scales are used for c and d). To allow better visualization of the networks, these graphs have 300 individuals and n (average degree)=8; graphs used for the simulations have 100 000 individuals and n=15.
Read & Keeling's method allows us to tune the clustering coefficient while maintaining a relatively small average degree. We fixed n at 15 for the simulations presented in this paper. Again, n represents the average number of people with whom each individual is in consistent and close enough respiratory contact to potentially transmit infection if that individual was to fall ill with TB. Importantly, the actual number of intimate respiratory contacts for each individual varies about this mean; such individual-level variation has important effects on the dynamics of respiratory diseases (Lloyd-Smith et al. 2005).
2.3 Modelling the natural history and transmission of TB on the network
We model the natural history of TB using a modified susceptible–exposed–infections–recovered structure consistent with previous TB models (Blower et al. 1996; Dye et al. 1998; Murray & Salomon 1998; Cohen & Murray 2004; Salomon et al. 2006). Individuals exist in several mutually exclusive states: susceptible to infection (S), latently infected (L), infectious (I) or recovered (R) (figure 2a). Following Vynnycky & Fine (1997), we incorporate re-infection assuming that individuals with latent infections or who recover from disease retain partial immunity which confers some protection from progression to disease upon re-infection. If an individual is linked to an infectious individual, he may become latently infected with a probability τ for each month that he is in contact with the infectious individual. Thus, a susceptible individual who is in contact with k infectious individuals in a given month has a probability of 1−(1−τ)k of infection in that month. Parameter values are listed in table 1 and the expressions for the probabilities of transitions between states are provided in table 2. Figure 2 Disease model transitions. (a) Infection model. Individuals are born into the susceptible state (S); if infected they move into a state of latency (L) from which they suffer primary progression to disease (vertical grey bars) for the first 5 years after infection, endogenous re-activation (diagonal black bars) if they progress to disease more than 5 years after an infection (or re-infection) event or exogenous re-infection (dark grey) if they progress to disease within 5 years after a re-infection event. Individuals in the diseased state (I) are infectious until they are either cured by drugs and move to the recovered state (R) (in our simulations this happens only after drugs become available in 1950), they self-recover and return to latency (arrows not shown), or they die. (b) Routes to disease. Progression for a hypothetical individual who is infected three times over the course of his life. The height of the bars represents the probability of progression to disease (by each route) as a function of time.
parameter | description | value | incidence sensitivity (%) | source |
---|---|---|---|---|
τ | infectiousness per contact, per month | 0.17 (0.12, 0.23) | −51,+87 | fit |
μ | mortality, per year | 0.02 (0.014, 0.026) | +27,−26 | assumption |
μTB | TB-mortality, per year | 0.3 (0.22, 0.37) | +2,−3 | Rutledge & Crouch (1919) and Grzybowski & Enarson (1978) |
γ | birth rate, per year | 0.2 | n.a. | to maintain stable population size |
rS | self-recovery, per year | 0.2 (0.14, 0.25) | +5,−6 | Springett (1971) and Enarson & Rouillon (1994) |
λr | relapse probability after ‘cure’, per year | 0.05 (0.035, 0.065) | −34,+38 | |
rD | treatment efficacy, per month | 0.82 (0.58, 0.93) | +86,−11 | assumption |
z | partial immunity (protect from progression) | 0.4 (0.28, 0.52) | +16,−16 | Vynnycky & Fine (1997) and Dye et al. (1998) |
p1 | primary progression probability, per year | 0.03 (0.02, 0.04) | −57,+115 | Vynnycky & Fine (1997) |
p2 | endogenous re-activation probability, per year | 0.0003 (0.00021, 0.00039) | −12,+6 | Horwitz (1969) and Vynnycky & Fine (1997) |
fT | fraction of population with access to treatment | 0 (pre-1950) 0.9 (post-1950) | n.a. | assumption |
state | D | S | L | I | R |
---|---|---|---|---|---|
D | 1−γ | μ | μ | (1−rD)μTB | μ |
S | γ | 1−μ−(1−μ)pk | 0 | 0 | 0 |
L | 0 | (1−μ)pk | (1−μ)−(1−μ)λ | (1−rD)(1−μTB)rS | 0 |
I | 0 | 0 | (1−μ)λ | 1−(1−rD)μTB−(1−rD)(1−μTB)rS−rD | (1−μ)λr |
R | 0 | 0 | 0 | rD | 1−μ−(1−μ)λr |
Individuals infected for the first time move from the susceptible to the latently infected class. For the first 5 years after the infection, the per-year probability of progression to active disease is equal to p1. If these 5 years pass without progression, the per-year probability of endogenous re-activation is reduced to p2; this corresponds to the observation that the risk of disease is highest soon after infection and is substantially lower if this initial period elapses without disease progression (Holm 1969; Styblo 1991). Latently infected individuals are asymptomatic, are not infectious, and experience the same per-year probability of death (μ) as susceptibles.
Individuals with persistent latent infection may be re-infected. Individuals in this state are exposed to the same force of infection (i.e. 1−(1−τ)k) as they would if they were susceptible. However, the partial immunity (z) conferred by their first infection reduces their per-year probability of progression to pI=(1−z) p1 during the 5-year period after the re-infection event (figure 2b). Individuals may be re-infected on multiple occasions.
Individuals with active disease experience an elevated TB-specific per-year probability of mortality (μTB). Individuals may contain their infection without intervention at a self-cure per-year probability (rS); these individuals return to a state of persistent latent infection. We also introduce curative antibiotic treatment. The distribution and efficacy of treatment is indexed by two terms: (i) the probability that an individual will be treated (fT) and (ii) the per-month efficacy of treatment (rD). Those receiving curative therapy progress to the recovered state from which they experience a per-year probability of relapse (λr).
2.4 Simulations
To produce trends similar to those in Western/Northern Europe over the past 100 years, we first fit the transmission parameter (τ) to generate epidemics that equilibrate near disease indices estimated for England and Wales in 1900. We then lower τ to allow the burden of disease to decrease at a rate similar to that observed in Western and Northern Europe over the first half of the century. Finally, for the final 50 years, we simulate antibiotic treatment by specifying the proportion of those who will be treated and the probability of treatment success. The sudden change in τ and the abrupt introduction of antibiotics allow the simulations to mimic the documented decline in TB over this period and are not intended to realistically represent gradual improvements in public health that contributed to this decline.
3. Results
3.1 Validity
Trends in TB incidence and annual risk of infection (ARI) from 1900 to 2000 generated with our model mirror patterns observed in Northern and Western Europe over this same period (figure 3; Vynnycky & Fine 1997). Here, we assume that 90% of TB cases are detected and treated after 1950. These simulations also re-create the ratio of TB prevalence : TB incidence : TB mortality (4 : 2 : 1) characteristic of developing countries before the introduction of antibiotics (results not shown; Styblo 1991). Figure 3 Simulations. (a) TB incidence and (b) ARI on graphs with D=2 (dashed orange) and D=10 (solid blue) with 100 000 individuals and parameter values as listed in table 1. Snapshots showing 10 000 individuals during epidemics at (c, e) high and (d,f) low incidence on graphs of (c,d) D=2 and (e,f) D=10. Grey pixels are unoccupied spaces, blue pixels represent individuals who are susceptible to infection and yellow pixels represent individuals with latent infection. Even at incidence levels considered high for TB epidemics, infectious individuals (red) are not present in large enough numbers to be seen easily at this resolution and edges are not shown in this representation as they were in the much smaller network depicted in figure 1a,b. Clustering of those with latent infection is evident in the more local graph at low levels of incidence (d), but not easily detected at higher incidence (c) or on the more global graph at any time in the epidemic (e,f).
3.2 Impact of clustering
Figure 3a,b demonstrates the decline in TB incidence and ARI during the twentieth century on networks in which the distribution of respiratory contacts are more (D=2) and less (D=10) tightly clustered. Snapshots of simulations (figure 3c–f) show the relationship between clustering and the localization of infection and disease at different stages of the epidemic. Since individuals in low D networks are less likely to have long-distance contacts, the pathogen is transmitted locally and results in a high density of infection in the neighbourhoods surrounding infectious individuals. This effect is less dramatic early in the course of these simulations when disease prevalence is high and the majority of the population is latently infected but becomes more pronounced as levels of disease decline over time. We present snapshots from a growing TB epidemic to emphasize the modest effect of clustering at high incidence (figure 3c,e) and substantial effect of clustering at low incidence (figure 3d,f); the inverse relation between the importance of clustering and the incidence also holds during the decline of the epidemic, but is less marked than these snapshots indicate.
3.3 Exogenous re-infection
Figure 4 shows that the proportion of disease due to exogenous re-infection depends on both the TB incidence and the structure of the underlying contact network. Re-infection is only possible if individuals with latent infection are linked to infectious individuals. When disease is widespread, a susceptible or latently infected individual is likely to contact an infectious individual regardless of whether the network is dominated by tight clustering (lower D) or by more long-distance connections (higher D). In contrast, as disease incidence declines, the likelihood that a latently infected individual will be re-exposed becomes increasingly dependent on the structure of the contact network. Figure 4 Importance of exogenous activation at different incidence levels and values of D. D has a substantial effect on estimates of the fraction of disease due to exogenous re-infection at low levels of TB incidence, but less impact at higher levels of disease occurrence.
If the relevant contact structure is dominated by short connections and tight clusters, communities with low average levels of disease may nonetheless support high local incidence of infection and disease. Thus, latently infected individuals may be repeatedly exposed to infectious individuals simply because they are in contact with others who are also at high risk of disease. Clustering results in a higher proportion of disease due to exogenous re-infection than has been previously postulated for low incidence settings using models which assume that individuals mix at random. On local graphs, we found that exogenous re-infection may account for 25% of the total progression to active disease even when incidence is only approximately 50 cases in 100 000 annually.
4. Discussion
Accurate estimates of the contribution of exogenous re-infection to the dynamics of TB epidemics would facilitate the selection of new intervention strategies and allow for better assessments of the performance of existing control programs. Our results are consistent with the prevailing view that exogenous re-infection may be the dominant mechanism for disease in areas of high incidence, but challenge the conventional wisdom that re-infection is negligible in low incidence settings. In simulations where local mixing predominates, we find that latent infection and consequently infectious TB occur in clusters. Thus, even when average TB incidence across a community is low, there may be pockets of disease and subgroups may be exposed to high local forces of infection. This may have implications for the divergence of distinct strains of TB.
Though the exact structure of respiratory contact networks in communities is not known, the socio-spatial structure represented by networks of low D aligns well with the existence of vulnerable subgroups in low incidence areas. For example, in the United States, individuals living in congregate settings, such as homeless individuals and prisons, are known to be at high risk of infection and re-infection (Nardell et al. 1986; MacIntyre et al. 1997). In contrast to most previous models which assume homogeneous mixing, this model allows that an individual who has been exposed to M. tuberculosis once is more likely to be exposed again than someone who has never been exposed, particularly if incidence is low. Conversely, when the average disease burden is moderate or high, there may be some individuals who are relatively protected from infection simply because they contact only a subset of the population which is not likely to have disease. This network model captures the notion that each member of the population may experience a unique force of infection which depends on the infection status of the people with whom he interacts.
Our results demonstrate that in large communities with low TB incidences, non-random mixing of the population allows that re-infection may play a larger role in disease dynamics than previously recognized. This result does not depend on the explicit inclusion of subpopulations with compromised immunity or otherwise distinctive disease risk. Prolonged or intense contact, such as the interactions occurring within closed settings like households, workplaces and hospitals, is generally considered necessary for transmission of M. tuberculosis (Grzybowski et al. 1975; van Geuns et al. 1975). Since each of these contexts shares the general property that two individuals in contact with another individual are also likely to be in contact with each other, it is reasonable to assert that the connections between individuals important for TB transmission can be qualitatively represented on graphs with high clustering coefficients. In areas where a substantial proportion of transmission is due to casual contacts, a network with a higher D value would better represent the contact structure. Observations of substantial transmission through casual contacts have mainly been reported (Classen et al. 1999) and modelled (Aparicio et al. 2000) in areas of high incidence. In these areas, re-infection is thought to be important regardless of specific contact structure.
The results presented here are based on a simplified model of an idealized network which attempts to capture some of the complexity of social contacts in the real world and explore the implications for TB dynamics. In particular, our network structure itself is static, though individuals move within the network upon birth or death of nearby individuals. Relaxation of the requirement that these networks are static may allow infection and disease to escape more easily from local clusters; thus, our simulations may overestimate the importance of re-infection in low incidence settings. However, we have used conservative estimates of exogenous re-infection: disease occurring more than 5 years after a re-infection event was labelled endogenous re-activation and all disease after treatment was counted as relapse. We have also excluded immigration of individuals with latent infection or active disease.
In contrast with models which assume homogeneous mixing, results from our network model suggest that exogenous re-infection may occur commonly even in communities with low average incidences of TB. This finding has important implications for the planning of interventions and the evaluation of existing control programs in low-incidence settings. While increased levels of re-infection may limit the effectiveness of interventions such as preventive therapy, the identification of clusters of individuals among whom disease is circulating may allow more targeted interventions and increase the utility of active case finding and contact tracing. Lessons learned from core group theory and the transmission of sexually transmitted diseases may be adapted by policy makers to assist the control of TB in low-incidence settings (Klovdahl et al. 1994; Kretzschmar et al. 1996).
A.1 Network characteristics
Following Read & Keeling (2003), we construct a network in the following way: given two vertices separated by a distance d, the probability of an edge linking them is given by

A.2 Model transitions and computations
The transitions between states are shown in table 2. Our model was run in Matlab using a vectorized Monte Carlo method to update the individuals' states. Clustering coefficients shown in figure 1 were computed according to

In figure 4, each point used to generate the surface represents an average of 50 simulations on two different graphs for each D. The incidences shown resulted from varying τ. Cubic spline smoothing was applied to generate a smoother surface, but did not change the surface shape.
A.3 Birth/recruitment process
Our approach, like Read & Keeling's, allows the population to be maintained at a stable size and avoids introducing susceptibles into areas with either too high or too low disease incidence, which would result in unintended spatial correlations (Keeling 2000). When an individual dies, for example at vertex x, a new susceptible individual is born with probability γ per year. If a birth occurs, we relocate agents according to the following scheme: a neighbour y of x moves into x's spot and assumes the contacts that x had. A neighbour of y then moves into y's spot, and after a set number of such replacements, a susceptible individual is born. We have used three replacements per birth in the simulations reported here. This method allows us to represent some social continuity upon death and reduces the need for continual network regeneration during the simulations.
A.4 Estimating the proportion of disease due to exogenous re-infection
We classify each progression from the latent to the infectious state as resulting from one of the three mechanisms: primary progression, endogenous re-activation or exogenous re-infection. If an individual progresses within 5 years of their first infection, we count this as primary progression. After 5 years pass, if there is no re-infection, and the individual progresses to active disease, we count this as endogenous re-activation (Holm 1969). Finally, disease occurring within 5 years of a re-infection event may either be due to primary progression of the recent infection (exogenous re-infection) or to re-activation of the original early infection (endogenous re-activation). We use the relative probabilities of progression, which comprise the total progression rate pI, to decide which occurs in each such case. In other words, we label each progression in this class a result of endogenous re-activation with probability p2/pI, and exogenous re-infection with probability 1−p2/pI. Finally, if disease occurs more than 5 years after a re-infection event, we classify this as endogenous re-activation.
A.5 Sensitivity analysis
The sensitivity of our model to the particular choices of parameters given in table 1 was assessed in the following way. For the parameters τ, μ, μTB, γ, p1, p2, z, rs and rp, we ran the model changing one parameter at a time (leaving the others at their default values) and noted how much difference this caused in the incidence of TB. The results are presented in table 1; the change in each parameter to the values shown in parentheses yielded the given changes in incidence indicated in the adjacent column. As with previously published TB models, our model is quite sensitive to the rate of progression p1, from latency to active disease, and to the transmission probability τ. However, within the realistic range consistent with data reviewed by Styblo (1991), our central results remain unchanged.
T.C. is supported by NIH grant 5K08AI055985-04.
Footnotes
This material has not been presented previously.