Self-clearance of Mycobacterium tuberculosis infection: implications for lifetime risk and population at-risk of tuberculosis disease

Background: it is widely assumed that individuals with Mycobacterium tuberculosis (Mtb) infection remain at lifelong risk of tuberculosis (TB) disease. However, there is substantial evidence that self-clearance of Mtb infection can occur. We infer a curve of self-clearance by time since infection and explore its implications for TB epidemiology. Methods and findings: data for self-clearance were inferred using post-mortem and tuberculin-skin-test reversion studies. A cohort model allowing for self-clearance was fitted in a Bayesian framework before estimating the lifetime risk of TB disease and the population infected with Mtb in India, China and Japan in 2019. We estimated that 24.4% (17.8–32.6%, 95% uncertainty interval (UI)) of individuals self-clear within 10 years of infection, and 73.1% (64.6–81.7%) over a lifetime. The lifetime risk of TB disease was 17.0% (10.9–22.5%), compared to 12.6% (10.1–15.0%) assuming lifelong infection. The population at risk of TB disease in India, China and Japan was 35–80% (95% UI) smaller in the self-clearance scenario. Conclusions: the population with a viable Mtb infection may be markedly smaller than generally assumed, with such individuals at greater risk of TB disease. The ability to identify these individuals could dramatically improve the targeting of preventive programmes and inform TB vaccine development, bringing TB elimination within reach of feasibility.


Introduction
Tuberculosis (TB) remains the largest cause of death by an infectious agent [1], with one-quarter of the global population estimated to have been infected with Mycobacterium tuberculosis (Mtb) [2]. It is commonly assumed that all such individuals retain a lifelong viable infection, defined here as being at risk of TB disease in the absence of treatment and reinfection [3,4]. This is unlikely to be true, however [5,6].
A range of evidence suggests that a proportion of an initially infected cohort may self-clear their infection, defined here as meaning that their risk of TB disease in the absence of treatment and reinfection becomes effectively zero [6][7][8][9]. Yet, there is also historical and recent evidence for a long-term rate of TB disease that persists for many years after infection [10][11][12][13][14].
The implications of these two observations for TB prevention are numerous. Tackling incident TB disease arising from the Mtb infected reservoir is necessary to meet World Health Organisation End TB Strategy targets [2,15,16]. However, TB prevention policies are hampered by the size of the infected reservoir as estimated by current tests, which are sensitive to historical infection and not necessarily current, viable infections [5,17]. In combination with the relatively low individual risk of TB disease and the costs and potential side-effects of preventive therapy, the cost-benefit of mass testing and treatment for infection using current tests is often unacceptable at both the individual and population levels [18].
If, however, we consider that self-clearance of Mtb infection is possible, our estimates for the population at risk of TB disease in the absence of (re)infection could be significantly reduced, as many individuals initially infected with Mtb may no longer harbour a viable infection. Moreover, because the number of people progressing to TB disease would remain unchanged, the decreased population at-risk would lead to a greater lifetime risk of TB disease in those that retain a viable infection. This would, in turn, improve the cost-benefit threshold of preventive therapy should a test for viable infection become available.
To adequately capture the process and implications of self-clearance in a population over time, and in turn, motivate research for tests of viable infection, it is crucial to provide quantitative estimates of the extent of self-clearance by time since infection. While an estimate of the highest potential proportion of individuals that may have self-cleared their infection has been provided [6], an estimate of the proportion of individuals that self-clear by time since infection is currently missing.
To infer this metric from empirical studies, data need to include an estimate for the time of initial infection, or at least enable a reasonable estimate. Examples include tuberculin skin test (TST) reversions among cohorts of initial TST-converters [8,19] as well as an absence of viable bacilli in people with a history of, but died of causes other than, TB [7,20]. As this inference relies on subjective interpretation of indirect measures of self-clearance, we consider it important to make conservative assumptions to provide a strong lower bound for self-clearance over time.
In this paper, we use a modelling approach to provide this conservative estimate of the extent of self-clearance of Mtb infection and its impact on TB epidemiology. We review the potential evidence to inform a cohort model of TB natural history before estimating two key outcomes: the increase in lifetime risk of TB disease following infection for those who retain a viable infection; and the decrease in the population with a viable infection in three epidemiologically distinct settings of India, China and Japan.

Methods (a) Data for self-clearance of Mycobacterium tuberculosis infection
To find suitable evidence, we considered reviews of the spectrum of Mtb infection and TB disease published in the last decade [17,[21][22][23][24][25][26][27][28][29] before extracting references seemingly pertinent to selfclearance of infection and references therein. We also considered specific references [6,[30][31][32][33][34][35][36] known to the authors and references therein. We only considered studies in humans in which individuals received no chemotherapeutic treatment or Bacillus Calmette-Guérin (BCG)-vaccination (see the electronic supplementary material for further details of the literature review). Following the review, two principal sources of evidence were found to be suitable for inferring self-clearance by time since infection. Firstly, TST-reversion studies, where we interpret a transition from TST-positive to TST-negative as waning of the adaptive immune response as a result of clearance of Mtb in the host. Studies had to report a time of initial TST conversion (i.e. becoming positive on a TST) or be among initially TST-positive children to provide a reasonable prior for the age, and therefore, time of infection. Individuals then had to be retested after a stated time interval. We inferred the proportion of individuals that cleared their infection from the proportion that TST-reverted upon retest. In line with our aim, this would probably represent a lower bound for self-clearance, because it is possible that some individuals may retain an immune response in the absence of a viable infection [6].
Secondly, we used autopsy studies that attempted to identify viable Mtb bacilli in the lungs of individuals that had histopathology consistent with a historic active infection but died of causes other than TB. We inferred the proportion of individuals that cleared their infection from the proportion in which no viable bacilli could be identified via culture and/or guinea pig inoculation. Historical annual risk of infection (ARI) estimates were then used to provide a reasonable prior on the average age at first infection, and hence the average time since first infection, for each age group (see the model description in the following section for further details).
Data from eligible studies were reviewed for potential further biases that could lead to an overestimate of the extent of selfclearance and, where necessary, we made the more conservative selection. To mitigate against small, chance reversions caused by test instability [37], we selected only transitions from indurations of 10 mm or more to 5 mm or less. Similarly, to avoid biases caused by potential TST boosting or persistent non-converters [38], we opted for studies with fewest repeat TSTs. Finally, to provide more robust estimates, the age-group specific results were only extracted if the group consisted of at least 30 individuals.
Self-clearance can also be estimated under certain assumptions from the number of TST-positive individuals that do not develop TB disease following immunosuppression [6]. We did not include such data, however, because estimating the average time since infection is more problematic for contemporary immunosuppression studies, where transmission is highly heterogeneous both geographically and temporally [39]. Moreover, because most of these cohorts have comorbidities sufficient to require immunosuppression, individuals with a viable infection are more likely to have progressed to TB disease before being immunosuppressed. Overall, while such studies can, therefore, provide a useful upper bound for self-clearance of infection in the general population, they are at odds with our aim to provide a robust lower bound of self-clearance by time since infection.

(b) Cohort model and lifetime risk of tuberculosis disease
A simple deterministic, compartmental model of TB natural history was used to represent a cohort of simultaneously infected people (see figure 1 and table 1). All individuals are modelled to be infected at a given age before either progressing rapidly to TB disease through the 'fast progression' infected compartment or more gradually through the 'slow progression' infected compartment, as is common in models of TB natural history [3,4]. Conversely, individuals may clear their infection, again through either of the two infected compartments. From here, they are no longer at risk of TB disease in the absence of reinfection. Background mortality is included to account for death from causes other than TB. For simplicity, and to retain conservative estimates, we did not include reinfection.
The model was simultaneously fitted to data on the proportion of individuals that self-cleared infection by time since birth or progressed to TB disease by time since infection. Data for selfclearance was inferred from the studies identified in the preceding royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 288: 20201635 subsection [7,8] (see the results section and electronic supplementary material for further details). To account for potentially different ages of first infection between the studies, the model was simultaneously fitted to TST-reversion and autopsy data with the same natural history parameters but independent ages of infection. Data for progression to TB disease was taken from the placebo arm of the British Medical Research Council's BCG vaccine trials [10,40,41]. For greater than 10 years post-infection, we used a distal progression risk of 20 per 100 000 per year applied to the BCG vaccine placebo group as if infection remains lifelong (which is how distal progression risks are measured and presented) and is representative of the values found in the literature [42][43][44] (see the electronic supplementary material for further details). Background mortality was modelled using a gamma distribution, described by a mean life expectancy of 60 years with a standard deviation of 10 years. These values are broadly representative of current survival curves in low-and middle-income countries [45] and more likely to reflect the survival curves in the historical studies used. Table 1 summarizes the model parameters.
Model fitting was performed in a Bayesian framework using a flat prior over 0-7 years for the age at infection for the TSTreversion cohort, as detailed in the associated study [8]. The prior for the age at infection for the autopsy cohort was calculated by assuming a normal distribution for the contextual ARI of the study [7], informed by historical TST surveys among adolescents (see the electronic supplementary material for further details). A flat prior was used for all other varied model parameters.
A beta function was used to characterise the likelihood function comparing model outcomes with data on self-clearance and progression to TB disease. A delayed-rejection, adaptive-Metropolis Markov chain Monte Carlo (MCMC) algorithm was used to generate posterior estimates for the model parameters. Each fitting procedure consisted of a 500 iteration burn in, followed by a chain of 5000 further iterations. Results for the median and equal-tailed 95% uncertainty intervals for the posterior model parameters were then generated from these chains.
The model was then run 125 times, sampling from the posterior model parameters. The median and equal-tailed 95% uncertainty intervals for the proportion of the cohort that self-clear infection and the cumulative risk of TB disease among individuals that retain a viable infection, both over time since infection, were then calculated. To arrive at the latter, the risk of TB disease during each time step is calculated by dividing the number of individuals that progressed to TB disease during the time step by the number of individuals that had a viable infection at the beginning of the time step, before integrating over the desired timescale following infection.
For purposes of comparison, the above analysis was repeated assuming lifelong infection (i.e. the self-clearance rates f c and s c were set to 0 per year) and the model was fitted to progression to TB disease data only. To make appropriate comparisons, we use as an input the posterior results for the age at infection from the self-clearance scenario.
Finally, independent sensitivity analyses were performed to assess the impact of the assumed risk of distal progression to TB disease (using 5 and 35/100 000 yr −1 ) and the mean life expectancy of the cohort (using 50 and 70 years).

(c) Country-level model and population at risk of tuberculosis disease
To estimate the age-specific population with a viable Mtb infection in a given country for a particular year, we applied the cohort model to consecutive 5-year birth cohorts, analogous to the approach for estimating the global burden of latent TB infection in [2] (see the electronic supplementary material for further details). A time-dependent force of infection was applied to each cohort using ARI estimates, which were derived by re-performing the Gaussian process regression in [2]. We used only the median estimated ARI in each country, because the focus of this work is to illustrate the relative difference between the population with a viable infection in the self-clearance and lifelong infection scenarios, not estimates of absolute numbers. To parametrize the TB natural history components of the model, the posterior parameter values derived from the single cohort model were used. Uncertainty in the results was then solely owing to that of the natural history parameters. We applied the model to three epidemiologically distinct settings: India, China and Japan. Japan, for example, has an older population and less recent transmission compared to China, and in turn compared to India. All three countries also have a sufficiently low prevalence of HIV infection as to not require modification of the TB natural history components of the model presented in figure 1.
For each setting, the model was run 125 times, sampling from the posterior model parameters. The median and equal-tailed 95% uncertainty intervals for the population with a viable Mtb infection in 2019 was calculated, by age as well as overall. For purposes of comparison, the above analysis was repeated assuming lifelong infection.
All analyses were conducted using R v. 3.5.0 [46]. Modelling and Bayesian fitting were performed using the deSolve [47] and FME packages [48], respectively. Plots were constructed using the ggplot2 [49] package. Replication data and analysis scripts are available on GitHub. TST reversion studies [8,19] and the other two autopsy studies [7,20]. Of the TST-reversion studies, Adams et al. [19] included a significant proportion of individuals that could have undergone at least five TSTs in under 2 years. Such a high rate of testing could result in bias owing to effects such as boosting [38]. To provide the more conservative estimate, we use only the results of Ferebee [8], in which initially TST-positive children were retested after a 10-year interval. Figure 2a shows this data presented over time since infection. Uncertainty in the horizontal axis reflects the 95% uncertainty interval of the estimate for the age of infection of the TST-reversion cohort, while the uncertainty in the vertical axis represents the equal-tailed 95% confidence interval of the data.

Results
Of the two autopsy studies, Feldman & Baggenstoss [20] considered only the lesions found in the lung and their surrounding tissues, while Opie & Aronson [7] considered both the lesions and the bulk of the lung. This is the likely reason why Opie & Aronson identified viable Mtb bacilli in a markedly greater proportion of individuals, and to provide a more conservative estimate of self-clearance, we extracted data from this study only. Figure 2a shows this data presented over time since infection. The data have been aggregated into three age groups, where only groups with greater than 30 individuals have been included. The uncertainty in the horizontal axis reflects the 95% uncertainty interval of the estimate for the age of infection of the autopsy cohort, while the uncertainty in the vertical axis represents the equal-tailed 95% confidence interval of the data.
Owing to a paucity of data, we could not further disaggregate by sex, location (e.g. high versus low burden setting), or age at infection. Table 1 shows the median and equal-tailed 95% uncertainty intervals for the posterior model parameters for both the case of self-clearance and lifelong infection. Prior versus posterior model parameters are presented in the electronic supplementary material, table S4. Figure 2 shows the results of the cohort-model allowing for self-clearance of infection, fitted to the self-clearance and progression to disease data, including the median and equaltailed 95% uncertainty intervals (see the electronic supplementary material, figure S4 for the model fitting results assuming lifelong infection). As such, we estimate 24.4% (17.8-32.6%, 95% uncertainty interval (UI)) of individuals self-clear within 10 years of infection and 73.1% (64.6-81.7%, 95% UI) over a lifetime.

(b) Cohort model and lifetime risk of tuberculosis disease
This translates into a lifetime risk of TB disease following infection, in those that retain a viable infection, of 17.0% (10.9-22.5%, 95% UI), compared to 12.6% (10.1-15.0%, 95% UI) in the standard scenario of lifelong infection. and equal-tailed 95% uncertainty intervals. The impact of self-clearance is most pronounced in older age groups. Figure 3 also shows the total population with a viable Mtb infection in each setting allowing for self-clearance of infection, expressed as a percentage of the population with a viable infection assuming infection is lifelong. The impact is smallest in India (56%, 47-65% UI), followed by China (37%, 28-47% UI), with the greatest impact in Japan (27%, 20-37% UI).

(c) Country-level models and population at risk of tuberculosis disease
See the electronic supplementary material, figures S5-8 and tables S5-12 for the results of sensitivity analyses performed using different mean life expectancies and risks of distal progression to TB disease. Neither sensitivity makes a qualitative difference to our results, though the results are more sensitive to the assumed life expectancy than the risk of distal progression to TB disease.

Discussion
We have shown that self-clearance of Mtb infection is likely to have a significant impact on TB epidemiology. Our results provide a robust lower-bound estimate, with at least 24.4% (17.8-32.6%, 95% UI) of individuals self-clearing within 10 years of     [6]. Our work, intended to provide a robust lower bound before exploring the resultant epidemiological implications, finds a lower, albeit still substantial, result for the extent of such self-clearance.
One immediate implication of our work is that a significant proportion of the 1.7 billion people currently estimated to be at risk of TB disease [2] are likely to have cleared their infection, such that the global number should be revised, or at least re-interpreted. Within that reduced number however, the increased risk of developing TB disease would improve the risk/benefit of preventive therapy programmes, particularly if used in conjunction with current methods for identifying high-risk groups [50]. More widely, self-clearance of infection, and whether those that have self-cleared have any protection from future reinfection, will have implications for mathematical modelling of TB natural history and interventions as a whole. Results of pre-versus post-infection novel vaccine candidate models, for example, will need to carefully consider the characteristics of those that have self-cleared their infection [51] as will analyses estimating the impact and cost-effectiveness of preventive therapy [52]. Importantly, however, any potential benefits of self-clearance will depend on developing and validating a test for viable Mtb infection that can outperform the positive predictive value of TSTs.
Our analysis has a number of potential limitations primarily owing to the absence of a validated test of viable infection, necessitating instead the indirect inference of self-clearance. With respect to TST-reversion studies, while it seems reasonable that a large reversion implies waning of the sensitization response as a result of clearance of infection, this is not always the case, as in Noguera-Julian et al. [53] whereby immunosuppression suppresses TST reactivity. Moreover, test instability can result in small chance reversions [37], which we partially mitigated against by requiring a reversion from an induration greater than 10 mm to one less than 5 mm. The possibility of false negative results remains, however. The lack of reversion in some participants after preventive therapy in a study by Houk et al. [54] should also be considered, although groups in the same study that started preventive therapy shortly after TST conversion did exhibit significant reversion thereafter. In either case, isoniazid is known to have limited ability to sterilize (i.e. fully clear) Mtb infection, at least in individuals living with human immunodeficiency virus [55], such that TST reversion may not necessarily be expected following Isoniazid preventive therapy alone. Finally, it is possible that some individuals were re-infected after initially self-clearing and reverting their TST during the 10-year follow up in [8], thus underestimating the extent of reversion and self-clearance.
Our use of autopsy studies to infer self-clearance also has certain limitations. Even though we chose the study with the most extensive exploration of the lung, it may be possible that viable Mtb bacilli could be found elsewhere. While Mtb DNA has been identified outside of the lung [56,57], identification of DNA does not equate to viable bacilli, and it remains unclear as to whether such reservoirs could seed future disease in the lung. Estimating the age of infection of the cohorts relied on ARI estimates for the late nineteenth and early twentieth centuries in the USA. While uncertain, it is accepted that the high ARI during this period was sufficient to assume all cohorts were first infected during adolescence. We also explicitly accounted for the uncertainty in the time since infection in our analysis.
Reinfection will have probably occurred in both data sources given the high background ARI in both contexts, which we did not consider in our modelling. Including reinfection would lead us to increase our estimate for the extent of self-clearance, either owing to secondary conversions among TST-reverters (TST reversion studies), or by estimating the time since last infection as opposed to the time since first infection (autopsy studies). As a consequence, the case is strengthened for our results providing a robust lower-bound estimate of self-clearance.
With respect to our cohort model, we explored parameter uncertainty that could materially alter our results. These were the assumed value for the rate of distal progression to TB disease and the life expectancy of the cohort. Independent sensitivity analyses found that none of these assumptions qualitatively altered our conclusions (see the electronic supplementary material).
Several open questions remain. While we have used TSTreversions to infer self-clearance of infection, in principle, they should also be factored into the ARI estimated from TST survey data among children, as has been considered in [37,58]. For our purposes, while this short-term self-clearance among children would increase the estimated ARI and in turn the number of people at risk of TB disease, the long-term selfclearance of infection in adults would probably outweigh this effect and still lead to a net reduction with respect to the assumption of lifelong infection. The wider issue of incorporating TST-reversions into the standard methodology for estimating ARIs has yet to be addressed, however [59]. Finally, it is unclear how self-clearance affects the protection afforded by previous Mtb infection [60], which will be a key input for transmission models intending to include self-clearance.
Our results highlight the urgent need for development and validation of a test for viable Mtb infection that can outperform the positive predictive value of TSTs. Such a tool would enable study of the immunology and epidemiology of self-clearance and in turn improve TB vaccine design and the targeting of preventive programmes. While research into correlates of risk to discern those most at risk of TB disease is ongoing [61,62], similar efforts to identify those least at risk would be similarly prudent [63]. Thereafter, in order to understand the impact of self-clearance on transmission and inform vaccine strategy development, it is important to better understand whether, and by how much, self-clearance of infection provides protection from future reinfection.

Conclusion
Owing to self-clearance of Mtb infection, the population with a viable infection may be markedly smaller than royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 288: 20201635 generally assumed, with fewer individuals retaining a viable infection and yet each at greater risk of TB disease. Coupling these wide-ranging implications for TB epidemiology with the ability to identify individuals that have self-cleared could dramatically improve the targeting of preventive programmes and inform TB vaccine design, bringing TB elimination within reach of feasibility.