Exploring surveillance data biases when estimating the reproduction number: with insights into subpopulation transmission of COVID-19 in England

The time-varying reproduction number (Rt: the average number of secondary infections caused by each infected person) may be used to assess changes in transmission potential during an epidemic. While new infections are not usually observed directly, they can be estimated from data. However, data may be delayed and potentially biased. We investigated the sensitivity of Rt estimates to different data sources representing COVID-19 in England, and we explored how this sensitivity could track epidemic dynamics in population sub-groups. We sourced public data on test-positive cases, hospital admissions and deaths with confirmed COVID-19 in seven regions of England over March through August 2020. We estimated Rt using a model that mapped unobserved infections to each data source. We then compared differences in Rt with the demographic and social context of surveillance data over time. Our estimates of transmission potential varied for each data source, with the relative inconsistency of estimates varying across regions and over time. Rt estimates based on hospital admissions and deaths were more spatio-temporally synchronous than when compared to estimates from all test positives. We found these differences may be linked to biased representations of subpopulations in each data source. These included spatially clustered testing, and where outbreaks in hospitals, care homes, and young age groups reflected the link between age and severity of the disease. We highlight that policy makers could better target interventions by considering the source populations of Rt estimates. Further work should clarify the best way to combine and interpret Rt estimates from different data sources based on the desired use. This article is part of the theme issue ‘Modelling that shaped the early COVID-19 pandemic response in the UK’.

The time-varying reproduction number (R t : the average number of secondary infections caused by each infected person) may be used to assess changes in transmission potential during an epidemic. While new infections are not usually observed directly, they can be estimated from data. However, data may be delayed and potentially biased. We investigated the sensitivity of R t estimates to different data sources representing COVID-19 in England, and we explored how this sensitivity could track epidemic dynamics in population sub-groups. We sourced public data on test-positive cases, hospital admissions and deaths with confirmed COVID-19 in seven regions of England over March through August 2020. We estimated R t using a model that mapped unobserved infections to each data source. We then compared differences in R t with the demographic and social context of surveillance data over time. Our estimates of transmission potential varied for each data source, with the relative inconsistency of estimates varying across regions and over time. R t estimates based on hospital admissions and deaths were more spatio-temporally synchronous than when compared to estimates from all test positives. We found these differences may be linked to biased representations of subpopulations in each data source. These included spatially clustered testing, and where outbreaks in hospitals, care homes, and young age groups reflected the link between age and severity of the disease. We highlight that policy makers could better target interventions by considering the source populations of R t estimates. Further work should clarify the best way to combine and interpret R t estimates from different data sources based on the desired use.
This article is part of the theme issue 'Modelling that shaped the early COVID-19 pandemic response in the UK'.

Background
Within six months of its emergence in late 2019, the novel coronavirus SARS-CoV-2 had caused over six million cases of disease (COVID-19) worldwide [1]. Its rapid initial spread and high death rate prompted global policy interventions to prevent continued transmission, with widespread temporary bans on social interaction outside the household [2]. Introducing and adjusting such policy measures depend on a judgement in balancing continued transmission potential with the multidimensional consequences of interventions. It is, therefore, critical to inform the implementation of policy measures with a clear and timely understanding of ongoing epidemic dynamics [3,4].
In principle, transmission could be tracked by directly recording all new infections. In practice, real-time monitoring of the COVID-19 epidemic relies on surveillance of indicators that are subject to different levels of bias and delay. In England, widely available surveillance data across the population include: (i) the number of positive tests, biased by changing test availability and practice, and delayed by the time from infection to symptom onset (if testing is symptom-based), from symptom onset to a decision to be tested and from test to test result; (ii) the number of new hospital admissions, biased by differential severity that triggers care seeking and hospitalization, and additionally delayed by the time to develop severe diseases; and (iii) the number of new deaths due to COVID-19, biased by the differential risk of death and the exact definition of a COVID-19 death, and further delayed by the time to death.
Each of these indicators provides a different view on the epidemic and therefore contains potentially useful information. However, any interpretation of their behaviour needs to reflect these biases and lags and is best done in combination with the other indicators. One approach that allows this in a principled manner is to use the different datasets to separately track the time-varying reproduction number, R t , the average number of secondary infections generated by each new infected person [5]. Because R t quantifies changes in infection levels, it is independent of the level of overall ascertainment as long as this does not change over time or is explicitly accounted for [6]. At the same time, the underlying observations in each data source may result from different lags from infection to observation. However, if these delays are correctly specified then transmission behaviour over time can be consistently compared via estimates of R t .
Different methods exist to estimate the time-varying reproduction number, and in the UK a number of mathematical and statistical methods have been used to produce estimates used to inform policy [7][8][9]. Empirical estimates of R t can be achieved by estimating time-varying patterns in transmission events from mapping to a directly observed time-series indicator of infection such as reported symptomatic cases. This can be based on the probabilistic assignment of transmission pairs [10], the exponential growth rate [11] or the renewal equation [12,13]. Alternatively, R t can be estimated via mechanistic models that explicitly compartmentalize the disease transmission cycle into stages from susceptible through exposed, infectious and recovered [14,15]. This can include accounting for varying population structures and context-specific biases in observation processes, before fitting to a source of observed cases. Across all methods, key parameters include the time after an infection to the onset of symptoms in the infecting and infected, and the source of data used as a reference point for earlier transmission events [16,17].
In this study, we used a modelling framework based on the renewal equation, adjusting for delays in observation to estimate regional and national reproduction numbers of SARS-Cov-2 across England. The same method was repeated for each of three sources of data that are available in real time. After assessing differences in R t estimates by data source, we explored why this variation may exist. We compared the divergence between R t estimates with spatio-temporal variation in case detection, and the proportion at risk of severe disease, represented by the age distribution of test-positive cases and hospital admissions and the proportion of deaths in care homes.

Methods (a) Data management
Three sources of data provided the basis for our R t estimates. Time-series case data were available by specimen date of test. This was a de-duplicated dataset of COVID-19 positive tests notified from all National Health Service (NHS) settings (Pillar One of the UK Government's testing strategy) [18] and by commercial partners in community settings outside of healthcare (Pillar Two). Hospital admissions were also available by date of admission if a patient had tested positive prior to admission, or by the day preceding diagnosis if they were tested after admission. Death data were available by date of death and included only those that occurred within 28 days of a positive COVID-19 test in any setting. All data were publicly available and taken from the UK government source [19,20], and were aggregated to the seven English regions used by the NHS.
To provide context for R t estimates, we sourced weekly data on regional and national test positivity ( percentage positive tests of all tests conducted) from Public Health England [21], available as weekly average percentages from 10 May. From the same source, we also identified the age distributions of cases admitted to the hospital and all test-positive cases. Hospital admissions by age were available as age bands with rates per 100 000, so we used regional population data from 2019 [22] to approximate the raw count. We separately sourced daily data on the number of deaths in care homes by region from March 2020, available from 12 April [23]. Care homes are defined as supported living facilities (residential homes, nursing homes, rehabilitation units and assisted living units). Data were available by date of notification, which included an average 2-3 days' lag after the date of death. We also drew on a database that tracked COVID-19 UK policy updates by date and area [24].

(b) R t estimation
We estimated R t using EpiNow2 v. 1.2.0, an open-source package in R [13,25,26]. This package implements a Bayesian latent variable approach using the probabilistic programming language Stan [27]. To initialize the model, infections were imputed prior to the first observed case using a log-linear model with priors based on the first week of observed cases. This means that the initial observations both inform the initial parameters and are then also fit, which makes the initial R t estimates less reliable than later estimates. This was a pragmatic choice to allow the model to be identifiable when only estimating part of the observed epidemic. We explored other parameterizations, but these suffered from poor model identification. For each subsequent time step with observed cases, new infections were imputed using the sum of previous modelled infections weighted by the generation time probability mass function, and combined with an estimate of R t , to give the prevalence at time t [12]. The generation time was assumed to follow a gamma distribution that was fixed over time but varied between samples, with priors drawn from the literature for the mean and standard deviation [28].
These infection trajectories were mapped to reported case counts (D t ) by convolving over an incubation period distribution and report delay distribution (ξ). We assumed a negative binomial observation model for observed reported case counts (C t ), royalsocietypublishing.org/journal/rstb Phil. Trans. R. Soc. B 376: 20200283 with overdispersion ϕ using an exponential prior with mean 1 and mean D t . We combined this with a multiplicative day of the week effect (ω(tmod7)) with an independent effect for each day of the week. We controlled temporal variation using an approximate Gaussian process [29] with a squared exponential kernel (GP).
In mathematical notation: The length scale and magnitude of the kernel were estimated during model fitting. We used an inverse gamma prior for the length scale, optimizing shape and scale values to give a distribution with 98% of the density between 2 and 21 days, and the prior on the magnitude was standard normal. Each region was fitted independently using Markov-chain Monte Carlo (MCMC). Eight chains were used with a warmup of 1000 samples and 2000 samples post warmup. Convergence was assessed using the R hat diagnostic.
We used a gamma-distributed generation time with mean 3.6 days (standard deviation (s.d.) 0.7), and s.d. of 3.1 days (s.d. 0.8), sourced from [28]. Instead of the incubation period used in the original study (which was based on fewer data points), we refitted using a lognormal incubation period with a mean of 5.2 days (s.d. 1.1) and s.d. of 1.52 days (s.d. 1.1) [30]. This incubation period was also used to convolve from unobserved infections to unobserved symptom onsets (or a corresponding viral load in asymptomatic cases) in the model. When fitting the model, the time interval distributions had independent priors placed on the mean and standard deviation of their respective lognormal distributions.
We estimated both the delay from symptom onset to positive test (either in the community or in hospital) and the delay from symptom onset to death as lognormal distributions using a subsampled Bayesian bootstrapping approach (with 100 subsamples each using 250 samples) from given data on these delays. Our delay from the date of onset to date of positive test (either in the community or in hospital) was taken from a publicly available linelist of international cases [31]. We removed countries with outlying delays (Mexico and the Philippines). The resulting delay data had a mean of 4.4 days and s.d. 5.6. Delays for hospital admissions and test positives were treated as having the same delay from infection to onset and observation. For the delay from onset to death we used data taken from a large observational UK study [32]. We re-extracted the delay from confidential raw data, with a mean delay of 14.3 days (s.d. 9.5). There were insufficient data available on the various reporting delays to estimate spatially or temporally varying delays, so they were considered to be static over the course of the epidemic, although we discuss the effects of this assumption. We have also discussed this approach more extensively in [25].

(c) Comparison of R t estimates
We compared R t estimates by data source, plotting each by region over time. To avoid the first epidemic wave obscuring visual differences, all plots were limited to the earliest date that any R t estimate for England crossed below 1 after the peak. We also identified the time at which each R t estimate fell below 1, the local minima and maxima of median R t estimates and the number of times in the time-series that each R t estimate crossed its own median, before comparing these across regions and against the total count of the raw data.
We investigated correlations between R t estimates and the demographic and social context of transmission. We used linear regression to assess whether the level of raw data count influenced oscillations in R t . We assessed the influence of local outbreaks using test positivity. We used a 5% threshold for positivity as the level at which testing is either insufficient to keep pace with widespread community transmission [33], or where outbreaks have already been detected and tests targeted to those more likely to be positive. We plotted this against raw data and R t , and also used linear regression to test the association. We interpreted results in light of known outbreaks and policy changes. We plotted and qualitatively assessed variation in R t estimates against the age distribution of cases over time, and similarly explored patterns in R t estimates against the qualitative proportion of cases to all deaths. The latter was not assessed quantitatively due to differences in reference dates [23]. With the exception of fitting the delay from onset to death (held confidentially), code and data to reproduce this analysis are available [34].

Results
Across England, the COVID-19 epidemic peaked at 4798 reported test-positive cases (on 22 April 2020), 3099 admissions (1 April 2020) and 975 deaths (8 April 2020) per day (figure 1a). Following the peak, a declining trend continued for daily counts of admissions and deaths, while daily case counts from all reported test-positive cases increased from July and had more than tripled by August (from 571 on 30 June to 1929 on 1 September). Regions followed similar patterns over time to national trends. However, in the North East and Yorkshire, Midlands and North West, the incidence of test-positive cases did not decline to near the count of admissions as in other regions, and also saw a small temporary increase during the overall rise in case of counts in early August.
Following the initial epidemic peak in mid-March 2020, the date at which R t estimates crossed below 1 varied by both data source and geography (figures 1b and 2). The first region to cross into a declining epidemic was London, on 26 March according to an R t estimated from deaths (where the lower 90% credible interval (CrI) crossed below 1 on 24 March and the upper CrI on 28 March). However, the data source used to estimate R t was as important as any regional variation in estimating the earliest date of epidemic decline. R t estimated from hospital admissions gave the earliest estimate of a declining epidemic, while using all test-positive cases to estimate R t took the longest time to reach a declining epidemic, in all but one region (East of England). This difference by data source varied by up to 21  between R t estimates (figure 1b). In some regions, the difference between R t estimates was consistent over time, such as between R t from admissions and deaths in the South East.
In other regions such as the Midlands, this was not the case, with the divergence between the R t estimates from test-positive cases, admissions, and deaths each varying over time. R t estimates from test-positive cases were the most likely to differ from estimates derived from other data sources across all regions. Across all regions, R t estimates from deaths had slower damped oscillations compared to estimates from test-positive cases or hospital admissions. However, oscillations in R t estimates did not appear to be linked to the level of raw data counts in each source (electronic supplementary material, figure S2).  More rapid oscillations in R t estimates from test-positive cases appeared to be linked to targeted testing of case clusters, seen in high test positivity (electronic supplementary material, table SI2). Both the North East and Yorkshire and the Midlands saw more frequent oscillations in R t estimates from test-positive cases than other regions. The R t estimates from cases crossed its own median 10 times over the timeseries in both regions, while in all other NHS regions this averaged 6 times, and oscillations in R t estimates from cases also had a shorter duration in the North East and Yorkshire and the Midlands compared to other regions (electronic supplementary material, table SI1). Across all regions, 84% of weeks with over 5% positivity (N = 19) were in the North East and Yorkshire and the Midlands. In these regions, positivity peaked on the week of 9 May 2020 at 14% and 12%, respectively, and overall averaged 6% (95%CI 4.4-7.6%) and 5.9% (95%CI 4.6-7.2%, weeks of 10 May to 22 August), respectively. High test positivity is likely to have resulted from targeted testing among known local outbreaks in these regions. In the Midlands, these included local restrictions and increased testing across Leicester and in a Luton factory (restrictions between 4 and 25 July [35]). In Yorkshire case clusters were detected with local restrictions in Bradford, Calderdale and Kirklees (with restrictions from 5 August [36]).
In England, a divergence between R t estimated from cases versus R t estimated from deaths and admissions coincided with a decline in the age distribution among all test-positive cases in England to a younger population (electronic supplementary material, figure SI2A). From mid-April to June 2020, national estimates of R t from test-positive cases remained around the same level as those from admissions or deaths, while after this, cases diverged to a higher steady state (figure 1a). On 23 May, the median R t estimated from cases matched that of deaths at 0.83 (both with 90%CrIs 0.78-0.89), but this was followed by a 78 day period before the two estimates were again comparable, on 8 August. Over this period the median R t estimate from cases was on average 14% higher (95%CI 12-15%). Meanwhile, the share of test-positive cases under age 50 increased from under one-quarter of cases in the week of 28 March (24%, N = 16 185), to accounting for nearly three-quarters of cases by 22 August (77%, N = 6733). While the percentage of test-positive cases aged 20-49 increased consistently from April to August, the 0-19 age group experienced a rapid increase over mid-May through July, increasing by a mean 1% each week over 9 May through 1 August (from 4% of 18 774 cases to 14.8% of 5017 cases).
Similarly, R t estimates from admissions in England oscillated over June through July 2020, potentially linked to the age distribution of hospital admissions. From 0.92 (90%CrI 0.87-0.98) on 11 June, R t estimated from admissions fell to 0.8 (90%CI 0.75-0.85)) on 27 June. By contrast, this transition was not observed in the R t estimate based on test-positive cases (figure 1a). Older age groups dominated COVID-19 hospital admissions, where 0-44 years never accounted for more than 12.8% of hospital-based cases (a maximum in the week of 22 August, N = 690; electronic supplementary material, figure SI2B). While the proportion of hospital admissions aged 75+ remained steady over May through mid-June, this proportion appeared to oscillate over July through August (standard deviation of weekly percentage at 6.1 over June-August, compared to 5.4 in months March-May). These variations were not seen in the proportion aged 70+ in the test-positive case data, which saw a continuous decline from 30% at the start of June to 7% by August.
R t estimated from either admissions or deaths experienced near-synchronous local peaks across regions over April and May 2020. We compared this R t estimated from deaths with its source data and a separate regional dataset of deaths in care homes. In the South East and South West, the R t estimates from deaths rose over April, with a peak in early May. In the South West, the median R t estimate from deaths increased by 0.04 from 22 April to 7 May (from 0.8 (90%CrI 0.72-0.88) to 0.84 (90%CrI 0.76-0.95)); and by 0.06 from 17 April to 4 May in the South East (from 0.82 (90% CrI 0.77-0.9) to 0.88 (90%CrI 0.72-0.88)). In both these regions, this early May peak in R t estimates from deaths coincided with similarly rising R t estimates from hospital admissions, while the reverse trend was seen in R t estimates from cases. In all regions, care home deaths peaked over 22-29 April (by date of notification; electronic supplementary material, figure SI3). This was later than regional peaks in the raw count of all deaths in any setting (which peaked between 8 and 16 April, by date of death), even accounting for a 2-3 day reporting lag. This meant that the proportion of deaths from care homes varied over time, where in the South East and South West, deaths in care homes appeared to account for nearly all deaths for at least the period mid-May to July.

Discussion
We estimated the time-varying reproduction number for COVID-19 over March through August 2020 across England and English NHS regions, using test-positive cases, hospital admissions and deaths with confirmed COVID-19. Our estimates of transmission potential varied for each of these sources of infections, and the divergence between estimates from each data source was not consistent within or across  Figure 2. Dates in 2020 on which R t estimate crossed 1 after first epidemic peak, median and 90% credible interval, by the data source for England and seven NHS regions.
royalsocietypublishing.org/journal/rstb Phil. Trans. R. Soc. B 376: 20200283 regions over time, although estimates based on hospital admissions and deaths were more spatio-temporally synchronous than compared to estimates from cases. We compared differences in R t estimates to the extent and context of transmission and found that the difference between R t estimated from cases, admissions and deaths may be linked to uneven rates of testing, the changing age distribution of cases and outbreaks in care home populations. R t estimates varied by data source, and the extent of variation itself differed by region and over time. Following the initial epidemic peak in mid-March, the date at which R t estimates crossed below 1 varied by both data source and geography, following which R t estimates from all data sources varied when not undergoing a clear state change. The differences in these oscillations by data source may indicate different underlying causes. This implies that each data source was influenced differently by changes in subpopulations over time.
Increasingly rapid oscillations in R t estimates from testpositive cases were associated with higher test-positivity rates. Increasing test-positivity rates could be an indication of inconsistent community testing, with the observation of an initial rise in transmission amplified by expanded testing and local interventions where a cluster of new, mild cases had been identified [18]. This targeted testing may have driven regionally localized instability in case detection and resulting R t estimates but may not reflect changes in underlying transmission. This is a limitation of monitoring epidemic dynamics using test-positive surveillance data in areas where testing rates vary across the population and over time. This also suggests that R t estimates from admissions may be more reliable than that from all test-positive cases for indicating the relative intensity of an epidemic over time [37].
We hypothesized that variations in R t estimates were also related to changes in the age distribution of cases over time, because age is associated with severity [38,39]. If each data source represented a different sample of this age-severity gradient, and transmission also varied by age or severity, R t estimates from each source would diverge. Early in the epidemic, tests were largely limited to hospital settings, and disproportionately represented healthcare workers compared to the general population. This sampling bias would be reflected in the R t from test-positive cases. The early peak in R t could then represent a substantial separate route of transmission in healthcare settings, in a wave of nosocomial infections [40]. If healthcare workers were less susceptible to severe disease than those older than working age, an early peak in R t estimated from test-positive cases would not have been represented in R t estimated from hospital admissions or deaths. Meanwhile, either hospital admissions or deaths data would be more representative of sampling a separate route of transmission among the general population. If infections spread through the general population later than nosocomial infections, then the timing of peaks in R t estimates from each data source would not have matched.
From late spring, outbreaks in care homes may have contributed to a divergence between R t estimates from testpositive cases and other data sources. All regions saw a near-synchronous local peak in R t estimated from hospital admissions over spring, which was not seen in R t estimated from test-positive cases. This may have reflected the known widespread regional outbreaks in care homes. The care home population is on average older and more clinically vulnerable than the general population, while also being less likely to appear for community testing [41,42]. Increased transmission in care homes would then be seen in an increased R t from hospital admissions, but not observed in an R t from test-positive cases.
Similarly, the age-severity gradient may have impacted transmission estimates later in the epidemic when community testing became more widely available. We found that from June 2020 onwards, R t estimates from all test-positive cases appeared to increasingly diverge away from R t estimates from admissions and deaths, transitioning into a separate, higher, steady state. This was followed by the observed age distribution of all test-positive cases becoming increasingly younger, while the age distribution of admissions remained approximately level. Because of the severity gradient, this suggested that the R t estimates from all testpositive cases and admissions were more biased by the relative proportion of younger cases and older cases, respectively, than the R t estimates from admissions or deaths.
Our analysis was limited where data or modelling assumptions did not reflect underlying differences in transmission. R t estimates can become increasingly uncertain and unstable with lower case counts. Further, estimated unobserved infections were mapped to reported cases or deaths using two delay distributions: the time from infection to test in the community or hospital, and a longer delay from infection to death. Mis-specification of the priors would have created bias in the temporal distribution of all resulting R t estimates, with estimated dates of infection and R t incorrectly shifted too much or too little in time compared to the true infection curve, and decreased accuracy of R t estimates [43].
We used the same distribution priors for both delays after symptom onset to positive test, and to hospital admission. This may be inaccurate where cases with mild symptoms take longer to present for testing than severe cases presenting for hospital admission, or vice versa. The difference between the two delays over time may also have varied, with a possible decrease in delay to reported tests when mass community testing became available over the summer of 2020. This would have had a differential impact on the accuracy of R t estimates over time in either direction, which could explain some of the oscillations in R t estimates from test-positive case data compared to hospital admissions. We had no data over time on delays from symptom onset to reporting in each data source with which to test this hypothesis. However, we have mitigated some of the impact of this by using a sub-sampled bootstrap of the available delay data when estimating the delay distribution priors. This inflated the uncertainty of these priors in line with the hypothesis that they varied over time. This adjustment may be conservative if the delay distributions are stable over time.
Spatial dependence in delay distributions may also have contributed to their mis-specification and increased uncertainty in R t estimates. We observed that the variation in R t estimates from admissions and deaths often showed comparable levels and patterns in oscillations over time but were out of phase with each other. This may have been due to using data sources from different populations for each delay estimate. To estimate the delay between symptom onset to either a positive test or hospitalization, we used a linelist of all patients publicly reported globally, which had a mean delay of 5.4 days (s.d. 5.6). This varied only slightly from an early estimate in the UK epidemic, where the delay from royalsocietypublishing.org/journal/rstb Phil. Trans. R. Soc. B 376: 20200283 onset to hospitalization had a mean of 5.14 days (s.d. 4.2) in confidential Public Health England (FF100) data [44]. Meanwhile, the same global public linelist contained few records with delay from onset to death, with mean 11.4 (s.d. 16.5). We compared this to confidential UK data from an observational study that had mean delay 14.3 days (s.d. 9.5) [32].
Comparing each type and source of delay, we judged the benefits of using open data to outweigh the minor observed spatial variation of the delay from onset to test or admission, although at the expense of increased uncertainty. However, we judged that the difference in delay from onset to death in the UK compared to public (international) data was sufficiently meaningful to justify using confidential UK data in order to maintain the accuracy of the R t estimate from deaths. The difference in the geographical source of delay distributions should not have substantially altered our conclusions about discrepancies between central estimates of R t from either test-positives or admissions, compared to R t estimated from deaths. However, using the international public linelist for the delay to test or admission may have introduced additional uncertainty around the respective R t estimates, compared to greater accuracy (reduced uncertainty) in estimates of R t from deaths based on a UK-specific delay distribution.
The data sources themselves may also have been inaccurate or biased, which would change the representation of the population we have assumed here. For example, we excluded data from other nations of the UK (Wales, Scotland and Northern Ireland) in our analysis, as these differed in both availability over time and in data collection and reporting practices [19,45]. English regional data may also contain bias where new parts of the population might be under focus for testing efforts, or the population characteristics of hospital admissions from COVID-19 may have changed over time with changes in clinical criteria or hospital capacity for admission. This would mean that an R t estimate from these data sources would represent different source populations over time, limiting our ability to reliably compare against R t estimates from other data sources. Where possible we highlighted this by comparing R t estimates to known biases and changes in case detection and reporting.
Our approach is unable to make strong causal conclusions about varying transmission, and assumptions about sampling and the representation of subpopulations remain implicit. Alternatively, varying epidemics in subpopulations could have been addressed with mechanistic models that explicitly consider transmission in different settings and are fitted to multiple data sources. However, these require additional assumptions, detailed data to parameterize and may be time-consuming to develop. In the absence of data, the number of assumptions required for these models can introduce inherent structural biases. Our approach contains few structural assumptions and therefore may be more robust when data are sparse, or information is required in real-time.
We conclude that when estimating R t , the choice of data source should be guided by the policy context in which the estimates will be used and interpreted. This work highlights that there is no clear superior choice of data source, while R t estimates are sensitive to assumptions about the underlying population of each data source. This means that both producers and users of R t estimates should understand relevant biases in the data source's population sampling strategy, such as by community case detection or patient severity, before drawing conclusions about transmission in the population as a whole.
We also recommend presenting concurrent R t estimates jointly, rather than pooling estimates of R t from different data sources. Pooling estimates would both suffer from unclear weighting and lose useful information about variation in subpopulation transmission. Although the reconstruction of the underlying transmission process from the reporting processes is robust, it is unclear how weights would be assigned based on likelihood to estimates from different data sources. Further, the variation in concurrent R t estimates provides more information about population transmission than any single estimate, when considered in light of the sampling biases of each data source. This additional information can be useful to identify transmission intensity by subpopulation where access to high quality disaggregated data may not be available in real time. While this can be difficult to interpret without specific knowledge of population structure and dynamics, this information would be lost altogether in a single or pooled estimate of R t . By contrast, if the policy were to be based on either a single or an averaged R t estimate, it would be unclear what any recommendation should be and for whom.
Future work could explore systematic differences in the influence of data sources on R t estimates by extending the comparison of R t by data source to other countries or infectious diseases. Additionally, work should also clarify the potential for comparing R t estimates in real-time tracking of outbreaks and explore the inconsistencies in case detection over time and space, where a cluster of cases leads to a highly localized expansion of community testing, creating an uneven spatial bias in transmission estimates. These findings may be used to improve R t estimation and identify findings of use for epidemic control. Based on the work presented here we now provide R t estimates, updated each day, for test positive cases, admissions, and deaths in each NHS region and in England. Our estimates are visualized on our website, are available for download, and are produced using publicly accessible code [46,47].
Tracking differences by data source can improve understanding of variation in testing bias in data collection, highlight outbreaks in new subpopulations, indicate differential rates of transmission among vulnerable populations and clarify the strengths and limitations of each data source. Our approach can quickly identify such patterns in developing epidemics that might require further investigation and early policy intervention. Our method is simple to deploy and scale over time and space using existing open-source tools, and all code and estimates used in this work are available to be used or re-purposed by others.

In context
In the UK, public policy and the media have prominently used the effective reproduction number (R t ) of COVID-19 to summarise ongoing pandemic transmission. Several teams in the UK have been contributing estimates of R t that are aggregated into a consensus range, but the methods, approaches, and data sources for estimating transmission have varied among teams and over time. For example, data sources could, amongst others, include counts of test-positive cases, hospital admissions, or deaths due to COVID-19. In royalsocietypublishing.org/journal/rstb Phil. Trans. R. Soc. B 376: 20200283 our team's submissions to the Scientific Pandemic Influenza Group on Modelling (SPI-M) from March onwards, we saw that even when using a consistent method, R t estimates were not a single, clear-cut number, but varied depending on the source of data.
In late May, we started to explore whether these differences in transmission estimates from each data source could be a policy-relevant indicator of biased data sampling and subpopulation epidemics. We first presented a summary of the differences in our team's R t estimates by data source to SPI-M as a short note in early June. From June onwards we used all three data sources to estimate R t and contributed them separately to the weekly reproduction number estimates published by SPI-M and considered by the Scientific Advisory Group for Emergencies (SAGE). Over this time, we have adapted our work to support the changing UK policy context. This has meant there are several differences in available data, methods, and implications of this work between the time we first generated the SPI-M report and the time of this publication.
As COVID-19 data became more openly accessible, we started to publish a daily comparison of UK R t estimates by data source (epiforecasts.io/covid/posts/national/ united-kingdom). This had initially been impossible as there were very few sources of public subnational data. Thanks to the Public Health England dashboard (coronavirus.data.gov.uk), public data sources for England increased in both quantity and quality and from October we were able to produce subnational R t estimates using a variety of public data sources. We felt that presenting these estimates publicly would be useful given the high level of interest in the government's claimed use of R t as a policy decision tool.
Between generating the original SPI-M submission and this publication, we significantly developed and improved the software we have built to estimate R t ("EpiNow2"). We continue to refine our methods for estimating R t , although the improved methods did not substantially change the trend or direction of differences between estimates and our resulting conclusions.
Our interpretation of the differences in R t estimates has changed over time as we saw new evidence for concentrated transmission in subpopulations. In the earliest paper presented to SPI-M, discussion centred on the likely effects of hospital-acquired infection and testing availability on differences between R t from test-positives compared to admissions or deaths over March and May. However, increasing evidence for a widespread and severe epidemic in care homes provided an alternative explanation for such differences. We realised that, even without disaggregated data by age or residence, simply identifying the differences in R t estimates could have been an early indicator of the epidemic in this vulnerable subpopulation. We therefore continued to track these differences, which once again became wider over the summer as transmission moved between age groups after restrictions were lifted and mass testing became available.
Most importantly, we continue to find new insights into the state of the UK pandemic from comparing R t estimates. One of the clearest trends we have seen in varying R t estimates by data source has followed from the National Health Service vaccination campaign. R t estimates from deaths are now consistently below those from hospitalisations and cases. This is a strong indicator of the positive impact of vaccination, and an encouraging further use for this work.