Abstract
The article is devoted to the parameters identification in the SI model. We consider several methods, starting with an exponential fit to the early cumulative data of SARS-CoV2 in mainland China. The present methodology provides a way to compute the parameters at the early stage of the epidemic. Next, we establish an identifiability result. Then we use the Bernoulli–Verhulst model as a phenomenological model to fit the data and derive some results on the parameters identification. The last part of the paper is devoted to some numerical algorithms to fit a daily piecewise constant rate of transmission.
1. Introduction
Estimating the average transmission rate is one of the most crucial challenges in the epidemiology of communicable diseases. This rate conditions the entry into the epidemic phase of the disease and its return to the extinction phase, if it has diminished sufficiently. It is the combination of three factors, one, the coefficient of virulence, linked to the infectious agent (in the case of infectious transmissible diseases), the other, the coefficient of susceptibility, linked to the host (all summarized into the probability of transmission), and also, the number of contacts per unit of time between individuals [1]. The coefficient of virulence may change over time due to mutation over the course of the disease history. The second and third also, if mitigation measures have been taken. This was the case in China from the start of the pandemic [2]. Monitoring the decrease in the average transmission rate is an excellent way to monitor the effectiveness of these mitigation measures. Estimating the rate is therefore a central problem in the fight against epidemics.
The goal of this article is to understand how to compare the SI model to the reported epidemic data and therefore the model can be used to predict the future evolution of epidemic spread and to test various possible scenarios of social mitigation measures. For t ≥ t0, the SI model is the following:
The quantity 1/ν is the average duration of the infectious period and νI(t) is the flux of recovering or dying individuals. At the end of the infectious period, we assume that a fraction f ∈ (0, 1] of the infectious individuals is reported. Let CR(t) be the cumulative number of reported cases. We assume that
Assumption 1.1.
We assume that
— | S0 > 0 the number of susceptible individuals at time t0 when we start to use the model; | ||||
— | 1/ν > 0 the average duration of infectious period; | ||||
— | f > 0 the fraction of reported individuals; |
Throughout this paper, the parameter S0 = 1.4 × 109 will be the entire population of mainland China (since COVID-19 is a newly emerging disease). The actual number of susceptibles S0 can be smaller since some individuals can be partially (or totally) immunized by previous infections or other factors. This is also true for SARS-CoV2, even if COVID-19 is a newly emerging disease. In fact, for COVID-19 the level of susceptibility may depend on blood group and genetic lineage. It is indeed suspected that the blood group O is associated with a lower susceptibility to SARS-CoV2 while a gene cluster inherited from Neanderthal has been identified as a risk factor for severe symptoms [3,4].
At the early beginning of the epidemic, the average duration of the infectious period 1/ν is unknown, since the virus has never been investigated in the past. Therefore, at the early beginning of the COVID-19 epidemic, medical doctors and public health scientists used previously estimated average duration of the infectious period to make some public health recommendations. Here we show that the average infectious period is impossible to estimate by using only the time series of reported cases, and must therefore be identified by other means. Actually, with the data of SARS-CoV2 in mainland China, we will fit the cumulative number of the reported case almost perfectly for any non-negative value 1/ν < 3.3 days. In the literature, several estimations were obtained: 11 days in [5], 9.5 days in [6], 8 days in [7] and 3.5 days in [8]. The recent survey by Byrne et al. [9] focuses on this subject.
Result
In §3, our analysis shows that:
— | It is hopeless to estimate the exact value of the duration of infectiousness by using SI models. Several values of the average duration of the infectious period give the exact same fit to the data. | ||||
— | We can estimate an upper bound for the duration of infectiousness by using SI models. In the case of SARS-CoV2 in mainland China, this upper bound is 3.3 days. |
In [10], it is reported that transmission of COVID-19 infection may occur from an infectious individual who is not yet symptomatic. In [11], it is reported that COVID-19-infected individuals generally develop symptoms, including mild respiratory symptoms and fever, on average 5–6 days after the infection date (with a confidence of , range 1–14 days). In [12], it is reported that the median time prior to symptom onset is 3 days, the shortest 1 day, and the longest 24 days. It is evident that these time periods play an important role in understanding COVID-19 transmission dynamics. Here the fraction of reported individuals f is unknown as well.
Result
In §3, our analysis shows that:
— | It is hopeless to estimate the fraction of reported by using the SI models. Several values for the fraction of reported give the exact same fit to the data. | ||||
— | We can estimate a lower bound for the fraction of unreported. We obtain 3.83 × 10−5 < f ≤ 1. This lower bound is not significant. Therefore, we can say anything about the fraction of unreported from this class of models. |
As a consequence, the parameters 1/ν and f have to be estimated by another method, for instance by a direct survey methodology that should be employed on an appropriated sample in the population in order to evaluate the two parameters.
The goal of this article is to focus on the estimation of the two remaining parameters. Namely, knowing the above-mentioned parameters, we plan to identify
— | I0 the initial number of infectious at time t0; | ||||
— | τ(t) the rate of transmission at time t. |
In §2, we will explain how to apply the method introduced in Liu et al. [18] to fit the early cumulative data of SARS-CoV2 in China. This method provides a way to compute I0 and τ0 = τ(t0) at the early stage of the epidemic. In §3, we establish an identifiability result in the spirit of Hadeler [19].
In §4, we use the Bernoulli–Verhulst model as a phenomenological model to describe the data. As it was observed in several articles, the data from mainland China (and other countries as well) can be fitted very well by using this model. As a consequence, we will obtain an explicit formula for τ(t) and I0 expressed as a function of the parameters of the Bernoulli–Verhulst model and the remaining parameters of the SI model. This approach gives a very good description of this set of data. The disadvantage of this approach is that it requires an evaluation of the final size CR∞ from the early beginning (or at least it requires an estimation of this quantity).
Therefore, in order to be predictive, we will explore in the remaining sections of the paper the possibility of constructing a day-by-day rate of transmission. Here we should refer to Bakhta et al. [20] where another novel forecasting method was proposed.
In §5, we will prove that the daily cumulative data can be approached perfectly by at most one sequence of day-by-day piecewise constant transmission rates. In §6, we propose a numerical method to compute such a (piecewise constant) rate of transmission. Section 7 is devoted to the discussion, and we will present some figures showing the daily basic reproduction number for the COVID-19 outbreak in mainland China.
2. Estimating τ(t0) and I0 at the early stage of the epidemic
In this section, we apply the method presented in [21] to the SI model. At the early stage of the epidemic, we can assume that S(t) is almost constant and equal to S0. We can also assume that τ(t) remains constant equal to τ0 = τ(t0). Therefore, by replacing these parameters into the I-equation of system (1.1) we obtain

Figure 1. In this figure, we plot the best fit of the exponential model to the cumulative number of reported cases of COVID-19 in mainland China between 19 February and 1 March. We obtain χ1 = 3.7366, χ2 = 0.2650 and χ3 = 615.41 with t0 = 19 Feb. The parameter χ3 is obtained by minimizing the error between the best exponential fit and the data.
The estimated initial number of infected and transmission rate
By using (1.3) and (2.3), we obtain
and by using (2.1)
Remark 2.1.
Fixing f = 0.5 and ν = 0.2, we obtain
The influence of the errors made in the estimations (at the early stage of the epidemic) has been considered in the recent article by Roda et al. [25]. To understand this problem, let us first consider the case of the rate of transmission τ(t) = τ0 in the model (1.1). In that case (1.1) becomes
Let t > t0 be fixed. The cumulative number of infectious CI(t) is strictly increasing with respect to the following quantities
I0 > 0 the initial number of infectious individuals; S0 > 0 the initial number of susceptible individuals; τ > 0 the transmission rate; 1/ν > 0 the average duration of the infectiousness period.Theorem 2.2.
(i) (ii) (iii) (iv)
Error in the estimated initial number of infected and transmission rate
Assume that the parameters χ1 and χ2 are estimated with a confidence interval
Remark 2.3.
By using the data for mainland China, we obtain
In figure 2, we plot the upper and lower solutions CR+(t) (obtained by using and ) and CR−(t) (obtained by using and ) corresponding to the blue region and the black curve corresponds to the best estimated value I0 = 1521 and τ0 = 3.3214 × 10−10.
Figure 2. In this figure, the black curve corresponds to the cumulative number of reported cases CR(t) obtained from the model (2.6) with CR′(t) = νf I(t) by using the values I0 = 1521 and τ0 = 3.32 × 10−10 obtained from our method and the early data from 19 February to 1 March. The blue region corresponds to the confidence interval when the rate of transmission τ(t) is constant and equal to the estimated value τ0 = 3.32 × 10−10.
Recall that the final size of the epidemic corresponds to the positive equilibrium of (2.7)
3. Theoretical formula for τ(t)
By using the S-equation of model (1.1) we obtain
Theorem 3.1.
Let S0, ν, f, I0 > 0 and CR0 ≥ 0 be given. Let t → I(t) be the second component of system (1.1). Let be a two times continuously differentiable function satisfying
Proof.
Assume first (3.7) is satisfied. Then by using equation (3.1) we deduce that
Conversely, assume that τ(t) is given by (3.8). Then if we define and , by using (3.3) we deduce that
Formula (3.8) was already obtained by Hadeler ([19], see corollary 2).
4. Explicit formula for τ(t) and I0
Many phenomenological models have been compared to the data during the first phase of the COVID-19 outbreak. We refer to the paper of Tsoularis & Wallace [27] for a nice survey on the generalized logistic equations. Let us consider here for example, the Bernoulli–Verhulst equation
Assumption 4.1.
We assume that the cumulative numbers of reported cases CRData(ti) are known for a sequence of times t0 < t1 < · · · < tn+1 (see figure 3).
Figure 3. In this figure, we plot the best fit of the Bernoulli–Verhulst model to the cumulative number of reported cases of COVID-19 in China. We obtain χ2 = 0.66 and θ = 0.22. The black dots correspond to data for the cumulative number of reported cases and the red curve corresponds to the model.
Estimated initial number of infected
By combining (1.3) and the Bernoulli–Verhulst equation (4.1) for t → CR(t), we deduce the initial number of infected
Remark 4.2.
We fix f = 0.5, from the COVID-19 data in mainland China and formula (4.3) (with CR0 = 198), we obtain
By using (4.1), we deduce that
Estimated rate of transmission
By using the Bernoulli–Verhulst equation (4.1) and substituting (4.4) in (3.8), we obtain
This formula (4.5) combined with (4.2) gives an explicit formula for the rate of transmission.
Since CR(t) < CR∞, by considering the sign of the numerator and the denominator of (4.5), we obtain the following proposition.
The rate of transmission τ(t) given by (4.5) is non-negative for all t ≥ t0 ifProposition 4.3.
Compatibility of the model SI with the COVID-19 data for mainland China
The model SI is compatible with the data only when τ(t) stays positive for all t ≥ t0. From our estimation of the Chinese’s COVID-19 data, we obtain χ2 θ = 0.14. Therefore from (4.6), we deduce that model is compatible with the data only when
This means that the average duration of infectious period 1/ν must be shorter than 3.3 days.
Similarly, the condition (4.7) implies
and since we have CR0 = 198 and CR∞ = 67 102, we obtain
So according to this estimation the fraction of unreported 0 < f ≤ 1 can be almost as small as we want.
Figure 4 illustrates proposition 4.3. We observe that the formula for the rate of transmission (4.5) becomes negative whenever ν < χ2θ. In figure 5, we plot the numerical simulation obtained from (1.1) to (1.3) when t → τ(t) is replaced by the explicit formula (4.5). It is surprising that we can reproduce perfectly the original Bernoulli–Verhulst even when τ(t) becomes negative (see figure 3). This was not guaranteed at first, since the I-class of individuals is losing some individuals which are recovering.
Figure 4. In this figure, we plot the rate of transmission obtained from formula (4.5) with f = 0.5, χ2 θ = 0.145 < ν = 0.2 (in (a)) and ν = 0.1 < χ2 θ = 0.145 (in (b)), χ2 = 0.66 and θ = 0.22, and CR∞ = 67 102, which is the latest value obtained from the cumulative number of reported cases for China. Figure 5. In this figure, we plot the number of reported cases by using model (1.1) and (1.3), with the rate of transmission obtained in (4.5). The parameters values are f = 0.5, ν = 0.1 or ν = 0.2, χ2 = 0.66 and θ = 0.22, and CR∞ = 67 102 is the latest value obtained from the cumulative number of reported cases for China. Furthermore, we use S0 = 1.4 × 109 for the total population of China and I0 = 954 which is obtained from formula (4.3). The black dots correspond to observed data for the cumulative number of reported cases and the blue curve corresponds to the model.
5. Computing numerically a day-by-day piecewise constant rate of transmission
We assume that the rate of transmission τ(t) is piecewise constant and for each i = 0, …, n,Assumption 5.1.
For t ∈ [ti−1, ti], we deduce by using assumption 5.1 that
Let assumptions 1.1, 4.1 and 5.1 be satisfied. Let I0 be fixed. Then we can find a unique sequence τ0, τ1, …, τn of non-negative numbers such that t → CR(t) the solution of (3.2) fits exactly the data at any time ti, that is to say thatTheorem 5.2.
Remark 5.3.
The above theorem means that the data are identifiable for this model SI if and only if the conditions (5.4) and (5.6) are satisfied. Moreover, in that case, we can find a unique sequence of transmission rates τi ≥ 0 which gives a perfect fit to the data.
6. Numerical simulations
In this section, we propose a numerical method to fit the day-by-day rate of transmission. The goal is to take advantage of the monotone property of CR(t) with respect to τi on the time interval [ti, ti+1]. Recently, more sophisticated methods were proposed by Bakhta et al. [20] by using several types of approximation methods for the rate of transmission.
We start with the simplest algorithm 1 in order to show the difficulties to identify the rate of transmission.
Algorithm 1
Step 1: We fix , or and . We consider the system
The map being monotone increasing, we can apply a bisection method to find the unique value solving
Then we proceed by induction.
Step i: For each integer we consider the system
The map being monotone increasing, we can apply a bisection method to find the unique value solving
In figure 6, we plot an example of such a perfect fit, which is the same for ν = 0.1 and ν = 0.2. In figure 7, we plot the rate of transmission obtained numerically for ν = 0.2 in (a) and ν = 0.1 in (b). This is an example of a negative rate of transmission. Figure 7 should be compared to figure 4 which gives a similar result.
Figure 6. In this figure, we plot the perfect fit to the cumulative number of reported cases of COVID-19 in China. We fix the parameters f = 0.5 and ν = 0.2 or ν = 0.1 and we apply our algorithm 1 to obtain the perfect fit. The black dots correspond to data for the cumulative number of reported cases and the blue curve corresponds to the model. Figure 7. In this figure, we plot the rate of transmission obtained for the reported cases of COVID-19 in China with the parameters f = 0.5 and ν = 0.2 in (a) and ν = 0.1 in (b). This rate of transmission corresponds to the perfect fit obtained in figure 6.
In figures 8–10, we use algorithm 1 and we plot the rate of transmission obtained by using the reported cases of COVID-19 in China where the parameters are fixed as f = 0.5 and ν = 0.2. In figures 8–10, we observe an oscillating rate of transmission which is alternately positive and negative back and forth. These oscillations are due to the amplification of the error in the numerical method itself. In figure 8, we run the same simulation as in figure 9 but during a shorter period. In figure 8, we can see that the slope of CR(t) at the t = ti between 2 days (the black dots) is amplified 1 day to the next.
Figure 8. In (a), we plot the cumulative number of reported cases obtained from the data (black dots) and the model (blue curve). In (b), we plot the daily rate of transmission obtained by using algorithm 1. We see that we can fit the data perfectly. But the method is very unstable. We obtain a rate of transmission that oscillates from positive to negative values back and forth. Figure 9. In (a), we plot the cumulative number of reported cases obtained from the data (black dots) and the model (blue curve) on a period six times longer than in figure 8. In (b), we plot the daily rate of transmission obtained by using algorithm 1. We see that we can fit the data perfectly. But the method is very unstable like on figure 8. We obtain a rate of transmission that oscillates from positive to negative values back and forth. Figure 10. We apply algorithm 1 to the regularized data. In (a), we plot the regularized cumulative number of reported cases obtained from the data (black dots) and the model (blue curve). In (b), we plot the daily rate of transmission obtained by using algorithm 1. We see that we can fit the data perfectly. But the method is very unstable. We obtain a rate of transmission that oscillates from positive to negative values back and forth.
In figure 10, we first smooth the original cumulative data by using the Matlab function CRData = smoothdata(CRData,‘gaussian’,50) to regularize the data and we apply algorithm 1. Unfortunately, smoothing the data does not help to solve the instability problem in figure 10.
We need to introduce a correction when choosing the next initial value I(ti). In algorithm 1, the errors are due to the following relationship:
In figure 11, we smooth the data first by using the Matlab function CRData = smoothdata(CRData, ′gaussian′,50), and we apply algorithm 2 by approximating equation (6.6) by

Figure 11. In this figure, we plot the rate of transmission obtained by using the reported cases of COVID-19 in China with the parameters f = 0.5 and ν = 0.2. We first regularize the data by applying the Matlab function CRData = smoothdata(CRData, ‘gaussian’,50). Then we apply algorithm 2 to the regularized data. In (a), we plot the regularized cumulative number of reported cases obtained after smoothing (black dots) and the model (blue curve). In (b), we plot the daily rate of transmission obtained by using algorithm 2. We see that we can fit the data perfectly and this time the rate of transmission is becoming reasonable.
Algorithm 2
We fix , or and . Then we fit the data by using the method described in §2 to estimate the parameters , and from day 1 to 10. Then we use
For each integer , we consider the system
The key idea of this new algorithm is the following correction on the I-component of the system. We start a new step by using the value obtained from the previous iteration and
In figure 12, we plot several types of regularized cumulative data in (a) and several types of regularized daily data in (b). Among the different regularization methods, an important one is the Bernoulli–Verhulst best-fit approximation.
Figure 12. In this figure, we plot the cumulative number of reported cases (a) and the daily number of reported cases (b). The black curves are obtained by applying the cubic spline Matlab function ‘spline(Days,DATA)’ to the cumulative data. The left-hand side is obtained by using the cubic spline function and right-hand side is obtained by using the derivative of the cubic spline interpolation. The blue curves are obtained by using cubic spline function to the day-by-day values of cumulative number of cases obtained from the best fit of the Bernoulli–Verhulst model. The orange curves are obtained by computing the rolling weekly daily number of cases (we use the Matlab function ‘smoothdata(DAILY,‘movmean’,7)’) and then by applying the cubic spline function to the corresponding cumulative number of cases. The yellow curves are obtained by using the Gaussian weekly smoothing to the daily number of cases (we use the Matlab function ‘smoothdata(DAILY,‘gaussian’,7)’) and then by applying the cubic spline function to the corresponding cumulative number of cases.
In figure 13, we plot the rate of transmission t → τ(t) obtained by using algorithm 2. We can see that the original data give a negative transmission rate while at the other extreme the Bernoulli–Verhulst seems to give the most regularized transmission rate. In figure 13a, we observe that we now recover almost perfectly the theoretical transmission rate obtained in §4. In figure 13b, the rolling weekly average regularization and in figure 13c the Gaussian weekly average regularization still vary a lot and in both cases, the transmission rate becomes negative after some time. In figure 13c, the original data give a transmission rate that is negative from the beginning. We conclude that it is crucial to find a ‘good’ regularization of the daily number of cases. So far the best regularization method is obtained by using the best fit of the Bernoulli–Verhulst model.
Figure 13. In this figure, we plot the transmission rates t → τ(t) obtained by using algorithm 2 with the parameters f = 0.5 and ν = 0.2. We use the cumulative data obtained by using (a) the Bernoulli–Verhulst regularization, (b) the rolling weekly average regularization, (c) the Gaussian weekly average regularization and in (d) we use the original cumulative data.
Remark 6.1.
For each simulation figure 13b,c, it is possible to obtain a transmission rate t → τ(t) that is non-negative for all time t by increasing sufficiently the parameter ν. Nevertheless, we do not present these simulations here because the corresponding values of ν to obtain a non-negative τ(t) are unrealistic.
In figure 14(a–d respectively), we plot the daily basic reproduction number corresponding to the figure 13(a–d respectively). The red line corresponds to R0 = 1. We see some complex behaviour for figure 14b,c,d is again unrealistic.
Figure 14. In this figure, we plot the daily basic reproduction number t → R0(t) = τ(t)S(t)/ν obtained by using algorithm 2 with the parameters f = 0.5 and ν = 0.2. We use the cumulative data obtained by using (a) the Bernoulli–Verhulst regularization, (b) the rolling weekly average regularization, (c) the Gaussian weekly average regularization and in (d) we use the original cumulative data.
7. Discussion
Estimating the parameters of an epidemiological model is always difficult and generally requires strong assumptions about their value and their consistency and constancy over time. Despite this, it is often shown that many sets of parameter values are compatible with a good fit of the observed data. The new approach developed in this article consists first of all in postulating a phenomenological model of growth of infectious, based on the very classic model of Verhulst, proposed in demography in 1838 [28]. Then, obtaining explicit formulae for important parameter values such as the transmission rate or the initial number of infected (or for lower and/or upper limits of these values), gives an estimate allowing an almost perfect reconstruction of the observed dynamics.
The uses of phenomenological models can also be regarded as a way of smoothing the data. Indeed, the errors concerning the observations of new infected cases are numerous:
— | the census is rarely regular and many countries report late cases that occurred during the weekend and at varying times over-add data from specific counts, such as those from homes for the elderly; | ||||
— | the number of cases observed is still underestimated and the calculation of not-reported new cases of infected is always a difficult problem [21]; | ||||
— | the raw data are sometimes reduced for medical reasons of poor diagnosis or lack of detection tools, or for reasons of domestic policy of states. |
In this article, we developed several methods to understand how to reconstruct the rate of transmission from the data. In §2, we reconsidered the method presented in [21] based on an exponential fit to the early data. The approach gives a first estimation of I0 and τ0. In §3, we prove a result to connect the time-dependent cumulative reported data and the transmission rate. In §4, we compare the data to the Bernoulli–Verhulst model and we use this model as a phenomenological model. The Bernoulli–Verhulst model fits the data for mainland China very well. Next by replacing the data by the solution of the Bernoulli–Verhulst model, we obtain an explicit formula for the transmission rate. So we derive some conditions on the parameters for the applicability of the SI model to the data for mainland China. In §5, we discretized the rate of transmission and we observed that given some daily cumulative data, we can get at most one perfect fit the data. Therefore, in §6, we provide two algorithms to compute numerically the daily rates of transmission. Such numerical questions turn out to be a delicate problem. This problem was previously considered by another French group, Bakhta et al. [20]. Here we use some simple ideas to approach the derivative of the cumulative reported cases combined with some smoothing method applied to the data.
To conclude this article, we plot the daily basic reproduction number

Figure 15. In this figure we plot R0(t) = τ(t)S(t)/ν the daily basic reproduction number and we vary the parameter f (a) and ν (b).
Concerning contagious diseases, public health physicians are constantly facing four challenges. The first concerns the estimation of the average transmission rate. Until now, no explicit formula had been obtained in the case of the SIR model, according to the observed data of the epidemic, that is to say the number of reported cases of infected patients. Here, from realistic simplifying assumptions, a formula is provided (formula (4.5)), making it possible to accurately reconstruct theoretically the curve of the observed cumulative cases. The second challenge concerns the estimation of the mean duration of the infectious period for infected patients. As for the transmission rate, the same realistic assumptions make it possible to obtain an upper limit to this duration (inequality (4.8)), which makes it possible to better guide the individual quarantine measures decided by the authorities in charge of public health. This upper bound also makes it possible to obtain a lower bound for the percentage of unreported infected patients (inequality (4.8)), which gives an idea of the quality of the census of cases of infected patients, which is the third challenge faced by epidemiologists, specialists of contagious diseases. The fourth challenge is the estimation of the average transmission rate for each day of the infectious period (dependent on the distribution of the transmission over the ‘ages’ of infectivity), which will be the subject of further work and which poses formidable problems, in particular those related to the age (biological age or civil age) class of the patients concerned. Another interesting prospect is the extension of methods developed in the present paper to the contagious non-infectious diseases (i.e. without causal infectious agent), such as social contagious diseases, the best example being that of the pandemic linked to obesity [29–31], for which many concepts and modelling methods remain available.
Data accessibility
The data in my paper are public and can be found at: https://en.wikipedia.org/wiki/COVID-19_pandemic_in_mainland_China; http://www.nhc.gov.cn/yjb/pzhgli/new_list.shtml.
Authors' contributions
P.M. conceived and designed the study, and analysed the data. P.M. and Q.G. carried out the analysis and performed numerical simulations, and all authors conducted the literature review. All authors participated in writing and reviewing of the manuscript.
Competing interests
The authors declare no conflict of interest.
Funding
This research was funded by the Agence Nationale de la Recherche in France (Project name: MPCUII (P.M.) and (Q.G.)).
Appendix A. Supplementary table
We use cumulative reported data from the National Health Commission of the People’s Republic of China and the Chinese CDC for mainland China. Before 11 February, the data were based on confirmed testing. From 11 February to 15 February, the data included cases that were not tested for the virus, but were clinically diagnosed based on medical imaging showing signs of pneumonia. There were 17 409 such cases from 10 February to 15 February. The data from 10 February to 15 February specified both types of reported cases. From 16 February, the data did not separate the two types of reporting, but reported the sum of both types. We subtracted 17 409 cases from the cumulative reported cases after 15 February to obtain the cumulative reported cases based only on confirmed testing after 15 February. The data are given in table 1 with this adjustment.
January | ||||||
19 | 20 | 21 | 22 | 23 | 24 | 25 |
198 | 291 | 440 | 571 | 830 | 1287 | 1975 |
26 | 27 | 28 | 29 | 30 | 31 | |
2744 | 4515 | 5974 | 7711 | 9692 | 11 791 | |
February | ||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 |
14 380 | 17 205 | 20 438 | 24 324 | 28 018 | 31 161 | 34 546 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
37 198 | 40 171 | 42 638 | 44 653 | 46 472 | 48 467 | 49 970 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
51 091 | 70 548–17 409 | 72 436–17 409 | 74 185–17 409 | 75 002–17 409 | 75 891–17 409 | 76 288–17 409 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
76 936–17 409 | 77 150–17 409 | 77 658–17 409 | 78 064–17 409 | 78 497–17 409 | 78 824–17 409 | 79 251–17 409 |
29 | ||||||
79 824–17 409 | ||||||
March | ||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 |
79 824–17 409 | 79 824–17 409 | 79 824–17 409 | 80 409–17 409 | 80 552–17 409 | 80 651–17 409 | 80 695–17 409 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
80 735–17 409 | 80 754–17 409 | 80 778–17 409 | 80 793–17 409 | 80 813–17 409 | 80 824–17 409 | 80 844–17 409 |
15 | 16 | 17 | 18 | |||
80 860–17 409 | 80 881–17 409 | 80 894–17 409 | 80 928–17 409 |