Epidemiological profiles and associated risk factors of SARS-CoV-2 positive patients based on a high-throughput testing facility in India

We describe the epidemiological characteristics and associated risk factors of those presenting at a large testing centre for SARS-CoV-2 infection. This is a retrospective record review of individuals who underwent SARS-CoV-2 testing by reverse transcription-polymerase chain reaction (RT-PCR) at a high-throughput national-level government facility located in the north of India. Samples collected from 6 April to 31 December 2020 are included in this work and represent four highly populous regions. Additionally, there was a prospective follow-up of 1729 cases through telephone interviews from 25 May 2020 to 20 June 2020. Descriptive analysis has been performed for profiling clinic-epidemiological aspects of suspect cases. Multivariable logistic regression analysis was undertaken to determine risk factors that are associated with SARS-CoV-2 test positivity and symptom status. A total of 125 600 participants' details have been included in this report. The mean (s.d.) age of the participants was 33.1 (±15.3) years and 66% were male. Among these tested, 9515 (7.6%) were positive for COVID-19. A large proportion of positive cases were asymptomatic. In symptomatic positive cases, the commonest symptoms were cough and fever. Increasing age (groups 20-59 and ≥60 years compared to age group less than 5 years), male sex, history of international travel, symptoms for SARS-CoV-2, and participants from Delhi and Madhya Pradesh were positively associated with SARS-CoV-2 test positivity. Having co-morbidity, risk behaviours and intra-familial positivity were associated with a positive odds ratio for exhibiting SARS-CoV-2 symptoms. Intensified testing and isolation of cases, identification of both asymptomatic and symptomatic individuals and additional care of those with co-morbidities and risk behaviours will all be collectively important for disease containment in India. Reasons for differentials in testing between men and women remain an important area for in-depth study. The increased deployment of vaccines is likely to impact the trajectory of COVID-19 in the coming time, and therefore our data will serve as a comparative resource as India experiences the second wave of infection in light of newer variants that are likely to accelerate disease spread.

SM, 0000-0002-4979-5700; MR, 0000-0003-0932-0935; PD, 0000-0003-3459-1278; RC, 0000-0003-1437-5121; JC-G, 0000-0003-4654-5463; AA, 0000-0001-8455-6054; HS, 0000-0002-1736-368X; CPY, 0000-0001-7531-6307; AS, 0000-0002-3305-0034 We describe the epidemiological characteristics and associated risk factors of those presenting at a large testing centre for SARS-CoV-2 infection. This is a retrospective record review of individuals who underwent SARS-CoV-2 testing by reverse transcription-polymerase chain reaction (RT-PCR) at a high-throughput national-level government facility located in the north of India. Samples collected from 6 April to 31 December 2020 are included in this work and represent four highly populous regions. Additionally, there was a prospective follow-up of 1729 cases through telephone interviews from 25 May 2020 to 20 June 2020. Descriptive analysis has been performed for profiling clinic-epidemiological aspects of suspect cases. Multivariable logistic regression analysis was undertaken to determine risk factors that are associated with SARS-CoV-2 test positivity and symptom status. A total of 125 600 participants' details have been included in this report. The mean (s.d.) age of the participants was 33.1 (±15.3) years and 66% were male. Among these tested, 9515 (7.6%) were positive for COVID-19. A large proportion of positive cases were asymptomatic. In symptomatic positive cases, the commonest symptoms were cough and fever. Increasing age (groups 20-59 and ≥60 years compared to age group less than 5 years), male sex, history of international travel, symptoms for SARS-CoV-2, and participants from Delhi and Madhya Pradesh were positively associated with SARS-CoV-2 test positivity. Having co-morbidity, risk behaviours and intra-familial positivity were associated with a positive odds ratio for exhibiting SARS-CoV-2 symptoms. Intensified testing and isolation of cases, identification of both asymptomatic and symptomatic individuals and additional care of those with co-morbidities and risk behaviours will all be collectively important for disease containment in India. Reasons for differentials in testing between men and women remain an important area for in-depth study. The increased deployment of vaccines is likely to impact the trajectory of COVID-19 in the coming time, and therefore our data will serve as a comparative resource as India experiences the second wave of infection in light of newer variants that are likely to accelerate disease spread.

Introduction
The first case of novel coronavirus disease in India was reported in a student from Thrissur, Kerala, who returned from Wuhan, China, on 30 January 2020 [1]. Later on, cases were reported from other parts of the country which were mostly either connected with the recent history of international travel or with an exposure to a confirmed case of COVID-19. India had reported the largest number of confirmed COVID-19 cases in Asia and ranked second worldwide after the United States during the first wave. As of 30 April 2021, India has confirmed a total of greater than 16.9 million cases with 192 311 deaths attributed to COVID-19 [2]. India is now dealing with an explosive second wave of SARS-CoV-2 infections and has the dubious distinction of largest daily cases worldwide. The brutal second wave of COVID-19 has hit the nation reporting more than 0.3 million cases daily since mid-April 2021. Possible reasons for the second wave in India include: large susceptible population, presence of virulent mutant strains of the virus, very poor planning and roll-out of vaccination for its masses, indications that India was achieving herd immunity led to complacency in measures, congregation of masses in religious gatherings and political rallies, pandemic fatigue and lowering of guard due to the false propaganda that COVID had been defeated. The rise of the second wave is much steeper than the first wave that peaked in September of 2020. Although this study is restricted to the fast wave of the Indian epidemic, our analyses will enlighten data from the second and future waves of SARS-CoV-2 infections.
At the onset of the pandemic, the government of India carried out testing of the exposed people for SARS-CoV-2 based on set criteria which underwent periodic changes in light of the evolving scenario of the pandemic. The criteria for testing were first laid down on 17 March and a nationwide lockdown was implemented on 24 March 2020. The laboratory testing capacity for SARS-CoV-2 has been ramped up since the initial lockdown in the country and it stands at 2501 laboratories as of 30 April 2021 [3] with a testing rate of 1418 daily tests per million population. India had nearly 253 daily confirmed cases per million and its daily positive rate was 21% as of 30 April 2021 [4].
Internationally, there is limited information from SARS-CoV-2 testing centres in regard to the socio-demographic profiles of those who got tested. Countries like Brazil, China, Italy, the UK and the USA have mapped the clinical and epidemiological features of patients with COVID-19. Docherty et al. [5] performed a large prospective cohort study and characterized the clinical features of 20 133 patients who were admitted to hospital with COVID-19 in the UK. The median age of patients admitted to hospital with COVID-19 or diagnosed in hospital was 73 years. More men were admitted than women. The commonest co-morbidities were chronic cardiac disease, uncomplicated diabetes, non-asthmatic chronic pulmonary disease and chronic kidney disease. Within an Asian setting, Huang et al. [6] have reported the epidemiological, clinical, laboratory, radiological and clinical features of patients in Wuhan, China. Most of the infected patients were men; less than half had underlying diseases including diabetes, hypertension and cardiovascular disease with a median age was 49 years. Common symptoms at the onset of illness were fever, cough and myalgia or fatigue. Grasseli et al. [7] have characterized the patients with COVID-19 symptoms in the Lombardy region of Italy. Of the 1591 patients included in the study, the median age was 63 years and 1304 (82%) were males. Of the 1043 patients with available data, 709 (68%) had at least one co-morbidity and 509 (49%) had hypertension.
We report here clinic-epidemiological features along with the risk factors among the positive patients in one of the largest cohort of potential COVID-19 cases (n = 125 600) who were tested through reverse transcription-polymerase chain reaction (RT-PCR) for the detection of SARS-CoV-2 infection from the period of 6 April 2020 to 31 December 2020 at National Institute of Biologicals (NIB), an autonomous institute of Ministry of Health and Family Welfare situated in Noida, Uttar Pradesh, India (figure 2). The deployment of vaccines is likely to impact the trajectory of COVID-19 in the coming time, and therefore our data will serve as a comparative resource as the pandemic continues, especially in light of newer variants or mutant strains that can accelerate disease spread.

Study design and data collection
The study has two components. The first is a retrospective analysis of the data from individuals who were tested for SARS-CoV-2 at NIB within north India. A total of 130 132 samples were tested in the period from 6 April 2020 to 31 December 2020. Considering missing information about demographic variables, symptom status, test results and repeat samples, a total of 125 600 individuals were included in this study. In addition, a subset of positive cases was followed up prospectively by telephonic interviews to enquire about the symptomatic status, morbidity profile and outcome. The required information was collated from 1729 positive cases from the period from 25 May 2020 to 20 June 2020. The individuals included in this study were people who were suspected to be exposed to a confirmed case of COVID-19, symptomatic frontline workers, symptomatic who had undertaken international travel, and those presented for laboratory testing from containment zones, or quarantine centres or self-isolation. The government of India brought in the first guidelines for testing on 17 March 2020 which mandated that the following individuals be tested: all symptomatic individuals (having cough, fever, difficulty in breathing) who (a) had undertaken international travel in the past 14 days or (b) had contacts of laboratory-confirmed cases, and (c) health workers managing COVID-19 patients. Shortly after the first guidelines (on 20 March and 9 April 2020), the testing criteria were broadened to include patients with severe acute respiratory illnesses (SARI) and with influenza-like illness (ILI) belonging to hotspots and gatherings. Major changes in the testing strategy were brought in during the months of May and September 2020 [8,9]. The guidelines released on 18 May 2020 in addition to the existing guidance identified all symptomatic individuals with SARI and ILI, contacts and migrants as eligible for testing. SARI and ILI were defined clinically. Another major testing strategy change was introduced on 4 September 2020 in which routine surveillance in containment by rapid antigen test was introduced [9]. In this policy, all patients of ILI/SARI and asymptomatic high-risk patients in a hospital or requiring royalsocietypublishing.org/journal/rsob Open Biol. 11: 200288 hospital due to any co-morbidities (age ≥65 years/immunocompromised status/pregnant women in or near labour, etc.) had to be tested by RT-PCR/TrueNat/CBNAAT. In addition, the guidelines permitted people to self-refer and get themselves tested for SARS-CoV-2 without any prescription. This allowed for on-demand and wider testing. In view of the above time points, we have referred to our data in different periods for ease of analysis as P1 (from 6 April to 17 May 2020), P2 (from 18 May to 3 September 2020) and P3 (4 September to 31 December 2020) (figure 1). Majorly, the individuals who belonged to three neighbouring states of Delhi, Uttar Pradesh and Madhya Pradesh were identified as per government regulations for testing; additionally, there were 290 samples (0.2%) from Ladakh. During the initial time of this study, NIB was the only centre with high-throughput RT-PCR and hence these Indian states sent their samples to this laboratory for processing (figure 2). Epidemiological data pertaining to demographic characteristics, clinical presentation/symptoms, co-morbidities, hospitalization details, recent travel history, case referral state, etc. were recorded on the Specimen Referral Form (SRF) as per standards laid out by the Indian Council of Medical Research (ICMR), Ministry of Health and Family Welfare, Government of India [10].
The second part of the study involved a prospective follow-up of individuals who were found to be SARS-CoV-2 positive during the period 6 April to 7 June 2020. We intended to follow-up all 2158 positive cases telephonically and were successful in 1729 cases (i.e. response rate of 80%). The missing 429 positive individuals could not be followed due to wrong/ invalid numbers and/or unwillingness to respond. The telephone interviews for the 1729 cases were carried out from 25 May 2020 to 20 June 2020 to assess their health status (figure 3). A questionnaire was designed to collect the information related to family members and their COVID-19 status, lifestyle habits such as tobacco smoking and alcohol intake, course of the disease, hospitalization and recovery. Since the mode of the interview was telephonic, participants' height and weight could not be measured directly to define their obesity status, but proxy indicators were collected in the form of self-perceived status about weight (normal, lean, overweight) and height (normal, small and tall stature).

Laboratory procedures
Trained personnel collected nasopharyngeal and oropharyngeal samples using standard guidelines laid out by the Indian Council of Medical Research (ICMR) [11]. Both types of swabbed samples were placed in a single viral transport medium (VTM) tube, packed in a triple-layered casing and transported under cold chain maintenance to the National Institute of Biologicals (NIB), Noida, Uttar Pradesh, India, that created a Biosafety level-2 (BSL-2) laboratory under negative pressure. A total of 600 microlitres of sample in VTM was transferred in barcoded secondary tubes which were fed to the system. A fully automated high-throughput M/s Roche Cobas 6800 system was used for COVID-19 testing based on the detection of the viral genome using realtime RT-PCR based diagnosis. Cobas SARS-CoV-2 diagnostic kits were used, which were a real-time RT-PCR two target test intended for the qualitative detection of nucleic acids  were adapted from the World Health Organization Coronavirus Disease Dashboard. The red line on the graph represents the cumulative suspected number of cases from the database from the current study. Policy changes in different time periods were: P1-all symptomatic patients with international travel history, contacts with positive cases, and healthcare workers managing COVID-19 positive patients to be tested; P2-in addition to P1, all healthcare/frontline workers involved in containment and mitigation of COVID-19, all hospitalized patients who developed ILI symptoms, and all symptomatic among returnees and migrants within 7 days of illness to be tested; P3-routine surveillance in containment by rapid diagnostic tests, exposed and asymptomatic, symptomatic patients and high-risk patients (those requiring hospitalization with co-morbidities/elderlies ≥ 65 years/immunocompromised/pregnant females) to be tested by molecular techniques, and it also allowed self-referrals.
royalsocietypublishing.org/journal/rsob Open Biol. 11: 200288 from SARS-CoV-2 in nasopharyngeal/ oropharyngeal samples collected in VTM. Limit of detection studies determine the lowest detectable concentration of SARS-CoV-2 at which at least 95% of all (true positive) replicates test positive. The concentration level with observed hit rates greater than or equal to 95% were 0.009 and 0.003 TCID 50 /mL for SARS-CoV-2 (Target 1) and pan-Sarbecovirus (Target 2), respectively. Cobas processed the samples in batches of 96 including one positive and one negative control. The Cobas 6800 machine could test up to 940 samples in a day under standard operational conditions. The output of the tests was interpreted according to the chart described elsewhere [12].

Statistical analysis
The RT-PCR positive and negative data were entered in Microsoft Excel and line lists were prepared for all cases.
Lists of quality checks were applied to ensure data quality. The dataset was locked and subjected to analysis using statistical software STATA Version 12.0 (StataCorp LP, College Station, TX 77845, USA) where in descriptive analyses were performed. Categorical variables were reported as frequency and proportions, and continuous variables were summarized as mean and standard deviation (s.d.). We also examined changes with respect to variables over three testing periods (P1, P2, P3). Multivariable logistic regression analysis was undertaken for two outcome variables: SARS-CoV-2 test positivity and clinical symptom positivity among the cases followed up. Symptom positivity was considered when SARS-CoV-2 was positive for any of the symptoms listed by ICMR [13]. A range of explanatory variables was included and their adjusted odds ratios along with 95% confidence intervals were calculated. The p-values < 0.05 were considered statistically significant.

Profiles of participants
The study flow chart is shown in figure 3. A total of 125 600 individuals were tested at NIB whose details were retrieved from records and included in our study.  royalsocietypublishing.org/journal/rsob Open Biol. 11: 200288 Two-thirds of samples (66%) were collected from men and less than 1% had a recent history of international travel in past one/two months (table 1). Institutional quarantine as a measure of curtailing transmission was self-reported by 18% of individuals (table 1). The samples were received from the state of Uttar Pradesh (83%) followed by the national capital, Delhi (14%). The distribution of characteristics in tested individuals as per three periods of varying inclusion criteria of testing (referred as P1 to P3 in this paper, as explained earlier) is shown in table 2. No major change was seen in the percentage of different age categories that got tested across three policy periods except in the last period (P3) where the proportion of 6-19 years that were tested increased from 11.8% in P1 to 17.6% in P3. A slight increase in proportion among those tested in P3 was also seen for the elderly age group. We also observed an increase in the proportion of women being tested from periods P1 to P3. Self-report of institutional quarantine decreased over time frames of P1 to P3.

Associated factors for test positivity for SARS-CoV-2
The raw data are shown in table 4. Age was found to be associated with test positivity. Participants aged 20-59 years and ≥60 years had 46% (adjusted odds ratio (AOR) 1.46 (95% confidence interval (CI): 1.24, 1.73), p < 0.001) and  royalsocietypublishing.org/journal/rsob Open Biol. 11: 200288 91% higher odds (AOR 1.91 (95% CI: 1.60, 2.28), p < 0.001) than participants aged less than 5 years, respectively. The pattern of these age groups that emerged during P2 and P3 wherein only elderlies were found to have a significant association with test positivity (table 5). Sex was found to be associated with SARS-CoV-2 positivity. Overall, the odds of the positive test were more for men as compared to women (AOR 1.08 (95% CI: 1.03, 1.13), p < 0.001). There were differences noted among associations for sex in three time frames studied in our study. In P1, higher odds were found for women, and in P3, higher odds were found for men (

Prospective follow-up of COVID-19 cases
Of the 2158 COVID-19 cases, we could successfully conduct telephone interviews for their health outcomes in 1729 (80%) cases. The mean age (s.d.) of the participants was 33.5 (±15.1) years, and there were 1194 men (69%) in this subset. Of these, 160 (9.2%) participants had co-morbidities. The most common conditions were hypertension and diabetes (39% and 37%, respectively). Current smoking and alcohol intake were reported by 4% and 6% of the participants, respectively. We enquired about the perceived status of weight and height stature as surrogate measures of obesity in this subset of data. Self-reported weight status as perceived by participants was normal for 1365 (79%), lean for 258 (15%) and overweight/obese for 106 (6%). Height was self-perceived as normal for 1352 (78%), short-statured for 120 (7%) and tall statured for 257 (15%). History of more than one family member who had tested as SARS-CoV-2 positive was elicited in 468 (27%) participants.

Clinical symptoms and care
Of 1729 participants, 1272 (74%) remained completely asymptomatic. Of 457 symptomatic, 427 (93.3%) had recovered and were symptomless at the time of telephone interviews. However, 27 (6%) had current symptoms. The two most common symptoms were fever and cough, which were seen in 322 (70.5%) and 236 (52%) participants, respectively. Influenza-like illness (ILI with fever and cough) was reported by 170 (9.8%) participants and complaints of breathlessness were recorded in 60 (13%) patients. Loss of taste or smell was reported by only 13 (3%) persons. All people interviewed recalled their symptoms and status well, perhaps due to the high awareness in the general population of the COVID-19 pandemic. Associated factors with clinical symptom positivity are shown in

Discussion
This study presents characteristics of a large cohort of 130 132 individuals who were tested for SARS-CoV-2 infection, of whom 125 600 were selected for further analysis. The overall test positivity for SARS-CoV-2 infection was 7.6% while during the study period, it ranged from 4.6% to 9.0% in India. The test positivity rate at a population level is based on the number of tests performed and the stage of transmission within a pandemic setting. In this study, the test samples were received predominantly from Uttar Pradesh, Delhi and Madhya Pradesh. Among the study states, India's capital (Delhi) has recorded maximum positivity from 6.3% to 20.0% during the study period [14]. The test positivity rate in Uttar Pradesh ranged from 2.5% to 6.8% and for Madhya Pradesh it ranged from 3.5% to 9.7% [14] due to changes in disease progression and penetration within interiors of India at different time points. COVID-19 probably spread in cities and urban areas first due to overcrowding and favourable circumstances for spread [15]. There have been three nationwide rounds of sero-surveillance that have reported increasing sero-prevalence in the general population from 0.7% (first round, May-June 2020), 7% (second round, August-September 2020) and 24% (third round, December 2020-January 2021) [15][16][17]. Our test positivity was maximum (9%) in P2 (May to September 2020) and minimum (5%) during P3 (September onwards). The presented findings reflect the clinic-epidemiological profiles of a heterogeneous population who were suspected to be SARS-CoV-2 infected at varying intensities of infection in respective geographic locations from April to December 2020. Examining through time periods of varying testing guidelines, we observed an increase in younger people, women and more asymptomatic participants being examined for SARS-CoV-2 out of total being testing at NIB during P3. The third nationwide survey also reported an increase in younger people sero-positivity (25%) among 10-17-year-olds surveyed [17]. Overall, test positivity declined over P3 compared to P1 and P2, again resembling nationwide reporting, where test positivity was recorded maximum from June to September and had been declining from October onwards [14]. Participants of all ages were found to be test positive with SARS-CoV-2 and age was found to be positively associated with test positivity and also with exhibiting symptoms. Compared to children, maximum odds for infection positivity were seen in elderly subjects, followed by adults in the age group of 20-59 years. The risk could be explained partially due to concomitant co-morbidities and possible exposure to other infected people within households, who could be asymptomatically infected. This pattern of higher infection rates is consistent with what is reported for the country. Older age groups are considered to have a higher risk of infection and also of severe disease outcomes (like admissions to intensive care units and deaths), as consistently seen in the global and local literature [18,19]. In India, 80% of deaths among SARS-CoV-2 has been reported above 50 years of age group [20]. In our group of participants who were tested, younger age groups were also infected, albeit with lower positivity risks, indicating all age groups are susceptible to infection. In P3, a higher proportion was represented by younger age groups including children and adolescents relative to prior testing time frames probably due to combined efforts of expansive surveillance strategy and widening of opportunities for testing to all persons. In absolute numbers, more men in our sample were tested and were found to be positive overall, and the odds of test positivity was 6% higher in men compared to women. Male sex has consistently been reported to be an independent risk factor for test positivity [21]. A nationwide sero-surveillance study done in India reported more sero-positivity in men compared to women in the first round, and nearly equal sero-positivity in men (6.7%) and women (6.5%) in the second round [15]. We also found varying risks in different time periods, with almost equal risk in our P2 from May to September 2020. Men have been reported to have severe SARS-CoV-2 infections and higher fatalities than women in India and internationally [5,22]. Varying immunogenic responses have been linked to differentials in response to SARS-CoV-2 infections by sex [23]. Gender differentials also have been reported within local and regional studies within India. We also postulate that there exists a differential in testing utilization rates for men and women, as occurs in Indian communities for a variety of health conditions-there are known variations in patterns of health system utilization by both men and women [24,25]. The sero-survey conducted in Delhi reported a higher sero-positivity in women [26].
Our sample of participants included 42 779 women but overall comprised only approximately one-third of the entire sample tested. Reasons for differentials in testing between men and women remain an important area for in-depth qualitative enquiry. It has been reported in two southern Indian states (Tamil Nadu and Andhra Pradesh) through contact tracing and active finding that contacts were younger and more often women than index cases. Also, the same study reported a secondary attack rate of 11% for high-risk close contacts [27].
Participants with a history of international travel had significantly higher odds of test positivity in our sample. India in its active response institutionalized a strong and early tracking system by issuing travel advisories, screening and testing of passengers, and curbing international travel at a later stage for preventing transmission of the disease. As the indigenous transmission expanded, fewer cases with international travel history were reported. The guidelines and indications for testing have been periodically revised over the course of the pandemic in India [9,10,13]. Travel-related spread has once again gained importance against the backdrop of increasing transmission in European countries and newer variants or mutational strains accelerating the local spread. The Government of India has issued recent advisories curbing international travel from select nations where enhanced transmission of SARS-CoV-2 has been reported [28]. Also, there is now provision for genomic surveillance and additional tracking of mutant strains of SARS-CoV-2 [28][29][30][31]. Further, web-based COVID tracking dashboards like PRACRITI make predictions of the subsequent three weeks in terms of active cases along with basic reproduction number (R0) values for each state in India [32]. The PRACRITI predictions for the second wave of infections that are ongoing (April 2021 onwards) are very alarming.
Significantly, participants with clinical symptoms had two times higher odds of test positivity. Cough and fever were the two most common symptoms reported in our symptomatic patients followed by sore throat and breathlessness. Similar symptoms have also been reported by other international and Indian studies [5,6,22,[33][34][35][36]. There have been differences in the profile of participants in settings outside India where the mean age of participants is on the older side, as compared to Indian settings, where younger age groups are affected more due to pre-existing population demographic profile. In our study, the mean age of participants was 33 years. It is noteworthy that for every symptomatic positive case for SARS-CoV-2, there were approximately six times more symptomatic persons with reasons other than SARS-CoV-2. Also, there was a large proportion of asymptomatic infective positive cases in our group of participants. For every one positive case with symptoms, approximately 16 cases were symptomless. A high proportion of asymptomatic SARS-CoV-2 positive cases have also been reported in hospitalized COVID-19 and other reports from India [34][35][36]. It is already established that there is a high percentage of asymptomatic cases with COVID-19, and these, when undetected, pose significant challenges for the containment of the virus [37][38][39]. Epidemiological analysis in the state of Karnataka reported predominantly asymptomatic cases in the younger age group of 16-45 years and symptomatic cases in the age band of 31-65 years. The study also suggested that both asymptomatic and symptomatic cases contribute to the spread of infection [40].  Sero-surveillance studies within the country suggest high infection load in select regions of the country though actual cases confirmed via testing are strikingly lower in comparison. It has been reported that for every case there are between 27 and 31 infections in the community [15,17,27]. Considering an adjustment factor of 27 on a conservative side, the actual infected cases up to 31 December 2020 in our study areas would be 16.8 million in Delhi (against 0.6 million cases reported), 15.8 million in Uttar Pradesh (against 0.5 million cases reported) and 6.5 million in Madhya Pradesh (against 0.2 million cases reported) respectively. Thus, the potential to go undetected (and possibly still spread the virus) is very high in the Indian context. Another peculiar pattern, though on a positive note, that has been reported in India is the overall high recovery rate and lower fatalities [41]. Nonetheless, co-morbidities have been found to be associated with the expression of symptoms in our sample. Cases with co-morbidities have been found to be associated with severe illnesses requiring hospitalization and critical care (and include fatalities) [22,42,43]. In any case, the claims of a low fatality rate in India are unreliable due to incomprehensive testing and lack of availability of data on death records for analysis.
In this work, lifestyle habits like smoking and alcohol consumption appeared as independent risk factors in symptomatic patients. Smoking is already linked to severe COVID-19 illness and fatalities [44]. Both smoking and alcohol can mediate a heightened inflammatory response and weaken host immune defences [45]. Indeed, the extended periods of lockdown and then release could have promoted alcohol abuse [46]. We also found excess weight as an independent factor with higher odds for developing COVID-19 symptoms. We were unable to measure body mass index, and hence, this association may be interpreted with caution. Nevertheless, obesity has been linked with poor immunity and leads to poorer outcomes [47]. Strikingly, our data suggest higher familial SARS-CoV-2 case positivity is significantly associated with increased odds for developing symptoms. Familial clustering does increase disease transmission, including from asymptomatic individuals who aid disease spread [48]. This finding reiterates the pivotal role of contact tracing and subsequent isolation/quarantine of family members in halting transmission of the virus. In the Indian context, this insight is critical given that most households are typified by individuals who share living spaces and facilities. During self-isolation, pulse oximetry also became an integral component of home-based COVID-19 patients' respiratory disease management [49].
Our study has several strengths. It is based on a very large sample of 125 600 participants who got tested for SARS-CoV-2 and it included all age groups. However, this cohort represents only those who had access to healthcare services and presented themselves for testing. We lack data on several parameters that influence health and disease outcomes royalsocietypublishing.org/journal/rsob Open Biol. 11: 200288 Table 5. Age and sex characteristics of SARS-CoV-2 positive cases among tested over three testing periods.  including socio-economic index, urban/rural status, poverty and vulnerability status. We do not have information pertaining to the exact source of exposure in our cases. We were constrained by the non-availability of data through record reviews as these data are collected at a high-volume testing centre, keeping in mind the resource and management constraints. Future studies should include outcome data, especially in the context of India, where there are wide social and economic disparities. We followed up a small fraction of positive cases in the initial part of the data collection period and that provided valuable insights on associated factors with symptomatic cases. Our patients largely had a milder spectrum of disease, a pattern that again is more generalizable in the Indian context due to unknown reasons except for overall young demography. Our findings are based on self-reports and data information filled at the time of requisition of tests. We have excluded information from participants whose information was deficient. Also, predominantly asymptomatic infections are obtained in our sample that was being tested. It was difficult for us to truly differentiate between pre-symptomatic and symptomatic infections; as had been reported earlier in the literature, people who had been tested may have been asymptomatic at the time of obtaining the sample but may exhibit symptoms in their future clinical course of infection [36,50]. This will require conducting prospective studies on these asymptomatic patients and will inform our understanding of symptom and outcome status better in these infected individuals.
In sum, we found a 7.6% positivity rate in a large cohort of those tested for SARS-CoV-2 between April and December 2020 in India. Most of the positive patients were asymptomatic, with cough and sore throat being the commonest symptoms reported. In our follow-up sub-study, a large majority of COVID-19 cases recovered fully with only 2% continuing with symptoms. Concomitant disease, smoking, drinking, obesity and familial COVID-19 patients were independent risk factors for expressing symptoms among diseased samples. Our findings have several implications for programmatic action directed at COVID-19 containment. As a majority of COVID-19 patients are asymptomatic as per this cohort, it is thus imperative to intensify efforts towards testing, identification and isolation of cases. It is noteworthy that all ages were found susceptible to infection, including children and adolescents (less than 19 years). Thus, public health workforces should execute household-level surveillance for case detection targeting all age groups. Significant factors associated with test positivity were increasing age, male sex, international travel and having symptoms. Co-morbidity was significantly associated with the exhibition of symptoms. There is still a considerable area and population in India that is susceptible to SARS-CoV-2 infection and risk factors obtained above must be given due attention for testing and surveillance operations. India, with a considerable number of young people, who have not been infected to a large extent, remains vulnerable, especially in the context where mutant variants have been reported from other parts of the world that have demonstrated higher transmissibility and propensity to infect young people [51]. The clinic-epidemiological profiles presented here will be more valuable with comparative data from other parts of India and from other COVID-19-afflicted regions of the world. Also, the COVID-19 mitigation steps taken by the government have provided a blueprint for other infectious diseases [52,53]. As is evident through past sero-surveys, a high proportion of the population in India, including in the rural areas, remains vulnerable to acquiring SARS-CoV-2 infection, as is evident from the second wave of infections. The subsequent waves may infect even larger remaining populations (driven by current and new variants that will arise) and thus only widespread and rapid immunization campaigns can protect the masses in India. The planning and roll-out of COVID-19 vaccines in India for its masses are hampered and severely delayed due to pandemic mismanagement that has also resulted in severe shortages of oxygen, hospital beds, ICUs, medicines and healthcare workers. Vaccination is thus proving to be a huge challenge for the already frail Indian healthcare system, and India may consider valuable lessons from success stories like the polio vaccination drive [54]. Studies like our current work will be imperative for inferring trends of disease spread in subsequent waves of infections, especially in the backdrop of concomitant immunizations and the evolution of new variants of concern.
Ethics. This study was approved by the Institutional Ethics Committee Competing interests. We declare we have no competing interests.
Funding. This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.