The history, genome and biology of NCTC 30: a non-pandemic Vibrio cholerae isolate from World War One

The sixth global cholera pandemic lasted from 1899 to 1923. However, despite widespread fear of the disease and of its negative effects on troop morale, very few soldiers in the British Expeditionary Forces contracted cholera between 1914 and 1918. Here, we have revived and sequenced the genome of NCTC 30, a 102-year-old Vibrio cholerae isolate, which we believe is the oldest publicly available live V. cholerae strain in existence. NCTC 30 was isolated in 1916 from a British soldier convalescent in Egypt. We found that this strain does not encode cholera toxin, thought to be necessary to cause cholera, and is not part of V. cholerae lineages responsible for the pandemic disease. We also show that NCTC 30, which predates the introduction of penicillin-based antibiotics, harbours a functional β-lactamase antibiotic resistance gene. Our data corroborate and provide molecular explanations for previous phenotypic studies of NCTC 30 and provide a new high-quality genome sequence for historical, non-pandemic V. cholerae.

. Additional genomes used in this study (attached .xls spreadsheet).

Bacterial rehydration and recovery
NCTC 30, NCTC 5395 and NCTC 10732 were recovered from lyophilised stocks according to the method published by Public Health England Culture Collections (https://www.pheculturecollections.org.uk/). Ampoules containing lyophilised bacterial stocks were broken under sterile conditions, and the contents were rehydrated using 0.5 ml LB medium for five minutes at room temperature. This suspension was mixed well and applied to LB agar plates, which were incubated overnight at 37 ˚C (passage 1). Three well-isolated colonies from these plates were single-colony purified onto both LB and TCBS agar, a medium selective for Vibrio species (passage 2). Colonies were taken from TCBS plates (or from LB plates if growth on TCBS agar was poor) and used to inoculate 3 ml LB liquid media, which was incubated for 24 h with shaking (180 rpm) at 37 ˚C (passage 3). Cultures were mixed with glycerol (25% v/v final concentration) and frozen at -80 ˚C.

Genomic DNA isolation
Total nucleic acids were extracted from V. cholerae using the Masterpure Complete DNA and RNA Purification kit (Epicentre, #MC85200), with modifications to the manufacturer's instructions. DNA was isolated from two independent clones of NCTC 30 picked at passage 2 (dubbed MJD382 and MJD439) and one clone of NCTC 5395 (MJD389), a strain that is closely related to 7PET V. cholerae [12]. All clones had been frozen at passage 3. Single colonies isolated from frozen bacterial stocks (passage 4) were used to lawn LB agar plates, which were incubated overnight at 37 ˚C (passage 5). Five loopfuls of bacterial lawn were added to 300 µl Tissue & Cell Lysis Solution supplemented with Proteinase K. Samples were vortexed (10 sec), and incubated at 65 ˚C with intermittent vortexing for 20-25 min or until the suspension had cleared. Samples were treated with RNAse A for 30 min, and chilled on ice. Proteins were precipitated using MPC Protein Precipitation Solution (150 µl), followed by centrifugation (16,000 x g; 10 min; 4 ˚C). Residual protein was precipitated by re-treating the samples with MPC reagent (30 µl). Genomic DNA (gDNA) was precipitated from the cleared supernatant using room-temperature isopropanol, collected by centrifugation (16,000 x g; 10 min; 4 ˚C), washed twice with 1 ml room-temperature 70% v/v ethanol, dried, and resuspended in 80 µl nuclease-free water. EDTA was excluded from the resuspension solution, to avoid interference with PacBio sequencing chemistry.

NCTC 30 genome assembly, annotation, and quality checks
Single-contig assemblies were generated for each of the two NCTC 30 chromosomes from PacBio read data, using HGAP v3 and the RS_HGAP_Assembly.2 protocol via SMRT Portal running SMRT Analysis v2.3.0.140936.p5.167094 [15]. Subreads were filtered for a minimum length of 500 bases; minimum polymerase read quality and length were set to 0.8 and 100, respectively. For assembly, the minimum seed read length was set to 6,000, and the following options were passed to BLASR: '-noSplitSubreads -minReadLength 200 -maxScore -1000 -maxLCPLength 16'. The expected genome size was set to 5 Mbp with a target coverage of 30. The resultant assembly comprised two contigs, one per V. cholerae chromosome. These sequences were circularised using Circlator v1.5.3 [16] using the assembly and the corrected reads. A final assembly was obtained by using the circularised sequences as a reference for re-assembly of the PacBio reads with the RS_Resequencing.1 protocol (minimum subread length of 50 bases, minimum polymerase read quality of 75%, minimum polymerase read length of 50 bases, BLASR maximum divergence of 30% and minimum anchor size of 12), which was corrected using Quiver v1. Assemblies were annotated using Prokka v1. 5 [17] and a genus-specific database [18]. The PacBio sequencing reads covered the finished assembly to an average depth of 148.01 X. To check the accuracy of the PacBio assembly, the corresponding Illumina short-reads were mapped to the assembly using SMALT v0.5.8 (http://www.sanger.ac.uk/science/tools/smalt-0), with a maximum and minimum insert size of 1000 and 50, respectively. No single nucleotide polymorphisms were identified in the assembly upon mapping of these data.

Plasmid extraction, PCR, and molecular cloning
Plasmid DNA was isolated from Escherichia coli cultures using the QIAprep Spin Miniprep kit (Qiagen, #27104). Reaction intermediates were purified using the QIAquick PCR Purification kit (Qiagen, #28104). To clone blaCARB-like, primers oMJD96 and oMJD97 were used to amplify blaCARB-like from MJD382 gDNA. The primers were designed to incorporate restriction enzyme sites and STOP codons as outlined in Figure S7. PCR was carried out Five microlitres of ligation mixtures were used to transform competent E. coli (NEB, #C2987I) according to the manufacturer's instructions. Transformants were selected for on solid LB media. E. coli that exhibited resistance to both ampicillin and chloramphenicol upon transformation were cultured and stored as glycerol stocks; these were also confirmed to be tetracycline-sensitive. Plasmids were prepared from these transformants as described above.
The presence of blaCARB-like in pMJD61 was checked by PCR using oMJD88 and oMJD89 (homologous to blaCARB-like), and confirmed by Sanger sequencing (GATC/Eurofins) with oMJD98 and oMJD99 (homologous to the sequences flanking tet on pACYC184).

Growth curves
Single colonies of V. cholerae were suspended in 0.5 ml LB broth by vortexing (10 sec

VC_1450
Yes (100%) No Table S3. Presence of select accessory virulence factor genes in the NCTC 30 genome.
Identity percentages were calculated by alignment of protein sequences from NCTC 30 and N16961 using BLASTp. The presence of genes encoding MSHA is also illustrated in Figure   S6. Since N16961 does not harbour NAG-ST, the translated NAG-ST nucleotide sequence (accession M85198.1) was used as a tBLASTx query to scan the NCTC 30 genome assembly for the presence of this enterotoxin. Figure S1. The NCTC quality check card for NCTC 30. This isolate has been passaged four times since its initial lyophilisation in 1950. Batch 3, prepared in 1962, was investigated and sequenced in this study. Information which is not relevant to the conclusions drawn in this study has been redacted (black rectangles).     plots were produced using ACT [23] to visualise the presence and absence of nucleotide sequences in the NCTC 30 genome assembly. When compared to the N16961 reference sequence, the absence of CTXφ (A), VPI-1 (B) and VSP-1 (C) from NCTC 30 is evident. No sequences homologous to these elements were detected anywhere in the NCTC 30 assembly.
The genes that encode the MSHA accessory virulence determinant are present in NCTC 30 (D), although the NCTC 30 mshQ gene is dissimilar to that of N16961. Figure S7. Strategy for cloning blaCARB-like into a low-copy plasmid. pACYC184 is a lowcopy cloning vector that encodes resistance to both chloramphenicol and tetracycline.
blaCARB-like was amplified from the NCTC 30 genome using primers oMJD96 and oMJD97, incorporating BamHI and SalI restriction sites. The amplicon was inserted into the pACYC184 tet gene, such that a premature STOP codon would be introduced in-frame into tet. An analysis of the sequence of blaCARB-like using BPROM (http://softberry.com) predicted