Immune activation by a multigene family of lectins with variable tandem repeats in oriental river prawn (Macrobrachium nipponense)

Genomic regions with repeated sequences are unstable and prone to rapid DNA diversification. However, the role of tandem repeats within the coding region is not fully characterized. Here, we have identified a new hypervariable C-type lectin gene family with different numbers of tandem repeats (Rlecs; R means repeat) in oriental river prawn (Macrobrachium nipponense). Two types of repeat units (33 or 30 bp) are identified in the second exon, and the number of repeat units vary from 1 to 9. Rlecs can be classified into 15 types through phylogenetic analysis. The amino acid sequences in the same type of Rlec are highly conservative outside the repeat regions. The main differences among the Rlec types are evident in exon 5. A variable number of tandem repeats in Rlecs may be produced by slip mispairing during gene replication. Alternative splicing contributes to the multiplicity of forms in this lectin gene family, and different types of Rlecs vary in terms of tissue distribution, expression quantity and response to bacterial challenge. These variations suggest that Rlecs have functional diversity. The results of experiments on sugar binding, microbial inhibition and clearance, regulation of antimicrobial peptide gene expression and prophenoloxidase activation indicate that the function of Rlecs with the motif of YRSKDD in innate immunity is enhanced when the number of tandem repeats increases. Our results suggest that Rlecs undergo gene expansion through gene duplication and alternative splicing, which ultimately leads to functional diversity.

XZ, 0000-0002-1100-6294; QR, 0000-0002-5347-5697 Genomic regions with repeated sequences are unstable and prone to rapid DNA diversification. However, the role of tandem repeats within the coding region is not fully characterized. Here, we have identified a new hypervariable C-type lectin gene family with different numbers of tandem repeats (Rlecs; R means repeat) in oriental river prawn (Macrobrachium nipponense). Two types of repeat units (33 or 30 bp) are identified in the second exon, and the number of repeat units vary from 1 to 9. Rlecs can be classified into 15 types through phylogenetic analysis. The amino acid sequences in the same type of Rlec are highly conservative outside the repeat regions. The main differences among the Rlec types are evident in exon 5. A variable number of tandem repeats in Rlecs may be produced by slip mispairing during gene replication. Alternative splicing contributes to the multiplicity of forms in this lectin gene family, and different types of Rlecs vary in terms of tissue distribution, expression quantity and response to bacterial challenge. These variations suggest that Rlecs have functional diversity. The results of experiments on sugar binding, microbial inhibition and clearance, regulation of antimicrobial peptide gene expression and prophenoloxidase activation indicate that the function of Rlecs with the motif of YRSKDD in innate immunity is enhanced when the number of tandem repeats increases. Our results suggest that Rlecs undergo gene expansion through gene duplication and alternative splicing, which ultimately leads to functional diversity.

Introduction
Repetitive DNA sequences are widespread and abundant in genomic DNAs because almost half of the human genome consists of repeats [1,2]. A subset of repeating DNA are DNA fragments consisting of tandem repeats with short sequence units (e.g. CAG) that are adjacent to each other. The terms 'microsatellites' and 'minisatellites' are often used to represent tandem repeats with short (less than or equal to 9 bp) and long (greater than 9 bp) repeat units, respectively. Tandem repeats can be mutation hotspots due to their repetitive features. Slippage during DNA replication or recombination events results in alleles with different numbers of repeat units, which are referred to as 'copy numbers'. Tandem repeats have higher mutation rates, which are 10-10 000 times the average rate, compared with other genomic loci [3]. Most tandem repeats lose direct biological functions due to instability and the lacking genetic information. Such tandem repeats are referred to as 'junk' DNA [4,5]. However, tandem repeats are useful as genetic markers in genotyping and forensic science and offer additional advantages for genome-wide linkage studies [6]. Furthermore, repeats are present in the functional (coding and regulatory) regions of genomes [7] and can alter the function and/or expression of genes to enable organisms to adapt quickly to new environments [8].
Repetitive sequences are important promoters of biological genomic DNA evolution, but the origin and evolutionary mechanism of tandem repeats have been controversial [9]. An early explanation for this variation is that DNA slip mismatch occurs during replication, and the DNA is then repaired and recombined to produce a repetitive sequence [10]. However, no study has fully explained the diversity of repeat sequence types within the same genome, and between the coding and the non-coding regions. Host defence proteins are important in combating microbial infections. Few excess tandem repeat variations have been observed in human defence proteins, but tandem repeat polymorphisms may arise in invertebrate defence proteins, which have a large population size [11]. Thus, the excessive number of tandem repeat polymorphisms in invertebrate defence proteins needs further investigation.
Host immunity is a continuous game between host and pathogen. Pathogens can invade the host quickly and efficiently, and the immune system is responsible for protecting the host from pathogens. This long-term coevolution between hosts and pathogens undergoes a process in which the mutation rate of long-generation hosts is low, whereas that of short-generation microorganisms is high [12]. This evolutionary mechanism enables hosts to protect themselves effectively against pathogens that show evolutionary variations. The best option for the host is to find ways to generate random or near-random diversification and expand immune receptors.
Lectins are pattern recognition receptors (PRRs) that are actively involved in various life processes, including protein trafficking, cell adhesion, phagocytosis, cell signalling, complement activation and non-self recognition [13]. Lectins have carbohydrate recognition domains (CRDs), which bind to sugars. Different CRDs have various symbolic structures with unique amino acid motifs that can recognize specific types of sugars [14]. Among the lectins, C-type lectins (CTLs) are the most abundant and widely studied [15]. The main characteristic of CTLs is the presence of at least one CRD containing a highly conserved Ca 2+ binding site 2 with a QPD (Gln-Pro-Asp) or EPN (Glu-Pro-Asn) motif, which is specific for galactose or mannose binding, respectively [16]. CTLs, which exist in almost all metazoans, are highly conserved in vertebrates and considerably diverse in invertebrates [15]. A variety of CTLs is present in one species and varies widely in amino acid sequences. However, these CTLs do not belong to a single family. A growing number of crustacean CTLs have been identified as PRRs or effectors participating in a series of immune defence responses [15]. The presence of tandem repeat polymorphisms in a single family of CTLs, as a kind of host defence protein, has not been studied.
Here, we have identified a hypervariable CTL gene family with tandem repeat polymorphisms (Rlecs; R means repeat) in oriental river prawn (Macrobrachium nipponense). A common feature of this lectin gene family is that all Rlecs have variable numbers of tandem repeats in the coding regions outside the CRDs. The number of repeat units varies from 1 to 9. Moreover, we have characterized the arrangement patterns of the repeat units in this lectin gene family. Rlecs with a motif of six different amino acids can be classified into 15 types through phylogenetic analysis. We have also determined the tissue distributions and expression patterns of eight types of Rlecs under bacterial challenge. Finally, a functional study on Rlecs-YRSKDD with variable numbers of tandem repeats (1, 2, 7 or 9) in innate immunity is conducted.

Characterization and sequence analysis of Rlecs
We found a lectin gene containing tandem repeats by analysing the transcriptome data of M. nipponense. More than one band was amplified using a pair of gene-specific primers (data not shown). After sequencing many clones, we found that these lectin genes belong to a multiple-gene family. These lectin genes had tandem repeats and were named as Rlecs. A common feature of these lectin genes was their tandem repeat region. Their tandem repeat unit can be either 33 or 30 bp long, and the number of tandem repeat unit varied from 1 to 9 (electronic supplementary material, table S1).
We selected a Rlec with YRSKDD motif (Rlec-YRSKDD) as an example to describe the characteristics of Rlec gene sequences in detail. The full-length cDNA of Rlec-YRSKDD that contained nine tandem repeats was 1408 bp, including an 894 bp open reading frame that encoded a 297-amino acid protein, a 47 bp 5 0 untranslated region (UTR) and a 467 bp 3 0 UTR with a canonical polyadenylation signal site (AATAAA) (electronic supplementary material, figure S1a). The repeat region of Rlec-YRSKDD had 279 bp, and the array of the nine tandem repeats was 33-30-30-30-30-30-33-30-33. The multiple alignments of the nine tandem repeats indicated that they were highly conserved (electronic supplementary material, figure S1b). Rlec-YRSKDD protein had a putative signal peptide containing 30 amino acid residues, 2 internal repeats (amino acids 59-96 and 109-148) and one CRD (amino acids 159-283). EPN (Glu 248 -Pro 249 -Asn 250 ) was found in the CRD of Rlec-YRSKDD. In addition, the theoretical isoelectric point and the molecular weight of Rlec-YRSKDD were 5.10 and 32.24 kDa, respectively.
The multiple alignments of the 15 different types of Rlecs by using amino acid sequences before and after the repeat region indicated that the main sequential variance came from the fifth exon (figure 1b). The YRSKDD-type was used as an example to study the differences of the same type of Rlecs. Multiple alignments showed that the amino acid sequences of the same type of Rlecs were highly conserved (figure 1c). The genomic sequences of Rlec isoforms varied, but their genome structures were probably similar to each other. All Rlec isoforms contained six exons interrupted by five introns. The repeat region was located in the second exon. The length and nucleotide sequences of introns in the Rlec isoforms also varied. The lengths of the first, third and sixth exons of the Rlec isoforms were not changed and were consistent with each isoform. Here, the Rlec-CNDSGD was used as an example.

DNA slip mismatch and alternative splicing modes of Rlecs
The expansion or contraction of nucleotides at tandem repeat regions during DNA replication may happen and is called replication slippage or slipped-strand mispairing [17]. We analysed all the arrangement patterns of the tandem repeats of 76 Rlecs and tried to find examples of slipped-strand mispairing to explain the possible mechanism of the formation of the different numbers of repeat units. Fortunately, five possible DNA slip mismatch examples were found. As shown in figure 3, type 1 can be described as follows. The first repeat unit and the last three repeat units were identical, and the excision of one or more 30 bp repeat units from the tandem repeat regions was found. This example can be found in the Rlecs containing the YRSKDD, YTYKED, YYYKED, YNYFDD or FQSKDG motif. In type 2, the first two repeat units and the last repeat unit were not changed, and one or more 33 bp repeat units were excised. This example can be found in the Rlecs containing the FHFKGD, YRSKDD or FHYKGD motif. In type 3, the first and the last repeat units were identical, and five repeat units with 30 bp were missing in Rlec-CNDSGD. Type 4 showed that the first repeat unit and the last two repeat units were the same, and several 30 bp repeat units were reduced in FHFKGD-, YTYKED-and FHYKGD-type Rlecs. In type 5, all repeat units were 33 bp, and several 33 bp repeat units were excised in Rlecs containing the FHYKGD, FHFKGD or YTYKED motif. Gene diversity can be produced by alternative splicing. In our research, the diversity of Rlecs can also be produced through alternative splicing. Three alternative modes, including alternative acceptor sites, alternative donor and acceptor sites, and exon skipping, were found in Rlecs. The exonintron boundaries of the genomic sequences of the Rlecs were GT and AG at the 5 0 and 3 0 splice sites, respectively. An example of an alternative acceptor event is shown in figure 4a. One alternative acceptor event had two acceptor sites (canonical or exonic) and one donor site. Alternative splicing at the exonic site induced the loss of the first 11 bp in the fifth exon in one transcript, which finally produced a different gene. This splicing mode was found in Rlecs-YHYQEH with 2, 3 or 7 tandem repeats. Different donor sites (canonical or exonic) and different acceptor sites (canonical or exonic) were selected in another alternative splicing event (figure 4b). Thus, the last 79 bp in the fifth exon and the first 58 bp coding region of the sixth exon were lost in one transcript. This splicing mode was found in Rlec-YHYQEH with seven tandem repeats. Exon 5 skipping was found in Rlec-YHYQEH with seven tandem repeats (figure 4c), and exon 3 skipping existed in Rlec-FHYKGD with three tandem repeats (figure 4d).

Expression pattern analysis of the eight types of Rlecs
The 15 different types of Rlecs were difficult to distinguish through quantitative real-time polymerase chain reaction (qRT-PCR). Therefore, only eight types of Rlecs were selected to study tissue distribution and expression pattern upon different bacterial challenges. The qRT-PCR analysis results showed that all the eight types of Rlecs were widely expressed in haemocytes, heart, hepatopancreas, gills, stomach and intestine. However, the expression level of each type of Rlecs was different. Rlecs-FHYKGD, Rlecs-YNYFDD, Rlecs-YYYKED, Rlecs-YRSKDD, Rlecs-YTYKED and Rlecs-YVVSDD showed the highest expression levels in the gills, and the expression level of Rlecs-YRSKDD was higher than those of Rlecs-FHYKGD, Rlecs-YTYKED, Rlecs-YNYFDD, Rlecs-YVVSDD and Rlecs-YYYKED. Rlecs-FQSKDG and Rlecs-YHYQEH were expressed in the hepatopancreas, and the expression level of Rlecs-FQSKDG was higher than that of Rlecs-YHYQEH (figure 5a). The expression pattern of the eight types of Rlecs in the gills challenged by Staphylococcus aureus or Vibrio parahaemolyticus was further studied. The seven types of Rlecs (except Rlecs-YVVSDD) were upregulated to varying degrees at certain challenge time points upon S. aureus challenge (figure 5b). Five types of Rlecs were downregulated in the gills challenged with V. parahaemolyticus, and the expression level of Rlecs-YNYFDD did not change upon V. parahaemolyticus challenge.
Rlecs-FHYKGD was initially reduced and then gradually increased until their highest expression level. Rlecs-YTYKED was upregulated at 2, 6, 12 and 24 h of V. parahaemolyticus challenge compared with the control (figure 5c).

Discussion
Many different CTLs with diverse functions were reported in crustaceans. However, lectins belong to different families, and the number of their common features is small. In this study, a new lectin gene family (Rlec) with different numbers of repeat units, different arrangement patterns of repeat units, variable exon 5 sequences and hypervariability was found in prawns,  Prawns were injected with rRlecs-YRSKDD plus LPS or PGN. The gills were collected at 12 h post-injection, and PO enzymatic activity was measured using L-DOPA as substrate and monitored by spectrophotometry at 490 nm. Each sample was composed of three prawns. Data were analysed by ANOVA. Results are expressed as mean ± s.d. derived from three independent experiments. Significant differences are marked with different letters (a-g).
royalsocietypublishing.org/journal/rsob Open Biol. 10: 200141 and these isoforms originated from gene duplications. The changes in gene copy number provide a rapid mechanism for adaptation to new or different environments by using existing functional sequences to respond to changing conditions [18]. As this process repeats, gene duplication can lead to the expansion of gene families and provide important evolutionary fodder on which selection can facilitate adaptation [19]. In this study, each genomic sequence corresponded to a specific Rlec isoform transcript. Thus, a strong evolutionary pressure has driven the diversification of Rlec genes and gives rise to neo-or subfunctionalization and retention in the prawn genome. In addition, the tandem repeat regions of Rlec isoforms were composed of tandem arrangements of closely related motifs, which ranged from 1 to 9. Notably, the tandem repeats in the coding region were not identical and did not produce frameshift mutations, which were extremely rare, but the length of the repeat unit was 33 or 30 bp. In penaeid shrimp, dinucleotide repeats (AT)n, (AC)n and (AG)n accounted for 81.40% of the total simple sequence repeats, but most repeat sequences are located in the intergenomic regions (24.63%) and the introns of protein-coding genes (22.07%) [20]. The variation in tandem repeat copy number has been reported in humans [11], but the tandem repeats in the coding region are rare compared with those in the non-coding region because of the strong selection against repeats that result in frameshift mutations [21]. Several examples of slipped-strand mispairing can explain the characteristics of Rlecs' tandem repeat sequences in the coding region.
In addition to gene duplications, alternative splicing is another pathway that contributes to the diversity of Rlecs at the post-transcriptional level. Most eukaryotic protein-coding genes are interrupted by introns, which must be precisely removed from the precursor transcripts to produce functional mRNAs [22]. RNA splicing has been considered as the main driver of transcriptional diversity, and a large number of regulatory mechanisms have been described. Our results exhibited the three alternative splicing modes of Rlec genes. Different mRNA splicing isomers were produced in one mRNA precursor of Rlecs by various splicing methods (different splicing site combinations were selected). The selection of 5 0 (3 0 ) splice sites in Rlec genes can increase the diversity of Rlec genes. Despite the differences in sequence characteristics, most Rlec genes had similar structures, wherein six exons were interrupted by five introns, and some Rlec isoforms without exon 3 or 5 can be produced by exon skipping. Substantial gene expansion combined with alternative splicing can induce hypervariable Rlec isoforms, which may prompt the functional diversity or enhancement of the Rlec gene family in prawns.
The pathogen-associated molecular pattern-PRR interaction activates the immune cascade, which leads to changes in the expression of effector genes. The primary effector systems were the production of AMPs and PO-dependent melanization. AMP is a kind of peptide found in almost all living organisms. AMPs play very crucial roles in innate immunity by killing invading microorganisms and regulating other immune or inflammatory responses [23]. The prophenoloxidase ( proPO) activation system is considered a vital innate defence mechanism in the immune system of crustaceans. Inactive proPO is transformed into active PO through a protease cascade reaction, after which the active PO catalyses the melanin biosynthesis pathway by hydroxylation. This catalysis leads to the oxidation of odiphenols to o-quinones and then non-specifically cross-links all molecules to form stable melanin, which directly kills and clears invading microbes [24].
Our results showed that LPS and PGN cannot upregulate the expression levels of AMPs and PO activity in Rlecs-YRSKDD (the main type of Rlecs) knockdown prawns. These results indicated that Rlecs-YRSKDD played a vital role in the regulation of AMP expression and proPO system activation. Furthermore, the variation in the number of tandem repeats influenced the immune function of Rlecs-YRSKDD. The immune functions of rRlecs-YRSKDD, including sugar binding, microbial inhibition and clearance, regulation of AMP gene expression and PO activation, were enhanced as the number of tandem repeats increased. Tandemly repeated DNA sequences were the highly dynamic genome components, which can exert a regulatory influence on protein function. The variation in the number of tandem repeats in the genes provide the functional diversity of cell surface antigens that allow rapid adaptation to the environment and/or escape from the host immune system in fungi and other pathogens [3]. Studies on the evolution of such repeated sequences have focused on sequences found in the non-coding and the coding regions of the genome. A 40 bp variable number of tandem repeat (VNTR) polymorphism in the non-coding region of the dopamine transporter (DAT) gene does not alter the amino acid sequence of the protein. However, the function of DAT is altered, and VNTR is identified as a functional polymorphism [25]. In prokaryotes, the collagen-like protein (BclA) is the first identified Bacillus anthracis protein that contain an internal collagen-like region of GXX repeats, which is associated with phenotypic variation [26]. In vertebrates, VNTR polymorphism in the P-selectin glycoprotein ligand 1 gene has been associated with ischaemic cerebrovascular disease [27]. In humans, host defence proteins do not have an excess of tandem repeat variation [11]. However, such advantageous tandem repeat polymorphisms are predicted to occur in invertebrate host defence proteins [11]. In this study, we provided an example of VNTR polymorphism in the crustacean immune protein of Rlecs, and VNTR can alter the immune function of Rlecs.
The expansion of genes promotes functional diversification and affords organisms with great phenotypic flexibility [19]. Lectins are important PRRs that can recognize and bind bacteria. In general, invertebrates only rely on innate immunity to combat diverse pathogens, and innate immunity is generally considered non-specific. However, innate immune PRRs often have high specificity for target molecules. Adaptive immunity with specificity exists only in vertebrates [28]. However, the innate defence systems of invertebrates also exhibit some specific immune responses against specific pathogens [29,30]. In our study, the Rlecs in prawns underwent dramatic gene expansions. Different types of Rlecs varied in tissue distribution, expression level and response to bacterial challenge. Different types of Rlecs produced various responses towards a certain bacterium. Therefore, the expansion of the Rlecs gene family due to gene duplications may be used by prawns to recognize different pathogens.
For the bacterial challenge experiments, the prawns were randomly divided into two groups, and each group contained 30 individuals. In two experimental groups, approximately 50 µl S. aureus (3 × 10 6 cells) or V. parahaemolyticus (3 × 10 6 cells) were injected into the abdominal segment of M. nipponense by using a 1 ml sterile syringe. The prawns in the control group were injected with the same volume of phosphate-buffered saline (PBS; 0.14 M NaCl, 3 mM KCl, 8 mM Na 2 HPO 4 and 1.5 mM KH 2 PO 4 ; pH 7.4). After treatment, the prawns were returned to the culture water tanks. Gills were randomly collected from five prawns from the experimental and control groups at 2, 6, 12 and 24 h post-injection. The heart, hepatopancreatic, gill, stomach and intestinal tissues and haemocytes were also collected from healthy prawns for RNA extraction. The haemolymph was extracted from the ventral sinus by using a 1 ml syringe containing one-third volume of anticoagulant buffer (acid citrate dextrose B (ACD-B): 1.47 g glucose, 0.48 g citric acid, 1.32 g trisodium citrate; made up to 100 ml volume by using double distilled water; pH 7.3) and then immediately centrifuged at 2000 r.p.m. for 10 min at 4°C for the isolation of hemocytes. Total RNA was extracted from the tissues by using the RNApure high-purity total RNA rapid extraction kit (Spin-column, BioTeke, Beijing, China) in accordance with the manufacturer's protocols. The RNA quality was assessed using electrophoresis on a 1.0% agarose gel. The first-strand cDNA of the samples for qRT-PCR was obtained through the PrimeScript 1st strand cDNA synthesis kit (Takara, Japan) with the Oligo(dT) primer. First-strand cDNA was synthesized using the 5 0 -CDS primer A and the SMARTer IIA oligo for 5 0 fragment cloning and the 3 0 -CDS primer A for 3 0 fragment cloning by using the SMARTer™ rapid amplification of CDNA ends (RACE) cDNA amplification kit (Clontech, Mountain View, CA, USA). All procedures were performed in accordance with the manufacturer's protocols.

Amplification of the intermediate sequences of 76
Rlecs isoforms and cloning of the full-length cDNA of Rlec-YRSKDD with nine tandem repeats The partial sequence of a lectin gene with tandem repeats was found in the transcriptome database of M. nipponense. Two primers (Rlecs-RT-F and Rlecs-RT-R, table 1) were designed on the basis of the partial sequences to amplify the middle fragments by using gills, stomach and intestine cDNA as templates. More than one band can be amplified, and the mixed fragments were gel purified and cloned into the pEasy-T3 vector (TransGen Biotech, China). First, we selected 10 positive clones to sequence. Sequence analysis showed that these lectin genes had sequence diversity. Thus, more positive clones were selected for sequencing. Finally, 76 different Rlec genes were identified after removing the redundant sequences, and these genes were used to construct the phylogenetic tree.
The obtained Rlec genes showed some differences, but their sequences had high similarity. Thus, designing primers to amplify the full length of a unique lectin gene was difficult. Rlecs-F and Rlecs-R were designed in accordance with the conservative region of the 76 Rlec genes. 5 0 and 3 0 RACE were performed using Rlecs-R and Rlecs-F, respectively (table 1). 5 0 -and 3 0 -RACE-ready cDNA were synthesized using the Clontech SMARTer™ RACE cDNA amplification kit (Takara). The PCR volume was 50 µl (2.5 µl of 5 0 -RACE-ready cDNA or 3 0 -RACE-ready cDNA, 5 µl of 10 × Advantage 2 PCR buffer, 1 µl of 10 mM dNTPs, 1 µl of 10 mM Rlecs-R or Rlecs-F, 5 µl of Universal Primer A mix, 34.5 µl of PCR-grade water and 1 µl of 50× Advantage 2 polymerase mix). The PCR conditions were set as follows: 5 cycles of 94°C for 30 s and 72°C for 3 min, 5 cycles of 94°C for 30 s and 70°C for 30 s, and 25 cycles of 94°C for 30 s, 68°C for 30 s and 72°C for 3 min. The 5 0 -and the 3 0 -RACE fragments were then cloned into the pEasy-T3 vector (TransGen Biotech, People's Republic of China). A number of positive clones were selected for sequencing. Numerous 5 0 and 3 0 cDNA sequences were obtained because this lectin gene family had gene diversity. Only the overlaps of the 5 0 -and the 3 0 -end sequences were identical and can be spliced into a gene. We successfully obtained the full-length cDNA of Rlec-YRSKDD with nine tandem repeats.

Bioinformatics analysis of Rlec gene family
The gene translation and prediction of the deduced protein were conducted using the ExPASy (https://web.expasy. org/translate/). Multiple sequence alignments were generated using the ClustalW2 program (http://www.ebi.ac.uk/ tools/clustalw2). The MEGA 7.0 was used to produce phylogenetic trees, and the neighbour-joining method was used for phylogenetic analysis [31].

Amplification of Rlec DNA sequences
A total of 76 Rlec genes that belonged to a lectin gene family were identified. Whether each Rlec isoform had a corresponding DNA sequence in the genome was not known. Therefore, the genome sequences of Rlecs needed to be amplified. Direct amplification by using genome DNA as template and genome walking were used to obtain the DNA of Rlecs. Genome walking was performed using the primers (Rlecs-walk-F1 and Rlecs-walk-F2) in table 1. The Universal GenomeWalker kit (Clontech, USA) was used to amplify the genomic sequences of Rlecs. Experiments were performed in accordance with the protocols of the kit. The Rlecs DNA with variable numbers of tandem repeats were obtained after analysing the sequencing results, which indicated that each Rlec transcript had its corresponding DNA sequence. Two primers (Rlecs-gF and Rlecs-gR) were designed to amplify the Rlecs DNA sequences directly. Multiple Rlecs DNA sequences corresponding to different types of Rlecs were identified. This finding also confirmed that each Rlec transcript had a corresponding DNA sequence. However, two sets of amplified Rlec DNA sequences were not matched because of the diversity of the Rlec genes. The results of genome walking and direct PCR amplification showed that the Rlecs DNA had six exons interrupted by five introns. Next, we selected Rlecs-CNDSGD to amplify its DNA sequence. Two pairs of primers (Rlecs-CNDSGD-gF1, Rlecs-CNDSGD-gR1; Rlecs-CNDSGD-gF2, Rlecs-CNDSGD-gR2) were designed for the direct PCR