Enhanced antimicrobial peptide-induced activity in the mollusc Toll-2 family through evolution via tandem Toll/interleukin-1 receptor

Toll receptors play an important role in the innate immunity of invertebrates. All reported Tolls have only one Toll/interleukin-1 receptor (TIR) domain at the C-terminal. In this study, numerous Tolls with tandem TIRs at the C-terminal were found in molluscs. Such Tolls presented an extra TIR (TIR-1) compared with Toll-I. Thus, Toll-I might be the ancestor of tandem TIRs containing Toll. To test this hypothesis, 83 Toll-I and Toll-2 (most have two TIRs, but others seem to be the evolutionary intermediates) genes from 29 shellfish species were identified. These Tolls were divided into nine groups based on phylogenetic analyses. A strong correlation between phylogeny and motif composition was found. All Toll proteins contained the TIR-2 domain, whereas the TIR-1 domain only existed in some Toll-2 protein, suggesting that TIR-1 domain insertion may play an important role in Toll protein evolution. Further analyses of functional divergence and adaptive evolution showed that some of the critical sites responsible for functional divergence may have been under positive selection. An additional intragenic recombination played an important role in the evolution of the Toll-I and Toll-2 genes. To investigate the functional difference of Toll-I and Toll-2, over expression of Hcu_Toll-I or Hcu_Toll-2-2 in Drosophila S2 cells was performed. Results showed that Hcu_Toll-2-2 had stronger antimicrobial peptide (AMP) activity than Hcu_Toll-I. Therefore, enhanced AMP-induced activity resulted from tandem TIRs in Toll-2s of molluscs during evolution history.

Toll receptors play an important role in the innate immunity of invertebrates. All reported Tolls have only one Toll/interleukin-1 receptor (TIR) domain at the C-terminal. In this study, numerous Tolls with tandem TIRs at the C-terminal were found in molluscs. Such Tolls presented an extra TIR (TIR-1) compared with Toll-I. Thus, Toll-I might be the ancestor of tandem TIRs containing Toll. To test this hypothesis, 83 Toll-I and Toll-2 (most have two TIRs, but others seem to be the evolutionary intermediates) genes from 29 shellfish species were identified. These Tolls were divided into nine groups based on phylogenetic analyses. A strong correlation between phylogeny and motif composition was found. All Toll proteins contained the TIR-2 domain, whereas the TIR-1 domain only existed in some Toll-2 protein, suggesting that TIR-1 domain insertion may play an important role in Toll protein evolution. Further analyses of functional divergence and adaptive evolution showed that some of the critical sites responsible for functional divergence may have been under positive selection. An additional intragenic recombination played an important role in the evolution of the Toll-I and Toll-2 genes. To investigate the functional difference of Toll-I and 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited. 1

. Introduction
Invertebrate species account for 95% of the total number of animals in the world [1]. Similar to vertebrates, they suffer from pathogens, such as microbes and viruses. For survival, invertebrates have developed the ability to resist pathogens during evolution. Invertebrates may lack an adaptive immunity system as that in vertebrates, but they possess a powerful and high-performance innate immunity system [2].
Among the signalling pathways of the invertebrate immune system, the Toll signalling pathway is the best characterized one. The Toll gene was originally identified with function for specifying dorsalventral polarity of the Drosophila embryo [3]. Tolls are ubiquitous in embryos, but they are only activated by spatially restricted cleavage by Spätzle (Spz) [4]. A previous study reported that the Drosophila Toll pathway shows remarkable similarity to the mammalian interleukin-1 pathway, which activates NF-κB, a protein responsible for inflammatory and immune responses [5]. Toll-9 has been proven to play a role in fungal and Gram-positive bacterial defence of Drosophila, independent of its morphogenetic functions [4,6]. The discovery of immune function for Toll in flies has led to the identification of vertebrate Toll-like receptors (TLRs) [7]. TLRs have been identified in animals ranging from cnidarians to mammals.
Pathogen-associated molecular patterns (PAMPs) are small molecular motifs conserved within groups of pathogens. Upon pathogen invasion, they are recognized by TLRs and other pattern recognition receptors (PRRs) and initiate immune response. In Drosophila, a cascade reaction system is triggered once PRRs recognize PAMPs. This system contains Spz, myeloid differentiation factor 88 (MyD88), tube, pelle, dorsal, inhibitor of NF-κB kinases (IKKs) and cactus. This cascade activates NF-κB, which enhances the expression of immune factor genes, such as AMPs, lysozyme and anti-lipopolysaccharides (ALFs) for resistance to infection. The Drosophila genome encodes nine Toll proteins, but only Toll-9 has a role in immunity. Some vertebrate species have more than one TLR involved in immune responses; for example, Homo sapiens TLR1, TLR2, TLR3, TLR4, TLR5, TLR6, TLR7 and TLR9 are all engaged in the recognition of different PAMPs and activate distinct immune responses.
To date, TLRs play a role in infection prevention and other immune-related processes. A previous study found that dual antigen-specific B cell receptor (BCR) and TLR engagement can fine-tune functional B cell responses, directly linking cell-intrinsic innate and adaptive immune programmes [8]. TLR activation upregulates pro-tumourigenic pathways, including the induction of inducible nitric oxide synthase (iNOS2) and cyclooxygenase (COX) 2, to promote a feed-forward loop leading to tumour progression and the development of more aggressive tumour phenotypes [9]. Furthermore, TLR4 is expressed in intestinal stem cells and regulates their proliferation and apoptosis via the p53 upregulated modulator of apoptosis [10]. Notably, an increasing number of studies have indicated that TLRs are engaged in autoimmunization; for example, TLR4 in T cells promotes autoimmune inflammation; TLR7, TLR8 and TLR9 play a key role in promoting the production of autoantibodies reactive with DNA-or RNA-associated autoantigens [11,12]. Functions of TLRs in invertebrates other than Drosophila have also been investigated in recent years, but most of these studies focused on their roles in anti-pathogens [13,14].
The reason for the functional diversity in TLRs is implied in their structures. Toll receptors are transmembrane proteins with extracellular leucine-rich repeat (LRR) motifs and the Toll/interleukin-1 receptor (TIR) domain [15]. The ectodomain of TLRs contains LRR motifs, which are defined by a consensus sequence of 24-29 amino acids in length, and the number of LRRs in different TLRs varies from 19 to 26 [16]. Such tandem arrays of LRRs have been found in proteins involved in ribosome and DNA binding, signal transduction, enzyme inhibition and cell adhesion [17]. The sequences of the TLR intracellular domain have similarities to the mammalian interleukin-1 receptors, called the TIR domain. This domain is composed of about 150 amino acid residues. Recognition of PAMPs and cytokines by TLRs leads to the stabilization of a dimeric form of the receptor through TIR domains, as well as providing a scaffold or recruitment of cytosolic adaptor proteins [18]. To date, most TLRs that have been investigated only contain one TIR domain.
In our previous study, a Toll receptor (Hcu_Toll-2-1) with tandem TIR domains at the C-terminal was found and studied in Hyriopsis cumingii [19]. A single TIR-containing Toll (Hcu_Toll-I) sharing similarities to Toll-I from Mytilus galloprovincialis (AFU48617.1) was also identified from H. cumingii . Hcu_Toll-2-1  and Hcu_Toll-I have similar domain structures but an extra TIR (TIR-1) is found in Hcu_Toll-2-1. Thus, Hcu_Toll-I may be the ancestor gene of Hcu_Toll-2- 1. To test this hypothesis, transcriptome sequencing was performed in 29 mollusc species to obtain more sequences similar to the Toll-I or Hcu_Toll-2-1 genes.
A total of 83 Toll-I and Toll-2 genes from 29 aquatic molluscs were identified. Phylogenetic analyses revealed that these Tolls could be divided into the Toll-I and Toll-2 families. Most members of Group VII in the Toll-2 family contained two TIR domains. Functional divergence analyses also indicated that the Toll genes diverged functionally from each other, causing different evolutionary rates. Over expression of Hcu_Toll-2-2 or Hcu_Toll-I in Drosophila S2 cells showed that tandem TIR-containing Toll (Hcu_Toll-2-2) had stronger antimicrobial peptide (AMP)-induced activity than single TIR Toll (Hcu_Toll-I). Therefore, tandem TIR Toll genes have been preserved in mollusc species during evolution, which possibly resulted from its enhanced function.

Animals
In this study, 29 mollusc species (species tree is shown in the electronic supplementary material, S1) were selected to investigate the evolution of mollusc Toll receptors. Three freshwater mussels, namely, H. cumingii, Sinanodonta woodiana and Cristaria plicata, were purchased from Wuhu City, Anhui Province, China. Another 26 seawater mollusc species were purchased from Nanjing and Hangzhou aquatic markets. Among these 26 species, Peronidia zyonoensis, Qicaibei, Wenbei and Hongbei were purchased from Hangzhou City, Zhejiang Province, China. In addition to these four species, the remaining species were purchased from Nanjing, Jiangsu Province, China. Among these 29 species, only six species (i.e. Haliotis rubra, Neptunea cumingi, Cymbium melo, Babylonia areolata, Rapana bezona and Turritella terebra) belong to Gastropoda. Other species belong to Bivalvia. Detailed information of species in this study can be found in the electronic supplementary material, S2. Given that no Latin name could be obtained, the Chinese Pinyin of Qicaibei, Wenbei, Hongbei and Jinqianbei were used in this study.

Transcriptome sequencing of mollusc species and cDNA cloning
The transcriptomes of mollusc species were sequenced using Illumina HiSeq™ 2000. The raw reads were de novo assembled using Trinity program. The transcriptome assemblies were searched for sequences similar to Toll-I or Hcu_Toll-2-1 using the tBLASTn algorithm. BLAST results showed that 83 Toll-I and Toll-2 genes were found. Some of them possessed complete coding regions. However, some of them had no complete coding region and only had 3 or 5 -ends. To obtain the coding region of these sequences, 3 or 5 RACE methods were employed using the SMARTer RACE 5 /3 Kit (Clontech), following the manufacturer's manual. Detailed methods can be found in our previously published paper [19]. The primers used for RACE are shown in the electronic supplementary material, S3.

Phylogenetic analyses of the Toll-I and Toll-2 protein families
Multiple sequence alignments of the TIR-2 domain sequences were performed using MUSCLE 3.52, followed by manual comparisons and refinement [20]. Phylogenetic analyses of the Toll protein family, based on TIR-2 domain sequences, were performed with the neighbour-joining (NJ) method using MEGA 6 [21]. We also used TLR1 protein from Mus musculus as an outgroup. Bootstrap support values were estimated using 1000 pseudo-replicates.

Functional divergence analyses
Some residues are highly conserved and others are highly variable in evolution. We used DIVERGE (v. 2.0) [22,23] to analyse the type I functional divergence between different groups of Toll receptor proteins. The functional divergence between two groups was measured as the coefficient of functional divergence (θ ). A coefficient equal to 0 indicates that the evolutionary rate of the duplicate genes at each site is entirely consistent. When the coefficient is greater than 0, the evolutionary rate of the duplicate genes at some critical amino acid residues is different. The software will predict these sites responding for functional divergence.

Site-specific selection assessment and testing
We used the Selecton Server (http://selecton.tau.ac.il/) [24] to calculate site-specific purifying and positive selection. In this study, K a /K s values were used to estimate two types of substitution events by calculating the synonymous rate (K s ) and the non-synonymous rate (K a ) at each codon. Three evolutionary models (M8 (ω s ≥ 1), M7 (beta) and M5 (gamma)) were used to describe, in probabilistic terms, how the characters evolve. Each of the models used different biological assumptions and the model that best fit the data was selected. These models all assumed a statistical distribution to account for heterogeneous K a /K s values among sites. The distributions were approximated using eight discrete categories and the K a /K s values were computed by calculating the expectation of a posterior distribution [24].

Detection of recombination events
Coding sequence (CDS) in different Toll groups was first aligned. The recombination detection program RDP v. 3.44 [25] was used to explore potential recombination events between divergent nucleotide sequences. This software embeds different methods for detecting recombination signals. In this study, three methods (RDP [26], Geneconv [27] and MaxChi [28]) were used to detect signals. The highest acceptable P cut-off value was set to 0.05. Significance was evaluated with 100 permutation tests.

Dual-luciferase activity assay in S2 cells
Dual-luciferase activity assay in S2 cells was conducted according to the protocol of a previously published paper. In brief, the abovementioned recombinant plasmid or empty pAc5.1/V5-His B plasmid (0.3 µg) along with pGL-Pen4 or empty pGL3-Basic plasmid (0.2 µg) and 0.02 µg of pRL-TK plasmid (wild-type Renilla luciferase control reporter vector; Promega, USA) were co-transfected into Drosophila S2 cells. The reporter gene plasmid (pGL-Pen4) was constructed using the promoter sequence of shrimp Penaeidin-4 (PEN4). S2 cells were cultured in standard Drosophila medium (serum-free; Invitrogen, USA) containing 10% fetal bovine serum (Invitrogen) at 27°C. Cellfectin II reagent (Invitrogen) was used for DNA transfection into S2 cells. The firefly and Renilla luciferase activities were measured after 48 h of transfection with the Dual-Luciferase Reporter Assay System (Promega), according to the manufacturer's instructions. All these assays were performed from three independent 3. 1

. Transcriptome sequencing and identification of Toll-2 and Toll-I
A total of 83 Toll-I and Toll-2 (most have two TIRs) genes from 29 mollusc species were identified using high-throughput sequencing and RACE technology (electronic supplementary material, S1). Based on their living environment, these mollusc species could be divided into freshwater and seawater species. Among these species, only H. cumingii, S. woodiana and C. plicata are freshwater mussels. According to the classification of species, these species could be divided into Gastropoda and Bivalvia. Among these species, H. rubra, N. cumingi, C. melo, B. areolata, R. bezona and T. terebra belong to Gastropoda. In some species, only Toll-I could be found and no Toll-2 could be identified. One Toll-I in Pinna rudis, C. melo, Crassostrea gigas, Hongbei and Scapharca subcrenata, two Toll-I in Tegillarca granosa and three Toll-I in Moerella iridescens, Mytilus edulis and Mactra veneriformis were identified. In N. cumingi, only one Toll-2 was found and no Toll-I could be identified. In other species, both Toll-I and Toll-2 could be found. Based on their isoform number, they could be divided into nine different situations.

Phylogenetic and structural analyses of the Toll-I and Toll-2 proteins
To predict the evolutionary relationships of the Toll-I and Toll-2 families, we constructed a NJ tree based on alignment of the TIR-2 sequences of the Toll-I and Toll-2 proteins. The majority of the phylogenetic clades had well-supported bootstrap values. These genes were categorized into nine major groups, named Groups I to IX. Groups I, II, III, IV, V and VI belong to the Toll-I gene family, whereas Groups VII, VIII and IX belong to the Toll-2 gene family. The largest group was VII, which contained 24 members. Group VI contained only four genes.
To further confirm the phylogenetic relationships and to examine the diversity of Toll proteins, we further searched for conserved motifs in Toll proteins using the MEME web server (http://meme.sdsc. edu) [29]. As shown in figure 1, five conserved motifs (motifs 1-5) were identified in these Toll proteins. All the predicted Tolls had conserved motifs 2 and 3, and the genes in the same group had similar conserved motifs, but some divergence was observed between groups. These results suggested that the Toll genes in the same group may have similar functions and some specific motif architectures may have important effects on group-specific functions. We also noticed that most proteins in Group VII contained motif 4, which did not exist in other groups, except one member (Mme_Toll-2-6) in Group IX, implying that this motif may be related to the specific functions of these proteins. Therefore, motif compositions of the Toll proteins in each group may provide additional support for phylogenetic analyses. We also used Pfam [30] to identify major domains of the Toll protein family. Interestingly, we also found that most members of Group VII possessed two TIR domains (one was TIR-1 (motif 4) and the other was TIR-2 (motif 3)), whereas other Tolls only contained the TIR-2 domain. Among these Toll-2 members, 14 Toll-2s were the evolutionary intermediates between single TIR Tolls and tandem TIR Tolls. Among these 14 Tolls, five Tolls were from Gastropoda. In Gastropoda, no complete tandem TIR Tolls could be found. BLASTP results showed that these 14 Tolls had similarities to Hcu_Toll-2-1, which contained two TIRs. However, SMART analysis showed that these 14 Tolls contained incomplete tandem TTRs. Thus, these 14 Tolls may be evolutionary intermediates.

Selective pressure at amino acid sites in the Toll family members
The K a /K s ratio measures selection pressure on amino acid substitutions. A K a /K s ratio greater than 1 suggests positive selection and a ratio less than 1 suggests purifying selection [36]. Amino acids in a        . protein are usually expected to be under different selective pressures and possess different K a /K s ratios [37]. To test for the presence of positive or negative selection at individual amino acids, the K a /K s ratios were calculated with the Selecton Server (http://selecton.tau.ac.il) [24]. We used three evolutionary models (M8 (ω s ≥ 1), M7 (beta), and M5 (gamma)) implemented in this server to perform the tests.
The results showed that the K a /K s ratios of the sequences from different Toll groups were significantly different (table 2). For example, higher K a /K s ratios existed in Groups II, VI and IX, indicating a higher evolutionary rate or site-specific selective relaxation within members of the same group. Despite the differences in K a /K s values, all the estimated K a /K s values were substantially lower than 1, suggesting that the Toll sequences within each of the groups were under strong purifying selection pressure. The selection model M7 did not indicate the presence of positively selected sites, whereas the M5 model did in Groups II, V, VI and IX (table 2). These observations suggested that selection spurred the potential for amino acid diversity at some residues, whereas other residues evolved under purifying or neutral selection. These positively selected residues might have changed the protein structure, thereby accelerating functional divergence during long periods of evolution.

Recombination analysis within Toll genes
Recombination plays a key role in the generation of genetic diversity. To determine whether homologous recombination shapes the evolution of the Toll genes, we analysed all Toll CDS segments to test whether some of them underwent an intragenic recombination event. Recombination signals of the Toll genes were investigated with the RDP [26], Geneconv [27] and MaxChi [28] methods embedded in the program RDP v3.44 [25]. All nine groups were found to contain similar mosaic segments, demonstrating that intragenic recombination had occurred. As summarized in table 3, 57 Toll genes in these groups exhibited evidence of intragenic recombination (p < 0.05 based on 100 permutations). As an example, we presented a recombination event of Cpl_Toll-2-5 and Cpl_Toll-2-4 detected by the RDP method (figure 2). A significant recombination event occurred between the 5 -ends of Cpl_Toll-2-5 and Cpl_Toll-2-4. Our results indicated that the Toll genes underwent frequent recombination events. Therefore, intragenic recombination played an important role in the evolution of the Toll genes, similar to other family genes [38,39]. Further studies are required to evaluate the influence of recombination on function and investigate the mechanisms underlying Toll recombination.