Proceedings of the Royal Society B: Biological Sciences
You have accessResearch articles

Substantially adaptive potential in polyploid cyprinid fishes: evidence from biogeographic, phylogenetic and genomic studies

Xinxin Li

Xinxin Li

Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China

Google Scholar

Find this author on PubMed

and
Baocheng Guo

Baocheng Guo

Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China

University of Chinese Academy of Sciences, Beijing, People's Republic of China

Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, People's Republic of China

[email protected]

Google Scholar

Find this author on PubMed

Published:https://doi.org/10.1098/rspb.2019.3008

Abstract

Whole genome duplication (WGD) is commonly believed to play key roles in vertebrate evolution. However, nowadays polyploidy exists in a few fish, amphibian and reptile groups only, and seems to be an evolutionary dead end in vertebrates. We investigate the evolutionary significance of polyploidization in Cyprinidae—a fish family that contains more polyploid species than any other vertebrate group—with integrated biogeographic, phylogenetic and genomic analyses. First, polyploid species are found to be significantly frequent in areas of higher altitude and lower mean annual temperature compared with diploid species in Cyprinidae. Second, a polyploidy-related diversification rate shift is observed in Cyprinidae. This increased net diversification rate is only seen in three polyploid lineages, and other polyploid lineages have similar net diversification rate as well as diploid lineages in Cyprinidae. Interestingly, significant ‘lag times’ existed between polyploidization and radiation in Cyprinidae. Multiple polyploid lineages were established approximately 15 Ma through recurrent allopolyploidization events, but the net diversification rate did not start to increase until approximately 5 Ma—long after polyploidization events. Environmental changes associated with the continuous uplift of the Tibetan Plateau and climate change have probably promoted the initial establishment and subsequent radiation of polyploidy in Cyprinidae. Finally, the unique retention of duplicated genes in polyploid cyprinids adapted to harsh environments is found. Taken together, our results suggest that polyploidy in Cyprinidae is far more than an evolutionary dead end, but rather shows substantially adaptive potential. Polyploid cyprinids thus constitute an ideal model system for unveiling largely unexplored consequences of WGD in vertebrates, from genomic evolution to species diversification.

1. Introduction

Polyploidization, often referred to as whole genome duplication (WGD), is most common in plants, but also occurs in insects, amphibians, reptiles and fishes. The evolutionary significance of polyploidization (e.g. its impact on species diversification, biological complexity and contribution to the generation of evolutionary novelties) is well recognized [15]. However, there are also disadvantages associated with polyploidy, such as detrimental effects on fertility and fitness due to genomic instability, mitotic and meiotic abnormalities, and gene expression and epigenetic changes [1]. As such, the evolutionary adaptive potential of polyploidy has been questioned. In fact, polyploidy has been suggested to be an evolutionary ‘dead end’, considering the paucity of polyploidization events that lead to the survival and establishment of lineages over long evolutionary time frames [46]. Nevertheless, it is believed that the adaptive potential of polyploidy could be evident even on short evolutionary time scales when analysed in connection with environmental changes [5]. First, polyploidy is expected to generate higher robustness in challenging environments. Polyploids usually have a broader range of environmental tolerance, and often occur in newly created, disrupted or harsh environments compared with diploids [2,5,7,8]. Second, polyploidy is expected to contribute to species diversification, especially in unstable environments, although this process is usually slow due to the stabilization of environmental conditions after polyploidization [2,3,5,6]. Therefore, how polyploidy affects speciation and diversification rates remains controversial, largely because it is difficult to find a polyploid lineage that shows those abovementioned signs of adaptive potential (i.e. prevalence in unstable environments, increasing diversification rates) simultaneously in the wild to refute the evolutionary ‘dead end’ hypothesis. This is particularly relevant in vertebrates, in which polyploidy is much rarer than in plants, although WGD is believed to play a key role in vertebrate evolution [9].

Among vertebrates, fishes exhibit more polyploid species than amphibians or reptiles. Besides the well-known teleost WGD [9], polyploidization has occurred repeatedly during fish diversification. Specifically, the family Cyprinidae contains more polyploid species than any other well-known polyploid group of fishes (e.g. sturgeons and salmonids [10]). Cyprinidae, the largest family of freshwater fishes with 367 genera and 3006 nominated species [11], includes many well-known polyploid species such as common carp (Cyprinus carpio) and goldfish (Carassius auratus). In Cyprinidae, polyploid species are mostly found in the largest subfamily, Cyprininae, with over 400 polyploids of approximately 1300 species [12]. Molecular phylogenetics suggests that Cyprininae could be subdivided into 11 tribes: Probarbini, Labeonini, Torini, Smiliogastrini, Poropuntiini, Cyprinini, Acrossocheilini, Spinibarbini, Schizothoracini, Schizopygopsini and Barbini [12]. Polyploidy is prevalent in Probarbini, Torini, Barbini, Spinibarbini, Cyprinini, Schizothoracini and Schizopygopsini [12], and so all known species of Cyprinini, Schizothoracini and Schizopygopsini are polyploid. Molecular phylogenetics suggests that polyploid cyprinid species have multiple origins [12,13], indicating that recurrent polyploidization accompanies Cyprinidae diversification. However, the effects of polyploidization on cyprinid evolution (i.e. environmental robustness and diversification rate) remain unknown. Specifically, it is unclear whether extant cyprinid polyploids retain adaptive potential or rather represent an evolutionary ‘dead end’.

To this end, we first compiled geographical occurrence data of Cyprinidae to explore whether polyploidization has increased their environmental robustness. Next, we estimated diversification rates in different cyprinid lineages to evaluate whether polyploidization facilitates their diversification. Finally, we used genomic data to determine if autopolyploidization or allopolyploidization occurred in major polyploid cyprinid groups, and to investigate whether unique retention of duplicated genes from polyploidization has contributed to the adaptation to harsh environments. Taken together, our results provide convincing evidence for the adaptive potential of polyploidy in Cyprinidae.

2. Materials and methods

(a) Geographical and climatic data

A list of valid Cyprinidae species (electronic supplementary material, table S1) was obtained from FishBase (http://www.fishbase.org). Ploidy information (electronic supplementary material, table S1) was retrieved from FishBase, Fish Karyome and references [1214]. The geographical occurrence was retrieved from the Global Biodiversity Information Facility (GBIF; https://www.gbif.org/) with the rgbif package in R. Species records absent in databases (e.g. fishes in Schizothoracini, Schizopygopsini and Sinocyclocheilus) were added by literature mining [1214]. Geographical occurrence records were filtered to exclude errors (e.g. records on the sea, and with zero–zero or other erroneous coordinates) and duplications. To avoid the effects of invasive and/or introduced species, an endemic occurrence dataset was first assembled by extracting the earliest occurrence records for each species (table 1; electronic supplementary material, table S2). Then, to valid observation of the endemic occurrence dataset, a dataset containing all occurrence records for each species was assembled (table 1; electronic supplementary material, table S2). Altitude and mean annual temperature values for each occurrence point were extracted from WorldClim (http://worldclim.org/) with 2.5 min resolution using the raster package in R. Geographical occurrence records of Cyprinidae species were mapped onto the worldwide terrain elevation map and mean annual temperature map, respectively. To test whether there is a distribution discrepancy along an environmental gradient, altitude and mean annual temperature between diploids and polyploids in Cyprinidae were compared globally. Furthermore, polyploid occurrence frequency, referred to as the relative proportion of polyploid species out of all species, was calculated in global context and various altitude/temperature gradient ranges.

Table 1. Geographical occurrence of diploids and polyploids in Cyprinidae. Note: Endemic dataset includes only the earliest geographical occurrence records for each species; large dataset includes all geographical occurrence records for each species.

endemic dataset
large dataset
no. diploids no. polyploids no. diploids no. polyploids
Africa 294 90 425 99
Asia 820 240 923 258
Europe 253 41 291 57
North America 313 1 333 4
total 1680 372 1995 393

(b) Phylomitogenomic analysis

To reconstruct the Cyprinidae phylogeny, 13 mitochondrial protein-coding genes were used to avoid paralogue problems in the nuclear genome. A total of 246 mitochondrial genomes were retrieved from GenBank, including 242 Cyprinidae and four outgroup species (electronic supplementary material, table S3). Sequences were aligned using MAFFT v. 7.305 [15] with default setting. Best partitioning schemes and nucleotide substitution models were determined with PartitionFinder v. 2.1.1 [16] using corrected Akaike information criterion (electronic supplementary material, table S4). Both maximum-likelihood (ML) and Bayesian inference (BI) methods were used for inferring the Cyprinidae phylogeny on the concatenated alignment of the 13 mitochondrial protein-coding genes. ABI tree was constructed with MrBayes v. 3.2.6 [17]. Markov chain Monte Carlo analyses were run for 200 million generations, sampled once per 1000 generations and the first 25% of samples were discarded as burn-in. The Bayesian posterior probability was determined from the remaining samples. The ML tree was estimated using RAxML v. 8.2.11 [18] with GTRGAMMA model. The rapid bootstrap algorithm was used with a thorough ML search and 1000 replicates to generate trees.

(c) Divergence time estimation

Divergence time in Cyprinidae was estimated with MCMCtree in PAML v. 4.9e [19], using the concatenated alignment of the 13 mitochondrial protein-coding genes and the ML topology. The following constraints were used for time calibration: (i) since the earliest fossil of Labeo-like and Barbus-like fishes were reported in Early Miocene [20,21], a minimum age constraint of 16.0 Ma (million years ago) was assigned to the most recent common ancestor of Labeonini and Barbini, respectively; (ii) a minimum age constraint of 5.33 Ma to the root of Labeonini is set, because the fossil of Labeo species is reported in Late Miocene [21,22]; (iii) since the earliest definite cyprinid fossils were reported from Eocene [23], a minimum age constraint of 33.90 Ma was assigned to the root of Cyprinidae. The MCMC run was first executed for 2000 iterations as burn-in, and then additional 20 000 generations with a sample frequency of 10. MCMC run was performed twice to assess consistency between runs. The final time-calibrated tree was visualized using FigTree v. 1.4.2 (http://tree.bio.ed.ac.uk/software/figtree).

(d) Diversification rate calculation

To investigate whether there is heterogeneity in diversification rates in each of the 11 tribes in Cyprinidae, Bayesian analysis of macroevolutionary mixtures (BAMM) v. 2.5 [24] was used for inferring diversification dynamics across Cyprinidae phylogeny. The time-calibrated Cyprinidae phylogenetic tree was used as input. To avoid potential bias caused by incomplete taxon sampling, the sampling fraction was specified at the genus level. Priors for speciation and extinction were set empirically with setBAMMpriors function (lambdaInitPrior = 1.0499, lambdaShiftPrior = 0.0457, muInitPrior = 1.0499). BAMM analysis was run multiple times, each with 10 million generations and sampled once per 1000 generations. Convergence and effective sample size were examined using the coda package in R, and the first 10% of estimates were discarded as burn-in. Bayes factors were computed to compare all rate shift models that were sampled in the posterior. Rate through time and rate shift configuration with the highest maximum of a posteriori probability were summarized and visualized with BAMMtools [24].

(e) Characterizing ohnologue evolution

To infer polyploidization events in representing polyploid cyprinid lineages, ohnologue evolution was characterized in six polyploid Cyprinidae species: C. carpio (Cyprinini), tetraploid, 2n = 100; Gymnodiptychus dybowskii (Schizopygopsini), tetraploid, 2n = 98; Schizothorax pseudoaksaiensis (Schizothoracini), ploidy, 2n unknown; S. anshuiensis (Cyprinini), tetraploid, 2n unknown; S. grahami (Cyprinini), tetraploid, 2n = 96; S. rhinocerous (Cyprinini), tetraploid; 2n unknown. Human (Homo sapiens) and zebrafish (Danio rerio) were used as outgroups.

Genomes of C. carpio [25] and three Sinocyclocheilus species [26], and transcriptomes of Gymnodiptychus dybowskii and Schizothorax pseudoaksaiensis [27] were downloaded from GenBank. Human and zebrafish genomic sequences were downloaded from Ensembl (release 76). Singleton genes with one to one orthologous relationship in human and diploid teleost genomes were retrieved from [28] (electronic supplementary material, table S5). Orthologous groups among human, zebrafish and the selected polyploid species were identified with OrthoMCL v. 2.0.9 [29]. OrthoMCL was run with a BLAST E-value of 1 × 10−5, a minimum aligned sequence length coverage of 50% of query sequence, and an inflation index of 1.5. Ortholous groups that have one copy of sequence in human and zebrafish and two copies of sequences in the selected polyploid species were chosen for subsequent phylogenetic analysis and paralogue divergence analysis. The multiple-sequence alignment was done with MAFFT v. 7.305 for each selected orthologous group. The ambiguous region in multi-sequence alignment was removed with Gblocks v. 0.91b [30]. The number of synonymous substitutions per synonymous site (Ks) between duplicates in each polyploid species was calculated with codeml in PAML v. 4.9e [19]. Ohnologues that resulted from polyploidization in each polyploid Cyprinidae species were defined with Ks ≤ 0.50 between duplicates. The synonymous substitution rate (r) of 3.51 × 10−9 substitutions per synonymous site per year was used as the mutation rate in Cyprinidae [25]. The divergence time (T) between ohnologues in each polyploid species was calculated with the equation T = Ks/2r. Auto- or allopolyploidy was inferred by comparison of the Ks distribution of ohnologue pairs within polyploid species with mitogenomic divergence among polyploid species [31]. The topology of each selected ohnologue group with different polyploid cyprinid species was inferred using ML method in RAxML v. 8.2.11, with 100 bootstraps and -m PROTGAMMAAUTO option to automatically select the best-fitting amino acid substitution model. The ML tree for each orthologous group was integrated into a tree set and visualized with DensiTree v. 2.2.6 [32]. Shared or independent polyploidization events within/among representing polyploid cyprinid lineages were inferred from ohnologue genealogies according to Guo et al. [33].

(f) Gene ontology enrichment analysis of pseudogenes

To test whether unique retention of duplicated genes might contribute to adaptation to environmental instability in polyploid cyprinids, genomes of four polyploid Cyprinidae species (C. carpio, S. grahami, S. anshuiensis and S. rhinocerous) were explored. Of the four Cyprinidae polyploid species, two (S. anshuiensis and S. rhinocerous) are cave-dwelling species. This provides an appropriate example for testing whether adaption to harsh environments is facilitated by unique retention of duplicated genes after polyploidization in Cyprinidae. Pseudogenes were used as an index of unique retention of duplicated genes, and gene ontology (GO) enrichment analysis of pseudogenes was performed in each of the four polyploid cyprinids. Pseudogenes annotation of these four species were retrieved from their genome annotations, respectively. Then, pseudogene sequence was obtained with BEDTools [34]. GO annotations of pseudogenes were obtained with BLAST searching in Uniprot.

GO enrichment analysis of pseudogenes in each genome was done using the ‘enricher’ function with Benjamini–Hochberg correction (q < 0.05) in the clusterProfiler package in R [35]. A hypergeometric test was used for determining if GO term enrichment was significant for pseudogenes in a certain species. RichFactor is calculated as the ratio of the numbers of genes enriched in a GO term to the numbers of all genes enriched in the same GO term.

3. Results

(a) Geographical occurrence of diploid and polyploid Cyprinidae

In total, the geographical occurrence was assembled for 2052 Cyprinidae species in 320 genera (table 1). This represents 68% of the nominated Cyprinidae species and 87% of genera in Cyprinidae. Of the 2052 Cyprinidae species, 1680 from 279 genera are diploids, and 372 from 41 genera are polyploids. The geographical distribution of the earliest occurrence record for each species is shown in figure 1 (left panels). As is well known, Cyprinidae is distributed in Eurasia, Africa and North America, and absent from Australia and South America. Polyploid species are commonly found in Eurasia and Africa, but less so in North America. The dataset with all occurrence records for each species in Cyprinidae suggests similar geographical distribution pattern as that with only the earliest occurrence record for each species (table 1).

Figure 1.

Figure 1. Global geographical distribution of the earliest occurrence record for Cyprinidae species along altitude (left upper panel) and mean annual temperature gradient (left lower panel). Comparison of diploids and polyploids distribution in Cyprinidae along altitude (right upper panel) and mean annual temperature gradient (right lower panel; Mann–Whitney U-tests, ***p < 0.001). (Online version in colour.)

The proportion of polyploid cyprinids is approximately 18% globally in the dataset with only the earliest geographical occurrence record for each species (table 2). Polyploid species tend to be found in areas of higher altitude and lower mean annual temperature globally compared with diploids (right panels, figure 1; Mann–Whitney U-tests, p < 0.001), and the observation remains when polyploid species in Schizopygopsini and Schizothoracini that are distributed specifically in the Qinghai–Tibetan Plateau (QTP) and its adjunct region are excluded (right panels, figure 1; Mann–Whitney U-tests, p < 0.001). Furthermore, polyploid occurrence frequency is found to be significantly higher in high-altitude areas compared with its global occurrence frequency, and accordingly, it is significantly higher in areas of low mean annual temperature (table 2; and electronic supplementary material, figure S1–S4). These observations are found with both the dataset with only the earliest occurrence record and the dataset with all occurrence records for each species (table 2).

Table 2. Proportions of polyploid cyprinids in global and in different environmental gradients.

altitude range (m.a.s.l)
temperature range (°C)
global <500 500–1500 1500–2500 >2500 <5 5–15 15–25 >25
endemic dataset polyploids 372 120 129 73 58 31 88 221 32
diploids 1680 1084 468 120 8 31 436 758 455
polyploid frequency (%) 18.13 9.97 21.61** 37.82*** 86.21*** 50*** 16.79 22.57** 6.57
large dataset polyploids 393 162 207 320 60 39 151 321 43
diploids 1995 1337 954 320 63 117 804 1213 540
polyploid frequency (%) 16.46 10.81 17.83 50*** 48.78*** 25* 15.81 20.92* 7.38

*p < 0.05.

**p < 0.01.

***p < 0.001 (χ2-test).

(b) Phylogeny and divergence time in Cyprinidae

Both BI and ML analyses resulted in highly resolved, well supported and highly compatible phylogenetic trees (electronic supplementary material, figure S5 and S6). Main clades in Cyprinidae are compatible with those from earlier studies [12,13]. The monophyly of Cyprininae is strongly supported, and Cyprininae comprises 11 tribes. Polyploid species are common in Probarbini, Torini, Cyprinini, Barbini, Spinibarbini, Schizopygopsini and Schizothoracini, and no polyploid species is found in Labeonini, Similiogastrini, Poropuntiini and Acrossocheilini.

The origin of Cyprinidae was approximately 31.68 Ma. Cyprininae have diverged with other subfamilies approximately 25.18 Ma (95% CI: 20.75–28.03 Ma), and began to diversify approximately 17.03 Ma (95% CI: 13.95–19.01). The Cyprinini radiation started approximately 12.44 Ma (95% CI: 10.18–14.02 Ma). Schizopygopsini and Schizothoracini diverged approximately 12.16 Ma (95% CI: 9.93–13.61 Ma).

(c) Diversification rate in Cyprinidae

Seven shifts in net diversification rates were detected in Cyprinidae evolution (electronic supplementary material, figure S7) and all started approximately 5 Ma (right top and middle panels, figure 2). The net diversification rate of the Cyprinidae family fluctuated along with the diversification of the subfamily Cyprininae, whereas net diversification rates of other subfamilies showed a steady decline across time (right top panel, figure 2). Before diversification of the subfamily Cyprininae, the net diversification rate of the Cyprinidae family showed a steady decline such as that of other subfamilies. With the beginning of the Cyprininae diversification, the net diversification rate of Cyprinidae kept stably until 5 Ma, and since then, it increased as same as the subfamily Cyprininae. Within Cyprininae, net diversification rates of eight tribes (Probarbini, Labeonini, Smiliogastrini, Poropuntiini, Cyprinini, Acrossocheilini, Spinibarbini and Barbini) showed a steady decline across time. In tribes Torini, Schizopygopsinim and Schizothoracini, net diversification rate decreased as much as other eight tribes until 5 Ma, and since then, increased from 0.27 to 0.36 in Torini, from 0.27 to 0.47 in Schizopygopsini, and from 0.27 to 0.82 in Schizothoracini (right middle panel, figure 2).

Figure 2.

Figure 2. (a) Time-calibrated phylogenetic tree of Cyprinidae. Blue tip labels are diploids, and red tip labels of polyploids. Plio., Pliocene; Plei., Pleistocene. (b,c) Net diversification rate through time in Cyprinidae. Solid line represents net diversification rate in each lineage, shaded areas represent 95% Bayesian credible intervals. (d) Global temperature trend from Olígocene to present (adapted from [36]). Vertical dotted line indicates 5 Ma. (Online version in colour.)

(d) Evolution of ohnologs in polyploid Cyprinidae

With the criteria of Ks ≤ 0.50 between duplicates, 357 ohnologue pairs were identified in the three Sinocyclocheilus species, 109 in the two Schizothoracinae species, 170 in C. carpio and the three Sinocyclocheilus species, and 97 in C. carpio, S. rhinocerous, and Gymnodiptychus dybowskii/Schizothorax pseudoaksaiensis. Ks peaked between ohnologues in C. carpio at 0.1694, which corresponds to a divergence time of 24.13 Ma (left panel, figure 3). In the three Sinocyclocheilus species, the Ks peaks ranged from 0.1214 to 0.1419, corresponding to a divergence time between 17.29 and 20.21 Ma (left panel, figure 3). The Ks peak was 0.1369 between ohnologues in Gymnodiptychus dybowskii, and 0.1596 in Schizothorax pseudoaksaiensis (data not shown), corresponding to divergence times of 19.50 Ma and 22.74 Ma, respectively. The ohnologue topology across different polyploid groups is summarized in figure 3 (right panel). Ohnologues form two clades across polyploid species in mostly all (89 of 109) of orthologous groups when Schizopygopsini and Schizothoracini species were analysed together. A similar pattern was seen in the three Sinocyclocheilus species (204 of 357 orthologous groups), and in C. carpio and three Sinocyclocheilus species (111 of 170 orthologous groups). Both ohnologues in Schizopygopsini and Schizothoracini species are grouped with one clade of ohnologues in C. carpio and S. rhinocerous in 85 of 97 orthologous groups, when C. carpio, S. rhinocerous and Gymnodiptychus dybowskii or Schizothorax pseudoaksaiensis were analysed.

Figure 3.

Figure 3. Distribution of the number of synonymous substitutions per synonymous site (Ks) between ohnologues in each of the four polyploid Cyprinidae species (left panel). The vertical dashed line indicates the Ks value corresponding to the peak. Major phylogenetic relationships of ohnologues across polyploid Cyprinidae species (right panel). (Online version in colour.)

(e) GO enrichment of pseudogenes in Cyprinidae genomes

The number of pseudogenes is 8926 in C. carpio, 3194 in S. grahami, 3610 in S. rhinocerous and 3932 in S. anshuiensis (electronic supplementary material, table S6). GO enrichment of pseudogenes in each species are listed in electronic supplementary material, table S7 and shown in electronic supplementary material, figure S8. Thirteen GO terms are found to comprise significantly more pseudogenes in the two surface-dwelling polyploid species (C. carpio and S. grahami) genomes than those in the two cave-dwelling Sinocyclocheilus genomes (electronic supplementary material, figure S9). It means that, compared with the two surface-dwelling polyploid species, there is a significant bias in the retention of genes in those 13 GO terms in the two cave-dwelling Sinocyclocheilus.

4. Discussion

The most salient finding of this study is that polyploidy in Cyprinidae shows clearly adaptive potential, in terms of its effects on environmental robustness and increasing species diversification simultaneously. Additionally, possibly genetic advantages underlying their adaptive potential in polyploid Cyprinidae (i.e. independent allopolyploid origins and unique retention of duplicated genes between closely related polyploid species in distinct environments) are explored. In the following, these findings are discussed in relation to cyprinid evolution, and the evolutionary significance of polyploidization in general.

(a) Polyploidy in Cyprinidae: far more than an evolutionary dead end

Among the phenomena suggested to indicate adaptive potential in polyploids, environmental robustness (higher stress tolerance than diploids) is the most straightforward [5]. This has been proposed based on the fact that extant polyploids are often found in unstable environments [5,7,8], which is also the case in Cyprinidae. Polyploid cyprinids are significantly frequent in higher altitude and lower mean annual temperature area (figure 1 and table 2). The observation persists with respect to different datasets (see Results). Schizopygopsini and Schizothoracini fishes best represent polyploid cyprinids that are adapted to higher altitudes, since they are restricted to QTP and its adjacent area and are the only cyprinids that occur naturally at high elevation there. In addition to high altitude, polyploid cyprinids are often found in other unstable environments. For example, Sinocyclocheilus from Cyprinini is the only cyprinid species endemic to karst caves in southwest China. The observation that only polyploid cyprinids are found at higher-altitude areas of QTP and karst caves in Southwest China could be explained by two alternative scenarios. One is that polyploids once coexisted with their diploid progenitors there, but their diploid progenitors have since emigrated or become extinct. The alternative is that polyploids first diverged from their diploid progenitors under different environmental conditions and subsequently migrated to their current habitats. Phylogenetic analyses and fossil records together might help distinguish these competing scenarios. The Cyprinidae phylogeny here (figure 2) and in an earlier study [12] suggest possible relatives of their diploid progenitors for polyploid Schizopygopsini, Schizothoracini and Sinocyclocheilus. A fossil study found an Oligocene-aged genera cyprinid species (Tchunglinius tchangii Wang et Wu) in the Nima Basin in the centre of QTP [37] is closely related to present-day genera Puntius distributed in South Asia and Africa in Similiogastrini. Similarly, Plesioschizothorax macrocephalus Wu et Chen, a Late Oligocene/Early Miocene fossil species in Lunbola basin of the north QTP [23,38], is closely related to the high-altitude-occurring Schizopygopsini and Schizothoracini. Those fossil records suggest that Schizopygopsini and Schizothoracini species probably coexisted with their diploid progenitors before the recent uplift of QTP, after which their diploid progenitors became extinct. No fossil is recorded in karst caves in southwest China, thus those two scenarios could not be distinguished in Sinocyclocheilus. It is worth noting that occurrence records in Russia, especially in Siberia, are sparse in our datasets due to the absence of records in GBIF and literatures. Nevertheless, the prevalence of polyploids in Cyprinidae found in high-altitude area of QTP and karst caves in southwest China supports the environmental robustness of polyploidy.

In addition to increasing environmental robustness, polyploidization is expected to affect the diversification rate [5]. If polyploidy is not an evolutionary dead end, the diversification rate in a polyploid lineage should be at least comparable with that in its diploid relative lineage. Study in plants suggests that extant polyploid lineages usually have lower net diversification rates compared to their diploid relatives, supporting the evolutionary dead end hypothesis [6]. Nevertheless, radiations have been observed in many plant groups that experienced ancient WGD, and usually occur after a substantial amount of time following WGD [39]. A WGD radiation lag-time model has thus been proposed to link higher diversification rate with WGD in plants [39], which suggests significant ‘lag times’ between WGD and radiation, possibly in response to environmental changes. The WGD radiation lag-time model is clearly observed in Cyprinidae. Polyploid species are common in seven of the 11 tribes in Cyprinidae (figure 2), but net diversification rate increased only in Torini, Schizothoracini and Schizopygopsini (figure 2). All polyploid lineages in Cyprinidae originated approximately 15 Ma (figure 2), yet the net diversification rate did not increase in Torini, Schizopygopsini and Schizothoracini until approximately 5 Ma. Since all of the polyploid lineages with increased net diversification rate occur in QTP (Schizopygopsini and Schizothoracini) or its adjacent area (Torini), environmental change triggering radiation of polyploid Cyprinidae lineages is most likely to be the continuous uplift of QTP and simultaneous climate changes which have been associated with diversification in many organisms [40]. The WGD radiation lag-time process in polyploid Cyprinidae is probably as follows. Polyploid Cyprinidae lineages were established in a period of globally sharp temperature decline, in the Mid-Miocene (right panel, figure 2), since increasing production of unreduced gametes by temperature shock is incidentally a key step of polyploidy formation in fishes [5]. After a lag time, the final uplift of QTP (approx. 10 Ma to present) [40] and the Late Miocene global cooling (approx. 7–5 Ma) [41] might together increase net diversification rates in polyploid Cyprinidae lineages. Therefore, polyploidization in Torini, Schizothoracini and Schizopygopsini seems to drive diversification in the way that the WGD radiation lag-time model proposes—radiation occurs after polyploidization for several million years to wait for environmental changes. A lineage-specific ohnologue resolution (LORe) model is proposed to explain lag time between WGD and diversification in Salmonid fishes [42]. It says the functional outcomes of WGD need not appear ‘explosively’, but can arise gradually over tens of millions of years owing to delayed rediploidization by the functional divergence of ohnologues responsible for lineage-specific adaptations and diversification [42]. The LORe model might also explain the WGD radiation lag time observed in polyploid Cyprinidae lineages, although allopolyploids are prevalent in Cyprinidae. It is notable that in addition to polyploidization coupling with environmental changes, increasing net diversification rate in polyploid lineages might result from other factors, such as available niche due to the lack of competition from non-Cyprinidae specie. However, it might not be the case for polyploid Cyprinidae lineages (Torini, Schizothoracini and Schizopygopsini) according to the following observations. Polyploid Cyprinidae coexist with Tibetan loaches Triplophysa in Balitoridae and glyptosternoid catfishes in Sisoridae in QTP and its adjacent area (electronic supplementary material, figure S10). Polyploid Cyprinidae, Tibetan loaches and glyptosternoid catfishes have similar diets (feeding on macroinvertebrates, algae, etc. [43]). Tibetan loaches and glyptosternoid catfishes also showed increasing diversification associated with the final uplift of QTP [40]. Taken together, polyploidy in Cyprinidae is far more than an evolutionary dead end, and instead shows adaptive potential in terms of its positive effects on environmental robustness and diversification.

(b) Genetic basis of adaptive potential in polyploid Cyprinidae

Although Cyprinidae contains more polyploid species than any other vertebrate groups, the origin of polyploidy in cyprinid species is still unknown. C. carpio, the most studied species, is reported to be allotetraploid according to cytogenetic [44] and genomic [25,45] evidence. Tor species in Torini is also speculated to be allotetraploid based on ohnologue genealogy of Sox genes [33]. Comparison between Ks distribution of ohnologue pairs within polyploid species and mitogenomic divergence among polyploid species has proven to be a reliable way to establish allopolyploidy that resulted from divergent diploid progenitors [31]. In the case of such allopolyploidy, the corresponding divergence time of the Ks distribution peak of ohnologue pairs within polyploid species would be older than mitogenomic divergence among polyploid species [31]. According to this criterion, Sinocyclocheilus, Schizopygopsini and Schizothoracini fishes are allotetraploids, as well as C. carpio. Furthermore, ohnologue genealogy can distinguish polyploidization events between different polyploid species as either shared or independent [33]. Specifically, if a polyploidization event is shared by two polyploid species before their divergence, ohnologues are expected to group across species. Ohnologues are expected to group by species if the polyploidization event occurred independently after the species diverged [33]. As such, the two deeply divergent mitogenomic Schizopygopsini and Schizothoracini lineages probably resulted from a shared polyploidization event (figure 3), as was also seen in the three Sinocyclocheilus species. Interestingly, ohnologue genealogy suggests that C. carpio and Sinocyclocheilus species result from the shared polyploidization event (figure 3). However, we cannot rule out the possibility that C. carpio and Sinocyclocheilus species might have arisen from different polyploidization events, but with closely related diploid progenitors. This also applies to Schizopygopsini and Schizothoracini lineages. It is clear that Schizopygopsini and Schizothoracini do not share a polyploidization event with C. carpio and Sinocyclocheilus species (figure 3). Those observations suggest that allopolyploidization occurs recurrently in Cyprinidae.

Gene redundancy is one well-known advantage of becoming polyploid [1]. Gene redundancy has been thought to be a major evolutionary force because it provides opportunities for genetic innovation [46], which has been frequently seen in fish evolution. For example, sub/neofunctionalization of an elastin gene generated by teleost WGD contributes to the origin of the bulbus arteriosus, an evolutionarily novel organ in the teleost heart outflow tract [47]. Extreme anoxia tolerance in polyploid cyprinids (Carassius species) is acquired through the neofunctionalization of duplicated genes from recent polyploidization that has created a new ethanol-producing pyruvate decarboxylase pathway [48]. Genome-wide studies show that after WGD, complex genes are preferentially retained as duplicates in teleosts to increase genomic complexity [28], and LORe lasting over tens of millions of years is assumed to be responsible for specific adaptations and diversification in salmonids [42]. Thus, biased gene retention after WGD might play a key role in long-term adaptation in polyploids [5]. However, the evolutionary significance of polyploidization with regard to duplicated gene evolution after WGD has not been considered in cyprinid evolution, for example in C. carpio [25], Sinocyclocheilus [26], or a number of transcriptome-based studies in Schizopygopsini and Schizothoracini fishes. Considering that most duplicated genes are usually silenced within a few million years after polyploidization [49], pseudogenes could be an index of unique retention of duplicated genes. As expected, both the number and proportion of pseudogenes in the four polyploid Cyprinidae genomes are significantly higher than those in diploid teleost genomes (χ2-tests, p < 0.01). GO enrichment of pseudogenes suggests that many genes are preferentially retained in the two cave-dwelling polyploids (S. rhinocerous and S. anshuiensis), but lost in the surface-dwelling polyploids (C. carpio and S. grahami; electronic supplementary material, figure S9). This includes genes contributing to the regulation of innate immune responses (GO:0045088), regulation of defence responses (GO:0031347) and negative regulation of NF-κB transcription factor activity (GO:0032088), which is a critical regulator of immediate responses to pathogens. In fact, differences in infection susceptibility are commonly observed between cave- and surface-dwelling fishes [50]. Thus, preferential retention of immune-related genes might be responsible for cave adaptation in Sinocyclocheilus. However, while increased net diversification has been observed in polyploid cyprinids (figure 2), it is not known how this diversification has been directly or indirectly affected by unique gene retention. Asexual reproduction is another notable advantage of becoming polyploid in general [1], and in Cyprinidae particularly [51]. For example, Gibel carp Carassius gibelio is so far the only vertebrate species described in which sexually and asexually reproducing natural populations coexist sympatrically. The mixed type of sexual and asexual reproduction has been believed to be the key for this polyploid cyprinid fish that is able to adapt itself to environmental fluctuation and can easily colonize a new habitat as an invasive species [52]. Asexuality may have played key role in the early stage of Cyprinidae polyploid evolution in the context of QTP uplift and climate changes by ensuring survival in periods when sexual mates are scarce.

5. Conclusion

In this study, the evolutionary significance of polyploidization is investigated in Cyprinidae. Increased environmental robustness and net diversification rate are observed simultaneously in multiple polyploid lineages. A lag time is observed between polyploidization events and radiation of polyploid lineages in Cyprinidae. The continuous uplift of QTP and simultaneous climate changes since the Mid-Miocene are likely to be responsible for the diversification of polyploids in Cyprinidae. Polyploids in Cyprinidae form through recurrent allopolyploidization events, and a unique gene retention profile after polyploidization might contribute to their adaptation to harsh environments. Taken together, those results suggest that polyploidy in Cyprinidae is far more than an evolutionary dead end, but rather shows substantially adaptive potential.

Data accessibility

Electronic supplementary material available from the Dryad Digital Repository: https://dx.doi.org/10.5061/dryad.mcvdncjwr [53].

Authors' contributions

B.G. conceived the study. X.L. analysed the data. B.G. and X.L. wrote the manuscript. Both authors read and approved the final version of the manuscript.

Competing interests

We declare that we have no competing interests.

Funding

This work was funded by the CAS Pioneer Hundred Talents Program, the National Natural Science Foundation of China (grant no. 31970382), and the Second Tibetan Plateau Scientific Expedition and Research Program (STEP, grant no. 2019QZKK0501).

Acknowledgements

We thank Ming Zou for help in genomic data analyses, and Jacquelin DeFaveri for language checking.

Footnotes

Electronic supplementary material is available online at https://doi.org/10.6084/m9.figshare.c.4829943.

Published by the Royal Society. All rights reserved.

References