Proceedings of the Royal Society B: Biological Sciences
Open AccessResearch articles

Widespread sympatry in a species-rich clade of marine fishes (Carangoidei)

Jessica R. Glass

Jessica R. Glass

College of Fisheries and Ocean Sciences, University of Alaska Fairbanks, Fairbanks, AK 99775, USA

South African Institute for Aquatic Biodiversity, Makhanda 6140, South Africa

Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA

[email protected]

Contribution: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

Google Scholar

Find this author on PubMed

Richard C. Harrington

Richard C. Harrington

Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA

Contribution: Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Supervision, Visualization, Writing – original draft, Writing – review & editing

Google Scholar

Find this author on PubMed

Peter F. Cowman

Peter F. Cowman

College of Science and Engineering, James Cook University, Townsville, Queensland 4811, Australia

Biodiversity and Geosciences Program, Museum of Tropical Queensland, Queensland Museum, Townsville, Queensland 4810, Australia

Contribution: Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Validation, Visualization, Writing – original draft, Writing – review & editing

Google Scholar

Find this author on PubMed

Brant C. Faircloth

Brant C. Faircloth

Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USA

Contribution: Conceptualization, Formal analysis, Investigation, Methodology, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

Google Scholar

Find this author on PubMed

Thomas J. Near

Thomas J. Near

Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA

Yale Peabody Museum of Natural History, Division of Vertebrate Zoology. New Haven, CT 06520, USA

Contribution: Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

Google Scholar

Find this author on PubMed



A universal paradigm describing patterns of speciation across the tree of life has been debated for decades. In marine organisms, inferring patterns of speciation using contemporary and historical patterns of biogeography is challenging due to the deficiency of species-level phylogenies and information on species' distributions, as well as conflicting relationships between species’ dispersal, range size and co-occurrence. Most research on global patterns of marine fish speciation and biogeography has focused on coral reef or pelagic species. Carangoidei is an ecologically important clade of marine fishes that use coral reef and pelagic environments. We used sequence capture of 1314 ultraconserved elements (UCEs) from 154 taxa to generate a time-calibrated phylogeny of Carangoidei and its parent clade, Carangiformes. Age-range correlation analyses of the geographical distributions and divergence times of sister species pairs reveal widespread sympatry, with 73% of sister species pairs exhibiting sympatric geographical distributions, regardless of node age. Most species pairs coexist across large portions of their ranges. We also observe greater disparity in body length and maximum depth between sympatric relative to allopatric sister species. These and other ecological or behavioural attributes probably facilitate sympatry among the most closely related carangoids.

1. Introduction

For decades, biologists have debated whether there is a universal paradigm to explain patterns and processes of speciation in marine habitats [13]. From describing modes of speciation and mechanisms of dispersal [4,5], to characterizing latitudinal and longitudinal diversity gradients [68] and hypothesizing geographical origins of diversity [911], the rise of genetic methods and oceanographic modelling has upended traditional assumptions that vicariance leading to allopatry [12] is the default mechanism of speciation in the ocean [1,11]. Although biogeographic barriers have been shown to result in allopatric speciation in certain circumstances [1,1315], in oceanic environments with fewer obvious geographical barriers to dispersal, other factors such as body size, pelagic larval duration and dispersal ability may be the prominent facilitators, rather than artefacts, of speciation [2,4,1618].

To assess contemporary and historical patterns of marine speciation and biogeography, scientists have employed a variety of approaches [1,14,1921]. One comparative method, age-range correlation, analyses the extent of range overlap between sister species pairs compared to the age of the phylogenetic node immediately subtending them as a proxy of species' age [2226]. Age-range correlations can be examined across sister species pairs to look for associations between geographical patterns and relative node ages. Assuming an allopatric speciation model, random, independent changes in ranges over time should lead to greater sympatry at older nodes, whereas a sympatric speciation model should reflect greater range overlap in recently diverged sister species compared to more distantly related sister clades [23,24]. Peripatric speciation, caused when a population becomes isolated at the periphery of its ancestral distribution, can be assessed by examining range size evenness (range symmetry) between sister species. Peripatric speciation is thought to occur when the ranges of recently diverged sister species are highly asymmetrical due to one species having a smaller range on the edge of the larger ancestral range [22,27].

Few studies have applied analyses of age-range correlation and range symmetry in large and taxonomically inclusive lineages of vertebrates, particularly marine fishes [18,23,24,2831]. The use of these approaches is hindered by a lack of comprehensive taxon sampling and limited availability of range data for many species. Existing age-range correlation studies on marine fishes have focused mainly on taxa occupying tropical coral reefs [1,6,11,14,32, cf. 33]. While these methods have challenges, such as distinguishing between sympatric speciation and secondary sympatry (i.e. allopatric speciation with subsequent range changes) [19,22,26,34], examining relationships between species ranges and node ages across large clades remains useful for understanding contemporary and historic biogeography in marine fishes. For pelagic and non-reef obligate species with high dispersal abilities, traditional models of allopatric and parapatric speciation that are thought to affect coral reef species [1,14] may be less important than sympatric speciation involving ecological divergence through habitat partitioning or reproductive timing [2,16,32]. Examining clade-level patterns of species ranges and integrating approaches such as age-range correlation with ecological data allow one to quantify the relationship between biogeography and speciation at larger taxonomic scales and assess potential drivers or outcomes of different speciation mechanisms (e.g. character displacement, competitive release).

Here, we integrate ecological trait data and characterize contemporary patterns of biogeography in species belonging to a large clade of coastal-pelagic percomorph marine fishes, Carangoidei [35], which contains Coryphaena (dolphinfishes), Echeneidae (remoras), Rachycentron canadum (cobia) and Carangidae (trevallies). These fishes prefer habitats ranging from reef-associated with pelagic-neritic to brackish, although the group can broadly be classified as coastal-pelagic. Life-history characteristics of the carangoids frequently exclude them from studies on coral reef obligate fishes, as well as studies that focus on open-water pelagic species such as tunas, because they do not exhibit ecological traits characteristic of entirely one group. For example, some genera within Carangoidei (e.g. Seriola, Caranx, Remora) have high dispersal potential due to their large body size and association with drifting seaweed rafts, similar to pelagic fishes [17]. Yet many carangoid species also display restricted home ranges [3639] – a trait more characteristic of reef fishes. Carangoids are assumed to have pelagic larval dispersal but the length of larval drifting and juvenile settlement patterns varies by species [4042]. Moreover, the importance of pelagic larval duration on dispersal, range size and speciation rate for marine fishes remains disputed [18,4345]. The regions of highest species richness of Carangoidei are the reef-abundant Indo-Australian Archipelago and Western Indian Ocean (figure 1a), making carangoids important for discussions on origins of tropical and sub-tropical fish biodiversity [6]. Carangoids are thus a key group for studying biogeographic patterns of fishes that span coral reefs and coastal habitats to the open ocean.

Figure 1.

Figure 1. (a) Heatmap of Carangoidei species richness using species range extent of occurrence data from IUCN and probability of occurrence data from Aquamaps [46]. (b) Allopatric sister species pair, Alectis indica (dark green) and Alectis alexandrina (light green), exhibit no range overlap and low range symmetry. (c) Sympatric sister species pair, Caranx sexfasciatus (light green) and Caranx papuensis (dark green), exhibit high range overlap and high range symmetry. The dark green regions in (c) represent complete overlap between C. sexfasciatus and C. papuensis.

A comprehensive, time-calibrated phylogeny is highly desirable to study patterns of speciation. However, the monophyly and taxonomic composition of Carangoidei and the more inclusive lineage, Carangiformes, have been contentious since the first phylogenies of these groups were published in the late twentieth century [42,47,48]. For example, early morphological phylogenies suggested Carangoidei encompassed the Carangidae, Echeneidae, Rachycentron canadum, Coryphaena, Nematistius pectoralis (roosterfish) and Mene maculata (moonfish) [42,49,50], whereas molecular data consistently resolved Carangidae as paraphyletic, but only inclusive of Rachycentron canadum, Coryphaena and Echeneidae [35,5153]. Moreover, molecular studies have disagreed on whether Carangiformes is paraphyletic [54,55] or monophyletic [35,5658]. These prior studies have been limited by combinations of taxonomic or locus sampling [51,53,56] or insufficient fossil calibrations [52]. Here, we perform a comprehensive phylogenomic analysis using a dataset of more than 955 ultraconserved element (UCE) loci [59] collected from 80% of the recognized species of Carangoidei. We combine this phylogenetic framework with data on geographical distributions, depth distributions and body size to address patterns of allopatry and sympatry in Carangoidei and examine phylogenetic signal in traits thought to influence speciation.

2. Material and methods

(a) Specimen sampling, genomic library construction and DNA sequencing

We obtained tissues for 154 species including nine outgroup species of Carangiformes through field collection and museum loans (electronic supplementary material, table S1). We prepared dual-indexed libraries [60] for targeted enrichment using the HyperPrep Kit (KAPA Biosystems, Wilmington, MA) following the manufacturer's protocols (electronic supplementary material). We used a probe set targeting 1314 UCE loci informative for phylogenetic analyses of Carangiformes and other acanthomorph fishes across evolutionary time scales [59]. We followed the methods of Ghezelayagh and Harrington [61]; see detailed protocol in electronic supplementary material. UCE sequence data were processed prior to phylogenetic analyses with phyluce v1.6 [62], which we used to construct alignments of individual UCE loci and perform edge trimming. We generated two data matrices for phylogenetic analyses to compare tree topologies with different amounts of missing data: one where 75% of taxa (115 out of 154) were present in each alignment and one where 95% of taxa (146 out of 154) were present.

(b) Phylogenetic and relaxed molecular clock analyses

We implemented the UCE-specific Sliding Window Site Characteristics approach with site entropy (SWSC-EN) to identify UCE core and flanking regions at each locus [63]. We used these results as input for PartitionFinder v2 [64] to determine the optimal number of partitions for loci in the 75% complete and 95% complete matrices. We inferred a partitioned maximum-likelihood (ML) phylogeny using IQ-TREE [65] and implemented the ultrafast bootstrap approximation approach using 1000 bootstrap replicates and a relaxed hierarchical clustering algorithm (rcluster) that included the top 10% partition merging schemes [66]. We rooted the tree with the myctophid Ceratoscopelus warmingii.

To account for stochasticity in the evolutionary history among individual UCE loci, we also performed a coalescent-based analysis using loci from the 75% complete matrix. We first generated individual locus trees in MrBayes v3.2.6 (see electronic supplementary material) [67]. We inferred a species tree from these locus trees using ASTRAL-II v5.6.2 with the default parameters [68]. Given the similar topologies generated from the 75% complete concatenated dataset compared to the coalescent-based tree, we used the ML phylogeny as a fixed topology to estimate divergence times of carangiform lineages by implementing a relaxed molecular clock approach in BEAST v1.10.4 [69]. Because BEAST has computational limitations when analysing hundreds of loci simultaneously, we performed replicate analyses using different combinations of loci, sensu Harrington et al. [56] and Branstetter et al. [70]. We included nine fossil calibration points from Harrington et al. [56] that spanned the Carangiformes clade and assigned the same lognormal prior distributions to incorporate age priors for select nodes (electronic supplementary material).

(c) Biogeographic and trait analyses

We obtained range data for 125 species from the IUCN Red List database, which consists of expert-validated range maps depicting the known ‘extent of occurrence’ of each species in the form of spatial polygons [71]. We obtained ranges for species missing from the IUCN database (n = 25) from Aquamaps, a database of species range predictions that uses a combination of occurrence data and species range modelling to assign relative probabilities of occurrence for each point coordinate [46]. Species for which no range data were available (n = 4) were trimmed from the time-calibrated phylogeny prior to biogeographic analyses. We also trimmed the time-calibrated phylogeny to only include carangoid taxa (Carangidae, Echeneidae, Coryphaenidae and Rachycentridae). The final dataset contained 123 species (electronic supplementary material, able S1).

To assess relationships between morphological traits, ecological traits and biogeography, we compiled trait data on maximum body length and maximum depth in the water column for the 125 carangoid species from the IUCN database [71]. If IUCN data were missing, we used FishBase [72]. We also compiled data on habitat class (reef or non-reef-associated) and diet (piscivorous or non-piscivorous) from a prior study [73] and the IUCN database [71].

We extracted divergence time estimates for each node of the time-calibrated phylogeny and conducted pairwise comparisons of range overlap and range symmetry [26]. We defined range overlap as the area occupied by a given species pair divided by the area of the species with a smaller range [26]. This produced an index ranging from 0 to 1, with 0 indicating no overlap (figure 1b) and 1 indicating complete overlap (figure 1c). Complete overlap meant both species co-occur throughout their entire respective ranges or that the range of one species is entirely encompassed by the other. To incorporate possible errors in points of occurrence, we classified species as allopatric if the range overlap index was less than 0.05 and sympatric if the range overlap index was greater than or equal to 0.05. We performed these analyses between all species in the phylogeny and extracted sister species pairs for additional analyses. A sister species pair was defined as two species sharing a unique common ancestor, i.e. an ancestor not shared with any other taxa in our phylogeny. We analysed 41 distinct sister species pairs, of which at least 32 we believe to be direct sister species (reciprocally monophyletic). The remaining nine pairs were uncertain due to unsampled species which may represent a closer relative to one of the species in those pairs. We hereon refer to sister species pairs inclusive of those believed to be direct sister species and those that exclusively share a single ancestor based on our phylogenetic sampling. To test for peripatry, we calculated range symmetry for each sister species pair, defined as the smaller of the two species' ranges divided by the sum of both species’ ranges [22]. The range symmetry metric falls between 0 and 0.5, where 0.5 indicates that both species have equal-sized ranges.

(d) Phylogenetic signal in Carangoidei

To examine the ecological and physical similarity of closely related species in Carangoidei, we tested three continuous traits (maximum body length, maximum water column depth and range size) and two discrete traits (habitat and piscivory) for phylogenetic signal, defined as the tendency for related species to resemble each other more than they resemble species drawn at random from the tree [74,75]. For continuous traits, we tested for phylogenetic signal using Blomberg's K [75]. We tested for phylogenetic signal in the discrete traits—habitat (reef or non-reef) and piscivory (piscivorous or non-piscivorous)—using Fritz's D [76] (electronic supplementary material).

We tested for phylogenetic signal of range overlap and range symmetry in Carangoidei using multiple matrix regression by means of a partial Mantel test with 1000 phylogenetically informed permutations in ‘phytools’ in R [77,78]. Although the Mantel test has been criticized for its low power, it is a suitable option for testing phylogenetic signal in data that are inherently pairwise contrasts, such as measures of range overlap and symmetry [79].

We used linear regression to examine the relationships between divergence times for 41 sister species pairs, range overlap, and range symmetry because these metrics are hypothesized to be informative about speciation mechanisms [22,26,29]. We also used Welch's t-tests to statistically examine the association of allopatry and sympatry with ecological trait differences. For each sister species pair, we calculated trait contrasts: the differences in body length and maximum water column depth between sister species. We chose water depth as a proxy for species' utilization of the water column. Limited availability of data on minimum water column depth prohibited us from calculating depth distributions across Carangoidei. We recognized this limitation and used the most comprehensive depth datasets available for the clade. We also analysed differences in body length and water column depth for sympatric sister species categorized by habitat type, i.e. whether both sister species occupied the same or different habitat type (reef or non-reef). We excluded one sister pair, Trachinotus mookalee (Indian pompano) and Trachinotus anak (oyster pompano), because no maximum depth data were available for T. mookalee [71,72].

3. Results

(a) Phylogenomic analyses and divergence times

We collected sequence data from an average of 958 loci for 154 individuals. Following alignment trimming, mean locus length was 972 bp (range: 319–1570 bp) and each locus contained a mean of 270.6 parsimony informative sites. The 75% complete alignment included 986 loci and the 95% complete alignment included 371 loci. PartitionFinder produced 143 partitions for the 75% complete matrix and 94 partitions for the 95% complete matrix. IQ-TREE (electronic supplementary material, figures S1 and S2) and ASTRAL (electronic supplementary material, figure S3) inferred similar trees with high ultrafast bootstrap support and local posterior probabilities, respectively (electronic supplementary material, figures S4 and S5). We observe similar topologies generated from the 75% complete concatenated dataset compared to the coalescent-based tree (electronic supplementary material, figure S4) and low Robinson-Foulds values between the trees (electronic supplementary material, table S3).

Most nodes in the ML tree generated from the 75% complete matrix are strongly supported, with 90% having ultrafast bootstrap support of 100 (figure 2). Our phylogenomic analyses suggest four distinct lineages within carangiform fishes: (1) a clade containing Lates calcarifer (barramundi), Centropomidae, Lactarius lactarius and Sphyraena; (2) a clade containing Polynemidae and Pleuronectoidei (flatfishes); (3) a clade containing Leptobrama, Toxotes, Nematistius pectoralis, Mene maculata (moonfish), Xiphias gladius and Istiophoridae; and (4) a clade containing Echeneidae, Rachycentron canadum and Coryphaena nested in a paraphyletic Carangidae (figure 2a; electronic supplementary material, figures S1–S3).

Figure 2. Figure 2.

Figure 2. Time-calibrated phylogeny of 145 species of Carangiformes and two outgroup species generated by BEAST using a guide tree from the 75% complete matrix constructed in IQ-TREE and nine fossil calibration points. Blue bars indicate 95% posterior probability densities (HPD) around point estimates. Nodes represent median ages from a maximum clade-credibility tree. Ultrafast bootstrap support values (BT) are indicated as circles on each node. No circle indicates 95–100% BT support. Dark blue rectangles indicate nodes calibrated with priors based on fossil data. Diamond node labels indicate sympatric (light blue) and allopatric (black) carangoid sister species pairs. One sister pair (Trachurus indicus and T. delagoa) are unlabeled due to missing range data. Fish image sources are in electronic supplementary material, table S5.

Within Carangoidei, our phylogenetic hypotheses resolve two large subclades. The first consists of the echenoids (Echeneidae, Rachycentron canadum and Coryphaena). This clade is sister to the recently elevated Trachinotidae that includes Trachinotus (pompanos), Lichia amia (leerfish) and Scomberoidinae (Oligoplites [leatherjackets] and Scomberoides [queenfishes]) [80]. Lichia amia was previously classified with Trachinotus in Trachinotini [42,49]; however, we resolve L. amia as the sister lineage of Scomberoidinae with strong support (figure 2; electronic supplementary material, figures S1–S3). We delimit the second major subclade within Carangoidei as Carangidae, inclusive of Naucratinae and Caranginae, which contain numerous paraphyletic genera (figure 2). In Alepes, Decapterus, Seriola, and Caranx, one or two species classified in other genera resolve within these clades (figure 2; electronic supplementary material, figures S1–S3). Carangoides is polyphyletic, with species distributed across nine clades (figure 2b; electronic supplementary material, figures S1–S3). Our phylogenetic hypotheses are broadly congruent with recent molecular analyses focusing on Carangoidei that contain dense taxonomic sampling [52,53,56].

Species relationships within Carangiformes are largely consistent across the different methods of analysis (IQ-TREE, ASTRAL) and matrix composition (75% or 95% complete). Differences in phylogenetic relationships between the 75% ML topology and ASTRAL coalescent-based tree involve the phylogenetic placement of Centropomus medius, Seriola nigrofasciata, Decapterus macarellus, D. akaadsi, Caranx crysos and C. caballus, as well as Parastromateus niger (electronic supplementary material, figure S4). Maximum likelihood phylogenies inferred using the 75% and 95% matrices differ in the resolution of Trachurus trecae, Decapterus akaadsi, Uraspis uraspis and Carangoides bajad (electronic supplementary material, figure S5).

Using relaxed-clock molecular dating analyses, we generated similar estimates of divergence times and overlapping 95% highest posterior densities (HPD) across four random subsets of 25 UCE loci (figure 2; electronic supplementary material, figure S6). These analyses estimate the age of the most recent common ancestor (MRCA) of Carangiformes as 66.66 Ma (95% HPD: 62.43–71.59 Ma) and of Carangoidei as 53.82 Ma (95% HPD: 51.50–56.77 Ma; figure 2). These divergence times are younger than a previous study on Carangoidei [52] but fall within the 95% confidence intervals of other phylogenetic studies that estimate the ages of Carangiformes and Carangoidei [5659].

(b) Phylogenetic signal

Tests of continuous trait variables within Carangoidei using Blomberg's K suggest there is phylogenetic signal for body length (K = 0.123, p = 0.001), water column depth (K = 0.090, p = 0.006) and range size (K = 0.307, p = 0.001), as K values lower than one imply that variance is less than expected by a Brownian process. Phylogenetic least squares regression with an OU error model suggests a correlation between maximum body length and maximum water column depth (t = 2.593, p = 0.011), but not between maximum body length and geographical range size (t = 1.004, p = 0.317).

We calculated Fritz's D to examine phylogenetic signal in discrete variables (reef habitat and piscivory) and compared observed D values to simulated sums of expected character changes under Brownian motion and random models. The test for reef habitat (D = 0.663) suggests a departure from Brownian motion expectations (p[D > 0] < 0.001) but more phylogenetic signal than expected from a random distribution of habitat traits across the phylogeny (p[D < 1] < 0.001). The test of Fritz's D for diet (piscivory or non-piscivory; D = 0.069) suggests the evolution of diet resembles a Brownian process (p[D > 0] = 0.365; p[D < 1] < 0.001). Mantel tests of contrast variables reveal a correlation between phylogeny and overlap of geographical ranges (R2 = 0.020, p = 0.009), as well as phylogeny and geographical range size symmetry (R2 = 0.033, p = 0.001).

(c) Sister species analyses

Among the 41 resolved carangoid sister species pairs, 30 (73%) are sympatric (range overlap > 0.05) and 11 (27%) are allopatric (figure 2). All allopatric sister pairs have range overlap values of zero except Trachinotus anak and T. mookalee, with a range overlap value of 0.02. All sympatric species pairs have range overlap values > 0.6 except Uraspis helvola and U. secunda, whose range overlap is 0.16 (figure 3a). The node ages of sympatric sister species pairs range from 0.08–17.74 Ma (median: 1.65 Ma), while the node ages of allopatric pairs range from 1.23–6.31 Ma (median: 2.14 Ma). We find no effect of node age on range overlap (r = 2.25 × 10−4, p = 0.926; figure 3a) or geographical range size symmetry (r = 0.013, p = 0.480), nor is there a correlation between range overlap and range size symmetry (r = 0.072, p = 0.091; figure 3b). Median range size symmetry is 0.273 for allopatric species pairs and 0.348 for sympatric species pairs. Notably, there are greater differences in maximum water depth between sister species in sympatry versus those in allopatry (t = 2.513, p = 0.017; figure 4a). We also observe greater differences in maximum body length between sympatric sister species pairs compared to allopatric pairs (figure 4b), though these are not significant (t = 1.821, p = 0.081).

Figure 3.

Figure 3. Range overlap as a function of node age (a) and range symmetry as a function of overlap (b) for 41 sister species pairs within Carangoidei.

Figure 4.

Figure 4. Contrasts between allopatric and sympatric sister species pairs for maximum water column depth (a) and maximum body length (b). Results of Welch's t-tests are presented to show significance between allopatric and sympatric sister species pairs.

Most allopatric sister species pairs are comprised non-reef associated species. Out of 11 allopatric sister pairs, 73% (n = 8 pairs) contain two non-reef-associated species (electronic supplementary material, table S4). We also find greater differences in maximum water depth (t = 2.173, p = 0.034) between sympatric sister species pairs that occupy the same habitat (e.g. both occupy reef or non-reef habitats) compared to sympatric pairs that occupy different habitats (electronic supplementary material, figure S7).

4. Discussion

(a) Carangiform phylogeny and timing of diversification

With a dataset averaging 958 UCE loci and representing 80% of the known species diversity within Carangoidei, we provide phylogenomic resolution for the relationships within this clade. The molecular and phylogenomic perspective on carangoid relationships is notable in the consistent paraphyly of the traditional delimitation of Carangidae when excluding the echenoids; thus, we confirm a newly revised classification for the carangoid subclades Carangidae and Trachinotidae [80]. Our resolution of subclades within Carangiformes is concordant with previous analyses using UCEs, with the exception of the relationships of Latidae, Centropomidae and Sphyraena [35,56]; these different phylogenetic relationships of early diverging carangiform lineages reflect ongoing challenges using molecular data to resolve taxonomic relationships due to short internal branches [81].

Our estimates of node ages suggest the origin of Carangoidei was approximately 53 Ma during the early Eocene (electronic supplementary material, figure S6). Estimates from UCE data for the age of Carangoidei are much younger than a previous study which suggested a Late Cretaceous origin; this is probably due to that study's fossil calibrations, which have older age estimates within Carangoidei [52]. Due to several identical fossil calibration points within Carangiformes shared across studies, our age estimates are similar to phylogenomic analyses using UCEs [56,59] and exons from protein coding genes [57]. Our results suggest most of the species level diversification occurred during the last 10 million years, during the late Miocene (approx. 11.63–5.33 Ma). The late Miocene was a period of warmer global climate and expanding coral reef habitats, which is congruent with the observed diversification of other tropical and sub-tropical coral reef fish lineages with diversity centered in the Indo-Pacific Archipelago region [6,9].

(b) Patterns of carangoid sympatry and allopatry

While sympatry of sister species pairs is ubiquitous across Carangoidei, regardless of node age, 27% of sister species pairs were allopatric. Most cases of allopatry (64%) were likely caused by vicariance, specifically the Isthmus of Panama. Seven out of eleven allopatric pairs have divergence times younger than 5 Ma and presently exhibit parallel range patterns, where one species occupies the eastern Pacific (e.g. Selene brevoortii [Mexican lookdown]) and its sister species inhabits the western Atlantic (e.g. Selene vomer [lookdown]; electronic supplementary material, figure S8A). The other cases of allopatry are potentially maintained by the cold-water barrier formed by the Benguela and Agulhas currents off the southern coast of South Africa separating the Atlantic and Indian Oceans (Alectis indica [diamond trevally] and A. alexandrina [African threadfish]), or open-ocean barriers in the Atlantic (e.g. Selene setapinnis [Atlantic moonfish] and S. dorsalis [African moonfish]; electronic supplementary material, figure S8B) and Indo-Pacific (Trachinotus mookalee and T. anak; Trachurus japonicus [Japanese horse mackerel] and T. novaezelandiae [yellowtail horse mackerel]). Isolating barriers are unknown for the oldest (approx. 6.3 Ma) diverging allopatric sister pair, Pseudocaranx dentex (white trevally) and Carangoides equula (whitefin trevally), which are found throughout the Atlantic and Indo-Pacific, respectively.

To our knowledge, only one study has demonstrated such widespread sympatry (76%) in a large lineage of marine fishes, Myctophidae [82], although high degrees of sympatry have been demonstrated in some genera of coral reef fishes – for example, over 80% of sister species pairs in Pomacanthus (angelfishes; n = 13 spp.) [83] and Haemulon (grunts; n = 21 spp.) [84]. Most comparable analyses of marine fishes found a higher prevalence of allopatry between sister species, from 62% in New World haemulid fishes (n = 42 spp.) [31] to 64% in parrotfishes (N = 61 spp.) [85] to 88% in Halichoeres (wrasses; n = 24 spp.) [86]. In Holocanthus (angelfishes; n = 7 spp.), one sister pair is sympatric while the other species are allopatric [87].

Our age-range correlation analysis revealed patterns that were not clearly consistent with a single geographical mechanism of speciation (e.g. sympatry, allopatry, peripatry), potentially due to range shifts in this clade over time [19]. While we cannot rule out sympatric speciation due to lack of appropriate data, we follow long-held assumptions and empirical evidence that allopatric speciation is the predominant mechanism driving diversification [88,89]. As such, the clade-wide pattern we observed may suggest secondary sympatry occurring after allopatric speciation. Moreover, the patterns we observe in range overlap and range symmetry imply close relatives in Carangoidei coexist and maintain sympatry across large portions of their ranges. The greater divergence in body size and water column depth in sympatric sister pairs compared to allopatric pairs may be prezygotic isolation mechanisms that reduce interspecific competition (e.g. character displacement), facilitating secondary sympatry among closely related species [90]. Similar examples of transitions from allopatry to secondary sympatry are sparse but have been observed in birds [30,91,92] and coral reef fishes [25,93]. Even under an assumed model of allopatric speciation, recent evidence suggests diverging and recently diverged lineages of birds, mammals and amphibians evolve under similar macro-selective pressures, contradicting long-standing ideas that divergent, allopatric adaptation initiates the earliest stages of speciation [33]. Reef fishes and marine cetaceans exhibit higher transition rates to sympatry than birds and other vertebrate lineages [93]. Although node age is a significant predictor of transition from allopatry to sympatry in terrestrial organisms, the probability of sympatry is independent of node age in coral reef fishes and cetaceans due to frequent, fast transitions between allopatric and sympatric states [93]. These fast transitions are attributed to higher intrinsic dispersal abilities in lineages of marine organisms compared to terrestrial vertebrates [93] even though dispersal ability—including pelagic larval duration—has a nuanced correlation with range size [18,44].

(c) Ecological signature of secondary sympatry in carangoid fishes

We observe higher divergence in maximum water depth and body size in sympatric sister species pairs, suggesting ecological factors facilitate sympatry among the most closely related species of carangoids. Water depth differences between sympatric sister species have also been documented in New World Halichoeres fishes [86], and sympatric sister pairs of parrotfish exhibit greater differences in body size, morphology, habitat type and colour patterns [85]. Body size and water column depth are reflective of resource use, with the former being a strong correlate of prey consumption [94] and the latter indicative of habitat niche partitioning [95,96]. The trait differences we examined in Carangoidei may be the result of character displacement, which is represented by the divergence of character traits in two or more lineages occurring in sympatry [90,97100], but at present, character displacement is difficult to prove due to the lack of detailed ecological trait data across carangoid species' ranges. Given that body size and maximum depth in the water column are positively correlated [101], it is unclear if divergence in body size is driving divergence in water depth distribution. Further research on these and other traits is warranted, particularly to compare sister species’ traits between areas of overlap versus non-overlap.

The displacement of ecological and behavioural characters, in part by minimizing competition, is hypothesized to facilitate sympatry between closely related species [97,99]. In coral reef environments, habitat complexity may influence character displacement in reef fishes, be it divergence in mate recognition [102], trophic partitioning [103], reef preference [85], or territoriality [104]. Yet, since fewer than half of carangoid species (45%) are classified as reef-associated, carangoid niche partitioning might be shaped by different factors than those affecting coral reef fishes. Most (80%) allopatric carangoid sister pairs are non-reef-associated, while 80% of sympatric sister species contain at least one reef-associated species. A previous analysis on carangoid body shape and ecological traits found that shifts from reef to non-reef environments increased rates of morphological diversification, implying that non-reef environments influenced morphological changes more than reef environments [73]. Although the authors did not find an effect of habitat type on rates of phylogenetic lineage diversification, their lineage diversification rates may have been skewed by their age estimates of Carangoidei [52], which are substantially older than our age estimates and those of other phylogenomic studies [56,57]. Our results suggest habitat and diet resemble a Brownian motion model of trait evolution, but we did not test for the effects of trait evolution on rates of diversification. Ecological partitioning among closely related species occupying non-reef environments might be one reason why carangoids exhibit such high disparity in body shape and body size relative to other percomorphs [48,105]. Despite this variation in body shape and size, we still observe phylogenetic signal in body length, which corroborates previous morphological work suggesting similarity in the evolution of carangoid body types among major subclades [51].

Our tests of phylogenetic covariance suggest that the evolution of certain morphological and ecological traits has been conserved during carangoid lineage diversification. Notably, although we observe weak but significant phylogenetic signal in body length and water column depth in Carangoidei, the prevalence of sympatry coincides with evidence of morphological and environmental niche-partitioning in body size and depth in the water column between sister taxa. Our results highlight the benefits of performing sister species analyses, not only because such analyses pose less risk of overestimating divergence times due to extinction events [24], but also because independent replicates are less likely to be phylogenetically confounded [106] and may reveal trait divergences that are masked by analyses of phylogenetic signal across the entire clade. Additional studies examining the mechanistic processes underlying speciation in Carangoidei, including mate selection, reproductive timing, and mechanisms of dispersal at the species-level will shed further light on the drivers of speciation in this unique clade of fishes.


Specimen numbers for museum tissues used in this manuscript are located in the electronic supplementary material (electronic supplementary material, table S1).

Data accessibility

All data underlying the analyses of this work are available on Dryad ( [107]. Sequence data generated for this manuscript are archived as raw reads in the NCBI Sequence Repository (SRA) under NCBI BioProjects PRJNA1028788, PRJNA758064 and PRJNA341709.

Supplementary material is available online [108].

Declaration of AI use

We have not used AI-assisted technologies in creating this article.

Authors' contributions

J.R.G.: conceptualization, data curation, formal analysis, funding acquisition, investigation, methodology, project administration, resources, software, supervision, validation, visualization, writing—original draft, writing—review and editing; R.C.H.: conceptualization, formal analysis, investigation, methodology, project administration, supervision, visualization, writing—original draft, writing—review and editing; P.F.C.: conceptualization, formal analysis, investigation, methodology, project administration, validation, visualization, writing—original draft, writing—review and editing; B.C.F.: conceptualization, formal analysis, investigation, methodology, resources, software, supervision, validation, visualization, writing—original draft, writing—review and editing; T.J.N.: conceptualization, funding acquisition, investigation, methodology, project administration, supervision, validation, visualization, writing—original draft, writing—review and editing.

All authors gave final approval for publication and agreed to be held accountable for the work performed therein.

Conflict of interest declaration

We declare we have no competing interests.


This research was supported by the Yale Institute for Biospheric Studies, the NSF Doctoral Dissertation Improvement Grant (grant no. DEB 1701597), the NSF Graduate Research Fellowship Program (grant no. DGE 1122492), the USAID Research and Innovation Fellowship, the Yale Department of Ecology and Evolutionary Biology Chair's Fund, and the Yale MacMillan Center International Dissertation Fellowship. P.F.C. was also funded by the Australian Research Council (grant no. DE170100516).


We thank Gregory Watkins-Colwell of the Yale Peabody Museum of Natural History and Roger Bills of the South African Institute for Aquatic Biodiversity. We are grateful to several museum collections, listed in electronic supplementary material, table S1, for providing samples for this study. Lastly, we thank the anonymous reviewers for their feedback.


Electronic supplementary material is available online at

Published by the Royal Society under the terms of the Creative Commons Attribution License, which permits unrestricted use, provided the original author and source are credited.