Chance long-distance or human-mediated dispersal? How Acacia s.l. farnesiana attained its pan-tropical distribution

Acacia s.l. farnesiana, which originates from Mesoamerica, is the most widely distributed Acacia s.l. species across the tropics. It is assumed that the plant was transferred across the Atlantic to southern Europe by Spanish explorers, and then spread across the Old World tropics through a combination of chance long-distance and human-mediated dispersal. Our study uses genetic analysis and information from historical sources to test the relative roles of chance and human-mediated dispersal in its distribution. The results confirm the Mesoamerican origins of the plant and show three patterns of human-mediated dispersal. Samples from Spain showed greater genetic diversity than those from other Old World tropics, suggesting more instances of transatlantic introductions from the Americas to that country than to other parts of Africa and Asia. Individuals from the Philippines matched a population from South Central Mexico and were likely to have been direct, trans-Pacific introductions. Australian samples were genetically unique, indicating that the arrival of the species in the continent was independent of these European colonial activities. This suggests the possibility of pre-European human-mediated dispersal across the Pacific Ocean. These significant findings raise new questions for biogeographic studies that assume chance or transoceanic dispersal for disjunct plant distributions.

. Collection localities of populations of Acacia farnesiana (L.) Willd. used in this study and groupings resulting from cluster analysis using STRUCTURE. Samples were collected from populations in northern Australia, South Central Mexico, Spain, Madagascar, Réunion, India and Fiji. Voucher specimens were retained for a representative subset of individuals and these were deposited in the National Herbarium of Victoria (MEL). These samples were supplemented with herbarium specimens from additional populations in the Americas, Cape Verde and the Philippines. For details of individual specimens and vouchers, see the electronic supplementary material, appendix S1. population     [41] and plotted graphically using DISTRUCT V. 1.1 [42]. We used Analysis of Molecular Variance (AMOVA) [43] to determine the statistical significance of genetic structure at the regional level and at the population level. We used GENALEX v. 6.41 [39] to conduct this analysis. Populations with less than five individuals were excluded. Analyses were based on the R ST measure of genetic diversity, with 9999 permutations.

Coalescent modelling
We used coalescent modelling to assess migration rates between geographically defined populations, effective population sizes and divergence times. Different programs model the coalescent process to estimate a subset of these parameters. We used MIGRATE-N [44] to estimate rates of gene flow between populations and effective population sizes of these populations. We used IMA2 [45] to estimate divergence time between genetically distinct regions (determined from analyses of geographical structure outlined above), and to estimate gene flow rates and effective population sizes of these regions, and the ancestral population. Analyses were run on an SGI Altix XE Cluster through the Victorian Life Sciences Computing Initiative (VLSCI).

Analysis with MIGRATE-N
We used genotypes of individuals at each locus in a coalescent model to estimate migration rates between populations (m = m/µ, where m is the rate of migration for each gene copy and µ is the mutation rate per gene copy per generation), and population sizes for extant and ancestral populations (θ = 4Nµ, where N is the effective population size for a diploid species) [46,47].
Populations were pooled to reduce the number of parameters being estimated. Three pooled populations were considered in the Americas based on regional biogeography, rather than national boundaries: Arizona and Northwest Mexico (north of the Trans-Mexican Volcanic Belt); South Central Mexico (from the Trans-Mexican Volcanic Belt to the Isthmus of Tehuantepec); and Central and South America plus the Caribbean Islands (including Campeche, Mexico, which is south of the Isthmus of Tehuantepec). The other two pooled populations considered were the Old World, as defined above, and Australia. Estimated migration rates were considered to be significant if zero was not within the 95% highest posterior density (HPD) interval.
A simple stepwise mutation (SSM) model of molecular evolution was used for each of the microsatellite loci, and mutation rates were allowed to vary among loci. For each hypothesized divergence, we ran two parallel Bayesian MCMC analyses of four heated chains (1.0, 1.5, 3.0 and 1 000 000.0), with independent random starting points. Starting parameters were based on a UPGMA tree and F ST . Bayesian uniform priors for θ and m were bound between 0 and 50, and between 0 and 100, respectively. A burn-in of 50 000 steps was followed by a further 50 000 recorded steps with sampling every 100 steps for each locus. Convergence on stationary distributions of parameters was assessed based on the similarity of posterior distributions of independent runs, and the effective sample size.

Analysis with IMA2
Divergence time between genetically distinct regions (determined from analyses of genetic structure, above), along with gene flow rates and effective population sizes, were estimated using IMA2 [45]. We used an SSM model of molecular evolution for each microsatellite locus. Four preliminary MCMC analyses with independent random starting points were run with 100 heated chains, under a geometric heating scheme with heating parameters ranging from 0.3 to 0.995, with 300 000 steps of burn-in, followed by one sampled genealogy, at which point the Markov chain state was saved to a file. Each of these Markov chain state files was then used as a starting point for five further runs under the same conditions for a total of 20 runs. For each of these runs, a further short burn-in of 10 000 steps was used, to ensure that each run had moved to an independent starting point before saving genealogies. Following this extra burn-in, each MCMC analysis continued for a further 5000 saved genealogies, saving every 100 steps, giving a total of 100 000 saved steps.
Coalescent parameters were estimated to be the peak of the posterior probability distribution from the combined MCMC runs, and the confidence interval was determined as the 95% HPD. Divergence models were assessed using likelihood ratio tests, with likelihood ratios expected to follow a χ 2 -distribution. Specifically, we tested for differences in effective population size of extant and ancestral populations, non-zero migration rates and asymmetrical migration rates. Relative divergence times, population sizes and rates of gene flow were converted into absolute values assuming a generation time of 3 years [48], and mutation rates of either 2.4 × 10 −4 mutations/marker/generation (based on direct observation of the microsatellite mutation rate in wheat [49]) or 5.0 × 10 −4 mutations/marker/generation (considered to be the average mutation rate over many species [50,51]), as there were no fossil data available at the species level to calibrate mutation rates.

Genetic diversity
At the broad regional level, we found genetic diversity in A. farnesiana to be highest in the Americas, and lowest in the Old World (table 2). The number of alleles per locus and expected heterozygosity were both highest in the Americas. Private alleles were detected in the Americas and Australia, but not in the Old World. Within populations in the Old World, genetic diversity was slightly higher in Spain than in other populations. Populations in the Old World all had strongly negative values for F.

Geographical structure
Bayesian analysis using STRUCTURE inferred a maximum K [52] at K = 2, indicating this to be the number of clusters that best explained the data. A peak at K = 2 in STRUCTURE analyses can be an artefact (e.g. [53,54]), so we also examined clustering based on the secondary peak at K = 5 ( figure 1a,b). At K = 2, both clusters were broadly distributed in the Americas (figure 2). Outside of the Americas, Cluster A was found in Australia, while Cluster B was found in the Old World.
At K = 5, Clusters 1, 3 and 4 were all found in the Americas (figure 3). Cluster 1 was only found in the Americas. Individuals from Madagascar, Réunion, the East Atlantic, India and Fiji were assigned to a single cluster (Cluster 4), suggesting a common source population. Individual plants from Spain and one plant from Cape Verde displayed approximately equal probability of assignment to Clusters 4 and 5 (figure 3b). One individual from the Philippines had ambiguous ancestry; the other was assigned to the same cluster as the population from South Central Mexico (Cluster 3). Australian populations were assigned a unique cluster (Cluster 2). No individuals from any population across the species distribution were assigned to Cluster 5 with a probability higher than 0.75.
A two-level AMOVA, with populations grouped into regions (Americas, Australia and Old World), partitioned 0% of total genetic variation among regions, 19% among populations, 75% among individuals and 6% within individuals. The variation between regions was not significant (R RT = −0.026; p = 1.0000). The variation between populations across the entire range, and between populations within regions was significant (R ST = 0.172; p = 0.0001 and R SR = 0.194; p = 0.0001). This means that although the populations we defined are genetically distinct, populations in different regions are no more genetically distinct than populations within the same regions.

Migration rate estimates MIGRATE-N
Using MIGRATE-N, we detected significant, but low, levels of migration between regions within the Americas (table 3 and figure 4). Significant gene flow was detected from populations in the Americas to populations in the Old World and Australia (although this is inconsistent with the results from IMA2 analysis; see below). Based on the estimates of gene flow, we can infer the source populations for the introduced samples. The migration rate into the Old World from South Central Mexico was significantly higher than all other inferred migration rates. Rates of migration from each American region to Australia were moderate, but within this the migration rates from South Central Mexico, and from      northwest Mexico and Baja California were marginally higher than the migration rates from Central and South America.

Divergence time estimates from IMA2
Based on analyses of population genetic structure (see above), the only regions that had diverged sufficiently to allow calculation of divergence time were Australia and the Americas. Using IMA2, we estimated the divergence time between these regions as 795 (95% HPD confidence interval: 165-3795) or 1695 (360-8115) years ago, depending on the mutation rate used for scaling. The broad confidence intervals for divergence time make these results difficult to interpret (table 4). We tested for statistical significance of migration rates using likelihood ratio tests and found that they were not significant (tables 4 and 5). Although this is inconsistent with the non-zero migration rates inferred by MIGRATE-N analysis between the American regions and Australia, it must be noted that the rates inferred by MIGRATE-N were also low, and we did not run likelihood ratio tests to determine statistical significance in the MIGRATE-N analysis.  Table 3. Bayesian estimates (mode and 95% posterior probability interval) of migration rates (number of immigrants per generation) and mutation-scaled effective population size (θ ) (a parameter that defines population size in terms of the diversity of genotypes) of Acacia farnesiana (L.) Willd. regional groups based on analysis with MIGRATE-N for all loci combined. The migration direction is represented with the immigrant population on the columns.

Effective population size estimates from MIGRATE-N and IMA2
Estimates of θ (mutation-scaled effective population size) from MIGRATE-N analysis varied from 2.32 in Arizona and northwest Mexico to 0.42 in the Old World, but these values have large confidence intervals and most of these are overlapping (table 3). Within our IMA2 analysis, we conducted likelihood ratio tests to determine whether differences in effective population sizes were statistically significant. We found that effective population sizes were significantly smaller for Australia than for either the Americas or the ancestral population. This is consistent with a founder effect following introduction from a source population in the Americas.

Discussion
Our results are consistent with published records that A. farnesiana originates in the Americas [11][12][13]. This region has the highest genetic diversity (expected and observed heterozygosity) and the highest effective population size (based on coalescent analyses). All genetic clusters inferred by STRUCTURE at K = 2 and K = 5 are found in at least one sample from the Americas with a Q-score of more than 50%, and only a subset of these clusters are found outside the Americas. The majority of genetic clusters are widespread within the Americas, with many individuals displaying admixture between clusters. This implies high levels of dispersal within the region. We discuss our hypotheses regarding the introduction of the plant to different Old World locations and to Australia below.

Introductions from the Americas to Southern Europe
The genetic data support our hypothesis that the introductions of A. farnesiana to southern Europe were via colonial interactions with the Americas. This is consistent with historical accounts of cultivation of the species in Italy and Spain as an ornamental during the seventeenth century [4,14]. The source populations for these introductions were probably from Central America, South America or the Caribbean Islands, based on assignment to similar genetic clusters in the STRUCTURE analysis. At a broader geographical scale, estimates of migration rates using MIGRATE-N show a high probability of South Central Mexico as a source population for introductions to the Old World. This could be driven by the inclusion of samples from the Philippines with the Old World at this broad clustering level, and genetic similarities between samples from the Philippines and South Central Mexico (see below).

Secondary introductions via Southern Europe to other parts of the Old World
The genetic data support the hypothesis that the plant underwent secondary introductions to other parts of the Old World from southern Europe. Samples from Spain show the highest genetic diversity. Other populations from the Eastern Atlantic Islands, Madagascar and Mascarene Islands, and India are genetically similar to those from Spain and contain a subset of this diversity. This could be the result of an initial introduction to Spain, with an associated population bottleneck followed by subsequent spread to other populations and further bottlenecks. Alternatively, the higher genetic diversity in Spain could be the result of multiple introductions, with only one of these introductions involving other parts of the Old World. The high level of admixture of genetic clusters at K = 5 in Spain would be consistent with the latter scenario.
The dispersal of A. farnesiana to Asia from Spain may have followed sixteenth century Mediterranean trade routes connecting southern Europe with North Africa, Arabia and India [57]. The A. farnesiana populations in the Eastern Atlantic islands, Madagascar and Mascarene islands in the Indian Ocean, and Fiji in the Pacific Ocean are all part of the same genetic cluster that includes southern Europe and the Indian subcontinent. This corresponds with the expansion of Portuguese trade along Africa's Atlantic coast and in the Indian Ocean region during the sixteenth and seventeenth centuries, and followed subsequently by Dutch, French and British trade and colonization in Africa, Asia and the Pacific during the eighteenth and nineteenth centuries. The plant's introduction to Fiji is ascribed to the gardening efforts of foreign traders and European missionaries during the 1860s [58], who may have introduced it from southern Europe or southern Asia.
For Southeast Asia, the genetic data demonstrate that introductions of A. farnesiana to the Philippines were independent of the introduction via southern Europe. The STRUCTURE analysis at K = 5 assigned samples from the Philippines to the same genetic cluster as plants from South Central Mexico (Puebla−Morelos). This genetic connection with South Central Mexico appears to be limited to the Philippines, with no further dispersal of this genotype to other Pacific islands or Old World locations. The Spanish galleon trade across the Pacific Ocean from Mexico from the sixteenth to the nineteenth centuries involved the introduction of numerous plants to the Philippines and the Mariana Islands (Guam) [59][60][61], and A. farnesiana may have been introduced during this period by people travelling between these two places.

Acacia farnesiana in Australia
There are no historical records for the introduction of A. farnesiana to Australia. Some of the earliest botanical explorations of northern Australia [62,63] note the species as being widespread at that time in some areas, which suggests its arrival prior to British settlement [64][65][66]. Hence, several Australian sources treat the plant as indigenous (e.g. [32,67]).
We tested three alternative hypotheses to explain the pre-British presence of A. farnesiana in the continent: (i) arrival via southeast Asia through colonial Portuguese or Spanish interactions; (ii) direct arrival from the Americas through European colonial voyages or (iii) pre-European arrival either through oceanic or human-assisted dispersal.

Arrival via Southeast Asia
The combined genetic and historical evidence does not support the hypothesis of A. farnesiana introductions from Old World colonial networks in Southeast Asia. If the Australian populations had arrived through Spanish or Portuguese colonial trade networks, they would most probably share a genetic cluster with populations from the Old World or Southeast Asia. However, the genetic clustering in the STRUCTURE analysis at both K = 2 and K = 5 shows that the populations from Australia form a different genetic cluster from the Old World or the Philippines populations, hence suggesting a separate introduction.

Direct arrival from the Americas through European colonial voyages
The combined genetic and historical evidence does not support the hypothesis of a direct introduction of A. farnesiana to Australia from the Americas through European colonial voyages or subsequent interactions. The divergence times based on the IMA2 coalescent analysis are difficult to interpret due to the large confidence intervals, but suggest introduction prior to European voyages and colonial interactions across the Pacific Ocean. The results of the STRUCTURE analyses also do not support direct

Pre-European arrival from the Americas
The genetic data offer strong evidence for a pre-European arrival from the Americas. This could have been through chance via oceanic dispersal, or through human agency. Genetic matching does not correspond with Spanish and Portuguese colonial activity in the Americas or in Southeast Asia, as discussed above. Estimates of migration rates between Australia and the Americas, using MIGRATE-N and IMA2, are low or non-significant. At K = 5, Australian populations are assigned to Cluster 2. A small number of samples from northwest Mexico and Baja California have Q-values above 0.5 for assignment to this cluster, and these may reflect the source populations for earlier introduction to Australia. These results could imply that following an early dispersal event from northwest Mexico to Australia, the populations subsequently diverged due to isolation or due to no further introductions from the Americas.

The enigma of Acacia farnesiana's arrival in Australia
How A. farnesiana arrived in Australia remains a historical enigma. It may be that a single chance event of oceanic dispersal [68] brought the plant's seeds from northwest Mexico across the Pacific Ocean to northern Australia, followed by gradual spread inland through wind, water, animals, birds and humans. Transoceanic dispersal by birds from Mesoamerica to Australia appears unlikely, since there are no records of bird migrations between these two regions [69]. Alternatively, A. farnesiana may have arrived through pre-European human-mediated introduction [70]. Our results are consistent with this. Although this may also seem improbable due to lack of any historical or archaeological evidence of pre-European human interactions between the Americas and Australia, a growing body of research using linguistic and genetic analysis indicates, for example, that sweet potato (Ipomoea batatas) was transferred by Austronesian sailors from the Americas into Oceania in pre-Columbian times [34,71]. As noted earlier, the genetic analysis of A. farnesiana samples from Fiji showed they were relatively recent introductions from southern Europe or India [58] and therefore not the source of early dispersals to Australia. Additional sampling and genetic analysis of A. farnesiana from intermediate islands in the north and south Pacific between the Americas and Australia may offer some clues.
One question that arises in relation to pre-European human-mediated introduction of A. farnesiana in Australia is whether the plant has long-standing recognition or use by indigenous groups. Our fieldwork in northwest Australia indicates that some Aboriginal languages in the Kimberley region such as Miriwoong have an indigenous name for A. farnesiana (moorloomboo), and that it is identified as a native plant typically growing on black soil country [72]. Further investigations of names and uses of A. farnesiana in other Indigenous languages of northern Australia combined with genetic analyses may provide insights into how the plant may have arrived and spread inland.

Conclusion
Our study is significant in providing the first genetic analysis of a plant introduction into continental Australia within a historical time frame of probably more than 750 years. While not conclusive, it also demonstrates the remarkable possibility of human-mediated dispersal of A. farnesiana from the Americas across the Pacific Ocean well before the arrival of Europeans to Australia. There are other plant species associated by their names and Aboriginal stories with pre-British colonial introductions to northern Australia from Southeast Asia and possibly further away from other parts of the Indian Ocean region. These include moringa (Moringa olifera), commonly referred to as Koepanger's tree (Kupang being the capital of West Timor) [73] and tamarind (Tamarindus indica) [74], which is often associated with the activities of Makassan trepangers (collectors of sea cucumber) and their trade with Aboriginal groups in northern Australia [75,76]. With increasing recognition of the long history of anthropogenic influence on vegetation change in the Australasian region [35,36], our study forges a new frontier for investigating ancient and precolonial interactions between Australia and the Pacific world and the Americas through integration of genetic analysis of plant species from these regions with available historical data.
The importance of recognizing the role of pre-European human-assisted plant dispersal goes beyond the Australia-Pacific region, and has broader implications for biogeographic studies of disjunct plant distributions around the world. Debates regarding disjunct plant distributions usually assume this is due to chance or transoceanic dispersal (e.g. [68]). The role of human-assisted dispersal is typically discussed in the context of European trade expansion and colonization of various world regions [77,78] but pre-and non-European human-mediated dispersal is rarely considered a possibility. In the absence of evidence, we see no reason to favour hypotheses of passive transoceanic dispersal as more parsimonious than explanations involving pre-European human interactions. There are other species in the genus Acacia s.l. with disjunct intercontinental distributions, such as Acacia heterophylla and Acacia koa [3], which are assumed to be the result of chance dispersal, but these should be reconsidered by including alternative hypotheses of human-assisted dispersal using multidisciplinary datasets including genetic, ecological, archaeological, historical, linguistic and social data.
There is a small, but growing, body of the literature using this interdisciplinary approach to investigate the ancient human history behind the current biogeographic distributions of various plant species [79][80][81][82][83][84]. Further research of this kind may not only solve the enigma of arrival of A. farnesiana to Australia, but also demand fundamental reconsideration of the pre-European history of indigenous interactions throughout the world.
Data accessibility. The datasets supporting this study have been uploaded as part of the electronic supplementary material.