The impact of global selection on local adaptation and reproductive isolation

Despite the homogenizing effect of strong gene flow between two populations, adaptation under symmetric divergent selection pressures results in partial reproductive isolation: adaptive substitutions act as local barriers to gene flow, and if divergent selection continues unimpeded, this will result in complete reproductive isolation of the two populations, i.e. speciation. However, a key issue in framing the process of speciation as a tension between local adaptation and the homogenizing force of gene flow is that the mutation process is blind to changes in the environment and therefore tends to limit adaptation. Here we investigate how globally beneficial mutations (GBMs) affect divergent local adaptation and reproductive isolation. When phenotypic divergence is finite, we show that the presence of GBMs limits local adaptation, generating a persistent genetic load at the loci that contribute to the trait under divergent selection and reducing genome-wide divergence. Furthermore, we show that while GBMs cannot prohibit the process of continuous differentiation, they induce a substantial delay in the genome-wide shutdown of gene flow. This article is part of the theme issue ‘Towards the completion of speciation: the evolution of reproductive isolation beyond the first barriers’.


Introduction
Felsenstein [1] pointed out that by any measure there are many more niches than species and demonstrated that gene flow is a strong homogenizing force that will tend to prevent populations adapting to different niches. As an illustration, each broad-leaved tree provides two equal-sized niches (patches) for feeding caterpillars: under-leaf and over-leaf. These niches likely differ in predator pressure. Why then does each species not bifurcate into an under-and over-leaf phenotype? It seems they cannot, even though variation exists in nature for under-and over-leaf caterpillar lifestyles. Felsenstein's answer is that the divergent selection caused by differing predator pressure on the pair of patches is insufficiently strong to overcome the homogenizing effects of gene flow between incipient patch-populations.
Several verbal models and simulation studies have framed the process of speciation as a tension between local adaptation and the homogenizing force of gene flow [2][3][4][5]. Flaxman et al. [2] introduced BU2S (build-up-to-speciation), a model of divergent selection acting on many loci between two populations connected by gene flow. Intended as an extension of Felsenstein's model, they concluded that their model has an emergent property: populations can adapt to the opposing environmental stresses experienced in a pair of equal-sized patches even in the face of strong gene flow and, in the process, become (partly) reproductively isolated. Multilocus extensions deriving the conditions under which a barrier with gene flow can be maintained have been described previously (e.g. [6]), and so, on superficial inspection, these results seem plausible.
The population genetics or sweep-based model simulated by Flaxman et al. [2] assumes that local adaptation is neverending, i.e. each new mutation confers a fixed selective advantage locally. By contrast, other simulation studies of a pair of populations connected by gene flow have taken on the quantitative genetics point of view by modelling an explicit phenotype and studying local adaptation to a new set of fixed local optima [3,5]. However, while these sweep-based and trait-based studies of divergent selection make different assumptions about the genetic basis of local adaptation (an eternal stream of local sweeps versus adaptation to a fixed set of local phenotypic optima), they share an important feature: only locally beneficial mutations (LBMs) that affect the trait(s) under divergent selection are considered. This 'adaptationist' simplification ignores a central tenet of Darwinian evolution: namely, that the mutational process is blind to changes in the environment [7] and therefore will tend to limit adaptation.
Ignoring deleterious mutations, a substantial fraction of new beneficial mutations must be advantageous in many environmental contexts. Returning to our toy example of caterpillars in over-and under-leaf patches, such globally beneficial mutations (GBMs) include all variants that increase caterpillar fitness in both leaf patches as well as any mutation that increases fitness at the adult stage. Even in the presence of a barrier to gene flow, such GBMs will tend to selectively sweep across patches [8]. Previous analytic work has focused on the interaction between GBMs and LBMs in the context of adaptive introgression: how likely are GBMs to introgress from one species into another if they are linked to alleles with locally deleterious effects [8][9][10]. Uecker et al. [10] show that the probability of a single locally deleterious allele hitchhiking to fixation decays to zero over a distance 1/2N e r, where N e is the effective population size and r is the scaled recombination rate. For example, given parameters for modern humans (N e = 10 000 and r = 10 −8 ), the relevant distance is ca 5kb, shorter than the average gene. Given that many sexual species have higher scaled recombination rates [11], recombination should be frequent enough to prevent the majority of locally deleterious mutations from hitchhiking to fixation [9]. While the sweep dynamics of individual variants that are linked to locally deleterious mutations have been characterized in some detail [9,10], the flip-side of this interaction has received little attention: to what extent is local adaptation impeded when there is aconstant supply of both GBMs and LBMs arising in a genome? Moreover, the hitchhiking-fixation of locally deleterious mutations is not necessarily the main factor slowing down adaptation. Instead, local adaptation may be limited chiefly by Hill-Robertson interference, the process whereby ongoing sweeps of mutations at partially linked loci decrease each others' probabilities of fixation and build up negative linkage equilibrium [12] between GBMs and LBMs prior to their fixation.
Here we use simulations to investigate how the presence of GBMs affects the process of divergent local adaptation and reproductive isolation. We extend the existing simulation frameworks of [2,3,5] in which divergence evolves under a constant high rate of mutational influx. In these models, local adaptation involves many loci and the dynamics resulting from the selective interference of LBMs and GBMs cannot be captured by the analytic results that are available for the simpler case of a single introgressing locus.
Specifically, we (i) ask to what extent adaptation to locally divergent trait optima is impeded by selective sweep interference from GBMs, (ii) assess how the effect of GBMs depends on the assumptions about local adaptation (trait-based versus sweepbased models) and (iii) consider how GBMs influence the evolution of reproductive isolation.

Methods (a) Trait-based model of local adaptation
We study the impact of global selection by adding globally beneficial mutations (GBMs) to a multilocus model of divergent local adaptation similar to that studied by Yeaman & Whitlock [5] and Rafajlovićet al. [3]. Simulations under this trait-based model were implemented in SLiM3.3 [13]. We consider two Wright-Fisher populations (discrete non-overlapping generations) of N e diploid individuals that exchange, on average, M = 4N e m migrants per generation and experience soft selection. We consider M = 1 throughout, which, in the absence of GBMs, guarantees rapid local adaptation [3]. Recombination is modelled by assuming that cross-over events occur uniformly at random, i.e. we ignore gene conversion and physical constraints on double cross-over events. Both populations start out perfectly adapted to a shared local optimum set at 0 until the onset of divergent selection. An instantaneous change in the environment shifts their optima to θ + , and θ − = −θ + , respectively. We thus assume that adaptation is de novo and that the optima are stationary. We further assume that the phenotype of each individual is determined by the sum of effect sizes of all mutations affecting the trait under divergent selection. As in Rafajlovićet al. [3], the fitness of an individual in population i with phenotype z and with selection strength σ is given by (2:1) Here, σ denotes the strength of selection on the divergently selected trait and is assumed to be the same in both populations. σ affects the width of the fitness curve, with smaller values representing weak selection. Similar to Rafajlovićet al. [3], we explore two values of σ representing weak and strong selection on the trait under divergent selection (table 1). New offspring are generated each generation until the fixed population size N e is reached. Each individual's chance of being the parent to each new offspring is determined by its fitness W. Locally beneficial mutations (LBMs) arising at a total rate U l per individual per generation, are assumed to be codominant with effect sizes drawn from a mirrored exponential distribution with mean α. Assuming θ = 2 and mean mutation effect sizes of 0.01, on average 100 mutations are required for perfect adaptation. Mutations can occur with equal probability at each position along a contiguous chromosome of map length R. We assume a large number of evenly spaced sites (100 000), such that the probability of multiple mutations occurring at the same site becomes negligible (approaching an infinite sites model). If a site does get hit by a second mutation of the same type (LBM or GBM, see below), it erases the previous one (house of cards model). This differs from the model studied by Yeaman & Whitlock [5] and Rafajlovićet al. [3] which assumes that mutations occur at a small number (50-2000) of evenly spaced loci with effects accumulating at loci that are hit by several mutations (continuum-of-alleles model).

(b) Adding globally beneficial mutations
We assume that both populations are also adapting to a shared moving optimum on a second orthogonal trait. This global optimum represents an n-dimensional vector coordinate in the space of all traits that are not under divergent selection. We assume that this trait space is so large that the optimum will never be stationary long enough for a population to achieve perfect adaptation. To this end we further assume the population always trails this optimum at a fixed distance θ + (the same value as for the optimum of LBMs). This makes direct comparison with the distribution of fitness effects (DFE) of LBMs possible at all times. Note however that we assume that there is no pleiotropy, i.e. each beneficial mutation contributes to either local adaptation or global adaptation, but not both. This ensures that local adaptation can be limited only by genetic hitchhiking and sweep interference of LBMs and GBMs but not pleiotropic effects of mutation on local and global fitness.
To reduce computation time we do not simulate the moving optimum explicitly, but rather draw selection coefficients for GBMs from an exponential distribution with mean s g . We combine the effects of LBMs and GBMs on an individual's relative fitness in population i by multiplying (2.1) by w j = 1 + 0.5s j (heterozygotes) or w j = 1 + s j (homozygotes) for all n g GBMs, thus assuming codominance. This results in: We determine s g by calculating the selection coefficients for all LBMs bringing an individual with phenotype = 0 (at the onset of phenotypic divergence), Δz closer to θ (with Δz drawn from an exponential distribution with mean α). We then match the best-fitting exponential distribution to the resulting distribution of selection coefficients. Therefore, at the onset of local selection the distributions of selection coefficients of GBMs and LBMs are identical. Note however that the DFE of LBMs changes with approach to the local optima.
To measure the impact of GBMs on trait divergence, we track the mean phenotype in both populations for the trait under local selection. Phenotypic divergence is measured as the difference Δz between population means, where 'perfect' differentiation corresponds to Δz = 2θ + . We also track the time and population of origin of LBMs and use the tree sequence recording in SLiM3.3 to record and assess the distribution of cross-population coalescence times at the end of each run (T) using tskit [14].
To facilitate comparisons between scenarios with and without GBMs, the rate U l of LBMs is kept constant throughout. Reduced divergence is therefore not caused by a reduction in the mutational supply of LBMs. Since the relative ratio of GBMs to LBMs is unknown in nature, we explore varying U g /U l but ensure that the total rate of beneficial mutations (U g + U l ≤ U) is biologically plausible given empirical estimates of de novo mutation rates and the distribution of fitness effects (see §2d 'Choice of parameter space'). For each parameter combination, we run 200 replicate simulations.

(c) Globally beneficial mutations in a sweep-based model
We re-implement the model studied by Flaxman et al. [2] in SLiM3 [13,15] by making minor changes to the trait-based model described above. For ease of comparison, we keep model assumptions and parameters (U l , U g , M, T) the same whenever possible (table 1). As before, we assume no pleiotropy, i.e. new mutations are either LBMs or GBMs. However, the effects of LBMs on fitness no longer depends on an explicit phenotype and a fitness function relating it to an optimum but are drawn directly from an exponential distribution with mean s l . Each LBM i confers a homozygous fitness w 1i = 1 + s i in population 1 and w 2i = 1/(1 + s i ) in population 2. Selection is thus purely directional. As before, co-dominance is assumed and fitness effects of all n g GBMs and n l LBMs (at any point in time) across multiple loci are multiplicative. Selection coefficients for GBMs and LBMs are drawn from the same exponential distribution. The impact of GBMs on fitness is modelled in the same way as before resulting in:

(d) Choice of parameter space
We seek to cover a biologically plausible range of parameters including humans and Drosophila. We focus on two scenarios that are loosely motivated by two contrasting estimates of the  [16], using inferences based on the site frequency spectra of synonymous and non-synonymous nucleotide sites, estimated that a fraction p a = 0.01 of mutations is selectively advantageous with a mean effect size of N e s = 5. By contrast, recent estimates based on the correlation between within-population synonymous nucleotide site diversity of a gene and its divergence from related species at non-synonymous sites suggest that beneficial mutations may be much rarer ( p a = 10 −4 ) [17] but of larger effect (mean N e s = 250). Given that both estimates are likely biased by their contrasting assumptions about the effect of selection on linked neutral sites and the fact that targets of selections are unlikely to be uniformly distributed in real genomes, we have implemented both a 'weak-frequent' (weak selection, frequent mutations) and a 'strong-rare' scenario and explored a range of U values for each case (see table 1 for parameter values). All simulations were run for two different lengths of sequence R = 0.5 and 0.05 morgan (M). Assuming a typical Drosophila chromosome (R = 0.5 M) of length 50 Mb, a rate of de novo mutation μ = 3.5 × 10 −9 per base and generation [18] and p a = 0.01 (i.e. the weak-frequent scenario), we expect a total rate of beneficial mutations U = 0.00175 per generation.

Results and discussion
The results section is structured as follows: we first describe the effect of globally beneficial mutations (GBMs) on adaptation to a fixed set of local trait optima. We then ask how GBMs affect genome-wide divergence measured in terms of the distribution of between-population coalescence times and consider their influence on clustering of locally beneficial mutations (LBMs). Finally, we investigate GBMs in the context of the more extreme model of continued divergence considered by Flaxman et al. [2] and discuss their effect on the evolution of reproductive isolation during speciation. We focus primarily on the results for the larger map length (0.5 M), which is biologically more realistic (analogous to one chromosome).

(a) GBMs reduce local adaptation
Comparing the mean divergence along the local trait axis between simulations with and without GBMs shows clearly that GBMs reduce mean phenotypic divergence and thus local adaptation (Δz in figure 1). Unsurprisingly, the effect of GBMs on local adaptation depends on the relative mutation rates of GBMs to LBMs (U g /U l , yellow versus orange in figure 1). We do not find any qualitative differences between the 'strong-rare' and the 'weak-frequent' scenario.
Rather than slowing down local adaptation, GBMs halt the process at a certain point, thus decreasing the (maximum) possible mean trait divergence between populations. In other words, GBMs induce a constant genetic load [19] by reducing the proportion of locally beneficial LBMs segregating in the population and increasing the proportion of locally maladaptive LBMs (figure 2). Although the mean trait divergence asymptotes to an equilibrium value below 2θ + , we want to emphasize that this equilibrium is highly dynamic, i.e. subject to high variance both within and among individual runs (figures 1 and 3). For large map lengths (0.5 M), GBMs do not substantially impede local adaptation but lead to more erratic evolutionary trajectories. Assuming a shorter map length (0.05 M), i.e. GBMs arising in much closer association with LBMs, leads to a more pronounced reduction of local adaptation (electronic supplementary material, figure S1).
To understand why the observed dynamics differ from the impact of single introgression events of beneficial mutations (here GBMs) associated with a deleterious load (here foreign LBMs), it is useful to consider the continuous process of sweeping GBMs during the adaptive trajectory of a population: at the onset of divergent selection, each population is displaced far from the new local optimum. As a consequence, the initial local adaptation is due to a few mutations of large effect that rapidly sweep to fixation [20]. During this phase, which involves a small number of rapid sweeps of large effect LBMs, segregating GBMs are unlikely to occur in close proximity to these LBMs. The subsequent approach to the optimum proceeds through mutations of progressively smaller effect sizes [20] with longer sojourn times. Given the increasing number of segregating LBMs, the probability that GBMs arise in close proximity increases, as does the chance of deleterious LBMs rising to appreciable frequencies before they are dissociated from sweeping GBMs by recombination ( figure 3). As the fitness landscape becomes less steep upon approaching the optimum, the time it takes and the frequency at which locally deleterious LBMs recombine away from sweeping GBMs is increased [10]. Once a LBM is at appreciable frequency, its trajectory will be affected by GBMs that arise sufficiently close (and in the same population).
To summarize, the effects of sweeps on the adaptive trajectories of populations overlap (in time) for many generations. Thus analytical results for sweeping GBMs occurring in isolation from each other (single introgression events) cannot capture the potential sweep interference between LBMs and GBMs. For example, although the frequency trajectory of an established LBM can only be affected by GBMs in a small fraction of genome around it [9,10], introgression happens continuously. Even if deleterious LBMs recombine away and are purged, their sojourn time in the 'wrong' population will be increased relative to a scenario without GBMs (figure 2). Sweep interference results from the time overlap of these sojourns, and its potential to limit adaptation is increased with GBMs.
Additionally, while the probability of any individual GBMs dragging locally deleterious LBMs to fixation is low, such events do accumulate (figure 2b, right) and are compensated by new LBMs. This turn-over is visible in the age distribution of LBMs which, in the presence of GBMs, shows a skew towards the recent past (figure 2a). By contrast, without GBMs, LBMs contributing to local adaptation tend to be old, i.e. have fixed during the initial phase of local adaptation (figure 2, left).

(b) GBMs reduce genome-wide divergence and increase clustering
The evolution of reproductive isolation can be characterized as a gradual reduction in effective migration rate (m e ) [21]. A measurable (in practice) metric for long-term m e is the genome-wide distribution of pairwise coalescence times between species, f(T 2 ). As long as gene flow is ongoing, recent coalescence is likely. However, once reproductive isolation is complete, f(T 2 ) is fixed and can only shift pastwards. Inspection of f(T 2 ) genome-wide (averaged across simulation replicates) shows that in the absence of GBMs, selection against maladapted migrants reduces m e [22] and hence the chance of recent coalescence compared with the neutral expectation (figure 4 and electronic supplementary material, figure S6). By contrast, the continued global sweeps induced by introgressing GBMs have the opposite effect of increasing m e and the fraction of genome with recent ancestry. The question of whether local adaptation in the face of gene flow leads to clustered genetic architectures for local adaptation has received much attention [3,5]. Given that Yeaman & Whitlock [5] described weak selection promoting clustering, we compared the distribution of pairwise distances between consecutive LBMs in the presence and absence of GBMs for the 'weak-frequent' scenario. Similar to Yeaman & Whitlock [5], we conditioned on LBMs that contribute to phenotypic divergence. Summarizing this royalsocietypublishing.org/journal/rstb Phil. Trans. R. Soc. B 375: 20190531 distribution across simulation replicates and weighting LBMs by their effect size shows that the presence of GBMs increases clustering (electronic supplementary material, figure S7). However, we stress that even over the long timespan we consider, both the amount of clustering that emerges overall and the increase in clustering due to GBMs are small. While our results confirm previous findings that clustering emerges even in the absence of GBMs, a direct comparison with Yeaman & Whitlock [5] and Rafajlovićet al. [3] is challenging given the differences in models: while we approximate an infinite sites model, Yeaman & Whitlock [5] and Rafajlovicé t al. [3] assume a limited number of loci at which mutational effects build up. Thus their model has clustering inbuilt and it is perhaps unsurprising that, given enough time, large effect loci arise [5, Fig. 4].
Booker et al. [24] have recently shown that GBMs also present a challenge for attempts to infer the targets of divergent selection in genome scans and may be impossible to distinguish from LBMs using F st [25,26]. However, given the different effects LBMs and GBMs have on the distribution of between-population coalescence times, it should be possible to distinguish them in real data using richer summaries of sequence variation than F st . Indeed, powerful methods for detecting individual introgression sweeps already exist [27]. An interesting direction for future work will be to exploit the information contained in the distribution in coalescence times around a large number of putative targets of local and global selection in genome-wide data. This can only increase the power for such inference. royalsocietypublishing.org/journal/rstb Phil. Trans. R. Soc. B 375: 20190531 more extreme, though arguably less realistic. Flaxman et al. [2] do not consider phenotypes explicitly, but instead assume that local selection is never-ending, which is equivalent to assuming eternally diverging local optima. While in this case strong reproductive isolation (genome congealing) is inevitable, GBMs still affect the speed at which reproductive isolation evolves. Although GBMs do not result in any increase in short-term m e measured over a single generation as the relative average fitness of migrants (electronic supplementary material, figure S8), their long-term cumulative effect is to delay the completion of strong reproductive isolation. This delay in the genome-wide shutdown of gene flow is reflected in the widening of the distribution of between-population coalescence times (electronic supplementary material, figure S9).

Conclusions and future directions
We set out to address an 'adaptationist' imbalance in existing models of evolution under divergent selection. We believe there is no reason to assume that, following the onset of local adaptation, all beneficial mutations will be subject to divergent selection. We termed the remaining fraction globally beneficial mutations (GBMs) and studied their impact on local adaptation and the emergence of reproductive isolation under two existing models.
We show that GBMs lead to more erratic evolutionary trajectories during local adaptation in the face of gene flow and may slow down the evolution of reproductive isolation regardless of whether we consider local adaptation to a fixed set of optima or as a runaway process. Our implementation approximates an infinite-sites-model with uniformly distributed mutational targets for both locally beneficial mutations (LBMs) and GBMs. Given the importance of linkage, any deviation from our assumption of a polygenic trait under divergent selection, or from the assumption that mutational targets for LBMs and GBMs do not arise in strictly separate parts of the genome, will reduce the negative impact of GBMs on local adaptation. We only considered U g /U l of up to 2. While the ratio of GBMs to LBMs is an unknown quantity in nature (again depending on the distribution and the number of potential sites of LBMs and GBMs relative to each other), it seems plausible that there are many more mutations that affect global fitness than local adaptation.
Our intent was to explore a more general mutational model for adaptation under divergent selection. However, for the sake of comparability, we have followed many of the same simplifying assumptions as existing work, i.e. complete symmetry between populations both in terms of their demography (N e , M) and selection (an instantaneous switch to symmetric trait optima). We have also introduced our own symmetric simplification: the same DFE for GBMs and LBMs. Given that such many-fold symmetry is biologically unrealistic, an important task for future work is to test the robustness of these results to violations of symmetry. While we can venture to make educated guesses about relaxing symmetry assumptions for individual parameters (e.g. GBMs should reduce local adaptation in both populations even when only one is given a new optimum), the effects of GBMs under a completely general model are difficult to intuit. Further research with these kinds of models would benefit from studying the impact of hard selection, given that these models rely on continuous diverging selection for very long time periods. It would also be helpful to understand to what extent GBMs can facilitate the local fixation of recombination modifiers such as inversions [28] or chromosomal fusions.
The theme of this special issue is the evolution of strong reproductive isolation during speciation. It might be suggested that our results are of little relevance since they appear to address primarily the initial evolution of reproductive barriers through adaptation to stable but divergent trait optima in two populations connected through migration. We show that in this case, migration shuts down only in regions of the genome that contribute to local adaptation causing partial reproductive isolation. However, we would argue that the equilibrium shown in figure 1 corresponds to a stable endpoint, despite the fact that barriers between  Figure 4. The genome-wide distribution of pairwise coalescence times f(T 2 ) for the 'strong-rare' scenario. The neutral expectation (assuming M = 1) [23, eqn 10] is shown as a black solid line; distributions in the presence (U g /U l = 2) and absence of GBMs are shown in tan and grey respectively. 69% (without GBMs) and 99% (with GBMs) of pairwise coalescence times are smaller than 10 × 2N e generations. Data correspond to figure 1(b,iii). The x-axis is truncated at 10 × 2N e . royalsocietypublishing.org/journal/rstb Phil. Trans. R. Soc. B 375: 20190531 locally adapted taxa remain permeable and globally beneficial genes can introgress with negligible delay [8]. Biologists studying gene flow in nature discarded the biological species concept (BSC) more than a quarter of a century ago [29] because its simplistic requirement of complete reproductive isolation, while attractive for the purposes of categorization, contradicts the widespread evidence of hybridization. There is now strong evidence from genomic data that species barriers have been permeable across entire radiations (e.g. Heliconius butterflies [30]). Likewise, the discovery that humans have received adaptive introgression across permeable species barriers from archaic hominins [31] has not led to a re-categorization of hominins as a single BSC species owing to their incomplete reproductive isolation. While we tend to focus on speciation as the evolution of complete isolation and the so-called congealing of the genome, partial reproductive isolation may be an alternative and evolutionarily stable endpoint, even in the presence of other mechanisms that reduce genetic exchange between locally adapted populations such as epistasis and reinforcement [32].
Taxa that maintain distinctive genomes in the face of gene flow are separated by permeable reproductive barriers. If maintenance is exogenous, for example depending on niche existence, environmental change may make them transient [33]. If endogeneous barriers develop, long-term persistence becomes more likely. In this context, it would be interesting to test how incorporating epistasis between LBMs (for example, by modifying the exponent in the fitness function [34]) in our model would affect the evolution of reproductive isolation. Although if we assume adaptation to be highly polygenic, epistasis is likely to only have modest effects [35].