How successful are mutants in multiplayer games with fluctuating environments? Sojourn times, fixation and optimal switching

Using a stochastic model, we investigate the probability of fixation, and the average time taken to achieve fixation, of a mutant in a population of wild-types. We do this in a context where the environment in which the competition takes place is subject to stochastic change. Our model takes into account interactions which can involve multiple participants. That is, the participants take part in multiplayer games. We find that under certain circumstances, there are environmental switching dynamics which minimize the time that it takes for the mutants to fixate. To analyse the dynamics more closely, we develop a method by which to calculate the sojourn times for general birth–death processes in fluctuating environments.

The subscript σ is not relevant for the argument that follows, so we omit it for the remainder of this section of the Supplement. Given that each of the rates contains a factor x(1 − x) there are always two trivial fixed points x = 0 and x = 1. The remaining fixed points of Eq. (S1) are the -non-trivialroots of ω + (x) − ω − (x) = 0. This is equivalent to where the latter payoffs are given by in an n-player game. We note that n − 1 j x j (1 − x) n−j−1 is the probability for a given player to face j opponents of type A and n − 1 − j opponents of type B in an n-player game, if the fraction of type A-players in the population is x.

General construction
The construction of the game with specified non-trivial fixed points x 1 , . . . , x n−1 is equivalent to finding coefficients c j ≡ a j − b j , j = 0, 1, . . . , n − 1, such that for x ∈ {x 1 , . . . , x n−1 }. We construct a solution by induction. The problem is straightforward for n = 2, in which case it reduces to finding c 0 and c 1 such that c 0 (1 − x 1 ) + c 1 x 1 = 0. One finds Suppose now that we have constructed c 0 , . . . , c n−1 such that x 1 , . . . , x n−1 are the roots of We can introduce a new root at x n by multiplying by 1 − x + 1 − 1 x n x. One then obtains This can be written in the form This completes the inductive construction.

Three-player games
The resulting relations are relatively compact for three-player games. We will have internal fixed points at x 1 and x 2 if we choose and We have here included the subscript σ to indicate the dependence of the game on the environmental state. The coefficient c 0,σ can be chosen arbitrarily, its sign determines the direction of the flow at x = 0, and hence the stability of x 1 and x 2 , respectively. In the main paper we use c 0,σ=1 = 1 and c 0,σ=−1 = −1.

Analytical calculation of sojourn times
The sojourn time is the mean amount of time that the system spends in any one position i, given the starting position and environment. Both conditional (only trajectories which result in fixation contribute to the average) and unconditional sojourn times can be calculated. In the following subsections, we go about deriving expressions for the sojourn times in fluctuating environments. We note that the following discussion is only directly applicable to well-mixed populations with two species (strategies), but it would also be possible, in theory, to formulate the problem more generally. Obtaining closed-form solutions to this more general problem would be distinctly more complicated as the solution in the 2-species case relies on the use of telescopic sums. Another simplification we have made is the assumption that the environmental switching probability is independent of the state i of the population. The mathematical procedure is still possible without making such an assumption but things do simplify in our case.

Initial steps of the calculation
In order to compute sojourn times we introduce the quantity ϕ i,σ;j,σ = Prob   There exists a time t ≥ t 0 at which the system reaches state j, and when it does so for the first time the environment is in state σ .
The population is started in state i and the environent in σ at initial time t 0 .
(S12) We note that the initial time t 0 is obviously immaterial, as the dynamics is Markovian. One then has with the following boundary conditions The first and second of these reflect the fact that the states i = 0 and i = N are absorbing, so once the population is in state i = 0 or i = N it remains there at all future times. In the third relation in Eq.
(S14) we have i = j so that trajectories, started at (i, σ), will reach state j immediately at t = t 0 ; they then only contribute to the above probability if σ = σ.
For a fixed j Eqs. (S14) impose constraints on ϕ i,σ;j,σ at i = 0, i = N and at i = j. Focusing on a given value of j it is hence convenient to treat the cases i < j and i > j separately. To proceed we introduce the following quantities, Focusing first on j > i, we define and we then arrive at the following after substitution in Eq. (S13), (S17) We have used vector and matrix notation for convenience. Indices run over the states of the environment; for example the components of ν k,σ;j correspond to the index σ , and µ has entries µ σ→σ . We then use the condition to determine ν i,σ;j,σ . The probabilities ϕ i,σ;j,σ can then be found using Similarly, for j < i we write λ i,σ;j,σ = ψ i,σ;j,σ − ψ i+1,σ;j,σ , and find One then proceeds using Now we have at our disposal a means by which to calculate the complete set of probabilities ϕ i,σ;j,σ . Using these probabilities, one can then compute the unconditional and conditional sojourn times.

Calculation of unconditional sojourn times
As the next step we compute There exists a time t > t 0 at which the population returns to state i, and when it does so for the first time the environment is in state σ .
The population is started in state i and the environent in σ at initial time t 0 .
(S21) We stress the requirement that t be strictly greater than t 0 , marking a difference compare to the above definition of ϕ i,σ;j,σ , where we only require t ≥ t 0 . Hence, r i,σ;σ is in general distinct from ϕ i,σ;i,σ = δ σ,σ . We have We can now turn to sojourn times. Consider a trajectory which begins in state (i, σ). The probability of spending a total of t time steps in a particular state j, irrespective of the state the environment is in at that time, is then given by q t (j|i, σ) = σ1...σt−1 ϕ i,σ;j,σ1 r j,σ1;σ2 r j,σ2;σ3 ...r j,σt−2;σt−1 1 − σt r j,σt−1;σt .
The trajectory first has to reach state j, as indicated by ϕ i,σ;j,σ1 , it then has to 'return' t times [in the sense of Eq. (S22)], indicated by the factors r j,σ1;σ2 r j,σ2;σ3 ...r j,σt−2;σt−1 , and it must then not return to j again, see the factor 1 − σt r j,σt−1;σt . This can be written in a more compact matrix notation where The unconditional sojourn time is then the first moment of the distribution over t defined by q t (j|i, σ), Letting 1 I − r j = S j , one can evaluate the series to find Therefore, one can calculate the unconditional Sojourn times once the probabilities ϕ ij have been obtained as described above.

Conditional sojourn times
The conditional Sojourn times can be calculated in a similar way. We introduce the following shorthand The population spends exactly t steps at j before absorption.
The starting point is (i, σ), and the mutant reaches fixation. .

n-Player games (n > 3)
We have discussed in the main part of the paper the case of 3-player games, which have 2 internal fixed points, and we have contrasted this with the case of 2-player games, which have 1 internal fixed point. One might well wonder about the observed phenomena in games with more than 3-players. In Figs. S2 and S3 we show the fixation probabilities and conditional fixation times for 4-and 5-player games respectively. We also include the corresponding plots for 2-player games in Fig. S1 for comparison. The qualitative behaviour of the fixation probabilities is the same in the 2-, 3-, 4-and 5-player cases.
The conditional fixation times for the 2-and 4-player cases are very similar to each other, as are those for the 3-and 5-player games, suggesting that it is the pattern of stable and unstable fixed points that primarily determines the behaviour of the conditional fixation times. So, one only observes the minimum in the conditional fixation times when stability of the fixed points at the absorbing boundaries is opposite.
In the cases where the absorbing fixed points are of the same type (both stable, or both unstable), the environment where conditional fixation occurs most slowly is the one where the mutants are also most likely to succeed, σ = 1. So, spending lots of time in the σ = 1 environment, where the absorbing boundary at x = 1 is repulsive, is bound to make the conditional fixation time larger. Whereas, if the system spends more time in the environment where the boundary at x = 1 is stable, only the trajectories which are direct will tend to achieve fixation. So, the σ = −1 environment is more conducive to a low conditional fixation time when the absorbing boundaries are of opposite type.
In the cases where the absorbing boundaries are of different types, the σ = 1 environment is both the environment where the boundary at x = 1 is attractive and is also the environment where the mutants are most likely to be successful. The mutants will be repulsed by the fixed point at x = 1 in the σ = −1 environment, but they will also be less likely to fixate and thereby contribute to the conditional fixation time. For this reason, neither environment has the lower conditional fixation time outright. Switching between the two environments allows a balance to be struck between these two effects, hence we observe the minimum in the conditional fixation times. Figure S1. Fixation probabilities (top two panels) and conditional fixation times (bottom two panels) for 2-player games (1 internal fixed point). The staring environment is σ = 1 for the panels on the left and σ = −1 for the panels on the right. The fixed point is located at x = 0.5. The behaviour of the fixation probabilities is very similar for all the n-player games tested. The behaviour of the conditional fixation times turns out to be very similar to the 4-player case but not to the 3or 5-player cases. Figure S2. Fixation probabilities (top two panels) and conditional fixation times (bottom two panels) for 4-player games (3 internal fixed points). The staring environment is σ = 1 for the panels on the left and σ = −1 for the panels on the right. The fixed points are located at x = 0.25, x = 0.5 and x = 0.75. The fixation probabilities are qualitatively similar in the 2-, 3-, 4-and 5-player cases. That is, the system is more likely to fixate in one environment than the other and no novel effects due to the switching are observed. The conditional fixation times in this case are qualitatively similar to the 2-player case shown in Fig. S1. Figure S3. Fixation probabilities (top two panels) and conditional fixation times (bottom two panels) for 5-player games (4 internal fixed points). The staring environment is σ = 1 for the panels on the left and σ = −1 for the panels on the right. The fixed points are located at x = 0.2, x = 0.4, x = 0.6 and x = 0.8. The fixation probabilities are qualitatively similar in all the 2-, 3-, 4-and 5-player cases. That is, the system is more likely to fixate in one environment than the other and no novel effects due to the switching are observed. The conditional fixation times in this case are qualitatively similar to the 3-player case.

Environments with the same direction of flow, but different magnitudes
One may also wonder about the case where the direction of flow between the fixed points does not change between the two environments, but the strength of the flow in different regions does, as illustrated in Fig.  S4. Figure S4. Strength of flow as a function of x, σ = 1 in red and σ = −1 in blue.
The fixation probabilities in Fig. S5 as a function of T and δ + are qualitatively similar to the the cases in Figs. S1, S2 and S3. This is to say that the σ = 1 environment is more conducive to fixation and the more time that is spent in that environment, the more likely fixation is to occur. In this case, something very similar is also true of the conditional fixation times. The mutants fixate more quickly in the σ = −1 environment than in the σ = 1 environment. This is because the stable fixed point at x = 0.3 is more attractive in the σ = 1 environment than in the σ = −1 environment. This can be seen from the sojourn times in Fig. S6. The interplay between the two environments here doesn't appear to produce any novel behaviour as a result of switching. These results indicate that the σ = 1 environment gives the higher probability of fixation whereas the σ = −1 environment gives the lower conditional fixation time. There doesn't appear to be any novel behaviour, such as global minima or maxima, that occurs as a result of switching in this case. Figure S6. Unconditional sojourn times (left) and conditional sojourn times (right). As the amount of time spent in the σ = 1 environment is increased, the effective strength of the stable fixed point at x = 0.3 is also increased. This acts to both decrease the likelihood of absorption at the x = 0 boundary and increase the conditional sojourn time.