The impact of time delays on mutant fixation in pairwise social dilemma games
Abstract
Evolutionary game theory examines how strategies spread and persist in populations through reproduction and imitation based on their fitness. Traditionally, models assume instantaneous dynamics where fitness depends on the current population state. However, some real-world processes unfold over time, with outcomes emerging from history. This motivates incorporating time delays into evolutionary game models, where fitness relies on the past. We study the impact of time delays on mutant fixation in a Moran Birth–death process with two strategies in a well-mixed population. At each time step of the process, an individual reproduces proportionally to fitness coming from the past. We model this as an absorbing Markov chain, allowing computational calculation of the fixation probability and time. We focus on three important games: the Stag-Hunt, Snowdrift and Prisoner’s Dilemma. We will show time delays reduce the fixation probability in the Stag-Hunt and the Prisoner’s Dilemma but increase it in the Snowdrift. For the Stag-Hunt and the Prisoner’s Dilemma, time delays lengthen the fixation time until a critical point, then reduce it. The Snowdrift exhibits the opposite trend.
1. Introduction
Evolutionary Game Theory (EGT) provides a framework for understanding how effective strategies spread through imitation or reproduction and persist in systems ranging from biology to economics [1–6]. The theory gives insights into phenomena like cooperation, competition and diversity in populations [7–13]. It examines how strategies or types of individuals change in frequency in a population based on their fitness from interactions. Strategies that produce higher fitnesses will tend to spread and become more common as individuals reproduce or copy them. Traditional EGT models have primarily focused on well-mixed, infinitely large populations, where individuals interact at random and strategy frequencies evolve deterministically according to replicator dynamics or other differential equation models [2–4,14]. However, real-world populations are typically finite and structured. To address these limitations, recent work has incorporated stochastic processes, such as the Moran process, into EGT models to study evolutionary dynamics in finite populations [4,15,15–25].
The classical concept of an Evolutionarily Stable Strategy (ESS) does not fully capture evolutionary dynamics in finite populations. A true ESS must now satisfy two criteria, it must resist invasion by rare mutant strategies, and its probability of replacement by a rare mutant must be lower than under neutral drift [15,16,19,20]. The first requirement provides an equilibrium condition, while the second provides a stability condition. Selection opposes invasion when mutants cannot attain higher fitness than resident individuals in a pure population. It also opposes fixation when mutants have no better chance of reaching fixation than neutral mutations. Satisfying both criteria helps ensure a strategy is robust and maintains equilibrium [6]. Therefore the study of fixation probabilities became a crucial component of comprehending EGT.
The Moran process is a classic model for studying evolution in finite populations [26]. In this stochastic process, a population of individuals with different types undergoes birth and death events over discrete time steps. At each step, one individual is selected to reproduce and its offspring replaces another chosen individual [27]. The individual’s fitness can influence birth and death events in this process. In general, individuals reproduce with a probability proportional to their fitness and replace another individual at random (Birth-death (Bd) process). We note that this process can also happen in other ways, and there are a number of different dynamics used. Fitness can be constant or frequency-dependent based on game interactions between individuals [15,16]. Typically the population is assumed well-mixed so that all individuals have the chance to be replaced by others. In this process, we track the fate of any emergent mutant, assessing the probability and the time that this single novel mutant ultimately fixes in the population (fixation probability and time). Also monitoring the trajectory of the mutant through the population provides insight into the dynamics of how novel mutations establish and propagate themselves in populations.
The fixation probability and time have been extensively studied across various models [28–37]. Previous work has focused on how factors like population size [28,34–36], environmental fluctuations [38–41] and population structure [42–51] influence mutant fixation. However, most of these studies assumed that fitness is determined instantaneously based on the current population state and payoffs. This assumption may not always be realistic, as biological and social processes often involve temporal delays [52,53]. It is therefore natural for EGT models to incorporate time delays, where fitness depends on history. The effects of time delays have been explored in the context of replicator dynamics for games with an interior stable equilibrium [54–64]. In [54], the authors studied a model where individuals at time t imitate strategies that had higher average payoffs at time for some delay . They showed the interior fixed point is locally asymptotically stable for small , but becomes unstable for large , leading to oscillations. In [55], individuals are born after their parents play and receive payoffs. Two coupled equations governed strategy frequency and population size. Analysis showed oscillations are not possible, with the original fixed point globally asymptotically stable regardless of the time delay. In [56], the authors modified the above model by allowing time delays to vary based on the strategies used by individuals. Under this framework, they observe novel behaviour where the equilibrium states become dependent on the delays. Most previous work only considered replicator dynamics and the impact of delays on equilibrium stability. Less attention has been given to finite populations and the effect of delays on the fixation process.
Our study looks into the influence of time delays on the fixation process in evolutionary games. We explore the impact of time delays on the fixation probability and fixation time in a well-mixed population comprising two strategies, examining a generalized Moran Bd process, where the fitness of both strategies at time step hinges on the fitness values from steps previously. By constructing an absorbing Markov chain, whose states are defined by indices, indices representing the history, and one the current mutant abundance, we determine the transition matrix, enabling us to analyse the fixation probability and fixation time for various game types.
The paper is structured as follows. Section 2 reviews the Moran process definition and basic properties for two-strategy games. Section 3 introduces a general framework to calculate absorption probabilities and times in Markov chains related to the process. Section 4 presents our model incorporating time delays into the Moran Bd process and defines a new associated Markov chain. Section 5 details the results of our model and the effects of time delays on the fixation probability and time across different games. The paper finishes with a discussion of the key findings and a summary of the main results.
2. Moran process
We consider the evolutionary dynamics of individuals adopting two distinct strategies, A and B within a well-mixed population, the game being played characterized by the payoff matrix [16]:
When a player with strategy A interacts with an A player (a B player), they receive a payoff of a (b). On the other hand, a B player receives a payoff of c (d) from an A (B) player. Consequently, the average payoffs when there are individuals with strategy A reads as follows:
where self-interactions are excluded. The corresponding average fitnesses are proportional to:
The variable represents the intensity of selection, indicating how much the payoff of individuals affects fitness [15]. When approaches zero, each individual’s payoff makes a minimal contribution to overall fitness, leading to what is known as weak selection. When , all individuals possess the same level of the fitness, resulting in neutral drift. Conversely, as increases, the influence of payoffs on fitness grows and we have strong selection.
There are various update rules when individuals update their strategies during each time step of a process [15,16,29,42,65,66]. We use the Bd process, where the fitness of individuals is considered in the Birth event, which is a type of frequency-dependent Moran process [15,16]. During each time step, an individual is chosen for reproduction based on their fitness. The newly born individual then replaces a randomly selected member of the population, ensuring that the total population size, , remains constant. At each time step, can change by a maximum of one based on the following transition probabilities:
Here the objective is to examine the Bd process involving a population of residents with strategy B when introduced to a single mutant with strategy A. This process ultimately results in one of two possible outcomes: the complete replacement of the population by the mutant, known as fixation, or the eradication of the mutant, referred to as extinction. Two quantities hold significant importance in this process, the fixation probability, which indicates the probability of the mutant becoming fixed in the population, and the fixation time, which represents the number of time steps required for the fixation [34,49,67]. There are three main approaches used to analyse the fixation process: direct analytical solutions, numerical methods using transition matrices of Markov chains, and Monte Carlo simulations. Analytical solutions can be found for simple Moran processes without population structure or with regular structure, providing formulae for fixation times and probabilities. However, as models become more complex, researchers must study the dynamics using numerical techniques and Monte Carlo simulations.
In general, we can consider an absorbing Markov chain associated with the process to investigate and compute the probability and time of fixation. Figure 1 provides a visual representation of the Markov chain. The state is determined by the number of mutants present in the population . Essentially, the Markov chain represents a random walk on sites ranging from to . Starting from state , the walk eventually reaches either the absorbing states of (mutant extinction), or (mutant fixation). The transition matrix is defined as follows:

Figure 1. Absorbing Markov chain with states associated with the Moran Bd process on population with size . Each state is determined by the number of mutants in the population . States and are the absorbing states related to the extinction and fixation of mutants, respectively.
The fixation probability from state with initial mutants with strategy A under a Bd process can be calculated using the recursive equations:
Using this recursive relationship and the boundary conditions that and , the fixation probability for a single mutant is [34,67]
The same recursive equation applies for the absorption time and fixation time as follows:
With the initial conditions and we have [34,67]:
3. General framework to calculate the absorbing probability and time for Markov chains
To evaluate the absorption probabilities and time for Markov chains with states ( transition state and absorbing states), a widely employed technique in numerical analysis, as outlined in [49,68], can be employed. The transition matrix can be re-expressed in a canonical form as follows:
where denotes the probabilities of transitions between the transient states, signifies the probabilities of transitions from the transient states to the absorbing states and represents an identity matrix. The Moran process in a well-mixed population is related to a Markov chain with states, including two absorbing states at and . Based on the canonical transition matrix, we can construct a new matrix denoted by and called a Fundamental matrix. The element represents the expected number of steps that the process which started in state spends in state before eventually reaching one of the absorbing states (the expected sojourn time) [49,68]. Based on this matrix, we can derive another matrix:
where represents the probability that the process becomes absorbed in absorbing state , given that it starts in state . For a Moran process in a well-mixed population, the matrix is composed of two columns, representing extinction and fixation probabilities.
The absorption time starting from state can be obtained by summing the row of the fundamental matrix and is given by the following expression:
To determine the fixation time, we employ a method described in previous studies [49,69,70]. If represents the fixation probability starting from mutants, then the fixation time starting from state can be calculated as follows:
By identifying the Markov chain and associated transition matrix for various models, we can utilize this numerical method to determine the fixation probability and times.
4. Model and methods
A random process should have a starting point. In our model, this is the time that a single mutant appears in the population, labelled time . In the above, we assume that the process is homogeneous, so that transition probabilities depend only upon the current state and not on the specific time that the state is reached. Thus for example in equation (2.5), the transition and fixation probabilities are independent of the specific time point, and so are not functions of time . In the model developed below, we will similarly consider the process homogeneous, but develop the concept of the state of the process to include not only the population size at present, but also at the most recent times, to incorporate time delays. This still leaves the question of how to deal with times too close to the start of the process for the full required history to exist. We explain how we do this, by selecting a specific initial state, in §5a.
Now we introduce a time delay in the Bd process for a well-mixed population. In the standard Bd process, the fitness of strategies at the current (discrete) time is important for the birth of new offspring. However, an individual’s fitness can depend on conditions in the past. Thus, we incorporate a time delay , so that the fitness of strategies comes from time , considering the same time delay for all strategies. Thus, both mutant and resident strategies receive payoffs from steps before. Since the Bd process proceeds in discrete time steps, takes integer values in the range. The case corresponds to the standard Bd process described previously.
To study this process, we model it as a Markov chain and use the previous numerical approach. To fully characterize each Markov chain state, we need to know both the number of mutants at the current time and the historical ones from time to the present. This allows us to define the state space describing the full process dynamics. Specifically, each state has indices, with indices representing the number of mutants at each of the previous time steps, and one index for the current number at time . When , the state reduces to a single index, as originally discussed. With the general state form , the number of mutants can only change by one per time step. This imposes between adjacent indices. As an example, for each state is characterized by , with . Within this Markov chain, states characterized by serve as absorbing states, signifying mutant extinction and fixation.
Deriving the whole state space of the Markov chain for a time delay of in a population of size as shown in figure 2, can be accomplished through the subsequent procedure.

Figure 2. Three Markov chains for with population size . The Markov chain with time delays has dimensions. To generate state spaces for a Markov chain with dimensions, we must erase the absorbing states from the -dimensional Markov chain and add three possible new indexes for each state like including (see algorithm 1). For instance, a population with size has (a): ; (b): ; and (c): The transition and absorbing states are represented by blue and red circles, respectively. When , there are more absorbing states with indicating mutant fixation and extinction, respectively.
The previous procedure allows us to find the state space of based on . As shown in figure 2, state space can be represented as a dimensional Markov chain. Simply we can obtain states of from . We identify and delete the absorbing states of where or , representing extinction or fixation. The remaining states of become the historical component of each state in ( in corresponding to in ). Since the number of mutants can change by at most one, in takes values , , or ( in is in ). The process excludes the absorbing states corresponding to and adds one more dimension (one more index to each state satisfying previous conditions) to find the Markov chain of . Algorithm 1 (described at the end of this section) can be used to find the state space for a given population size and time delay.
The next step is to find the transition matrix for the Markov chain with state space . Since we are considering the Bd process, time delays can affect the Birth part where fitness for reproduction comes from the past. If we denote the state as , there are three possible transitions, as shown in figure 3. In each generation of the process, the number of mutants can either increase by one, decrease by one or remain unchanged. In each transition, the history changes such that the new becomes from the previous state. To find the transition probability, the fitness for the birth part comes from , which shows the number of mutants steps ago. However, for the death part, the number of current mutants is important. Considering these details, the transition probabilities are:

Figure 3. There are three possible transitions from state in which the number of mutants in the population represented by can rise by one, drop by one or remain unchanged. In each transition, the history will transform in such a way that .
where and are given in (2.2). Algorithm 2 provides a useful way to determine the transition matrix for the time delay . Once is obtained, we can identify all absorbing states where and put the matrix into canonical form by finding and . This then allows the numerical approach from the previous section to be applied to calculate the fixation probability and time. As mentioned before, there can be additional fixation states (depending on ) with . Since the sum of each row of the absorbing probability matrix, is 1 (the sum of absorption probabilities starting from one state), we can find the fixation probability by summing the probabilities leading to fixation with . By dividing the state space into two parts for transient states and for absorbing states, we can define the fixation probability and time starting from state as:
5. Results
In this section, we examine how time delays impact the fixation of mutants. Before presenting the main results, it is instructive to demonstrate some limitations of our numerical method. As figure 4 shows, the state space size grows with both and . For large and values, the number of states becomes enormous, necessitating more memory to store the data. Furthermore, calculating the inverse matrix has higher computational complexity as the matrix size increases. For instance, with and , there are 1398 states, yielding a matrix with size of . This matrix would require approximately MB of memory ( bytes). However, when grows to and to , the state space has states, requiring around GB of memory: a prohibitive amount. Clearly, for larger populations and time delays, our numerical method has limitations, and computer simulations become preferable.

Figure 4. The number of states as a function of and .
(a) Fixation probability and time
In the following, we investigate the effect of time delays on the fixation probability and time in some well-known games: the Stag-Hunt, Snowdrift and Prisoner’s Dilemma. When studying the fixation of one mutant in a population of residents, there may be multiple states with that differ in their historical components for a given time delay . For consistency, we will use the initial state for all our results; this is consistent with starting with a single mutant and that for times within of the start of our process, the pay-offs used to calculate the fitness are assumed to take the value at this starting point. We compare the results of our numerical method (as much as possible) with those from computer simulations. For the simulations, we conduct different Monte Carlo simulations for each set of fixed parameters. The fixation probability is calculated as the ratio of fixed processes to total simulations. We determine the average fixation time only using processes that eventually undergo fixation. We constrained the pay-off matrix by setting and . By varying the remaining two parameters, we can generate a range of distinct games. This approach reduces the total number of parameters while still allowing for diverse game scenarios.
(i) Stag-Hunt game
The Stag-Hunt game is one of the earliest and most analysed coordination games [9]. This game is defined by a payoff matrix where the payoffs satisfy the relation . The key feature of the Stag-Hunt is that players are incentivized to coordinate on the same strategy, represented by the payoffs and along the leading diagonal of the payoff matrix. However, there is also a risk, captured by the off-diagonal payoffs and , if the players fail to match strategies. Given that is fixed at and at , the value of must fall within the range , while should be less than .
We first analyse how time delays impact the fixation probability when there is one mutant with strategy A (the ‘stag’ strategy) in a population of residents using strategy B. Figure 5 shows how varying the payoff matrix elements, population size and selection intensity affects the fixation probability in the presence of time delays. The lines represent numerical solutions while the dots show computer simulations. As noted previously, our numerical method is limited computationally, restricting the population size and time delay values we could examine. Thus for larger and , only simulation results are presented. Regardless of the network size and payoff element values, longer time delays consistently yield lower fixation probabilities. The results demonstrate that time delays reduce the ability of a mutant to take over a population compared with the undelayed cases.

Figure 5. The effect of time delays on fixation probabilities in the Stag-Hunt game ( and ) according to numerical solutions (lines) and computer simulations (dots). Panel (a) shows the influence of time delays for various network sizes with , and . Panel (b) illustrates the effect of time delays for different payoff matrix parameters with and . Panel (c) depicts how selection intensity affects outcomes for and with , . Lastly, panel (d) demonstrates the impact of network size for with , and .
Figure 5a,b demonstrate that as increases, the fixation probabilities appear to converge to a constant value. Our model indicates that as approaches infinity, fitness remains unchanged throughout the fixation process. This is because fitness originates from the starting state with one mutant and persists across all steps until fixation. Thus, we have a Moran process with constant fitness, where the fixation probability follows equation (2.6). With and the constant fitness defined as :
Therefore the fixation probability with constant fitness is:
For sufficiently large values of , the fixation probability depends only on the element of the payoff matrix and . When and , the fixation probability is . For , this gives fixation probabilities of , , and , respectively, matching figure 5a. The same behaviour is observed in panel (b) for a large time delay for different values of the payoff matrix. The influence of the intensity of selection is shown in panel (c) for a fixed payoff matrix with and . When , the payoff has no effect on fitness, resulting in neutral selection with a fixation probability of . As increases, the role of the payoff and the time delay becomes more apparent. Notably, the fixation probability initially rises with increasing before later decreasing. Panel (d) demonstrates the effect of the population size for three time delay values. For sufficiently large populations, the impact of the time delay on the fixation probability is negligible. However, for small populations, the time delay has a considerable effect on the fixation probability.
Figure 6 shows how time delays affect the fixation time in this game. Panel (a) demonstrates the impact of for various population sizes. Initially, increasing the time delay at a constant population size causes the fixation time to rise. However, after a critical value, the time delay starts to reduce fixation time until it levels off at a fixed value for very large . As discussed previously, a sufficiently long time delay produces a Moran Bd process with fixed fitness derived from the initial state with one mutant. Therefore, for very large , the fixation time can be calculated using equation (2.8) when the ratio of in (2.8) is .

Figure 6. The effect of time delays on fixation time in the Stag-Hunt game ( and ) according to numerical solutions (lines) and computer simulations (dots). Panel (a) shows the influence of the time delays for various network sizes with , and . Panel (b) illustrates the effect of time delays for different payoff matrix parameters with and . Panel (c) depicts how selection intensity affects outcomes for and with , . Lastly, panel (d) demonstrates the impact of network size for with , and .
The influence of the time delay on the fixation time in a population with constant size for different parameters of the payoff matrix is illustrated in panel (b) of figure 6. As the time delay increases, the fixation time rises to a maximum before decreasing. Ultimately, all fixation times converge to a constant value. Panels (c) and (d) show how the intensity of selection and population size impact the fixation time for certain time delays. As the game becomes more influential in determining fitness and selection intensity increases, the fixation time varies with the time delay. Additionally, for a fixed time delay, increasing the population size leads to greater fixation times, with network size having a larger effect than the time delay.
(ii) Snowdrift game
Generally, a Snowdrift game is characterized by a payoff matrix for which [6,71]. This setting of the payoff matrix promotes cooperation (playing strategy A) by an individual when they encounter a defecting opponent (playing strategy B). The game describes two people who are stuck in a snowdrift and both want to get out of it but are not eager to share the cost of clearing the snow. To have a Snowdrift game, given that is fixed at and at , the value of must be greater than (i.e. ), while b should fall within the range .
The time delay impacts the fixation probability in the Snowdrift game, as shown in figure 7. The top panels illustrate how increasing affects the fixation probability for various payoff parameters and population sizes. Unlike the Stag-Hunt game, a longer time delay boosts fixation probability in the Snowdrift game. Notably in panel (a), a small leads to a lower value in larger populations, but as grows, larger populations exhibit a greater value. In both panels, fixation probabilities plateau at constant values dependent on population size and game parameters when is sufficiently high. For large time delays, the fixation probability, like the Stag-Hunt game, approaches a constant value. This value corresponds to the fixation probability in a scenario where fitness is determined by the initial condition, with a single mutant in a population of residents. According to equation (5.1), when and , for large enough , increases as increases, as shown in panel (a). In panel (b), we see that for , the curves exhibit similar behaviour for a large enough delay.

Figure 7. The effect of time delays on the fixation probabilities in the Snowdrift game ( and ) according to numerical solutions (lines) and computer simulations (dots). Panel (a) shows the influence of time delays for various network sizes with , and . Panel (b) illustrates the effect of time delays for different payoff matrix parameters with and . Panel (c) depicts how selection intensity affects outcomes for and with , . Lastly, panel (d) demonstrates the impact of network size for with , and .
Panel (c) illustrates how the population size impacts the fixation probability across three values of the time delay. Initially, increasing the population size reduces the probability of fixation regardless of the time delay. However, for sufficiently large populations, further increases in population size increase the probability of fixation. We observe differing patterns at small versus big population sizes. In small populations, longer time delays correspond to higher fixation probabilities for mutants. Conversely, in large populations, longer delays diminish the fixation probability.
In the Snowdrift game, the impact of time delays on the fixation time is the reverse of what we saw in the Stag-Hunt game, as shown in figure 8. With a constant population size and payoff matrix values, increasing the time delay first lowers the fixation time, which then rises as the delay continues increasing. As before, for a large enough delay , the fixation time converges to a constant value that is less than the fixation time when there is no delay.

Figure 8. The effect of time delays on fixation time in the Snowdrift game ( and ) according to numerical solutions (lines) and computer simulations (dots). Panel (a) shows the influence of a time delay for various network sizes with , and . Panel (b) illustrates the effect of a time delay for different payoff matrix parameters with and . Panel (c) depicts how selection intensity affects outcomes for and with , . Lastly, panel (d) demonstrates the impact of network size for with , and .
(iii) Prisoner’s Dilemma game
Another interesting case is the Prisoner’s Dilemma. This game is characterized by the payoff relationship [7,8]. In such a setting, the defect strategy D is a dominant strategy, so that for a rational player it is always optimal to play strategy D irrespective of the other player’s choice. The standard Prisoner’s Dilemma offers players the option of defecting or cooperating; for mutual cooperation, two interacting players are offered a reward, , and for mutual defection, . In this scenario, if one player cooperates while the other defects, then the cooperator would receive the sucker’s payoff , and the defector would receive the temptation-to-defect payoff . In this game, the value of should be greater than , while should be less than .
We examine two scenarios in this game. Figures 9 and 10 depict how the fixation probability and time vary with changes in the time delay, for different population sizes and payoff matrices where the initial mutant is cooperator and defector, respectively. As expected, the fixation probability is much higher when the initial mutant is a defector rather than a cooperator under the same conditions since defectors, on average, obtain higher payoffs than cooperators when playing against both cooperators and defectors. In both scenarios, increasing the time delay leads to small changes in the fixation probability and time. However, larger time delays tend to decrease the fixation probability. We also observe an increase followed by a decrease in the fixation time as the time delay grows. This observation is clearer in the case where the initial mutant is a cooperator. For sufficiently large time delays, where the fitness depends only on the initial state, the constant fitness values () are for an initial mutant with the cooperator strategy and for an initial mutant with the defector strategy (when ).

Figure 9. The effect of the time delay on the fixation probability and time in the Prisoner’s Dilemma game ( and ) according to numerical solutions (lines) and computer simulations (dots) where the initial mutant is a cooperator. Panels (a) and (b) show the effect of time delays on the fixation probability and panels (c) and (d) show the effect of time delays on the fixation time for different network sizes and payoff values.

Figure 10. The effect of time delays on the fixation probability and time in the Prisoner’s Dilemma game ( and ) according to numerical solutions (lines) and computer simulations (dots) where the initial mutant is a defector. Panels (a) and (b) show the effect of time delays on the fixation probability and panels (c) and (d) show the effect of time delays on fixation time for different network sizes and payoff values.
(b) Comparison between fixation probabilities
In the preceding section, we observed that different games exhibit different patterns when incorporating a time delay into our model. To gain a more comprehensive understanding, we now compare all games by examining how changes in two parameters of the payoff matrix affect patterns influenced by a time delay. We set and , while varying from to and from to . Our analysis focuses solely on the fixation probability, as it holds greater significance than the time in the literature. Figures 11 and 12 illustrate the difference between fixation probability with time delay and fixation probability without a time delay for mutants with strategies A and B, respectively. Negative values indicate a decrease in fixation probability, while positive values signify an increase.

Figure 11. Fixation probability difference between delayed () and without a delay for an initial mutant with strategy A. Fixed parameters: , . Varying other parameters yields different game types: Prisoner’s Dilemma (PD), Snowdrift (SD), Stag-Hunt (SH) and Harmony (H). The time delay is from left to the right.

Figure 12. Fixation probability difference between delayed () and without a delay for an initial mutant with strategy B. Fixed parameters: , . Varying other parameters yields different game types: Prisoner’s Dilemma (PD), Snowdrift (SD), Stag-Hunt (SH) and Harmony (H). The time delay is from left to the right.
Our figures reveal that different parameter ranges give rise to distinct games. Among these, we observe a new game, referred to as the Harmony (H) game in the literature, which is less well known and studied compared with other games. Across the Stag-Hunt (SH) game, we observe a consistent inverse relationship between time delays and the fixation probability regardless of the strategy of the initial mutant. In the Snowdrift (SD) game, the fixation probability also decreases with a delay in both figures, suggesting that the effect of time delays is consistent regardless of the initial mutant’s strategy. In the Prisoner’s Dilemma (PD) game, the relationship between the time delay and the fixation probability is contingent on the game’s parameters. While there are scenarios where increased time delay leads to a higher fixation probability, a broad set of parameters results in a decreased fixation probability, irrespective of the initial mutant’s strategy. Notably, as we vary a single parameter while keeping others fixed (for instance, changing from a value below , corresponding to the Prisoner’s Dilemma, to a higher value associated with the Snowdrift game for ), we observe a shift in behaviour. This is reflected in the change of fixation probability differences from negative to positive values, highlighting the profound impact of parameter variation on game dynamics. Introducing time delays and considering payoffs from past interactions implies that both strategies receive payoffs by engaging with a higher proportion of resident strategies. Consequently, this scenario proves unfavourable for mutant strategies in Stag-Hunt games, while being advantageous in the Snowdrift game, regardless of the mutant’s strategy. This leads to a decrease in fixation probability for the Stag-Hunt game and an increase for Snowdrift game. For the Prisoner’s Dilemma game, it is dependent on the value of the payoff matrix.
(c) Conditional sojourn time
The impact of time delays on the conditional sojourn time will be examined in this section. Conditional sojourn time refers to the duration a process spends in a particular state before fixation. The conditional sojourn time starting from state refers to the number of times a process is in a transition state with mutants before fixation occurs [49,68]. Understanding conditional sojourn times is crucial for examining how traits or genotypes evolve within populations over time. When we introduce a time delay into models, we can analyse how the sojourn times in each state before fixation change. Specifically, the fixation time starts from one mutant which is the sum of the sojourn times across all states before fixation occurs. Therefore, by studying how time delays affect sojourn times, we can better comprehend their impact on overall fixation times.
For a process without time delays, when the process starts with mutants, the conditional sojourn time in the state with is . With time delays, each state with mutants may have different historical components. To find the conditional sojourn time associated with a state having mutants, we sum over all where . Using the fundamental matrix , the conditional sojourn time for a state with mutants starting from state is defined as:
Figures 13 and 14 show the mean conditional sojourn time for the Stag-Hunt and Snowdrift games, respectively, in a population of size 5. For the Stag-Hunt game, we set the payoff matrix values to . With no time delay, the smallest conditional sojourn time is when there are two mutants in the population. As the time delay increases to , which corresponds to the largest fixation time as seen in the right panel, the conditional sojourn time for two mutants increases. On average, the conditional sojourn time at is higher than for other time delays, matching the highest fixation time. This suggests is a critical value where the process spends more time in transition states for . For a sufficiently large delay (here ), where the fixation time converges to a constant, the conditional sojourn time decreases as the number of mutants increases, so that the process spends a small fraction of time in states with more mutants. Compared with the case without time delays, has a much smaller conditional sojourn time in states with four mutants. In other states, the conditional sojourn times are almost the same for and . This leads to a lower fixation time for compared with .

Figure 13. Conditional sojourn times for the Stag-Hunt game in a population of size 5 for different values of the time delay. The right panel shows the fixation time as the delay increases. We use as the payoff matrix.

Figure 14. Conditional sojourn time for the Snowdrift game in a population of size 5 for different values of the time delay. The right panel shows the fixation time as the delay increases. We use as the payoff matrix.
In the Snowdrift game, the results are the reverse of those for the Stag-Hunt. Using the payoff matrix , the largest conditional sojourn time on average occurs when there are two mutants. This time decreases as the time delay increases from until the critical delay of and then increases again. On average, has the smallest conditional sojourn time in all states, matching the lowest fixation time. In contrast to the Stag-Hunt game, for a sufficiently large delay, the conditional sojourn time increases as the number of mutants increases.
6. Discussion
Understanding fixation processes is vital for analysing evolutionary games in finite populations. Two important metrics—fixation probability and time—have received significant research attention across various evolutionary game models [28–37]. Most previous models frequently assume that fitness instantaneously adjusts according to the current population state and payoffs [28–51]. This assumption of immediate fitness determination may not reflect reality and most of real phenomena exhibit temporal delays. Therefore we have investigated the intricate influence of time delays on mutant fixation in evolutionary games.
Previously, researchers have studied the effect of time delays in deterministic replicator equation models [54–64]. They have been incorporated in two different ways: first, by considering only the payoffs at time coming from time in the past [54,55]; and second, by allowing individuals born in the past to replicate now based on past payoffs [56]. These studies have focused on how time delays impact the existence and stability of interior stationary states. Under the first approach, sufficiently long delays can generate limit cycle oscillations after a supercritical Hopf bifurcation [54,55]. Under the second approach, strategy-dependent delays can shift the locations of stationary states to disfavour delayed strategies [56]. However, the impact of time delays on fixation dynamics has remained an open question.
Our study extends the investigation of time delay effects from replicator dynamics to the Moran Bd process, which represents a shift in perspective. While both models are used to study evolutionary dynamics, the Moran process operates on finite populations and captures stochastic effects that are particularly relevant in small populations, which are common in real-world scenarios. This individual-based approach allows us to examine phenomena such as fixation probabilities and times, which are not directly accessible in the deterministic replicator dynamics framework [28–32]. The Moran process has widespread applications, particularly in biology, where understanding the fate of mutations (such as in cancer) is of paramount importance.
This study provides a more realistic model of how historical interactions influence current fitness and strategy adoption in social dilemmas. By examining how lags affect the spread and fixation of strategies in populations, the research offers deeper insights into the emergence and stabilization of cooperative and competitive behaviours. This approach not only enriches the theoretical framework of evolutionary dynamics but also has potential applications across various fields, from biology to economics, where time-dependent interactions significantly impact outcomes.
Here we employed a well-mixed population model where individuals adhere to one of two strategies, with pay-offs contingent upon their strategic choice and the prevalence of strategies [16]. Initially, a solitary mutant individual with strategy A emerges amidst a population of residents employing strategy B. At each discrete time step, an individual reproduces in proportion to its pay-off, randomly replacing another individual [15,16]. In the absence of a time delay, reproductive fitness is determined solely by the number of mutants in the current state [15,16,29,42,65,66]. However, when a time delay is introduced, reproductive fitness becomes contingent upon the population state steps back in time.
Generally, both strategies’ fitness depends on the frequencies of the strategies. When a mutant has relatively higher fitness in the population, it has a greater chance of becoming fixed. When we add a time delay to the model, it changes the relative fitness of mutants and residents because looking at past strategy frequencies can raise or lower fitness based on the payoff matrix. In the Stag-Hunt game, a mutant adopting strategy A experiences a greater reward when collaborating with fellow A mutants compared with interacting with B residents [9]. On the contrary, residents sticking to strategy B receive the same payoff regardless of whether they play with A or B strategies. As the A mutant population expands, they reap enhanced benefits by coordinating with each other, boosting their payoff (and fitness), consequently favouring mutant fixation. However, introducing a time delay disrupts this advantage by basing mutant reproduction on a past state where A mutants were less prevalent. This lagged payoff is lower than the real-time payoff, hindering A mutants’ ability to reproduce and thrive. Notably, this time delay does not affect B residents’ payoff. As a result, the effective payoff for A mutants diminishes, leading to a reduced fixation probability.
In contrast to the Stag-Hunt game, in the Snowdrift game, both A mutants and B residents attain higher payoffs from coordinating with A mutants [6,71]. As the time delay increases, both strategies interact more with B residents, reducing their payoffs. However, A mutants maintain a superior effective payoff compared with B residents, enlarging the fitness ratio () beneficial to mutants. Thus, unlike the Stag-Hunt, lengthening the time delay in the Snowdrift game enhances A mutants’ fixation prospects. In the Prisoner’s Dilemma game [7,8], the cooperator and defector strategies obtain higher payoffs when playing with a cooperator. When the initial mutant is a cooperator, increasing the time delay leads to a decrease in the effective payoff of both strategies for reproduction. However, the effective payoff for the cooperator from the past is smaller than for the defector, so the fixation probability decreases. When the initial mutant is a defector, although both strategies get more payoff when fitness comes from the past due to playing with more cooperators, the ratio of can be smaller or bigger based upon the elements of the payoff matrix. Thus, the time delay can decrease or increase the fixation probability according to how much better it is to play with more cooperators for mutants and residents.
In general, introducing a time delay alters the effective payoffs that both mutant and resident obtain for reproduction, impacting the fixation probability. A time delay tends to reduce the fixation probability if it reduces the relative payoff advantage of the mutant over the resident (). It tends to increase the fixation probability if it enhances this comparative advantage. The key factor is the relative change in payoffs for mutant versus resident resulting from the delay. In summary, there are three scenarios where the time delay reduces the fixation probability:
— | If it decreases the mutant’s payoff but increases the resident’s payoff | ||||
— | If it decreases both payoffs but the resident’s proportionately less than the mutant’s | ||||
— | If it increases both payoffs but the resident’s proportionately more than the mutant’s |
Conversely, there are scenarios where the time delay increases the fixation probability:
— | If it increases the mutant’s payoff but decreases the resident’s payoff | ||||
— | If it increases both payoffs but the resident’s proportionately less than the mutant’s | ||||
— | If it decreases both payoffs but the resident’s proportionately more than the mutant’s |
We also have examined the joint effect of a time delay and population size on fixation probability in the Stag-Hunt and Snowdrift games. Previous work showed that population size can increase fixation probability in some games [36]. Here, in Stag-Hunt games, increased population size decreases fixation probability regardless of time delays, although time delays affect smaller populations more. By contrast, in Snowdrift games, increased population size increases fixation probability. However, time delays affect small and large populations differently: bigger delays help mutant fixation in small populations, while smaller delays help in large populations.
Fixation time has been less studied in past literature. This measurement can vary substantially depending on the model used and the structure of the population [34,49,67]. In some cases, fascinating behavioural patterns may emerge related to fixation time. The effect of time delays on fixation times varies across the three studied games. In the Stag-Hunt game, as time delays initially increase, fixation times grow longer, peaking and then decreasing after a critical time delay point. Analysing the conditional sojourn time (the duration the process stays in each state before fixation), we see the process lingers mostly in states with few or many mutants in this game. However, the process spends more time in intermediate states as delays rise until reaching a peak, then declining. By contrast, the Snowdrift game demonstrates the opposite pattern: fixation times decrease and then increase again after a critical delay point. Here, conditional sojourn times in intermediate states dramatically fall with longer delays until the minimum fixation time is reached. Meanwhile, time delays have little influence on fixation times in the Prisoner’s Dilemma game, although there is a slight increase then a decrease similar to the Stag-Hunt game.
Our findings suggest that incorporating time delays into evolutionary game dynamics introduces novel and potentially significant effects. While our study focused on scenarios with uniform time delays for both mutant and resident strategies, further investigations are needed to explore cases where the time delay is strategy-dependent. Additionally, our exploration of the Bd update rule, where birth is influenced by fitness and a time delay, highlights the potential of alternative update rules, such as birth–Death, death–Birth, Death–birth, and imitation, to unveil distinct time delay effects. Moreover, our analysis of well-mixed populations without structure provides a foundation for future studies in structured populations, including line graphs, star graphs and more complex graphs such as scale-free and random graphs. Examining strategy-dependent time delays across structured populations represents a promising direction for future research on time delays in evolutionary games.
Data accessibility
This article has no additional data.
Declaration of AI use
We have not used AI-assisted technologies in creating this article.
Authors’ contributions
J.M.: conceptualization, formal analysis, investigation, visualization, writing—original draft, writing—review and editing; M.B.: conceptualization, formal analysis, funding acquisition, supervision, writing—review and editing.
Both authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
This project has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No 955708.