Fitness differences suppress the number of mating types in evolving isogamous species

Sexual reproduction is not always synonymous with the existence of two morphologically different sexes; isogamous species produce sex cells of equal size, typically falling into multiple distinct self-incompatible classes, termed mating types. A long-standing open question in evolutionary biology is: what governs the number of these mating types across species? Simple theoretical arguments imply an advantage to rare types, suggesting the number of types should grow consistently; however, empirical observations are very different. While some isogamous species exhibit thousands of mating types, such species are exceedingly rare, and most have fewer than 10. In this paper, we present a mathematical analysis to quantify the role of fitness variation—characterized by different mortality rates—in determining the number of mating types emerging in simple evolutionary models. We predict that the number of mating types decreases as the variance of mortality increases.


Introduction
Reproductive processes play a key role in evolutionary biology. Yet the evolution of sexual reproduction itself is still an intensely debated subject that gives rise to many puzzling questions [1,2,3]. For a complete understanding, it is important to consider biological features that, although sometimes taken for granted as common, actually display startling variety across the tree of life [4]. In this paper we will be concerned with just such a feature and its evolution; the number of "sexes" in a given species.
For clarity, let us compare the more familiar sexual features of mammals with those of the slime mold Dictyostelium discoideum. Mammals posses two sexes, defined in terms of the size of the gametes (sex cells) that they produce; males produce small, motile and numerous sperm while females produce large, sessile and well-provisioned eggs. A union between haploid gametes of opposite types (sperm and egg) permits the formation of a zygote. This state, in which gametes display a clear size dimorphism, is termed anisogamy. In contrast, D. discoideum is an isogamous species; it produces gametes that are morphologically indistinguishable [3]. These gametes still come in a number of self-incompatible, genetically for more than two mating types).
While each of these hypotheses may be biologically plausible, they also lead to a somewhat bimodal prediction for the number of mating types; under certain conditions two (or sometimes three) mating types are selected for, while outside these conditions the number of mating types should consistently grow. Thus these theories do not account for the intermediate number of mating types found in ciliates and fungi.
Recently, attention has been turned to the power of neutral models for an explanation as to the number of mating types observed across species [20,21,22]. These models take a similar approach to [7] in assuming that there are no differences (other than selfincompatibility) between types. Their analysis considers population genetic effects of finitepopulation size and investigates the number of mating types predicted at long times under a mutation-extinction balance. Conceptually, this approach is similar to classic population genetic studies estimating the number of self-incompatibility alleles in certain plant populations [23,24,25]. (In these species, further self-incompatibilities have arisen between the sexes themselves in order to prevent inbreeding depression [26,27]).
A key biological feature affecting the typical dynamics of the self-incompatibility alleles that define mating types is the balance between sexual and asexual reproduction. Many isogamous species are facultatively sexual, experiencing only rare bouts of sexual reproduction between many rounds of asexual division. In [28], it was shown that prolonged rounds of asexual division could substantially increase the extinction risk to mating type alleles. In [20] it was then shown analytically that the expected number of mating types under a mutation-extinction balance would decrease dramatically as a function of decreasing rates of sex. These results were extended in [22] to predict the invasion and extinction dynamics of mating type alleles in finite populations, showing that these predictions were broadly consistent with empirical observations. The above neutral models each assume that every mating type experiences a symmetric negative frequency dependent selection (the rare sex advantage), but that otherwise each mating type is equally fit. However, as verbally argued in [29], such perfect symmetry in fitnesses may be unlikely. In many diverse lineages, mating type loci are found to feature large regions of suppressed recombination [30,10], analogous to non-recombining sex chromosomes in animals. This suppressed recombination has two important consequences. First, linked deleterious mutations are more likely to accumulate on mating type alleles as a function of their reduced effective population size [30,31]. Second, we expect that suppressed recombination will rapidly lead to divergence between the mating type alleles [32].
In obligately sexual species with just two mating types, any effect of fitness differences between mating type alleles would be somewhat muted by Fisher's principle [33]; the need to reproduce sexually guarantees an equal abundance of each mating type in a pool of successfully partnered parents. However, when sex is facultative, differences in the reproductive rate of mating types during asexual reproduction can distort the population's sex-ratios away from an even distribution [34]. In an extreme scenario then, fitness differences between the mating types could drive competitive exclusion and a reduction in mating type diversity. Preliminary simulations in [20] supported this view, but no analytic results were obtained.
In this paper, we will address this gap in the literature, focussing on how mortality differences between mating type alleles may alter the expected number of mating types in isogamous species. We examine a mathematical evolutionary model of isogamous populations of infinite size and find an analytic expression for the expected number of mating type alleles. We will derive general results that apply to models in which the mortality rates of mutant types are drawn from arbitrary probability distributions.
The paper is structured as follows. We begin in Section 2 by presenting the model that we will use for our analysis, which can be broken down into two components; a short timescale model of the mating type population dynamics and longer timescale model of the evolutionary dynamics. In Section 3.1 we analyse the behaviour of the short term population dynamics, deriving conditions for the stability of the population. Section 3.2 utilises these results to make predictions about the long-term evolutionary dynamics when mutation and extinction events occur. First examining a special case when mutant mortality rates are drawn from a delta distribution, we infer the distribution-dependent scaling factor of the expected number of mating type alleles using an integral transform. Finally, we conclude by exploring the biological implications of our results with reference to available empirical data.

Population and evolutionary models
In constructing a mathematical model for the evolution of mating types, we must determine appropriate descriptions of facultative sexual reproduction in a population, and of the emergence of mutations. Let us briefly discuss our choices, and their biological implications, with reference to the literature.
Some authors have modelled facultative sex as a population switching between purely sexual and asexual reproductive modes [34,28], while others consider a population average rate of asexual and sexual reproduction [20,22], with individuals able to engage in either mode at any given time. We take the latter approach as it does not require a population-level coordination in the selection of reproductive mode. The actual rate of sexual reproduction in a facultatively sexual population is likely a complex function of the ecology and population genetics of the species under consideration. Moreover, the exact theoretical mechanisms that generate selection for sexual reproduction are an intensely debated subject [35]. To understand the effect of the rate of sexual reproduction on the number of mating types, we treat it as a control parameter in our model.
Regarding mutation, we must distinguish between two relevant types of mutations that our model should account for: those that affect the (frequency independent) fitness of individuals, and those that affect their mating behaviour. In principle, mutations affecting frequency-independent fitness may or may not be linked to the underlying mating type, however, in our model only one of these is relevant. Unlinked mutations would become disassociated from the mating type background on which they arise via genetic recombination during sexual reproduction. Thus the effect of such unlinked mutations would be most clearly seen during periods of purely asexual reproduction in the population. Such dynamics would therefore be obscured in our chosen modelling framework, in which the timing of sexual reproduction is not correlated across the population. We therefore only consider mutations affecting frequency-independent fitness that are explicitly linked to the mating type locus.
Finally we note that mutations that generate new mating type properties may be realised in different ways. Some studies consider the gradual emergence of new mating types from their self-incompatible ancestors, with intermediate stages accounted for in which the mutant mating type may not be fully compatible with all residents mating types [17]. However, the vast majority of studies take a simplified modelling approach, in which new self-incompatible mating type alleles are introduced to the population in a state fully compatible with all resident types. We will follow this second approach, but note that frequency independent fitness mutations that are linked to mating type alleles could be interpreted in a loose sense as accounting for increased or decreased maladaption between a given mating type and the remaining residents.
We now proceed to describe the precise mathematical details of the model. The model is split into two distinct time scales: Short-term population dynamics and the long-term evolution of mating types. We describe the short-term dynamics in terms of deterministic ordinary differential equations for the frequencies of the mating types in the population. These are assumed to reach a stable stationary state on a much faster timescale than mutations occur. By then introducing random mutation events, we explain how the system evolves in the long term.

Short-term population dynamics
The model for our short term population dynamics is similar to that introduced in [20]. An allele at a single locus determines the mating type of a haploid individual (see Appendix A for a full biological motivation). In order to investigate the evolution of mating type number, we allow an infinite number of alleles at that locus, i.e. there is a priori no upper bound for the number of different mating types in a population. The population dynamics are governed by the following types of events: asexual reproduction, sexual reproduction, death. The processes are depicted in Fig. 1. We use a Moran type modelling framework (the population size is fixed by coupled birth-death events and generations are non-overlapping), but take the limit of infinite population size.
When reproducing asexually, the individuals produce exact copies of themselves. We take a coarse-grained approach and consider a "population average" rate of asexual and sexual reproduction (i.e. sexual reproduction is not correlated in time across the entire population). The rate at which a given mating type reproduces asexually is given by a fixed clonal reproduction rate c. For a population that only reproduces asexually, c = 1. In contrast, we have c = 0 for a population that only reproduces sexually, and values in between for facultative sex.
During a sexual reproduction event, which occurs at rate (1 − c), individuals of two different mating types are required. Hence, the frequency of sexual reproduction depends on the frequencies of both mating types. We assume mass-action encounter rate dynamics. For simplicity, we further assume that each sexual pairing produces just one progeny. This offspring inherits the mating type of either parent with probability 1/2.
As addressed in the introduction to this section, we assume that mutations that affect the frequency independent fitness of a mating type are linked to the mating-type allele. We now make a further simplification and assume that this fitness is predetermined and invariant with time. Differences in fitnesses between mating types are realised through a mating-type specific mortality rate, D i (which can be interpreted as an inverse fitness of the i th type). Death events for a given type i (which we recall are coupled to reproduction events) are then weighted by the rate D i .
We now formulate the above model mathematically (for more details we refer to the supplementary material in [20]). Given a population with M distinct mating types, we write x = (x 1 , . . . , x M ) for the relative frequency of individuals with mating type i, so that M i=1 x i = 1. The rate at which type i replaces type j is given by the matrix elements The first term in the bracket describes the asexual reproduction with the clonal reproduction rate c. The second term describes the sexual reproduction between an individual of mating type i and any other type which is not the same. Individuals mate with each other at rate 1 − c and the offspring inherits the mating type i with probability 1/2. Keeping the total population size fixed, reproduction is coupled with the removal of individuals from the population, with death rate D j for type j.
The population dynamics for this model are then described by the ODE system where we have introduced the parameter for mathematical convenience in the forthcoming analysis. Note that since we are working in the limit of infinite population size, the dynamics are entirely deterministic. Thus we ignore the possibility of rare types being lost through genetic drift (for studies analysing such stochastic invasion dynamics, see [21,22]). The only way that extinctions can occur in this model are when the dynamics of Eq.(2) drive one or more mating type frequencies to zero at long times. The mortality rates D i are arbitrary positive parameters. We are interested in the effect of mortality differences between mating type alleles, not the evolution of the actual values themselves. Hence, we impose the rule i D i /M = 1 and decompose where the r i are the residuals describing the difference between the average mortality rate of the population and that of type i. We interpret these values as representing fitness, since larger r i implies lower mortality rates and therefore higher reproductive success.

Long-term evolutionary dynamics
For the long-term evolution of the system we consider mutations and extinctions of mating types. When a mutation event occurs, a novel mating type is introduced to the population. The novel mating type obeys the same dynamical rules as the resident population, but has a distinct mortality rate. Note that by assuming this mortality rate is coupled to the mating type allele and invariant in time, we are able to characterise the model in terms of a single mutation rate, the arrival rate of new mating type alleles. The precise value for the mutant mortality rate is drawn from an arbitrary distribution p. From a theoretical standpoint, the choice of such a distribution poses a common problem, with empirical data suggesting that fitness distributions vary across species [36]. We therefore make no a priori assumptions about the form of this distribution and instead proceed with the aim of deriving results for arbitrary distributions.
We next assume a low mutation rate, so that the short-term population dynamics is always in a stationary state upon the introduction of a new mutant mating type. From a consideration of the form of Eq. (2), it is clear that the fixed points of the short-term population dynamics (along with their stability) will be dependent on the precise values of the mortality rates. We will characterise these properties mathematically in Section 3.1.
Here we note that upon introduction of a mutant mating type there are only a limited class  : Schematic algorithm for the long-term population dynamics. First, we compute the stationary state of the short-term system according to Eq. (10). Then, we check the stability of the fixed point using the criteria in Eq. (11). If the fixed point is stable (i.e. mating types co-exist), a mutation event occurs and we add a novel mating type to the population. If the fixed point is not stable, the mating type with the highest mortality rate goes extinct, i.e. is removed from the population. After each mutation and extinction event, we compute the new stationary state.
of events that can occur.
Suppose a novel mating type µ, with frequency x µ , is introduced to the population. If the short-term dynamics approach a stable interior (i.e. 0 < x i < 1 for i = 1, 2, . . . , M, µ) fixed point, we interpret the mutant as having successfully invaded and is now part of the resident population; x If however the interior fixed point 0 < x i < 1 for i = 1, 2, . . . , M, µ is unstable, the short term dynamics must tend to one of the remaining fixed points in the system, where at least one of the mating type frequencies will be reduced to zero population density (i.e. at least one of the mating types will go extinct). We first consider the situation where just one of the mating types, e, is extinct. Note that under the symmetry of the mating dynamics, the mating type with the lowest frequency (i.e. x e = 0) will be that with the highest mortality. We use this fact to identify the extinct mating type e, and remove it from the population; We note that at this stage the number of mating types has not changed, with the mutant mating type either having displaced a resident or been itself driven to extinction. A new interior fixed point now exists at (i.e. 0 < x i < 1 for i = e). If this new interior fixed point is stable, we can proceed to introduce another mutation and repeat the above process. If the new interior fixed point is unstable, we must seek the next mating type to go extinct e . By the same logic as above, this will be the mating type with the highest mortality among the remaining mating types; This process is repeated until the short-term dynamics succeeds in reaching a stable fixed point where co-existence is possible. The general algorithm is illustrated in Fig. 2.
As mentioned before, we are not interested in the evolution of the mortality rates D i themselves. Therefore, after each mutation and extinction event, we re-centre the mortality rates around a mean value of one. For this, we use the decomposition into fitnesses r i in Eq. (4). Let r 1 , . . . , r M be the fitnesses for the residing mating types in the population. Note that the fitnesses, r i , sum to zero by definition. After adding a mutant with value r M +1 , we obtain the new fitnesses Similarly, after removing an extinct mating type with the lowest fitness r min = min(r i ) from the population, we obtain

Results
In the long-term evolutionary model, the number of mating types M is a random variable which evolves over time. Simulations show that we can expect the process to reach a stationary distribution with the number of mating types fluctuating close to some mean value. Our goal is to calculate this value. In particular, we aim to predict the average number of mating types in terms of the model parameters, including the fitness variability. We now proceed with the full mathematical analysis. For readers less concerned with the specific mathematical details of the derivation, our key result is given in Eq. (30).
In order to determine how the system evolves in the long-term, we first need to know how the short-term population dynamics behave. For this, we analyse the stationary states of the short-term dynamical system and prove conditions for stability in terms of the fitnesses r i ; see Theorem 1 below. In analysing the long-term dynamics we seek to link the spread in fitness (described by the distribution p from which the mortality rates of new mutants are drawn) to the expected number of mating types at long times, E p M . Our analysis will proceed by first studying the special case that new mutants always have the same fitness advantage over the residents (i.e. p is a Dirac delta distribution), then applying a heuristic to expand these results to the general case.

Short-term dynamics: stable fixed points
The dynamics of the short-term model with no variation in fitness are simple; when D i ≡ 1, the mating type frequencies approach the stable fixed point x i = 1/M . In the general heterogeneous case, however, simulations show that the fixed point destabilises with increasing mortality rate variability (see Fig. 3). For the co-existence of mating types in a population, we require the short-term dynamics to reach a stable stationary state with non-zero mating type frequencies. Otherwise, the frequencies of the mating types with lower fitness values would eventually reach zero, i.e. the mating types go extinct. Hence, we ask if a stable fixed point exists and under which conditions with respect to the range of allowed fitness values.
Theorem 1 (i) The ODE system in Eq. (2) has a fixed point x given by x is interior if and only if the fitnesses obey the inequalities where we have introduced R(M ) = (M/γ − 1) −1 .

Proof
We first compute the non-trivial fixed point. Noting that since the addition of a constant factor will not change our conclusions about the fixed point or its stability, we rescale time using t → 2 1−c t and write with Since i r i = 0, it is easy to check that the fixed point x proposed in (10) obeys i x i = 1 as required. From (12) it will therefore suffice to show that g ij (x ) = 0 for all i, j. This is an over-specified linear system, and it is straightforward to check that and hence the result of Eq. (10) follows.
Next, we undertake a linear stability analysis of Eq. (12) around x . The Jacobian matrix has entries Now, and by construction g ik (x ) = 0, so in fact we have Note that, since i x i = 0, we know J must be a degenerate matrix having a zero eigenvalue. To show linear stability of x , we are required to prove that the other eigenvalues of J all have negative real part. It is well known (see e.g. [37]) that all eigenvalues of a matrix lie inside the union of the Gershgorin discs in the complex plane; disc i has centre given by the i-th diagonal element of the matrix and a radius equal to the sum of the absolute values of the other entries of row/column i. The radii of the Gershgorin discs for our Jacobian matrix can be computed as follows: With the diagonal element as the centre of disc i, we find that if λ is an eigenvalue of J, we must have Re[λ] ≤ J ii + ρ i = 0. We know of the existence of one zero eigenvalue of J; to prove stability we must establish its uniqueness. If we assume that x is interior, the Jacobian matrix J is Metzler, meaning that it has strictly positive off-diagonal entries. Addition of a sufficiently large constant to the diagonal of J would give it all positive entries. The Perron-Frobenius theorem states that a real matrix with all strictly positive entries has a unique, real, rightmost eigenvalue. Therefore J also has a unique rightmost eigenvalue, which in this case must be zero, and hence x is linearly stable.
Let us now deduce the criterion for the fixed point x to be interior, in terms of the difference in mortality rates between mating types. At one end of the allowed range we obtain and at the other as required.

Analysis of long-term model
Let us first discuss the general behaviour of the long-term evolutionary dynamics of our model (described in Section 2.2). In each simulation step we introduce a new mutant and compute the fixed point of the initial system and check its stability according to the conditions given in Eq. (11). There are 3 cases to discuss: Failure If the mortality rate of the mutant mating type allele is too high relative to the existing population, no stable interior fixed point exists. In other words, the mutant does not invade the population, and the number of mating types does not change (M → M ).
Invasion with coexistence If the mortality rate of the mutant is not too high or too low, an interior fixed point may exist. In that case, the mutant mating type coexists with the residing types in the population. Thus, the number of mating type alleles increases by one (M → M + 1).
Invasion with extinction For very fit mutants (i.e. low mortality rate), it is possible that no stable fixed point exists with all M + 1 types. Hence, one or more of the residing mating type alleles with the highest mortality rates go extinct. Potentially, a mutant can wipe out the whole population. The number of mating type alleles is reduced by the number of extinct residents (M → M + 1 − M extinct ). Note that in the case of a single extinction event, the resident mating type is simply replaced by the mutant, hence the number of mating type alleles does not change.
After each mutation event, we record the number of mating types residing in the population. This way, we obtain a time series of the evolution of the number of mating types M as in Fig. 4. In the homogeneous case where all mortality rates are equal (D i = 1), we see a constant growth of the number of mating types as expected. A stable interior fixed point is always reached and thus, mutant mating type alleles invade the population. With mortality differences (D i = D j ), the growth of the number of mating types is limited. The fixed point destabilises and mating types with higher mortality rates are pushed towards extinction. Higher asexual reproduction rates c can amplify the extinction rate. Thus, we see fewer mating types as c → 1. In the long term, the average number of mating types approaches a specific value as we discuss in the following.
From simulations we observe that the average number of mating types changes according to the model parameter γ, which is a measure of the rate of sex in the population (see Eq. (3)). Meanwhile, changing the distribution p from which the fitnesses (r) of new mutants are drawn appears to affect the average number of mating types only through scaling by a constant factor (see Fig. 5). This relation is demonstrated in Fig. 5, where we show the average number of mating types obtained from arbitrary fitness distributions p as examples. This observation motivates us to seek a functional φ[p] depending on the fitness distribution p such that φ[p] · E p M is invariant. That is, for some function m(γ) we have To find the scaling functional, we use a heuristic approach based on two assumptions: (i) only beneficial mutations (r > 0) are relevant, since deleterious mutations are very unlikely to invade, and (ii) the functional φ can be obtained as the average of some function of fitness. The consequence of these assumptions mathematically is that we may write the integral transform for some kernel function k. To compute the scaling functional, we need to determine the kernel function k(r). Since k does not depend on p, we are free to choose a solvable model to analyse.

Non-random mutant mortality rates
In the previous section, we found that the missing quantity we needed to determine the number of mating type predicted by our model was a kernel function k(r). However we also saw that we were free to determine this function using any distribution of mating type fitnesses, p. In this section we therefore choose p to have a particularly simple form, that of the delta-distribution. Let p = δr be a Dirac distribution centred at positionr > 0. Note that this implies a degenerate case where fitnesses are non-random; the Dirac distribution always returns mutants with a fitness of preciselyr (i.e. The delta-distributed fitness values have mean fitnessr and variance zero). We emphasise that this distribution is not biologically relevant, but will provide a route to obtaining general analytical results for more realistic distributions.
According to Eq. (18), we have If we can compute an expression for the expected number of mating types of the functional form then comparing with Eq. (19) should allow us to determine the scaling functional for generic distributions, which can be constructed as the superposition of Dirac delta distributions.

Stationary configuration of fitnesses.
For mutant mortality rates drawn from a delta distribution, simulations imply that the system is driven towards a stationary configuration of fitnesses. Fig. 6 shows an example time series of the fitnesses r i linked to the mating type alleles residing in the population. From simulations we observe that there appears to exist an absorbing stationary configuration in which the fitnesses are equally spaced.
For this configuration to be stable under the addition of a new mutant requires that the invader replaces the residing mating type with the lowest fitness, such that (after recentring) the same equispace configuration is recovered. It follows that we must therefore haver > max(r i ).
To compute this state we write the fitnesses as ranked values, (i.e. r 1 is the highest fitness value in the population). Since the values are centred at 0, we have r M < 0 < r 1 . In addition, we assume that there are no duplicate fitness values, which is ensured when mortality rates are drawn from a delta distribution. Now let us discuss the stability of the fitness configurations before and after a mutation event. Since the stationary configuration must be stable, we require according to Eq. (11) that r M > −R(M ). After adding a mutant with r 0 > r 1 , we compute the new fitnesses and obtain with Eq. (8) with i = 0, . . . , M . For a stationary configuration, the mating type with the largest mortality rate must go extinct, i.e. we require an unstable fixed point with r M < −R(M + 1). After eliminating the mating type with the fitness r M , we compute the new fitnesses again. Here, we use Eq. (9) and insert the result from Eq. (21) to obtain with i = 0, . . . , M − 1.
Since we assume a stationary state, we have Here, we see that the fitnesses are equally spaced with distance ∆r = (r 0 − r M )/M . To compute the distance, we use that the sum of the fitnesses is zero, i.e. we have We solve for ∆r and obtain ∆r = r 0 2 M + 1 .
Finally, for the stationary configuration of fitnesses we have The stationary configuration we derived analytically is also plotted in Fig. 6.

Expected number of mating types.
To compute the expected number of mating types, we must check which values of M admit stationary configurations of the form derived above. The conditions for a stationary state from Theorem 1 deliver bounds on the allowed values of M . Let r 0 =r be the mutant mortality rate drawn from a delta distribution p ∼ δr. Using (24) we havē We solve for M and obtain the upper bound, Likewise, the opposite limit deliversr and we obtain the lower bound We observe that, for different values ofr and γ, the region of possible numbers of mating types implied by the above bounds may contain more than one integer. However, for small r (corresponding to small fitness differences between mating types) we have the first-order scaling law Hence we identify Fig. 7 shows a heat map of the expected number of mating types (approximated by (M + +M − )/2) for different combinations of the model parameter γ and the delta distribution peak p = δr. For low sex rates (c → 1) the number of mating type alleles is small. As the mortality differences ∆r ∼r increase, the number of mating types is lowered further.
In Fig. 8, we compare the analytically derived expected number of mating types in Eq. (28) with simulations. We see that the simulated number of mating types converges to the expected value. Hence, our assumptions about the stationary configuration of the fitnesses and the conclusion we draw for the number of mating types are validated.

Average number of mating types for general mortality distributions
We now compare the result of the previous section, Eq. (28) with Eq. (20) to determine the simple rules m(γ) = γ and k(r) = r. For general fitness distributions we therefore expect the scaling function of Eq. (18) to be given by In words, the scaling functional determining the expected number of mating types is simply the mean fitness value of beneficial mutations. Finally, we obtain an approximation for the average number of mating types: Let us compare our analytic prediction of the average number of mating type alleles with values obtained from simulations. Fig. 9 shows the average number of mating types as a function of the clonal rate c for different mortality rate distributions. We see that the analytic result in Eq. (30) predicts the average number of mating types very well, especially for small values of c.
Noting that we have defined the parameter γ = (1 − c)/(1 + c), as c → 1 we would expect our approximation to break down as higher order terms in the scaling law Eq. (27) become non-negligible. Indeed, intuitively we note that while the number of mating type alleles is discrete and bounded below by one, Eq. (30) is continuous and takes values in [0, ∞). Nevertheless, as demonstrated in Fig. 9, the approximation continues to provide good estimates for the expected number of types.

Discussion
In this paper we have addressed the problem of why the number of mating types in isogamous species is often low, when naive evolutionary reasoning suggests there should be very many types, each at very low frequency. To this end, we analysed a simple model that incorporates differences in fitness between mating type alleles. Our main result, Eq. (30), shows that unless mating type mutants have precisely equal fitness to their ancestors, the number of mating types in a population will not grow indefinitely, but rather plateau at a finite value.
In a biological sense, one of the most interesting features of Eq. (30) is its dependence on the rate of asexual to sexual reproduction, c. As this parameter increases, the average number of mating types in the population decreases for any choice of mutant mating type fitness distribution. This is in line with previous observations that the rate of facultative sex may be a key empirical predictor of the number of mating types observed in isogamous species [20,22]. The number of mating types also decreases as the variance of the mutant mating type fitness from its ancestor increases. We can then see that when sex is very rare, mating type mutants must have extremely similar fitnesses for the maintenance of multiple mating types to be possible. For instance, if sex occurred on the order of once every thousand generations (c = 0.999), as has been recorded in some yeasts and algae [38,39], mutant fitness (measured by relative death rate) would need to have a variance of less than 10 −6 from their ancestors if drawn from a normal distribution for even two mating types to be observed (see Fig. 5).
We may then ask what empirical support exists for fitness differences between mating type alleles. As we have demonstrated in Fig. 3, we would expect prominent deviations from an even ratio (x * i = 1/M ) in the rare sex regime. Thus skewed mating type ratios may provide evidence for variability in mating type fitnesses. Such skewed ratios have been recently observed in D. discoideum [40], as well as the isogamous species Cryptococcus neoformans [41] and Candida glabrata [42] (both with two mating types). Meanwhile indirect support for differences in mating type fitness comes from evidence of the repeated loss of mating type alleles in species phylogenies. Such losses appear to have occurred many times in ciliates [9] and fungi [43]. As it is possible that these extinctions could also be driven by genetic drift (demographic stochasticity) alone [20], we next turn to the potential causes of differential fitness between mating types.
Beneficial mutations on specific mating type alleles have been shown to be particularly prevalent in pathogenic fungi [44,43], where they can imbue increased virulence. Meanwhile deleterious mutations have also been shown to accumulate more rapidly on mating type alleles as a result of suppressed genetic recombination at the mating type locus [31]. Within facultative sexuals, beneficial mutations at unlinked loci may also play a role. If a highly beneficial mutation arises during a period of asexual reproduction, the single mating type associated with this mutation may sweep to fixation before the next round of sexual reproduction. Indeed, this behaviour has been observed experimentally in Chlamydomonas reinhardtii [45]. Our model, which assumes a constant fitness for each novel mating type and an average rate of sexual reproduction across the population, takes a coarse-grained approach to these dynamics. Accounting for these additional processes (the accumulation of mutations on mating type alleles and switching between asexual and sexual environments) in a precise manner would be an interesting area for future investigation, but one that would require the specification of two mutation timescales (one for the arrival rate of new mating types and another for the mutations affecting the fitness of extant mating type alleles). Nevertheless we would expect the qualitative picture to remain the same.
From a mathematical perspective, our techniques for modelling and analysing the effect of mutation differ from many standard approaches. Perhaps the most ubiquitous is adaptive dynamics [46], which considers the outcome of evolutionary dynamics in a continuous trait space [47] (that is, very small mutational steps are assumed [48]). However, such an approach is problematic when the quantity of interest (in our case, the average number of mating types) is explicitly a function of the size of the mutational steps. For instance, in Eq. (28) we find that if we take the limit of infinitely small mutational steps (i.e. we take the limit r → 0), the model predicts infinitely many mating types at leading order. Thus it is not clear that an adaptive dynamics approach would capture the interesting and biologically relevant dynamics revealed by our analysis.
In cases such as those described above, modelling discrete, randomly chosen mutations is more appropriate. However, one must then decide what distribution these should be sampled from, whether it be delta [20], normal [49], exponential [50] or generalised. As empirical fitness distributions have been shown to vary between species (and even between genetic loci) [36], the qualitative model dynamics must be independent of such a choice for any strong biological conclusion to be drawn. In this paper we have shown that a heuristic approach, based on the assumption of the existence of a scaling factor, can provide a way to make analytic conclusions without specifying any precise form for the distribution of mutations. Interestingly, a similar observation has been made for a game theoretic model of the evolution of average population fitness [51], although the analytic form of the observed scaling function was distinct. An important avenue for further investigations will be to determine if this approach can be used to as a general method for addressing problems involving stochastic mutation events.
In the shorter term, the approach we have developed will be useful for other problems in which balancing selection plays a role. In particular, this includes the study of alternate self-incompatibility systems. In the model we analyse in this paper, we assume that mating types are determined by one of an infinite set of potential alleles at one single locus. While this is a plausible model for many isogamous species, alleles at 2 distinct loci control mating type expression in many fungal species [13,12]. Future research would therefore involve extending the model presented here to account for this genetic feature.
As addressed in the introduction, self-incompatibility is also prevalent in many plant systems and, as a result of suppressed recombination, deleterious mutations can also accumulate at the self-incompatibility locus [52]. As these species are in general obligately sexual (i.e c = 0), our analysis suggests that such plants would be able to tolerate larger variance in the fitness of self-incompatibility alleles than facultatively sexual species. Once again, more specific modelling would need to be conducted to obtain a precise mathematical prediction for these diploid species, in which the genetic determination of self-incompatibility can sometimes be complex [53]. However, as effective population sizes in these species are often small (of order 10 3 [54]) genetic drift is likely to play a more important role. It will therefore be interesting to investigate whether the techniques we have developed here for populations of infinite size can be generalised to capture the effect of finite population size.

Data availability
All data that is used to support our conclusions is retrieved from stochastic simulations. The source code is available at: https://gitlab.com/YvonneKrumbeck/mating-type-model. Here, the sex of the organism is determined by a pair of chromosomes (X/X in females, X/Y in males). The life cycle of cells in many isogamous species is dominated by a haploid state (green background). Here an allele on a single chromosome determines the mating type (e.g. I, II and III in Dictyostelium discoideum). Mammals produce short-lived haploid sex cells (yellow background) that carry one chromosome type of the diploid parent. Their morphology differs according to the sex of their parent. A large egg cell and small sperm cell merge and form a diploid zygote which develops to the progeny. Cells of isogamous species, on the other hand, are equally sized and can reproduce asexually and sexually. During the sexual phase, two cells of different mating types merge to form a diploid zygote. From this short-lived diploid state (yellow background), the progeny emerge as haploid cells inheriting the parental mating types with equal probability. haploid micronucleus during sexual conjugation). Finally, while mating types are mostly inherited deterministically in a genetic manner, there are exceptions to this rule; many yeasts have evolved the ability to switch between mating types from generation to generation [55], while some ciliates inherit their mating type epigenetically or stochastically [9]. Models investigating the effect of these additional species-specific considerations are interesting, and indeed there have been a number of theoretical studies towards this end [34,28,56].