Interface Focus
You have accessResearch article

The canonical equation of adaptive dynamics for Mendelian diploids and haplo-diploids

Johan A. J. Metz

Johan A. J. Metz

Mathematical Institute, Leiden University, 2333CA Leiden, The Netherlands

Institute of Biology, Leiden University, 2333CA Leiden, The Netherlands

Evolution and Ecology Program, International Institute for Applied Systems Analysis, 2361 Laxenburg, Austria

Naturalis Biodiversity Center, 2333CR Leiden, The Netherlands

[email protected]

Google Scholar

Find this author on PubMed

and
Carolien G. F. de Kovel

Carolien G. F. de Kovel

Department of Medical Genetics, Universitair Medisch Centrum Utrecht, Sectie Complex Genetics, Utrecht, The Netherlands

[email protected]

Google Scholar

Find this author on PubMed

Published:https://doi.org/10.1098/rsfs.2013.0025

    Abstract

    One of the powerful tools of adaptive dynamics is its so-called canonical equation (CE), a differential equation describing how the prevailing trait vector changes over evolutionary time. The derivation of the CE is based on two simplifying assumptions, separation of population dynamical and mutational time scales and small mutational steps. (It may appear that these two conditions rarely go together. However, for small step sizes the time-scale separation need not be very strict.) The CE was derived in 1996, with mathematical rigour being added in 2003. Both papers consider only well-mixed clonal populations with the simplest possible life histories. In 2008, the CE's reach was heuristically extended to locally well-mixed populations with general life histories. We, again heuristically, extend it further to Mendelian diploids and haplo-diploids. Away from strict time-scale separation the CE does an even better approximation job in the Mendelian than in the clonal case owing to gene substitutions occurring effectively in parallel, which obviates slowing down by clonal interference.

    1. Introduction

    For context, it is useful to distinguish between micro-, meso- and macro-evolution. The term micro-evolution customarily refers to changes in gene frequencies on a population dynamical time-scale. We will refer to the evolution of quantitative traits through the repeated substitution of novel mutants, including the splitting of the evolutionary path into separate evolutionary lines, as meso-evolution. The term macro-evolution then becomes restricted to larger scale changes such as anatomical innovations, where one cannot even speak in terms of a fixed set of traits.

    Adaptive dynamics (AD) was devised as a mathematical framework for dealing with meso-evolution in ‘realistic’ ecological settings. It differs from more classical approaches to modelling evolutionary change, which generally assume constant fitnesses, by its focus on the population dynamical basis for those fitnesses, and hence on their inevitable change over evolutionary time. Of course, the greater realism at the ecological end is brought about by making different simplifying assumptions, this time genetically unrealistic ones. The main simplification is (i) separation of the population dynamical and mutational time scales. In order to concentrate on ecological aspects, unencumbered by the complexities of the genetic architecture and genotype to phenotype map, most AD research moreover assumes (ii) clonal inheritance.

    One of the powerful tools of AD is its so-called canonical equation (CE), a differential equation describing how the prevailing trait vector changes over evolutionary time. The derivation of the CE is based on a subsequent further simplifying assumption: (iii) small mutational steps. As the speed at which a mutant substitutes is proportional to its effect, conditions (i) and (iii) will only rarely be met together. In §5, we give arguments why for small step sizes the time-scale separation need not be very strict.

    The CE was first derived in [1] for well-mixed clonal populations with the simplest possible life histories. This was underpinned by a level of rigour sufficient to satisfy hard probabilists in [2] and [3]. The initial results were extended in [4] to, possibly spatially distributed, locally well-mixed populations with general life histories, with [5] giving a probabilistically rigorous underpinning for the case of simple age dependence. In this paper, we describe the extension to Mendelian diploids and haplo-diploids, once again on a physicist level of rigour. As it turns out, the CE for the simplest and most complex life histories differs only in a scalar factor summarizing how the intricacies of the life history feed through to the invasion probabilities of advantageous mutants. Mendelian diploidy brings in an additional factor 2, owing to the doubling of the number of mutant alleles per individual over a substitution.

    Collet et al. [6] also derives a CE for Mendelian diploids, with full rigour. However, this in essence is a CE for a single locus trait, selected by a two-tiered ecology, within and between diploid bodies, the latter with the simplest possible life history. ([6] also lists the early applications of AD to Mendelian models.) We consider general life histories and phenotypic traits that are underlain by many loci.

    In §2, we give a general heuristic derivation of the CE for clonal and for haploid and diploid Mendelian populations. In §3, we work out the details, and in §4 we consider haplo-diploids. The technicalities can be found in a suite of appendices. In the final §5, we discuss the strengths and weaknesses of the CE as a tool for evolutionary understanding.

    2. Deriving the canonical equation for the textbook genetic scenarios

    Mathematically, the CE is derived by taking two subsequent limits: (i) letting the system size K (and hence the average population size Inline Formula) go to infinity (to make the community dynamics deterministic) and the mutation probability per birth event ɛ go to zero in such a manner that (a) ɛK ln(K) → 0 (to make the time for a substitution shorter than the time between mutations) and (b) ɛK exp(αK) → for sufficiently small positive α (to keep the population from going extinct on the time scale of the trait changes), followed by (ii) letting the mutational step sizes go to zero, all the while keeping the trait changes in view by rescaling time. Biologically, the CE is best seen as an approximation. From that perspective, it is expedient to express the result in the original time scale so that the basic biological parameters are kept in view.

    Under the assumptions that K is large, ɛK ln(K) is small and ɛK exp(αK) is large for a sufficiently small positive α, that mutations are unbiased and that the environment as perceived by the individuals does not fluctuate (note that this implies a non-fluctuating resident population), the rate of change of a trait vector X can in the clonal case be expressed approximately as

    Display Formula
    2.1
    with s(Y|X) the invasion fitness of Y mutants in the environment generated by X residents (see [7,8]), ∂1s the derivative of s for its first argument, a row vector, and Inline Formula the corresponding column vector, Ts the mean survival time of the residents, Tr their average age at reproduction, Inline Formula a measure for the variability of their lifetime offspring production (detailed in §3) and C the covariance matrix of the mutational steps. (In the AD literature, the quantity Inline Formula is known as the selection gradient.)

    The first term in square brackets after the second equals sign in (2.1) comes from multiplying the probability of a mutation per birth by the population birth rate, Inline Formula, a formula derived from the consistency relation Inline Formula (the average number of particles in a ‘reservoir’ equals the average entrance rate multiplied by the average residence time).

    The second term comes from combining three approximation formulae to calculate the probability p that a mutant invades into the resident population, all coming from a branching process approximation for the invasion phase of the mutant dynamics, followed by averaging the product of the resulting expression and the mutational effect Inline Formula over the distribution g of Z.

    Display Formula
    2.2
    with Inline Formula if x ≥ 0 and Inline Formula if x ≤ 0, and R0 the average lifetime offspring number of the mutant. (Note that the general R0 concept allows for multiple birth states as, for example, in spatially distributed populations [9].) Equation (2.2) is derived through a perturbation expansion from an equation for the invasion probabilities of a branching process ([1018], appendix B).
    Display Formula
    2.3
    Equation (2.3) follows from R0(X|X) = 1.
    Display Formula
    2.4
    Equation (2.4) is derived through a perturbation expansion from the characteristic equation for the Malthusian parameter of a branching process ([4,19,20], appendix A). (Note that dim(R0) = 1 and dim(s) = 1/time.) Together (i)–(iii) result in
    Display Formula
    2.5
    When multiplying (2.5) by the mutational effect, it pays first to take transposes to make use of
    Display Formula
    2.6
    Equation (2.6) follows from the facts that for unbiased mutational effects ∫ZZTg(Z)dZ = C and that the ( )+ means that we effectively integrate over only the half space where 1s(X|X)Z > 0.

    In the derivation of (2.6), we have used the very strong interpretation of ‘unbiased’ that Z not only has mean zero but also is distributed symmetrically around that mean. Relaxing these assumptions gives an expression with considerably less appeal [13]. As relaxing them would make the following arguments less easy to follow while their essence stays the same, we have chosen to stick to the time-honoured simplification.

    Moreover, in writing down (2.1), we have tacitly assumed that an invading mutant that makes it through the stochastic boundary layer, where its population dynamics can be approximated by a branching process, also makes it to a full substitution. This ‘invasion implies substitution’ rule presumably holds good for small mutational steps, away from population dynamical bifurcation points and evolutionarily singular strategies (characterized by 1s(X|X) = 0). A hard proof is available for the case where the community dynamics allows a finite dimensional representation [2123]. However, it looks as if with the right mathematical expertise the rule should be extendable to the required generality by combining the approaches in [24] and [25,26].

    The two terms in square brackets in (2.1) connect to the simplifying assumptions in the following manner. The first term is contingent on the time-scale separation assumption. Only when substitutions do not interfere does the speed of trait movement become proportional to the number of invasion attempts per time unit. The derivation of the second term is contingent on the smallness of the mutational steps and also on the time-scale separation to produce the constant environment underlying the branching process calculations (which assume fixed probability distributions for the lifetime offspring numbers). More in particular, time-scale separation means that in between the negligibly short substitution events the environment is stationary and resident populations are genetically homogeneous. (More about the latter, seemingly unrealistic, consequence in §5.)

    We now consider the Mendelian case. For chromosomal sex determination, we focus on the autosomes, deferring the allosomal contributions to §4.

    Thanks to the genetic homogeneity of the residents, the argument in (2.1) extends seamlessly to haploids. The only difference from the clonal case is that thinking genetics points one to the concept of genotype to phenotype map and mutations that occur on multiple loci. However, when substitutions occur singly the latter multiplicity becomes phenotypically inconsequential.

    Moving on to diploids, we first take a closer look at genotype to phenotype maps. The prevalent view in Evo-Devo nowadays is that the trait changes that AD attempts to model are mostly caused not by changes in the coding regions of genes but at their regulatory regions (e.g. [27,28]). The latter determine the activity of the genes in different parts of the body, at different times during development and under different micro-environmental conditions. This scenario allows us to look at the genotype as a sequence of vectors Inline Formula of expression levels, with a (ai) a placeholder for the name of (an allele on) a generic locus. The genotype to phenotype map Φ transforms this sequence into phenotypic traits. It is from this perspective that one should judge the assumption of smallness of mutational steps: the influence of any specific regulatory site among its many colleagues tends to be relatively minor. And it is this perspective that allows us to assume that (iv) genotype to phenotype maps are smooth.

    Lemma

    ([29], A. Pugliese 1996, personal communication). When there are no parental effects, smooth genotype to phenotype maps are locally additive, i.e. if at a the expression vector Ua mutates to UA, then

    Display Formula
    2.7

    Proof.

    Without parental effects

    Display Formula
    2.8

    Hence,

    Display Formula
    2.9

    In diploids, an invading mutant allele A practically always shares a body with a resident allele a and this aA reproduces through backcrossing with a resident aa. Hence, the allele population initially grows as clonally reproducing aAs (producing aas on the side), and its invasion fitness corresponds to the asymptotic average per capita growth of that clonal population in an environmental background provided exclusively by, also seemingly clonally reproducing (for homogeneous), residents. The invasion implies that substitution theorem also applies unchanged (thanks to the local additivity). However, after substitution the population consists of mutant homozygotes, making the resulting step in phenotype space twice as large as in the clonal and haploid cases. We thus conclude that the CE for Mendelian diploids reads

    Display Formula
    2.10

    On the right-hand side, we have put first the ecologically determined number of resident haplotypes (sets of chromosomes as present in gametes), followed by the life-history statistics controlling the initial demographic stochasticity of allelic substitutions, followed by an expression quantifying the per birth mutational variability generated by the genetic architecture and genotype to phenotype map, to conclude with the selection gradient, summarizing the ecology's current tendency for filtering novel genetic variation. In Metz & Jansen [30], it is argued that Inline Formula times the second group of quantities precisely equals the effective population size from population genetics.

    3. Filling in the details

    It may seem that with (2.10) we are done. However, the devil is in the detail, to wit the calculation of s, Ts, Tr, Inline Formula and C. For background material on the ecological models covered, see [20,31] for the resident and [32,33] for the invader dynamics.

    In principle Inline Formula also presents a problem, although only when the resident population fluctuates. This is even so in the clonal case as Inline Formula is not just a time average but a peculiarly weighted one (e.g. [34,35]). To keep things simple, we have confined the argument to non-fluctuating residents.

    We first consider R0, although this quantity does not appear in our (2.1)-derived list. The reason that it does not do so is that we wanted to write the CE in the form customary in the literature. However, for most structured populations R0 is far easier to calculate than s. Therefore we might just as well in (2.1) and (2.10) drop Tr and replace s with R0.

    R0 equals the dominant eigenvalue of the next-generation matrix L(Y|X) (or operator in the case of infinitely many birth states; e.g. [9]). For the clonal case, L is constructed by calculating from a model for the behaviour of individuals how many offspring in different birth states they produce on average, dependent on their own birth state.

    Given the next-generation matrix, we can introduce two further quantities for later use: the stationary birth state distribution, i.e. normalized positive right eigenvector of L(X|X) going with R0 = 1, Inline Formula, 1TU = 1 with Inline Formula, and the corresponding reproductive values, i.e. co-normalized left eigenvector, Inline Formula, VU = 1. vi equals the expected contribution of a newborn of type i to future birth rates.

    How can we define R0 for evolving sexual diploids? The first answer is that in hermaphrodites one can just add the numbers of offspring that individuals father and mother over their life (i.e. produce through the micro- and macro-gametic routes) and divide by 2. The factor 1/2 comes from the fact that in the Mendelian process each allele is only transferred with that probability. When hermaphrodites are born stochastically equal (i.e. their birth states have the same probability distribution),

    Display Formula
    3.1
    with m and f the average number of offspring fathered or mothered by a randomly chosen individual. For later use, we moreover note that for the resident
    Display Formula
    3.2
    as resident densities are supposedly constant and every individual has one father and one mother.

    Equations (3.1) and (3.2) also hold good for dioecious organisms, but with a twist. For later use, we note that we then can rewrite (3.1) and (3.2) by letting pf and pm denote the fractions of newly produced females and males and f+ and m+ the average lifetime numbers of offspring begotten by a female or male. Then f = pff+ and m = pmm+, so that

    Display Formula
    3.3
    and for the residents
    Display Formula
    3.4
    The above results are not completely trivial, as, in contrast to hermaphrodites, individuals of dioecious species are born in different flavours. To account for this, L(Y|X) should be properly extended. If no other birth state distinctions are needed
    Display Formula
    3.5
    with ff, fm the average lifetime numbers of daughters of a female, male, and mf, mm the corresponding average lifetime numbers of sons, all for mutant heterozygotes in the environmental and genetic background provided by the resident. The simplest case is when there is no connection between the traits and sex determination so that ff = pff+, mf = pmf+, fm = pfm+ and mm = pmm+. Then L has rank one and R0 = 1/2(pff+ + pmm+). Another way of getting the latter result is by observing that in this case all offspring are born stochastically equal, with each having the same probability of being born female. We can then proceed as if sex is determined after birth and calculate R0 by averaging over the possible courses of a life. When there is a connection, we end up with the same formula by defining pm and pf as the asymptotic probabilities of being born male or female, i.e. by choosing for pm and pf the components of the right eigenvector U of L, and defining m+ and f+ again as the average numbers of offspring fathered or mothered over a lifetime given the parental sex, f+ = ff + mf, m+ = fm + mm. By using R0 = 1TLU, we again get R0 = 1/2(pff+ + pmm+). However, only the similarity of the expression is pleasing as this time it is not explicit, as to calculate pm and pf one first needs to calculate R0.

    Appendix C indicates how the preceding considerations extend in the presence of additional birth state distinctions.

    Next we consider Ts, the quantity that had to be combined with Inline Formula to get the population birth rate.

    Display Formula
    3.6
    with Inline Formula, γi being the probability density of time until death of a resident born in state i. In the dioecious case, with all offspring born stochastically equal
    Display Formula
    3.7
    The mean age at reproduction Tr needs more thought as the offspring may also differ in their birth states. The perturbation expansion for s in appendix A tells us that we should weight those offspring with their reproductive values.
    Display Formula
    3.8
    with Λ(a) composed of 1/2 times the average per capita parenting rates of age a residents split according to the birth states of offspring and parent. This formula generalizes the usual definition of the age at reproduction for all offspring born equal. (Note that Inline Formula so that VΛ(a)U is a probability density.) When all offspring are born stochastically equal Inline Formula with λ = λf + λm, λf and λm being half the average per capita mothering and fathering rates of the residents. For dioecious organisms, λf = pfλf+, λm = pmλm+ with λf+, λm+ being half the average female, male parenting rates. Yet, Tr,f+ = Tr,f, Tr,m+ = Tr,m, thanks to the normalization of λf+ and λm+ before calculating the mean parenting ages.

    Inline Formula, appearing as a coefficient in the approximation formula for the invasion probability p, is the most complicated beast. We first give its formula for the clonal case

    Display Formula
    3.9
    Here, the ki are the lifetime numbers of i-offspring of residents and var(h|j) means that the variance of the random quantity h for an individual born in state j. Equation (3.9) emerges from the perturbation expansion (appendix B).

    To calculate Inline Formula for the Mendelian diploid case, we apply (3.9) to the alleles. We only consider the case with all offspring born stochastically equal.

    Display Formula
    3.10
    with Inline Formula and Inline Formula the variances of the lifetime offspring numbers kf and km produced, respectively, through the macro- and micro-gametic routes, and c their covariance (appendix D). The 1 inside the inner brackets comes from the Mendelian sampling of alleles.

    For dioecious organisms, c = −1 (as Ekfkm = 0 and Ekf = Ekm = 1), and Inline Formula, Inline Formula Inline Formula, θf+ = σf+/f+ and θm+ = σm+/m+. The latter result is obtained from the following lemma together with the observation that E+kf = f+ and E+ km = m+, with E+ the expectation conditional on the parental sex.

    Lemma.

    Let the discrete non-negative random variable k be constructed from a random variable k+ by setting k = k+ with probability p and k = 0 with probability 1 − p, then Ek = pEk+ and var(k) = Ek2 − (Ek)2 = p[var(k+) + (Ek+)2] − (pEk+)2. When moreover Ek = 1 so that Ek+ = p−1,

    Display Formula
    3.11

    The mutational covariance matrix C is primarily a phenomenological quantity, although in principle it can be expressed in terms of per locus statistics and the genotype to phenotype map,

    Display Formula
    3.12
    with pa the relative frequency with which a mutation occurs at locus a and ga the probability density of mutational steps in the allelic trait vector of that locus.

    As a final point, we need to say something about the traits and the concept of phenotype. In general, the phenotypes of AD should be seen as reaction norms, i.e. maps from micro-environmental conditions to characteristics of individuals (another term is conditional strategies). Only in the simplest cases a reaction norm is degenerate, taking only a single value. The dioecious case is similar in that the development of the sexes need not be, and in fact rarely is, equal. Hence, trait vectors will generally consist of two components, traits of the male, Xm, and of the female, Xf. In general, the traits of the two sexes evolve dependently as they are coupled by their mutational covariances. In the extreme case that the mutational covariances between male and female traits are all zero the female and male coevolve as if they are separate species. At the opposite end of the spectrum Xm = Gm(Xf), or Xf = Gf(Xm), and the mutational variation in the male and female traits is fully correlated. The upshot is that, except in the fully correlated case, we cannot work with a monolithic trait vector influencing both macro- and micro-gametic reproduction, as in hermaphrodites. Instead, we should take into account the fact that when the sexes are separate they can evolve in their own ways.

    4. Haplo-diploids: a not uncommon, but often-neglected, reproductive mode

    In addition to the haploid and diploid ones, there exist all sorts of other lifestyles. One common type is where haploid and diploid phases alternate (as in, for example, ferns, mosses and a great variety of algae). It is then necessary, as in dioecious diploids, to introduce trait vectors for each separate phase. We can then consider a diploid plus its haploid offspring as a single generalized individual ([36] gives a relaxed introduction to this useful concept) and apply the theory of the previous two sections, where for the diploid phase traits we again have to put in a factor 2.

    Still another lifestyle is the so-called haplo-diploid one, where one sex is diploid and the other haploid (the supplementary material to [37] lists the many known haplo-diploid taxa). Although this case can also be treated through the mental construction of appropriate generalized individuals, we follow the strategy of §3, and treat the two sexes as different birth states, as this is simpler and allows us to illustrate further tricks of the trade.

    Although the opposite also occurs, we shall for definiteness take the hymenopteran situation as reference and assume that females are diploid. In that case only the female reproductive output needs to be discounted by 1/2. We allow any physiological structure and only assume that within each sex all offspring are born equal. The next-generation matrix then becomes

    Display Formula
    4.1
    with ff and fm the average lifetime numbers of female and male offspring of a female, and m the average lifetime number of (female) offspring of a male. ff and fm depend only on the female traits, m only on the male traits.

    For the resident ff = 1 (as the density of females is constant) and hence fm = r, the sex ratio at birth (the number of males born relative to females), and m = r1 (since all females have a father and one female is born for r males). Hence, the resident's next-generation matrix is

    Display Formula
    4.2
    with co-normalized eigenvectors
    Display Formula
    4.3
    As is commonly the case in models with more than one birth state, the resulting expression for R0 does not excel in transparency. However, we do not need R0 but its derivative, for which there exists a simple formula given in the following lemma, derived by differentiating through the characteristic equation and solving for ∂R0/∂Y (see [38,39] and in particular [40]).

    Lemma.

    Let Inline Formula and Inline Formula Inline Formula, then

    Display Formula
    4.4

    So far, we have listed the additional ideas. The rest is work, following the route outlined in §§2 and 3.

    We begin with the selection gradient

    Display Formula
    4.5
    As we start from R0 instead of s, there is no need for us to calculate Tr, and as Ts has nothing to do with the reproductive system, (3.7) holds good as for any other sex-differentiated population. There remains
    Display Formula
    4.6
    where the |f, |m in the indices indicates that the quantity is calculated conditional on the parent being female or male. There are nicer expressions than (4.6). If you wish, you can check it by working through appendix E. The term 1 + r inside the inner brackets derives from the Mendelian sampling in the females. Furthermore, for the male traits the factor 2 in front of Inline Formula in (2.10) should be removed. Lastly, a factor accounting for the reduced number of haplotypes per individual has to be inserted—for sex-independent per haplotype mutation frequencies (1 + (1/2)r)/(1 + r).

    Our arguments for diploids only considered autosomes. To obtain the full CE in the case of chromosomal sex determination, it suffices to add the allosomal contributions to the autosomal one. The contribution of the chromosome occurring only in the heterogametic sex follows the rules for clonal reproduction, that of the chromosome of the homogametic sex follows those for haplo-diploids.

    5. Discussion

    Away from evolutionarily singular strategies (where 1s(X|X) = 0), the time-scale separation assumption combined with the assumption of small mutational steps guarantees practically permanent genetic homogeneity of the resident population. More in particular, a population with sufficiently restricted variability will become homogeneous on the time scale of the substitutions, thanks to the close to additive genetics, and will effectively stay so until the next substitution. The exceptions to this homogeneity occur when an evolutionary trajectory comes close to an evolutionarily singular strategy. The reason is that, where everywhere else we have an approximately linear selection regime, near the singular strategies the quadratic terms in the local expansion of the fitness landscape start to dominate, creating epistasis (non-additivity) at the level of the fitnesses.

    Genetic homogeneity of the resident population lies at the base of the approximations made to reach the CE and of its easy extension to a Mendelian world. At first sight this seems a strong argument against the CE's applicability, as genetic variability appears to be rampant. There are two arguments for yet thinking that the CE may often be a fair description of reality. The first one is that we need mutation limitation only for genes affecting the trait. The thrust of our theoretical deductions is not affected by selectively protected variability that is developmentally and selectively unconnected to the focal traits or variability on neutral loci subjected to mutation, drift and draft (hitchhiking with loci under selection). The only effect that selectively kept variability at unconnected loci may have is that it makes the lifetime offspring numbers of the substituting alleles more variable, enlarging Inline Formula. However, as we treated the components of Inline Formula as empirical quantities, any genetic causes of this variability are automatically included.

    The second argument has a bearing on variability owing to a lack of strict time-scale separation. As long as that variability stays sufficiently limited, it should in Mendelian populations have little effect on the quality of the CE approximation. The reason is that the effect of such variability on the invasion fitness is of higher order in the mutational step size than the terms retained in the derivation of the CE. (The argument for the latter statement may be found in [41]. This paper admittedly only considers unstructured populations. However, the nature of the argument suggests that with the right mathematical expertise it should be extendable to structured ones.) In clonal populations, a lack of mutation limitation yet presents an obstacle to the quantitative applicability of the CE owing to clonal interference: a substituting clone may be ousted en route by a better mutant so that only a fraction of the mutants that invade contribute to longer run evolutionary change. We may thus expect ‘reality’ to be slower than the CE. Mendelian populations do not suffer this slowing down as mutants on different loci effectively substitute in parallel without interfering thanks to recombination (and the approximate additivity coming from the smallness of mutational effects).

    The picture of multiple substituting mutants with small additive effect may seem close to the one considered by Lande [42,43]. However, there is a difference as in our picture only a small and variable number of loci are simultaneously substituting, with new mutants continually coming and old ones going. By contrast, the quantitative genetics picture underlying Lande's work has its basis in Fisher's picture of evolutionary change coming from changes in the allele frequencies on a large number of loci with all alleles present from the start [44]. (An argument against that picture is that the genetic differences between, say, a choanoflagellate and a human are so many that the attendant polymorphism would contain more genotypes than the number of atoms in the Earth, invalidating the classical population genetic calculations.) Of course, Lande did not subscribe to this simplified picture (e.g. [45,46]), and neither did Fisher, at least not wholeheartedly [47, chapter IV]. However, the mathematical details of the connection between the approximations introduced in Lande's various papers remains to be worked out.

    One of the consequences of having only a few loci substituting at any one time is that the within-population variance of the trait fluctuates stochastically and appears to be highly variable (J. Ripa 2005, personal communication). This notwithstanding a negative feedback: when the standing variability is high, mutant alleles will find it more difficult to pass through the stochastic boundary layer because of the larger fluctuations in their offspring numbers. Unfortunately, this picture does not easily lead to explicit expressions, as the fluctuations in variability occur on a time scale similar to that of the substitutions so that we cannot use our earlier branching process calculation. However, we surmise that in reality the allelic reproductive variability coming from co-substituting alleles can usually be neglected relative to the variability in offspring numbers caused by alleles not affecting the traits and by micro-environmental variability such as some individuals running into a predator and others not.

    Although the previous paragraphs may give the impression that we are quite optimistic about the applicability of the CE, there is one problem that in applications is usually ignored: there is no need for C to stay constant. It may perhaps seem, when the mutations with small and hence additive effect all occur on different loci, that also their cumulative effect will be additive. However, this is not the case. This can be seen by looking at the developmental process as a sequence of maps, each of which transforms the genetic information plus environmental inputs further towards the final phenotype. Even if linearity were to prevail early on, the accumulated change will be appreciable, and when the output from those early stages is fed into a nonlinear map to get to the next stage, approximate linearity is lost and the derivatives of Φ that appear in (3.12) change over evolutionary time.

    In principle, it is possible to write down a CE for the extended trait vector (X, C) to obtain the change in C as correlated response to selection on X. However, going through the calculations in the manner of (3.12) shows that the expression for the covariances between the mutational steps in X and C generically involves higher moments. We should thus see the CE as the first step in a truncated moment expansion.

    All this does not mean that using the CE with fixed C is never more than a heuristic tool, with no strong connection to reality. There are scenarios where one may with impunity assume C to be constant, in particular when investigating the behaviour of evolutionary trajectories close to evolutionarily singular strategies. In a linearized stability analysis as in [48] or in the analysis of scenarios for initial divergence close to an adaptive branching point (as defined in [4951]) as in [4,5254], only those situations are considered where the change in X, and hence the associated change in C, is small. (Such arguments involve two trait scales, a small one for linearizing the CE and a smaller one for the mutational steps.)

    Acknowledgements

    We thank Jacques van Alphen for inviting the first author to a workshop on insect parasitoids, thus sparking the research reported in §4.

    Funding statement

    C.G.F.d.K. was supported by the Netherlands Organization of Scientific Research (NWO), grant 813.04.001. J.A.J.M. benefited from the support of the Chair Modélisation Mathématique et Biodiversité VEOLIA-École Polytechnique-MNHN-F.X.c.

    Appendix A. Approximating a mutant's invasion fitness

    For the sake of the exposition, we first consider the case of a single birth state. Then s satisfies the characteristic equation (Lotka's equation)

    Display Formula
    A1
    depending on whether we are dealing with continuous or discrete time community dynamics. Here, λ(a) is the average birth rate or ratio at age a, where the average is taken over whatever stochastic trajectories individuals may follow during their life.

    From here on, we concentrate on the continuous time case. Rewriting (A 1) by introducing the probability density of the age at reproduction a, and its cumulant generating function ϕ,

    Display Formula
    A2
    gives
    Display Formula
    A3
    Expanding ϕ as
    Display Formula
    A4
    with κi the ith cumulant, Inline Formula, Inline Formula, and solving subsequently for the first- and second-order terms (on the assumption that |ln(R0)| is small) gives
    Display Formula
    A5
    Equation (A 5) often performs far better than the estimate of the error term suggests. The reason is that a similar result appears for birth kernels with potentially large R0 but narrow spread. When the ith central moment μi of a scales like σn, (A 1) can be written as
    Display Formula
    A6

    Taking logarithms and solving subsequently for first- and second-order terms gives

    Display Formula
    A7

    The two approximations agree up to their second-order terms (but not in higher order ones).

    For more than one birth state, we introduce Λ(a) = λij(a), with λij(a) the average rate at which an individual born in state j gives birth to offspring in state i. Then s can be calculated from the characteristic equation

    Display Formula
    A8
    with
    Display Formula
    A9
    or, equivalently, as the rightmost solution of
    Display Formula
    A10
    Let λd stand for ‘dominant eigenvalue of’ and Inline Formula be the mutational step. Then (A 8) can be rewritten as
    Display Formula
    A11
    Expanding ψ as a function of its first argument gives
    Display Formula
    A12

    On the assumption that s = O(||Z||) for small ||Z|| we can rewrite (A 12) as

    Display Formula
    A13

    Hence,

    Display Formula
    A14

    It remains to calculate

    Display Formula
    A15
    The denominator in the first factor equals R0(X|X) = 1. The second factor equals (e.g. [55])
    Display Formula
    A16
    Hence,
    Display Formula
    A17

    In Appendix B of [4] it is moreover proved that when ln(R0) = O(||Z||2), as is the case around evolutionarily singular strategies, it is still possible to use (A 17) with O(||Z||2) replaced by O(||Z||3).

    Appendix B. Approximating a mutant's invasion probability

    Equation (3.9) has a long history, starting with [10,11] for single birth states, with refinements in [1214]. The work of Eshel [15] starts the treatment for multiple birth states, with refinements in [1618]. All later papers are rather complicated as they aim at the strongest possible results. However, when the offspring numbers have third moments a standard perturbation expansion suffices.

    To begin, we consider the case of a single birth state. The following calculation can be found in any textbook devoting space to branching processes. Denote the lifetime number of offspring of a representative individual as k where the underlining signifies that k is a random variable, and its so-called generating function as

    Display Formula
    B1
    with qk the probability that an individual begets k offspring. By differentiating one finds for n = 0,1,…
    Display Formula
    B2
    with Inline Formula and Inline Formula. Ek[n] is known as the nth factorial moment. Let p denote the probability of invasion. When an individual is known to beget k offspring, the chance that its line goes extinct is (1 − p)k. Hence,
    Display Formula
    B3
    We know that p = 0 for R0 ≤ 1. For small R0 − 1, and hence small p
    Display Formula
    B4

    Inline Formula. Substituting the ansatz p = c(R0 − 1) + O((R0 − 1)2) and solving for c gives

    Display Formula
    B5

    The calculation for more than one birth state starts from the vector of generating functions Inline Formula Inline Formula with

    Display Formula
    B6
    j the birth state of the parent. By a similar argument as before (e.g. [33])
    Display Formula
    B7
    with Inline Formula and a prime denoting the derivative for Z (Inline Formula is thus a row vector with components ∂g/∂zi and G’ a matrix), and
    Display Formula
    B8
    with
    Display Formula
    and Lm j the jth column of Lm.

    Let P denote the vector of invasion probabilities depending on the birth state of the newly arrived mutant. By a similar argument as before, P can be shown to satisfy

    Display Formula
    B9
    [33]. Expanding and making the ansatz that P = CZ + O(||Z||2) gives
    Display Formula
    B10
    with Inline Formula, Inline Formula and Inline Formula Inline Formula. Collecting the first-order terms gives
    Display Formula
    B11
    Hence,
    Display Formula
    B12
    with κ = O(||Z||). Substituting this in the equation that results from collecting the second-order terms gives
    Display Formula
    B13
    which, on dividing by κ and right multiplying with U to get the projection on the dominant left eigenspace, gives
    Display Formula
    B14

    Finally use that VΔLU = ln(R0) + O(||Z||2) to arrive at

    Display Formula
    B15
    (The step from the factorial moments in G’’T(1, 0)[V, V]U to variances is made by observing that Inline Formula Inline Formula.)

    Now suppose that there is no relation between birth state and mutation propensity. Then the birth state distribution of a newly appeared mutant is U, and the probability that a random mutant invades can be approximated as

    Display Formula
    B16

    As in the invasion fitness case, (B 16), with O(||Z||2) replaced by O(||Z||3), also applies near evolutionarily singular strategies where ln(R0) = O(||Z||2).

    Appendix C. Calculating R0 for a population in two patches with separate sexes

    The proper next-generation matrix for this situation is

    Display Formula
    C1
    with fm,12 the average lifetime number of female offspring produced in patch 1 by a mutant male born in patch 2 (through fertilizing resident type females), etc.

    Now assume that an individual's sex is at most dependent on its birth patch, as is, for example, the case when sex determination is fully environmental or when fathers have no influence on the sex of their offspring, mothers let the sex of their offspring depend at most on where those offspring are born, and the traits are not connected with sex allocation. Then L can be written as

    Display Formula
    C2
    with Inline Formula, etc. and Inline Formula, etc., and thus has rank 2. After one generation, the births can be written as Inline Formula. From then on multiplication with L gives
    Display Formula
    C3

    Hence, R0 can be calculated as the dominant eigenvalue of

    Display Formula
    C4

    When in the general case we write the dominant eigenvector of L as Inline Formula, with Inline Formula, Inline Formula, Inline Formula, multiplying U with L also gives (C 3). Hence, in this case R0 also corresponds to the dominant eigenvalue of a matrix constructed by adding the average numbers of offspring that individuals father or mother over their life and dividing by 2. However, in general these quantities are no more than population averages that can be determined only after establishing the stable birth state distribution of the full next-generation operator.

    Appendix D. The effective offspring variance for sexual diploid residents

    We only consider the case where except for genetic and possibly sex differences all individuals are born equal. In sexually reproducing individuals, an allele is born in either of two states, with potentially different futures, to wit in a macro-gamete or micro-gamete. In the simplest case, these states are randomly allotted, independent of which route put the parental allele in the parental body. This happens, for example, in hermaphrodites or when sex is determined environmentally or by alleles on a different chromosome. We concentrate on that simplest case. Then Inline Formula is the variance of the number of allelic copies reaching the next generation of zygotes from a copy that has just ended up in a zygote. Let km be the number of alleles that does so micro-gametically, and kf the number that does so macro-gametically. Generally, km and kf are dependent. For example, for a seed, and hence for the two alleles in it, the size of the plant it engenders is a random variable, with larger plants usually producing more ovules as well as pollen so that km and kf are positively correlated. As the extreme opposite, in dioecious organisms a new zygote will go on to produce either micro- or macro-gametes, i.e. kmkf = 0. Hence, the starting point of the calculation is the generating function of the pair (kf, km)

    Display Formula
    D1
    The generating function of the number k of aA offspring of an aA individual reproducing in an aa population background is then
    Display Formula
    D2
    with ½ + ½w the generating function of the number of A alleles, 0 or 1, in a gamete. For the calculation of Inline Formula the allele A is supposed to have no phenotypic effect. Hence, at population dynamical equilibrium both 1g(1, 1) = 2g(1, 1) = 1 because the number of individuals stays constant over the generations and each offspring has one father and one mother. Therefore,
    Display Formula
    D3
    and
    Display Formula
    D4
    with
    Display Formula
    D5
    the covariance between macro- and micro-gametically produced offspring.

    Appendix E. The effective offspring variance for haplo-diploid residents

    Following the pattern from appendix D, we first express the variances and covariances of the allelic offspring numbers in terms of the variances and covariances of the numbers of offspring of the resident individuals. Let ga denote the generating function of the lifetime offspring numbers of a residing in a female and ending up in female and male children of that female and let g denote the generating function of those numbers of children kf and km irrespective of their genotype, then

    Display Formula
    E1
    Differentiating gives, with i, j = f, m,
    Display Formula
    E2

    Hence

    Display Formula
    E3
    and
    Display Formula
    E4

    Therefore,

    Display Formula
    E5
    where the |f in the indices of σ and c indicates that the quantities are calculated conditional on the parent being a female. This gives all the ingredients for calculating Inline Formula. First, calculate
    Display Formula
    E6
    Substituting (4.3) and (E 6) in (3.9) gives
    Display Formula
    E8

    For good measure we add the formula for the average age at reproduction

    Display Formula
    E9

    Footnotes

    One contribution of 11 to a Theme Issue ‘Modelling biological evolution: recent progress, current challenges and future direction’.