Why is combinatorial communication rare in the natural world, and why is language an exception to this trend?

In a combinatorial communication system, some signals consist of the combinations of other signals. Such systems are more efficient than equivalent, non-combinatorial systems, yet despite this they are rare in nature. Why? Previous explanations have focused on the adaptive limits of combinatorial communication, or on its purported cognitive difficulties, but neither of these explains the full distribution of combinatorial communication in the natural world. Here we present a nonlinear dynamical model of the emergence of combinatorial communication that, unlike previous models, considers how initially non-communicative behaviour evolves to take on a communicative function. We derive three basic principles about the emergence of combinatorial communication. We hence show that the interdependence of signals and responses places significant constraints on the historical pathways by which combinatorial signals might emerge, to the extent that anything other than the most simple form of combinatorial communication is extremely unlikely. We also argue that these constraints can be bypassed if individuals have the socio-cognitive capacity to engage in ostensive communication. Humans, but probably no other species, have this ability. This may explain why language, which is massively combinatorial, is such an extreme exception to nature's general trend for non-combinatorial communication.


Replicator equations for composite signalling strategies
In evolutionary game theory, the dynamics of a strategy S in a population is governed by the replicator equation [1] dx(S) where x(S) is the frequency of the strategy S in the population, f (S; {x}) is the growth rate of agents with that strategy given the frequencies of all other strategies in the population, and the sum is over the set of strategies S that are in direct competition with S (including S itself).
In a model of communication, there are many different sets of competing strategies. First, the action, A, performed by an agent, α, can vary according to the state of the environment, E. We assume that a single action A α (E) is performed by agent α whenever the environment is in state E: previous work on communication [2,3] has shown that probabilistic strategies (where more than one action might be performed in a given environment) are not evolutionarily stable. Thus, if ψ(A, E) is the frequency of agents that perform action A in environment E, we have the replicator equation is the growth rate of the action strategy E → A and depends on the frequencies of all other strategies in the population. In the model of composite communication described in the main text, we assume that the action performed in an composite environment E 1 • E 2 is the composite of the actions performed in the environments separately, i.e., A α (E 1 ) • A α (E 2 ) for agent α. This means that the frequencies ψ(A, E) are explicitly defined for those environmental states E that are not composites.
The situation for reactions is directly analogous: for each action A that is performed, an agent can perform a single reaction R: for each action, the different possible reactions compete with one another. Hence we have where here φ(R, A) is the frequency of the reaction strategy A → R in the population, R is the set of all possible reactions and v(R, A) is the growth rate of that strategy which again depends on the frequencies of all other strategies in the population.
The crucial question, then, is what form the fitnesses u(A, E) and v(R, A) should take. First, we associate the growth rate s(R|E) with reaction R in environment E (which may include composite environments). We also assign a cost c α to an agent for maintaining a given set of signalling strategies. For simplicity, we assume that each non-default action strategy (i.e., environment E for which A α (E) = A 0 ) costs an amount χ and that each non-default reaction strategy (i.e., action A for which R α (A) = R 0 ) costs an amount η. We further assume the state E is present a fraction f (E) of the time. Finally, if q α (R, E) is the probability that agent α performs the reaction R in environment E as a consequence of some other agent performing an appropriate action, the mean rate of offspring production (fitness) of that individual is The fitnesses u(A, E) and v(R, A) are then obtained by averaging over the set of individuals who have the specific strategy E → A or A → R.
There are two approaches that can be taken to calculate these fitnesses that lead to exactly the same outcome. The first is to assume, as in [2,3], that all agents can observe the behaviour of all other agents (i.e., the population is spatially well-mixed) and that signallers receive the same payoff, s(R|E), as the agents who respond to their signals. The second is instead to assume that agents can only observe the behaviour of conspecifics (i.e., those with the same action and reaction strategies), thereby leading to an indirect benefit for actors who also perform the corresponding reaction.
In order that different strategies may compete with one another, we further require in this case that recombination is efficient, so that the fraction of individuals with two particular strategies is given by the product of their individual frequencies. Below, we show that these two approaches-which amount to different ways to ensure the stability of cooperative behaviour in the population-lead to the same dynamics.

Direct benefit to both signaller and receiver in a spatially-mixed population
We consider first the case where an agent can observe the behaviour of all other agents, and the payoff for performing a reaction R in environment E is passed on from the reactor to actor.
The easiest fitness to evaluate is v(R, A), i.e., that for a reaction strategy A → R. This is because the payoff for performing a reaction is direct. There are three contributions to this quantity. The first comes from actors who perform A in a (non-composite) environmental state E. These actors make up a proportion ψ(A, E) of the population, so under the assumption that agents are spatially-mixed, the probability q(R, E) that a reactor interacts with such an actor is ψ(A, E). This first contribution is then simply In a spatially well-mixed population, the probability that a randomly chosen actor behaves in this way is φ( To decide if a given combination of actions A 1 • A 2 is equivalent to A, we employ the Kronecker delta symbol δ(A 1 • A 2 , A), which equals 1 if the two arguments are equal, and zero otherwise. Then, we find where E 1 , E 2 denotes a sum over distinct pairs of non-default, non-composite environmental states. The final contribution to the fitness comes from the fact that maintaining each non-default reaction decreases the payoff by an amount η, no matter what behaviour the agent actually engages in. Since only fitness differences matter, we can equally ascribe a fitness benefit to the default reaction of η, again by using the Kronecker delta symbol: Adding these three terms together gives the expression quoted in the main text.
We now turn to the fitness u(A, E) of the action strategy E → A. To do this we need to identify the mean growth rate of agents employing the strategies A → R for fixed A but variable R. Again, this has three contributions. First, the probability q(R, E) that a randomly chosen reactor exhibits the reaction R to the action A in environment E is φ(R, A). Hence, the first contribution to the fitness is The second contribution comes from the case where the environmental state E co-occurs with some other state E (which is distinct from E and E 0 ). In the composite state E • E , the probability that the composite action A • A is observed by a randomly-chosen reactor is ψ(A , E ), given that action A is already performed by some actor. The probability that reactor also performs the reaction R to Hence, the second contribution to the fitness is Finally, we can assign a fitness advantage to the default strategy A 0 via Again, summing these three contributions together we obtain the expression for u(A, E) given in the main text.

Indirect benefit through kin discrimination with random mating
The foregoing expressions for the fitnesses were obtained by using the fact that, when an actor or reactor is chosen at random from the population, the probability that it has a given strategy E → A or A → R is just given by the frequency of that strategy in the population, ψ(A, E) or φ(R, A). This is appropriate when agents are well mixed in space. An alternative approach is to assume that agents interact only with their conspecifics. Then, when considering an agent as an actor with the strategy A → E, for example, and asking whether a reactor that agent interacts with exhibits the strategy A → R, this is equivalent to asking whether that same actor also has the strategy A → R. In principle, strong correlations could build up between different strategies. However, if we assume that some mating process acts so that offspring acquire random combinations of parents' strategies, and that this process acts sufficiently quickly that it reaches equilibrium on the timescale of the growth dynamics, then the probability that an agent has the strategy A → R, say, is φ(R, A) no matter what other strategies it may possess. Thus, asking questions about a single agent in this picture is equivalent to asking those same questions about randomly-chosen agents in the previous section. Hence, the fitnesses that arise from this approach are exactly equivalent. It is possible to show this more formally by deriving the replicator dynamics from first principles using, e.g., the Price equation [4,5] as a starting point.

Conditions for evolutionary stability
Evolutionarily stable strategies are found by identifying stable fixed points of the replicator equations (S2) and (S3). To obtain vanishing right-hand sides of these equations, we must have that ψ(A, E) = 0 for all but one action A in each environment E, and that φ(R, A) = 0 for all but one reaction R to each action A. Thus, only homogeneous populations are fixed points of the replicator equations. The only exception to this is when multiple fitnesses have the same value: then one has neutral stability in mixed populations. This will rarely be the case in situations of interest to us: even then, stochastic contributions (not considered here) will tend to lead to a homogeneous population. Hence, only homogeneous populations can be evolutionarily stable, as stated in the main text.
To determine whether a particular homogeneous population is evolutionarily stable, we need to examine the behaviour of deviations away from the corresponding fixed point in (S2) and (S3). Ultimately, we find that the requirement for stability (i.e., that the Hessian matrix evaluated at a fixed point has negative eigenvalues [6]) is satisfied only if  Table S1: Growth rates s(R|E) as a function of the behaviour R ∈ {R 0 , U, D, F} in each environmental state E ∈ {E 0 , L, E, X, L • E}. The idea is that there is a penalty α when one predator is present, and 2α when both are present; a cost β if food is scarce; a cost of moving away from predator γ; a benefit δ if the movement away from a predator leads to a greater chance of survival; and a cost of for fleeing. It is assumed that all the costs and benefits are cumulative, except for fleeing which (by taking the agent to a completely new location) is taken to be independent of the environmental state.

Case study of the principles for the emergence of combinatorial communication: Putty-nosed monkeys
We illustrate the general principles for the emergence of combinatorial communication set out in the main text with the concrete example of the putty-nosed monkey's communication system. To recap, there are three basic environmental states that are relevant to communication: L, where leopards are present; E, where eagles are present; and X, in which food is scarce. We assume that the optimal behaviour in these environmental states is U, to move up, D, to move down, and F, to flee, respectively. We also assume that the only composite state that may exist is presence of both predators L • E, and that the optimal response in this environment is to flee (F), since moving away from one of the predators (U or D) will inevitably entail moving towards from one of the predators.
Taking into account the default environment E 0 and the default reaction R 0 , we find that even this simple model has twenty different growth rates s(R|E). To keep the parameters to a manageable number, we introduce a set of costs α, β, γ and which relate to the presence of a predator, the absence of food, moving away from a predator and fleeing respectively, and a benefit δ for moving away from a predator. Combining these costs additively leads to the set of growth rates specified in Table S1. In addition to these growth rates, we must also specify the costs η and χ for maintaining components of a communication system, and the frequencies f (E) with which the various environmental states are present. Again, for simplicity, we assume that leopards and eagles are equally frequent, f (L) = f (E).
To complete the definition of the model, we specify three actions R, G and B that we construe as different colours (rather than sounds like 'pyow' and 'hack'), and the single composite action R • G.
We now demonstrate how principles 2 and 3 stated in the main text allow us to understand the constraints on reaching the actual communication system exhibited by putty-nosed monkeys (i.e.,  Table S2: Parameters used in direct numerical integration of the system of ODEs (S2) and (S3).
the signals L → R → U, E → G → U and X → R • G → F) from a simpler system. Since one needs at least two signals to create a composite signal, we consider starting points whereby two signals have separately evolved. There are two distinct starting points: (A) one in which one of the predator signals is present alongside the food signal, i.e., L → R → U (or E → G → U which is equivalent, due to the symmetry in the predators we have built into this model) and X → B → F; and (B) one in which both predator signals are present, i.e., L → R → U and E → G → U. Note that in the former case, there was no option other than to use the non-composite action B to act as a cue for the absence of food.
Principle 2 in the main text states that given starting point (A), the only way that a new signal can be added is by some external trigger: if the system (A) is unstable, it can be unstable only to losing existing actions or reactions, rather than by adding a new action or reaction, since these necessarily incur a cost η or χ respectively to no benefit. The only way a more complex system can be constructed in this case is the additional of a new signal by an external trigger. The only signal absent from (A) is the remaining predator signal. This leads to a system in which all three non-composite environmental states map to distinct non-composite actions, each of which yields a distinct reaction. It is possible that once this second predator signal is added, ritualisation of the action R • G → F may occur due to R • G being performed in the composite state L • E, and because the flee reaction is optimal in the presence of both predators. This leads to the establishment of a pseudo-composite signal within the classification scheme outlined in the main text. For the set of model parameters given in Table S2, we find that this is exactly what happens: see Figure S1 which shows the results of direct numerical integration of the ODE system (S2) and (S3) under these conditions. Now we consider starting point (B). Principle 3 in the main text states that this starting point may be vulnerable to the emergence of a composite signal without any external triggering. As we further discuss in the main text, this is permitted when the system comprising three non-composite signals is unstable. In particular, this is true for the combination of parameter values given in Table S2. Figure S2 shows that ritualisation of R•G → F is followed by sensory manipulation of X → R•G, generating the putty-nosed monkey's communication system without the need for an external trigger. On the other hand, when the non-composite signalling system is stable, this pathway is suppressed and the starting point (B) is expected to be stable, even if the system with a fully-composite signal is also stable (and therefore, in principle, a possible endpoint of the dynamics). Direct numerical integration of (S2) and (S3) under these conditions shows no change in strategy frequencies over time, indicating that the two-signal system (B) is stable in this case, and further signals can be added only by means of an external trigger. As we argue in the main text, the most likely scenario is that the stable system of three non-composite signals will be reached through such a mechanism, rather than the system with a composite signal, even though that is also stable.
This simple example thus demonstrates two constraints on the trigger-free emergence of fully-composite signals: (i) in order for a composite action to be used as a cue for an unrelated environmental state, no other action may be in use as part of a signal for that state; and (ii) the system in which the unrelated environmental state is signalled by an non-composite action must in itself be unstable. Figure S1: Ritualisation of a pseudo-composite signal from a system of three non-composite signals in the absence of an external trigger. The left panel shows the frequencies ψ(A, E) of relevant action strategies; the right panel shows the frequencies of φ(R, A) of relevant reaction strategies. These were obtained obtained by direct numerical integration of the system (S2) and (S3) with the set of model parameters given in Table S2. Figure S2: Evolution of a fully-composite signal from a system of two non-composite signals in the absence of an external trigger. This is achieved first by ritualisation of R • G → F which creates a state that is then vulnerable to sensory manipulation of X → R • G. The left panel shows the frequencies ψ(A, E) of relevant action strategies; the right panel shows the frequencies of φ(R, A) of relevant reaction strategies. These were obtained obtained by direct numerical integration of the system (S2) and (S3) with the set of model parameters given in Table S2.