Competition for resources can explain patterns of social and individual learning in nature

In nature, animals often ignore socially available information despite the multiple theoretical benefits of social learning over individual trial-and-error learning. Using information filtered by others is quicker, more efficient and less risky than randomly sampling the environment. To explain the mix of social and individual learning used by animals in nature, most models penalize the quality of socially derived information as either out of date, of poor fidelity or costly to acquire. Competition for limited resources, a fundamental evolutionary force, provides a compelling, yet hitherto overlooked, explanation for the evolution of mixed-learning strategies. We present a novel model of social learning that incorporates competition and demonstrates that (i) social learning is favoured when competition is weak, but (ii) if competition is strong social learning is favoured only when resource quality is highly variable and there is low environmental turnover. The frequency of social learning in our model always evolves until it reduces the mean foraging success of the population. The results of our model are consistent with empirical studies showing that individuals rely less on social information where resources vary little in quality and where there is high within-patch competition. Our model provides a framework for understanding the evolution of social learning, a prerequisite for human cumulative culture.


Introduction
Animals rely on current information about their environment, which they gain by directly interacting with their environment (individual learning) or by observing others (social learning). Theoretically, social learning has many benefits. In particular, it allows animals to learn behaviours that have been tested and retained by others, and these should have higher expected returns than untried new behaviours [1][2][3][4][5]. Social learning also has evolutionary consequences by facilitating local adaptation, and it is a requisite mechanism for the evolution of culture [1,6,7] including cumulative culture in humans. A recent tournament model demonstrated a consistent competitive advantage for individuals relying primarily on social learning [2]. Together these arguments suggest that social learning should be ubiquitous in nature, but observational and experimental evidence for example from goats [8], honeybees [9] and guppies [10] suggests that animals often opt for individual learning even when social information is available.
Social learning theory often focuses on how individuals are expected to deploy social learning strategies in a foraging context [11][12][13]. Models have explored the relative effectiveness of individual and social learning in spatially heterogeneous or temporally fluctuating environments [1,7,14], or when and where individuals should selectively use social learning [2,15,16]. In order to reproduce the coexistence of learning strategies seen in nature, models generally attribute either costs [7,14,15,17] or learning errors [2,15,18] to social learning (reviewed in [19]). The costs of social learning have been attributed to the development and maintenance of a sophisticated behavioural repertoire and nervous system [14]; to a metabolic cost for searching or joining [18]; or to acquiring inappropriate or outdated information [2]. Some models just assume arbitrary costs [7,17]. Previous models have shown that the fitness of learning strategies is frequency-dependent [6]. Although these models generate the mixed-learning strategies seen in nature, there is little empirical evidence for either the costs or low fidelity of social learning when compared with individual learning [20][21][22]. Thus, the wide use of individual learning in nature remains poorly explained.
Despite the common use of a foraging framework to study social learning, relatively little theoretical attention has been paid to the impact of competition. This is despite the recognition, since Malthus [23,24], that competition plays a fundamental role in shaping populations and behaviour. Resource competition should disproportionately affect social learners, since by copying others social learners are more likely to be forced to share resources in depleted patches. By contrast, directly sampling the environment allows individual learners to find unexploited patches. The argument is the same if individuals copy complex foraging behaviours rather than foraging patch choices. Two individuals that use the same behaviours will exploit similar resources, while an individual that innovates can access resources that are unused by others.
Social learning models [2] often assume that resource quality is highly variable and that resources within patches are unlimited. The latter assumption implies that there is no competition and is usually made to simplify model dynamics. However, in nature competition is ubiquitous [24 -26]. Consider for example mate-choice preferences, nesting, feeding and oviposition sites. All represent cases of limited resources (mates, space, food or a combination). Where resources are sufficiently large or abundant, individuals experience little competition. Where resources are less abundant, individual success in resource acquisition is negatively impacted by competition. In addition to resource abundance, resource distribution should also affect the impact of competition on social learning. Where resource quality varies widely, the costs of sharing resources should be outweighed by the advantages of using others to find highly profitable patches. However, where there is little variation in resource quality, individuals may do best by identifying unused patches.
Where individuals compete for access to resources using aggression [27], resources can be lost or the net gain can be reduced by the physiological costs of contest. For example, house sparrows (Passer domesticus) foraging in small patches aggressively drive competitors away from resources. Dominant individuals forage less efficiently because they invest time in this resource-guarding behaviour, and subordinate individuals may fail to forage at all [28]. Thus, fewer resources are collected from patches with more foragers. By contrast, in pure exploitative competition individuals simply divide rewards rather than engaging in conflict. Thus, the distribution of resources can affect learning or foraging strategies in several ways; highly clumped and variable resources are more divisible but also potentially more likely to be contested than evenly spread resources. By using environmentally appropriate learning strategies in a frequency-dependent manner, individuals should be able to mitigate the effect of competition. Therefore, competition deserves consideration in learning models: it is important to understand how learning strategies compete against each other [29].
To study how competition affects the success of individual and social learning, we developed an agent-based model, where individuals forage for resources divided among patches. Patches contain different amounts of resource, which can change over time. Individuals learn about patches by direct exploration (individual learning) or by observing another individual that has visited that patch (social learning). Our model is akin to the classical producer-scrounger game [11]: individual learners produce new knowledge about the system, and social learners exploit the knowledge that others have produced. In contrast to earlier social learning models, resources are limited and individuals foraging at the same patch compete to collect them. We use this model to ask the following questions: (i) does competition reduce the effectiveness of social learning relative to individual learning, (ii) does environmental stability or variation in resource quality affect the relative effectiveness of social learning and individual learning, and (iii) how is the mean individual foraging success affected by different proportions of social learning in a group?

Material and methods
We used agent-based stochastic simulations to study the relative fitness of individual and social learning in different environments and under different competition strengths. We describe our model as if the information that individuals must learn is the location of resources that are distributed among patches. However, the model is equally appropriate if individuals must learn foraging strategies that offer different resource pay-offs, similar to the multi-armed bandit model used by Rendell et al. [2].

(a) The model
We modelled an environment comprised of patches. Each patch has a resource value, which describes the amount of resources that one insatiable individual foraging alone could collect from that patch in a single time step. We drew the resource value for each patch independently from a gamma distribution. In each time step, the resource value of each patch changes with probability t. If the resource value changes, a new resource value for that patch is drawn independently from the same gamma distribution. We used different gamma distributions to study environments with different resource distributions. We set the parameters of each gamma distribution so that the expected resource value in any patch was the same in all environments (here set to 4), but the variability among patches differed among environments. Thus, each environment we studied could be fully characterized by the number of patches, the rate of change t and the evenness G of the resource distribution. We measured G using the Gini index [30], which is widely employed by economists to measure wealth inequality (see the electronic supplementary materials). The Gini index can range from G ¼ 0 (if all patches have the same resource value) to G ¼ 1 (if all resources are concentrated in a single patch). We studied values between G ¼ 0.14, where there is little variability among patches, and G ¼ 0.83, where most patches are poor and a few patches are very rich (electronic supplementary material, table S1). Individuals in our model collect resources from patches. Individuals can collect resources only from patches that they know about, and they learn about patches in one of two ways. Each individual is either an individual learner or a social learner. A social learner observes a randomly selected exploiting individual, and learns the amount of resources that individual collects from the patch it is currently exploiting. An individual learner observes a randomly selected patch, and learns the amount of resources that individuals in that patch are currently collecting. If no individuals are exploiting the patch, the individual learner rspb.royalsocietypublishing.org Proc. R. Soc. B 282: 20151405 learns the full resource value of the patch. We call the amount the individual learns the anticipated reward of the patch. When an individual first learns about a patch, the anticipated reward is the amount that is currently being collected, and is not corrected for the added competition that will occur if the individual joins exploiters in the patch. In nature, animals may or may not be able to adjust anticipated rewards for anticipated competition. We have chosen this formulation because it ensures that social and individual learners obtain equally accurate information. Thus, there is no implied penalty on the quality of information obtained by social learners. An individual that has learned the anticipated reward of a patch maintains that information until it is updated (for example, if it visits that patch to forage or if it learns about the patch again) or until the individual dies. Thus, if the resource value or number of individuals using a patch changes, an individual's anticipated reward from that patch will be incorrect until it is updated. Time in our model is divided into steps that we call rounds. In each round, each individual either learns (with probability b) or exploits resources from patches it has previously learned about (with probability 1 2 b). We set b ¼ 0.2, which was found to be most advantageous for individual learners in a previous model [31]. If an individual exploits, it visits the patch that it knows about and from which it anticipates the highest reward. It collects an amount of resources p from that patch, where Here, p is the resource value of the patch, n is the number of individuals that visit the patch in that round, u describes the maximum resources that an individual can collect in a single round (e.g. due to satiation) and c scales the strength of competition between individuals in a patch (e.g. [32,33]). If c ¼ 0, then individuals are not affected by other individuals in the same patch and receive p ¼ p (i.e. there is no competition). Where c ¼ 1 individuals receive exactly p ¼ p/n (i.e. there is full exploitative competition), and where c . 1 individuals receive less than p ¼ p/n (i.e. there is interference competition). Every individual that exploits a patch updates its anticipated reward of that patch according to the amount of resources it has collected. If an individual is selected to exploit but has not yet learned about any patches, then it does nothing in that round.
In each round, each individual in the population dies with probability d ¼ 0.02. Any individual that has survived its first 100 rounds dies after its 100th round. Thus, the life expectancy of individuals is 43.37 rounds. When an individual dies, a new individual with no knowledge of the environment is immediately born to replace it. Thus, the population size is constant.
(b) Analysis 1: frequency-dependent effects of social and individual learning without and with competition Our first goal was to understand how the relative fitness of individual and social learners depends on the frequencies of each strategy. This is important because frequency dependence is needed to explain how learning strategies coexist in nature, and the source of this frequency dependence is still poorly understood. The relative amount of resources collected by an individual is a performance measure that is commonly used as fitness proxy in foraging theory [2,34], and we used this here. We ran simulations with 100 patches and 100 individuals. We initialised each simulation with a fixed frequency of individual learners, and we iterated 10 4 model rounds. To maintain fixed frequencies of each learning strategy, we assumed that whenever an individual died it was immediately replaced by the birth of a new individual with the same learning strategy. In each of the last 2500 rounds of each simulation, we recorded the average resources collected by each individual or social learner. We conducted 100 simulations for each frequency of individual learners in the set f0.01, 0.1, 0.2, . . . , 0.9, 0.99g and evenness in the set G [ f0.14, 0.83g both without (c ¼ 0) and with (c ¼ 1) competition. In all simulations, we set u ¼ 1 (i.e. individuals can collect up to the full resource value of the patch they visit).
(c) Analysis 2: the effect of environmental parameters on the fitness of learning strategies The approach above allows us to predict the frequency of individual learning in the population at which each social or individual learner collects the same amount of resources. We expect this to be the frequency at which the strategies coexist [6]. However, this approach is computationally intensive. Therefore, to understand how competition affects the fitness of learning strategies across a broad range of environmental conditions, we used an evolutionary algorithm. As above, we modelled systems with 100 patches and 100 individuals. We initialized each population with 50 individual and 50 social learners, and we iterated 10 4 model rounds. When an individual in the population died, it was immediately replaced by the offspring of a surviving individual. The probability that each surviving individual was selected as the parent was proportional to the average resources per round that the individual had collected over the course of its lifetime. This translation of resources collected to reproductive potential has been used in previous studies of individual and social learning [2]. Offspring inherited their learning strategy from their parent, and with probability 0.01 that their learning strategy mutated to the opposite strategy. Mutation prevents strategies from becoming fixed due to stochastic drift. We conducted 100 simulations for each combination of G [ f0.14, 0. 16 figure S3). We calculated the average frequency of individual learners in the population over the last 2500 rounds of each simulation. This analysis assumes that each forager is either an individual learner or a social learner, and uses only that learning strategy. In the electronic supplementary materials, we analyse a similar model in which each individual can use a combination of the two learning strategies (electronic supplementary material, figure S2).

(e) Analysis 4: the effect of social learners on mean individual foraging success
In analyses 1 -3, we asked how environmental parameters and competition strength affect the stable frequency of social learners in a population. In this final analysis, we asked how the frequency of social learners affects mean individual foraging success. We defined mean individual foraging success as the average resources collected per individual per foraging round. Using the approach from analysis 1, we conducted simulations in which the frequency of social learners was fixed at f. We conducted 100 independent simulations over 10 4 rounds, and averaged the mean individual foraging success in populations with f [ f0.01, 0.1, 0.2, . . . , 0.9, 0.99g over the last 2500 rounds. Then, using the approach from analyses 2 and 3, we conducted 100 additional simulations in which the frequency of social learners was allowed to evolve. We iterated simulations for 10 4 rounds and calculated the mean evolved frequency of social learners and the mean individual foraging success in the evolved population over the last 2500 rounds of each simulation. We compared the mean individual foraging success in evolved populations to that in populations with fixed frequencies of social learning. We conducted this analysis for populations and environments characterized by each combination of t [ f0.01, 0.1g, c [ f0.6, 1, 1.6g, u [ f2, 1g and G ¼ 0.83.

Results (a) Competition enables the coexistence of learning strategies
When there is no competition, social learners collect more resources than individual learners regardless of the relative frequency of the two strategies ( figure 1a,b). Thus, social learners will outcompete and exclude individual learners from the population. With competition, however, there is frequency dependence and each strategy is favoured when is it rare (figure 1c,d ). The frequency at which the two strategies have the same fitness, and thus at which they should coexist, depends on the resource distribution (cf. figure 1c,d).

(b) Stable and highly skewed resources favour social learning
With competition, social learning is favoured when resources are unevenly distributed and highly predictable ( figure 2). This is because foragers gain higher returns by identifying and exploiting high-quality patches, even if they must share those patches with others. If resources are more evenly distributed, individual learning is favoured. In this case, foragers do better by spreading themselves evenly among patches and avoiding competition. Rapidly changing patch quality also favours individual learning. In this case, the information held by others quickly becomes obsolete. The slightly better information a forager obtains by learning from others is not worth the competition it faces from the individual it has copied. We found qualitatively similar effects of resource distribution and rate of environmental change using a model in which each individual was allowed to employ a mixture of individual and social learning (electronic supplementary material, figure S2). The steady state for social learning in analysis 2 does not precisely match the predictions of analysis 1. This is due to the effect of mutation in analysis 2. Symmetrical mutation pushes the ratio of social and individual learners towards 50%. When selection is weak (e.g. on the lower left side of figure 2), the steady state in the evolutionary model is close to 50%. When selection is strong (e.g. on the right side of figure 2), the results of the evolutionary analysis are similar to the predictions of analysis 1.

(c) Competition favours individual learning
Social learning dominates when competition is very weak, regardless of resource distribution ( figure 3). This is because every individual can collect nearly the full resource value from a patch regardless of how many individuals are present, so there is little disadvantage in joining others. With increasing competition, there is more incentive to find unexploited patches and consequently individual learning increases. The more evenly resources are distributed, the faster the proportion of social learning decreases with increasing competition.  1 and u ¼ 1), then the mean individual foraging success is maximized when all foragers are individual learners. This is true because every social learner reduces the success of the foragers it copies by at least as much as it collects itself. If multiple foragers in a patch can sometimes collect more total resources than one forager could collect alone (i.e. if c , 1 or u , 1), then some social learning can increase the mean individual foraging success. In this case, an individual learner that discovers an unexploited high-value patch may be unable to exploit all of the resources in that patch. By following that individual, social learners can find the high-value patch and access the unexploited resources more quickly than individual learners can. These social learners gain more by copying than the patch finders lose by being copied. Thus, the mean individual foraging success is higher in a population with some social learners than in a population with all individual learners. If the frequency of social learning in a population is allowed to evolve, the mean individual foraging success at the evolved frequency (dots in figure 4) can be higher or lower than in a population of all individual learners. Nonetheless, the frequency to which social learning evolves always exceeds the frequency that maximizes the mean individual foraging success. Thus, at the frequency to which they evolve, social learners negatively affect the mean foraging success in their system.

Discussion
Our results demonstrate that in contrast to a competitionfree world, which favours social learning, competition for resources can promote mixed individual and social learning strategies. Social learning is highly effective where resources are unevenly distributed because it allows individuals to quickly find good but densely occupied patches. Even with competition, pay-offs in these patches can be higher than in randomly sampled patches. By contrast, when patches vary little in quality, individual learning is advantageous. In this case, there are no patches with extremely high resource values, and individual learning allows foragers to find unoccupied patches and so to avoid competition. Moreover, where environmental turnover is high and resource predictability is low, information quickly becomes outdated. When information is less accurate, the positive effect of collecting information from other foragers becomes small relative to the negative effect of competing with those foragers for resources, and individual learning is again advantageous. Thus, social learning is favoured where resource distribution is uneven yet predictable, and individual learning is favoured where resource distribution is even and/or unpredictable. Previous social learning models have typically not incorporated competition (e.g. [2,6,7,[14][15][16]). Instead, they have explained the coexistence of individual and social learning strategies by attributing arbitrary costs, low fidelity or inefficiencies to social learning [2,[6][7][8][9][10]. Given that competition is ubiquitous in nature it is commonly incorporated in foraging models, for instance in ideal-free distribution [35] and producer-scrounger games [33]. Our analysis demonstrates the importance of including competition in models of social learning: models that ignore competition may give incomplete or misleading results.
The results from our competition-based learning model are consistent with what we find in nature. Sticklebacks (Pungitius pungitius), for example, use individual learning to make foraging decisions when competition is high, but rely almost entirely on social learning when competition is low [36]. Honeybees (Apis mellifera) rely heavily on social information when foraging resources are unevenly distributed and are more likely to use individual information when resources are evenly distributed [20]. Foraging returns for rats (Rattus norvegicus) are likely to be highly variable because some potential food items are harmful or fatal, and rats exhibit strong preferences for socially learned food items [37]. By contrast, herbivores like goats experience homogeneous resource distributions and rely on individual information when exhibiting grazing preference [8].
In our model, social learning can evolve to high frequencies. This is true because social learning can be advantageous to individuals. However, in a classic thought experiment, Rogers [6] predicted that social learning at its evolutionarily stable frequency would have no effect on population mean fitness. This has become known as Rogers' paradox [15]. It is a paradox because most researchers believe that the capacity for social learning to increase population mean fitness explains the evolution of culture, and if Rogers' prediction holds this cannot be true. Rogers' model assumes that the fitness of individual learners does not depend on the frequency of social learners. Because the fitness of social learners is reduced as they become more common, social learners increase in the population only until they have the same fitness as individual learners. Therefore, at the stable state, the mean fitness of the population with social learners is the same as the mean fitness of the population without them. In the presence of competition, the frequency of social learners does affect the fitness of individual learners. In this case, social learning at its evolved frequency can either increase or decrease population mean fitness relative to populations with all individual learners, as we show here. Thus, competition offers a resolution to Roger's paradox. Interestingly, the evolved frequency of social learning is always greater than is necessary to maximize the population mean fitness. Thus, at the evolved frequency of social learning, individuals that use social learning beyond the optimal frequency have a negative effect on population mean fitness.
We have restricted our analysis to cases in which withinpatch resource collection by individuals is either strictly independent (i.e. c ¼ 0) or competitive (i.e. c . 0). In nature, resource collection can also be cooperative (i.e. c , 0). In this case, two or more individuals working together can access resources that one individual could not access alone. Cooperative foraging has been observed in mammals [38], birds [39], spiders [40], possibly fish [41] and is common in eusocial insects [42]. If foraging is cooperative, evolutionarily stable frequencies of social learning might not reduce mean individual foraging success. However, cooperative foraging is likely to be a derived trait that evolves from simpler intraspecific interactions. Thus, as cooperative foraging evolves, social learning may still pass through a stage in which it reduces the fitness of its population.
The ability to learn from others is a prerequisite for the evolution of culture [1,6,7,43,44], including the cumulative culture that has made humans so successful. We have shown that the evolution of social learning is shaped by competition. This result provides an important new context for future studies of cultural evolution.