The evolution of cooperation by social exclusion

The exclusion of freeriders from common privileges or public acceptance is widely found in the real world. Current models on the evolution of cooperation with incentives mostly assume peer sanctioning, whereby a punisher imposes penalties on freeriders at a cost to itself. It is well known that such costly punishment has two substantial difficulties. First, a rare punishing cooperator barely subverts the asocial society of freeriders, and second, natural selection often eliminates punishing cooperators in the presence of non-punishing cooperators (namely,"second-order"freeriders). We present a game-theoretical model of social exclusion in which a punishing cooperator can exclude freeriders from benefit sharing. We show that such social exclusion can overcome the above-mentioned difficulties even if it is costly and stochastic. The results do not require a genetic relationship, repeated interaction, reputation, or group selection. Instead, only a limited number of freeriders are required to prevent the second-order freeriders from eroding the social immune system.


Introduction
We frequently engage in voluntary joint enterprises with non-relatives, activities that are fundamental to society. The evolution of cooperative behaviours is an important issue because without any supporting mechanism [1], natural selection often favours those that contribute less at the expense of those that contribute more. A minimal situation could easily cause the ruin of a commune of cooperators, namely, the 'tragedy of the commons' [2]. Here, we consider different types of punishment, such as a monetary fine [3][4][5][6][7] and ostracism [8][9][10][11], for the evolution of cooperation. Punishment can reduce the expected payoff for the opponent, and subsequently, change natural selection preferences, to encourage additional contributions to communal efforts [12]. Our model looks at this situation, because 'very little work has addressed questions about the form that punishment is likely to take in reality and about the relative efficacy of different types of punishment' [13].
Here, we choose to focus on social exclusion, which is a common and powerful tool to penalize deviators in human societies, and includes behaviours such as eviction, shunning and ignoring [14][15][16]. For self-sustaining human systems, indeed, the ability to distinguish among individuals and clarify who should participate in the sharing of communal benefits is crucial and expected (of its members) [17]. A specific example is found in the case of traffic violators who are punished, often strictly by suspending or revoking their driver licence for public roads. Among non-humans, shunning through partner switching is a common mechanism for inequity aversion and cooperation enforcement [13,18,19]. Experimental studies have shown, for instance, that chimpanzees can use a mechanism to exclude less cooperative partners from potential collaborations [20], or that reef fish will terminate interaction with cleaner fish that cheat by eating the host's mucus rather than parasites [21].
In joint enterprises, by excluding freeriders from benefit sharing, the punishers can naturally benefit, because such exclusion often decreases the number of beneficiaries, with little effect on the total benefit. Consider the example of the division of a pie provided by some volunteers to a group. If a person is one of the volunteers, it may be justifiable in terms of fairness to suggest or even force freeriders to refrain from sharing in the pie. Although excluding freeriders can be stressful, it increases the share of the pie for the contributors, including the person who performs the actual exclusion. If the situation calls for it, the excluded freerider's share of the group benefits may separately be redistributed among the remaining members in the group [22,23]. Therefore, in either case, the excluded member will obtain nothing from the joint enterprise and the exclusion causes immediate increases in the payoff for the punisher and also the other remaining members in the group. This is a 'self-serving' form of punishment [13,18]. It is of importance that if the cost of excluding is smaller than the reallocated benefit, social exclusion can provide immediate net benefits even to the punisher. This can potentially motivate the group members to contribute to the exclusion of freeriders, however, our understanding of how cooperation unfolds through social exclusion is still 'uncharted territory' [24].
Most game-theoretical works on cooperation with punishment have focused on other forms of punishment, for example, costly punishment that reduces the payoffs of both the punishers and those who are punished. As is well known, costly punishment poses fundamental puzzles with regard to its emergence and maintenance. First of all, costly punishment is unlikely to emerge in a sea of freeriders, in which almost all freeriders are unaffected, and a rare punisher would have to decrease in its payoff through punishing the left and right [18,[25][26][27]. Moreover, although initially prevalent, punishers can stabilize cooperation, while non-punishing cooperators (so-called 'second-order freeriders') can undermine full cooperation once it is established [3,13,17,24,28,29].
In terms of self-serving punishments, however, we have only started to confront the puzzles that emerge in these scenarios. We ask here, what happens if social exclusion is applied? that is, do players move towards excluding others?, and can freeriders be eliminated? Or, will others in the group resist? Our main contribution is to provide a detailed comparative analysis for social exclusion and costly punishment, two different types of punishment, from the viewpoint of their emergence and maintenance. With the self-serving function, social exclusion is predicted to more easily emerge and be maintained than costly punishment.
Few theoretical works have investigated the conditions under which cooperation can evolve by the exclusion of freeriders. Our model requires no additional modules, such as a genetic relationship, repeated games, reputation or group selection. Considering these modules is imperative for understanding the evolution of cooperation in realistic settings. In fact, these modules may have already been incorporated in earlier game-theoretical models that included the exclusion of freeriders [30][31][32], but we are interested in first looking at the most minimal of situations to get at the core relative efficacy of costly punishment versus social exclusion.

Game-theoretical model and analysis
To describe these punishment schemes in detail, we begin with standard public good games with a group size of n ! 2 [26,33,34] in an infinitely large, well-mixed population of players. We specifically apply a replicator system [35] for the dynamic analysis, as based on preferentially imitating strategies of the more successful individuals. In the game, each player has two options. The 'cooperator' contributes c . 0 to a common pool, and the 'defector' contributes nothing. The total contribution is multiplied by a factor of r . 1 and then shared equally among all (n) group members. A cooperator will thus pay a net cost s ¼ c (1 2 r/n) through its own contribution. If all cooperate, the group yields the optimal benefit c(r 2 1) for each; if all defect, the group does nothing. To adhere to the spirit of the tragedy of the commons, we, hereafter, assume that r , n holds, in which case a defecting player can improve its payoff by s . 0, whatever the co-players do, and the defectors dominate the cooperators. To observe the robustness for stochastic effects, we also consider an individual-based simulation with a pairwise comparison process [36,37]. See the electronic supplementary material for these details. In what follows, we extend the standard public good game to one of the different types of punishment, costly punishment or social exclusion, and investigate the evolutionary fate of populations.

(a) Type A: costly punishment
We then introduce a third strategy, 'punisher', which contributes c, and moreover, punishes the defectors. Punishing incurs a cost g . 0 per defector to the punisher and imposes a fine b . 0 per punisher on the defector. We denote by x, y and z the frequencies of the cooperator (C), defector (D) and punisher (P), respectively. Thus, x, y, z ! 0 and x þ y þ z ¼ 1. Given the expected payoffs P S for the three strategies (S ¼ C, D and P), the replicator system is written by where P :¼ xP C þ yP D þ zP P describes the average payoff in the entire population. Three homogeneous states (x ¼ 1, y ¼ 1 and z ¼ 1) are equilibria. Indeed, and P P ¼ rc n ðn À 1Þðx þ zÞ À s À gðn À 1Þy: ð2:2cÞ Here, the common first term denotes the benefit that resulted from the expected (n 2 1)(x þ z) contributors among the (n 2 1) co-players, and b(n 2 1)z and g(n 2 1)y give the expected fine on a defector and expected cost to a punisher, respectively. First, consider only the defectors and punishers (figure 1). Thus, y þ z ¼ 1 and the replicator system reduces to _ z ¼ zð1 À zÞðP P À P D Þ. Solving P P ¼ P D results in that, if the interior equilibrium R between the two strategies exists, it is uniquely determined by The point R is unstable. If the fine is much smaller: b , s=ðn À 1Þ ¼: b 0 ; punishment has no effect on defection dominance, or otherwise, R appears and the dynamics turns into bistable [33,34]: R separates the state space into basins of attraction of the different homogeneous states for rspb.royalsocietypublishing.org Proc R Soc B 280: 20122498 both the defector and excluder. The smaller g or larger b, the more the coordinate of R shifts to the defector end: the more relaxed the initial condition required to establish a punisher population ( figure 1a). Note that a rare punisher is incapable of invading a defector population, because the resident defectors, almost all unpunished, earn 0 on average, and the rare punisher does Às À gðn À 1Þ , 0.
Next, consider all of the cooperators, defectors, and punishers (figure 1b). Without defectors, no punishing cost arises. Thus, no natural selection occurs between the cooperators and punishers, and the edge between the cooperators and punishers (x þ z ¼ 1) consists of fixed points. A segment consisting of these fixed points with z . b 0 =b is stable against the invasion of rare defectors, and the other segment not so [33,34]. Therefore, this stable segment appears on the edge EC if and only if the edge ED is bistable. We denote by K 0 the boundary point, with z ¼ b 0 =b. There can thus be two attractors: the vertex D and segment EK 0 . The smaller g or larger b, the broader the basin of attraction for the mixture states of the contributors. That is, the higher the punishment efficiency, the more relaxed the initial condition required to establish a cooperative state. This may collaborate with evidence from recent public good experiments [38 -40], which suggest the positive effects of increasing the punishment efficiency on average cooperation.
However, the stability of EK 0 is not robust for small perturbations of the population. Because P P , P C holds in the interior space, an interior trajectory eventually converges to the boundary, and dðz=xÞ=dt ¼ ðz=xÞðP P À P C Þ , 0: the frequency ratio of the punishers to cooperators decreases over time. Thus, if rare defectors are introduced, for example by mutation or immigration, into a stable population of the two types of contributors, the punishers will gradually decline for each elimination of the defectors. Such small perturbations push the population into an unstable regime around K 0 C, where the defectors can invade the population and then take it over. See the electronic supplementary material, figure S1 and also Hauert et al. [26] for individual-based simulations.

(b) Type B: social exclusion
We turn next to social exclusion. The third strategy is now replaced with the excluder (E) that contributes c and also tries to exclude defectors from sharing benefits at a cost to itself of g . 0 per defector. The multiplied contribution is shared equally among the remaining members in the group. We assume that an excluder succeeds in excluding a defector with the probability b and that the excluded defector earns nothing. For simplicity, we conservatively assume that the total sanctioning cost for an excluder is given by g times the number of defectors in a group, whatever others do.
We focus on perfect exclusion with b ¼ 1: exclusion never fails. Under this condition, however, we can analyse the nature of social exclusion considered for cooperation. Indeed, we formalize the expected payoffs, as follows: P C ¼ cðr À 1Þ À ð1 À zÞ nÀ1 rc n ðn À 1Þ y 1 À z ; ð2:4aÞ and P E ¼ cðr À 1Þ À gðn À 1Þy: ð2:4cÞ Equation (2.4c) describes that the excluder can constantly receive the group optimum c(r 2 1) at the exclusion cost expected as g(n 2 1)y. In equations (2.4a) and (2.4b), ð1 À zÞ nÀ1 denotes the probability that we find no excluder in the (n 2 1) co-players, and if so, (n 2 1)y/(1 2 z) and (n 2 1)x/(1 2 z) give the expected numbers of the defectors and cooperators, respectively, among the co-players. Hence, the second term of equation (2.4a) specifies an expected benefit that could have occurred without freeriding, and equation (2.4b) describes an expected amount that a defector Here, we specifically assume b ¼ 0.5 and g ¼ 0.03, which result in an unstable equilibrium R within PD and the segmentation of PC into stable part PK 0 and unstable part K 0 C. The interior of triangle is separated into the basins of attraction of D and PK 0 . In fact, given the occasional mutation to a defector, the population's state must leave PK 0 and then enter the neighbourhood of the unstable segment K 0 C, because P P . P C holds over the interior space. The population eventually converges to D.
rspb.royalsocietypublishing.org Proc R Soc B 280: 20122498 has nibbled from the group benefit, in the group with no excluder. The expected payoffs for any b are formalized in the electronic supplementary material.
First, the dynamics between the excluders and defectors can only exhibit bi-stability or excluder dominance for b ¼ 1 (figure 2a). Considering that P D ¼ 0 holds for whatever the fraction of excluders, solving P E ¼ 0 gives that, if the interior equilibrium R exists, it is uniquely determined by z ¼ 1 À ðr À 1Þc ðn À 1Þ g : ð2:5Þ The point R is unstable. As before, for larger values of g; the dynamics between the two strategies have been bistable. The smaller the value of g; the larger the basin of attraction to the vertex E. In contrast to costly punishment, an excluder population can evolve, irrespective of the initial condition, for sufficiently small values of g. When decreasing g beyond a threshold value, R exits at the vertex D, and thus, the current dynamics of bi-stability turns into excluder dominance. From substituting z ¼ 0 into equation (2.5), the threshold value is calculated as g 0 ¼ ðr À 1Þc=ðn À 1Þ. We note that the dynamics exhibit defector dominance no matter what g; if b is smaller than z 0 , which is from solving ð1 À bÞ nÀ1 rcðn À 1Þ=n . cðr À 1Þ: the unexcluded rare defector is better off than the resident excluders.
Next, consider all three strategies (figure 2b). Solving P C ¼ P D results in By the assumption r , n, we have 0 , z 0 , 1. Let us denote by K 0 a point at which this line connects to the edge EC (x þ y ¼ 1). This edge consists of fixed points, each of which corresponds to a mixed state of the excluders and cooperators. These fixed points on the segment EK 0 ðz . z 0 Þ; and those on the segment K 0 C are unstable.
Similarly, solving P E ¼ P C gives ð2:7Þ We denote by K 1 a point at which the line z ¼ z 1 connects to EC. These two lines are parallel, and thus, there is no generic interior equilibrium. Importantly, the time derivative of z/x is positive in the interior region with z , z 1 . Therefore, the dynamics around the segment K 1 K 0 are found to be the opposite of costly punishment, if z 1 . z 0 (or otherwise, K 1 K 0 has been unstable against rare defectors). In this case, introducing rare defectors results in that, for each elimination of the defectors, the excluders will gradually rise along K 1 K 0 , yet fall along the segment EK 1 . Consequently, with such small perturbations, the population can remain attracted to the vicinity of K 1 , not converging to D. Moreover, if g , g 0 ; the excluders dominate the defectors, and thus, all interior trajectories converge to the segment EK 0 , which appears globally stable (figure 2b). This result remains robust for the intermediate exclusion probability ( figure 3). See the electronic supplementary material, figures S2 and S3 for individual-based simulations.

Discussion
Our results regarding social exclusion show that it can be a powerful incentive and appears in stark contrast to costly punishment. What is the logic behind this outcome? First, it is a fact that the exclusion of defectors can decrease the number of beneficiaries, especially when it does not affect the contributions, thereby increasing the share of the group benefit. Therefore, in a mixed group of excluders and defectors, the excluder's net payoff can become higher than the excluded defector's payoff, which is nothing, especially if the cost to exclude is sufficiently low. If social exclusion is capable of 100 per cent rejection at a cheap cost, it can thus emerge in a sea of defectors and dominate them. In our In the presence of second-order freeriders. The triangle is as in figure 1b, except that z denotes the excluder frequency and the vertex E corresponds to its homogeneous state. Similarly, the edge EC consists of a continuum of equilibria. Here, we specifically assume b ¼ 1 and g ¼ 0:03. EC is separated into stable and unstable segments. The coloured area in the interior of triangle is the region in which P E . P C holds. In fact, given the occasional mutation to a defector, the population's state must converge to the vicinity of the point K 1 , because the advantage of the excluders over the cooperators becomes broken when the population's state goes up beyond K 1 .
rspb.royalsocietypublishing.org Proc R Soc B 280: 20122498 model, self-serving punishment can emerge even when freeriding is initially prevalent by allowing high-net benefits from the self-serving action.
Moreover, we find that an increase in the fraction of excluders produces a higher probability of an additional increase in the excluder's payoff. This effect can yield the wellknown Simpson's paradox [41]: the excluders can obtain a higher average payoff than the cooperators, despite the fact that the cooperators always do better than the excluders for any mixed group of the cooperators, defectors, and excluders. Hence, in the presence of defectors, the replicator dynamics often favour the excluders at the expense of the cooperators. Significantly, if a player may occasionally mutate to a defector, social exclusion is more likely than costly punishment to sustain a cooperative state in which all contribute. In our model, a globally stable, cooperative regime can be sustained when solving the second-order freerider problem by allowing mutation to freeriders.
Sanctioning the second-order freeriders has also often been considered for preventing their proliferation [3,29,34,36], although such second-order sanction appears rare in experimental settings [42]. And, allowing for our simple model, it is obvious that in the presence of defectors and cooperators, a second-order punisher that also punishes the cooperators is worse off than the existing punisher, and thus, does not affect defector dominance as in our main model. However, given that excluding more co-players can cause an additional increase in the share of the group benefit, it is worth exploring whether the second-order excluder that also excludes the cooperators is more powerful than the excluder. Interestingly, our preliminary individual-based investigation often finds that second-order excluders are undermined by the excluders and cooperators, which forms a stable coexistence (see the electronic supplementary material, figure S4): second-order exclusion can be redundant.
A fundamental assumption of the model is that defection can be detected with no or little cost. This assumption appears most applicable to local public goods and team production settings in which the co-worker's contribution can be easily monitored. However, if the monitoring of co-players for defection imposes a certain cost on the excluders, the cooperators dominate the excluders, and the exclusionbased full cooperation is no longer stable. A typical example is found in a potluck party that will often rotate, so that every member takes charge of the party by rotation. This rotation system can promote the equal sharing of the hosting cost; otherwise, no one would take turns playing host. Another example is given by studies on coastal fisheries management. In a laboratory experiment using young fishers in a fishing community, it was found that the possibility of ostracism can decrease overfishing in a common-pool resource setting [43]. Another field research has also observed that a profit-sharing local fishing group, in which mutual monitoring and peer pressure are common, works efficiently [44]. In the latter case, shunning profitable collective actions (e.g. search of promising spots and development of fishing techniques) could be a credible sanction on defective behaviours. Indeed, empirical evidence suggests that the profit sharing observed was primarily considered to make the various collective actions self-enforcing: that is, to avoid the tragedy of the commons [44].
We assessed by extensive numerical investigations the robustness of our results with respect to the following variants (see the electronic supplementary material, figures S5 and S6). First, we considered a different group size n [3,45], In costly punishment, the stable segment PK 0 expands with n, yet our main results were unaffected: with small perturbations, the population eventually converges to a noncooperative state in which all freeride. In social exclusion, our results remain qualitatively robust with smaller and larger sizes (n ¼ 4 and n ¼ 10), but the limit exclusion cost g becomes more restricted as n increases. Next, we considered a situation in which a punisher or excluder can choose the number of defectors they sanction. For simplicity, here we assume that each of them sanctions only one [22,46], who is selected randomly from all defectors in the group. Our results remain unaffected, except that social exclusion becomes incapable of emerging in a defector population, in which the payoff of a rare excluder is only given by rc=ðn À 1Þ À c À g , 0. To bring forth the possibility of an emergence, a rare excluder is required to exclude more than n À rc=ðc þ gÞ defectors. We have to note that the model on social exclusion studied in this paper has a considerable limitation: only the self-serving aspect of social exclusion is included in the model. In our model, an excluder can directly gain an additional benefit by excluding defectors from a game, since the number of exploiters in the game will reduce by the exclusion. In real life, however, the self-serving function stable equilibria unstable equilibria outcomes alternative outcomes Figure 3. Effects of intermediate social exclusion in the presence of second-order freeriders. The parameters and triangles are as in figure 1, except that b ¼ 0:5 and g ¼ 0:03 (a), 0.13 (b), 0.18 (c), or 0.28 (d ). EC is separated into stable and unstable segments. The coloured area is the interior region in which P E . P C holds. (a) The dynamics of ED are unidirectional to E. All interior trajectories converge onto the stable segment EK 0 . Moreover, occasionally mutating to a defector leads to upgrading E to a global attractor. (b -d) An unstable equilibrium R appears on CD. The interior space is separated into the basins of attraction of D and EK 0 . R is a saddle (b) or source (c,d). In (c) especially, the interior space has a saddle point Q. Given the mutant defectors, the population's state around EK 0 will gradually move to K 1 (b,c), or to the unstable segment K 0 C (d). The last case is followed by a convergence towards D. rspb.royalsocietypublishing.org Proc R Soc B 280: 20122498 does not seem to be the only mechanism of social exclusion. There is in fact an experimental result that indicates the existence of social exclusion without a self-serving feature [47]. In the experiment, a social exclusion is shown to still work even when there is a negative (short-term) effect on payoffs of excluders. It was not yet possible to overcome the complications raised by this aspect of social exclusion.
Our results spur new questions about earlier studies on the evolution of cooperation with punishment. A fascinating extension is to the social structures through which individuals interact. To date, a large body of work on cooperation has looked at how costly punishment can propagate throughout a social network [48 -50]: for example, the interplay of costly punishment and reputation can promote cooperation [51]; strict-and-severe punishment and cooperation can jointly evolve with continuously varying strategies [52]; and evolution can favour anti-social punishment that targets cooperators [53]. Our results show that social exclusion as considered is so simple, yet extremely powerful. That is, even intuitively applying it to previous studies can help us much in understanding how humans and non-humans have been incentivized to exclude freeriders. It is also worth exploring the idea that a mix of these different types of punishment-for instance, monetary penalties and licence suspension for traffic violators-could more effectively maintain a stable social structure of cooperation than each type in isolation. A fine is often applied flexibly and mainly on material terms, whereas social exclusion can also cause an unexpected loss of standing in the community [32].
To resist the exclusion, it is likely that conditional cooperators capable of detecting ostracism [8] evolve. This would then raise the comprehensive cost of exclusion to the excluders, because of more difficulties of finding and less opportunities of excluding freeriders. This situation can then result in driving an arms race of the exclusion technique and exclusion detection system. An extensive investigation for understanding joint evolution of these systems is for future work.