Philosophical Transactions of the Royal Society B: Biological Sciences
You have accessResearch articles

The adaptive value of probability distortion and risk-seeking in macaques’ decision-making

A. Nioche

A. Nioche

Department of Communications and Networking, Aalto University, Espoo, Finland

Google Scholar

Find this author on PubMed

,
N. P. Rougier

N. P. Rougier

Inria Bordeaux Sud-Ouest, 33405 Talence, France

Institut des Maladies Neurodégénératives, Université de Bordeaux, 33000 Bordeaux, France

Institut des Maladies Neurodégénératives, CNRS, UMR, 5293, 33000 Bordeaux, France

LaBRI, Université de Bordeaux, INP, CNRS, UMR, 5800, 33405 Talence, France

Google Scholar

Find this author on PubMed

,
M. Deffains

M. Deffains

Institut des Maladies Neurodégénératives, Université de Bordeaux, 33000 Bordeaux, France

Institut des Maladies Neurodégénératives, CNRS, UMR, 5293, 33000 Bordeaux, France

Google Scholar

Find this author on PubMed

,
S. Bourgeois-Gironde

S. Bourgeois-Gironde

Laboratoire d’Economie Mathématique et de Microéconomie Appliquée, Université Panthéon-Assas, 75006 Paris, France

Institut Jean Nicod, Département d’Etudes Cognitives, ENS, EHESS, PSL Research University, 75005 Paris, France

Institut Jean Nicod, CNRS, UMR 8129, 75005, Paris, France

Google Scholar

Find this author on PubMed

,
S. Ballesta

S. Ballesta

Laboratoire de Neurosciences Cognitives et Adaptatives, UMR, 7364, 67000 Strasbourg, France

Centre de Primatologie de l’Université de Strasbourg, 67207 Niederhausbergen, France

Google Scholar

Find this author on PubMed

and
T. Boraud

T. Boraud

Institut des Maladies Neurodégénératives, Université de Bordeaux, 33000 Bordeaux, France

Institut des Maladies Neurodégénératives, CNRS, UMR, 5293, 33000 Bordeaux, France

Centre Expert Parkinson, CHU Bordeaux, 33000 Bordeaux, France

[email protected]

Google Scholar

Find this author on PubMed

    Abstract

    In humans, the attitude toward risk is not neutral and is dissimilar between bets involving gains and bets involving losses. The existence and prevalence of these decision features in non-human primates are unclear. In addition, only a few studies have tried to simulate the evolution of agents based on their attitude toward risk. Therefore, we still ignore to what extent Prospect theory’s claims are evolutionarily rooted. To shed light on this issue, we collected data from nine macaques that performed bets involving gains or losses. We confirmed that their overall behaviour is coherent with Prospect theory’s claims. In parallel, we used a genetic algorithm to simulate the evolution of a population of agents across several generations. We showed that the algorithm selects progressively agents that exhibit risk-seeking, and has an inverted S-shape distorted perception of probability. We compared these two results and found that monkeys’ attitude toward risk is only congruent with the simulation when they are facing losses. This result is consistent with the idea that gambling in the loss domain is analogous to deciding in a context of life-threatening challenges where a certain level of risk-seeking behaviour and probability distortion may be adaptive.

    This article is part of the theme issue ‘Existence and prevalence of economic behaviours among non-human primates’.

    1. Introduction

    Making decisions with uncertain outcomes involves solving a trade-off. A decision-maker’s attitude towards risk depends on his/her capacity to value quantities and perceive probabilities. Prospect theory offers a framework to define this attitude toward risk that relies on two key concepts: (i) Choices are performed according to a reference point with respect to which expected gains and losses are contrasted; the evaluation of quantities is captured by the shape of the utility function, and reference-dependence entails risk aversion in the domain of gains (concave utility function) and risk-seeking in the domain of losses (convex utility function). (ii) The perception of the probabilities of the outcomes, described by the shape of the probability weighting function, is nonlinear.

    An increasing number of studies have tackled the biological relevance of Prospect theory claims [17]. In the domain of gains, risk aversion has been reported in several taxa, including rodents [8], birds [9], insects [10] and plants [11]. In apes and monkeys, a variable level of risk-seeking behaviour has been observed [12] and only few studies [13,14] reported risk aversion (see [15], in this same theme issue, for an exhaustive review of the literature in rhesus monkey (Macaca mulatta)). Attitude toward risk in losses has seldom been reported in independent studies [15]. Prospect theory’s claim of an asymmetric risk aversion for gains/risk-seeking for losses per se has been assessed specifically in few studies [15]. An asymmetry similar to the one observed in humans has been reported in capuchins [16] and in rhesus macaques [17] but other studies have not found any asymmetry for the latter [15]. Interestingly, this absence of asymmetry has also been reported in rodents [8]. Overall, this observed variability in primates’ attitudes toward risk can be partly explained by contextual factors or sampling effect [1821].

    Concerning probability distortion, the picture seems clearer. Rhesus macaques and rats, for instance, perceive rare events as more probable than events that have a higher probability of occurring and vice versa [8,17,22,23]. This consistency across species, despite individual variability, possibly implies that probability distortion is likely to be anchored at a basic neurobiological level [24,25], but more data need to be collected.

    The patchy nature of the data collected up to now, concerning the shape of both the utility and probability functions in primates, calls for more comprehensive studies that assess the whole span of Prospect theory in larger populations in order to assess both intraindividual and interindividual variability, in more ecological conditions.

    In addition, little is known about the adaptive value of the shape of the probability weighting function and the asymmetry of the utility function [26].

    The goal of this study is therefore threefold: (i) to confirm the data previously collected with two rhesus macaques [17] using a larger number of animals, in more ecologically valid conditions (i.e. semi-free-ranging Tonkean macaques exposed to autonomous learning and testing devices); (ii) to address the adaptive value of the nonlinearity of the utility and probability weighting function using a genetic algorithm-based simulation; and (iii) to compare the above two findings.

    2. Material and methods

    (a) Subjects

    At the University of Strasbourg, we collected data on one social group of Tonkean macaques (Macaca tonkeana), housed at the Centre de Primatologie (CdP) of the University of Strasbourg. Animals lived in semi-free-ranging conditions in a wooded park of 3788 m2 with permanent access to an indoor–outdoor shelter (2.5 × 7.5 × 4 m). The group included 23 individuals with even sex ratio among adults, which is comparable to the composition of wild groups [27]. All subjects have research experience in cognitive testing using touchscreens. Among the 23 individuals, 14 were involved in this current study; however, only the seven that were sufficiently trained and performed our economic task without a significant side bias (less than 80%) are included in this report. The species Macaca tonkeana is a member of a group of closely related Sulawesi macaque species, all living in multi-male and multi-female egalitarian and highly tolerant societies [28,29]. Indeed, compared with other species of macaques, Sulawesi macaques' social interactions are more complex and more influenced by friendships than by dominance and kinships [2933]. However, the non-social cognition skills of Tonkean macaques seem to be comparable to those observed in other species of macaques [34]. Unfortunately, owing to demographic heterogeneity and small size of the experimental population for each species (see electronic supplementary material, table S1), the comparison of economic decisions of Tonkean macaques with rhesus monkeys is beyond the scope of this actual study. However, we believe that the experimental methods reported here are particularly adapted to the future study of potential variations in economic decision-making in different species of primates. Water was provided ad libitum and monkeys were fed with commercial primate pellets twice a day and received fresh fruit and vegetables once a week. At the Institut des Maladies Neurodégénératives, Bordeaux, France (IMN), the study was performed on two female rhesus macaques: Hav (born in 2012) and Gla (born in 2011). Animals were housed in the animal facility of the IMN under standard conditions (a 12h light/dark cycle with the light on from 7.00 to 19.00; humidity at 60%, temperature 22 ± 2°C). During the time of the experiment, animals had controlled access to water 5 days per week. They were fed with commercial primate pellets once a day, and received fresh fruit and vegetables once a week. The nine animals (five males and four females) that composed the final experimental cohort weighed 10 ± 3.1 kg and were 6.8 ± 2 years old.

    (b) Data collection

    At the University of Strasbourg, data were collected using four Machines for Automated Learning and Testing (MALTs), which were directly accessible to the monkeys from their living environment (figure 1a,b). Several cognitive tasks were available to the macaques at the MALT, presented via a touchscreen interface [35]. Each MALT was accessible freely at all times, except for 2 h cleaning and refill sessions, at least once a week. MALTs allow automatic identification of each subject using a radio-frequency identification (RFID) dual-detection system [36]. For that purpose, subjects were all equipped with two RFID microchips (UNO MICRO ID/12, ISO Transponder 2.12 × 12 mm), injected into each forearm during the macaques’ veterinary health check under appropriate anaesthesia, to individually identify them when using a MALT (figure 1b,c). At the IMN, data were daily collected (approx. 1 h per session, approx. 5 sessions per week). During each session, animals were seated in a primate restraint chair located in a dark room equipped with a video monitoring system and faced a touchscreen on which the task was displayed (figure 1d). A resting bar was mounted in the lower part of the chair at waist level to accurately measure arm-reaching movement parameters. The behavioural and video data were stored on a separate computer located outside the room for further analysis.

    Figure 1.

    Figure 1. (a,b) Four MALTs were set up in a shelter placed inside the macaques’ wooded park and behavioural tasks similar to the ones used in Nioche et al. [17] were presented via a touchscreen interface. Correct trials were rewarded with diluted syrups according to the probability associated with the choice. (c) Screenshot from the control video streaming while a subject was performing a trial. The pattern of each slice represents the number of tokens earned or lost (see Material and methods). (d) Experimental device for rhesus monkeys at the IMN. The monkey sat in a primate restraint chair positioned 20 cm from a touchscreen installed in an electrically isolated dark room. A resting bar was mounted on the lower part of the chair at waist level. A tube positioned directly in front of the monkey’s mouth dispensed small amounts of apple sauce as a reward. Task/video monitoring and data acquisition were performed by a separate computer located outside the dark room. (Online version in colour.)

    Each monkey performed between 11 681 and 51 853 trials. Some semi-free-ranging subjects (Tonkeans) did not perform the task optimally and adopted alternative response strategies. Indeed, five tested individuals chose the target on one side in more than 80% of the cases (in lotteries involving either gains or losses) and were consequently not included in this study. Two other individuals were also not considered because they performed fewer than 2000 trials. Concerning the rhesus monkeys, we recorded overall 54 904 trials, of which 39 853 (72.59%) have already been used for a previous publication [17].

    (c) Task

    The task is adapted from Nioche et al. [17], without any major change and is shown in figure 2.

    Figure 2.

    Figure 2. (a) The orientation of the parallel lines constituting the pattern indicates a quantity (horizontal lines represent 0; clockwise rotation of one-, two-, three-quarters of 90° represents, respectively, a loss of −1, −2, −3 tokens; counter-clockwise rotation of one-, two-, three-quarters of 90° represents, respectively, a gain of 1, 2, 3 tokens). (b,c) Each lottery is represented by a pie chart, as in Stauffer et al. [22]. Each pie chart is composed of two slices. Each slice encodes one possible outcome of the lottery (x or 0). The arc length of each slice represents the probability of the corresponding outcome (p or 1 − p). The relative positions on the pie chart of each slice are randomly determined at each trial. Panels (b) and (c) represent the task as it was coded at the University Strasbourg and the IMN of Bordeaux, respectively. (Online version in colour.)

    Lottery: We consider simple lotteries LL, which can be defined as a tuple (x, p), such that if L = (x, p), it yields the outcome xR, with probability p ∈ [0, 1] and 0 with probability 1 − p. For all trials, the probability of receiving (losing) tokens is drawn from the set p ∈ {0.25, 0.50, 0.75, 1.00}.

    Choice: Each choice has to be made between two lotteries L1=(x1,p1)andL2=(x2,p2).

    Rewards: The monkey starts each trial with three tokens and can gain up to three tokens and lose up to three tokens. At the end of the trial, the monkey is given a liquid reward proportional to the tokens earned (between 0 and 6 tokens at the end of each trial). At the University of Strasbourg, monkeys could be rewarded with diluted syrup (1/10; 1 token of reward corresponds to 0.25 ml of liquid reward; figure 2) and at the IMN, with diluted apple sauce (1/3, 1 token corresponds to 0.1 ml of liquid reward).

    (d) Experimental paradigm

    At the beginning of the trial, a gauge with three tokens was displayed (figure 2b,c). In Bordeaux, the monkey had to grasp a grip (i.e. resting bar at waist level) for a short duration that varied randomly from trial to trial in the range 150–300 ms to ensure that the monkey could not anticipate the stimulus display. If the monkey did not hold the grip long enough, the trial was considered to be failed and an error was raised. If an error was raised, the screen turned black, and the monkey had to wait 2000 ms for the beginning of the next trial. If the monkey held the grip for the required amount of time, two circles representing two lotteries appeared on the screen. The monkey had 2000 ms to decide which lottery to choose. If the monkey did not choose within the allotted decision time, an error was raised. Once one lottery was selected by touching the corresponding circle, the other circle disappeared. The monkey had 5000 ms to return its hand to the grip (otherwise an error was raised). Once the monkey returned its hand to the grip, the outcome was determined based on the probabilities shown in the two slices of the chosen circle. The amount of reward (loss) was indicated to the animal by the disappearance of the slice corresponding to the non-occurring output. The gauge was filled (emptied) by the amount earned (forfeited), one token at a time. The time of the filling animation was 1500 ms. The inter-trial interval varied randomly between 150 and 300 ms.

    At the University of Strasbourg, similar experiments were performed using touchscreens only. The experiment started with a central coloured square. Directly (25 ms) after the subject touched this cue, two circles representing two lotteries appeared on the screen. The animal had 15 000 ms to decide which lottery to choose. After making a choice by touching one of the two circles the other circle disappeared. On the remaining circle, the outcome was determined based on the probabilities shown in the two slices of the chosen circle. The gauge filled (emptied) by the amount earned (forfeited), all tokens at the same time. During reward delivery, auditory feedback (bell sound) was played for each token earned by the subject. The gauge and tokens stayed on the screen for 8000 ms plus a 500 ms intertrial interval where the screen was left black.

    (e) Modelling of monkeys’ decisions

    We characterize the monkeys’ decision-making using a model based on the Prospect theory [26,37], similar to the one used in Nioche et al. [17].

    (i) Probability weighting function and probability distortion

    Following Prelec [38] the subjective probability perception is defined as:

    w(p)=e(lnp)α,2.1
    with p ∈ (0, 1) the actual probability, and with α ∈ (0, ∞), a free parameter indicating the distortion of the probability perception. We assume that w(0) = 0. For α ∈ (0, 1), the closer α is to zero, the more the small probabilities are overestimated and the large probabilities underestimated. For α = 1, the subjective probabilities are the same as the objective probabilities. For α ∈ (1, ∞), the higher α, the more the small probabilities are underestimated and the large probabilities overestimated.

    (ii) Utility function and risk aversion

    The utility of a normalized outcome x ∈ [−1, 1] is defined as:

    u(x)={x1βif x>0,|x|1+βif x<0,0otherwise,2.2
    with βR a free parameter describing the risk aversion of the decision-maker [39]. If β is positive, u″ is negative, indicating risk-averse preferences [40], if β is negative, u″ is positive, indicating risk-seeking preferences, if β is equal to 0, x:u(x)=xandu(x)=0, indicating risk-neutral preferences.

    Subjective expected utility and side bias. The subjective expected utility (SEU) of a lottery L is given by:

    SEU(L=(x,p))={w(p)u(x)γif γ<0andLis on the left,w(p)u(x)+γif γ>0and Lis on the right,w(p)u(x)otherwise,2.3
    with γR, a free parameter describing to what extent the decision-maker is biased towards one side. If γ = 0, the decision-maker is not biased. Otherwise, the higher γ, the more the decision-maker is biased towards the right, the lower γ, the more the decision-maker is biased towards the left.

    (iii) Choice probability and stochasticity

    We also assume that action selection is probabilistic: the option that has the highest subjective utility is chosen only with a probability greater than the other options, and not with certainty. We model this stochasticity with a classic softmax function [41] such as the probability of choosing the lottery Li is given by:

    p(choice=Li)=11+e(SEU(Li)SEU(Lj))/λ,2.4
    with Lj the alternative option, and λ ∈ (0, ∞) a free parameter describing to what extent decision-making is stochastic. The higher λ, the more the decision-making is stochastic.

    (f) Data analysis

    We will consider separately the choices involving only potential gains (xi∈1,2 > 0) and only potential losses (xi∈1,2 < 0). Furthermore, we will distinguish two groups of choices.

    Group 1. There is a better response regardless of the risk attitude of the decision-maker: there is i, j ∈ {1, 2} s.t. pipj and xi > xj, or pi > pj and xixj (30 different pairs of lotteries for gains, 30 for losses ignoring the order/side of presentation—60 otherwise).

    Group 2. A trade-off between risk and potential gain/loss has to be made: there is i, j ∈ {1, 2} s.t. pi > pj and xi < xj (18 different pairs of lotteries for gains, 18 for losses ignoring the order/side of presentation—36 otherwise).

    The choices from Group 1 are used as control and choices from Group 2 to assess attitude towards risk.

    Control 1: performance assessment. Lottery pairs with a better response (Group 1) are used to assess the monkeys’ performance. We consider specifically the cases where it exists i, j ∈ {1, 2} s.t.:

    [Same p] pi = pj but xi > xj in order to assess the discrimination of the quantities (12 different pairs of lotteries for gains, 12 for losses ignoring the order/side of presentation—24 otherwise);

    [Same x] xi = xj but pi > pj in order to assess the discrimination of the probabilities (18 different pairs of lotteries for gains, 18 for losses ignoring the order/side of presentation—36 otherwise).

    We model the probability of choosing the right option depending on the difference of expected values using an ordinary sigmoid function:

    p(Δ)=1(1+ek(Δx0)),2.5
    with Δ the difference of expected values between the lottery on the right and the lottery on the left, k the slope parameter and x0 the intercept parameter. We use a Levenberg–Marquardt algorithm (SciPy library) to optimize the parameters.

    Control 2: consideration of the difference between expected values when trading-off between quantity and probability. Results for lottery pairs with a trade-off between quantity and probability (Group 2) are used to check if the frequency with which the riskiest option is chosen is dependent on the difference between the expected values of the safest option and of the riskiest option.

    We model this relation using an ordinary sigmoid function:

    p(Δ)=1(1+ek(Δx0)),2.6
    with Δ the difference of expected values between the riskiest lottery and the safest lottery, k the slope parameter and x0 the intercept parameter. We use a Levenberg–Marquardt algorithm (SciPy library) to optimize the parameters.

    (i) Assessment of the attitude towards risk

    The choices from Group 2 are used to characterize the attitude toward risk. To this end, separately for each monkey and for choices involving potential losses and for choices involving potential gains, we optimize the free parameters of our model (θ = {α, β, γ, λ}), using an SLSQP optimization algorithm [42] (SciPy library), in order to maximize the likelihood of the data given the model. More precisely, the log-likelihood is estimated as follows:

    lnL(Oθ)=iOlnp(oiθ)2.7
    for θ = {α, β, γ, λ} a set of parameter values, and with O the set of observations under consideration, p the probability according to our decision-making model of making the choice oiOgivenθ.

    To assess the stability of the fit, we bin the trials in chunks of 200 trials by chronological order for each monkey, and optimize separately for each chunk the free parameters of the model.

    (g) Simulations

    The simulation is based on a genetic algorithm [43]. We consider a set L={l1,,lNL} of lotteries where each lottery l1 is described by a probability pi = i/NL of reward and an associated reward xi = 1/pi such that the expected value of each lottery is equal to 1. We consider a set A={a1,,aNA} of agents where each agent ak is fully described by a couple of parameters (αk, βk). When asked to choose between lotteries li and lj, the choice of agent ak is:

    choice(ak,li,lj)={iif wαk(pi)uβk(xi)>wαk(pj)uβk(xj)jotherwise,2.8
    with wα : [0, 1] → [0, 1] the probability weighting function w described above with a value of distortion parameter equal to α, and uβ:RR the utility function u described above with risk aversion parameter equal to β.

    The initial population A0 is built from a set of parameters α and β uniformly drawn from [αmin, αmax] × [βmin, βmax] such that we have ak0={αk,βk}.

    At each epoch, each agent completes a set of NT trials. Each trial is composed of two lotteries randomly and uniformly drawn for the set L. Individual gains are computed according to the agent’s choices, and a proportion of the best scorers is selected using a selection rate γ that may vary depending on the simulation (see Results). The next generation is computed by iteratively selecting two random parents among the selected agents and by computing the linear interpolation of their respective parameters (α and β) such as to generate two offspring. More precisely, considering an agent ain and an agent ajn at epoch n, we generate two new agents akn+1andak+1n+1 using a random and uniform factor λ ∈ [0, ɛ] (ɛ being the mixture rate) such that:

    akn+1=(λαin+(1λ)αjn,λβin+(1λ)βjn)andak+1n+1=((1λ)αjn+λαin,(1λ)βjn+λβin).}2.9

    The procedure is iterated until we reach a population whose size is the same as the initial population. After this new population has been generated, we apply a mutation of a small proportion of the new agents using a mutation rate δ such that δ × NA agents benefit from a random mutation. This mutation consists of replacing the agents’ set of parameters by values randomly drawn from [αmin, αmax] × [βmin, βmax]. The whole procedure is iterated for a fixed number of epochs NE.

    (h) Statistics

    To compare measures, we used a Wilcoxon signed-rank (for paired data) and rank-sum test (for independent data) with p < 0.05. We also consider that monkeys’ behaviours were significantly biased based on the 95% confidence intervals given by the best-fit parameter value of the model.

    3. Results

    (a) Monkeys’ attitude toward risk

    We analysed the results of nine macaques monkeys that performed a total of 256 976 trials (28 553±12 609 per subject). Each trial consisted of a bet involving either gains or losses. We assessed the performance of the monkeys, by evaluating how they considered the difference of expected value between the available options. On average, monkeys were sensible to the difference of expected value between the two options. The electronic supplementary material provides details about each individual behaviour (electronic supplementary material, figures S1–S3 and tables S1–S8). Monkeys were sensible to the difference of expected value between the two options when probabilities were equal but amounts differed (figure 3a,d) for 8/9 individuals in the gain domain and 7/9 in the loss domain (95% confidence intervals of the best-fit parameter value). Monkeys were also sensible to options when amounts were equal but probability differed (figure 3b,e) for 9/9 subjects in the gain and the loss domains and when there was a trading-off between quantity and probability (figure 3c,f) for 9/9 and 6/9 individuals for the gain and the loss domain, respectively.

    Figure 3.

    Figure 3. Consideration of the difference between expected values (EVs) when probabilities are equal but amounts differ, when amounts are equal but probabilities differ, or when there is a trade-off between quantity and probability. (a,d) Probability of choosing the right option against the difference of EV when probabilities are equal but amounts differ. (b,e) Probability of choosing the right option against the difference of EV when amounts are equal but probabilities differ. (c,f) Probability of choosing the riskiest option against the difference of EV when there is a trade-off between quantity and probability. Blue lines (top row, ac) are related to choices involving gains, and orange lines (bottom row, df) to choices involving losses. Thin lines represent the values of ordinary two-parameter sigmoid functions using the best-fit parameter values of each individual (one line corresponds to one individual). Thick lines represent the average value of these functions. (Online version in colour.)

    Based on the Prospect theory [26,37], we characterize the probability weighting function (i.e. subjective perception of probabilities), the utility function (i.e. subjective valuation of the rewards), the stochasticity in choice (i.e. to what extent choices reflect subjective expected utilities), and the side bias for each individual. On average, the best-fit values of the risk-aversion parameter of the macaques’ utility functions are significantly different for gains and losses (Wilcoxon signed-rank test, p = 0.0039). The recovered utility functions are, respectively, concave in the gains domain, indicating risk-averse preferences (Wilcoxon signed-rank test, p = 0.02, figure 4a) and convex in the losses domain, indicating risk-seeking preferences (Wilcoxon signed-rank test, p = 0.0039, figure 4d), reproducing the known asymmetry. Monkeys also overweight small probabilities in both the losses and gains domains (Wilcoxon signed-rank test, both p = 0.0039, figure 4b,e). This distortion is slightly more pronounced for losses than gains but does not reach a significant threshold (Wilcoxon signed-rank test, p = 0.098). The steepness of the softmax function that fits their choice probabilities given the subjective expected utilities (figure 4e,f) is not significantly different between the gain and the loss domain (Wilcoxon signed-rank test, p = 0.36). The side bias in their decisions is not significantly different between the gain and the loss domain (Wilcoxon signed-rank test, p = 0.91). Electronic supplementary material, figures S4 and S5 provide details about each individual.

    Figure 4.

    Figure 4. Modelling of monkeys' attitude toward risk. (a,d) Monkeys’ utility function, (b,e) monkeys’ probability weighting function, (c,f) monkeys’ probability to choose the right lottery according to the difference of subjective expected utility (SEU) between right and left lotteries. Blue lines (top row, ac) are related to choices involving gains, and orange lines (bottom row, df) to choices involving losses. Thin lines represent the indicated function using the average best-fit parameter values over the data chunks of one individual, and thick lines represent the indicated function using the mean of the average best-fit parameter values over all the individuals. The mean value ± s.d.) for each parameter for each condition is indicated in the top left corner. (Online version in colour.)

    (b) Evolution of attitude toward risk in a population of artificial agents

    Unless stated otherwise, we use the parameters given in table 1 for all the simulations. The initial and final populations are, respectively, depicted in figure 5a,b and figure 5c,d. The thick black lines indicate the mean probability weighting function (figure 5c) and the mean utility function (figure 5d) over the whole population at the end of the selection process.

    Table 1. Parameters used in all simulations unless stated otherwise.

    parameter name value
    number of lotteries NL 1000
    number of agents NA 1000
    number of epochs NE 1000
    number of trials NT 100
    wα xe(log(x)α)
    uβ xe(1β)
    αmin, αmax 0.25, +1.75
    βmin, βmax −0.80, + 0.80
    γ (selection rate) 0.20
    δ (mutation rate) 0.02
    ε (mixture rate) 0.25
    PRNG seed used for all displayed results 123
    Figure 5.

    Figure 5. Simulation of 1000 agents playing 100 lotteries over 1000 generations. (a,c) Probability weighting function. (b,d) Utility function. The initial population is generated using parameters (αk, βk) randomly and uniformly drawn from [αmin, αmax] × [βmin, βmax]. The apparent dissymmetry of initial curves comes from the nonlinearity of parameter effects on function shapes. After 1000 epochs where we applied the selection of the best agents, generated new agents from their parameters and introduced a few random variations, the final mean behaviour of agents is shown using a thick black curve. (Online version in colour.)

    Overall, figure 5 reveals that more than 95% of the agents in the final population tend to overestimate low probabilities and underestimate high probabilities (see electronic supplementary material, figure S6c). It also appears that more than 95% of the agents in the final population have a convex utility function, indicating risk-seeking behaviour (see electronic supplementary material, figure S6d).

    The mean gain of a subset of the artificial agents depending on the shapes on their probability weighting and utility functions is depicted in figure 6. The mean gain of best individuals is sometimes better than the expected value, allowing the selection process to be efficient. As expected, an overestimation of small probabilities and underestimation of high probabilities (β = 0.6) lead to higher gains with only a marginal influence of the utility. When there is no distortion of probabilities (β = 1.0), only the utility function influences the mean gain of the luckiest agents. For an underestimation of small probabilities and overestimation of high probabilities (β = 1.4), the influence of the utility function is negligible.

    Figure 6.

    Figure 6. Analysis of the mean gain of a population of agents. Each point on the left panel corresponds to the mean gain of a subset (20% in the figure) of a group of NA agents playing NT lotteries. Depending on their probability weighting and utility functions, the mean gain of best individuals might be better than the expected value. The white line corresponds to the separation along with the median score (1.23) of all simulations. On the right, we display the utility function (blue) and the probability weighting function (red) for a few representative points (a–i).

    The exact shape of the final population’s functions is dependent on the selection rate, mutation rate and mixture rate as shown in figure 7. However, the influence of the mutation rate is small, as is the influence of the mixture rate, and for neighbouring values of the selection rate (between 5 and 30%), the shapes of the curves remain similar to the ones shown in figure 5. Though, if the selection rate is very large (45% or higher), the shapes of the curves are inverted: the probability weighting function would be S-shaped (instead of inverted S-shaped), indicating a low underestimation of probabilities and a high overestimation of probabilities, and the utility function would be convex, indicating risk-averse preferences. Further analyses of the evolution and influence of parameters on the distribution of the final population are provided in electronic supplementary material, figures S7–S13.

    Figure 7.

    Figure 7. Parameter sensitivity. The exact shape of the final population’s functions is dependent on the selection rate, mutation rate and mixture rate. However, for a selection rate below 35%, the general pattern remains the same: risk-seeking (convexity of the utility function) and overestimation of the small probabilities and underestimation of the high probabilities (inverted S-shaped probability weighting function).

    (c) Comparison between monkeys’ and agents’ populations

    We compared the observations made in monkeys and the observations made in artificial agents. Figure 8a,b summarizes the results of this analysis.

    Figure 8.

    Figure 8. Comparison of monkeys’ and agents’ behaviour. (a) Using the result from the monkeys’ data fitting, we can represent the population for gain (blue) and loss (red) showing that the monkeys display a tendency to overestimate small probabilities and underestimate large probabilities (in both gain and loss domain) and that they are risk-averse in the gain domain but risk-seeking in the loss domain. Subjects ‘Hav’ and ‘Gla’ are rhesus monkeys. (b) Initial population and final population of agents. The mean behaviour (black cross) is closer to monkeys’ mean behaviour in the loss domain (red cross) compared with monkey’s mean behaviour in the gain domain (blue cross). It is to be noted that the exact final position depends on the selection, mutation and mixture rate (figure 5).

    As the exact characteristics of the final population depend on the selection rate, we report statistical results for a selection parameter included in the set {0.05, 0.1, … , 0.35}, that is, values neighbouring the one used for the results presented in figure 5 (±0.15). The probability weighting function has an inverted S-shape both in populations of monkeys and in selected artificial agents. The values of the probability distortion parameter between the populations of monkeys and selected artificial agents are not significantly different for gains (figure 8b, Wilcoxon rank-sum, all p > 0.05, common language effect size [44], f = [0.32, 0.58]) and less robust for losses (figure 8b, Wilcoxon rank-sum, for γ ∈ [0.05, 0.2], all p > 0.05, f = [0.49, 0.63] and for γ ∈ [0.25, 0.35], all p < 0.05, f = [0.12, 0.29]). On the other hand, the values of the risk-aversion parameter are significantly different for gains (figure 8b, Wilcoxon rank-sum, all p < 0.001, f = [0.91, 1]) and, to a lesser extent, for losses (figure 8b, Wilcoxon rank-sum, all p < 0.1, f = [0.12, 0.32]), although the values observed for monkeys in both the losses and artificial agents lead to risk-seeking preferences (convexity of the utility function).

    4. Discussion

    We measured probability distortion and risk aversion in nine macaque monkeys of different ages and genders, belonging to two species, by using a similar task in different experimental conditions. We found that, overall, monkeys showed an inverted S-shape probability distortion pattern. In addition, on average, animals were, respectively, risk-averse for gains and risk-seeking for losses, confirming an asymmetry of treatment between gains and losses as we previously described [17]. Our conclusion is drawn from a dataset that includes a reasonable number of subjects, and most of them voluntarily performed the task in unconstrained environments, providing ecological validity to our findings. Our results therefore reliably reproduce in two species of non-human primates the classical pattern of the Prospect theory found in humans [45]. It is still worth noting that a substantial level of heterogeneity has been reported in the attitude of monkeys and humans towards risk [21] (for more information, see [15], in this special issue). Many confounding factors have been considered as the source of this heterogeneity, such as the amount or the nature of rewards, the kind of behavioural task or even the temporal organization of trials within a given task. Further research is needed to better understand the influence of these factors on primates’ decision-making.

    Efficiency in foraging and reproduction has been optimized by natural selection as individuals that followed sub-optimal strategies lost out to competitors [46,47]. The classical inverted S-shape profile of the probability weighting function has been described in many different species, and it may be a ubiquitous characteristic of decision-making of living organisms [8,22,23]. One can thus speculate that such decisional strategy may be adaptive. The evolutionary advantages of the nonlinearity of the probability weighting and utility functions remain debated [17]. Our results, which bring together real-life and simulation data, offer new insights into this issue.

    Indeed, to assess if these cognitive biases may confer an evolutionary advantage at the population level, we ran a simulation that allowed agents to freely compete and select the decision strategies that had the best fitness. The results of this simulation show that the final population exhibits an inverted S-shaped probability weighting function and a convex utility function. This indicates that the selected decision-makers are those that have a preference for risk (i.e. the convexity of the utility function), while overestimating the small probabilities and underestimating the high probabilities (i.e. inverted S-shaped probability weighting function). Hence, in a context where the total number of choices is limited, and the expected value is constant between the options, the selection process promotes the lucky gamblers that are biased towards the ‘high-stake’ bets (small probability and large reward). The strategy with the better fitness in our simulation is therefore congruent with the behavioural pattern of the monkeys’ behaviour in losses.

    Gambling in the loss domain can be arguably considered analogous to deciding in a context of life-threatening challenges (e.g. predation avoidance, social challenges). Cognitive adaptation in humans seems to support this hypothesis [48]. For instance, loss cues attracted more attentional resources than gain cues [49,50]. In real life, suboptimal decision-making in the gains domain may thus not lead to consequences as dramatic as in the losses domain. Hence, one can speculate that different levels of selective pressure could have been applied to the biological mechanisms responsible for decision-making under risk in the gain and the loss domains, therefore fostering an asymmetry of treatment.

    Our simulation did not capture this known asymmetry of treatment between gains and losses in primates [26,37]. This was expected because, for agents in the simulation, gains and losses were differentiated only by their polarity. However, the mean decisional strategy adopted by monkeys in the losses domain was the closest to the selected behaviours of agents (both risk-seeking). Several proposals may help to interpret this result. First, satiety is known to influence economic decisions [5154]. In real life, losses could be arguably infinite (i.e. losing social status, territories or physical integrity, or even facing death), whereas accumulation of goods is usually limited by time and/or space (e.g. limited accumulation of goods or satiety during foraging). These features were not implemented in the simulation and may be good candidates to better explain the known asymmetry of treatment between gains and losses. Second, it has been noted that the emotional state of subjects and/or social context of decisions can influence attitude toward risk and loss aversion [19,55,56]. Our simulations did not consider these other important features of biological agents. But despite these limits, we found that the final population of agents exhibit distorted probability perception and nonlinear utility function, thus showing that these behavioural strategies can be adaptive in a given context of decision-making. In conclusion, our study shows that integrating simulation and real-life data provides new insights about the evolutionary roots of cognitive biases, therefore reassessing the biological dimension of the decision theory.

    Ethics

    Experiments were conducted at the Centre de Primatologie (CdP) de l’Université de Strasbourg (Niederhausbergen, France; LNCA UMR-7364) and Institut des Maladies Neurodégénératives (IMN, CNRS, UMR-5293, University of Bordeaux, Bordeaux). At the University of Strasbourg, experiments were approved by the ethical committee of the CdP, which is authorized to house non-human primates (registration no. B6732636). The research further complied with the EU Directive 2010/63/EU for animal experiments. At the IMN, experimental procedures were performed in accordance with the Council Directive of 2010 (2010/63/EU) of the European Community and the National Institute of Health Guide for the Care and Use of Laboratory Animals. The protocol received agreement from the Ethical Committee for Animal Research CE50 (registration no. C33063268).

    Data accessibility

    The data necessary to reproduce the analyses presented in this article are provided at https://github.com/aureliennioche/EvoProspect.

    Authors' contributions

    A.N., M.D. and S.B. performed the experiments on monkeys; A.N., N.P.R. performed the simulations; A.N., N.P.R. and S.B. performed the data analysis; A.N., N.P.R., M.D., S.B.-G., S.B. and T.B. designed the study and co-wrote the manuscript.

    Competing interests

    We declare we have no competing interests.

    Funding

    T.B. and M.D. are supported by the CNRS; S.B. by the University of Strasbourg; S.B.-G. by the University of Panthéon-Assas; N.P.R. by Inria and A.N. by the Agence Nationale de la Recherche (ANR-16-CE38-0003), the Ministère de la Recherche et de la Technologie (French Ministry for Research and Technology), Sorbonne Université and Aalto University. The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

    Acknowledgements

    We are grateful to Silabe and the University of Strasbourg for expert animal care of the Tonkean colony and financial support. We thank Hélène Meunier for the initial training of the Tonkean colony on MALT and Adam Rimele for technical support and data management. The rhesus macaques are housed in the Centre Paul Broca Nouvelle Aquitaine, thanks to Labex BRAIN financial support. Hugues Orignac and Tho-Hai N’Guyen take care of them with great skill.

    Footnotes

    One contribution of 17 to a theme issue ‘Existence and prevalence of economic behaviours among non-human primates’.

    These authors have contributed equally to this study.

    These authors have contributed equally to this study.

    Electronic supplementary material is available online at https://doi.org/10.6084/m9.figshare.c.5230724.

    Published by the Royal Society. All rights reserved.