Eye pupil signals information gain

In conditions of constant illumination, the eye pupil diameter indexes the modulation of arousal state and responds to a large breadth of cognitive processes, including mental effort, attention, surprise, decision processes, decision biases, value beliefs, uncertainty, volatility, exploitation/exploration trade-off, or learning rate. Here, I propose an information theoretic framework that has the potential to explain the ensemble of these findings as reflecting pupillary response to information processing. In short, updates of the brain’s internal model, quantified formally as the Kullback–Leibler (KL) divergence between prior and posterior beliefs, would be the common denominator to all these instances of pupillary dilation to cognition. I show that stimulus presentation leads to pupillary response that is proportional to the amount of information the stimulus carries about itself and to the quantity of information it provides about other task variables. In the context of decision making, pupil dilation in relation to uncertainty is explained by the wandering of the evidence accumulation process, leading to large summed KL divergences. Finally, pupillary response to mental effort and variations in tonic pupil size are also formalized in terms of information theory. On the basis of this framework, I compare pupillary data from past studies to simple information-theoretic simulations of task designs and show good correspondance with data across studies. The present framework has the potential to unify the large set of results reported on pupillary dilation to cognition and to provide a theory to guide future research.


Surprise and self-information
One of the first cognitive variables that was shown to influence pupillary responses is surprise, defined in information theory as the negative logarithm of the probability of an event. This quantity is also called self-information, because it measures how much information is gained when observing an event. Pupil size has been shown to respond vigorously and robustly to surprise, with dilation in response to events in inverse proportion to their frequency of occurrence in a trial [51,14,52]. Pupil also responds to stimulus disappearance, in inverse proportion to how likely the stimulus is to disappear at that given time [9]. Along the same line, pupillary dilation has been reported in relation to the probability of a reward outcome, independently of its sign (i.e. responses are equivalent for losses and rewards; Van Slooten et al. [25], Lavín et al. [6], Satterthwaite et al. [37]) or even to the occurrence of errors, as a function of their likelihood [53]. When events have probability distributions defined along continuous feature spaces (e.g. position, number line), pupil also responds in inverse proportion to the probability density of occurrence of that feature [13,10]. When event occurrences depend on past trial history, pupil responses reflect surprise taking account of that history [12,10]. Despite this apparent consistency of findings, no attempts have been made so far to assess whether the relationship between pupil size and event probability follows a logarithmic trend, as predicted if pupil signals self-information. To step in this direction, the data from aforementioned studies is plotted against quantified surprise values in Fig. 1 (see squares in figure). This analysis is restricted to studies that reported probabilities quantitatively and measured pupil size in millimetres or percents. Precise comparison across studies is not possible given that detailed conditions are not available (i.e. time and performance pressure, lighting conditions, baseline arousal levels, etc.) and that measurement methods may differ. However, this visualization already suggests that pupil dilation is linearly proportional to self-information, within and across studies.

Information about task variables
The examples mentioned so far show that pupil size dilates in proportion to the amount of information needed to encode sensory stimuli. When a surprising stimulus is presented, self-information is large and pupils dilate. However, sensory stimuli such as cues, can also carry information about other, separate events. Pupillary response to such cases was investigated in Preuschoff et al. [7], in which stimuli informed participants on their winning probability. Subjects had to bet on which of two cards, whose values were revealed afterwards, was going to be larger. In this study, Preuschoff and colleagues looked at the pupil response to the display of the first card value. Here all values (from 1 to 10) were equally likely, such that self-information was equal in all conditions. However, some cards provided more information than others about the chance of having a winning or losing bet. For example, when the first card was a 10, there was a guarantee of winning/losing if participant had bet on the first card being larger/smaller (there were no ties in the game). Conversely, a 5 provided little information about the chance of winning, since probabilities were still close to 50-50. Such gradual gain of information about the probability distribution of a variable (chance of winning in the present case) can be quantified by the Kullback-Leibler (KL) divergence between prior and posterior variable distributions. KL divergence can be interpreted as the amount of information gained about the true probability distribution of a variable, after receiving new data. KL divergence provides a generalized measure of information gain that is equivalent to self-information in the case of detection or discrimination tasks. Remarkably, the pupillary response to first card value presentation in Preuschoff et al. [7] followed closely the KL divergence between subjects' belief on winning probability before and after observing the first card value (see light blue circles in figure 1 and panel A in figure 2), even though these results were not discussed as such in the paper.
When the second card was presented, different situations could occur. The predictions could be confirmed, in which case little information would be gained (e.g. first card was 8, predicting first card being larger, and second card was 5, confirming predictions), or they could be contradicted, in which case a lot of information would be gained (e.g. first card was 8 but second card was 9). Here again, pupil responded in pro- Figure 1. Relationship between information cost and pupil dilation in previous studies. Information cost was quantified as the KL divergence between prior and posterior beliefs. Squares in the graph illustrate pupillary responses to discrimination or detection tasks, in which KL divergence simplifies to stimulus self-information. Circles illustrate pupil dilations in response to task variables and decision making. See supplementary information for details.
portion to the amount of information being gained about winning probability, quantified as KL divergence (see figure 1 and panel B in figure 2). The findings of Preuschoff et al. [7] are compelling for several reasons. First, pupil size variations occurred following participants' choice and were thereby not affected by decision processes or motor responses, reflecting purely inferential processes. Second, they allow us to make clear quantitative predictions in terms of information processing and these predictions are strikingly confirmed.
One difference between surprise and KL divergence models of pupil response is that, if pupil responded only to surprise, it would always depend on the frequency of occurrence of presented stimulus, independently of task. In contrast, KL divergence models predict that pupil will respond to the amount of information provided by stimuli about task variables. This difference was exploited in two studies by Reinhard and colleagues [8,54] in which stimulus probabilities were manipulated in a GO/NOGO tasks. In accordance with the information model, Reinhard et al. showed that pupillary response depended only on the probability of occurrence of the features of the GO/NOGO stimuli that were informative about the task (e.g. when GO was defined by the occurrence of 1-letter as opposed to 2-letter stimuli, the identity of the letter being presented was irrelevant and failed to affect pupil response; see simulated results in figure 1). More generally, several studies have found that pupillary responses to stimuli depend on whether they are attended to or not [31,32,34,55,56] and that these responses scale with the subjective salience of the stimuli [35,56,57]. In attentional blink experiments, targets that follow closely previous target occurrences remain sometimes undetected. In these cases, pupillary response to target occurrence is greatly diminished [32]. Larger pupil dilation is associated with larger distractor interference [58], and increased processing of subliminal cues [59], in agreement with the view that pupil response scales with the quantity of visual information being processed.

Decision making
When decisions are made in the absence of uncertainty, such as in simple stimulus-response association tasks, the relationship between pupil response and information gain is straightforward. For example, in Richer and colleagues, both reaction time and pupil dilation were shown to vary as a function of the number of stimulus-response associations [38], in accordance with the classical Hick-Hyman law [60]. Here the information cost of the decision can be quantified as the log of the number of possible stimulus-response associations in the task, which is equivalent to the KL divergence between prior and posterior beliefs [61] (see figure 1, yellow circles).
In conditions of uncertainty, the situation is slightly more complex. Satterthwaite and colleagues tested participants on a task similar to that of Preuschoff et al. [7], except that the decision followed, rather than preceded, the display of the first card value [37]. Participants had to pick either the face-up or face-down deck of cards. The second card value was then revealed and the trial was won if the card from the chosen deck was the largest [37]. Interestingly, in that case, the results were exactly opposite those of Preuschoff: when the first card was less informative (e.g. 5), making it more difficult for the subject to choose which deck to pick, the pupil response was larger than when the first number was either small or large, a case for which decision was easier to make. The reaction time associated with the decision followed the same pattern, being larger for less informative values. This observed relationship between reaction time and pupillary dilation has been found in many studies [46,19,62,24,63,2,64] and pupillary responses are best modelled by means of regressors that extend during the whole reaction time period of the trial rather than by brief pulses limited to stimulus onset [18,22]. These findings suggest that the process from which pupillary dilation originates is maintained during the whole decision process.
The finding that uncertain or conflictual decisions are slower than decisions for which more information is available from stimulus is classical in the decision making literature. It can be modeled as a drift diffusion process in which noisy evidence accumulates until a threshold is reached and in which the rate of accumulation depends on how close the option values are to each other [65,66]. Drift diffusion models can also be interpreted as time-resolved Bayesian decision making processes in which each accumulation step corresponds to the update of prior to posterior belief [67]. The noisier the evidence, the more updates will tend to go in the wrong direction. Therefore, the summed quantity of information accumulated over the whole decision process is larger when evidence is noisy than when it is not. Thus, results from Satterthwaite et al. [37] can be accounted for by considering the sum of the KL divergences resulting from every update along the drift diffusion process (see figure 3 and light orange circles in figure 1). In Urai et al. [19] and Colizoli et al. [24], pupil size was measured during motion discrimination tasks and was shown to vary in parallel with decision uncertainty and reaction time: it decreased with stimulus strength for correct trials (low uncertainty), but increased with stimulus strength in error trials (high uncertainty). This pattern of results can also be explained by recurring to drift diffusion models of decision making and by assuming variable drift rates [66]. Along the same line, Cheadle et al. [45] showed that during a task in which evidence accumulated over eight successive stimulus presentation, pupillary responses were proportional to the amount of evidence provided by each stimulus. Moreover, this response was modulated by recency and confirmation biases, which both also affected decisions. So pupil responses tracked decision updates, as predicted by our proposal. In de Gee et al. [18] and de Gee et al. [22], pupil responses in detection and 2alternative forced choice tasks were shown to be inversely proportional to the probability of the choice and hence to the KL divergence between prior and posterior: in conservative participants (biased towards NO), YES choices led to larger responses, while the opposite tended to be found in more liberal participants (biased towards YES). Pupil responses were also shown to vary as a function of the influence of the prior on perceptual decisions in de Gee et al. [22] and Krishnamurthy et al. [21]: when prior beliefs have less weight (because of better control or attentional allocation or because of low prior reliability), more information is extracted from the sensory stimulus, KL divergence is larger and pupil dilates more. Along the same line, when the occurrence of surprising outcomes suggests the task structure may have changed, pupil dilations is even larger [10,21,26]. This is because such environmental volatility is associated with increased learning rate and thus increased influence of sensory evidence on internal models of the task. Indeed, the extent to which volatility affected learning rate correlates with the magnitude of the pupil response [10,21]. Together, these findings on pupillary response to volatility and surprise confirm that pupil diameter scales with how much novel sensory evidence is used to update current belief states.

Mental effort
Another common findings in the literature is that pupil size varies as a function of task demands and subject's engagement in the task, suggesting the view that pupillary dilation indexes mental effort [2,4,5,68,69,70]. We have recently proposed that mental effort too can be quantified as the average KL divergence between prior and posterior beliefs [61]. Effortful tasks often include large number of associations between stimuli and responses, resulting in low prior beliefs for each association and requiring large updates in order to reach precise posterior beliefs (e.g. N-back task; see simulations of N-back task from Rondeel et al. [68] in Fig.  1, red circle). Other cases of difficult tasks are those in which prior beliefs do not match task statistics (e.g. Stroop task), or in which task statistics change constantly (e.g. switch tasks), also implying large updates and large information costs (see simulations of Stroop and switch tasks from Rondeel et al. [68] in Fig. 1, orange and yellow circle). So the present proposal that pupil size scales with information gain can also be applied to complex tasks and accounts for the classical relation between mental effort and pupillary dilation.

Tonic pupil size
So far we have restricted our discussion to phasic pupil responses, i.e. the change in pupil size that follows event onset. However, the tonic variations in pupillary diameter, usually measured during baseline epochs that precede trial onsets have also some interesting properties. These tonic pupillary changes have been related to the modes of discharge observed in noradrenergic neurons [29,63,48,47,50,30]. Large phasic responses occur when baseline firing rates of noradrenergic neurons are low and would correspond to small tonic pupil size, whereas large baseline noradrenergic activity would be associated with large tonic pupil size but small phasic responses [71,29,63,72,47]. Indeed, negative correlations between spontaneous changes in tonic and phasic pupil size have been reported repeatedly [29,20,63,44,73,18,74,72], even though task-induced or interindividual changes in tonic and phasic pupil size go often in the same direction [12,21,10,30,25,75].
Assuming that tonic variations of pupil size, like phasic task-induced changes, reflect quantitatively the amount of information being processed by the brain may help reconcile these contradictory findings in a parsimonious way. When information is attached to abrupt sensory signal, it leads to phasic dilation whose magnitude is proportional to the KL divergence between prior and posterior beliefs. In the absence of clear onset, tonic pupil size reflects information processing from memory, i.e. manipulation of working memory, planning, mind-wandering, mental imagery or offline learning. Therefore, tonic pupil size would increase when cognitive activity occurs out of sync with task events [76], hence decreasing limited cognitive resources available for main task [61], leading to distractibility and exploratory behaviour, but it would also increase during demanding covert computations on working memory [81,82,25]. However, confirming this hypothesis requires quantifying out-of-sync information processing in terms of KL divergence, like we did for phasic pupillary responses. Since we cannot provide such quantified predictions on the basis of current literature, this will have to rely on future experimental studies.

Relation to alternative theories
Pupillary responses, because of their relation to the noradrenergic system [71], have previously been linked to unexpected uncertainty [27,84], sometimes taken as synonym to surprise [7,28,6] and sometimes as an equivalent of volatility, i.e. how likely the environment dynamics is to change [84,27,21,85,86,87]. These two definitions are strongly related since surprising observations suggest that the statistical structure of the environment may have changed [88]. While surprise is event-related and could be linked to phasic pupillary changes [28], volatility varies slowly and could be related to tonic pupil size [27]. Unexpected uncertainty relates also strongly to the problem of exploitation/exploration trade-off, another concept linked to pupillary responses [30,29,83,89]: when confidence in the internal model of the environment drops following surprising observations, exploitation strategies lose value with respect to alternative exploration strategies [84]. However, recent data has shown that variations of tonic pupil size are not indicative of unexpected uncertainty, but are rather a signature of reducible uncertainty (ambiguity resulting from poor model of environment, caused by undersampling; Krishnamurthy et al. [21]) or expected uncertainty (related to the variance of the task; De Berker et al. [12]). This is also in line with the finding that pupil size does not depend only on nora-drenaline but also on other neuromediators such as acetylcholine [50], whose function has been associated with encoding of expected uncertainty [27]. Phasic pupillary responses, on the contrary, were shown to correlate with unexpected uncertainty [21]. However, since volatility is a slow-changing property of the environment, this observed correlation with phasic pupillary changes must reflect the fact that, when prior knowledge on environment is unreliable (i.e. volatility is high), more weight is given to new sensory evidence, as opposed to prior biases [90,84,27], and model updates between prior and posterior beliefs are more expansive [90], leading to larger pupillary dilations. Overall, current evidence does not seem to favour the view that pupil dilation would be indicative of specific types of uncertainty but, as I argue in the present work, would rather signal information processing, which itself depends strongly on uncertainty conditions.

Limitations
Notably, two studies reported results that appear to be in contradiction with our information model. In O'Reilly et al. [13], the onset of unexpected saccadic targets led to pupillary dilations, but when these violations of expectation indicated the need to update the internal model of saccade target distributions, pupillary responses were smaller than when these unexpected events were identified as being outliers (identified by their colour). In Van Slooten et al. [23], pupillary response to the outcome of subjects' choices in a 2-arm bandit task was shown not to depend on modelled expectations: when subjects were thought to expect a large reward, their pupillary response was similar regardless of feedback. Fur-ther, the magnitude of the decision-related response scaled with the difference between the available options, and feedback pupillary response was inversely proportional to the model learning rate, both results being in apparent contradiction with previous literature [10,37,19] and the present proposal. In both aforementioned cases, pupillary responses were compared to variables of computational models fitted to behaviour, as opposed to direct task variables. These behavioural models are based on assumptions and conclusions drawn from the models are valid only to the extent that these assumptions are justified. For example, in O'Reilly et al. [13] the model assumed participants did not update their internal model when faced with outlier stimuli. However, it could be argued that participants always updated their internal models in the face of surprising targets but had to put extra work to cancel these updates when figuring out that the target was an outlier. So while the results of O'Reilly et al. [13] and Van Slooten et al. [23] appear to contradict our view and invite us to remain cautious in our conclusions, possible alternative interpretations of their data suggest that more investigations should be conducted to resolve this apparent inconsistency.

Conclusion
In the present paper, the factors that trigger changes in pupil-linked arousal were discussed under the light of information theoretic framework. The hypothesis that pupil size scales with the amount of information being processed, allowed us to explain a wide range of data, sometimes with quantitative predictions. This view applies both to tonic and phasic pupillary responses, the difference being that phasic responses mark information processing triggered by precise event onset while tonic pupillary changes are not precisely aligned to external events.
Beside the factors that trigger pupillary changes, an equally important issue concerns the computational effects of pupil-linked arousal, and more generally, its functional role in brain computations. This issue goes beyond the scope of the present paper and will be discussed in future work.