Proceedings of the Royal Society B: Biological Sciences
You have accessResearch articles

Physiological arousal predicts gaze following in infants

Mitsuhiko Ishikawa

Mitsuhiko Ishikawa

Department of Psychology, Graduate School of Letters, Kyoto University, Yoshida-Honmachi, Kyoto 606-8501, Japan

Japan Society for Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan

[email protected]

Google Scholar

Find this author on PubMed

Shoji Itakura

Shoji Itakura

Department of Psychology, Graduate School of Letters, Kyoto University, Yoshida-Honmachi, Kyoto 606-8501, Japan

Google Scholar

Find this author on PubMed



    According to the natural pedagogy theory, infant gaze following is based on an understanding of the communicative intent of specific ostensive cues. However, it has remained unclear how eye contact affects this understanding and why it induces gaze following behaviour. In this study, we examined infant arousal in different gaze following contexts and whether arousal levels during eye contact predict gaze following. Twenty-five infants, ages 9–10 months participated in this study. They watched a video of an actress gazing towards one of two objects and then either looking directly into the camera to make eye contact or not showing any communicative intent. We found that eye contact led to an elevation in the infants' heart rates (HRs) and that HR during eye contact was predictive of later gaze following. Furthermore, increases in HR predicted gaze following whether it was accompanied by communicative cues or not. These findings suggest that infant gaze following behaviour is associated with both communicative cues and physiological arousal.

    1. Introduction

    From an early stage, infants are highly sensitive to the human gaze, and they refer to the gaze of others for social learning. Csibra & Gergely [1] proposed the theory of natural pedagogy, which stipulates that infants follow the gaze of others because they are sensitive to their communicative intent. According to this theory, gaze following is necessarily accompanied by specific ostensive cues, such as direct gaze and infant-directed speech, which promote infant gaze following [25].

    However, a recent study has indicated that infants demonstrate gaze following behaviour in situations without ostensive cues. Gredebäck et al. [6] examined infant gaze following both with and without communicative intent by creating three experimental contexts using models presenting: (i) eye contact (social, ostensive), (ii) no cue (non-social, non-attention-seeking), and (iii) shivering (social, non-ostensive, attention-seeking). Their results suggested that infants follow the human gaze independently of communicative intentions.

    Thus, the effects of eye contact on infant gaze following have remained unclear. This may, in part, be owing to the fact that previous empirical studies have focused primarily on visual attention and have not looked at what is happening for infants during eye contact and how eye contact promotes infant gaze following. Szufnarowska et al. [7] have shown that infants follow others’ gaze direction in situations that are highly stimulating and grab their attention. They suggested that gaze following is based on an infant's attentional state and that eye contact can strongly elicit an infant's attention. Other studies have measured infant attention by measuring their gaze time at an actress' face [2,6], but the results have been inconsistent across studies. In their research study, Senju & Csibra [2] used animation to engage an infant's attention. However, even though they had high visual attention on the screen the infants in their study did not show gaze following. Therefore, it is logical to conclude that a different approach to experimental conditions of measuring an infant's state is necessary to clarify the promotive effects of eye contact on gaze following.

    According to the Aston–Jones model of attention (AJMA), animals tend to be unresponsive to changes in external stimuli at low levels of arousal. Conversely, animals are more sensitive to the surrounding environment and more responsive to changes in external stimuli at high levels of arousal [8,9].

    Arousal is commonly associated with visceral, autonomic and endocrinal changes in the body, induced by subcortical structures, particularly the amygdala [10,11]. Some studies have argued that eye contact directly activates arousal systems in the brain [12]. Increased arousal influences the perceptual and cognitive processes dealing with social stimuli [13]. For example, eye contact elevates physiological arousal [14,15], affecting such social cognitive processes as face perception [16] and mentalizing [17].

    Physiological arousal may increase when eye contact promotes gaze following because the amygdala, which is an important part of the social brain network [18], connects these processes [19,20]. It has also been shown that amygdala activation is highly related to physiological arousal symptoms, such as heart rate (HR) [21]. Ishikawa & Itakura [22] have indicated that enhanced amygdala activation could promote the response to other's gaze direction. Because the amygdala is located deep within the brain, it is impossible to measure its activity by near infrared spectroscopy or electroencephalography. Thus, measuring physiological arousal is the most feasible way to examine how eye contact affects infant states in the context of gaze following.

    In the current study, we have examined how physiological arousal in infants changes in the context of gaze following [2326] and whether arousal level during eye contact predicts gaze following behaviour. The main purpose of this study is to examine how eye contact affects infants in the situation of gaze following. In infant studies, HR has been the most common and robust measurement used to monitor physiological arousal [27,28]. Therefore, in the current study, we used infant HR as an index of physiological arousal to measure their arousal levels while they watched a video. We predicted that eye contact would elevate HR levels [14,15]. We also predicted that infants would follow another's gaze only when there was eye contact [2]. In addition, we expected that HR levels during eye contact would predict later gaze following behaviour.

    2. Methods

    The experimental protocol was approved by the Research Ethics Review Board of the Kyoto University Department of Psychology in Japan. The parents of all participants provided written informed consent before their infants participated in this study.

    (a) Participants

    Twenty-five (12 females and 13 males) 10-month-old infants completed the study. The sample size was decided based on a recent study measuring infant HR and eye tracking [29]. The mean age of the infants was 295.64 days old (range: 281–335 days old). Seven infants were excluded from the analyses owing to their inability to maintain attentiveness (infants who had less than three trials with gazing at one of the objects and removed electrodes of the electrocardiogram (ECG)).

    (b) Apparatus

    A Tobii T60 Eye Tracker integrated with a 17-inch TFT monitor (Tobii Studio 2.2.8, Tobii Technology, Stockholm, Sweden) was used to present the stimuli to the infants and to record eye movements at a 60 Hz sampling rate. The final videos were edited using Adobe Premiere Pro CS6 in order to control the duration of each phase.

    Participants were seated in a carer's lap approximately 60 cm from the monitor. Prior to the recording, a five-point calibration was conducted. Figure 1 shows the stimuli used during the familiarization and test phases.

    Figure 1.

    Figure 1. Selected frames of the stimulus videos. All videos started with the baseline phase (a), followed by the action phase (bd) and the gazing phase (e). The action phase consisted of three conditions: eye contact (EC), no cue (NC) and shivering (SV).

    For the HR recording, we used a Polymate (Digitex Lab, Japan) to measure the ECG data. The sampling rate was 1000 Hz. The sites of the ECG electrodes were monitored with a 4-lead ECG and rubbing alcohol was applied to the infant's skin before attaching the electrodes to reduce impedance.

    (c) Stimuli and procedure

    We modelled our procedures and stimuli on those used by Gredeback et al. [6]. Each stimulus video clip began with an onscreen female actress seated at a table and gazing downward. Two toys were placed on the table, with one to each side of the actress (figure 1). These toys were alternately used as a target object and a distractor object. The videos consisted of three phases. In the baseline phase of each condition, the actress remained still for 2 s and then looked up gradually into the camera with both eyes closed. This was followed by the action phase, which differed between conditions. In the eye contact (EC) condition, the actress raised her eyes and looked into the camera for 3 s. In the no cue (NC) condition, the actress closed her eyes for 3 s. In the shivering (SV) condition, the model made a few rapid horizontal head shakes as if shivering for 3 s with closed eyes. The third phase was the gazing phase. In this phase, the actress turned her head approximately 45° toward one of the two objects on the table and fixated on it for 5 s. In the NC and SV conditions, the actress opened her eyes just before shifting her head. The actress maintained a neutral facial expression and remained silent throughout the entire experimental sequence.

    All three conditions were run within-subject, and a total of 12 trials were presented to each infant. Each trial was presented in a quasi-randomized order across conditions. The stimulus in each trial used one toy as the target object and one toy as the distractor object in a randomized sequence. The direction of the actress' gaze was counterbalanced in ABBABA order. Half of the infants saw a leftward gaze in the first trial, and the other half saw a rightward gaze first. Before the start of each trial, the infant's attention was drawn to the centre of the screen, on which the actress' face appeared along with colourful animations and sounds. Then when the infant began paying attention to the screen, the experimenter pressed a key to begin the trial. The total length of the all trials was approximately 5 m.

    (d) Data analysis

    We used a Clearview fixation filter for the eye-tracking data. Fixation was defined as gaze recorded within a 50 pixel diameter for a minimum of 200 ms, and this criterion was applied to the raw eye-tracking data to determine the duration of any fixation. The recorded sample's average percentage was 61.63% (s.d. = 12.03%, range: 45–90%). Seven infants were excluded from the final analysis because of inattentiveness.

    (e) Eye-tracking data

    The principle measurement of gaze following was the infants first saccade towards an object during the gazing phase. If the infant looked at the same object the actress gazed at, their gaze behaviour was coded as gaze following and the percentage of gaze following was calculated. Infants were required to elicit this type of saccade in at least three trials to be included in the analyses. In addition, we characterized the infants' attention to the face by calculating the duration of gaze at the actress’ face for both the baseline and action phases separately. In total, 20% of all trials were excluded for the analysis of gaze following (EC: 5.33%, NC: 6.66%, SV: 7.66%). Seven infants were excluded because of inattentiveness, so 25 babies contributed data analysis.

    (f) Heart rate

    Trials where there was excessive movement on the part of the infant were excluded before we calculated R-wave-to-R-wave (R-R) intervals. R-R intervals were then visually inspected to identify infrequently missed beats, which were replaced by interpolation with neighbouring R-R intervals. We separated the ECG data from each of the three phases (baseline, action and gazing), and then we averaged the R-R intervals in each phase. In total, 80.4% of all trials were used for the analysis of HR, and 98% of excluded trials were overlapped with the exclusion of eye-tracking data. We calculated beats per minute for the HR analysis using averaged R-R intervals. We calculated the degree of increase from the baseline to the action phase for each trial to examine how HR increase predicts gaze following.

    3. Results

    For analysis of gaze following, we used the same statistical method as Gredebäck et al. [6]. We conducted a one-way ANOVA with the gaze following rate used as the dependent variable across the three conditions. Results showed a significant main effect of condition (F2,48 = 8.059, p = 0.001, ηp2=0.251). Bonferroni post hoc tests showed that a higher gaze following rate was found in the EC condition relative to the NC condition. A marginal difference was found between the EC and SV conditions (EC versus NC: p = 0.001; EC versus SV: p = 0.069).

    In addition, we performed two-tailed t-tests against a chance level of 50% (figure 2.) Infants exhibited significant gaze following behaviour only in the EC condition (EC: M = 74.9%, t24 = 4.465, p < 0.001, d = 1.82.; NC: M = 42.9%, t24 = −1.342, p = 0.192, d = 0.54; SV: M = 53.9%, t24 = .765, p = 0.452, d = 0.31).

    Figure 2.

    Figure 2. Results of gaze following during the gazing phase and the proportion of gaze following in each condition. The x-axis depicts condition and the y-axis depicts the percentage of gaze following. Asterisks indicate statistical significance, p < 0.05; n.s. = not significant. Error bars depict standard error.

    As an additional measurement of gaze following, we compared across conditions the amount of time the infants spent looking at the target versus the distractor object. Based on the protocol used in Gredebäck et al. [6], we conducted a 2 × 3 ANOVA with two levels of area of interest (AOI), the target object and the distractor object and three condition levels (EC, NC, SV). There was a marginal interaction effect between AOI and conditions (F2,48 = 2.16, p = 0.079, ηp2=0.083). Bonferroni post hoc tests showed that infants looked at the target object (M = 0.73 s) more often than the distractor object (M = 0.50 s) only in the EC condition (p = 0.003).

    In order to examine how the HR changed while infants watched the videos, a 3 × 3 ANOVA with three levels of condition (EC, NC, SV) and three phase levels (baseline, action, gazing) was conducted. Figure 3 shows a significant interaction effect between them (F4,96 = 2.49, p = 0.048, ηp2=0.094). Bonferroni post hoc tests showed that the EC condition was associated with an HR increase from the baseline (M = 126.50 bpm) to the action phase (M = 127.83 bpm, p = 0.002) and a HR decrease from the action phase to the gazing phase (M = 126.33 bpm, p < 0.001). The SV condition also revealed a decreased HR from the action phase to the gazing phase. In the action phase, the HR was on average higher for the EC condition than the NC condition, and no differences were present between the EC and SV conditions (EC versus NC: p = 0.001; EC versus SV: p = 0.342).

    Figure 3.

    Figure 3. Mean heart rate (HR) levels during each phase for each condition. The x-axis depicts video phase and the y-axis depicts the beat per minute of HR. Error bars depict standard error.

    We conducted generalized linear model (GLM) analyses to predict gaze following by condition and HR increase levels, which were calculated as the degree of increase in HR from the baseline to the action phase. We first performed a generalized linear mixed model analysis and, because there was no random effect of individual difference, we selected GLM. According to the GLM analysis, the HR increase rate predicted later gaze following (estimate ± s.e. = 68.71 ± 11.91, Z = 5.769, p = 0.001) in all three conditions, with a higher HR increase pointing to gaze following behaviour (figure 4). In addition, we tested the difference between slopes across the three conditions. Results indicated that only the EC condition differed from the NC condition and there were no other differences between any other conditions (EC versus NC: estimate ± s.e. = −1.71 ± 0.46, Z = −3.727, p < 0.001; EC versus SV: estimate ± s.e. = −0.733 ± 0.40, Z = −1.829, p = 0.159; NC versus SV: estimate ± s.e. = 0.593 ± 0.38, Z = 1.59, p = 0.247). The results of the model predicted that although the infants show the same general level of HR increase from the baseline to the action phase in all three conditions, infants will follow the actress' gaze most often in the EC condition.

    Figure 4.

    Figure 4. Results of GLM investigating the effects of the experimental condition and HR variability on gaze following. The x-axis depicts HR variability and the y-axis depicts the percentage of gaze following.

    In addition, we conducted causal mediation analysis to examine whether HR increase mediates between condition and gaze following. We used the R package ‘mediation’ [30] for conducting causal mediation analysis because it is possible to use with GLM. The significance of the indirect effect yielded a 95% confidence interval (CI) = 0.0283–0.19, an interval that did not include zero (p = 0.006). The direct effect of the condition was also significant (CI = 0.1044–0.38, p = 0.002). These results show that HR increase partially mediates between condition and gaze following.

    Since it is possible that attention to the actress' face affected infant HR and gaze following, we examined the infant's attention to the actress' face across conditions in all three phases. We conducted a one-way ANOVA using gaze time at the actress’ face as the dependent variable across the three conditions for each phase (table 1). There were no differences between any condition or phase (baseline: F2,48 = .302, p = 0.741, ηp2=0.012; action phase: F2,48 = 2.33, p = 0.793, ηp2=0.010; gazing phase: F2,48 = 1.237, p = 0.299, ηp2=0.049).

    Table 1. Mean looking time (s) and standard deviations for each condition in three phases.

    eye contact no cue shivering
     mean 1.86 1.78 1.76
     s.e. 0.57 0.74 0.63
    action phase
     mean 1.49 1.37 1.39
     s.e. 0.29 0.25 0.26
    gazing phase
     mean 0.94 0.88 0.97
     s.e. 0.42 0.39 0.38

    4. Discussion

    We examined the ways in which EC affects HR in infants and the degree to which HR predicts later gaze following. Our results showed that the infant's HR increased only during the EC condition. Moreover, HR increase during the action phase predicted later gaze following for all three conditions. The HR only partially mediated between condition and gaze following.

    The results of this eye-tracking data replicated the results of Senju & Csibra [2]. Eye contact was the only condition that promoted infant gaze following, and it was the only condition in which the HR increased from the baseline. Prior research has demonstrated that EC elevates physiological arousal [14] and enhances social cognitive processes [16,17]. Our study results can suggest that, in the context of gaze following, eye contact may elevate infant physiological arousal and induce gaze following as output.

    The mechanism of promotive EC effects on gaze following is probably related to an increase in physiological arousal.

    According to the results of the studies by Gredebäck et al. [6] and Szufnarowska et al. [7], infants showed significant gaze following in the SV condition. This may link to the fact that infants show gaze following in situations which grab their attention [7]. However, in the current study, infants only followed the actress' gaze direction in the EC condition, and their fixation time spent on her face did not differ across conditions. These results suggest that infant gaze following is not dependent on simple visual attention to a person performing an activity.

    Our finding that HR increase predicted later gaze following in all three conditions is consistent with the AJMA's description of the relationship between high arousal and attention [8,9]. Barbaro et al. [31] examined how gaze behaviour is dynamically influenced by autonomic arousal. Barbaro et al. [31] also showed that arousal changes (e.g. changes in HR) occur before changes in attentional state, as predicted by the AJMA. Our results indicated that only the EC condition induced gaze following at more than a chance level and HR was enhanced only in that condition. Furthermore, the results of causal mediation analysis of our study results showed that HR partially mediated between condition and gaze following. On the basis of these findings, it can be considered that the AJMA would only partially work as the mechanism underlying EC effects on gaze following. The HR increase predicted gaze following in all conditions, and therefore, we conclude that the AJMA may be applied generally to infant responsiveness to external stimuli as well. However, if ostensive cues are presented and infant physiological arousal is increased enough, they may likely evidence gaze following.

    The results of the present study showed that physiological arousal may play an important role in the mechanism of infant gaze following behaviour. However, based upon the GLM and causal mediation analysis results, we also found that HR cannot explain gaze following as there was a strong promotive effect of EC regardless of HR. One possibility that explains this is that HR cannot reflect all processing of EC information. Farroni et al. [32] showed that direct gaze at a presentation time of only 500 ms affects gaze following behaviour in newborns, suggesting that a fast pathway happening before HR increase may be related to promotion of gaze following.

    Senju & Johnson [13] proposed that EC might be processed in the amygdala during fast-track modulation. Because of the role of the amygdala in monitoring another's gaze direction, it seems probable that the activation of the amygdala may directly promote gaze following. However, it is still unclear how the limbic system of an infant activates when EC is perceived. Therefore, further studies are required to examine the fast promotive effects on gaze following.

    From our results, EC can enhance physiological arousal and EC itself may have promotive effects on gaze following. Moreover, physiological arousal can predict later gaze following without requiring ostensive cues. Therefore, we propose that infant gaze following is dependent on both communicative cues and physiological arousal. Because physiological arousal partially mediated between ostensive conditions and gaze following, it is possible that infants can follow another's gaze in a state of high arousal without ostensive cues [6,7]. This hybrid theory is a feasible explanation for the conflicting results of previous gaze-following studies.

    Duncan et al. [33] have described the problem of replication and robustness in developmental research. Underscoring this is the fact that arousal has been shown to be affected by diurnal rhythms, sleep and mood [34]. Therefore, infants may be in very different states depending on the experimental setting, the nature of stimuli, the testing time and other factors. Because of this, arousal measurement would be a helpful addition to infant research, to further explain results and control for the myriad infant affective states.

    This study is, to our knowledge, the first to examine physiological arousal in the context of gaze following, and we showed that physiological arousal is highly related to gaze following in infants especially with ostensive cues. The primary limitation of this study is the small number of participants. Gredebäck et al. [6] conducted the study running the same three conditions between-subject, collecting about 30 infants per condition. Therefore, it must be considered that infants may also show significant gaze following in the NC and SV conditions, but their effect size may be small. Although it is technically difficult to collect gaze data and physiological arousal at the same time in infants, future studies should recruit more participants and run between-subject analyses. In addition, the slight difference of stimuli might have affected the current results. In Gredebäck et al. [6], the actress continued to have her gaze lowered in the NC and SV conditions, while the actress closed her eyes in our study. Therefore, it could be that closed eyes may decrease infant gaze following. Gredebäck et al.'s [6] study should be replicated with exactly the same stimuli measuring physiological arousal.

    In conclusion, our study demonstrated that both communicative cues and physiological arousal affect gaze following in infants. Physiological arousal may play an important role in understanding the mechanisms behind eye contact effects.


    The experimental protocol was approved by the Research Ethics Review Board of the Kyoto University Department of Psychology in Japan. The parents of all participants provided written informed consent before their infants participated in this study.

    Data accessibility

    Data used in analysis are available in the electronic supplementary material.

    Authors' contributions

    M.I. developed the study concept, and conducted experiments and data analysis. All authors approved the experiment design and discussed the results. S.I. supervised this study.

    Competing interests

    Authors have no conflicts of interest.


    This work was funded by Young Fellowship grants to M.I. from the Japan Society for the Promotion of Science and grants to S.I from the Japan Society for the Promotion of Science (grant nos. 25245067 and 16H06301).


    We appreciate the cooperation of all families that agreed to participate in this study. We would also like to thank the anonymous reviewers and colleagues who have provided us with useful feedback.


    Electronic supplementary material is available online at

    Published by the Royal Society. All rights reserved.