Eye and head movements are complementary in visual selection

In the natural environment, visual selection is accomplished by a system of nested effectors, moving the head and body within space and the eyes within the visual field. However, it is not yet known if the principles of selection for these different effectors are the same or different. We used a novel gaze-contingent display in which an asymmetric window of visibility (a horizontal or vertical slot) was yoked to either head or eye position. Participants showed highly systematic changes in behaviour, revealing clear differences in the principles underlying selection by eye and head. Eye movements were more likely to move in the direction of visible information—horizontally when viewing with a horizontal slot, and vertically with a vertical slot. Head movements showed the opposite and complementary pattern, moving to reveal new information (e.g. vertically with a horizontal slot and vice versa). These results are consistent with a nested system in which the head favours exploration of unknown regions, while the eye exploits what can be seen with finer-scale saccades.


Introduction
During routine behaviour in naturalistic environments, we are faced with ongoing decisions about where to attend and how to orient our sensory effectors to obtain information about the world. In vision, our moment-to-moment grasp of the visual world is severely limited, with the visual field encompassing less than 180°(approximately 120°binocular), and within that range acuity falls precipitously away from the focal (foveated) area, with peripheral information severely degraded, low frequency, and having minimal chromatic sensitivity [1].
Given the importance of orienting behaviours, a large body of research has been devoted to understanding the processes that govern the selection of where and when to deploy  In gaze-contingent displays, an opaque mask obscures the image, with the exception of a 'window', whose position is determined moment to moment by the recorded position of the eye or the head. (b) In asymmetrical windows, we differentiate between gaze shifts aligned with the long axis of the window, which result in inspection of areas where information is already available (1), and gaze shifts aligned perpendicular to the long axis, which result in inspection of areas where no information is currently available (2). head movements should be little different from eye movements insofar as their selection characteristics are concerned. Indeed, it has routinely been observed that eye and head movements are tightly and systematically coordinated, both in low-level neural circuits (in cats [21,22] and in primates [23,24]), and in behaviour during attentional shifts (e.g. [25,26]). Although top-down influences can override these coordinated behaviours (e.g. [27,28]), the apparent 'default' pattern suggests that head-and eye-based attentional systems orient most often using common or shared general principles.
In this study, we evaluate the extent to which orienting behaviours using different effectors may exhibit differential selection characteristics and biases. In particular, we combine the methods of Foulsham et al. [17], and Solman & Kingstone [9] to evaluate scene viewing with asymmetric gazecontingent windows, controlled either by the eyes or by the head. Participants viewed scenes for 10 s and were given a simple yes/no question about the content following each scene (e.g. 'Was there a red chair?'). There were four viewing conditions and two orienting effector conditions. Participants scanned the scene either in full view, or through an effector-contingent window, which displayed only a portion of the scene at the moment-to-moment position indicated by the effector (figure 1). Effector-contingent windows were either square, horizontal slots or vertical slots. Effector conditions were eye contingent and head contingent, with the head-contingent window condition (conceptually equivalent to viewing a dark room with a headlamp) following the methods in Solman & Kingstone [9].
The key comparisons concerned how effector condition would interact with window condition. In particular, whether head movements would, like eye movements, follow the long axis of asymmetrical windows, or instead shift the window in the direction of the short axis, so as to reveal more new information, that could then be inspected by the eyes.

Displays and procedure
Displays subtended 39.2°by 29.4°, and windows were 15.3°by 3.8°for the asymmetrical windows, and 7.7°for the square window. Each participant viewed the same 40 scenes in randomized order, with window condition blocked and counterbalanced (10 scenes per window condition). Each trial began with a black dot displayed on a grey background, in one of the four corners or in the centre of the display. Participants shifted their gaze onto the dot to trigger onset of the image-in this way ensuring that gazeshift patterns did not reflect stereotyped patterns specific to a central starting position. Images were viewed for 10 s, followed by a yes/no question about the content following each scene (e.g. 'Was there a red chair?') in order to motivate visual exploration of the images. Participants responded via keypress ('z' or 'm') with yes-no mapping counterbalanced across individuals. No feedback was provided, and the following trial began immediately after the participant's response. Participants completed two practice trials with a square window to familiarize themselves with the task, and a short break and recalibration occurred between each window condition.

Tracking methods
Native data rates were 1000 Hz for the eye-contingent apparatus and 100 Hz for the head-contingent apparatus. For both effector conditions, we performed temporal smoothing of detected positions to reduce jitter in the displayed window position. Both raw (unsmoothed) and displayed (smoothed) coordinates were stored for analysis at a rate of 60 Hz. Approximate latency differed across systems (approx. 2 ms for the eye-contingent condition and approximately 10 ms for the head-contingent condition), but was in both cases lower than the screen refresh rate (60 Hz).

Apparatus: eye contingent
The experiment was created in Matlab, using v. 3 of the Psychophysics Toolbox [29,30] and the Eyelink Toolbox [31], and run on an SR Research supplied host PC with a 2.67 GHz Intel Core2 Quad CPU. The stimulus displays were presented on a 17 Dell P2411Hb monitor at a resolution of 1280 by 960 pixels and a distance of 55 cm. Eye movements were recorded throughout the task using a desktop-mounted Eyelink 1000 system (SR Research), with participants' heads stabilized by a chin and forehead rest. Velocity and acceleration were calculated online through SR Research's system, and saccades were identified by a velocity threshold of 30°s −1 and an acceleration threshold of 8000°s −2 , as well as a motion threshold of 0.15°. Saccade offset was registered when these thresholds were no longer surpassed.

Apparatus: head contingent
The experiment was created in Matlab, using version 3 of the Psychophysics Toolbox, and run on a Dell Precision T3500 computer with a 3.07 GHz Intel Xeon Processor. Stimulus displays were rear-projected with a Canon LV8235-UST projector onto a Da-Lite screen with a diagonal span of 132 at a resolution of 1920 by 1080 pixels and a distance of 300 cm. Note that, to match the aspect ratio of the monitor in the eye-contingent apparatus, we used only a restricted horizontal span of the projection screen for image display, so that in practice the resolution used was 1440 by 1080 pixels, and the diagonal span commensurately reduced. Head position was recorded using an OptiTrack optical motion tracking system (Natural Point, Inc.) with six V100:R2 cameras. The participant's head was tracked using a rigid body (providing 6DOF position and orientation) defined by five passive reflection markers, affixed to the front of a baseball cap. The position and orientation of the head was then used to determine the window position on the screen, as if the participant was wearing a headlamp.

Analysis
Analyses were based on the angular distribution of gaze movements during scene viewing (example scanpaths are shown in electronic supplementary material, figure S1). Each eye or head movement was classified on the basis of its onscreen angle (i.e. for the vector from the start to the end position of the For eye movements, we examined the distribution of saccade directions. For head movements, which are not ballistic in the way that saccadic eye movements are, we estimated the distribution using a sample-based approach. In particular, for each sample position p n = (x n , y n ), we stepped forward through successive samples (n + 1, n + 2, . . . , n + k) until the magnitude of the vector from p n to p n + k exceeded 1.0 degrees of visual angle (equivalent to a fairly short saccade; see electronic supplementary material, figure S2).
Finally, these bias scores were used to determine an axis congruency measure-i.e. the degree to which movements align themselves with as opposed to against the long axis of asymmetrical windows. In particular, we take the average horizontal bias in horizontal windows and the average vertical bias in vertical windows. These two values are combined, and the mid-point (0.5) is subtracted, to yield positive scores when movements are biased to align with the asymmetrical window, and negative scores when movements are biased in the orthogonal direction. Both conditions had significant effects of window, eye contingent: F 3,57 = 6.59, m.s.e. = 0.008, p < 0.001, and head contingent: F 3,57 = 89.5, m.s.e. = 0.008, p < 0.001. These effects were followed up with targeted contrasts, first comparing each of the asymmetrical windowed conditions to the baseline square window condition, 1 then additionally directly comparing horizontal and vertical window conditions.

Results
For the eye-contingent window, there was no difference in horizontal bias between the horizontal window and the square window, F 1,19 < 1, p = 0.648, but a significantly reduced horizontal bias (i.e. more vertical movements) for the vertical window when compared with the square window, F 1,19 = 6.44, m.s.e. = 0.016, p < 0.05. Most critically, there was a significantly greater horizontal bias in the horizontal window when compared with the vertical window, F 1,19 = 5.84, m.s.e. = 0.025, p < 0.05, replicating the key finding in [17]. When viewing with an eye-contingent horizontal window, the majority of saccades were horizontal. When viewing with a vertical window, a higher proportion were vertical. Thus, rather than moving the window to show more information, they tended to target locations that were already revealed.
For the head-contingent condition, there was both a reduced horizontal bias for horizontal when compared with square window conditions, F 1,19 = 123.0, m.s.e. = 0.011, p < 0.001, and an increased horizontal bias for vertical when compared with square window conditions, F 1,19 = 66.1, m.s.e. = 0.005, p < 0.001. Finally, directly comparing the asymmetrical windows, we find a greater horizontal bias for vertical when compared with horizontal window conditions, F 1,19 = 492.6, m.s.e. = 0.006, p < 0.001. In other words, head-contingent participants showed the complete opposite effect of asymmetrical window condition to the eye-contingent group. The majority of head movements elicited by a horizontal window were vertical, exposing more new information. With a vertical window, this switched systematically to a horizontal bias.

Axis congruency
A more targeted measure to assess the influence of asymmetrical windows is to compute the average axis congruency of orienting movements-i.e. the degree to which movements align themselves with as opposed to against the long axis of asymmetrical windows (see Analysis in Material and methods). Both effector conditions showed significant bias (figure 4), confirmed with two-tailed single-sample t-tests against 0 (i.e. against the assumption of no bias). For eye-contingent viewing, there was a significant positive bias, t 19 = 2.417, p < 0.05, with more movements along the main axis of asymmetrical windows. For head-contingent viewing, there was instead a significant negative bias, t 19 = −22.194, p < 0.001, with more movements against the main axis of asymmetrical windows.

Discussion
We sample our visual environment in multiple ways, by moving our bodies, heads and eyes. Despite this, we know rather little about how these different movements are deployed and coordinated to explore the world. This study provides good evidence that, rather than different types of movements being driven by the same underlying guidance processes, overt attentional orienting is guided differently based on the particular effector being used. The different windowed viewing conditions in this work resulted in clear and effector-specific changes in behaviour. When the window was yoked to eye position, overall slightly more horizontal than vertical saccades were made, a bias typical to eye movements and reported previously [32]. Critically, when viewing through asymmetrical windows, this bias was modulated in the direction parallel to the long axis of the window-with vertical windows leading to a reduction of the typical horizontal bias, so that participants more often shifted their gaze up and down within the window. In other words, participants moved their eyes to exploit the currently visible information more often than making a saccade into an unseen region (cf. [17]). This pattern is the opposite to what one would expect if the eyes are moving to reveal new information [33], which is surprising for some models of eye movement control (e.g. [14]).
When viewing was instead controlled by head position, however, the core pattern of results was entirely reversed. Horizontally oriented windows led to an increase in vertical saccades, and vertically oriented windows resulted in far more horizontal saccades. In these conditions, therefore, participants tended to explore the scene by moving perpendicular to the long axis of the window, resulting in new information becoming visible. This information could then be exploited by making eye movements within the window. Critically, even though the windows restricted visibility in exactly the same way for the head-contingent and eye-contingent conditions, participants oriented their attention differently according to the effector involved. Notably, the magnitude of the orienting bias for asymmetrical windows was considerably larger for the head than for the eyes. This may suggest that eye movements are less readily influenced by topdown or strategic factors than are slower and more deliberate head movements, so that the underlying horizontal bias in eye movements (independent of asymmetrical viewing restrictions) remains prominent across conditions. This default horizontal bias may reflect a combination of asymmetry in the visual field, the side-by-side placement of the eyes, and priors on information localized along the horizon. The vertical window, then, attenuates this bias but does not remove it. Given that inhibition of reflexive eye movements is notoriously difficult (e.g. antisaccade tasks; [34]), it is perhaps unsurprising that typical biases in eye movements are more resistant to experimental manipulation than are head movements.
The interaction between window shape and effector reported here provides some key insights into attentional selection in systems, such as the human visual system, which contain nested effectors for orienting. In particular, our results suggest that while eye movements may be biased to target regions of the visual field where there is some pre-existing information, head movements are instead biased to target unknown regions. This distinction provides a fundamental complementarity in nested orienting systems, so that coarse orientation provides entirely novel information, while fine orientation more selectively scrutinizes existing information. When considering how people scan scenes, several authors have made the distinction between large global shifts which change the focus to a new and unknown region and smaller local movements which focus on detailed inspection [33,35]. In order to see what we need to see, a balance must be struck between exploration and exploitation. Our results suggest that there is a division of labour in this regard between the eyes and the head, and presumably the rest of the body.
Why do the eyes and head respond in this way to the presence of asymmetric windows? It is important to note that in unrestricted viewing, humans have a wider field of view and greater resolution horizontally than vertically [1]. In effect, the conditions of naturalistic viewing are themselves asymmetric along the horizontal axis. Commensurately, we note here that the open (no window) and horizontal windows produced the most similar behaviours, in both eye-and head-contingent conditions. In the case of eye movements, a horizontal bias was observed (as noted elsewhere [32]), and in the case of head movements, a vertical bias was observed. Thus, under unrestricted viewing conditions, we find eye movements preferentially exploiting the greater horizontal span of the visual field, while head movements serve the role of shifting this field vertically. When the horizontal range is restrictedthrough square or vertical windows-these orienting systems accommodate. Under eye-contingent viewing, the typical horizontal bias is attenuated, with more gaze shifts drawn into alignment with the window, following the visual information presented there. Under head-contingent viewing, the eye is no longer able to drive horizontal exploration, and so the head assumes this role.
A similar scenario has been explored in the context of reading [36]. Head and eye movements were recorded while participants read a passage on either a vertically or horizontally extended page. In this case, the eyes were found to exploit information both horizontally and vertically, but with some attenuation along the longer axis. Head movements, on the other hand, supplemented this eye-based exploration with increased vertical scanning for vertically arranged material, and increased horizontal scanning for horizontally arranged material. These results may at first seem contradictory, but it is important to emphasize the difference between asymmetry in the environment and asymmetry in the ability to inspect the environment. In the former case, there is by definition no information to be had outside of the proscribed area-if the information is itself extended along one axis, there is necessarily nothing to be learned from exploration orthogonal to that axis. By contrast, if information is distributed uniformly, but visibility is restricted to one axis, then shifts orthogonal to that axis may prove useful. An additional key distinction concerns the goals which drive the visual system. In the reading study [36], the sequence of gaze positions was determined by the task, and with the exception of line changes on the horizontally oriented page, successive words were always within the visual field. Here, then, head movements were recruited solely to reduce the oculomotor strain from eccentric eye positions, or to support shifts too large for the eyes alone. In this study, the task is explicitly exploratory, and although salient points in a scene may preferentially capture attention, there is nothing like the regularity of sequential fixations seen in reading, nor is there strong systematicity in salient points across different images. In this exploratory setting, we find that the head is far more proactive-guided in such a way as to uncover new information to a greater extent than movements of the eyes under the same conditions. This complementarity between effectors probably optimizes the utility of foveal attention. In eye movements alone, there have been suggestions that fixation locations are selected to maximize uncertainty reduction-targeting regions of the visual field probably to provide the most new information [13][14][15]. Importantly, however, these selections are all made in the presence of peripheral vision in those regions. In other words, information-maximizing eye movements select on the basis of existing coarse visual information. Shifting attention to a region entirely outside of the immediate visual field, however, necessitates foveating an arbitrary and completely unknown region-which may often have little informational value (e.g. a patch of clear sky or a blank wall). To engage saccadic targeting processes for such a gaze shift is probably a waste of resources. Head movements may therefore serve an important bridging role, shifting the visual field at a coarse level, then allowing eye movements to select the important regions of this new vista.
It is important to note that this account, at face value, seems contradictory with reliably observed patterns of coordination between eye and head movements during large gaze shifts-with the eye characteristically leading or else synchronous with the head in movement onset (e.g. [4,25,26]). Based on these results, our account seems untenable-it is difficult to argue that the head's role is to provide a novel field for the eye if head movements characteristically lag behind eye movements. Critically, however, this characteristic pattern is observed only for gaze-shifts to unpredictable targets within the visual field. By contrast, head movements do lead the eyes when targets are predictable (e.g. [25,37]) and when shifts are voluntary (i.e. without a target [38]), suggesting that the temporal coordination between eye and head movements is fundamentally different when gaze shifts are reactive (e.g. to a target stimulus) than when they are intentional. In exploratory orienting, it is the observer driving the selection of gaze locations, not a discrete target stimulus. Consequently, consistent with our account, we would expect the intentional mode of coordination, with the head leading the eyes. Combining these two orienting systems, then, efficient and thorough visual exploration can be achieved. Eye movements are tasked with analysing in detail only the most important regions, conserving foveal processing, driven by informed selection within the existing visual field. Head and body movements, on the other hand, provide novel information by selecting the visual field itself, with an exploratory bias towards the unknown.
Ethics. Informed consent was obtained from all participants, and all experimental procedures and protocols were reviewed and approved by the University of British Columbia Behavioral Research Ethics Board (Protocol No. H10-00527).
Data accessibility. Our data are deposited at the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.33kh1 [39]. Authors' contributions. The experimental design was based on work by T.F. and A.K., modified by G.J.F.S. to incorporate head tracking, per previous work by G.J.F.S. and A.K. Data preparation and statistical analyses were performed by G.J.F.S. Manuscript preparation was shared by all authors.
Competing interests. The authors declare no competing interests. Funding. This work was supported by the Natural Sciences and Engineering Research Council of Canada, RGPIN 170077-11.