Binocular combination of stimulus orientation

When two sine waves that differ slightly in orientation are presented to the two eyes separately, a single cyclopean sine wave is perceived. However, it is unclear how the brain calculates its orientation. Here, we used a signal detection rating method to estimate the perceived orientation when the two eyes were presented with Gabor patches that differed in both orientation and contrast. We found a nearly linear combination of orientation when both targets had the same contrast. However, the binocular percept shifted away from the linear prediction towards the orientation with the higher contrast, depending on both the base contrast and the contrast ratio. We found that stimuli that differ slightly in orientation are combined into a single percept, similarly for monocular and binocular presentation, with a bias that depends on the interocular contrast ratio. Our results are well fitted by gain-control models, and are consistent with a previous study that favoured the DSKL model that successfully predicts binocular phase and contrast combination and binocular contrast discrimination. In this model, the departures from linearity may be explained on the basis of mutual suppression and mutual enhancement, both of which are stronger under dichoptic than monocular conditions.


Introduction
The human visual system combines slightly different visual inputs from the two eyes into a single coherent cyclopean percept in which the two inputs are 'fused'. Binocular combination has been studied for almost two centuries [1], using both neurophysiological and behavioural approaches involving a wide variety of stimuli and tasks . The simplest theory of binocular combination is linear summation, in which the combined fused image is the linear sum of the images presented 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

Apparatus
The experiments were controlled by a PC, and the stimuli were displayed as a grey-level modulation on a Philips 107P colour monitor, 100 Hz refresh rate. The mean display luminance was 40 cd m −2 in an otherwise dark environment. Screen resolution was 1024 × 768 pixels; gamma correction was applied. The stimuli were viewed from a distance of 150 cm.

Stimuli
Stimuli were localized grey-level gratings (Gabor patches, GPs) with a spatial frequency of three cycles per degree (wavelength, λ), equal distribution (STD, σ , 0.23 degrees, allowing minimum two cycles in the GP). Stimuli were presented foveally using stereo goggles (crystal eyes 3, StereoGraphycs). Cross-talk of our goggles was minimal (2.2% at the highest contrast level, determined by measuring the luminance of a patch shown to each eye through the other eye's shutter using a photometer), and was invisible to observers at the brief exposure duration.   (table 1). Table 1. A sample orientation-ratios matrix. The contrasts (peak-to-peak Michelson contrast, in %) used for the eight possible contrast ratios when a representative orientation combination of 80 and 100 degree GPs were presented to the left (L) and right (R) eyes, respectively. Base contrast = 20%. Note that contrast ratio = 1 is presented twice. Orientations with a difference of 10, 15 and 20 degrees were combined, resulting in a total of eight combinations: 80-95, 80-100, 85-95, 85-100, 95-80, 95-85, 100-80 and 100-85 (figure 1a). We tested this limited set of orientation differences (up to 20 degrees) in order to avoid conditions where rivalry (either monocular or binocular) occurs. For each orientation combination, one eye (e.g. the right eye) was stimulated with a fixed higher contrast or base contrast (e.g. 20%), whereas the fellow eye was presented with one of the four contrast possibilities: 5%, 10%, 15% or 20% (figure 1b). This was repeated with the same base contrast (20%) in the other eye (e.g. the left eye) and the varying contrast in the fellow eye. Note that in this scheme, contrast ratio = 1 is presented twice, resulting in eight interocular contrast ratios per orientation combination (0.25, 0.5, 0.75, 1, 1, 1.33, 2 and 4; table 1). We tested four base contrasts (10%, 20%, 40% and 60%). For each base contrast, five baseline control conditions were also tested when interocular contrast ratios = 0 or ∞, i.e. only base contrast was presented to one eye, and the other eye just viewed a blank with grey background (80, 85, 90, 95 and 100 degrees; figure 1a). In addition, for each base contrast and contrast ratio, the monocular control conditions were also tested when the two gratings were presented in the same eye (monocular orientation combination). During monocular presentation, the other eye was presented with a blank screen at the mean contrast level.

Visual task
The task was to rate the perceived orientation as one of 5 options, corresponding to 80, 85, 90, 95 and 100 degrees, by pressing a computer keyboard button immediately after making their decision, using the dominant hand. An additional (sixth) option was reserved for the 'cannot decide' response (with 0% for this response in three out of four observers and up to 5% in observer D.V.; data not shown). Each trial was preceded by a fixation circle (approx. 0.3 degrees) at the centre of the display until the observer signalled their readiness using the keyboard space buttons. Observers were instructed to begin a trial  Table 2. A sample stimulus-response matrix. The frequencies of orientation responses of 80, 85, 90, 95 and 100 degrees when 80 and 100 degree GPs were presented to the left (L) and right (R) eyes, respectively. The frequencies were summed over the two directions of rotation and the two eyes. The eight interocular contrast ratios L/R = 0, 0.25, 0.5, 0.75, 1,1, 1.33, 2, 4, or ∞, the base contrast = 20% and stimulus duration = 80 ms. (Observer = O.E.). Note that the frequency of orientation responses is double for contrast ratio = 1 because ratio 1 was run twice.

Orientation tuning
Each of the orientation combinations was tested both monocularly and dichoptically. There were 32 conditions (eight orientation differences × four contrast ratios). In a given run of 192 trials, we intermixed 16 conditions (all eight orientation differences and two contrast ratios-12 trials per condition). In the monocular trials, the fellow eye was presented with a mean isoluminance background, with observers unaware of the origin of stimulation. Each run was repeated four times per eye receiving the base contrast yielding 48 measurements per condition. The same condition was repeated with the base contrast switched to the other eye, yielding another 48 measurements (combining the two eyes resulted in 96 trials per condition). The total number of trials for each contrast base was 3072, yielding 12 288 trials for the four base contrasts altogether. The monocular control conditions were tested in additional 12 288 trials, using the same protocol (the monocular and dichoptic trials were not interleaved). The baseline control conditions were tested in additional 960 trials, using runs of 10 blocks of 12 trials each, repeated eight times, yielding 96 measurements per condition. Overall, a total of 25 536 trials were performed by each subject, which took approximately 40-50 h per subject to complete the experiment. The first additional run of the baseline control served as 'preview' training and was not included in the analysis. Data were obtained for each observer and then merged for the group analysis.

Data analysis
We used a rating scale signal detection analysis [43] to obtain estimates of perceived orientation for each observer and condition, using Matlab. Briefly, a stimulus-response matrix was obtained for each base contrast and orientation combination. Table 2 shows an example matrix when base contrast = 20% and the GPs with 80 and 100 degree orientation were present to the left and right eyes, respectively. To eliminate possible biases (e.g. response bias and binocular bias), for each observer, the response frequencies were pooled over the two directions of rotation and the two eyes. The signal detection theory (SDT) model was used to fit the final stimulus-response matrix. The model contains nine Gaussian distributions, differing in their mean values but with the same standard deviation, corresponding to the binocular-combined orientations with nine interocular contrast ratios. The model has a total of 14 parameters, four criteria for five response options, nine mean values of the Gaussian distributions to give the perceived orientations and their standard deviations. The mean values of the first and the last Gaussian distributions were fixed to be 80 and 100 degrees, respectively, corresponding to the control conditions when interocular contrast ratio = 0 and ∞.  Table 3. SDT model parameters. The reduced χ 2 = 1.22; the number of degrees of freedom = 24; µ 1 = 80 and µ 9 = 100 were fixed. The standard errors were estimated from the standard deviation of the residuals and the Jacobean matrix at the point of best fit in parameter space.  Table 3 shows the parameters of the SDT model that best fit the sample stimulus-response matrix shown in table 2. The mean values of the first and the last Gaussian distributions were fixed to be 80 and 100 degrees, respectively (i.e. µ 1 = 80 and µ 9 = 100, the perceived orientations under control conditions when the contrast ratio = 0 and ∞). The perceived orientation at a given test contrast ratio is given by the mean value of its Gaussian distribution. Using this method, we estimated the perceived orientations for all conditions for each observer, and averaged the results across the four observers.

Contrast detection threshold
Binocular and monocular presentations with a GP (90 degrees orientation) were used to measure contrast detection thresholds per observer. Monocular and dichoptic trials were intermixed, whereas in the monocular trials, the fellow eye was presented with a mean luminance background, with observers unaware of the origin of stimulation. A one-up three-down staircase procedure was used to estimate thresholds. Data were obtained for each observer and then merged for the group. The averaged thresholds were 5.0 ± 0.52% for monocular contrast detection, and 3.2 ± 1.80% for binocular contrast detection-a binocular summation ratio of approximately 1.56-well within the range reported in previous studies [25].

Modelling
When the two eyes were presented with two GPs with slightly different orientations, a single fused GP was perceived. In order to better understand the mechanisms underlying binocular combination of orientation, we tested several models.
The simplest model is linear vector summation. If a GP of one orientation is represented by a vector GP(m, θ ), with the vector length representing the GP's contrast m, and the vector angle representing the orientation angle θ, the linear summation model assumes that the fused GP is the linear vector summation of the two monocular GP vectors. Note that this vector summation is not the real mathematical summation of two GPs with different orientations. Let GP(m 1 , θ 1 ) and GP(m 2 , θ 2 ) be two GPs presented to the two eyes, respectively. The fused GP vector is given by GP(m,θ) = GP(m 1 , θ 1 ) + GP(m 2 , θ 2 ). The perceived orientation is given bŷ Ding & Sperling [6] proposed an interocular gain-control model to explain binocular phase combination. Later, the model was modified by adding interocular gain-enhancement to explain both binocular phase and contrast combination (DSKL model- [7]). Ding et al. [7] compared five nested models to predict both phase and contrast binocular combination. The first nested model was a contrastweighted summation model, a simplified Ding-Sperling model with the gain-control threshold = 0, which can explain binocular phase combination [6,7], but failed to predict both phase and contrast in a binocularly combined sine wave [7]. The second and third nested models were the Ding-Sperling models with symmetric (the second) or asymmetric (the third) double-layer interocular gain-controls. The fourth nested model added interocular gain-enhancement to the third model, and the full model (the fifth nested model) added interocular gain-control of gain enhancement to the fourth model.
In this study, we set out to test these five nested models for orientation combination. However, it is very difficult to obtain reliable data for perceived orientation when the stimulus contrast is near threshold. Thus, in the model fitting, the gain-control thresholds were not significant and we therefore fixed the gain-control threshold and the threshold for gain-control of gain-control to be 0 for all five nested models. This makes the first three nested models identical. However, mutual enhancement had to be assumed to account for the decreasing mutual suppression when the base contrast increases. Therefore, we tested only three nested models in this study. GP(m 2 , θ 2 ), (2.2) and the perceived orientation is given bŷ This is the first model of the five nested models in our previous study [7]. Similar to the prediction of perceived phase [7], the power summation model [5] and the two-stage model [12] give the same prediction of the perceived orientation as given by equation (2.3).
Model 2: contrast-weighted summation plus contrast gain-enhancement. This is the fourth nested model in our previous study [7] with gain-control threshold = 0. The fused GP vector is given by The formula for the perceived orientation is given in electronic supplementary material.
Model 3: DSKL model. The DSKL model consists of three layers for each eye before the binocular linear summation site: (i) a selective signal layer that receives both gain-control and gain-enhancement from the other eye and outputs the signal to the binocular summation site; (ii) a non-selective gain-control layer that first extracts and sums image contrast energy (ε) across frequency channels and orientations (total contrast energy, TCE) and then exerts gain-control to the other eye's three layers separately with different gain-control efficiencies; (iii) a gain-enhancement layer that extracts image contrast energy (ε * ) (TCE*) and exerts gain-enhancement only to the other eye's signal layer. For normal vision without eye bias, the full DSKL model has five parameters: gain-control threshold g c , gain-enhancement threshold g e , the threshold for gain-control of gain-control g cc , the threshold for gain-control of gain-enhancement g ce , and the gamma exponent γ for calculation of image contrast energy. When g c = 0 and g cc = 0, the fused GP vector is given by The formula for the perceived orientation is given in the electronic supplementary material.

Results
In order to explore how the orientation information is combined to form a unified fused percept, we varied both the interocular (dichoptic) and monocular orientation difference and the interocular and monocular contrast ratio. Varying the interocular contrast ratio has provided strong constraints and important insights into both phase and contrast combination [6,7,14]. For each dichoptic orientation combination, one eye was stimulated with a fixed 'base' contrast (10%, 20%, 40% or 60%), whereas the fellow eye was presented with one of the four contrast levels, resulting in eight interocular contrast ratios varying from 0.25 to 4 for each orientation combination (see Material and methods). For example, when one eye was stimulated with a 20% base contrast, the fellow eye was presented with 5% (which is near the monocular threshold of contrast detection, see Methods), 10%, 15% or 20% contrast. We also used identical conditions to compare between binocular and monocular combination of orientation.
At each base contrast and contrast ratio, the raw rating data were analysed using SDT to calculate the perceived orientation when a grating with one orientation was presented to the left eye and a grating with a different orientation was presented to the right eye (binocular orientation combination), or when the two gratings were presented in the same eye (monocular orientation combination). In order to remove any possible bias, for each observer, the results were averaged across eyes and directions of rotation. These results were averaged across the four observers and were fitted with several orientation combination models (figures 2-5). We also fitted the individual data (shown in electronic supplementary material). (c) (d) Figure 2. Perceived orientation when the orientation difference of the two input gratings was 20 degrees. The red and blue curves are the best fits of the DSKL model [7] to the data (red squares and blue circles) averaged across four observers, and the black line is the prediction of orientation linear vector summation model. The red and blue dots indicate data for individual observers. Error bars: ± 1 s.e. Figure 2a shows the results for an orientation difference of 20 degrees and a base contrast of 10%. For both binocular (blue) and monocular (red) conditions, perceived orientation shifted from the left eye's orientation (80 degrees) to the right eye's orientation (100 degrees) as the right eye/left eye contrast ratio increased. Open circles are the average data, small dots show the individual data. The red and blue curves are the best fits of an orientation combination model, i.e. the DSKL model [7] (see Material and methods), to the data averaged across four observers, and the black line is the prediction of a linear vector summation model. The data fall close to the linear summation prediction when the interocular contrast ratio is close to 1; however, when the two input orientations had different contrasts (contrast ratio = 1), the data shifts away from the orientation predicted by linear summation, and towards the orientation with the higher contrast. This shift is a consequence of mutual suppression between the two gratings. As base contrast increases (figure 2b-d), the mutual suppression decreases, resulting in a smaller deviation from the linear prediction. This reduced mutual suppression can be explained by mutual enhancement between the two gratings. It is also noteworthy that while monocular and binocular combination followed very similar patterns, both mutual suppression and mutual enhancement were systematically stronger under binocular than under monocular conditions. Both mutual suppression and enhancement were independent of input orientation difference over the limited range tested. Similar results were obtained when the input orientation difference was 15 (figure 3) or 10 degrees (figure 4). The astute reader might note that the two models converge as the orientation difference becomes smaller at all base contrast levels (compare figures 2 and 4). Figure 5 summarizes our main results by comparing the apparent contrast ratio (i.e. the ratio of the monocular model outputs of the two eyes, see §2.8) to the input stimulus contrast ratio. Without any interorientation interaction, the apparent contrast ratio is identical to the contrast ratio (black dashed lines, which show the prediction of linear summation in figure 5). Mutual suppression reduced the apparent contrast ratio when the stimulus contrast ratio was below 1. Note that this reduction in the We note that there are substantial individual differences among observers (see electronic supplementary material and discussion below). The variance across observers is summarized in electronic supplementary material, table S1. We note that, for observers D.V. and O.E., there is a strong effect of mutual suppression and mutual enhancement under binocular conditions, whereas under monocular conditions, mutual suppression is a weak effect, and there is no mutual enhancement. For these two observers, the monocular and dichoptic data are significantly different (p < 0.05) at 10% base contrast; the mutual suppression was much stronger under binocular conditions than under monocular conditions. However, as base contrast increased, mutual enhancement cancelled the mutual suppression under binocular conditions, resulting in similar data to those under monocular conditions at high base contrast (60%). For observers Y.S. and S.T., the monocular and dichoptic data are very similar; under both binocular and monocular conditions, observer Y.S. showed strong suppression and enhancement, whereas observer S.T. showed strong suppression but no enhancement. Table 4 shows model parameters when fitting the DSKL model to the average data of four observers (red and blue curves in figures 2-4). We compared three nested models, in which a previous model is nested within its successor. When the threshold of gain-control of gain-enhancement g ce = ∞, the DSKL model is simplified to be model 2, and when the gain-enhancement threshold g e = ∞, model 2 is further simplified to be model 1. Their fitting statistics (reduced Chi-square (χ 2 /v) and corrected Akaike information criterion (AIC-see electronic supplementary material for details) are shown in tables 5 and 6. The comparison of two neighbouring models was made through an F-test (see electronic supplementary materials for details) with the F-value given in the row of the second model (F-test and its p-value are only shown for the average data). When fitting averaged data, under both binocular and monocular conditions, the F-test shows that the goodness of fit was significantly improved by adding interocular gain-enhancement to the model (model 2 versus model 1). Although the F-test also shows a further improvement of fit performance by adding gain-control of gain-enhancement (DSKL model  Table 4. DSKL model parameters. The standard errors were estimated from the standard deviation of the residuals and the Jacobean matrix at the point of best fit in parameter space.  Table 5. Model fitting statistics (binocular). N p , the number of parameters; v, the number of the degrees of freedom; AIC c , the Akaike information criterion with a correction for finite sample sizes.  Table 6. Model fitting statistics (monocular). N p , the number of parameters; v, the number of the degrees of freedom; AIC c , the Akaike information criterion with a correction for finite sample sizes. versus model 2), we cannot exclude the possibility of overfitting the model to the noise in the data, because the reduced Chi-square (χ 2 /v) was less than one [44]. In this study, we did not have enough data to test gain-control of gain-enhancement in the DSKL model, although it significantly improved the model fit in our previous studies [7,45]. For three observers, adding gain-enhancement (observers D.V. and Y.S.) and further adding gain-control of gain-enhancement (observer O.E.) also improved the fits to the individual data under binocular conditions (table 5). However, for the fourth observer (S.T.), whose data showed no interocular gain-enhancement (the interocular suppression remained constant when base contrast increased-see electronic supplementary material), neither adding gain-enhancement nor adding gain-control of gain-enhancement (model 2 or DSKL model) improved the fit. Under monocular conditions (table 6), three observers showed no interocular gain-enhancement, and no improvement was observed when fitting their data with model 2 or the DSKL model.

Discussion
Earlier behavioural studies showed that orientation discrimination thresholds are lower under binocular than monocular conditions when the stimuli are brief and have low contrast, but the binocular advantage is essentially eliminated for contrasts above about 15% [17]. However, our aim was not to compare monocular and binocular sensitivity, but rather to directly investigate the mechanism behind the combined fused percept that substitutes the two separate monocular percepts. Our results show that stimuli that differ in orientation by up to 20 degrees around the vertical meridian, within the 'effective limit on single vision' [35] are combined to form a single percept, regardless of whether the two orientations are presented monocularly or binocularly, with a bias that depends on the interocular contrast ratio, demonstrating that the brain can produce a coherent percept from a combination of two different orientations that is similar to a pure single-orientation stimulation.
There is no previous evidence on how different orientations presented to the two eyes are combined when the task is to explicitly address the perceived orientation and not to use it as a cue for depth (we note that in our experiments using small GPs and brief exposures, slant was not experienced by any of the participants (or by the authors)). Because both physiology and behaviour showed orientation tuning of about ±10 degrees around the preferred orientation, we assume that dichoptic stimuli, with a total orientation difference of up to 20 degrees used here, probably activate the same processing channel with the same orientation preference. Importantly, a substantial proportion of cells in both cat striate cortex [32] and macaque V1 [33] differ significantly in the preferred orientations of the two eyes. In the same vein, it has been suggested that if two gratings that activate the same processing channel with the same orientation preference are presented monocularly, they provide a stable percept of a single orientation [40].
Based on recordings from neurons in the cat striate cortex, Hubel and Wiesel showed that the inputs from the two eyes exhibited linear spatial summation [46,47]. Smith et al. [48] showed in monkey simple cells that the effectiveness of a stimulus in producing a response reflects interocular differences in the relative balance of inputs to a given cell; however, the eye of origin has no specific consequence. Simple cells showed linear spatial summation between the left-and right-eye receptive fields. The same mechanism of linear summation has been suggested to account for orientation selectivity, before a cell's nonlinear mechanisms. This suggestion appears to be applicable to our findings when the interocular contrast ratio is close to 1. However, as shown in figures 2-4, the results depart from simple linear (vector) summation when the two eyes have different contrasts. This nonlinear contrast influence on interocular effects is reminiscent of the nonlinear contrast influence on context effects of collinearly oriented flanking elements falling outside target's receptive field [49,50]. Based on our DSKL model, the departures from linearity, both under monocular and dichoptic viewing, may be explained on the basis of mutual suppression and mutual enhancement, both of which are stronger under dichoptic than monocular conditions. So far as we know, no physiological studies have directly addressed the mutual interaction of two gratings with similar orientations. However, based on studies using gratings with orthogonal orientations, previous physiological studies [51,52] demonstrated that interocular suppression is substantially different from monocular cross-orientation suppression. These studies suggest that interocular suppression is mediated by inhibitory circuitry within the visual cortex, whereas monocular cross-orientation suppression is mediated by subcortical mechanisms.
We fitted several models to the data. The linear vector summation model (black lines in figures 2-4) is clearly a poor fit when the interocular contrast ratio is not equal to one. The data are well fitted by a gain-control model (model 2) and by the full DSKL model [7]-red and blue curves in figures 2-4which has been used to successfully predict binocular combination of perceived phase and contrast [7] and also contrast discrimination [45,53]. Here, we show that it can also predict both the binocular and monocular combination of nearby orientations. Binocular orientation and phase combination differ from each other in several important ways. In binocular phase combination, the interocular suppression becomes stronger as the base contrast increases, and at low contrast and short stimulus duration, the summation is almost linear, without interocular suppression [27]. However, in binocular orientation combination, the interocular suppression becomes weaker as the base contrast increases. We assume interocular enhancement to account for this decreasing interocular suppression at higher contrasts. This interocular enhancement cannot be observed directly in binocular phase combination because of the strong mutual interocular suppression in observers with normal vision. However, it can be observed in observers with amblyopia, because of the weak or absent interocular suppression from the non-dominant eye to the dominant eye [45].
It is not altogether surprising that the binocular orientation and phase combinations have different behaviours given the differences in stimuli and tasks. For example, stimuli differing in phase can be combined into one mathematically by simply summing the two stimuli. However, stimuli differing in orientation cannot be combined into one mathematically by simply summing the two. Moreover, in the phase task, observers match the perceived phase of the dichoptically combined gratings to a binocularly viewed reference line, whereas in the orientation task, observers match the perceived orientation to an internal subjective standard.
Our study was limited to briefly presented stimuli with orientation differences of up to 20 degrees. We have not explored larger orientation differences, where the two orientations cannot be combined into a percept of a single orientation, and are probably processed by separate channels [39]. Similarly, we have not explored long durations, where two Gabor stimuli of slightly different orientations presented binocularly will continue to produce a single fused percept, but temporally interleaved monocular stimuli do not, instead producing an oscillating percept, most likely owing to after-images [42]. For our stimuli, it is plausible that the two inputs are processed by a single channel, because the orientation