Depth perception in disparity-defined objects: finding the balance between averaging and segregation

Deciding what constitutes an object, and what background, is an essential task for the visual system. This presents a conundrum: averaging over the visual scene is required to obtain a precise signal for object segregation, but segregation is required to define the region over which averaging should take place. Depth, obtained via binocular disparity (the differences between two eyes’ views), could help with segregation by enabling identification of object and background via differences in depth. Here, we explore depth perception in disparity-defined objects. We show that a simple object segregation rule, followed by averaging over that segregated area, can account for depth estimation errors. To do this, we compared objects with smoothly varying depth edges to those with sharp depth edges, and found that perceived peak depth was reduced for the former. A computational model used a rule based on object shape to segregate and average over a central portion of the object, and was able to emulate the reduction in perceived depth. We also demonstrated that the segregated area is not predefined but is dependent on the object shape. We discuss how this segregation strategy could be employed by animals seeking to deter binocular predators. This article is part of the themed issue ‘Vision in our three-dimensional world’.

The shape of the test patch is defined by Eq. 1. For the test patch we predict that the perceived peak disparity of the object will be equal to the average disparity over a square window by centred on the peak disparity. The shape of the smooth function describing disparity at each point in the object is and for the first experiment is defined as: Eq. 1 Where is the peak disparity of the object, is the smoothness coefficient and and are the coordinates of a given point. and are the width ( direction) and height direction) of the patch.
Here .
The simplest way to calculate the total disparity enclosed within the averaging window is to integrate : Eq. 4 Substituting this back into equation 2 for we obtain: 0.25 , ,

Eq. 5
To translate this into the prediction of the peak depth all we need is to divide by the area of the window:

, , 4
Eq. 6 This gives us the predicted peak depth for an arbitrary window size by where and (we only consider square windows) and smoothness coefficient . We calculated the predicted peak depth for a range of window sizes and all smoothness coefficients used in Experiment 1.
For each tested window size, we compared the predicted peak depth at each smoothness coefficient to the results, for each participant. The reduced chi-squared test score was used across all smoothness coefficients to calculate a goodness of fit for each window size. We selected the window-size that had the minimum reduced chi-squared test score as the best prediction for the individual participant. The reduced chi-squared test was used as it takes the uncertainty in the participant's data at each smoothness coefficient into account, giving less weight to data with greater uncertainty, and therefore enabling a better fit to be produced.
This process was repeated for all participants, until each participant has a predicted window size calculated from the best fit of the model. We analysed the variance of the model from the participant's data using the R 2 test score to give us an idea of how well the model fitted each participant's performance.

Experiment 2
This process can also be repeated for the stimulus in Experiment 2, resulting in: , 2 Eq. 8 Due to the symmetrical disparity distribution in both condition 1 and 2, and as the stimulus and window was square this equation holds for both condition 1 and condition 2 (although and are switched algebraically).
Fitting was done as with Experiment 1, where a range of window sizes were tested.The window size that best fit all smoothness coefficients simultaneously, according to the reduced chi-squared test score, was considered to be the best fit.

Experiment 3
From Experiment 1, the model delivered an averaging window size within 15% of the stimulus plateau size, for all participants. However, this was also the size of the disparate standard object patch.
Experiment 3 was designed to test whether the visual system 'chose' its averaging window on the basis of the stimulus-specific plateau, or using the standard patch size as a template. Thus, we implemented two models, the half-depth averaging model (based on the stimulus plateau size), and the template model (based on the standard stimulus patch size). Note therefore that there are no free parameters in these models. The experiment used a range of plateau sizes so that we could decide which model best fit the human data.
We repeated the logic used above to define the models here: Where is the size of the border around the plateau size : Eq. 10 Following the same methodology, and defining a function :

, ln cosh sech
Eq. 11 , , 4 Eq. 12 For the half-depth model, where averaging is dependent on the size of the plateau of the smooth stimulus, we used the size of the plateau for each stimulus in the experiment as the window size.
Performance was analysed using the R 2 test statistic in comparison to the participants data.

2
For the template model averaging was dependent on the size of the standard stimulus patch, we therefore chose a constant window size for all shapes of the smooth stimulus. Again, this left us with no free parameters to fit, so we simply tested performance using the R 2 test statistic. Here, we used: Eq. 13 The Hadamard product of the disparity matrix and the matrix was then taken. When divided by the sum of (this equals the number of non-zero elements in ∘ ), we obtained the average disparity in within the circle.
Eq. 14 An analogous model using this method and a square window was also developed, and performed fits to within one pixel of window size (1.07min arc) of the original integral model, although with a significantly inferior runtime and lower accuracy than the integral method presented above.

Experiment 1
See the main body of the paper for the fitting parameters for the square window model in Experiment 1; Figure 3. See Supplementary Supplementary Figure 1, below, for best-fit conditions for the circular model applied to each participant's data. The figure shows that the model is clearly a poor fit to participant performance. Analysis of R (see Table 1) confirmed this: an R of -1 implies that the function is better fitted with a straight line. We therefore rejected this circular window model in favour of the square-based model presented in the main text.