How predation shapes the social interaction rules of shoaling fish

Predation is thought to shape the macroscopic properties of animal groups, making moving groups more cohesive and coordinated. Precisely how predation has shaped individuals' fine-scale social interactions in natural populations, however, is unknown. Using high-resolution tracking data of shoaling fish (Poecilia reticulata) from populations differing in natural predation pressure, we show how predation adapts individuals' social interaction rules. Fish originating from high predation environments formed larger, more cohesive, but not more polarized groups than fish from low predation environments. Using a new approach to detect the discrete points in time when individuals decide to update their movements based on the available social cues, we determine how these collective properties emerge from individuals' microscopic social interactions. We first confirm predictions that predation shapes the attraction–repulsion dynamic of these fish, reducing the critical distance at which neighbours move apart, or come back together. While we find strong evidence that fish align with their near neighbours, we do not find that predation shapes the strength or likelihood of these alignment tendencies. We also find that predation sharpens individuals' acceleration and deceleration responses, implying key perceptual and energetic differences associated with how individuals move in different predation regimes. Our results reveal how predation can shape the social interactions of individuals in groups, ultimately driving differences in groups' collective behaviour.


Supplementary Methods and Analyses
1.1 Trials with eight fish 1

.1.1 Preprocessing of tracks
All analyses were performed using MATLAB (2016a) and we made use of the inbuilt MATLAB functions where noted. Track segments generated from the Didson tracking software were stitched together using a custom built script. These were then smoothed using a Savitzky-Golay filter with span 7, and degree 4 using the intrinsic smooth function. The coordinates of each fish were located in a cartesian coordinate system, with each fish i at time t represented as (x i (t), y i (t)). These coordinates, measured in pixels, were then transformed into coordinates in mm, calibrated using a known length within the arena.

Distance to centroid
On each frame, we calculated the global centroid as the mean coordinates of all fish in the arena; (x(t),ȳ(t)). The mean distance between each fish's coordinate and the global centroid was then calculated and averaged across trials (Fig. 1).

Detection of subgroups
After lifting the holding cylinder, the fish took approximately one minute before they began to explore the arena. This is observed in Figure 1, where the first minute of the trial involves fish emerging from the holding cylinder and hence the mean distance to the centroid is typically low (because not all fish have emerged). Therefore, in the following analyses, we discarded trajectories within the first minute from when the first fish had been detected as emerging.
To identify the fish that were exploring the arena together in a 'subgroup', for each fish in turn, and on each frame, we identified all neighbours that were within 100 mm of the focal fish. We chose 100 mm because 4-5 body lengths is commonly used to determine group membership in fish shoals [1][2][3], and the fish were on average 20.5 mm ± 4.3 SD mm length. If a neighbour was within 100 mm of the focal fish, we defined these two fish as 'connected'. We detected all group members that were connected with one another, and also determined the largest one of these connected components using the networkComponents function by Daniel Larremore (see: http://se.mathworks.com/matlabcentral/fileexchange/42040-find-network-components). This component represented the subgroup that contained the largest number of fish that were exploring the arena together on that frame. If there were two subgroups of equally large size on the same frame, then we randomly chose one of the subgroups. We determined the total number of times the largest subgroup size was a group of 8 fish, versus the times when the largest subgroup size contained less than 8 fish across each trial.

Calculation of subgroup size transitional probabilities
For each trial, we determined the modal size of the largest subgroup exploring the arena within 2 second windows (48 frames). We then asked whether this largest subgroup changed in size between successive windows. We obtained, therefore, the total number of times the largest subgroup (of a certain size) increased in size, the total number of times it decreased in size, the total number of times it remained the same size, and hence the total number of possible transitions.

Calculations of the structure and movements of subgroups
The corners of the arenas disrupted the shoaling behaviour of the guppies, often causing them to cross and reducing their directional alignment. To ensure we only analysed times when the fish were shoaling, for the following measures we removed tracks that were within 150 (∼ 130 mm) pixels of the corners of the arena.
To determine the average width and height of the subgroups that were exploring the arena together, on each frame, we rotated the coordinates of fish belonging to a particular subgroup size by the average direction those subgroup members were facing. In effect, the subgroups were rotated so that the group's mean heading was aligned and facing along the positive X axis. The fish at the very front of the group, therefore, had the largest x coordinate and the fish at the back of the group having the smallest x coordinate. We determined the distance between the fish with the smallest and largest x coordinate as a measure of the subgroup's length. In a similar way, the determined the distance between the fish with the largest and smallest y coordinate, giving the subgroups's width at that frame. For each trial, we calculated the mean length and width for each subgroup size.
To visualise the shape of the subgroups of 8 fish as shown in Figure 3 and Figure S2, we produced heatmaps of the positions of group members relative the subgroup's centroid. To construct these heat-maps, we subtracted the centroid from the subgroup's rotated coordinates. This transformed the coordinates so that the group's centroid was located at (0,0) and was facing along the positive X axis. We then determined the number of coordinates that fell within the range x = [-150:1:150] and y = [-150:1:150]. In effect, this represented a 301 x 301 matrix where, the values in each cell of the matrix corresponded to the number of observations that a fish was observed in that position relative to the shoal's centroid. We then summed these matrices from each trial (separately for male or females, and for fish from high or low predation environments). These summed matrices were then normalised by dividing each cell by the maximum number of observations that occurred within one cell of the resultant matrix. For plotting purposes, the resultant heat-maps were rotated anticlockwise by 90 degrees (so that shoals faced along the positive Y axis), and were smoothed with a gaussian filter (using the intrinsic imgaussfilt function with sigma = 7). Note that all statistics were performed on the raw data of the subgroup lengths and width, and these plots were purely for visualisation purposes.
To calculate the modal nearest neighbour distances between fish in a trial, we first calculated the distance between each fish and its nearest neighbour on every frame. We then determined the modal nearest neighbour distance between fish in a trial by counting the number of times fish were observed within 0 -2 mm, 2 -4 mm, 4 -6 mm, etc from each other. We then determined the bin with the largest number of observations and used the mean of the bin's edges as a measure of the modal nearest neighbour distance between fish.
On each frame, we also calculated each detected subgroup's centre as the mean coordinates of all fish belonging to that subgroup; (x sg (t),ȳ sg (t)). We then estimated the velocity of each subgroup's centre at time t using: where ∆t is 1/24. We then calculated the speed of the subgroup's centroid as: For each trial, we calculated the median speed of each subgroup size within the trial. We also calculated the median speed of all individuals in each subgroup size.
Polarisation was calculated as where u i is the unit vector of fish i, and N are the number of fish in the subgroup. Polarisation scores take values between 0 (no alignment) to 1 (complete alignment). For each trial, we calculated the median polarisation of each subgroup size. We also counted the total number of times subgroups were observed with polarisation scores above 0.85 and the total number of times they were observed. To create the heat plots shown in Figure S5, we determined the number of times a subgroup was observed having speed in the range [0:10:200] mm/s and polarisation scores from [0:0.05:1]. We therefore created a 21 x 21 cell matrix (for each predation and sex combination), where the number in each cell represented the number of times a subgroup had been observed at the speed and polarisation associated with that cell. These maps were normalised by dividing each cell by the sum of all cells. For plotting purposes, these heat-maps were smoothed using a gaussian filter (sigma = 1). As before, however, all statistics were performed on the raw data.

Preprocessing of tracks
Track segments generated from the CTrax were stitched together using a custom built script. Like with the eights, the corners of the arena disrupted the shoaling behaviour of the pairs. Because of this, we did not manually correct tracks that were within 150 (∼ 130 mm) pixels of the corners of the arena, and hence tracks were discarded within these regions. We wanted to ensure that each trial contributed a sufficient amount data in order to accurately quantify the fish's social interactions. Therefore, we only analysed videos where we had at least 2 minutes worth of data for when the pair were shoaling together. Shoaling was defined as when the two fish were less than 100 mm apart (4-5 body lengths) and the fish had moved at least 20 mm within a second. This classification ensured we did not include trials where either the fish had not explored the arena, were not moving, or were not exhibiting shoaling behaviour. Again, we chose 100 mm because this represented 4-5 body lengths of the fish. In practise, the majority of social interactions occurred within this region (Fig. S7A). In total, we removed 30 of the 254 trials (11 from low predation populations and 19 from high predation populations) where data did not reach these criteria.

Acceleration and Deceleration Decisions
We determined the x and y components of each fish's, velocity using the standard forward-difference approximations: where ∆t = 1/24 s was the constant duration between consecutive video frames. A fish's speed at time t was then approximated as: To detect the times when a fish decided to move (increase in speed), we smoothed these speeds using a Savitzky-Golay filter with a span of 12 frames (1/2 second) and degree 3. We then detected the times and heights of the peaks and troughs of these speeds using the findpeaks inbuilt function in MATLAB, with a minimum distance of 8 frames (1/3) second between adjacent troughs or peaks (Fig. 4A). Because fish showed strong directional alignment when they were less than 100 mm apart (Fig. S7A), and because we were interested in how fish updated their position as a function of its neighbour's position when they were shoaling, we only analysed decisions when fish were less than 100 mm apart. Using a custom built script, we paired each trough (magenta points in Fig. 4A) with the following peak (yellow points in Fig. 4A) of the speed profile. Each paired trough and peak was classified as a decision. We removed spurious decisions where the change in speed between the trough and peak of a decision was less than 20 mm/s. We determined this cutoff by plotting a histogram of all decision speeds across all trials, which revealed a strong peak below 20 mm/s (Fig. S7B), revealing the spurious decisions (noise). An example of one of these spurious decisions can be seen in Figure 4A at ∼ 9.5 seconds. We calculated the acceleration within each of these decisions as the change in speed between the trough and the peak of a decision, divided by the time lag between the trough and the peak of the decision. To calculate the deceleration after a decision, we calculated the change in speed between the decision's peak and the next decision's trough, divided by the time between the decision's peak and the next decision's trough. Because the fish did not always make a second decision immediately after the first decision, we only included decelerations when the time between the decision peak and next decision's trough was less that 15 frames (0.62 seconds). This ensured we captured the true deceleration of the fish, without including long spurious periods of time between non-sequential decisions.
We calculated the distance a fish moved during each of its decisions (Fig. S11A). To do this, we determined the fish's position at the start of its decision (when the fish had its minimum speed) and the fish's position at the start of its next decision. We then calculated the distance between these two points. We only used consecutive decisions if there was less than 50 frames (∼ 2 seconds between these two decisions. This was determined by plotting the probability distribution of the time between decisions (Fig. S11B).
We identified the distance between the pair when a fish made a decision, and determined the acceleration of the fish as a function of this distance (Fig. S10). This indicated that beyond 200 mm, the fish's acceleration were relatively consistent, albeit with a slight decay with distance. We tested whether fish from high or low predation differed in their acceleration when fish were more separated, we calculated the mean acceleration of the fish when fish were separated by more than 200 mm. Because the fish were generally shoaling together, there was far less data in this region than when fish were less than 200 mm apart. We also had to remove clear two outliers from these data for statistical analysis.
To generate the heat-maps shown in Figure 4D-G and Figure S9C-F, for each decision, we determined the relative location of the neighbour with respect to the focal fish that made the decision. Negative and positive x values of the neighbour indicated that the neighbour was to the left or to the right of the focal fish, respectively. Negative and positive y values of the neighbour indicated that the neighbour was behind or in front of the focal fish, respectively. We then discretised these locations in the range [-100:5:100] mm in both the x axis and the y axis. We then determined the mean acceleration and deceleration of fish when their neighbour was in each of these discretised locations across all decisions, but separately for males or females, and for fish from high or low predation populations. For plotting purposes, these heat-maps were smoothed with a Gaussian filter, with sigma = 4. We also determined the total number of decisions that were made when neighbours were in these locations, which was used to determine the probability contour regions using the probcontour function. Note that the statistics were performed on the raw measures, and these plots were purely for visualisation purposes.
At each decision point, we identified the fish that was in front or behind the other fish relative to the pair's direction of travel. To remove any ambiguity about which fish was in front or behind the other, we only assessed this positioning when the pair's polarisation was above 0.5. We then determined how the distance between the pair varied in the two seconds following a decision by either the lead fish or the following fish (see Fig. 4A). We also determined whether the pair switched positions during these decisions, by assessing whether the fish that was at the front of the pair remained at the front of the pair 15 frames (0.62 seconds) after either fish had made a decision. If the two fish had changed position, we counted this as a switch. The number of times the fish did not switch position during these decisions was also counted.
To determine a decision rate for each fish, we counted the number of decisions a fish made, and divided this by the number of seconds the fish had been observed less than 100 mm from its partner. On average, males updated their position at a rate of 0.84 movements s -1 (± 0.02 SE) whereas the larger females updated their position at a rate of 0.65 movements s -1 (± 0.02 SE). Pairs of fish did not update their position at the same time. For each fish, we determined the time delay between when one fish moved, and then the next fish moved. To ensure the decision made by the first fish could have been influencing the decision made by the second fish, we set a 'cut-off' time for when the second fish had to have made a decision after the first fish's decision, if these decisions were to be counted as paired. Because males made on average 0.84 decisions per second, this cut-off time was set to 28.6 frames (1.19 s), and because females made on average 0.65 decisions per second, this cut-off was set to 37.1 frames (1.54 s). Males updated their position 0.47 (±0.003 SD) seconds after their neighbour had moved, whereas females updated their position 0.54 (±0.004 SD) seconds after their neighbour had moved.

Turning Decisions
Each fish's heading was calculated from the ellipse fitted to each fish's body orientation during tracking [4] and took values between 0 and 2π. These headings were made continuous by unwrapping heading changes around 2π. For example, we converted a heading change from 350 to 10 degrees to 350 to 370 degrees. This removed any discontinuity around 0 and 2π allowing us to compare changes in angles around this junction. This also assumes that fish do not turn more than 180 degrees in one frame, and because the fish are not exhibiting escape responses, our temporal resolution (24 fps) should have removed this possibility. The fish typically travelled at the same heading, interspersed with sharp changes in heading (Fig. S7C). To detect these changes in heading, we used an algorithm developed by Weinmann et al. (2015) [5]. The function minL1Potts.m was applied to the unwrapped headings of a fish with γ = 5. This produced a stepwise function highlighting the changes in direction (Fig. S7C). Since the algorithm marks the step-points in the middle of the change in angle, and not when the final heading is reached, we deducted 6 frames (0.25 seconds) from the timestamps of the direction changes to determine the instance when a fish started to change its heading. All code associated with these analyses can be downloaded from http://pottslab.de/. The size and direction of a turn, therefore, was calculated as the change in angle between successive changes in heading. Positive turn angles indicate the fish turned to the left, whilst negative turn angles indicate the fish made a turn to the right. We also assessed the rate at which fish made turns by calculating the number of turns performed, divided by the total time they were less than 100 mm apart. Predation did not affect the turning rates of males or females (Table S5), however, smaller females made more turns than larger females (χ 2 = 4.97, d.f = 1, P = 0.03).
To create the heat-map in Figure 5A, like the acceleration heat-maps, we determined the relative location of the neighbour when a fish made a turning decision. Again, these relative locations were discretised in the range [-100:5:100] mm in both the x and the y axes. We then determined the average turn angle when neighbours were in each of these discretised locations. This is represented by the colour in Figure 5A. For plotting purposes, these heat-maps were smoothed with a Gaussian filter, sigma = 2. These heat-maps revealed a fish's turning responses could be broken down into 3 distinct regions as a function of their neighbour's position. To define these regions, we divided the relative location of the neighbour into three 120 degree sections, indicated by the dotted lines in Figure 5A. In each trial, we counted the total number of turns a fish made when their neighbour was in each of these top two sections. We also counted the number of times a fish turned left or right when their neighbour was in each of these top two sections. In each trial, therefore, we determined the number of times a fish turned towards its neighbour out of the total number of turns it made. We also determined the mean size of the turns that were directed towards the location of the neighbour in these top two sections within each trial.
To create the heat-map in Figure 5B, for each turning decision, we determined the relative direction of the neighbour with respect to the focal fish, and the heading of the neighbour. These directions and heading were discretised in the range [−π : 2π/49 : π]. With regards to the direction of the neighbour, negative values of π indicate a neighbour was to the right of the focal individual making the turn, and positive values of π indicate a neighbour was to the left of the focal individual making the turn. We determined the mean turn size when neighbours were within each of these discretised direction-heading combinations. This is shown as the colour in Figure 5B, smoothed with a Gaussian filter, sigma = 2. Where the signs of the direction and heading of the neighbour are the same, the neighbour is located to the left or right of the focal individual and facing away from it. In these cases, turns towards a neighbour could either be due to attraction responses (turns towards position of the neighbour), alignment responses (turns to align with the direction of the neighbour) or both responses combined. Note that if the fish only performed turns towards their neighbour's position, then this heat-map could be broken down into just two regions along the line X = 0. This is not the case, and the turning responses as a function of the heading and direction of a fish's neighbour, need to be broken down into four regions as indicated by the dotted lines in Figure 5B. In the two regions where the signs of the direction and heading of the neighbour are opposite, we counted the number of times a fish turned to align with their neighbour's heading, out of the total number of turning decisions in these regions. This was determined separately for each trial. We also determine the mean size of the turn that would have resulted in the fish becoming aligned within these two regions in each trial.

General Model Formulation
We modelled all response variables using generalised linear mixed effects models. These were fitted with predation regime (high or low), sex, subgroup size (where applicable) and body size (see Fig. S1) as fixed effects. River (nested within predation and crossed with sex), and trial (where applicable) were fitted as random factors to the data. Sex, Predation and subgroup size (where applicable) were treated as categorical variables in all analyses, whereas body size was treated as a continuous. We assessed the interactions of predation and sex, but not subgroups size (where applicable) or body size, because these are expected to scale with our shoaling measures. The full fixed effect models were simplified by removal of non-significant terms. If interactions were observed between predation and sex, without an overall significant predation effect, we analysed sexes in separate models. In practice, we found significant interactions between predation and sex in the polarisation of groups of eight fish (χ 2 = 4.81, df = 1, P = 0.028), the deceleration of pairs (χ 2 = 6.19 df = 1,P = 0.013), the proportion of alignment responses in pairs (pMCMC = 0.048), the mean response lag time (χ 2 = 6.88 df = 1,P = 0.009), and the decision rates (speed and turning rates) in pairs (speed: χ 2 = 13.9 df = 1,P < 0.001; turning: χ 2 = 4.69 df = 1,P < 0.03). All statistical analyses were performed in R.

Binomially Distributed Variables
Binomially distributed variables (i.e. increase/decrease of shoal size, observations in/out of a high polarised state, number of switches/non switches) were typically over-dispersed, and therefore all binomially distributed data were modelled with Markov-Chain Monte Carlo Generalised Linear Mixed Models (MCMCglmm) that allows for over dispersion. All MCMCglmm models were run for 4040000 iterations after a burnin of 40000 and a thinning interval of 4000 to yielding a total posterior sample of 1000 posterior samples per chain. Flat priors were used for the fixed effects and locally uninformative priors were used for the random effects, both representing little prior knowledge. Convergences of each model were assessed using diagnostic plots (using the scapeMCMC library in R) and using the Gelman-Rubin diagnostic statistic [6] implemented through the coda package in R [7]. We ran duplicates of all models with over-dispersed starting values, and then compared the mixing properties of chains between identical models. All models showed adequate convergence (multivariate potential scale reduction factors all <1.02). All auto-correlations were within the interval -0.1 and 0.1.

Continuous Variables
Continuous response variables were modelled with linear mixed effects models (LME). These models were fitted using maximum likelihood. To avoid violating LME assumptions (i.e. residual distribution and hetero-scedasticity which were checked with diagnostic plots), shoal width, shoal length, modal distance (eights), average distance moved during acceleration/deceleration periods, mean alignment response during turns, and mean decision lag were log transformed. The polarisation of groups was -log(1-P) transformation, where P was the median polarisation of the subgroup. All other variables were untransformed. Contour lines represent regions containing the proportion of total observations where individuals were found relative to the shoal centroid located at (0,0). Shoals from high predation populations were generally more compact than shoals from low predation populations. These patterns were consistent in shoals of 8 female fish (Fig. 3) and across different subgroup sizes (Fig. S3).     Fig. 4A). We filtered decisions that were less than 20 mm/s changes in speed (red line), as these represented spurious decisions likely due to noise (i.e. decision at approximately 9.5 s in Fig. 4A). (C) Example of a guppy's turning profile. Guppies swim in the same direction with intermittent changes in heading. We used a step detection algorithm to detect these changes in direction. Blue line represents the original headings calculated during tracking. The red line represents the step changes detected. The times of these changes are plotted in Fig.  4A as the dashed lines.