A data-driven method for reconstructing and modelling social interactions in moving animal groups
Abstract
Group-living organisms that collectively migrate range from cells and bacteria to human crowds, and include swarms of insects, schools of fish, and flocks of birds or ungulates. Unveiling the behavioural and cognitive mechanisms by which these groups coordinate their movements is a challenging task. These mechanisms take place at the individual scale and can be described as a combination of interactions between individuals and interactions between these individuals and the physical obstacles in the environment. Thanks to the development of novel tracking techniques that provide large and accurate datasets, the main characteristics of individual and collective behavioural patterns can be quantified with an unprecedented level of precision. However, in a large number of studies, social interactions are usually described by force map methods that only have a limited capacity of explanation and prediction, being rarely suitable for a direct implementation in a concise and explicit mathematical model. Here, we present a general method to extract the interactions between individuals that are involved in the coordination of collective movements in groups of organisms. We then apply this method to characterize social interactions in two species of shoaling fish, the rummy-nose tetra (Hemigrammus rhodostomus) and the zebrafish (Danio rerio), which both present a burst-and-coast motion. From the detailed quantitative description of individual-level interactions, it is thus possible to develop a quantitative model of the emergent dynamics observed at the group level, whose predictions can be checked against experimental results. This method can be applied to a wide range of biological and social systems.
This article is part of the theme issue ‘Multi-scale analysis and modelling of collective migration in biological systems’.
1. Introduction
The identification and characterization of interactions between the constituent elements of a living system are a major challenge for understanding its dynamic and adaptive properties [1,2]. In recent years, this issue is also at the heart of research conducted in the field of collective behaviour in animal societies [3,4]. However, the identification from field data of the interaction rules between individuals in species whose level of cognitive complexity can be quite high remains problematic. Indeed, the way in which individuals interact is strongly influenced and modulated by the physical characteristics of the environment in which the organisms live, such as temperature, humidity, brightness, or even the presence of air currents (see for instance [5,6] in social insects). The situation is quite different in the laboratory, where conditions can be precisely controlled and monitored. Moreover, new tracking techniques make it possible to record the behaviour of individuals alone or in groups for relatively long periods of time [7–11]. Using large sets of tracking data, one can then reconstruct and model the social interactions between two individuals of the same species, and between them and the obstacles present in their environment [12,13].
Explaining collective behaviour in groups of organisms consists in describing the mechanisms by which the behaviours of an individual are influenced by the behaviour of the other group members that are present in its neighbourhood [14,15]. This social influence is often assumed to result from the additive combination of pairwise interactions of the individual with part or all of the members of the group. Determining how these pairwise interactions may be combined, and which neighbours must be taken into account, is a central problem in the study of collective behaviour [16].
The behavioural response of an individual is the set of its successive positions during a given period of time, usually at some discrete time steps. From these data, it is possible to draw the trajectories of all the individuals in a group and calculate their instantaneous velocity and acceleration, their distance and angle of incidence to obstacles (e.g. the wall of an experimental tank), their heading, as well as relative quantities such as the distance between individuals, the angle of their relative position, and group quantities such as cohesion and heading polarization. These measures can reveal individual behavioural patterns such as the average velocity or the frequency of heading changes close to obstacles, and also collective behavioural patterns such as the level of cohesion and polarization of the group. Thus, the analysis of collective behaviour consists in measuring behavioural changes at the individual scale that likely result from social interactions, and associating these measures with the relative state of the individuals involved in these interactions [12]. The relative state of an individual j with respect to a focal individual i is determined by the distance d_{ij} between i and j, its relative velocity v_{ij}, its angular position with respect to i, ψ_{ij} (viewing angle), and its relative heading ϕ_{ij} (figure 1). The behavioural changes are precisely given by the variations of an individual’s position and velocity, or, equivalently, by the position, speed and heading variations.
In fish that have a burst-and-coast swimming mode, the behavioural changes of an individual correspond to significant variations of its heading that occur exactly at the onset of the acceleration phase (i.e. the bursts). These discrete behavioural decisions are called ‘kicks’ [12,17]. Other quantities such as the intensity of the acceleration in the direction perpendicular to the direction of motion, or simply the turning direction (right or left), can be used to detect behavioural changes. Thus, the task is to relate the heading variation of a focal fish δϕ_{i} with its state variables, that is, to find a function δϕ_{i}(d, v, ψ, ϕ).
In this article, we first show that force maps, which are widely used to describe the effects of social interactions on the behaviour of individuals, have important limitations when it comes to describe these interactions. These limitations result mostly from the limited number of variables that force maps can handle, the difficulty of identifying intermediate contributions to behavioural patterns, and the difficulty of distinguishing the effects of state variables from constitutive parameters. We then describe in detail a method to analyse behavioural data obtained from digitized individual trajectories. The method allows us (1) to quantify the social interactions between two individuals and describe how the intensities of these interactions vary as a function of the state variables of the individuals, and (2) to reconstruct the interaction functions and to derive an explicit and concise mathematical model reproducing the observed behaviours. Finally, we apply this method to the analysis of the social interactions in two species of fish that both have a burst-and-coast type of swimming and that are characterized by very different levels of coordination when swimming in groups. The reconstruction of interaction rules allows to understand the origin of the differences in the level of coordination, and to predict in which experimental conditions other behavioural differences can arise.
2. Use and limitations of force maps to infer social interactions
A first way to infer social interactions between individuals directly from experimental data consists in using the force-map technique [13,18]. This technique has been used, for instance, to estimate from experiments performed with two fish the effective turning and speeding forces experienced by an individual, once the relevant variables on which they may depend have been chosen [11,18,19]. This is a simple way to visualize the strength and direction of behavioural changes. However, force maps have strong limitations that can induce profound misunderstandings.
(a) Visualization of social interactions with force maps
Force maps are planar representations of two-dimensional functions of the form f(x, y) where the variation of the value of the function is represented by a colour gradient. Generating and interpreting force maps is easy and this explains their success for inferring interactions between moving groups of individuals.
A 2D function f(x, y) given by a dataset is a sequence of N triplets (x^{n}, y^{n}, f^{n}), where the index n denotes for example the instant of time t^{n}, n = 1, …, N. To build a force map of this function, the (x, y)-space is discretized in I × J boxes of the form $[{\hat{x}}_{i},{\hat{x}}_{i+1}]\times [{\hat{y}}_{j},{\hat{y}}_{\hspace{0.17em}j+1}]$, where the nodes ${\hat{x}}_{i}$ and ${\hat{y}}_{j}$ are given by
Figures 2 and 3 show the force maps of the heading variation δϕ of an individual fish performing a burst-and-coast type of swimming, as a function of different variables, in the presence of a conspecific in a circular tank. In the case of Hemigrammus rhodostomus (figure 2), we used a tank of radius R = 0.25 m, about 8.3 times the body length (BL) of the fish, and in the case of Danio rerio (figure 3), a tank of radius R_{Z} = 0.29 m, about 6.4 BL. Both figures 2a and 3a show the intensity of δϕ as a function of d_{ij}, the distance between fish, and ψ_{ij}, the angle with which fish i perceives fish j, respectively, and figures 2b and 3b show δϕ as a function of d_{ij} and ϕ_{ij}, the heading difference between fish.
These force maps provide some information about the individual behaviour of fish. In H. rhodostomus, figure 2a shows that the focal fish tends to turn towards its neighbour, to the left (resp. right) when the neighbour is on the left (resp. right), except when the neighbour is very close. In that case, the behaviour is quite complex: the fish performs small angular changes of amplitude $ca\hspace{0.17em}\hspace{0.17em}30-{60}^{\circ}$, probably owing to collision avoidance manoeuvres. When the neighbour is further away from the focal fish, it maintains its heading (i.e. the white circular region at ${d}_{ij}\approx 1-2\hspace{0.17em}\text{BL}$). From this force map, one could conclude that a fish is attracted by its neighbour when it is beyond a distance of about 2 BL, and repulsed when its neighbour is too close (d_{ij} < 1 BL). The force map in figure 2b shows that the focal fish turns left (resp. right) when the relative heading of the neighbour is shifted to the left (resp. right). The larger the heading difference, the stronger the turn: the colour intensity increases as |ϕ_{ij}| grows from 0 to 120°. When fish swim in more or less opposite directions, the intensity of the heading change is small. This force map reveals that a fish tends to align with its neighbour. In D. rerio, figure 3a shows that the focal fish turns towards its neighbour when they are close to each other (${d}_{ij}\approx 1-2\hspace{0.17em}\text{BL}$) or when the neighbour is located behind the focal fish (|ψ_{ij}| > 90°, whatever the distance). When the neighbour is at ${d}_{ij}\approx 2-3\hspace{0.17em}\text{BL}$ in front of the focal fish (|ψ_{ij}| < 60°), the focal fish turns away from its neighbour, as well as when its neighbour is very close to it (d_{ij} < 0.5 BL). The reaction to the neighbour’s heading is less intense than in H. rhodostomus, as shown by the wide white regions in figure 3b. When fish are beyond 3 BL from each other, the focal fish turns to adopt the orientation of its neighbour. At short distances, the behaviour is more complex, with changes of smaller size than those observed in H. rhodostomus.
The visualization of the data by means of force maps therefore suggests the presence of two distinct types of interaction: an attraction interaction, which leads a fish to turn towards its neighbour to get closer to it, and an alignment interaction, which leads a fish to turn so as to adopt the same heading as its neighbour. However, nothing can be deduced from these colour maps about what happens when the two contributions to heading variation have different signs. Thus, would a fish turn right or left when its neighbour is on its left side? Attraction alone would induce the focal fish to turn left. However, if the relative heading of the neighbour is turned to the right, alignment alone would induce the focal fish to turn right. This difficulty comes from the fact that a function (δϕ) that depends on three variables (d_{ij}, ψ_{ij} and ϕ_{ij}) cannot be represented in 3D. To overcome this limitation, some authors use a kind of force map where the relative position of a fish with respect to a focal fish is decomposed into the left–right (LR) and front–back (FB) distances [13]. In figure 2a, (d_{ij}, ψ_{ij}) are the polar coordinates of fish j in the system of reference centred on fish i pointing north. This is a continuous system of reference in which all the relative positions of fish can be represented. Instead, the LR and FB distances are projections of the relative position of a neighbour with respect to the focal fish on the d_{ij}-axis, where all the points of figure 2a that are in the left semicircle of radius 1 BL are averaged in a single point where ${d}_{ij}^{\mathrm{LR}}=-1\hspace{0.17em}\text{BL}$ (where ${d}_{ij}^{\mathrm{LR}}$ is the LR distance of fish j with respect to fish i), because these points are at 1 BL to the left of the focal fish. Then, the third variable ϕ_{ij} is used to expand this averaged point on a vertical line with different values of ϕ_{ij}, in a system of reference with coordinates $({d}_{ij}^{\mathrm{LR}},{\varphi}_{ij})$, giving rise to figure 2c. Similarly, the upper semicircles of figure 2a are averaged on the ${d}_{ij}^{\mathrm{FB}}$-axis in figure 2d.
Force maps in figures 2c,d and 3c,d provide additional information about individual fish behaviour. In H. rhodostomus, figure 2c shows that turning direction is homogeneously distributed in the upper and lower half-planes, meaning that the focal fish turns to adopt the heading of its neighbour almost independently of the LR distance separating them, although the turning intensity is larger when the neighbour is far from the focal fish and perpendicular to it (i.e. the regions of highly intense colour at $|{d}_{ij}^{\mathrm{LR}}|>2\hspace{0.17em}\text{BL}$ and ϕ_{ij} ≈ ±90°). In the white horizontal region, the focal fish maintains its heading when it is aligned with its neighbour (|ϕ_{ij}| < 10°), whatever the horizontal distance between them. Figure 2d exhibits two large regions homogeneous in colour, showing that the focal fish turns almost always to adopt the heading direction of its neighbour, except when this one is far behind it (i.e. the small regions of the opposite colour for ${d}_{ij}^{\mathrm{FB}}<-2\hspace{0.17em}\text{BL}$). In D. rerio, the colour of each vertical half-plane of figure 3c is almost uniform, except for some regions close to the focal fish (d^{LR}_{ij} ≈ ±1 BL) at ϕ_{ij} ≈ ±90°, and some distant regions located at d^{LR}_{ij} ≈ ± 3 BL and ϕ_{ij} ≈ ±135°). This means that the focal fish turns almost always towards its neighbour, and almost independently from its relative heading, except when both fish are close and perpendicular to each other, and when they are very far and almost anti-aligned. In figure 3d, one can see that the fish turns to adopt the same direction as that of its neighbour when the neighbour is close and in front of it (i.e. the large green and orange homogeneous regions in the centre of the figure). But the fish tends to turn towards the opposite direction when its neighbour is far ahead (small regions where ${d}_{ij}^{\mathrm{FB}}>2\hspace{0.17em}\text{BL}$) or behind it and not very close to it (regions where ${d}_{ij}^{\mathrm{FB}}<-1\hspace{0.17em}\text{BL}$). This is a different and more complex behaviour than the one observed in H. rhodostomus, where heading changes depend on ϕ_{ij} but not on ${d}_{ij}^{\mathrm{LR}}$. In D. rerio, we observe the opposite, and the dependence on the FB distance is more complex than in H. rhodostomus. However, neither ${d}_{ij}^{\mathrm{LR}}$ nor ${d}_{ij}^{\mathrm{FB}}$, separately, can determine the distance at which the neighbour is (e.g. a fish j located at 1 BL to the left can be at 0.5 or 2 BL to the front). Therefore, this kind of representation hides the effect of the absolute intensity of the interactions as a function of the distance between fish, and moreover does not allow the contribution of intermediate effects such as attraction and alignment to be disentangled.
(b) Limitations of force maps
The above descriptions show that the use of force maps to characterize interactions between individuals raises several problems, especially when more than two state variables must be taken into account to describe the behaviour of fish. Other issues are not exclusive to force maps. For example, data can be scarce; not only because they are difficult to collect but also because the phenomenon under observation rarely produces data of a given kind. This is what happens for instance when we consider the repulsive interactions: as individuals repel each other, there are very few cases in which the distance between them is small, so that repulsive interactions are very difficult to describe when the distance between fish is small. This particular issue is common to other methods, as well as ours. More importantly, data are rarely homogeneously distributed, so that different regions of the map can result from averaging a very disparate number of data, meaning that similar colour intensities do not have the same relevance. For instance, figure 4 shows that the relative position of the neighbour of fish i is not homogeneously distributed on the (d, ψ)-plane, and is differently distributed in both panels, so that, e.g., information from the frontal region in figure 4a is more relevant to describe the behaviour of H. rhodostomus than the information from the same region in figure 4b is to describe the behaviour of D. rerio.
Thus, the relative simplicity to visualize and interpret data with force maps comes at the cost of important limitations and oversimplifications. Four of the most critical of these limitations are the following:
(i) | the reduced number of state variables involved in individual behaviour that force maps can handle is limited to two, at most three. This limitation leads to the use of projections and average values that can hide crucial features of the phenomenon under study. Moreover, when data are not homogeneously distributed, the same limitation can even produce wrong or inaccurate observations or conclusions. | ||||
(ii) | the difficulty of identifying and disentangling intermediate contributions such as attraction and alignment to behavioural patterns; | ||||
(iii) | the difficulty of finding simple analytical expressions of the interaction functions in order to implement them in a mathematical model; | ||||
(iv) | the difficulty of distinguishing the effects of variables (e.g. the distance between fish d_{ij}) from the effect of parameters (e.g. the BL of fish). |
(i) When a function depends on more than two variables, force maps are mere projections of the function on a 3D surface, where the value of the function has been averaged with respect to one or more variables. However, averaging raises two problems. First, the function can be odd with respect to the variable used to calculate the average, as is the case of the heading change δϕ with respect to the angle of perception ψ_{ij} and the relative heading ϕ_{ij} in H. rhodostomus, where δϕ(d_{ij}, ψ_{ij}, ϕ_{ij}) = −δϕ(d_{ij}, − ψ_{ij}, ϕ_{ij}) = −δϕ(d_{ij}, ψ_{ij}, − ϕ_{ij}) [12]. Then, for instance, huge variations of δϕ with respect to ψ_{ij}, but of different sign, can cancel each other and yield a small average value, as if δϕ was almost independent of ψ_{ij}. In other words, a crucial feature of the behaviour can be hidden by the averaging process. Second, averaging by simply adding the values and dividing by the number of values implicitly assumes that the probability of occurrence is the same for all the possible states.^{1} This would mean for instance that the probability of a fish j being in the state d_{ij} = 1 BL, ψ_{ij} = 90° and ϕ_{ij} = 10° with respect to a focal fish i is the same as the probability of being at (d_{ij}, ψ_{ij}, ϕ_{ij}) = (2, 0, 180), which is clearly not the case, at least in H. rhodostomus [12].
Thus, even if averaging over some state variables to obtain a 2D force map can provide qualitative information, it can also hide crucial effects and can even produce a completely misleading result owing to the strong correlations between the states variables along the actual trajectories. For instance, averaging the interaction between two fish (and neglecting the effect of the wall) over their distance d, to only keep the dependence on the two angular variables ψ (viewing angle) and ϕ (heading difference) in a 2D representation, involves the unknown correlation function C(d, ψ, ϕ) between d, ψ and ϕ along the experimental trajectories. Note that even if the true interactions were in fact separable into a product of one-variable interaction functions, δϕ = f(d) g(ψ) h(ϕ), the 2D projection would then be $\delta \varphi (\psi ,\varphi )=g(\psi )\hspace{0.17em}h(\varphi )\times {\int}_{0}^{\mathrm{\infty}}C(r,\psi ,\varphi )\hspace{0.17em}f(r)\phantom{\rule{1pt}{0ex}}\phantom{\rule{1pt}{0ex}}\phantom{\rule{1pt}{0ex}}\mathrm{d}r$. The integral term (averaging over the distance between the two fish) leads to an unknown and non-trivial function of ψ, and ϕ, which is in general not separable unless the correlation function is itself separable. And it is not: if two fish are close, they are likely to be aligned (ϕ ≈ 0), whereas if they are far apart, their headings are much less correlated (ϕ nearly uniformly distributed). This simple example demonstrates that force map projections not only do not recover the true interaction (g(ψ) h(ϕ) in this case), but can artificially produce a non-separable interaction, even if the actual interaction takes a product form.
As a consequence, the small number of dimensions that can be represented by force maps constitutes a crucial limitation for an accurate description of behaviour, especially if more variables are taken into account, such as, e.g. the relative velocity v_{ij}, or the interaction of fish with an obstacle (such as the wall of a tank), which would require two more variables r_{w}^{i} and θ_{w}^{i}, the distance and angle to the wall, respectively. Even for the depicted intermediate values, the precise contributions of the different variables to the heading change are still entangled. For instance, how does ϕ_{i} vary when j points to the left and is located at the right side of i, and how does this change vary with the distance d_{ij}? The contribution of each variable cannot be disentangled from those of the other variables.
Moreover, force maps of 3D functions like δϕ require huge amounts of data. Using 30 bins per variable with an average of 100 points per bin would require 2.7 × 10^{6} data points, which is very far from being the amount of data collected in most of the experiments on collective motion. We anticipate here that, in order to get the same level of precision, our procedure described below reduces the required number of data by up to 150 times.
(ii) Another limitation of force maps is that intermediate contributions are difficult or often impossible to identify. A function f can depend on the state variables x and y through two intermediate functions a and b such that f(x, y) = f(a(x, y), b(x, y)). This is precisely the case for the heading change δϕ, which depends on the state variables d_{ij}, ψ_{ij} and ϕ_{ij} through the combination of at least two intermediate contributions, attraction and alignment, themselves depending on the three state variables [12]. Moreover, attraction and alignment can have opposite contributions, so that their combined effect can be cancelled as if the fish were isolated from each other, while both forces are in action. Force maps are not able to identify which part of the heading change is due to the attraction or the alignment, that is, the functions a(x, y) and b(x, y) cannot be directly extracted from force maps. Even if representations in 7D were possible, these ‘maps’ would not allow identification of intermediate contributions.
(iii) Extracting analytical expressions from a colour map is difficult unless the relation between variables is very simple. However, it is essential to build mathematical models to understand how the combination of social interactions gives rise to the observed behaviour [14,21]. The simulations of these models will then be used to make predictions in other experimental situations, to draw phase portraits, etc. Simple piecewise linear functions interpolating the data are not suitable because they lack the physical or biological meaning and do not help to build explicit and concise mathematical models. Figures 2 and 3 show the force maps of δϕ for two different species of fish. Significant differences appear between maps, e.g. in panels (a), the left–right symmetry of the heading change with respect to the position of the neighbour in H. rhodostomus, while in D. rerio the symmetry is with respect to the position of the focal fish; in panels (c), the half-planes of homogeneous colour are horizontal in H. rhodostomus, but vertical in D. rerio. Will such a difference be observed if larger groups are considered? Force maps cannot be extrapolated from one experimental situation to another.
(iv) Finally, it is difficult to distinguish the effect of a variable from the effect of a parameter in a force map, e.g. the effect of the instantaneous distance between fish at each instant of time, from the effect of the fixed BL of individuals. Are the differences observed in the previous maps of each species due to the experimental conditions (e.g. the radius of the arena, which is a variable of the experimental set-up), or to the physical characteristics of the species (e.g. the BL, which is a fixed parameter of the species)?
The method we present below does not have these limitations, first, because it can handle a large number of state variables (i.e. of dimensions), and second, because the analytic expressions it provides are precisely the optimal way to disentangle the interaction functions at play and to describe the role of each state variable and each parameter of the system. These analytic expressions can then be exploited to build an explicit and yet concise model, whose agreement with experiment can be tested, and whose predictions can be further investigated experimentally.
3. Method to extract and model social interactions from behavioural data
Section 2 shows that force maps are representations of one quantity (action force, acceleration, heading variation, etc.) as a function of pairs of other quantities (relative position, velocity or orientation, angle of perception of other individuals, etc.). These quantities only make sense in a framework of the physical world described by a mathematical model which, even if it is often not mentioned explicitly in studies [18,19,22,23] (but see also [13,24,25]), is usually based on equations of motion built in analogy with Newtonian mechanics.
The method essentially consists in defining this framework and deriving the mathematical model describing the relation between the quantities used to quantify the behaviour of an organism and its interactions with other organisms or physical objects that are present in its environment.
Of course, our procedure requires the measurements (i.e. the distribution) of the heading changes δϕ, and also calculation of the average in each box of the discretized grid. However, our procedure does not require at any moment to represent heading changes as a function of two or more variables, i.e. using force maps.
(a) Outline of the method
Figure 5 provides a general overview of the method used to design the mathematical model and to extract the social interaction functions. The method consists of five consecutive steps, starting from experimental observations and ending with numerical simulations of a model reproducing these observations. These five steps are repeated in a cycle as shown in figure 5, as the model predictions allow better targeted experiments to be performed, which in turn allow the model to be refined: the experiments feed the model, and in turn the model helps the design of experiments. Figure 6 shows the same five steps in a detailed flow chart that includes the test of the model.
(i) Data collection and preparation
The method starts by carrying out experiments, motivated by a fundamental question, or the results of the previous cycle of an ongoing modelling process. Then, data are collected and prepared in the form of individual trajectories u_{i}(t) resulting from a tracking procedure conveniently purged from identification errors (step 1 in figures 5 and 6). The velocity vector v_{i} of an individual i is calculated from successive positions with finite differences, and its heading ϕ_{i} is extracted directly from the tracking data, or, alternatively, taken as the orientation of the velocity vector.
(ii) Identification and analysis of state variables and observables
The second step consists in identifying and analysing the state variables and the relevant observables that can be used to describe the phenomenon under study (step 2 in figures 5 and 6).
In the case considered here, the individual state of a fish is characterized by its position and velocity vectors, or, equivalently, by its speed and its distance and orientation to the wall, v, r_{w} and θ_{w}, respectively. The state of a fish i with respect to another fish j is characterized by the distance between them d_{ij}, the relative speed v_{ij}, the viewing angle with which j is perceived by i, ψ_{ij} (≠ψ_{ji}), and their relative heading ϕ_{ij} = ϕ_{j} − ϕ_{i}. The heading angle change δϕ_{S} resulting from social interactions is thus a function of four variables.
Then, an analysis is performed on a series of observables that can be used to describe both the individual and collective behaviour of fish when swimming in pairs. These observables are the probability density function (PDF) of the spatial distribution of individuals, their velocity, their distance and orientation to obstacles, their heading variation, the distance between individuals (cohesion), their relative positions, and their relative heading (polarization). Figure 7 shows the PDF of d_{ij}, ψ_{ij} and ϕ_{i} corresponding to the species of fish studied here. The analysis of these observables relies exclusively on the experimental data and is independent of a model. Eventually, force maps can be drawn.
(iii) Modelling hypotheses
The next step consists in defining a class of models that can provide the most suitable description of the observed phenomenon (step 3 in figures 5 and 6). This is by far the most crucial part of the method.
This process has been described in detail in [12] in the case of H. rhodostomus; here we simply summarize it, as it is identical for D. rerio. We first consider that fish move in a straight line during short time intervals of different durations. These time intervals are separated by instantaneous ‘kicks’ during which fish adjust their direction and make an abrupt acceleration, after which the speed decays exponentially as the fish glides between two kicks. These assumptions come from the analysis of the trajectories and the variation of the velocity [12]. We also assume that the effects of social interactions occur exclusively when kicks are performed, that is, at the decision times when fish adjust their heading. Then, we provide an explicit equation for the heading variation of the focal fish δϕ when the fish performs a kick, in terms of the quantities that can potentially have an effect on δϕ. The selection of the duration and length of the gliding phase and the motion of the fish, with quasi-exponential decay of the velocity between two kicks, are given by simple analytical probability distributions fairly well reproducing the experimental ones, as described in [12].
The equations for the time-evolution of the vector position u(t) and the heading ϕ(t) of a fish are
(iv) Extraction of interaction functions from trajectory data
Step 4, which is the one we wish to emphasize in this article, consists in extracting, from the experimental data, the interaction functions determining the contributions of a neighbouring fish and of the obstacles in the environment to the instantaneous heading variation of a fish. It is a relatively long step that involves several substeps, for which an overview is presented in figure 8.
Discretizing the six state variables in Ω = Q × P × I × J × K × M boxes, and calculating the mean value of δϕ in each box from the experimental data, equation (3.1) can be rewritten as a system of Ω equations and Ω unknowns,
The key hypothesis to the procedure consists in assuming that the contribution of each state variable to each kind of interaction can be separated in a product form, which is the case for physical particles [12]:
For the sake of simplicity, we consider here the case where fish are far enough from the tank wall that the contribution of δϕ_{w} to the heading variation can be neglected with respect to the effects of social interactions. The procedure can easily be extended to extract both the social interactions and the interaction with the wall, but at the cost of more complicated notation. We refer the reader to the work of Calovi et al. [12], in which the disentangling of the combined effects of a tank wall and social interactions between fish is described in detail.
Hence, we are left with four state variables, so that the heading change δϕ(t) is averaged in four-dimensional ijkm-boxes. We introduce the notation f(d_{i}) = f_{i}, g(ψ_{j}) = g_{j}, …, $\hat{l}({v}_{m})={l}_{m}$, so that, after removing the interaction with the wall, the system (3.3) becomes
The goal is now to solve this system of equations. This will provide the discrete values of the interaction functions (the unknowns ${f}_{i},{g}_{j},\dots ,{\hat{l}}_{m}$) as a function of the known values δϕ_{ijkm} (calculated from the experimental data as the mean heading change in each box). Systems with more equations than unknowns are called overdetermined systems and rarely have a solution. One way of overcoming this problem consists in reducing the overdetermined system to a solvable one by minimizing the error Δ with which the overdetermined system is satisfied by a candidate solution that is updated iteratively. This corresponds to step 4.1 in the flowchart figure 6, which is described in detail in box 1 and in §3b below for a simple 2D case where the equation is f_{i} g_{j} = a_{ij}. Figure 9 shows how the error function is built in this simple case.
The error function is a function Δ(X) of D = 2(I + J + K + M) variables which returns a real value calculated as follows:
The minima of Δ(X) are obtained by finding the zeros of the gradient of Δ (see the inset in box 1). Finding the zeros of a function (or root finding) can be carried out by different methods such as descent methods or other iterative methods. Here, we use an iterative method based on the fact that the equation that each component must verify is linear in this component, so that each component can be written explicitly in terms of the other components. This relation is found as follows.
Minimizing the error between data and a product form.
Let us consider an N × M two-dimensional array a_{ij}, for instance, built after averaging some quantity obtained from experiments on an N × M grid. We look for the best approximation of a_{ij} under a product or separable form, a_{ij} ≈ x_{i} × y_{j}, i = 1, …, N, j = 1, …, M (see the system (3.7), in the more general case when more than two dimensions are involved). Note that in the product form, the number of unknown variables is only N + M, compared with the much larger size N × M of the original array of data a_{ij}. In order to quantify the accuracy of the product description, we define the quadratic error function Δ:
The goal is thus to find a set of values x_{1}, …, x_{N}, y_{1}, …, y_{M} that minimizes the error Δ. Δ = 0 implies that x_{i} y_{j} = a_{ij}, for all i and j, and that the original data had exactly a product form. Finding the minima of a function requires finding the zeros of its derivative, which, in high dimensions, is the gradient vector $\mathbf{\nabla}\mathrm{\Delta}$, given by the partial derivatives of Δ with respect to its components; see the Finding minima in the boxed inset below.
Finding minima
A minimum of a one-variable function f(x) is a point x_{m} where the derivative vanishes, f′(x_{m}) = 0, and the second derivative is positive, f″(x_{m}) > 0.
In several dimensions, the ‘derivative’ of a function $F\hspace{0.17em}:\hspace{0.17em}{\mathbb{R}}^{D}\to \mathbb{R}$ is the gradient vector $\mathbf{\nabla}F(\mathit{x})$, whose components are given by the partial derivatives of the function with respect to each component of x: $\mathbf{\nabla}F(\mathit{x})=(\mathrm{\partial}F/\mathrm{\partial}{x}_{1},\mathrm{\partial}F/\mathrm{\partial}{x}_{2},\dots ,\mathrm{\partial}F/\mathrm{\partial}{x}_{D})$. The gradient vector points in the direction of maximum variation of F, and is therefore zero (i.e. equal to the null vector 0) when x is an extremum, and in particular, a minimum:
In principle, to ensure that the extremum is indeed a minimum, one also has to check that the Hessian matrix of second-order partial derivatives has only strictly positive eigenvalues, a condition generically satisfied for squared error functions like the one considered here.
Differentiating (6.1) with respect to x_{k} gives
For the gradient to be zero, we must have
The resulting set of explicit equations for each component can be written in the following form,
Fixed point iterations.
A point ${x}^{\ast}\in \mathbb{R}$ is a fixed point of a function $f\hspace{0.17em}:\hspace{0.17em}\mathbb{R}\to \mathbb{R}$ if f(x*) = x*. Fixed points can be found under certain conditions^{2} by means of the iterative method
In d dimensions, a vector ${\mathit{x}}^{\hspace{0.17em}\ast}\in {\mathbb{R}}^{d}$ is a fixed point of a function $\mathit{F}\hspace{0.17em}:\hspace{0.17em}{\mathbb{R}}^{d}\to {\mathbb{R}}^{d}$ if F(x*) = x*.
When the dimension of the system d = N + M is very large, and in order to improve the stability of the recursion dynamics, it is convenient to use a relaxation method,
In the end, the procedure provides the values of each interaction function at the points representing each box minimizing the error functions. Depending on the number of boxes, a satisfactory analytical form of each one-variable function can be obtained, based on physical principles and specific observations. This part corresponds to the step 4.3 in figure 6 and depends on the specific phenomenon under study; it is detailed in §4 for the case of fish swimming in pairs. For example, the function of the angle of perception g(ψ) must be odd, i.e. g(−ψ) = −g(ψ), because the attractive effect of a neighbour on the heading change of a focal fish has the same intensity wherever the neighbour is located at the right or at the left of the focal fish, but has opposite directions, and hence opposite signs, in each case: if the neighbour is at the right (resp. left) side, the focal fish would turn right (resp. left) to approach the neighbour. Physical properties of the interactions at play, such as the exponential decay of some interactions, must be taken into account and guide the choice of the final analytical expressions.
(v) Numerical simulations of the model
The last step consists in performing numerical simulations of the model with the double purpose of (1) verifying that the simulation results are in good agreement with the experimental results, and (2) making new predictions that can be ultimately confirmed by new experiments. If one of these two points is not satisfactorily verified, then it is necessary to go back to a previous step: step 3 to revise or reformulate the model, step 2 to use alternative state variables or observables, or even step 1 to carry out new experiments or measures of the data (figure 6).
(b) Application to a simple case study
In order to illustrate how the extraction procedure (step 4 of our methodology in figure 6) can be used, we have produced artificial data for a simple case in which the interaction functions are known and in which we controlled the level of noise and the distribution of data. This simple case also allows us to illustrate the efficiency of the procedure and the accuracy that can be obtained according to the quality of the data.
Figure 10a,b shows the colour map of a real function of two variables a(x, y) going from [0, L] × [−π, π] to $\mathbb{R}$, for two levels of noise
We now apply the procedure described in §3a(iv), which provides analytical expressions of the interaction functions of x and y that give rise to this colour map.
The key hypothesis is that there exist two decoupled functions f(x) and g(y) such that a(x, y) = f(x) g(y). Evaluating these functions in each ij-cell, this means that
To minimize the error function, we look for the zeros of its gradient. The partial derivatives are
Thus, starting from an initial guess of the solution, ${\mathit{X}}^{0}=({f}_{1}^{0},{f}_{2}^{0},\dots ,{f}_{I}^{0},{g}_{1}^{0},{g}_{2}^{0},\dots ,{g}_{J}^{0})$, it is possible to obtain the values of the f_{i} s and the g_{j} s after one iteration, that is, ${\mathit{X}}^{1}=({f}_{1}^{1},{f}_{2}^{1},\dots ,{f}_{I}^{1},{g}_{1}^{1},{g}_{2}^{1},\dots ,{g}_{J}^{1})$, repeating this process iteratively, X^{1} → X^{2} → X^{3} → · · · until convergence, i.e. X^{n+1} ≈ X^{n}.
When the jumps between iterations are excessively abrupt, for example if the initial guess is far from a solution, it is convenient to smooth or relax the iterative process by averaging the new values with those of the previous step weighted with a larger coefficient
Figure 11 shows the successive values that each component f_{i}, g_{j} adopts during the first 120 steps of the iteration process, starting from an initial condition where ${f}_{i}^{0}={g}_{j}^{0}=1$ for all i and j. We used I = 11, J = 17, a value of λ = 0.975 quite close to 1 in order to illustrate the convergence process in detail, and a final tolerance |Δ(X^{n+1}) − Δ(X^{n})| < ɛ = 1.87 × 10^{−7} (using ɛ = I × J × 10^{−9}). Typical values of λ can be much smaller (λ = 0.75), depending on how close the initial guess is from the solution.
The final state to which the iterative method has converged, that is, the points ${\{{f}_{i}\}}_{i=1}^{I}$ and ${\{{g}_{j}\}}_{\hspace{0.17em}j=1}^{J}$ that solve the reduced system (6.5), is shown in figure 10c,d. There is an excellent agreement with the original data, despite the addition of noise, including in the case of a much larger noise than the one inferred in the actual data of our fish experiments.
Note that if f and g are a solution of (6.5), then the functions (1/α)f and αg, where α is a real number, are also a solution of (6.5), since the product of the two functions remains invariant. Hence, we need to impose an additional condition in order to account for this under-determination and to generally allow for a proper comparison of reconstructed interaction functions. Following [12], we chose to normalize all angular functions so that their squared average is equal to 1. In particular, g is normalized such that $(1/2\pi ){\int}_{-\pi}^{\pi}g{(y)}^{2}\hspace{0.17em}dy=1$, a normalization applied in figure 10c,d.
The last step consists in finding simple analytical expressions that interpolate the discrete values of the reconstructed interaction functions, in order to implement them in an explicit mathematical model. There is an infinite number of combinations, so that one must be guided by key physical features of the phenomenon under study that are well established, properties such as symmetries or analogies with other physical systems. For example, in the case of angular functions, the parity is often easily identifiable from the data or can be asserted from general principles (mirror symmetry, left/right symmetry etc.), so that a small number of Fourier modes can be sufficient to interpolate the angular functions from the data that result from the reconstruction procedure.
(c) On the hypothesis of separation of variables of interaction functions
The central assumption of the extraction procedure is the separation of variables made in (3.4)–(3.6). Without this hypothesis, the solution of equation (3.3) far from the wall is δϕ_{S}(d_{i}, ψ_{j}, ϕ_{k}, v_{m}) = δϕ_{ijkm}, and we are led back to force maps, with all contributions still entangled and no analytical expressions of the interaction functions. Separability of variables is thus crucial for our method. However, what happens if the interaction functions do not satisfy this condition?
Consider a 2D function a(d, ψ), odd in ψ. Such a function can always be expanded in a Fourier series as
A simple and biologically meaningful example, inspired by the fish interaction functions, would consist in writing ${c}_{k}(d)={\alpha}_{k}\mathrm{exp}[-{d}^{2}/(2{l}_{k}^{2})]$ in (3.19) and using only the modes k = 0 and 2:
Let us see the result of applying our procedure to a similar function b(d, ψ) that does not satisfy the separability condition. Using the Fourier expansion (3.19), we consider two examples b_{1}(d, ψ) and b_{2}(d, ψ) of the form
Figure 12a,b shows the colour maps of b_{1}(d, ψ) and the solution found by the reconstruction procedure f_{i} g_{j}. The general shape is quite well reproduced, and differences appear only at a relatively small scale. Figure 12c,d shows the reconstructed functions f(d), g(ψ) for which we have found the following analytical expressions,
Although the procedure provided an apparently satisfactory result (figure 12a,b are quite similar), the reconstructed functions fail to reproduce some features such as the (two) changes of variation of the intensity along the radial coordinate (first increasing, then decreasing) when the neighbour is at one side of the focal individual (ψ ≈ ±π/2), and introduce an artificial slight decrease in the intensity when the neighbour is far away (d ≈ 3 BL) and at one side (ψ ≈ ±π/2) of the focal individual (figure 12c,d).
A similar qualitative result is obtained in the second case with b_{2}(d, ψ), as shown in figure 13. In that case, we found β_{0} = 0.69, ${\overline{l}}_{0}=0.078$, β_{1} = 1.99, β_{2} = 0.9, so the interaction function provided by our procedure is
Let us now discuss the general validity of the product assumption on the illustrative example of the interaction of a fish with a circular wall. In this case, the heading angle change between two time steps (kicks) is δϕ(r_{w}, θ_{w}), where r_{w} is the distance of the fish from the wall, and θ_{w} is the angle between the fish heading and the normal to the wall. Here, the product hypothesis amounts to assuming that δϕ(r_{w}, θ_{w}) = f(r_{w})g(θ_{w}). As explained in [12], the product assumption is verified for physical particles interacting with a wall via a conservative force (deriving from a potential energy), with g(θ_{w}) given exactly by g(θ_{w}) = sin θ_{w} (projection of the central force on the normal to the velocity, i.e. on the angular acceleration). For animals, and in particular fish, the anisotropic perception of their environment generally leads to non-conservative interactions. The product hypothesis assumes that this anisotropic perception can be fully encoded in a non-trivial function g(θ_{w}) generally different from a simple sinus, while the function remains odd if the left/right symmetry is preserved. As explained above, δϕ(r_{w}, θ_{w}) for animals takes the general Fourier expansion form of equation (3.19), $\delta \varphi ({r}_{\mathrm{w}},{\theta}_{\mathrm{w}})=\mathrm{sin}({\theta}_{\mathrm{w}})\sum _{k=0}^{\mathrm{\infty}}{c}_{k}({r}_{\mathrm{w}})\mathrm{cos}(k{\theta}_{\mathrm{w}})$. Strictly speaking, δϕ(r_{w}, θ_{w}) can be separated into the product of two functions of r_{w} and θ_{w} only if all functions c_{k}(r_{w}) are proportional to each other. However, if these functions decay similarly with the distance to the wall r_{w}, the proportionality assumption and hence the product assumption would be only weakly invalid, as illustrated in the practical examples presented above. In fact, we generally expect only a few Fourier modes to be relevant in the above expansion (in [12], it was found that only one or two non-zero modes were enough to describe all interactions; see next section). Hence, the product assumption can be only severely invalidated if the few relevant functions c_{k}(r_{w}) have very different behaviour. However, on general biological grounds, we expect that all these functions characterizing the interaction with the wall should decay smoothly with the distance r_{w}. Similarly, for the corresponding functions describing the attractive interaction between two fish, we expect them to first increase with distance between the fish before decaying at larger range. Hence, even if these few functions are not strictly proportional to each other, their similar anticipated behaviour should lead to effective product interactions grasping the most important features of the actual interactions.
In conclusion of this section, the product assumption allows for an efficient method to reconstruct the individual one-variable interaction functions, only requiring a moderate amount of experimental data. Even if the product assumption is not strictly verified by the actual interactions, general biological considerations ensure that the product interactions would still describe their main features. In any case, the product form offers an explicit representation of the interactions, separating the different contributions (interaction with the wall, attraction/repulsion and alignment between individuals). The reconstructed one-variable interaction functions can be fitted by simple analytic forms (see next section) and the full interaction functions can then be straightforwardly implemented in an explicit and concise model whose predictions can be checked against experimental results, hence providing a further validation of the product assumption. In addition to ultimately producing models for the dynamics of animal groups, the simple and explicit form of the interactions allows for a precise analysis of the behavioural interactions at play in the system, and in particular, for the disentangling of their different components.
4. Extraction and comparison of social interactions in different species of fish
The model proposed in §3 is based on the assumption that social interactions are combined in an additive form (see step 3, figure 6). In equation (3.2), two functions δϕ_{Att} and δϕ_{Ali} were introduced to account for the attraction and alignment interactions, for which no a priori assumptions were made except that their dependence on the state variables d_{ij}, ψ_{ij} and ϕ_{ij} is decoupled; see equations (3.5) and (3.6). Steps 4.1–4.3 in figure 6 provide us with functional forms, but do not determine which functional form corresponds to which kind of interaction.
To do that, the analysis of the relevant observables (step 2 in figure 6) is used to say that δϕ must change sign when ψ_{ij} and ϕ_{ij} both change sign, and that, consequently, the same must happen for δϕ_{Att} and δϕ_{Ali}. This way, the parity of the angular components of the interaction functions is univocally determined. Thus, to have an interaction of attraction, the fish must turn left (resp. right) if its neighbour is on its left (resp. right); that is, δϕ > 0 if ψ_{ij} > 0 (resp. δϕ < 0 if ψ_{ij} < 0). Assuming perfect left/right symetry, this exactly means that δϕ_{Att} must be an odd function of ψ_{ij}, and thus an even function of ϕ_{ij}, provided fish do not have side preferences to turn (that is, fish do not prefer turning left to turning right). Similarly, to have an interaction of alignment, the fish must turn left when the relative heading of its neighbour is turned to the left, and turn right if it is turned to the right; that is, δϕ_{Ali} must be an odd function of ϕ_{ij}, and thus an even function of ψ_{ij}.
This allows us to rewrite the interaction functions from equations (3.5) and (3.6) as follows:
Figure 14 shows the social interaction functions reconstructed with the procedure described in the previous section in the case of two H. rhodostomus (figure 14a–c) and two D. rerio (figure 14d–f) swimming in circular arenas. In both species, attraction and alignment have been detected and, although some important differences can be observed, each kind of interaction has essentially a similar shape and intensity, although the interaction ranges are very different. Hence, we used the analytical expressions introduced in [12] for H. rhodostomus (solid lines in figure 14) to fit the discrete values of D. rerio extracted from the experimental data in step 4 of our method (figure 6), that is, with the procedure described in §3 (points in figure 14).
These expressions are, for the intensity of the social interaction of attraction and alignment, as follows:
H. rhodostomus | D. rerio | |
---|---|---|
R (m) | 0.25 | 0.29 |
R/BL | 8.33 | 6.44 |
γ_{Att} | 0.124 | 0.42 |
d_{Att} (m) | 0.03 | 0.015 |
l_{Att} (m) | 0.193 | 0.042 |
γ_{Ali} | 0.092 | 0.32 |
d_{Ali} (m) | 0.03 | 0.045 |
l_{Ali} (m) | 0.16 | 0.1 |
Regarding the normalized angular functions for D. rerio, we found the following expansions (with at most two Fourier modes in addition to the trivial zero mode):
Following step 5 of our methodology (figure 6), these analytical expressions should be implemented in the model introduced in §3. Then, numerical simulations of the model should be performed and compared with the known experiments, and predictions should be made, which must be verified a posteriori. This corresponds to step 6 of our methodology (figure 6); this has been done for H. rhodostomus in [12], and will be done elsewhere for D. rerio and other species.
Comparing the interaction functions found for both species, we observe that they have a similar shape, especially the angular functions (see the angular functions of attraction in figure 14b, for which one can use the same function of the angle of perception ψ_{ij}). The intensities of attraction and alignment have the same order of magnitude in both species: the maximum of the attraction is around 0.4 in both species, and the maximum of the alignment is around 0.2–0.3 (figure 14a).
The most important difference between the two species is that the range of the interactions is much larger in H. rhodostomus than in D. rerio: in H. rhodostomus, the maximum intensities of attraction and alignment are around 7 and 3.5 BL respectively, while in D. rerio these maxima are both near 1.5 BL. Moreover, the intensity of these interactions decays more rapidly in D. rerio than in H. rhodostomus, especially with respect to the fish BL (in D. rerio, the alignment intensity is zero beyond 5 BL (approx. 22.5 cm), but it is still noticeable at a distance of 10 BL (approx. 30 cm) in H. rhodostomus). Attraction almost always dominates alignment in D. rerio (the intersection of the red and blue lines in figure 14a is at around 0.5 BL), while, in H. rhodostomus, alignment was found to dominate attraction at short distances (under 2.5 BL).
In H. rhodostomus, a fish i is subject to a stronger attraction when the other fish j is at its right or left side (O_{Att} reaches its highest values when ψ_{ij} ≈ ±90°; see figure 14b) and moves more or less perpendicular to it (E_{Att} is higher when ${\varphi}_{ij}\approx 90\text{--}{100}^{\circ}$; see figure 14c). Alignment is stronger when the other fish is in front (E_{Ali} is higher when |ψ_{ij}| < 80°). In D. rerio, attraction is stronger when the other fish is clearly behind the focal fish (O_{Att} is higher when ψ_{ij} ≈ ±135°) and moves in the opposite direction (E_{Att} is higher when |ϕ_{ij}| > 100°; see figure 14e,f).
In both species, the strength of the alignment vanishes when fish are already almost aligned (O_{Ali} ≈ 0 when |ϕ_{ij}| < 30°; see figure 14c,f). In D. rerio, the strength of the alignment is active essentially when both fish are perpendicular to each other (the high intensity of |O_{Ali}| ≈ 1.5 is reached when ϕ_{ij} ≈ ±85°) and the focal fish has its neighbour at one of its sides (E_{Ali} peaks at ψ_{ij} ≈ ±85°), while in H. rhodostomus, the ranges of interaction both in the angle of relative heading and in the angle of perception of the neighbour are much wider: alignment is active when the neighbour is ahead of the focal fish (|ψ_{ij}| < 100°) and fish are simply slightly aligned (45° < |ϕ_{ij}| < 135°).
In summary, when swimming in pairs, H. rhodostomus interact in a much wider range of situations than D. rerio. This is true with respect to the three state variables of a focal fish: (1) the distance d_{ij} at which both attraction and alignment interactions are active is much larger in H. rhodostomus than in D. rerio; (2) the zone around a focal fish where the strength of the interactions with a neighbour is important is much wider in H. rhodostomus than in D. rerio, and (3) the same is true for the range of relative headings for which the intensity of interactions between fish is not negligible. Finally, the maximum intensity of the interaction is similar in both species, although high-intensity values are reached in a much wider range of situations in H. rhodostomus than in D. rerio.
5. Discussion and conclusion
Behavioural biology has recently become a ‘big-data science’ mainly supported by the advances in imaging and tracking techniques. These new tools have revolutionized the observation and quantification of individual and collective animal behaviour, improving to unprecedented levels the variety and precision of available data [9,26–28]. As the access to large volumes of data is gradually stepping animal behaviour research into a new era, there is also a growing need for understanding interactions between individuals and the collective properties that emerge from these interactions. Animal societies are complex systems whose properties are qualitatively different from those of their individual members, and whose behaviours are impossible to predict from a priori knowledge of individuals [3,4]. However, understanding how the interactions between individuals in swarms of insects, schools of fish, flocks of birds, herds of ungulates, or human crowds give rise to the ‘collective level’ properties requires the development of mathematical models. These models allow investigation of how complex processes are connected, to systematically analyse the impact of perturbations on collective behaviour (e.g. when a predator is detected in the neighbourhood), to develop hypotheses to guide the design of new experimental tests, and ultimately, to assess how each biological variable contributes to the emergent group-level properties.
We have presented a general methodology that leads to the measurement of the social interactions (defined in the framework of a model) from a set of individual trajectories. This procedure, illustrated here on two different species of fish, can be similarly applied on any set of trajectories of other organisms, including humans [29,30]. Once the experimental trajectories have been obtained, the extraction of interaction functions only makes sense after they have been defined in the framework of a general model for the equation of motion of an individual interacting with its environment (i.e. the obstacles and another individual). The model should involve the relevant variables regarding the interaction with obstacles (distance to the nearest wall, angle of the velocity with respect to the normal to the wall etc.) and another individual (distance between individuals, relative velocities, viewing angle etc.). In addition, each interaction component (repulsion, attraction, alignment etc.) is reasonably assumed to contribute additively: for instance, the influence of the wall and another individual on the focal individual is the independent sum of the two corresponding interactions. More importantly, the central hypothesis of our approach consists in assuming that the contribution of each interaction can be adequately described by a product of unknown single-variable functions of the relevant variables, or of a combination of these variables. The structure of the model is obviously also constrained by the considered species, their motion mode, and their anticipated interactions. For instance, for the fish species that have a burst-and-coast swimming mode, the dynamical model is intrinsically discrete in time and returns the angle change of an individual after each kick. For humans or some other fish species with a smooth swimming mode, a continuous-time model is necessary.
The unknown interaction functions defined in the structure of the model are not constrained, and the aim of the extraction procedure (step 4 in figure 6) is to measure them, without any a priori assumption about their form or intensity. In order to achieve that, each unknown single-variable interaction function is tabulated on a one-dimensional grid, and their values at each grid point are the fitting parameters. These parameters are then determined by minimizing the mean quadratic error between the prediction of the model and the experimental angle changes after a kick (in the case of discrete dynamics) or the experimental acceleration (in the case of a continuous-time dynamics). This minimization process (box 1) is achieved by solving, for instance with an iterative method (box 2), the equations expressing the vanishing of the partial derivatives of the error with respect to each fitting parameter, as we have done here, or by gradient descent methods. Once the interaction functions have been obtained on their respective grid, they are fitted and represented by simple analytical forms that best capture their general shape (figure 14).
The original general equation of motion is now complemented by these explicit interaction functions, leading to a concise and explicit model, which can be straightforwardly implemented numerically and can even be studied mathematically. The ability of the model to reproduce the experimental results and even to predict the behaviour of individuals in other situations not yet investigated experimentally can then be assessed. The original model can also be extended and the full extraction procedure repeated if an important feature appears to be missing. In particular, quantities like the probability distributions of the distance to the wall, of the distance between two individuals, of the angle between the two velocities or between the velocity and the normal to the wall, as well as other observables, permit the predictive power of the model to be assessed. Note that the interaction functions appearing in the model are determined by finding the best equation of motion describing the instantaneous decisions of the individuals. It is by no means trivial that this is enough for the resulting dynamical model to be able to reproduce observable quantities measured after averaging over many trajectories, or to predict the behaviour of individuals in different experimental conditions. A model able to achieve this certainly provides a convincing indication that its original and general structure and its extracted interaction functions properly represent and describe the behaviour and motion of the studied species.
The main limitations of our methodology lie in the reasonable assumption of additive contributions for the different interactions, and more critically, in assuming that each of these interactions is the product of single-variable interaction functions. This is the cost to pay for only involving a limited number of fitting parameters, yet capturing a large part of the complex structure of these interactions, and also for ultimately obtaining a concise, explicit and exploitable model.
To address these issues, we are planning in the near future to test our models on a robotic platform [31]. Such a platform could also allow us to study bidirectional interactions between robots reproducing the trajectories generated by our models and real fish, to obtain a more representative validation of our models of interactions [32].
Yet, the advantages and benefits of our approach are numerous. First, the number of fitting parameters (typically 30 for each of the typically five to eight interaction functions), although apparently large (a total of typically 150–300 parameters), is in general much smaller than the number of data points in the available experimental trajectories (typically 10^{4} for human groups [29], and 10^{5} for fish [12]). In comparison, a complete force map in typically five or more dimensions (one dimension per relevant variable) on a mesh involving 30^{5} boxes, with enough data points in each of them, would require millions if not billions of experimental data points, which is in general impossible to achieve. Two-dimensional force maps obtained after projection (i.e. averaging on the other 5 − 2 = 3 variables), although less noisy than the original force map, cannot be exploited to build a model, since it can be shown that they are strongly affected by the existing correlations between these variables along actual trajectories. For instance, if a fish is very close to a wall, there is a high probability that it swims parallel to the wall, so that its distance from the wall is strongly correlated with its heading angle. On the other hand, it can be shown [12] that our methodology to extract interaction functions is not at all affected by the likely correlations present in the system, and actually exploits them.
Our method is also robust with respect to the presence of noise in the data (intrinsic behavioural noise or unwanted experimental noise), and can actually be used to measure the spontaneous fluctuations of the speed and heading angle of the individuals [12,30]. Moreover, the extraction of interaction functions requires very limited computing power, being obtained within a few seconds on a standard workstation. Ultimately, the analysis of the resulting interaction function allows general qualitative conclusions to be made about the interaction at play for the considered species. Force maps can also help in this analysis, but our approach allows for an even finer analysis thanks to the disentangling of interactions by means of separate and explicit interaction functions, instead of mere projected colour maps affected in an uncontrolled manner by the inherent correlations present in the system and the mixing of the different interaction contributions.
More importantly, our methodology ultimately leads to a concise and explicit model that can be exploited to understand and explain diverse experimental features and various forms of collective behaviour, and that has a predictive power, while force maps cannot be directly exploited to build such explicit models. Note that the structure of the model should be robust for different species having comparable motion mode. For instance, this structure is the same for H. rhodostomus and D. rerio (or for any species with a burst-and-coast or run-and-tumble motion mode), and the behavioural differences between the species are solely and fully encoded in their different measured interaction functions.
In the specific cases of H. rhodostomus and D. rerio, we have also found that the interaction functions characterizing their interaction with the wall are very similar (see [12] for H. rhodostomus), which explains their common tendency to swim close to the wall, especially when a fish swims alone in a tank. However, even if the interaction functions describing the repulsion/attraction and alignment interactions between two fish have a similar general structure and shape for both species (figure 14), the range of the attraction and alignment interactions is much shorter for D. rerio. In addition, the intensity of both interactions in D. rerio is strongly reduced when the focal fish is behind and hence follows the other fish (i.e. when |ψ_{ij}|< 30°). Both features contribute to a much weaker coordination of the motion in groups of two fish in D. rerio, compared with H. rhodostomus, which can already be qualitatively noticed by observing recorded trajectories. This weaker coordination is quantitatively illustrated by the probability distribution of the distance between two fish (figure 14a,d), which is wider for D. rerio and extends up to much larger distances. Moreover, the probability distribution of the heading angle difference between the two fish is less peaked near ϕ_{ij} = 0 in D. rerio (figure 14f). Finally, the weaker coordination observed in D. rerio results in a similar behaviour for the geometrical leader and follower, whereas the leader and follower have a very distinct behaviour in H. rhodostomus.
6. Material and methods
(a) Ethics
Experiments with H. rhodostomus have been approved by the Ethics Committee for Animal Experimentation of the Toulouse Research Federation in Biology no. 1 and comply with the European legislation for animal welfare. Experiments with D. rerio were conducted under authorization by the state ethical board of the Department of Consumer and Veterinary Affairs of the Canton de Vaud (SCAV) of Switzerland (authorization no. 2778). During the experiments, no mortality occurred.
(b) Study species
Hemigrammus rhodostomus were purchased from Amazonie Labège (http://www.amazonie.com) in Toulouse, France. Fish were kept in 150 l aquariums on a 12 : 12 h, dark : light photoperiod, at 26.8°C (±1.6°C) and were fed ad libitum with fish flakes. The average BL of the fish used in the experiments was 31 mm. Sixty wild-type D. rerio with short fins (AB strain) were acquired from a pet shop, and stored in a 60 l aquarium. The average BL of the fish used in the experiments was approximately 4.5 cm. The water in the housing aquarium was kept at a temperature of 26°C. The fish were fed once per day with commercial food between 16.00 and 18.00. Additionally, we opted to use enrichment for the aquarium in the form of plastic plants, Cladophora, gravel, rocks and aquatic snails.
(c) Experimental procedures and data collection
The experimental tank (120 × 120 cm) used to investigate swimming behaviour in H. rhodostomus was made of glass and was set on top of a box to isolate fish from vibrations. The set-up, placed in a chamber made by four opaque white curtains, was surrounded by four LED light panels giving isotropic lighting. A circular tank of radius R = 25 cm was set inside the experimental tank filled with 7 cm of water of controlled quality (50% water purified by reverse osmosis and 50% water treated by activated carbon) heated to 26.1°C (±0.3°C). Reflections of light due to the bottom of the experimental tank were avoided thanks to a white PVC layer. Each trial started by placing two fish randomly sampled from their breeding tank in a circular tank. Fish were left for 10 min to habituate before the start of the trial. A trial consisted in 1 or 3 h of fish freely swimming (i.e. without any external perturbation) in a circular tank. A total of 16 trials were performed. Fish trajectories were recorded by a Sony HandyCam HD camera filming from above the set-up at 50 Hz (50 frames per second) in HDTV resolution (1920 × 1080 pixels).
In the case of D. rerio, the experimental set-up had dimensions of 100 × 100 × 25 cm, inside which a circular tank of radius R = 29 cm was placed. The sides and bottom part of the tank were covered with a Teflon plate to avoid reflections. Furthermore, the set-up was confined behind white sheets to isolate the fish from external stimuli in the room, while also maintaining a consistent lighting environment inside the set-up bounds. A uniform luminosity for the room was provided by four 110 W fluorescent lamps placed at each of the four sides of the tank. Prior to placing fish in the experimental set-up, we ensured that the height of the water was 6 cm. Then, 10 videos of 70 min duration each were recorded in the circular tank. Subsequently, a group of fish was randomly selected and caught from the rearing tanks to participate in the experiment. A pair of fish was then chosen and placed in the set-up. The fish were allowed to habituate for 5 min before starting the 70 min long recording. After a single experiment was completed, the fish were returned to the original rearing tank without being re-inserted in the selected group (i.e. no individual was used twice in the same day). The positions of fish on each frame were tracked with idTracker 2.1 [8]. Time series of positions were converted from pixels to metres and the origin of the coordinate system was set to the centre of the ring-shaped tank. Tracking errors (approx. 20% of the data) were corrected and instances where at least one fish moved less than 0.5 BL per second during 4 s were removed. More than 8 h remained during which one fish could kick.
We provide the dataset of each species in Figshare: https://doi.org/10.6084/m9.figshare.9933356.v1.
(d) Segmentation of trajectories
Instants of kicks were identified as local minima of the velocity preceding local maxima (which are more easy to identify), along time intervals [t − t_{w}, t + t_{w}], where t_{w} is a time window of 0.32 s. Trajectories are thus considered as a sequence of kicks. Fish almost never kick at the same time (asynchronous kicks); the position of the other fish is calculated at each instant of kick of the focal fish by simple linear interpolation.
(e) Fortran code
We provide the code to perform the analysis described in §3b, together with a simple script for quick visualization of the result in the gnuplot environment and the required data in Figshare: https://doi.org/10.6084/m9.figshare.11777325.
Data accessibility
Supporting data are available from the Figshare Repository: https://doi.org/10.6084/m9.figshare.9933356. We provide the code to perform the analysis described in §3.2, together with a simple script for quick visualization of the result in the gnuplot environment and the required data in Figshare: https://doi.org/10.6084/m9.figshare.11777325.
Authors' contributions
R.E., C.S. and G.T. conceived and designed the study. R.E., V.L., V.P. and F.B. performed research; R.E., V.L., C.S. and G.T. analysed data. R.E., C.S. and G.T. wrote the paper with an input from all other authors.
Competing interests
We declare we have no competing interest.
Funding
This work was supported by the Germaine de Staël project no. 2019-17 and the Swiss National Science Foundation project ‘Self-Adaptive Mixed Societies of Animals and Robots’, grant no. 175731. R.E. was supported by Marie Curie Core/Program Grant Funding, grant no. 655235-SmartMass. V.L. was supported by doctoral fellowships from the scientific council of the Université Paul Sabatier.
Acknowledgements
G.T. gratefully acknowledges the Indian Institute of Science to serve as Infosys visiting professor at the Centre for Ecological Sciences in Bengaluru. We thank Gérard Latil for technical assistance.
Footnotes
Endnotes
1 The average function of a(x, y) with respect to the variable x is given by ${\u27e8a\u27e9}_{x}(y)={\int}_{x}p(x,y)a(x,y)\mathrm{dx}$, where p(x, y) is the probability of occurrence of the state (x, y). However, force maps calculate the 〈a〉_{x}(y) as the mean of the values of a(x, y) over all the values of x, i.e. ${\u27e8a\u27e9}_{x}(y)=\sum _{x}a(x,y)/\sum _{x}1$, as if p(x, y) = 1 for all (x, y), because knowing p(x, y) is part of the problem.
References
- 1.
Kaneko K . 2006 Life: an introduction to complex systems biology.Understanding Complex Systems . Berlin, Germany: Springer. Crossref, Google Scholar - 2.
- 3.
Camazine S, Deneubourg J-L, Franks N, Sneyd J, Theraulaz G, Bonabeau E 2001 Self-organization in biological systems.Princeton studies in complexity . Princeton, NJ: Princeton University Press. Crossref, Google Scholar - 4.
Sumpter DJT 2010 Collective animal behavior. Princeton, NJ: Princeton University Press. Crossref, Google Scholar - 5.
Jost C, Verret J, Casellas E, Gautrais J, Challet M, Lluc J, Blanco S, Clifton MJ, Theraulaz G . 2007 The interplay between a self-organized process and an environmental template: corpse clustering under the influence of air currents in ants. J. R. Soc. Interface 4, 107-116. (doi:10.1098/rsif.2006.0156) Link, Web of Science, Google Scholar - 6.
Casellas E, Gautrais J, Fournier R, Blanco S, Combe M, Fourcassié V, Theraulaz G, Jost C . 2008 From individual to collective displacements in heterogeneous environments. J. Theor. Biol. 250, 424-434. (doi:10.1016/j.jtbi.2007.10.011) Crossref, PubMed, Web of Science, Google Scholar - 7.
Branson K, Robie AA, Bender J, Perona P, Dickinson MH . 2009 High-throughput ethomics in large groups of Drosophila. Nat. Methods 6, 451-457. (doi:10.1038/nmeth.1328) Crossref, PubMed, Web of Science, Google Scholar - 8.
Pérez-Escudero A, Vicente-Page J, Hinz RC, Arganda S, de Polavieja GG . 2014 idTracker: tracking individuals in a group by automatic identification of unmarked animals. Nat. Methods 11, 743-748. (doi:10.1038/nmeth.2994) Crossref, PubMed, Web of Science, Google Scholar - 9.
Dell AI et al. 2014 Automated image-based tracking and its application in ecology. Trends Ecol. Evol. 29, 417-428. (doi:10.1016/j.tree.2014.05.004) Crossref, PubMed, Web of Science, Google Scholar - 10.
Anderson DJ, Perona P . 2014 Toward a science of computational ethology. Neuron 1, 18-31. (doi:10.1016/j.neuron.2014.09.005) Crossref, Web of Science, Google Scholar - 11.
Romero-Ferrero F, Bergomi MG, Hinz RC, Heras FJH, de Polavieja GG . 2019 Idtracker.ai: tracking all individuals in small or large collectives of unmarked animals. Nat. Methods 16, 179-182. (doi:10.1038/s41592-018-0295-5) Crossref, PubMed, Web of Science, Google Scholar - 12.
Calovi DS, Litchinko A, Lecheval V, Lopez U, Pérez Escudero A, Chaté H, Sire C, Theraulaz G . 2018 Disentangling and modeling interactions in fish with burst-and-coast swimming reveal distinct alignment and attraction behaviors. PLoS Comput. Biol. 14, e1005933. (doi:10.1371/journal.pcbi.1005933) Crossref, PubMed, Web of Science, Google Scholar - 13.
Zienkiewicz AK, Ladu F, Barton DAW, Porfiri M, Bernardo MD . 2018 Data-driven modelling of social forces and collective behaviour in zebrafish. J. Theor. Biol. 443, 39-51. (doi:10.1016/j.jtbi.2018.01.011) Crossref, PubMed, Web of Science, Google Scholar - 14.
Lopez U, Gautrais J, Couzin ID, Theraulaz G . 2012 From behavioural analyses to models of collective motion in fish schools. Interface Focus 2, 693-707. (doi:10.1098/rsfs.2012.0033) Link, Web of Science, Google Scholar - 15.
Herbert-Read JE . 2016 Understanding how animal groups achieve coordinated movement. J. Exp. Biol. 219, 2971-2983. (doi:10.1242/jeb.129411) Crossref, PubMed, Web of Science, Google Scholar - 16.
Lei L, Escobedo R, Sire C, Theraulaz G . 2020 Computational and robotic modeling reveal parsimonious combinations of interactions between individuals in schooling fish. PLoS Comput. Biol. 16, e1007194. (doi:10.1371/journal.pcbi.1007194) Crossref, PubMed, Web of Science, Google Scholar - 17.
Mwaffo V, Anderson RP, Butail S, Porfiri M . 2015 A jump persistent turning walker to model zebrafish locomotion. J. R. Soc. Interface 12, 20140884. (doi:10.1098/rsif.2014.0884) Link, Web of Science, Google Scholar - 18.
Katz Y, Tunstrøm K, Ioannou CC, Huepe C, Couzin ID . 2011 Inferring the structure and dynamics of interactions in schooling fish. Proc. Natl Acad. Sci. USA 108, 18 720-18 725. (doi:10.1073/pnas.1107583108) Crossref, Web of Science, Google Scholar - 19.
Herbert-Read JE, Perna A, Mann RP, Schaerf TM, Sumpter DJT, Ward AJW . 2011 Inferring the rules of interaction of shoaling fish. Proc. Natl Acad. Sci. USA 108, 18 726-18 731. (doi:10.1073/pnas.1109355108) Crossref, Web of Science, Google Scholar - 20.
Lee S, Wolberg G, Shin SY . 1997 Scattered data interpolation with multilevel B-splines. IEEE Trans. Vis. Comput. Graph 3, 228-244. (doi:10.1109/2945.620490) Crossref, Web of Science, Google Scholar - 21.
Weitz S, Blanco S, Fournier R, Gautrais J, Jost C, Theraulaz G . 2012 Modeling collective animal behavior with a cognitive perspective: a methodological framework. PLoS ONE 7, e38588. (doi:10.1371/journal.pone.0038588) Crossref, PubMed, Web of Science, Google Scholar - 22.
Torney CJ, Lamont M, Debell L, Angohiatok RJ, Leclerc L-M, Berdahl AM . 2018 Inferring the rules of social interaction in migrating caribou. Phil. Trans. R. Soc. B 373, 20170385. (doi:10.1098/rstb.2017.0385) Link, Web of Science, Google Scholar - 23.
Quera V, Gimeno E, Beltran FS, Dolado R . 2019 Local interaction rules and collective motion in black neon tetra (Hyphessobrycon herbertaxelrodi) and zebrafish (Danio rerio). J. Comp. Psychol. 133, 143-155. (doi:10.1037/com0000172) Crossref, PubMed, Web of Science, Google Scholar - 24.
Lukeman R, Li Y-X, Edelstein-Keshet L . 2010 Inferring individual rules from collective behavior. Proc. Natl Acad. Sci. USA 107, 12 576-12 580. (doi:10.1073/pnas.1001763107) Crossref, Web of Science, Google Scholar - 25.
Pettit B, Perna A, Biro D, Sumpter DJT . 2013 Interaction rules underlying group decisions in homing pigeons. J. R. Soc. Interface 10, 20130529. (doi:10.1098/rsif.2013.0529) Link, Web of Science, Google Scholar - 26.
Egnor SER, Branson K . 2016 Computational analysis of behavior. Annu. Rev. Neurosci. 39, 217-236. (doi:10.1146/annurev-neuro-070815-013845) Crossref, PubMed, Web of Science, Google Scholar - 27.
Robie AA, Seagraves KM, Egnor SER, Branson K . 2017 Machine vision methods for analyzing social interactions. J. Exp. Biol. 220, 25-34. (doi:10.1242/jeb.142281) Crossref, PubMed, Web of Science, Google Scholar - 28.
Gernat T, Rao VD, Middendorf M, Dankowicz H, Goldenfeld N, Robinson GE . 2018 Automated monitoring of behavior reveals bursty interaction patterns and rapid spreading dynamics in honeybee social networks. Proc. Natl Acad. Sci. USA 115, 1433-1438. (doi:10.1073/pnas.1713568115) Crossref, PubMed, Web of Science, Google Scholar - 29.
Jayles B, Escobedo R, Pasqua R, Zanon C, Blanchet A, Roy M, Tredan G, Theraulaz G, Sire C . 2020 Collective information processing in human phase separation. Phil. Trans. R. Soc. B 375, 20190801. (doi:10.1098/rstb.2019.0801) Google Scholar - 30.
Escobedo R, Jayles B, Trédan G, Roy M, Pasqua R, Zanon C, Blanchet A, Theraulaz G, Sire C . In preparation. Measuring and modeling individual level interactions in a swarming crowd. Google Scholar - 31.
Bonnet F, Rétornaz P, Halloy J, Gribovskiy A, Mondada F . 2012 Development of a mobile robot to study the collective behavior of zebrafish. In 2012 4th IEEE RAS & EMBS Int. Conf. Biomedical Robotics and Biomechatronics (BioRob), Rome, Italy, 24–27 June 2012, pp. 437–442. Piscataway, NJ: IEEE. Google Scholar - 32.
Papaspyros V, Bonnet F, Collignon B, Mondada F . 2019 Bidirectional interactions facilitate the integration of a robot into a shoal of zebrafish Danio rerio. PLoS ONE 14, e0220559. (doi:10.1371/journal.pone.0220559) Crossref, PubMed, Web of Science, Google Scholar