Multi-robot replication of ant collective towing behaviours

In this work, teams of small mobile robots are used to test hypotheses about cooperative transport by ants. This study attempts to explain a decrease in steady-state transport speed with increasing team size that was previously observed in the ant Novomessor cockerelli. Two models of one-dimensional collective towing are compared: one in which transporters with different maximum speeds pull the payload with continuous, variable forces and another in which transporters with identical speeds pull with intermittent, unsynchronized forces. A statistical analysis of ant data supports the hypothesis that ants behave according to the first model, in which the steady-state transport speed is the maximum speed of the slowest teammate. By contrast, the ant data are not consistent with the second model, which predicts constant speed regardless of team size. To verify these predictions, the ant behaviours in each model are translated into decentralized controllers and implemented on teams of two to four robots. The controller for the first model incorporates a real-time reinforcement learning algorithm that successfully reproduces the observed relationship between ant team size and transport speed. The controller for the second model yields the predicted invariance of transport speed with team size. These results show the value of robotic swarms for testing mechanistic hypotheses about biological collectives.

SW, 0000-0002-6282-4772 In this work, teams of small mobile robots are used to test hypotheses about cooperative transport by ants. This study attempts to explain a decrease in steady-state transport speed with increasing team size that was previously observed in the ant Novomessor cockerelli. Two models of one-dimensional collective towing are compared: one in which transporters with different maximum speeds pull the payload with continuous, variable forces and another in which transporters with identical speeds pull with intermittent, unsynchronized forces. A statistical analysis of ant data supports the hypothesis that ants behave according to the first model, in which the steady-state transport speed is the maximum speed of the slowest teammate. By contrast, the ant data are not consistent with the second model, which predicts constant speed regardless of team size. To verify these predictions, the ant behaviours in each model are translated into decentralized controllers and implemented on teams of two to four robots. The controller for the first model incorporates a real-time reinforcement learning algorithm that successfully reproduces the observed relationship between ant team size and transport speed. The controller for the second model yields the predicted invariance of transport speed with team size. These results show the value of robotic swarms for testing mechanistic hypotheses about biological collectives.

Introduction
Cooperative transport of large food items by ants is an impressive example of robust multi-agent coordination that is fully decentralized, scalable with the number of transporters, and effective in unknown environments with uneven terrain and obstacles. The ant behaviours that drive this phenomenon are still poorly understood [1,2] the food item being carried [3,4]. These interactions are an example of stigmergy, a mechanism by which individuals communicate through modifications to their environment (in this case, the payload). However, stigmergic behaviour is most likely not the sole factor contributing to successful transport. Coordination may also depend on the members of the transport team having some common information, such as the direction to the nest, and may rely on direct or explicit communication between transporters.
Further understanding of group food retrieval in ants can be applied toward developing control policies for cooperative transport by robotic swarms. Cooperative multi-robot manipulation of heavy payloads in unstructured environments has potential applications in disaster response and search-andrescue operations, as well as automated construction and assembly tasks in remote environments. Previous approaches to multi-robot manipulation have relied on explicit communication between robots, leader -follower strategies, and prior information about the environment, physical properties of the payload, and configuration of robots around the payload [5][6][7]. In other approaches, collective transport is indirectly effected through the pushing efforts of robots that make contact with the load while responding to an external stimulus [4,8,9].
This study considers group transport of artificial payloads by Novomessor cockerelli, a species of desert ant that is capable of highly coordinated, stable transport of large food items in teams [10]. An experimental study of these ants [11] found that the steady-state load transport speed decreased with increasing team size, even when the load weight per ant was held constant. The authors speculated that this effect may be caused by variation in ant orientation with respect to the payload. That is, teams usually distribute themselves along the load's perimeter, and the force that an ant can apply may vary according to its position. For instance, a dynamical model of the ant data in [11] successfully reproduced transport dynamics by assuming that ants on the leading edge of the load pulled and lifted, while ants on the trailing edge only lifted [12]. Effects such as these might be expected to reduce speed for larger team sizes, if additional transporters are required to occupy less advantageous positions. However, this idea was not supported by a more recent experimental study in which all transporters adopted the same position [13]. This was achieved by requiring ants to move a rectangular load by pulling on strings attached to one of its sides, as shown in figure 1a. The results confirmed the relationship observed previously, in which load speed decreased as team size increased and per capita load remained constant.
Buffin et al. [13] also noted a large variance in the speed of single ants towing a load by themselves, suggesting that the members of transport teams vary in the maximum speeds and pulling forces they can attain. They performed simulations that assembled virtual teams from random samples of a normally distributed population of maximum individual speeds. If each team was assumed to have the speed of its slowest member, the simulations reproduced the observed decline in transport speed with team size [13]. However, that study did not examine the coordination mechanisms that would allow a team to regulate its speed in this manner.
In the present study, a multi-robot testbed is used to test hypotheses about ant group transport behaviours in order to gain insight into the observed relationship between load speed and team size [11,13]. It is assumed that all the ants in the transport team know the direction to the goal (the ants' nest) and can navigate even while moving backwards [14], but that they have no knowledge of the load characteristics, the number of teammates or the teammates' locations on the load relative to its centre of mass. Section 2 describes the multi-robot testbed, figure 1b, that was designed to mimic experiments that show a decrease in ant transport speed with increasing team size [13]. Section 3 presents two candidate models of collective towing behaviour in an effort to reproduce this observed trend. The first model assumes a team of ants with heterogeneous maximum speeds that pull on the load with continuous, variable forces, requiring the team to move at the speed of the slowest member for stable transport to occur. The second model assumes a homogeneous team of ants that pull on the load with intermittent, identical forces as they take uncoordinated steps backward toward the nest. For both models, the average steady-state transport speed is predicted and compared to the ant data. Section 4 proposes a decentralized robot controller for cooperative towing that can be implemented on a team of robots with heterogeneous maximum speeds. The controller is based on a reinforcement learning algorithm that uses only stigmergic feedback, similar to the type of information that would be available to the ants. This controller was implemented on teams of two to four Pheeno robots [15] that cooperatively towed a rectangular payload, along with a second controller that produced intermittent, uncoordinated pulling forces. Section 5 presents the steady-state transport speeds in these robot experiments and compares them to the transport speeds observed in the ant experiments. Finally, §6 discusses possible causes and advantages of the observed transport strategy in N. cockerelli, and §7 summarizes the results and outlines directions for future work.

Materials and methods: multi-robot experiments
Collective towing experiments were conducted with teams of two, three and four Pheeno robots. Each team pulled a load in a parallel configuration designed to mimic the arrangement of pulling ants in the experiments of Buffin et al. [13]. Figure 1b shows an overhead snapshot of the experimental set-up with three robot transporters. The payload in these experiments was a 76.2 Â 10.2 Â 10.2 cm L-shaped acrylic frame weighing 500 g. A three-dimensional-printed plastic basket attached to the frame allowed different masses to be added. A sensor suite was designed for each robot to measure the force vector that it applied to the load throughout the transport process. The sensor suite consisted of a three-dimensional-printed sliding pressure plate mounted on a potentiometer that allowed 2708 of rotation. A circular force-sensitive resistor was used to measure the magnitude of the force that the pressure plate exerted due to the robot's pulling force, and the potentiometer determined the direction of the force with respect to the load's orientation. The sensor suites were affixed to the payload, and a hemp cord connected each robot to a sensor suite, as shown in figure 2. The experiments were filmed with an overhead camera (Microsoft Life Cam, resolution of 720p) at a rate of 30 frames per second. The robots and payload were marked with two-dimensional binary identification tags to enable realtime tracking of their positions and orientations by the overhead camera.
3. Two models of ant behaviours and the resulting payload speed 3.1. Ants with continuous, adaptive pulling forces individual transport speed was significantly faster than transport by teams of two ants ( p 1,2 ¼ 0.014), three ants ( p 1,3 ¼ 1.60 Â 10 25 ) or four ants ( p 1,4 ¼ 6.34 Â 10 26 ), but there was no significant difference among teams of different size ( p 2,3 ¼ 0.70, p 2,4 ¼ 0.42 and p 3,4 ¼ 0.96). The plot also reveals a large variance in transport speed by single ants. As noted in Buffin et al. [13], this suggests that an ant transport team can be modelled as a heterogeneous group whose members are capable of different maximum speeds and pulling forces. Such a team would be able to cooperatively tow the load only as fast as the speed of its slowest member, requiring coordination among the teammates. In this model, it is assumed that an ant's maximum towing speed is distributed according to a Gaussian probability density function ( pdf ), f(v), with corresponding cumulative density function (cdf ), F(v) (see the electronic supplementary material). Under this assumption, a Gaussian distribution was fitted to the data in figure 3 on the steady-state towing speed of a single ant (m ¼ 0.7 cm s 21 , s ¼ 0.36 cm s 21 ). The expected speed of the slowest member of a transport team with N members was then computed using order statistics [16]. The expected mean of the rth-order statistic from n samples of the distribution f(v), where 0 , r n, is given by The variance of the rth-order statistic is The speed of the slowest member of a team is distributed according to the first-order statistic (r ¼ 1), for which equations (3.1) and (3.2) simplify to and   [13]. The circles with error bars represent the mean + standard deviation across 30 experimental trials for one ant, 24 for two ants, 25 for three ants and 20 for four ants. Red plots: mean + standard deviation of the first-order statistic of n ¼ 2, 3, 4 samples from a normal distribution that is fit to individual ant transport speed data. rsos.royalsocietypublishing.org R. Soc. open sci. 5: 180409 speed for each team size are very close to those of the corresponding first-order statistics. This similarity supports the hypothesis that ant transport teams move at the speed of the slowest teammate, which implies that the ants can coordinate transport in a decentralized fashion without explicit communication, information about the payload, or knowledge of the team size and configuration.

Ants with intermittent, constant pulling forces
Besides the ants' variability in speed, their unsynchronized gaits during transport could have contributed to the decrease in load speed with increasing team size. The ants in a transport team were observed to step backward while towing the load, and their out-of-phase stepping could have coincided with their application of intermittent, uncoordinated forces on the load. In this section, a dynamical model is developed to investigate the effect of this hypothetical behaviour on the steady-state transport speed.
Collective transport behaviours have previously been studied with the Kilobot robotic platform [17] and the Bristlebot, Hexapod and mTug platforms [18]. Rubenstein et al. [17] found that the transport speed remains the same regardless of team size when the payload mass per robot is kept constant. However, this paper did not take into account out-of-phase stepping by the robots. Christensen et al. [18] investigated the effect of uncoordinated stepping on effective force per robot during group towing. They found no effect for the case where the robots' gait is non-impulsive; e.g. the case where each robot's contact time with the ground, during which it exerts a pulling force on the load, is not extremely short compared with the stride period of its gait. Here, the dynamic model presented in [17] is combined with the gait consideration discussed in [18] to predict the effect of out-of-phase stepping on the steady-state transport speed.
During the experiments, the ants transported the load along an approximately straight path and produced very little load rotation. Hence, the load dynamics can be modelled as translation in the plane using Newton's second law of motion: is the kinetic frictional force on the load, m is the load's mass and a [ R 2 is the load's acceleration. Using an ideal motor assumption to relate force to velocity, the force applied by ant i can be modelled by the linear relation where K . 0 is a constant gain, v max is the maximum ant speed under no load, v i [ R 2 is the velocity of ant i andx [ R 2 is the direction of the load transport. Under the assumption that the load is in static equilibrium in the vertical direction, the frictional force is given by where m k is the coefficient of kinetic friction of the load on the ground, and g is the acceleration due to gravity.
In this model, all ants move at the same speed. Since the ants pull on strings to tow the load, they are rigidly attached to the load when applying force (e.g. the ants may not move faster than the load). This model assumes that a transporting ant will not accumulate enough slack in its string such that its pulling effort does not affect the payload, as was evident in the ant transport videos. During transport, the load stops immediately when the ants stop pulling it. This allows for the use of the quasi-static motion assumption [19,20], implying that the load velocity v L [ R 2 is in the same direction as the net force applied by the ants. These assumptions simplify the sum over the forces defined in equation (3.6) to, (3:8) Inserting this total applied force into equation (3.5), along with the frictional force defined in equation (3.7), and solving equation (3.5) for kv L k at steady-state (a ¼ 0) yields To include the effect of asynchronous ant gaits, the model can incorporate a probability p f ¼ t c /t s that an ant applies a pulling force at any given time, where t c is its contact time with the ground and t s . t c is the rsos.royalsocietypublishing.org R. Soc. open sci. 5: 180409 period of its gait, as defined in [18]. In the case where N ants pull in parallel with their steps beginning at independent, uniformly random starting times between 0 and t s , the number of transporters n N that apply force at the same time is described by the binomial distribution, n B(N, p f ). Then, the average total force exerted by the ants on the load is given by Inserting this total applied force into equation (3.5) and solving for the steady-state load velocity results in an equation similar to equation (3.9): What is important to note in equation (3.11) is that the term producing a slowing effect, m k mg/p f NK, includes the load mass m in the numerator and the team size N in the denominator. Therefore, if the ratio m/N of the load mass to the team size is kept constant, as was done in the ant experiments, then the steady-state load speed should remain the same regardless of the team size, as observed in [17]. This contradicts the observed trend in load speed from the ant experiments.

Design of robot controllers for adaptive, continuous pulling
One possible method for a heterogeneous transport team to adjust to the speed of its slowest member is through learning based on stigmergic feedback; in this case, measurements of changes in the load's rotation in response to forces applied by all the transporters. As stated in §1, it is assumed that the ants all know the direction to their nest and that they tow the load in this direction along a straight line. Under these assumptions, the ants' configuration on the load during the experiments (figure 1a) would have resulted in a net zero moment on the load if all ants pulled with identical forces. Any disparities in towing force would produce differences in the ants' maximum possible steady-state towing speeds as well as rotation of the transported load. However, significant rotation of the payload was not observed in the ant experiments after the transport reached steady state (e.g. the load oscillated slowly but did not rotate, which would have indicated that one ant was moving faster than the others). We hypothesize that the ants detect small rotations of the load or rotational forces on the load, and that they act to reduce these before they become large. We also assume that the ants do not know their location on the load with respect to its centre of mass; i.e. they have no a priori knowledge of how changing their speed of transport will affect the rotation of the load. Thus, they cannot be using a simple feedback mechanism to reject the load's rotation, since they do not know before transport whether they should speed up or slow down when sensing a change in the load orientation. Ants could have implicitly communicated their differences in towing speed by measuring the collective effect of these differences on the load's orientation and rotational dynamics during transport. To illustrate, figure 4 shows a cooperative towing scenario with two robots. Robot R 1 is moving faster than robot R 2 , causing the load to rotate in the clockwise direction. If robot R 2 were moving faster than robot R 1 , then the load would rotate in the counterclockwise direction. By making an association between the load's direction of rotation and the required speed change to reject the rotation, and by measuring its own pulling force, each robot should be able to learn the conditions under which it should speed up or slow down in order to move at the same speed as the rest of the transport team. These conditions do not require the robot to know its location on the load. When there is an incentive for each robot to move as fast as possible, the competing objectives of preventing load rotation and transporting the load quickly will cause the speeds of the faster team members to oscillate around the speed of the slowest member when it is moving at its maximum speed.
A real-time reinforcement learning algorithm was developed to implement this behaviour on robots. The algorithm drives a team of robots with heterogeneous maximum speeds to transport a load at the speed of its slowest member. Each robot runs the two-layer neural network shown in figure 5. There is no distinction between a learning phase and an exploitation phase; instead, an e-greedy algorithm is chosen to adapt to the unpredictable and changing pulling forces during transport. This is done to allow learning errors that can be made during the start of the transport to be corrected.
For this application, the goal is for the robots to learn the association between the direction of the load's rotation and the necessary adjustments to the individual speeds of the transporters to reject this rotation and keep the load at its initial orientation. This learning is done at discrete times, where, for rsos.royalsocietypublishing.org R. Soc. open sci. 5: 180409 simplicity, the times are defined in unit increments. The input vector to the neural network of robot k at time t, p k (t) ¼ [p k,1 p k,2 p k,3 p k,4 ] T ¼ [sign(u) sign( _ u) sign( € u ) sign( _ v k )] T , contains the directions of the load's orientation u, angular velocity _ u, and angular acceleration € u, as well as the direction of the change in the robot's velocity _ v k , all at time t. The output vector of the neural network of robot k is defined as a k ¼ [a k,1 a k,2 ] T ¼ [v kþ v k2 ] T , whose entries indicate a binary decision for the robot to speed up or slow down. Each robot k computes the value a k,j (t) of its output neuron j [ f1, 2g at time t as where w k,ij (t) is the weight value of the connection between the robot's ith input neuron and jth output neuron. An e-greedy algorithm is applied to determine the action taken by the robot. With probability e, the robot randomly chooses to speed up or slow down without using the neural network to make a decision. Otherwise, the output neuron with the larger value is chosen as the winner. If v kþ ! v k2 , then robot k speeds up; otherwise, it slows down, according to the following controller: where K a and K v are constant gains, u max ¼ p, v k,max is the robot's maximum possible towing speed, and F k is the magnitude of the pulling force applied by the robot. When the robot measures F k ¼ 0 during some time period, it speeds up but does not update its weights w k,ij (learn) during that period. The velocity controller in equation (4.2) drives inherently faster robots with higher maximum speeds to be more agile than slower robots, since the quantity (1 2 v k /v k,max ) is larger for faster robots for a given robot speed v k , producing a higher acceleration. This component of the controller prevents the robots from all making the decision to change their transport speeds by the same amount, which would keep their relative speeds constant and, therefore, maintain the direction and speed of the load's Figure 4. A two-robot towing scenario. Robot R 1 is moving faster than robot R 2 , causing the load to rotate by angle u and the towing string of robot R 2 to go slack. rotation. The robots would not learn anything substantial in this scenario, since any action they take would not affect the load's rotation and every decision would be penalized according to the reward function described in the next paragraph, even if it were the correct one. The controller also makes larger adjustments to the robots' accelerations when the payload is far from its initial orientation. The gains K a and K v weight the competing objectives of maintaining the load in its original orientation and moving the load as fast as possible.
After taking an action of speeding up or slowing down, each robot computes a reward that is based on the resulting change in the load's orientation. The reward function at time t for robot k is defined as The reward function can be interpreted as follows. When a robot's action causes the load to rotate toward its initial orientation (sign(u _ u) , 0), return to its initial orientation (u ¼ 0), or stop rotating ( _ u ¼ 0), the robot is rewarded based on the ratio between its current and maximum speeds and the discrepancy that it measures between the load's current and initial orientations. The robot is penalized based on this discrepancy if its action causes the load to rotate away from its initial orientation (sign(u _ u) . 0). The difference in reward values E k (t 2 1) and E k (t) at times t 2 1 and t, respectively, determines the adjustments to the neural network weights according to the Instar rule [21]. Reward constants r k,j (t) are defined at time t such that only the output neuron associated with the chosen action is rewarded or penalized: Àt and a k,j . a k,l , l = j 0, otherwise,

<
: where t is a threshold value chosen to differentiate between sensor noise and a significant reward change. Then the weights are updated as follows: After updating the weights w k,ij , the process is repeated. A flowchart of the robot behaviour is shown in figure 6.

Continuous, adaptive pulling
The learning algorithm described in §4 was implemented on teams of two to four Pheeno robots to investigate whether learning through stigmergic feedback could cause a transport team to tow the payload at the speed of the slowest member. In these experiments, the mass of the load was maintained at 500 g, which a single robot is capable of towing. This was done to ensure that the decrease in payload speed could be attributed entirely to the learning algorithm, not to the experimental set-up. To imitate the ants' directionality of transport toward the nest, all the robots were assigned to drive in the same direction. The robots changed their speed according to the learning algorithm at a rate of 0.5 Hz. Each robot was programmed with a probability e ¼ 0.2 of choosing a random action, a forgetting rate g ¼ 0.02, a learning rate a ¼ 0.1, a significance threshold value t ¼ 0.5, and controller gains K a ¼ 1 and K v ¼ 0.8. Ten trials were run for each team size. Teams of two, three and four robots consisted of members with maximum speeds of [4,12]  includes the value 4 cm s 21 throughout the duration of towing. There are slight discrepancies between the robots' reference velocities and their actual velocities, which were measured from the robot locations tracked by the overhead camera. These differences were caused by camera error, wheel slip and tag placement error.

Intermittent, constant pulling
To simulate out-of-phase stepping during transport as described in §3.2, towing experiments with two to four Pheeno robots were run in which the robots' motors were periodically turned on and off to mimic the contact and swing phases of an ant's gait. The robots' motors were turned on for 1.5 s during a total step time of 3 s. The robots' start times for each step were randomly drawn from a uniform distribution. Each robot 'stepped' with the same maximum velocity of 8 cm s 21 . The load mass was scaled with the number of robots in the team, with a constant mass of 500 g per robot. Ten trials were run for each team size. Figure 8 shows that the average steady-state payload speed during these experiments matches the value predicted by the model equation (3.11). There was no significant difference in speed among teams of different size (ANOVA: F 2,27 ¼ 0.017, p ¼ 0.98). Thus, it can be concluded that out-of-phase stepping is not the primary cause of the decrease in steady-state transport speed with respect to team size that is observed in N. cockerelli.
In addition, out-of-phase stepping was tested on teams of Pheeno robots with different maximum speeds. Under these conditions, the robot teams exhibited uncoordinated transport in which the load was pulled into the slowest robot, which was then dragged along by the efforts of the other robots. This type of behaviour was never observed during the ant towing experiments. Wheel slip ultimately stalled the robots or made them uncontrollable and prevented further transport of the load.

Discussion
The statistical analysis in §3.1 of data on collective towing by Novomessor cockerelli ants and the results of the multi-robot towing experiments described in §5.1 support the conclusion that ants adjust their speeds during collective transport to accommodate the slowest member of the transport team. Hence, robotic implementation of ant-like transport strategies could enable adaptive, decentralized transport by Previous studies have addressed other scenarios where group efficiency is a sublinear function of group size. Krieger et al. found that in a group of robots mimicking an ant foraging behaviour, the contribution of each individual to the foraging process decreased as the group size increased due to more time spent avoiding collisions [22]. Lerman & Galstyan also discovered this sublinear relationship in foraging behaviours and predicted an optimal group size for foraging before interference among individuals begins to lower the net group performance [23]. In contrast to these group foraging examples, the sublinear relationship between transport speed and team size that has been investigated here appears to be caused by heterogeneity of teammate capabilities rather than interference arising from uncoordinated individual behaviours (e.g. out-of-phase stepping, unaligned pulling angles or avoidance of fellow transporters).
The differences in individual N. cockerelli speeds could be due to variation among ants in fatigue, health or age. However, they may also reflect behavioural differences with implications for colony function [24]. Behavioural disparities may arise from genetic variation among ants or from differences in nutrition, developmental environment or experience. Individual differences can also be amplified via positive feedback generated by social interactions among colony members [25]. Potential benefits of interindividual variation include improved resistance to pathogens and parasites, more efficient allocation of distinct tasks among workers, and more robust response to perturbations [26,27]. For example, honeybee colonies do better at keeping nest temperature within optimal bounds when their workers vary in the temperature threshold that triggers thermoregulatory behaviour [28]. Whether two robot team max individual speeds [4,12]  four robot team max individual speeds [4,6,9,12] cm s -1 Figure 7. Results of towing experiments with the reinforcement learning algorithm. The plots display the time evolution of the load and robot velocities for each team size. In all plots, the black line is the maximum reference speed of the slowest team member. The first row shows the average load velocity across ten trials, with the solid blue line and shaded area representing the mean and 95% confidence interval, respectively. The second row shows the measured velocities of the robots and the load during a single experimental trial, and the third row shows the corresponding reference velocity that was calculated by each robot using the learning algorithm.
rsos.royalsocietypublishing.org R. Soc. open sci. 5: 180409 differences in speed among individual N. cockerelli ants have similar effects on colony function remains to be investigated. Cooperative transport allows colonies to harvest large food items that would otherwise require dissection into pieces small enough for single ants to carry [11]. In some ant species, particularly Eciton burchellii and other swarm raiders, transport has the further advantage of being superefficient. That is, groups can carry more weight per ant than solitary transporters, without any loss in speed [1,29]. This property helps swarm raiders achieve the high rates of food capture needed to support their large, fast-growing colonies. N. cockerelli, in contrast, have much smaller colonies and workers forage largely on their own. They also compete with mass-recruiting ant species, such as Forelius foetidus and Solenopsis xyloni, that can readily displace them from rich food items [10]. This competition may not afford N. cockerelli the opportunity to wait for the fastest foragers to retrieve a food item before it is claimed by other ants.
A heterogeneous transport team that adapts to its slowest member will move the load more slowly as the team size increases; however, this strategy also has various benefits. If this adaptation did not occur, the fastest individuals would need to overwhelm the efforts of the slowest members and handle a heavier portion of the load, possibly dragging the slower teammates during the transport. By accommodating the slowest member, large transport teams can use the strength of all teammates, not just the fastest ones and thus can transport heavier loads. Another advantage may arise when the load must be transported over different types of terrain. A large transport team could apply a high net force to a load in order to pull it up a hill or over an obstacle. The presence of slow members in the team could potentially stabilize this manoeuvre, since from a control-theoretic perspective, less aggressive systems are easier to control. Therefore, an ant-like strategy could potentially increase the robustness of the transport to changing, unanticipated environmental conditions.

Conclusion
In this work, we have investigated the observation that the transport speed of a load towed by several Novomessor cockerelli ants decreases as a function of the team size, even with the same load mass per ant. A control approach in which a homogeneous team of robots pull on the load with intermittent, constant forces was tested as a possible ant towing behaviour. A dynamical model was used to predict the steady-state speed of a load that is transported in this manner by a team of known size. The predicted load speed was independent of team size, as long as the mass per transporter remained constant, and the prediction was verified through experiments with teams of two to four small mobile robots. Using order statistics, it was found that the load speed decrease can be attributed to the heterogeneous abilities of individual ants. Since ants within a given colony may move at a range of possible speeds, as a result of differences in characteristics such as age, energy and ability to orient during transport, it was hypothesized that an ant transport team must move at the speed of the slowest member for successful load transport. Owing to their biological limitations, the ants would need to identify the slowest member without explicit communication or global information. To implement a multi-robot towing strategy with these constraints, a real-time reinforcement learning algorithm was developed that relies on implicit communication through the load and local measurements by each robot. This algorithm was tested on teams of two to four robots with significantly different individual maximum speeds. The experimental results supported the hypothesis that the ants are towing the load at the speed of the slowest team member. The experiments on collective transport described in this paper have focused on one-dimensional payload transport by both ants and robots in flat environments with no obstacles. In the future, experiments and analyses should be conducted for transport along two-dimensional trajectories through more complex environments. To more closely emulate the ant behaviours, multi-robot experiments can be performed using legged robots that are closer in design to the ants' anatomy. In addition, a rigorous analysis of the reinforcement learning algorithm presented in this work is needed to characterize the existence and stability of equilibrium payload speeds. This analysis would provide theoretical guarantees on the transport dynamics, and therefore enable more confidence in the algorithm's effectiveness in a wide range of scenarios.
Data accessibility. The data from the robot experiments, comprising the robot input data and overhead camera tracking data, can be accessed through the Autonomous Collective Systems Laboratory Github Repository at: https:// github.com/ACSLaboratory/robot_towing_data_set/. The data from the ant experiments associated with this paper can be found with the publication Buffin et al. [13], which further explains these data.