Understanding the group dynamics and success of teams

Complex problems often require coordinated group effort and can consume significant resources, yet our understanding of how teams form and succeed has been limited by a lack of large-scale, quantitative data. We analyse activity traces and success levels for approximately 150 000 self-organized, online team projects. While larger teams tend to be more successful, workload is highly focused across the team, with only a few members performing most work. We find that highly successful teams are significantly more focused than average teams of the same size, that their members have worked on more diverse sets of projects, and the members of highly successful teams are more likely to be core members or ‘leads’ of other teams. The relations between team success and size, focus and especially team experience cannot be explained by confounding factors such as team age, external contributions from non-team members, nor by group mechanisms such as social loafing. Taken together, these features point to organizational principles that may maximize the success of collaborative endeavours.

shows the Google search volume for GitHub and its competitors, as gathered from Google Trends (https: //www.google.com/trends/).

S2 Number of commits per push
A "push" to GitHub can consist of multiple "commits". A commit is an individual update to the project's git repository. However, most pushes consist of only a single commit (Fig. S2). A commit can contain many or few changes to a repository, and a change eventually affecting only a small piece of a file may have taken an enormous amount of effort to achieve. This variability makes "lines of code" work metrics noisy. Despite this, they are a commonly analyzed metric. by the lead (w 1 /W) decreases on average as team size grows. The lower bound 1/M is almost never reached, indicating that workloads remain "front-loaded" for all team sizes. Figure S3B shows how the average workload per team member decreases as teams grow. The ten curves correspond to deciles of the total workload distribution. They all follow the same functional form, except possibly the highest decile which decays slightly more slowly. This indicates that the distribution of workload across team members is independent of total workload; more active teams distribute their workloads across their members in the same manner as less active teams.

S4 Outliers in success do not skew team focus results
In Fig. S4 we reproduce the results corresponding to main text Fig. 2 but withhold the 1% highest S projects. Our results do not change, indicating that the trends observed in main text Fig. 2 are not due to outliers. Note that the magnitude of the trend in figure panel D does decrease quite a bit, but the trend is still evident. The average work per team member decreases as team size grows. The ten curves correspond to the deciles of the total work distribution. The functional form is identical for each decile, except perhaps the highest decile which decays slightly more slowly with M, indicating that the way in which work is distributed over a team is independent of the total activity of that team.

S5 Experience is weakly related to team success
One of the quantities we used to understand a team is their experience E, defined as the average number of other teams that members of the team belong to. We found in the main text that E and S are significantly correlated by themselves, but that E is not significant when including the other variables total team size, effective team size, diversity of experience, and number of lead members. This was shown using a linear regression model. Here we further underscore these results. In Fig. S5A we find that top teams have significantly higher experience than average teams, but this distinction disappears for M ≥ 7. Figure S5B shows how E and S relate directly, independent of other quantities. We do see a significant increase in S as E grows, but the change in S is far smaller in magnitude than that of diversity D or number of leads L (shown in the main text). Moreover this effect is confounded with team size, and larger teams may even show a decrease in success as experience grows, although there are insufficient statistics to conclude a trend exists.

S6 Linear model with secondary team contributions
In Table S1 we present the linear model corresponding to that of the main text but augmented with two additional dependent variables: M ext , the number of users who have submitted at least one pull request to the team, and W ext , the number of submitted pull requests. These variables are both significantly correlated with team success S . Including them in the linear regression model did not change the significances of any other variables although it did alter the regression coefficients.  Figure S4: Teams are focused, and top teams are more focused than average teams of the same size.As per main text Fig. 2 Outliers (above the 99th percentile in S ) were filtered out to ensure they do not skew the model.  Figure S5: Experience weakly correlates with success. (A) Top teams have higher experience than average teams, but only for team sizes M < 7. (B) As experience grows, success grows on average, but the change in S is quite weak, compared to that of diversity and number of lead members (main text). Linear models also indicated that the effects of E are entirely due to the other quantities.