A relationship between attractiveness and performance in professional cyclists

Females often prefer to mate with high quality males, and one aspect of quality is physical performance. Although a preference for physically fitter males is therefore predicted, the relationship between attractiveness and performance has rarely been quantified. Here, I test for such a relationship in humans and ask whether variation in (endurance) performance is associated with variation in facial attractiveness within elite professional cyclists that finished the 2012 Tour de France. I show that riders that performed better were more attractive, and that this preference was strongest in women not using a hormonal contraceptive. Thereby, I show that, within this preselected but relatively homogeneous sample of the male population, facial attractiveness signals endurance performance. Provided that there is a relationship between performance-mediated attractiveness and reproductive success, this suggests that human endurance capacity has been subject to sexual selection in our evolutionary past.


Introduction
Choosy females prefer to mate with high quality males, because they make 'good fathers' (direct benefits), and/or because they provide 'good genes' for their offspring (indirect benefits) [1]. One aspect of quality is whole-organism performance, defined as any quantitative measure of how well an organism performs an ecologically relevant, dynamic behaviour [2]. In non-human animals, for example, locomotor performance is often positively associated with fitness [3]. However, whereas the importance of performance in shaping the outcome of male-male interactions has been shown repeatedly, less is known about its importance in the context of female mate choice [2].
In humans, the link between attractiveness and quality has proved elusive [4], and the few studies that have quantified the link between attractiveness and performance typically used a random sample from the general population ( [5], but see [6]). In such a sample, there are many variables that affect attractiveness and/or performance, including differences in training and diet, which may obscure or generate associations between the two. Also, the measures of performance employed predominantly capture variation in strength and coordination, rather than endurance, which is more difficult to quantify. However, it has been hypothesized that it is endurance capacity in particular, that has been subject to strong selection in our evolutionary past [7,8].
Here, I use data from elite professional cyclists that finished the 2012 Tour de France, generally considered to be one of the hardest endurance events. In this unique subset of the male population, which is relatively homogeneous in terms of training effort and motivation, I test for a relationship between attractiveness and performance. Furthermore, I test whether this relationship is stronger when attractiveness is scored by naturally cycling women as compared with when scoring is done by women using a hormonal contraceptive or men [9].

Material and methods (a) Measuring attractiveness
Eighty portraits of riders that participated in the 2012 Tour de France, taken on the day before the start of the race, were obtained from http://www.letour.fr, together with their date of birth, nationality, height and weight. Portraits showed the head, neck and part of the shoulders and were standardized in terms of lighting, distance and background.
I created two online surveys, each containing the portraits of 40 riders in a random order, at http://www.fluidsurveys.com. Participants were first asked to rate each rider in terms of attractiveness on a discrete scale from 1 to 5, with 5 being the highest. Before moving on to the next portrait, raters were asked to also provide a masculinity and a likeability score for this rider. Masculinity may be correlated with attractiveness [10] and mediate a relationship between attractiveness and performance, and likeability captures variation in facial expression (i.e. smiling).
In addition, participants provided information on, among other things, their sex and age, and on whether they thought they knew the rider. Furthermore, women were asked whether they used a hormonal contraceptive, and if they did not, for the average length of their cycle and how many days had passed since the start of their last period.
In total, 398 þ 418 ¼ 816 people participated, 72% of which were female (for more demographic information, see the electronic supplementary material, Results). A total of 282 out of a total 32 468 attractiveness ratings (0.9%) were excluded because the rater indicated that he or she recognized the rider. For more information on rider selection and data collection, the inference of female fertile phase, variation in facial expression of the riders, and on rider height and weight, see the electronic supplementary material, Methods.

(b) Quantifying performance
To quantify rider performance, I performed a principal component analysis on the time it took for each rider to complete the prologue, the two individual time trials and the complete race (minus the time for the prologue and the time trials). I extracted the first principal component, and to ensure faster riders had higher values, multiplied this with 21 (for details, see the electronic supplementary material, Methods).

(c) Statistical analyses
I used linear mixed models using restricted maximum likelihood (REML) to test for systematic differences in attractiveness among riders and raters by fitting rider and rater identity as random effects, and assessed their significance using one-sided likelihood-ratio (LR) tests.
I subsequently tested whether performance was a predictor of attractiveness by including performance, as well as various rater-specific variables that might explain additional variation in attractiveness scores. Note that at this stage, no other riderspecific variables (e.g. age or weight) were included, as these might be mediators of a relationship between attractiveness and performance. For all covariates, both linear and quadratic terms were fitted. Rater nationality was fitted as a random effect. I performed backward elimination of non-significant terms, starting with the least significant quadratic terms. Significance of fixed effects was assessed using LR tests (using maximum likelihood (ML)). Parameter estimates of significant terms were obtained from the final model (fitted using REML), and for non-significant terms they were obtained by reintroducing them one-by-one into the final model.
Having estimated the overall effect of performance on attractiveness, other rider-specific variables were included into the model arrived at above, again followed by backward elimination. Note that starting with a full model including all rider-and rater-specific variables resulted in the same final model. The proportion of variance in attractiveness among riders and raters explained by the rider-and rater-specific fixed effects retained in the final model was calculated following [11].
Finally, I tested for rater-specific variation in the relationship between attractiveness and performance by expanding the model arrived at above with a random slope for the regression of attractiveness on performance for each rater, and included an interaction between performance and various rater-specific variables. Note that whereas the effect of performance on attractiveness is tested on the level of the rider (N ¼ 80), interactions between performance and rater-specific variables are tested on the level of the rater (N ¼ 816).
I repeated all analyses for masculinity and likeability, as well as for attractiveness corrected for likeability and vice versa. Residual attractiveness, masculinity and likeability scores were normally distributed. All analyses were run in R v. 3.0.0 [12]. Linear mixed models were run using lme4 0.999999-2 [13].

Results (a) Variation in attractiveness
There is significant variation among riders in attractiveness, with rider ID explaining 28% of the variation in attractiveness scores (x 2 1 ¼ 12709, p , 0.001). Part of this variation is associated with their performance during the 2012 Tour de France, with better performing riders receiving on average higher attractiveness scores (b ¼ 0.091 + 0.043, x 2 1 ¼ 4:58, p ¼ 0.032, R 2 ¼ 5.5%; figure 1a; electronic supplementary material, figure S1). In those riders that also took part in the 2013 Tour de France, there is a very similar association with their performance in that year (see electronic supplementary material, Results).
Which rider-specific variables shape performance, and which rater-specific variables shape attractiveness scores, is outlined in the electronic supplementary material, Results.

(b) Variation in the relationship between performance and attractiveness
Despite substantial individual variation (see electronic supplementary material, figure S1), the slope of attractiveness on performance differed significantly among female raters rsbl.royalsocietypublishing.org Biol. Lett. 10: 20130966 using the pill, female raters in the non-fertile part of their cycle, female raters in the fertile part of their cycle and male raters (x 2 3 ¼ 12:5, p ¼ 0.006; figure 1b; electronic supplementary material, figure S1c). Although still positive, the slope was significantly weaker in men and pill-using women (x 2 1 ¼ 11:8, p , 0.001). There was no significant difference in the slope between men and pill-using women (x 2 1 ¼ 0:18, p ¼ 0.67) or between women in the fertile and in the non-fertile part of their cycle (x 2 1 ¼ 0:54, p ¼ 0.46). None of the interactions between other rater-specific variable and performance were significant (see electronic supplementary material, Results).

(c) Masculinity and likeability
There was no association between performance and masculinity, whereas there was a positive association between performance and likeability. The significant relationship between attractiveness and performance, as well as the significant difference between men and pill-using women versus non-pill-using women remained when attractiveness was corrected for likeability, whereas there was no relationship between performance and likeability corrected for attractiveness (see electronic supplementary material, Results).

Discussion
Why is there an association between a rider's attractiveness and his performance during the Tour de France? First, performance may be positively correlated with general health, vigour or strength, or certain personality characteristics (e.g. competitiveness), which in their turn may be associated with attractiveness. Alternatively, facial attractiveness may signal endurance performance in particular. Indeed, high endurance performance is thought to have been the target of selection in early hominids, as being able to efficiently cover large distances allowed for more efficient hunting, gathering and scavenging, resulting in a number of uniquely human adaptations [7].
If true, individuals with higher endurance capacity were likely to be better resource providers for their partner and progeny. By choosing a mate with high endurance capacity, a woman would thus have gained direct (e.g. more resources for her and her offspring) and/or indirect (i.e. physically fitter offspring) benefits. Interestingly, across cultures, women place a lot of value on the provisioning ability of their prospective partner [14]. So, provided the association of endurance performance (i.e. physical fitness) with attractiveness translates into an association with reproductive success (i.e. evolutionary fitness) [15], endurance performance may have been subject to natural as well as sexual selection [8].
Although their preference was significantly weaker, also (heterosexual) men rated faster cyclists as more attractive. Furthermore, there was a close correlation between male and female ratings (see electronic supplementary material, Results). This suggests that men either know what (heterosexual) women find attractive, or that preference functions for performance-mediated attractiveness are to some degree independent of sex. Also pill-using women showed a reduced preference for faster cyclists. Although the difference is relatively small and women using the pill are not a random subset of the female population, this is in line with other studies demonstrating a reduced preference for indicators of male quality in pill-using women [9].
To summarize, I was able to simultaneously investigate the effects of several rider-and rater-specific variables on attractiveness scores and show a relationship between facial attractiveness and performance. Although the mechanism mediating this relationship remains to be elucidated, this provides a fascinating new insight into the nature of human endurance performance.
Acknowledgements. I am grateful to all participants. This study would have been impossible without the rider portraits and biographies made available by the Amaury Sport Organisation. Jonas Mechtersheimer performed the pilot study which this study was based on. I thank Barbara Tschirren, Simon Lailvaux, Stefanie Muff, Michael Jennions and three anonymous reviewers for discussion and comments.   Figure 1. (a) The relationship between attractiveness and performance. Grey dots depict a rider's attractiveness score, averaged across raters and plotted against his performance. The solid and dashed lines depict the relationship between attractiveness and performance and its 95% CI, obtained from a mixed model including additional rider-and rater-specific variables. (b) The mean rater-specific slope of this relationship and its standard error, for women in the fertile part of their cycle, women in the non-fertile part of their cycle, pill-using women and men. Also see the electronic supplementary material, figure S1.