Evidence for sexual conflict over major histocompatibility complex diversity in a wild songbird

Sex differences in parasite load and immune responses are found across a wide range of animals, with females generally having lower parasite loads and stronger immune responses than males. Intrigued by these general patterns, we investigated if there was any sign of sex-specific selection on an essential component of adaptive immunity that is known to affect fitness, the major histocompatibility complex class I (MHC-I) genes, in a 20-year study of great reed warblers. Our analyses on fitness related to MHC-I diversity showed a highly significant interaction between MHC-I diversity and sex, where males with higher, and females with lower, MHC-I diversity were more successful in recruiting offspring. Importantly, mean MHC-I diversity did not differ between males and females, and consequently neither sex reached its MHC-I fitness optimum. Thus, there is an unresolved genetic sexual conflict over MHC-I diversity in great reed warblers. Selection from pathogens is known to maintain MHC diversity, but previous theory ignores that the immune environments are considerably different in males and females. Our results suggest that sexually antagonistic selection is an important, previously neglected, force in the evolution of vertebrate adaptive immunity, and have implications for evolutionary understanding of costs of immune responses and autoimmune diseases.

From each individual, we collected blood samples of 20-80 µl from the brachial or tarsus veins. The samples were suspended in 500-600 µl of SET-buffer (0.15 M NaCl, 0.05 M TRIS, 0.001 M ethylenediaminetetraacetate) and frozen at -20°C. Later, genomic DNA was isolated using standard phenol/chloroform-isoamylalcohol extraction [10], and the purified DNA was stored in 1´ TE buffer and frozen at -50 or -80°C.
We used already published primers, forward (HNalla) TCCCCACAGGTCTCCACAC and reverse (HN46) ATCCCAAATTCCCACCCACCTT [11], that amplify a 262 bp region of exon 3, and have been designed based on MHC-I cDNAs in combination with intron 2 and intron 3 sequences from great reed warblers [11]. Tag sequences of six nucleotides (i.e., molecular identifiers, MIDs) were added to the 5' end of the primers in order to create unique combinations of tagged primer pairs for each sample, enabling us to reassign sequences to samples after the 454 sequencing [12]. We used eight forward tags and 14 reverse tags. The samples from this study, plus 56 additional samples, were divided into four pools to be distributed over a 454 fiber-optic slide with a four-region gasket, using the same combinations of tagged primers in each pool. The total number of individuals included was 351, of which 42 samples were replicated, five were triplicated, and three were quadruplicated, resulting in a total number of 412 samples. In the PCR step, we added 16 blank controls. The PCR reaction mix for each sample consisted of 1.0 µl DNA solution (20 ng/µl in ddH2O), 1.2 µl tagged forward primer (2.5 µM), 1.2 µl tagged reverse primer (2.5 µM), 9.5 µl Multimix (Qiagen N.V., Venlo, Netherlands), and 2.1 µl ddH2O. PCRs were initiated by a 15 minutes heating phase at 95°C, followed by 30 cycles of 30 sec denaturation at 95°C, 90 sec annealing at 65°C, 60 sec elongation at 72°C, and ended with 10 min at 72°C. The PCR results were confirmed by gel electrophoresis in 2% agarose gels, and samples were then pooled eight by eight and cleaned using the MinElute PCR Purification Kit (Qiagen N.V., Venlo, Netherlands). The DNA concentration of each pool was measured on a Nanodrop 2000C spectrophotometer, and finally the samples pooled again to equal concentrations and delivered for 454-sequencing. Sequencing was done in a Roche 454 GS FLX (F. Hoffmann-La Roche AG, Basel, Switzerland) following the manufacturer's instructions at the Department of Biology at Lund University.

Filtering and screening of 454-sequence data
The 454-sequencing data contained 97,900 unique sequence variants and the total coverage was 551,258 reads. The data was demultiplexed using the software jMHC [13]. Sequences with no reads or to which tags had not been successfully assigned were filtered out in this step. Identical sequences were detected and merged using the web-based freewares Seqeqseq (http://mbio-serv2.mbioekol.lu.se/apps/seqeqseq.html) and mergeMatrix (http://mbio-serv2.mbioekol.lu.se/apps/mergeMatrix.html). Sequences with < 3 reads across all samples were removed, reducing the data set to 16,587 variants with 455,718 reads among 412 samples. Further filtering was performed in the web-based freeware popMatrix (http://mbio-serv2.mbioekol.lu.se/apps/popMatrix.html), which is designed to filter data from highthroughput sequencing following the principles in Galan et al. (2010) [14]. In this process, samples with < 240 reads and sequences with a relative abundance of < 0.012 within each sample were removed. The threshold for the first filter was estimated from the distribution of the number of reads per sample. The threshold for the second filter was estimated from the distribution of within-sample frequencies of all sequences, following Galan et al. (2010) [14], and verified by matching the replicated samples in the data set. After this filtering step, 511 sequence variants remained with 311,108 reads. 3 samples with < 240 reads had been removed. The remaining sequences were aligned and inspected manually using BioEdit (version 7.1.11). Sequences that contained stop codons or indels obstructing the reading frame, other nonfunctional sequences and known pseudogene alleles were removed from the data set at this point. PCR chimeras and single nucleotide substitution errors were also identified and removed. After final filtering and screening, there were 329 sequence variants in the data set. The replicated samples in the data set were merged so that any sequence that remained in any one sample after filtering and screening (unique variants) got to stay in the data set. One genotyped sample was excluded from further analyses because it could not be assigned to an individual in the database. The proportions of unique variants (PUV) between replicated samples were calculated and the repeatability of the total sequencing reaction was calculated as [1 -mean PUV] = 0.94.
All DNA sequences were identified by blasting against the NCBI database (http://blast.ncbi.nlm.nih.gov/Blast.cgi) and novel sequences given new names following the MHC standardized nomenclature [12].

Statistical models The effects of MHC-I diversity on life span
The linear regression model included life span as dependent variable, MHC-I diversity (i.e., the number of different MHC-I alleles per individual) and MHC-I diversity squared as explanatory variables, sex as a fixed factor, and the interactions MHC-I diversity × sex, and MHC-I diversity squared × sex. The model was simplified step by step by removing the least significant terms. Analyses of variance and changes in the Akaike information criterion (AIC) were used to confirm that each simplification step improved the fit of the model. Histograms of the residuals from the life span models showed that they did not follow a normal distribution (Fig. S5 a, b), and we therefore ran the models also as log-linked generalized linear models with negative binomial error distribution using the package 'MASS' in R [15]. The results from these confirmed the results obtained with the Gaussian linear regression models (Table S7 a-c).

The effects of MHC-I diversity on offspring fledging success
We tested the effect of MHC-I diversity on the lifetime number of fledglings with a linear regression model that included life span as covariate (i.e., an analysis of offspring fledging success). The other terms in the model were specified as in the model on life span (above) and the model was simplified step by step by removing the least significant terms. Analyses of variance and changes in AIC were used to confirm that each simplification step improved the fit of the model. The residuals from these models were normally distributed (Fig. S6 a, b).
For illustrative purposes, residuals from a linear model between lifetime number of fledglings and life span for each sex (Table S8) were saved as a new variable, that explains offspring fledging success, and this was correlated with MHC-I diversity using a linear regression model.

The effects of MHC-I diversity on offspring recruitment success
We tested the effect of MHC-I diversity on the lifetime number of recruiting offspring with a linear regression model including lifetime number of fledglings as covariate (i.e., an analysis of offspring recruitment success). The other terms in the model were specified as in the model on life span (above) and and the model was simplified step by step by removing the least significant terms. Analyses of variance and changes in AIC were used to confirm that each simplification step improved the fit of the model. The residuals from this model are normally distributed (Fig.  S7). The models were also run independently for each sex, testing the effect of MHC-I diversity on the lifetime number of recruiting offspring with lifetime number of fledglings as a covariate. The residuals from these models are normally distributed (Fig. S8 a, b).
In order to obtain selection gradients [16], we calculated the relative lifetime number of recruiting offspring of each individual by dividing the values with the mean of all individuals in our data set. We then repeated the final models with relative lifetime number of recruiting offspring as response variable (with lifetime number of fledglings as covariate).
For illustrative purposes, residuals from a linear model between lifetime number of recruiting offspring and lifetime number of fledglings for each sex (Table S9) were saved as a new variable, that explains offspring recruitment success, and this was correlated with the number of different MHC-I alleles using a linear regression model.

The effects of MHC-I diversity on territory attractiveness rank
The effect of the MHC-I diversity on the territory attractiveness rank was tested with linear mixed effects models using the package 'lme4' in R [17]. The models were conducted independently for each sex and had yearly recorded values of the territory attractiveness rank as response variable, the MHC-I diversity as explanatory variable, and age given individual as random factor. Two-tailed p-values were approximated from a t-distribution given the degrees of freedom of the parameter estimates.

Parent-offspring regression of MHC-I diversity
The model included MHC-I diversity in the offspring as response variable, the combined MHC-I diversity in the parent pairs (i.e., the sum of the number of alleles in each parent subtracting the number of shared alleles) as explanatory variable, and the pair ID of the parents as random factor. The combined MHC-I diversity in the parent pairs takes into account that the MHC-I diversity in a parent pair may be smaller than the sum of the MHC-I diversity from each parent because certain alleles may be shared between the parents. The combined MHC diversity in parent pairs was estimated using the package 'MHCtools' in R [18]. The model was run with the package 'lme4' in R [17]. A two-tailed p-value was approximated from a t-distribution given a number of degrees of freedom estimated with Kenward-Roger approximation using the package 'pbkrtest' in R [19].

The effects of MHC-I diversity in offspring on their recruitment success
We tested whether MHC-I diversity in the offspring affected their recruitment success using a generalized linear mixed model with logit link function. The model had offspring recruitment status (i.e., recruited or not) as response variable, MHC-I diversity as explanatory variable, and included nest ID as a random factor. We also tested if offspring sex affected the relationship between their MHC-I diversity and recruitment success, by repeating the model including sex as a fixed factor and the interaction MHC-I diversity × sex. These models were run with the package 'lme4' in R [17] using the data from the 145 fledglings of the 1998 cohort.

Supplementary Results
First, we analyzed whether MHC-I diversity affected life span, when including sex as a fixed factor. The full quadratic model showed no significance for the interactions MHC-I diversity × sex and MHC-I diversity squared × sex (Table S1 a), and AIC decreased from 716.78 to 713.01 when they were removed from the model. This was confirmed by an anova comparing the models with or without the interaction terms (p = 0.9). In the model without the interaction terms, sex was least significant (Table S1 b), and removing that term from the model was confirmed by confirmed by a drop from 713.01 to 711.10 in AIC and by comparing the two models with an anova (p = 0.77). In the model without sex, MHC-I diversity was the least significant term (Table S1 c) and removing that term from the model reduced the AIC from 711.10 to 709.44 and was confirmed by an anova (p = 0.56). The final model included only MHC-I diversity squared and showed no significant effect (quadratic regression, b = 0.00037, SE = 0.0017, t = 0.22, d.f. = 169, p = 0.83; Table S1 d).
Second, we analyzed the effect of MHC-I diversity on offspring fledging success. This model had the lifetime number of fledglings as dependent variable, including life span as covariate, and sex as a fixed factor. The full quadratic model showed no significance for the interactions MHC-I diversity × sex and MHC-I diversity squared × sex (Table S2 a), and AIC was reduced from 1152.56 to 1149.36 when they were removed from the model. This was confirmed by an anova comparing the models with or without the interaction terms (p = 0.68). In the model without the interaction terms, MHC-I diversity was the least significant term (Table S2 b Table S2 c).
Third, we tested the effect of MHC-I diversity on offspring recruitment success. This model had the lifetime number of recruiting offspring as dependent variable and lifetime number of fledglings as covariate, including sex as a fixed factor. The full quadratic model showed no significance for the interactions MHC-I diversity × sex and MHC-I diversity squared × sex (Table S3 a), but AIC increased from 588.24 to 595.03 when they were removed, so we retained them in the model. The linear term MHC-I diversity was less significant than MHC-I diversity squared and removing MHC-I diversity and the interaction MHC-I diversity × sex from the model reduced the AIC from 588.24 to 586.35. An anova model confirmed removing those terms (p = 0.37). The final model included lifetime number of fledglings, MHC-I diversity squared, sex, and the interactions lifetime number of fledglings × sex and MHC-I diversity squared × sex. It showed a highly significant for the interaction between MHC-I diversity squared and sex (quadratic regression, b = 0.0071, SE = 0.0023, t = 3.04, d.f. = 165, p = 0.0028; Fig. 1; Table 1). The model was repeated for males and females separately. In males, MHC-I diversity squared was the least significant term (Table S3 b Table 1). In females, MHC-I diversity was the least significant term (Table  S3 c Table 1).