Mitochondrial diversity in European and Chinese pigs is consistent with population expansions that occurred prior to domestication
Abstract
Mitochondrial DNA (mtDNA) diversity in European and Asian pigs was assessed using 1536 samples representing 45 European and 21 Chinese breeds. Diagnostic nucleotide differences in the cytochrome b (Cytb) gene between the European and Asian mtDNA variants were determined by pyrosequencing as a rapid screening method. Subsequently, 637 bp of the hypervariable control region was sequenced to further characterize mtDNA diversity. All sequences belonged to the D1 and D2 clusters of pig mtDNA originating from ancestral wild boar populations in Europe and Asia, respectively. The average frequency of Asian mtDNA haplotypes was 29% across European breeds, but varied from 0 to 100% within individual breeds. A neighbour-joining (NJ) tree of control region sequences showed that European and Asian haplotypes form distinct clusters consistent with the independent domestication of pigs in Asia and Europe. The Asian haplotypes found in the European pigs were identical or closely related to those found in domestic pigs from Southeast China. The star-like pattern detected by network analysis for both the European and Asian haplotypes was consistent with a previous demographic expansion. Mismatch analysis supported this notion and suggested that the expansion was initiated before domestication.
1. Introduction
The domestic pig is an important farm animal worldwide. Darwin (1868) recognized two major forms of domestic pigs (Sus domesticus), the European and Asian form, both originating from the wild boar (Sus scrofa). In the light of the similarity in osteological characteristics, European and Southeast Asian subspecies of the wild boar are thought to be the main ancestors of the domestic pig (Clutton-Brock 1987). Previous studies on domestic pigs have demonstrated considerable genetic structure. A significant differentiation between the European and Chinese domestic pigs has been revealed by mitochondrial DNA (mtDNA) analyses (Giuffra et al. 2000; Okumura et al. 2001; Watanobe et al. 2001; Kim et al. 2002). Studies using nuclear markers have also documented a significant genetic distance between the European and Chinese domestic pig populations (Fan et al. 2002; Yang et al. 2003; Fang et al. 2005). These studies also showed that human activity has influenced the genetic structure of domestic pigs, since the independent domestication of the European and Asian pigs was followed by a subsequent introgression from Asia to Europe as documented by the historical records and molecular studies of mtDNA (Darwin 1868; Jones 1998; Giuffra et al. 2002; Larson et al. 2005).
More than 200 domesticated pig breeds exist in the world, about 30% of these are from China and another 33% originate from Europe according to the domestic animal diversity information system of the Food and Agricultural Organization (http://dad.fao.org/en/home.htm). It has been proposed that Chinese pigs were domesticated from local wild boar populations in several different regions, and the South China wild boar (S. scrofa chirodontus) and the North China wild boar (S. scrofa moupiensis) are considered the two main ancestors (Zhang 1986). Zhang (1986) described 48 Chinese indigenous pig breeds, which he classified into six types according to their geographic origin, distribution, body conformation and colour. The various breeds were developed to enhance the yield and quality of meat according to the demand of local people. For more than 20 years, Chinese indigenous pig breeds have been in direct competition with highly selected western breeds with high growth rate and high lean content, and many of the indigenous pig breeds are in danger of being displaced. Population sizes are often small, with some verging on extinction or even being effectively lost. Drastic reductions in population size will increase the risk of extinction through inbreeding depression.
In Europe, only a limited number of pig breeds with a fast growth rate and very high lean content are widely used in the pig industry. In a recent survey, the Large White pig breed was reported to represent 30% of the total gene pool of the European fattening pigs (http://www.projects.roslin.ac.uk/pigbiodiv/). Under this kind of circumstance, it is important to measure and preserve genetic diversity for future exploitation.
In this study, mtDNA diversity among the European and Chinese domestic pigs was investigated by sequence analyses of cytochrome b (Cytb) and control region haplotypes. Mitochondrial DNA has some clear advantages as a tool for population genetic studies (Brown et al. 1979; Wolstenholme 1992; Avise 1994). Mitochondrial DNA variation may reveal a history of past isolation, even in the event of contemporary admixture of groups that evolved in allopatry (Avise 1994). The aim of this study was to elucidate the genetic structure and the degree of differentiation among the European and Chinese domestic pigs using a unique collection of 45 European and 21 Chinese breeds. We also wanted to assess the relative impact of the introgression of Asian germplasm into Europe across both widely used commercial pig lines and local pig breeds.
2. Material and methods
(a) Animal material
Samples from 934 unrelated European pigs (no common grandparents) representing 45 different European pig populations were used (see the electronic supplementary material). The material included 19 local breeds and many widely used commercial lines: Landrace (11 lines), Large White (eight lines), Pietrain (three lines), Duroc (two lines) and Hampshire (two lines). A total of 602 unrelated Chinese animals (no common grandparents) representing 21 breeds were used (electronic supplementary material). All Chinese pigs were from preservation farms in China. More details about the country of origin and the number of samples for each breed are given in the electronic supplementary material and the geographic distribution of the Chinese breeds is indicated in figure 5. Genomic DNA was extracted from blood by a standard phenol–chloroform method or from hairs by Chelex extraction.
(b) mtDNA analysis
Fourteen base pairs (bp) of Cytb were analysed by pyrosequencing as previously described (Clop et al. 2004). The region contains several diagnostic nucleotide substitutions that can be used to distinguish mtDNA sequences belonging to the European and Asian clusters.
A part of the control region (637 bp) was amplified with primers L15433 (5′-TGC AAC CAA AAC AAG CAT) and H16108 (5′-GCA CCT TGT TTG GAT TGT CG) (Watanobe et al. 2001) and the following PCR conditions: 45 cycles of 45 s each at 94 and 54 °C followed by 50 s at 72 °C. The amplification was preceded by 7 min at 94 °C and terminated with 10 min at 72 °C. This fragment could not be reliably amplified from degraded DNA and in these cases the following three primer pairs were used to amplify three-overlapping fragments covering the entire region: (L15433+H62 (5′-CCTGCCAAGCGGGTTGCTGG), L119 (5′-CAGTCAACATGCATATCACC)+H124 (5′-ATGGCTGAGTCCAAGCATCC) and L104 (5′-TGGACTAATGACTAATCAGCCCAT)+H16108) (Watanobe et al. 2001; Larson et al. 2005).
The amplified products were sequenced directly without cloning using the Big Dye Terminator Sequencing kit (Amersham). The sequences from one sample were compiled into a single contiguous fragment using the Sequencher software (Gene Codes, Ann Arbor, Michigan, USA). Ambiguous positions were verified by resequencing using the nested primers (H62, L119, H124 or L104) depending on the position. All new sequences have been deposited in GenBank (DQ152842–DQ152899 & DQ379016–DQ379231) and all data are available as a pop set (http://www.ncbi.nlm.nih.gov/entrez/batchseq.cgi?db=popset&view=ps&val=76162779)
(c) Data analyses
Cytochrome b haplotypes were defined on the basis of six polymorphic positions. The frequency of each haplotype in each population and the frequency of Asian haplotypes within each European line were calculated. The control region sequences were aligned using ClustalW (http://www.ebi.ac.uk/clustalw/). Identical sequences obtained from different animals were detected using Arlequin 2.0 (Schneider et al. 2000). All singletons (a polymorphic site represented by a single variant sequence) were checked carefully to exclude sequencing errors.
Genetic distances based on the gamma distribution (α=0.5) were calculated using Kimura's two-parameter algorithm, as implemented in the MEGA software (Kumar et al. 2001). A neighbour-joining (NJ) tree was constructed based on the genetic distance matrix. Median-joining (MJ) networks were also constructed for a visualization of the relationship among the European and Asian haplotypes separately. The MJ network was enhanced by first generating a reduced median network with software package Network 4.1 (Bandelt et al. 1999) to eliminate non-parsimonious links.
The mismatch analysis as well as the calculation of Tajima's D-values (Tajima 1989) was done with the Arlequin software (Excoffier et al. 1992; Schneider et al. 2000).
3. Results
(a) mtDNA diversity
Nucleotide positions 15033–15046 in the Cytb gene were screened using pyrosequencing; nucleotide positions are assigned according to the complete pig mtDNA reference sequence (Ursing & Arnason 1998). Six variable sites (nt15035, 15036, 15038, 15041, 15044 and 15045) were detected when analysing 934 European and 602 Chinese pigs. Three European (E1, E3 and E4) and three Asian haplotypes (A1, A2 and A4) were found (figure 1); a proposed evolutionary relationship of these Cytb haplotypes is also given in figure 1. The E2 type has previously been found in Italian wild boars and is associated with the second major European cluster of mtDNA haplotypes denoted EII by Giuffra et al. (2000) and D4 by Larson et al. (2005). A3 has been found in Chinese wild boars (Fang 2006, unpublished). The polymorphism at positions nt15035 and 15044 and the haplotypes A4, E3 and E4 were detected for the first time in this study and they were all rare. The polymorphism at nt15045 is the only one causing an amino acid substitution. The distribution of Cytb haplotypes and the estimated frequency of Asian haplotypes among breeds are compiled in the electronic supplementary material. Five haplotypes were found among the European pigs (E1, E3, E4, A1 and A2) while only three Asian haplotypes (A1, A2 and A4) were found among Chinese pigs.
A portion of the hypervariable control region (637 bp, positions nt15451–16088) was sequenced using 175 European samples and 99 Chinese samples. Analysis of the 175 samples from the European pigs revealed a total of 39 variable sites forming 36 distinct haplotypes, which included 26 European haplotypes (EH1–26) and 10 Asian haplotypes (EAH1–10). Twelve out of the 36 haplotypes were represented by a single sequence while all others were found in at least two animals. Twenty-two variable positions and 28 Asian haplotypes (AH1–28) were found among Chinese pigs. Six Asian haplotypes (EAH5–10) were shared between the European and Chinese populations. The region between positions 15533–15733 bp of the control region was highly variable in particular among the European pigs; 12% of the positions were polymorphic representing 61.5% of the total number of polymorphic sites. Single nucleotide substitutions or insertion/deletion polymorphisms at positions 15544, 15567, 15573, 15582, 15595 and 15829 were diagnostic for distinguishing Asian and European haplotypes. Positions 15590 and 15733 were also informative, but they were not diagnostic since the same substitution was observed in at least one sequence from the other cluster.
(b) Population genetic analysis
Genetic distances for control region sequences showed that all the Chinese mtDNA haplotypes (mean±s.e., 0.006±0.001 within group) and all the European haplotypes (0.005±0.001 within group) formed two distinct clusters that are well separated from each other (0.024±0.005 between groups), consistent with previous studies indicating independent domestication of pigs in Europe and Asia (Giuffra et al. 2000; Larson et al. 2005). A NJ tree was constructed for all control region haplotypes based on pairwise genetic distances (figure 2). Sequences representing the major clusters of mtDNA haplotypes detected in a recent survey of a diverse sample of wild boars and domestic pigs from Eurasia (Larson et al. 2005) are included in figure 2 and labelled according to their GenBank accession numbers. This comparison revealed that all mtDNA haplotypes found in the European and Chinese domestic pigs belonged to the D1 and D2 clusters. Many of the bootstrap values for nodes within the D1 and D2 clusters were low, implicating that the substructure within the major clusters is uncertain whereas assignments of haplotypes into the D1 and D2 clusters are very reliable. There was also a perfect agreement as regards the classification of Cytb and control region sequences as of the Asian or European origin; the corresponding Cytb haplotypes are indicated in figure 2. This means that we found no evidence for any past recombination events that would be possible if paternal transmission of mtDNA was to occur rarely in mammals (Kraytsberg et al. 2004).
Networks of the local European and Chinese haplotypes were constructed to better visualize the relationship among haplotypes (figure 3). Two core lineages were revealed both among the European and Chinese haplotypes. Each core haplotype was surrounded by a star-like pattern, consistent with a recent population expansion. A demographic expansion was also supported by negative Tajima's D-values (Tajima 1989) that reached statistical significance for the European haplotypes (Europe: D=−1.51, p=0.04; China: D=−0.68, p=0.28)
Mismatch distributions were calculated for the two major haplogroups to further investigate the hypothesis of a population expansion (figure 4). The mismatch distributions for the European and Chinese haplogroups were unimodal and fully consistent with a population expansion. The approximate time since expansion can be estimated by the formula t=gτ/2u (Rogers & Harpending 1992); t=time in years, g=generation interval, τ=peak number of mismatches accumulated in mutational time since expansion where this time is measured in units of generations divided by 2u, and u=mutation rate per nucleotide site multiplied by the number of nucleotides compared (in this case 637 bp of the mtDNA control region). τ was estimated at 2.4 and 3.4 from the mismatch distributions for the European and Asian haplotype groups, respectively. This translates to an initiation of expansion about 190 000 (Europe) and 275 000 (Asia) years BP if we assume a generation interval around 1.5 years and a substitution rate of 1.37×10−8 per nucleotide site, and year as previously estimated for the control region of mammalian mtDNA (Pesole et al. 1999).
4. Discussion
Previous studies have revealed multiple centres of pig domestication and the existence of a clear phylogenetic structure of mitochondrial sequences found in wild boars and domestic pigs (Giuffra et al. 2000; Larson et al. 2005). Larson et al. (2005) identified six major clusters (denoted D1 to D6) of pig mitochondrial control region sequences that were assumed to reflect domestication from genetically distinct subpopulations of the wild boar; the D1 (Europe), D2 (Asia) and D4 (Italy) clusters have previously been denoted EI, A and EII, respectively (Giuffra et al. 2000; Kijas & Andersson 2001). The present study provides the most comprehensive screening of mtDNA variation among domestic pigs including altogether 1536 individuals representing 21 Chinese breeds and 45 European pig populations. Our strategy has been to sequence diagnostic nucleotide differences in the Cytb gene by pyrosequencing as a rapid screening method to assess the frequency of major European and Asian mtDNA clusters followed by sequencing 637 bp of the hypervariable control region as a further characterization of mtDNA diversity. Only mtDNA sequences belonging to the major European (D1) and Asian (D2) clusters were represented among the diverse set of breeds included in this study. Thus, domestic pigs associated with the other four clusters of mtDNA types (D3–D6) appear to have had no or only minor impact on the development of the European and Chinese pig breeds, at least as regards the maternal contribution.
The common occurrence of Asian mtDNA haplotypes among the European domestic pigs is fully consistent with the results of previous molecular studies and the well-documented introgression of Asian pigs into Europe primarily during the eighteenth and nineteenth centuries (Jones 1998). The average frequency of Asian mtDNA haplotypes was 29% across the European breeds but varied from 0 to 100% within breeds. The frequency of Asian haplotypes was low or absent in Duroc and Hampshire lines. Our data showed that Landrace lines (mean=12.8%, range=0–43%) were less affected by Asian introgression than Large White lines (mean=76.0%, range=14–100%). Pietrain, a breed originally developed in Belgium, exhibited a very high frequency of Asian haplotypes in some lines (France) but a complete absence in others (Germany and PIC).
In general, there was a good agreement between the known breed history and the molecular data. For instance, some breeds with a well-documented Asian influence like Berkshire and Large White, both originating from England, exhibited a high frequency of Asian mtDNA haplotypes. Accordingly, the presence of Asian haplotypes in two Spanish pig breeds, Manchado de Jabugo and Negro Canario, was consistent with the known introgression of Tamworth and Black pigs, carrying Asian haplotypes, from United Kingdom to Spain in 1980 (EAAP-Animal Genetic Data Bank, http://www.tiho-hannover.de/einricht/zucht/eaap/index.htm). Furthermore, two control region haplotypes (EH7 and EH15) were detected in Duroc pigs. One of these, EH15, was also found in Spanish pig breeds, Negro Iberico and Retinto, which is consistent with the fact that Red Iberian pigs contributed to the development of the Duroc (Jones 1998). Only the European haplotypes were detected in some local breeds like Bisaro from Portugal, Negro Iberica from Spain and Landrace from Iceland. No European mtDNA haplotypes were detected among the 19 breeds of Chinese pigs included in this study. This shows that maternal introgression from the European domestic pigs has had no or very little impact on these Chinese breeds. An obvious topic for future research will be to use Y-specific markers to assess male-mediated introgression both in Europe and China.
Although the historical introgression of Asian pigs into Europe is well-documented, it is not clear from which region(s) of Asia these pigs originated. We observed a considerable overlap between the Asian haplotypes present in the European domestic pigs and in Chinese pigs. The control region sequences for 6 out of 10 Asian haplotypes detected in the European pigs were identical with mtDNA haplotypes found in contemporary Chinese pigs. The remaining four showed not more than two nucleotide differences compared with the most similar haplotype found in China (figure 3b). Furthermore, an examination of the haplotype distribution among Chinese pigs revealed a clear phylogeographic signal. The Asian mtDNA haplotypes present in the European pigs were all found in breeds from Southeast China (figure 5). Thus, domestic pigs from Southeast China or closely related pigs from other parts of Asia carrying mtDNA haplotypes belonging to the D2 cluster were used for the introgression into Europe.
The historical demography of the populations of domestic pigs in Europe and China was examined using mismatch distributions, which represent the frequency distribution of pairwise differences among all sampled haplotypes. Theoretical studies have shown that population bottlenecks and population expansions have a strong effect on the pattern of genetic polymorphism among haplotypes in the population (Rogers & Harpending 1992). For instance, populations in long and stable demographic equilibrium have multimodal mismatch distribution (ragged and chaotic) whereas the distribution appears unimodal after recent demographic expansions (Rogers & Harpending 1992; Harpending 1994). The mismatch distributions as well as the network analysis were consistent with population expansions in the ancestors for both contemporary Chinese and European domestic pigs (figures 3 and 4). The crucial question is whether the population expansion occurred before or after domestication. Similar signatures of population expansions in goats and sheep have been interpreted to reflect rapid demographic expansion subsequent to domestication (Bruford et al. 2003). In contrast, our analysis of the mismatch distributions suggested that the expansions of the European and Asian domestic pig populations were initiated about 190 000 and 275 000 years BP, respectively, i.e. long before domestication that occurred approximately 9000 BP (Epstein & Bichard 1984). These are very rough estimates due to the difficulty in measuring the time-scale for recent divergences based on molecular data (Ho & Larson 2006) and because the molecular clock rate for the hypervariable control region of mtDNA in the pig lineage has not been calibrated. We have used an average estimate for the substitution rate of the control region among mammalian species (Pesole et al. 1999), and a 20 fold higher rate in the pig lineage is required to obtain an estimate of the initiation of expansion consistent with the time of domestication.
In order to test the clock rate within the pig lineage, we estimated the time since divergence between the wild boar and the closely related warthog (Phacochoerus aethiopicus) using the same clock rate and the formula T=d/2λ (Nei 1987); T=time in years, d=genetic distance and λ=substitution rate per site per year. This gave a reasonable estimate of 4.4 Myr since divergence based on an observed genetic distance of 0.12±0.02 between Sus and Phacochoerus (GenBank AB046876) for the same 637 bp of the control region as used in this study. This estimate is similar to estimates in the range 1.4–2.8 Myr for the split between S. scrofa and P. aethiopicus based on sequence analysis of porcine SINEs (Sulandari et al. 1997). A 20 fold higher substitution rate would give an unrealistic estimate of 220 000 years since divergence. Furthermore, the average genetic distances for the control region between the European and Asian haplotypes are 0.025±0.006. Using the same formula and the same clock rate, this genetic distance corresponds to an estimated time since divergence of 900 000 years BP for the Asian and European wild boar sequences. This estimate is identical to the one we previously obtained based on a sequence divergence of 1.2% across the entire mtDNA (Kijas & Andersson 2001). Thus, this exercise shows that there is no indication of an exceptionally high substitution rate for the mtDNA control region in the pig lineage that is required to make our data consistent with a population expansion subsequent to domestication. An initiation of the demographic expansion before domestication is also supported by the observation that control region sequences from wild boars are intermingled with sequences from domestic pigs in network trees, and similar signs of population expansions were also noted for control region sequences from wild boars (Larson et al. 2005). Furthermore, we have recently analysed the mtDNA control region from the European wild boars with a 2n=36 karyotype that differs by a single centric fusion from the 2n=38 karyotype present in wild boars from Central Europe, Asia and in all domestic pigs (Fang et al. submitted). The results reveal a close genetic relationship to mtDNA haplotypes found in European domestic pigs (the D1 cluster) and the network analysis indicated that this wild boar population found in Western Europe originates from the same population expansion as European domestic pigs. This shows that the population expansion must predate domestication unless there has been an extensive maternal gene flow from domestic pigs to this wild boar population. The latter is contradicted by the differences in karyotype and in the distribution of mtDNA haplotypes; the most common haplotypes found in wild boars with 2n=36 were closely related but not identical to haplotypes found in domestic pigs (Fang et al. submitted). Thus, we conclude that the pattern of mtDNA diversity in domestic pigs is consistent with a demographic expansion that predates domestication and may rather be related to population fluctuations of this large mammal during the last glaciation period.
We thank Ulla Gustafsson for excellent technical assistance, Greger Larson and Sarah Blott for valuable comments on the manuscript. DNA samples were provided by the PigBioDiv2 consortium (Agence de la Sélection Porcine, INRA, Georg-August University Göttingen, University of Córdoba, Nordic Gene Bank, University of Trás-os-Montes e Alto Douro, the Rare Breeds Survival Trust, Roslin Institute, Sygen International plc, China Agricultural University, Jiangxi Agricultural University and Huazhong Agricultural University). The study was funded by the European Commission PigBioDiv2 project (QLK5-CT-2002-01059) (www.pigbiodiv2.com).
Footnotes
The electronic supplementary material is available at http://dx.doi.org/10.1098/rspb.2006.3514 or via http://www.journals.royalsoc.ac.uk.