Amy2B copy number variation reveals starch diet adaptations in ancient European dogs

Extant dog and wolf DNA indicates that dog domestication was accompanied by the selection of a series of duplications on the Amy2B gene coding for pancreatic amylase. In this study, we used a palaeogenetic approach to investigate the timing and expansion of the Amy2B gene in the ancient dog populations of Western and Eastern Europe and Southwest Asia. Quantitative polymerase chain reaction was used to estimate the copy numbers of this gene for 13 ancient dog samples, dated to between 15 000 and 4000 years before present (cal. BP). This evidenced an increase of Amy2B copies in ancient dogs from as early as the 7th millennium cal. BP in Southeastern Europe. We found that the gene expansion was not fixed across all dogs within this early farming context, with ancient dogs bearing between 2 and 20 diploid copies of the gene. The results also suggested that selection for the increased Amy2B copy number started 7000 years cal. BP, at the latest. This expansion reflects a local adaptation that allowed dogs to thrive on a starch rich diet, especially within early farming societies, and suggests a biocultural coevolution of dog genes and human culture.

MO, 0000-0002-8361-4221; EA, 0000-0001-6748-5450 Extant dog and wolf DNA indicates that dog domestication was accompanied by the selection of a series of duplications on the Amy2B gene coding for pancreatic amylase. In this study, we used a palaeogenetic approach to investigate the timing and expansion of the Amy2B gene in the ancient dog populations of Western and Eastern Europe and Southwest Asia. Quantitative polymerase chain reaction was used to estimate the copy numbers of this gene for 13 ancient dog samples, dated to between 15 000 and 4000 years before present (cal. BP). This evidenced an increase of Amy2B copies in ancient dogs from as early as the 7th millennium cal. BP in Southeastern Europe. We found that the gene expansion was not fixed across all dogs within this early farming context, with ancient dogs bearing between 2 and 20 diploid copies of the gene. The results also suggested that selection for the increased Amy2B copy number started 7000 years cal. BP, at the latest. This brain-case volume, a shorter snout, tooth crowding and a higher frequency of dental defects. All the individuals used in this study belonged to the domestic form, according to one or several of these criteria.
When possible, measurements were taken from mandibles, particularly the five dimensions frequently measurable in broken archaeological specimens (dimensions #8, 10, 11, 19, 20, after [25]; electronic supplementary material, table S1-only measurements for individuals providing aDNA results are reported). The data obtained for our archaeological Holocene canids were then compared with the data derived from (i) a series of Pleistocene wolf mandibles from Arcy-sur-Cure (France) [23] dated between 100 000 and 60 000 years BP, prior to any suspicion of domestication; (ii) a series of Pleistocene canid mandibles from Předmostí (Czech Republic) [26], attributed to the wolf and dated to 27 000-26 000 BP; (iii) a series of modern Eurasian wolf mandibles from the National Museum of Natural History, Paris [23]; and (iv) a series of modern wolf mandibles from Southeastern Europe [27] (electronic supplementary material, figure S1). It was noted that the length of the tooth row (dimension #8 [25]) was significantly different between the Holocene canids and the four series of wolf (Mann-Whitney tests corrected for Bonferroni, p < 0.05). The only individual in the Holocene series, located at the very margin of the modern wolves' variation interval (CH1075; electronic supplementary material, figure S1), evidenced a colour mutation typical to domestic animals from one of our previous studies on the same material [15]. Therefore, the canid series analysed in this study can be identified to be the domestic form C. familiaris.

Ancient DNA extraction
The external surface of the bones was scratched with a sterile scalpel to produce a clean piece, which was then reduced to powder with a sterile hammer. The powder (150-300 mg) was then digested for 18 h at 55°C with agitation in 4.7 ml of buffer (0.5 M EDTA (ethylene diamine tetra acetic acid), pH = 8.0), 50 µl of proteinase K (1 mg ml −1 ) and 250 µl of 0.5% N-lauryl-sarcosyl [28]. A silica-based method modified from Rohland & Hofreiter [32] was used to retrieve the aDNA. Mock extractions were performed in order to rule out contamination from reagents. In addition, cross-contamination was monitored by combining the aDNA from our samples with the aDNA from other species (i.e. owls, fish and sheep) for each extraction session.

Ancient DNA pre-amplification and quantitation
In order to restore sufficient aDNA quantity for each sample, we co-amplified the nuclear fragment of the Amy2B gene alongside a fragment of a nuclear reference gene present in two diploid copy numbers (C7orf28b), in a multiplex polymerase chain reaction (PCR). Such pre-amplification procedures have been shown to improve the sensitivity of quantitative PCR (qPCR) analysis on modern [34,35] and aDNA [36]. We followed previous recommendations to perform robust and highly accurate targeted pre-amplification in combination with qPCR [34][35][36].
Both fragment sequences were amplified using dog specific primers [16,18,37]: -Amy2B-fragment of 76 bp: forward 5 -CCAAACCTGGACGGACATCT-3 and reverse 5 -TAT CGTTCGCATTCAAGAGCAA-3 . two different controls in all PCR assays: an aerosol control (tube kept open throughout the manipulation to monitor airborne contaminations) and a PCR-mix control (to monitor contamination of reagents). Amplification products were quantified using the Quantifluor ® dsDNA System (Promega). This system enables the sensitive quantification of small amounts of double stranded DNA thanks to a fluorescent DNA-binding dye.
We followed the same experimental design as previously published [16,18]: the reaction was performed in a 25 µl reaction volume containing 12.5 µl of Taqman Genotyping master mix (Applied Biosystems), 0.9 µM of each primer, 0.25 µM of each probe and 2 ng of DNA. The cycling conditions were one step at 50°C for 2 min, one step at 95°C for 10 min, followed by 40 cycles of one step at 95°C for 15 s and one step at 60°C for 1 min. All reactions were run in triplicate for each sample in the same qPCR plate. We systematically added three qPCR-mix controls to monitor contamination of reagents in each assay and three aerosol controls to monitor airborne contaminations during plate preparation.

First tests on present-day canids
In present-day wolves, the amylase copy number variation ranges from two to eight copies, with 60% of wolves bearing only two copies [17]. In order to choose a wolf reference sample to account for interplate variability in subsequent studies, we performed an independent qPCR on 16 wolves to evaluate the number of Amy2B copies (electronic supplementary material, table S2a), following the previous protocol. The same protocol was also used to test 16 present-day dogs (electronic supplementary material, table S2b).
Modern DNA work was performed in a distinct laboratory (IGDR, CNRS-UMR6290, Rennes, France). The modern DNA samples came from the biobank Cani-DNA_CRB, in IGDR-CNRS, Rennes.

Quantitative polymerase chain reaction on ancient canids
The pre-amplification step was independently repeated before every qPCR attempt for each sample, so that each set of qPCR results (e.g. qPCR results of two different plates for the same sample) derived from independent pre-amplification. Pre-amplification controls relating to samples tested for qPCR were systematically added in the assay. We followed the protocol described above using a present-day wolf sample as a reference to account for inter-plate variability (sample reference 8278-cani-DNA Biobank IGDR, CNRS-UMR6290, Rennes, France). Whenever possible, three positive full replicates (i.e. pre-amplification + qPCR in triplicates) were analysed for each sample and each gene.

Quantitative polymerase chain reaction analysis
Data were analysed using the CopyCaller software (Applied Biosystems), and relative quantitative ratios (RQ) were estimated for each sample and each run. Copy numbers for each target were normalized to the reference modern wolf (sample references 8278-two amylase copies). Raw copy number data were rounded to the nearest whole number. The confidence value of the associated predicted copy number was calculated for each sample (for more details, see: https://tools.thermofisher.com/content/sfs/manuals/ cms_062369.pdf).

Results
For each aDNA sample, nuclear fragments of the Amy2B gene and a reference gene present in two diploid copy numbers (C7orf28b) were co-amplified by a qPCR procedure. In order to estimate the Amy2B copy number, RQ between these two genes were estimated for each sample and then normalized to the reference modern wolf (bearing two Amy2B copies). The protocol was tested on 16 (figure 1a). The two samples from north France (CH735) and Turkmenistan (CH1075) evidenced between 4 and 12 estimated copy numbers. The third sample, from north France (CH734), presented between 8 and 16 Amy2B copy numbers. The fourth sample, from Romania (CH1585), presented the highest estimated Amy2B copy number, varying between 12 and 20. These four samples presented high RQ value variations between replicates (three to six replicates, with variances of 3.07, 2.11, 2.63, 4.53 for CH734, CH735, CH1075, CH1585, respectively; electronic supplementary material, table S4).
The four dogs showing Amy2B gene expansion (more than eight copies) came from several regions of Europe and Southwest Asia (i.e. CH1585, Borduşani, Romania, 7th millennium cal. BP; CH1075, Ulug Depe, Turkmenistan, mid-to late 5th millennium cal. BP; CH735 and Ch734, Bury, France, mid-to late 4th millennium cal. BP; see the electronic supplementary material), but no link could be established between the number of gene copy and a given geographical area. We also compared the mandibles of these four individuals (electronic supplementary material, table S1 and figure S1). The first one (CH1585) had a very short tooth row and showed oligodontia. The other three were larger but with no dental defects. No link between the number of gene copies and the morphological characteristics (i.e. size and mandible shape; electronic supplementary material, table S1 and figure S1) could be found. These results were unable to correlate the Amy2B gene expansion to a particular ancient dog population or morphotype.

Discussion
We obtained results for 13 of 88 samples. This success rate (15%) can be explained by aDNA degradation: (i) the estimated number of copies can differ between replicates and, therefore, must be interpreted as a minimum number of copies that could be detected and (ii) inhibition was observed in amplification curves from the majority of failed amplification attempts. We highlighted the difficulty to precisely estimate the high copy number, as it is already established that the ability to distinguish copy numbers decreases as they increase [38]. Consequently, the confidence values are often lower for high copy number samples even under optimal experimental conditions, due to the compression of the CT sub distributions for high copy numbers [38]. 1 This explains some of the high copy numbers within sample variance calculated for the four individuals showing more than eight Amy2B copies. This phenomenon was amplified by the fact that we worked with aDNA (due to inhibition and degradation) and that two genes were targeted (reference C7orf28b and Amy2B). The pre-amplification step was necessary to restore a sufficient amount of aDNA but did not guarantee equal preservation of both targeted fragments. Enzymatic reparation of the aDNA extracts as well as droplet digital PCR (ddPCR) could be explored to improve detection efficiency. In particular, ddPCR has been shown to reduce mean coefficients of variation by 37-86% and improve reproducibility by a factor of 7 [39].
This study is, to our knowledge, the first report of qPCR being used to estimate the copy number variation from aDNA; this has led to three main issues. dogs to thrive on a starch rich diet, in comparison with the mostly carnivorous diet of wolves [18]. This constituted an important selective advantage for dogs feeding on human leftovers within a farming context. However, the scarcity of data anterior to the Neolithic does not allow us to assess whether this expansion took place before the Neolithic transition, or emerged during the Neolithic under new selection pressures related to the development of agriculture.

Antiquity of the Amy2B gene expansion
Currently, only a few dog lineages, such as the dingo (two copies) and the Siberian husky (three to four copies), show an unusual lack of Amy2B copy number. These dogs come from regions with no, or recent, agricultural practices [17,18,37]. This supports the hypothesis that the development of a dog's capability to digest starch efficiently does not result from a relaxation of the natural selection pressures related to domestication. It is more likely to result from an adaptation to the shift of human food habits during the Neolithic.

Persistence of a small number of copies in ancient dogs
We found ancient dogs from early farming contexts with two copies of the Amy2B gene at Isaccea, Hârşova, Borduşani (Romania), Ulug Depe (Turkmenistan) and Bercy (north France). Low copy number is an exceptional situation in present-day dogs and is only found in lineages associated with recent nomadic hunter-gatherers, such as the dingo and the Siberian husky. These two lineages also appear as basal on phylogenetic trees of extant dog breeds [40,41], probably as a result from a lack of recent admixture with other dog breeds due to geographical and cultural isolation [4]. Our early farming series suggests that the low Amy2B copy number present in their genome could stem from an ancient gene pool.

The Amy2B copy number was not fixed in early dogs
The two dog series from Borduşani and Hârşova can be considered together, as they are contemporary (mid-to late 7th millennium cal. BP) and belong to two neighbouring sites in southeast Romania. The archaeozoological series also displayed identical exploitations of animals [42,43]. The Borduşani/Hârşova set includes dogs bearing either two, two to eight or more than eight Amy2B copies, indicating a strong variability in the number of copies of Amy2B that could exist concurrently in the same population.
On a wider geographical level, our results show that dogs bearing the Amy2B copy number expansion came from various regions. Similarly, we found no link between the number of gene copies and specific morphotypes; though it is expected that the adaptation to a starchy diet would not only impact digestive functions but also morphological traits linked to biting and chewing (e.g. teeth, skull and mandible conformation [44]). These observations are congruent with the situation in modern dogs, where there is no fixation of the number of Amy2B copies in a given breed [16,17].

Conclusion
In this study, we have provided evidence for an increase of the amylase gene copy number in ancient dog genomes, with a firm ante quem during the 7th millennium cal. BP in Southeastern Europe. We have demonstrated that the modern capability of numerous dogs to digest starch does not result from the selection of lineages during Classical Antiquity or the nineteenth century selection of modern breeds [19][20][21]; but began, at the latest, during the Neolithic, between the 10th and 7th-5th millennium cal. BP, at least in various regions of West and East Europe and Southwest Asia. We also demonstrated, on the basis of archaeological remains that the Amy2B copy number increase was not fixed in all dogs from Neolithic farming societies. In addition, we showed the relatively late persistence of only two copies of the Amy2B gene in ancient dogs, well beyond the first appearance of farming. This situation is uncommon in modern dog lineages and could not have been demonstrated without ancient data.
Further analyses, on larger samples of ancient Eurasian dogs and wolves from the Palaeolithic to the Bronze Age, would help define the precise chronology and rhythm of the Amy2B expansion during early dog breeding. It will also help to pinpoint the date(s) and the location(s) of the first occurrence(s) of the Amy2B expansion (i.e. more than eight copies).
In humans, the pattern of variation in copy numbers of the human amylase gene (AMY1) is consistent with a history of diet-related selection pressures [45]: higher AMY1 copy numbers and protein levels likely improve the digestion of starchy diets. Human starch consumption increased significantly during the Neolithic transition and is correlated with a gradual increase of AMY1 copy numbers [46,47].