The York Gospels: a one thousand year biological palimpsest

Medieval manuscripts, carefully curated and conserved, represent not only an irreplaceable documentary record but also a remarkable reservoir of biological information. Palaeographic and codicological investigation can often locate and date these documents with remarkable precision. The York Gospels (York Minster Ms. Add. 1) is one such codex, one of only a small collection of pre-conquest Gospel books to have survived the Reformation. By extending the non-invasive triboelectric (eraser-based) sampling technique eZooMS, to include the analysis of DNA we report a cost effective and simple-to-use biomolecular sampling technique. We apply this combined methodology to document for the first time a rich palimpsest of biological information contained within the York Gospels, which has accumulated over the 1,000 year lifespan of this cherished object that remains an active participant in the life of York Minster. This biological data provides insights into the decisions made in the selection of materials, the construction of the codex and the use history of the object.

Abstract 26 Medieval manuscripts, carefully curated and conserved, represent not only an irreplaceable documentary 27 record but also a remarkable reservoir of biological information. Palaeographic and codicological 28 investigation can often locate and date these documents with remarkable precision. The York Gospels 29 (York Minster Ms. Add. 1) is one such codex, one of only a small collection of pre-conquest Gospel 30 books to have survived the Reformation. By extending the non-invasive triboelectric (eraser-based) 31 sampling technique eZooMS, to include the analysis of DNA we report a cost effective and simple-to-use 32 biomolecular sampling technique. We apply this combined methodology to document for the first time a 33 rich palimpsest of biological information contained within the York Gospels, which has accumulated over 34 the 1,000 year lifespan of this cherished object that remains an active participant in the life of York 35 Minster. This biological data provides insights into the decisions made in the selection of materials, the 36 construction of the codex and the use history of the object. 37 38 39 40 Introduction 1 Illuminated manuscripts are objects of great worth and value, often in the past elaborately decorated and 2 bound, emphasising their importance, not only as literary texts but also as physical objects of intrinsic and 3 spiritual value. Moreover, a contemporaneous collection of animal skins bound together provides a 4 remarkable biological resource, which may inform upon the husbandry of the animals and in turn shed 5 light on the assembly of the codex. The utility of parchment documents as a store of biological 6 information is confirmed by a number of molecular studies, which have successfully retrieved DNA 7 sequences from parchments and produced comparisons with modern reference populations of cattle, 8 sheep and goat [1][2][3][4]. These analyses have utilised isolated parchment fragments, which are then digested 9 as part of the DNA extraction process. Understandably, such studies have not yet included bifolio from 10 bound volumes. 11 12 In the light of the vast potential afforded by ancient manuscripts several authors are experimenting with 13 non-destructive sampling of documents. Whilst most of these methods involve some form of 14 spectroscopic analysis, novel approaches which release molecules from the surface, such as the use of 15 synthetic gel films [5,6] have been reported, and this work can be seen as part of a wider push to develop 16 sampling methods for material culture [7][8][9][10][11]. These studies also form part of an increasingly 17 sophisticated analysis of the conservation status of objects, which includes analyses of the microflora 18 within buildings and upon objects. In the case of parchment, this includes concerns over both parchment 19 deterioration and risks to personnel health caused by mould (hyphomycetous fungi) [6,[12][13][14]. 20 21 Analyses of the traces left by the handling and use of an object are common in Palaeolithic and Neolithic 22 archaeology and have been applied to personal adornment artefacts [e.g. 15,16] as well as stone tools. 23 Much less common in the historical period, they can illuminate the choices made in the selection of raw 24 materials, and the life history of an object [17]. A pioneering application of this analysis is Kate Rudy's 25 [18] use of densitometry to map the discoloration of parchment caused by repeated usage. Her work 26 highlights a third level of biological data, the grime on the surfaces of parchments that attests to handling. 27 In some cases, folia have experienced much more active intervention, such as devotional kissing or 28 rubbing, activities that are likely to leave distinct microbial traces, which can be explored using 29 biomolecular methods. 30 31 The analysis of both the raw materials of the codex (i.e. the skins selected from flocks and herds), and the 32 microorganisms on the object (which may highlight its use history and conservation risk) must be 33 compatible with conventional conservation treatment. Our contribution to this field has been to develop 34 [19] and here to refine a triboelectric (eraser-based) sampling method, to recover first protein and now 35 DNA from parchment. Dry cleaning with PVC erasers is a common and widely used conservation 36 technique [20], and we analyse the waste material from this process, which would otherwise be discarded. 37 38 We apply our approach to document for the first time the vast array of biological information contained 39 with a single codex, recovering both DNA and proteins from the dry eraser waste of skins from the York Protein Analysis 4 This is the first systematic application of eZooMS [19] to identify the source species of every bifolia 5 contained within a single codex (Supplementary table 1). The York Gospels were found to be composed 6 of two animal species: the original gospels are of calfskin (except for one bifolia made of sheep) and the 7 later additions (C 14 th ) are made exclusively of sheepskin ( figure 1, supplementary figure 1). Due to 8 extensive conservation treatments carried out in the 1950s that involved covering entire folios in silk 9 gauze, we were unable to determine the species for the final quire and flyleaves (folios 162-167). 10 11 DNA Analysis 12 DNA extraction was attempted from eight bifolio of the York Gospel from which large volumes of 13 eraser waste (150-250ul) had been generated; sufficient DNA concentrations were recovered from all 14 samples for successful library preparation and high-throughput sequencing. Initial species identification 15 was undertaken with FastQ Screen and in all cases, the genetic species assignments agreed with those 16 produced through eZooMS (supplementary table 2). One sample (folio 158, Quire XXIII), which had 17 previously provided an inconclusive species identification of calf during the proteomic analysis was also 18 assigned as calf in the genetic analysis. However, this page is one of those that suffered extensive invasive 19 conservation, and even this combined assessment remains somewhat tentative. Insufficient data was 20 recovered from two samples taken from the 14 th C additions to the Gospels to confidently assign the 21 source species using solely genetic methods, however both samples were previously conclusively 22 identified as sheep via eZooMS. 23 24 Genetic analysis 25 DNA sequences recovered from the parchment samples were stringently (supplementary table 3,  26 supplementary methods) aligned to the genome of the identified host (production) species to estimate the 27 proportion of endogenous DNA retained within the bifolio of the York Gospels (supplementary 28 methods). This analysis resulted in a mean endogenous percentage of 19.3% (range 0.7 -51.4%) over all 29 the samples (supplementary table 3). For one bifolio, folio 125 (quire XVIII), 51.4% of reads could be 30 aligned to the genome of the source species (cow), however this assignment fell to 5.6% when filters for 31 read mapping quality were applied (supplementary table 3). While some loss of mapped reads is expected 32 post filtering, this extreme reduction suggests that there may be a bias in DNA sequence preservation or 33 retrieval from the manuscript; one that in this case favoured repetitive genomic regions over more gene 34 rich euchromatic regions. This apparent taphonomic bias may be a reflection of the harsh alkaline 35 treatment that is an integral part of the parchment production process, which might selectively degrade 36 the more loosely packed euchromatin over the tightly bound heterochromatin (Supplementary figure 2). 37 38 A mapDamage2.0 [22] analysis was completed on the filtered host reads recovered from the York 39 Gospels to explore DNA damage patterns and recovered DNA authenticity (supplementary figure 3). All 40 samples were found to have characteristic markers of degraded ancient DNA with an increase in 41 deamination at the end of reads. However, the frequency of these modifications is much lower than what 42 would be predicted for bone of a similar age [23]. Given that the DNA fragment lengths are also short in 43 these recovered sequences (supplementary figure 4), this result may suggest that different DNA 44 degradation processes are active in parchment and bone. 45 46 Sufficient sequence data was recovered from three samples (Fol. 13, Fol. 101 and Fol. 125) to permit 47 further population genetic analyses. These samples were placed onto a reference dataset of modern cattle 48 [24] in a principal component analysis (supplementary figure 5a and b) using the LASER2.0 software 49 [25,26]. Although, only a relatively small number of variants could be called in each of the three samples 50 they can be seen to cluster with modern cattle breeds of Northern Europe (supplementary figure 5b). 51 Specifically, the sample with the highest genomic coverage folio 101 (2,139 bovine HD snps called) falls 52 just outside a cluster of Norwegian red and Holstein animals, with the two lower coverage samples folio 53 13 and folio 125 falling outside of the major European distribution (supplementary figure 5b). The 54 position of folio 13 and 125 likely reflects the limited SNP recovery in these samples, however it could 1 also be an indication of true genetic diversity in these animals. As further studies of geographically and 2 temporally localised animal bone reveal the genetic landscape of medieval cattle, we will be in a stronger 3 position to localise these and future parchment objects. 4 5 Sex identification [27] was attempted for all eight bifolio sampled for DNA analysis (supplementary table  6 2), with high confidence assignments being deduced for five. Four out of the five reliably typed animals in 7 the original Gospel document (prior to the later circa 14 th C additions) were found to be female. 8 9 Exogenous DNA 10 11 Human 12 As the novel non-invasive DNA sampling technique utilised in this study samples molecules from the 13 surface of documents, the exogenous/environmental DNA residing on the York Gospels was explored. 14 Firstly, to discern an estimate for the upper bounds of human DNA on the document, reads recovered 15 from the Gospels were aligned to the human genome (hg19 Microbiome 39 To further classify the York Gospels metagenome, recovered sequences were analysed using two 40 independent metagenomic pipelines, One Codex [30] and metaBIT [31] (supplementary figure 6a and b) 41 to try to reduce method-based classification biases [32]. Six samples from archival documents held in the 42 Borthwick Archive were also included for comparison. The taxon distributions generated from the 43 metaBIT and One Codex analysis of the York Gospel samples were found to be consistent with those 44 previously reported for the Skin microbiome (supplementary figure 6a and b) [31,33]. This result is in 45 agreement with the fact that the skin microbiome is the most common component of the urban 46 microbiome [34], in particular on handled surfaces [35]. 47 48 To further investigate this skin microbial signature, a principal component analysis (PCoA) of the York 49 Gospel samples was conducted with the human microbiome project (HMP) human-associated microbial 50 profiles provided with metaBIT at genus level (figure 2a). All the York Gospel samples irrespective of 51 their human / endogenous DNA percentages or conservation status were found to have a microbial 52 profile which placed them within the HMP skin and nose diversity ( figure 2a). This tight clustering of the 53 York Gospel samples is not seen in the six comparative documents, whose broader distribution may 54 reflect their more diverse life histories prior to and within the archives (supplementary figure 7). The skin 55 microbial signature is further highlighted in an abundance heatmap (figure 2b) which shows the presence 1 of skin microbial markers (e.g. Propionibacterium, and Staphylococcus) at high relative frequencies. Importantly 2 these skin microbiome signatures are absent in the control sample (supplementary figure 8) suggesting 3 they reflect microbiota colonising the parchment itself and are not the result of laboratory contamination 4 [36,37]. Of importance to the continued conservation of the York Gospels is the discovery of the 5 Saccharopolyspora genus on all bifolio including the conserved sample folio 158. This bacterial genus has 6 been identified previously by Piñar et al. [13] as a possible cause for a measles-like (maculae) spotting of 7 parchment, which is associated with localised collagen damage and document degradation [13]. 8 9 To explore shared patterns of microbial colonisation within the York Gospels, samples were clustered 10 according to their genus profiles (dendrogram figure 2b). This analysis placed the microbial composition 11 of the highly conserved folio 158 as an outlier to all other York Gospel samples, moreover the two later 12 additions to the manuscript (Fol. 3 and Fol. 6) are seen to cluster. Interestingly, two samples with 13 relatively high endogenous DNA content (Fol. 13 and 101) also fall together, a tentative hint at the 14 possibility of future correlations between microbial colonisation and DNA retrieval. The clustering 15 analysis was then repeated using the combined Borthwick Archive / York Gospels dataset 16 (supplementary figure 8). In this analysis three major groupings can be seen, firstly, folio 158 is again seen 17 as an outlier. Two further internal groupings are then revealed (supplementary figure 8) figure  33 9c), highlighting five significantly differentiated taxa between the groups (Saccharopolyspora, Pseudonocardia, 34 Actinopolyspora, Propionibacterium, and Staphylococcus). These results seem to further reflect the differences in 35 the level of handling between the documents with the York Gospels microbiome containing significantly 36 more of the skin microbiome components Propionibacterium and Staphylococcus, but might also provide 37 insights into their relative conservation priorities with the York Gospels being more heavily colonised 38 overall by Saccharopolyspora (supplementary figure 9c). 39 40 The limited number of significantly differentiable taxa in these metagenome comparisons may be 41 indicative of the small sample size in this study and the inherent difficulty of metagenomic taxonomic 42 assignment from shotgun sequencing [32]. However, it could also be an indication of a common 43 parchment microbiome, maintained by the specific microbial growth conditions (salt rich) provided by 44 parchments surface [13]. 45 46 represent >5% abundance in at least one sample. Clustering of samples (dendrogram) was completed 5 using the complete metaBIT genus filtered output. The highly conserved sample Fol.158 is seen as an 6 outlier to the other York Gospel samples and the two later additions (Fol. 3 and 6) are seen to cluster. 7 8

1
Illuminated manuscripts represent an irreplaceable historical record, but the need to conserve these 2 documents is seemingly at odds with their value as an important reservoir of contemporaneous biological 3 information. By extending the non-invasive eZooMS method to include the analysis of DNA we propose 4 a cost effective and simple-to-use biomolecular sampling technique to enable this historical resource to be 5 explored. The current study applies this combined methodology to document for the first time a rich 6 palimpsest of biological information from a complete book object of great cultural value over its 1,000 7 year history. 8

Species composition and book production 9
Within this study the species composition of the York Gospels has been revealed, the primary document 10 is composed almost exclusively of calf skin, apart from a single bifolio made of sheep. This detailed 11 analysis further highlights the utility of eZooMS for high throughput species identification of 12 manuscripts. With a low cost and analysis time per sample, this technique has the potential to describe the 13 species composition of many other documents and provide further insights into illuminated manuscript 14 production. 15 16 As, zooarchaeology usually struggles to obtain accurate population sized assemblages, with collections 17 often processed and fragmented and rarely constrained to a narrow time range [39,40]. The analysis of the 18 species composition of manuscripts may have implications, which extend beyond the documents 19 production, by providing a more refined understanding of past animal population sizes with a tighter 20 chronology than can often be obtained from archaeological assemblages alone. 21 22 Although a possible artefact of the small sample number, the frequency of female animals (4 females, 1 23 male) among the calves is worthy of mention. As cattle are slaughtered for parchment production as 24 juveniles (supplementary materials and methods), male calves, of lesser reproductive value than females, 25 would be hypothesized to be most often selected for this purpose. If this contradictory pattern of an 26 excess of female calves were to be confirmed by further sampling, a possible explanation could lie in the 27 correlation between the writing of the manuscript and a historical outbreak of murrain, tentatively 28 identified as rinderpest or a closely related morbillivirus ancestor. Although the composition date of the 29 York Gospels is still debated, it is possible they were written in Canterbury around 990 CE, shortly after a 30 major outbreak of cattle plague occurred in the Great Britain and Ireland. At least seven independent 31 sources describe widespread cattle mortalities in England, Wales, Ireland and possibly Scotland between 32 986 and 988 [41,42]. Medieval and later texts demonstrate that the flaying of the carcass of a diseased 33 animal to employ its skin was an accepted way of cutting losses. The murrain could therefore have 34 produced an abundant source of foetal and newborn calf skins of both sexes, that were possibly so 35 numerous they were still being used as parchment two to four years after the outbreak. Indeed, even if 36 these skins were not produced as a consequence of the murrain but there was a widespread local mortality 37 in the years before the text was written, it would seem unusual to sacrifice the very animals (females) 38 required to rebuild the herd. Further analyses of the mortality patterns and local outbreaks of disease 39 could help to refine the chronology of text. 40 41 An alternative explanation could lie in the value of the female calves themselves, as cattle are proposed to 42 be a great source of wealth and power in the Anglo-Saxon world. If female calves are of higher value than 43 male, perhaps the selection of female animals reflects the choice of the very best and most expensive 44 material available to receive the holy word: a sacrifice of prized animals fit to answer the sacrifice of the 45 Lord, and to demonstrate the faith and the wealth of the commissioner of the manuscript [43]. 46 47 Finally, our assumption about the higher value of female calves, which relies mostly on data from earlier 48 (Roman period) or later (13th-14th century) texts, may be erroneous. The Saxon economy relied mostly 49 on oxen for traction [44], and the latter may have out-valued heifers, leading to occasional surfeits of 50 female calves. 51 52 In this context of the perceived value of calves, the presence of a single sheepskin bifolio in the original 53 Gospel text seems out of place. In the accounts of a Cistercian Abbey at Beaulieu (1269-1270) the best 54 sheepskin parchment is worth less than the worst calf parchment [45]. Determining the extent to which 1 the selection of sex and of species reflected differences in quality, perceived notions of value or a spiritual 2 dimension would require a more comprehensive study of Anglo-Saxon manuscripts. 3 4 In relation to the later sheepskin additions to the document a 16th century inventory describes the York 5 Gospels as "A text, decorated with silver, not well gilt, on which the oaths of the dean and other dignities 6 and canons are inserted at the beginning" [21]. This description would seem to imply that not only the 7 text but the bifolia themselves were later 14thC additions. A possible explanation for the use of sheepskin 8 for these subsequent additions that contain oaths, deeds and personal correspondence is found in the The 9 Dialogus de Scaccario [46], which describes a preference for legal documents to be written on sheepskin to 10 avoid erasure and fraud of the written details. Unlike calf and goat, sheepskin has a lower density of 11 collagen fibres at the base of the (more abundant) hair follicles, which means that the skin can split and 12 the upper layer peel away if the parchment is roughly abraded. 13

Population genetic analysis 14
This study is the first to our knowledge to use a non-invasive method to retrieve host genetic data from 15 parchment, arguably the most important biological material of the Medieval world. We were able to 16 recover endogenous DNA from all six of the samples taken from the original, almost one thousand year 17 old York Gospel document. Although only limited DNA sequencing was undertaken in this analysis there 18 was still sufficient data to estimate the genetic affinities of half of these samples, placing them within 19 European cattle diversity. 20 21 The loss of more data than would be expected through standard bioinformatic filters for repetitive 22 sequences may represent a limitation to this nucleotide retrieval technique, as the abundance of these 23 sequences found in our dataset restricted the utility of the host genomic data. However, this increase in 24 repetitive sequences has also been observed to a lesser extent in DNA samples from younger cut pieces 25 of sheep parchment [1]. Given the extensive sampling opportunities that could be provided by our novel 26 non-invasive method, a further comprehensive analyses of a diverse range of documents is warranted to 27 see if this effect is truly a consistent artefact of eraser based extraction, or an as yet undescribed feature of 28 DNA recovered from older parchments. Encouragingly, one sample (folio 101) did have sufficient 29 genetic density to enable clear comparisons with reference populations and was found to closely resemble 30 breeds of North West Europe, a result that matches the provenance of the object. 31 32 These analyses highlight a second text contained within the manuscripts of the Medieval World, which 33 speaks to the historical management of animals and the further possibility of geolocating the source of the 34 parchment, by comparison with the archaeological record of butchered bone. Moreover, as methods for 35 DNA recovery and analysis are optimised further, the phenotype of animals selected for parchment 36 production may be documented [47,48]. Genetic analyses could reveal not only the sex of the animals, but 37 also their coat colour and morphological features (polled vs. non-polled), while epigenetic analyses may in 38 the future provide insight into the animal's age [49,50]. 39

40
The analysis of microbial communities that inhabit the built environment is an expanding field of 41 research, and is moving into heritage science. We have developed an easy to use non-invasive sampling 42 method, which can recover high resolution microbial data from sensitive documents. Our culture-free 43 technique has benefits over other recently proposed sampling methodologies [6] in that it co-opts a 44 widely accepted manuscript cleaning technique [20] and as such can be reliably implemented into work-45 flows by conservators and codicologists. 46 47 One of the greatest factors influencing interpretation of microbiomes is technical variation in the analysis 48 [51], we therefore opted for a shotgun approach to guard against known artefacts in the 16S analysis of 49 short ancient DNA sequences [52]. Moreover, although it is known that HMWt DNA is preserved on 50 filter paper (Owens and Szalanski 2005) (a surface in some ways analogous to parchment) we chose not to 51 shear the extracted DNA, in an attempt to exclude recent DNA transferred by handling and preserve 52 highly degraded endogenous molecules [53]. Reagent contamination is also a known issue for studies such 53 as ours with limited DNA starting concentrations [36,37], which we addressed through the use of 54 appropriate blank controls. Finally, we utilised multiple software packages in our analysis, to improve the 1 resolution of taxon presence and abundance [32]. Despite the relative immaturity of shotgun 2 metagenomic analyses, it is encouraging that the different analytical pipelines used in the study gave 3 relatively consistent results, and the species of interest have overlaps with those previously described in 4 the parchment microbiome. 5 6 The microbial signature of the York Gospel reflects the nature and use of this document in that it 7 resembles that of the human skin microbiome, reiterated by the level of human DNA discovered on the 8 surface of the document with values of over 15% of recovered reads in some cases. The increased 9 abundance of the Propionibacterium genus seen in this analysis compared to other studies, likely reflects the 10 significant amount of handling the York Gospels has been subject to, including its use in ecclesiastical 11 ceremonies to this day. An alternative explanation is that the true extent of colonization by 12 Propionibacterium species on parchments may have been underestimated by some 16S studies as the 13 targeted sequencing of the hypervariable region 4 has been shown to limit the resolution of skin 14 commensal microbiota, particularly Propionibacterium [54]. The discovery of the possibly destructive 15 Saccharopolyspora genus [13] at high concentrations on the surface of the York Gospels further highlights 16 the utility of metagenomic analyses that seek to describe the parchment metagenome. Illuminated 17 manuscripts are part of our collective heritage and methods like these that can aid target conservation 18 efforts will be of great use. 19 20 Comparison of the microbial signatures generated in this analysis highlighted a distinct difference 21 between a highly conserved bifolio and the remaining bifolia from the York Gospels and enabled 22 documents that were later additions to the manuscript to be distinguished. These results raise the 23 possibility of future molecular provenancing of documents as more manuscripts have their microbiomes 24 explored. The metagenomic signals from our comparisons of the Borthwick Archive and York Gospel 25 samples strengthens this assertion as overall, we could distinguish between the two groups. The limited 26 number of statically differentiated taxa between these sample sets with very divergent conservation 27 histories raises the possibility that the microbial colonisation of documents though widespread may be 28 limited to certain species that can tolerate the growth conditions at parchments surface [13]. 29 30 31 Conclusion 32 This study is the first non-invasive biomolecular analysis of a complete book object. Both protein and 33 genetic analyses have been applied to reveal the animal origins of the 167 parchment folios that make up 34 the York Gospels. This is the first time a non-invasive sampling technique has been used to recover host 35 DNA from parchment manuscripts, which has allowed not only for the determination of source species 36 but also an estimate of the genetic affinities and sex of the animals with limited sequencing. In addition, 37 our novel sampling method enabled the recovery of detailed information on the microbiome associated 38 to individual folios, which will not only be of great interest for book conservation, but also has the 39 potential to inform about past storage and handling of book objects. 40

42
A full description of materials and methods is provided in supplementary methods. Briefly, the York 43 Gospels and archival documents were sampled using the dry non-invasive eraser based sampling 44 technique of Fiddyment et al. [19]. With 86 folios sampled for protein analysis, and a further eight folios 45 and six archival documents sampled for DNA analysis. eZooMS analysis of the York Gospels and 46 archival documents was completed following the protocol of Fiddyment et al. [19]. DNA was extracted 47 from the eraser crumbs following a modified version of the protocol of Fiddyment et al. [ and sequenced on an Illumina MiSeq. Raw sequencing reads were trimmed of adapter sequences using 51 cutadapt [57] and aligned to appropriate reference genomes using BWA [58] and filtered with SAMtools 52 [59]. DNA damage assessments were completed using mapDamage2.0 [22] and population genetic PCoA 1 analyses with LASER2.0 [25]. Metagenomic analyses were completed from host filtered datasets 2 (supplementary methods) using metaBIT [31], OneCodex [30] and STAMP [38].