Bone need not remain an elephant in the room for radiocarbon dating

Radiocarbon (14C) analysis of skeletal remains by accelerator mass spectrometry is an essential tool in multiple branches of science. However, bone 14C dating results can be inconsistent and not comparable due to disparate laboratory pretreatment protocols that remove contamination. And, pretreatments are rarely discussed or reported by end-users, making it an ‘elephant in the room’ for Quaternary scientists. Through a questionnaire survey, I quantified consensus on the reliability of collagen pretreatments for 14C dating across 132 experts (25 countries). I discovered that while more than 95% of the audience was wary of contamination and would avoid gelatinization alone (minimum pretreatment used by most 14C facilities), 52% asked laboratories to choose the pretreatment method for them, and 58% could not rank the reliability of at least one pretreatment. Ultrafiltration was highly popular, and purification by XAD resins seemed restricted to American researchers. Isolating and dating the amino acid hydroxyproline was perceived as the most reliable pretreatment, but is expensive, time-consuming and not widely available. Solid evidence supports that only molecular-level dating accommodates all known bone contaminants and guarantees complete removal of humic and fulvic acids and conservation substances, with three key areas of progress: (i) innovation and more funded research is required to develop affordable analytical chemistry that can handle low-mass samples of collagen amino acids, (ii) a certification agency overseeing dating-quality control is needed to enhance methodological reproducibility and dating accuracy among laboratories, and (iii) more cross-disciplinary work with better 14C reporting etiquette will promote the integration of 14C dating across disciplines. Those developments could conclude long-standing debates based on low-accuracy data used to build chronologies for animal domestications, human/megafauna extirpations and migrations, archaeology, palaeoecology, palaeontology and palaeoclimate models.


2
Comments to the Author(s) I'm in two minds about this paper. On the one had it will be really refreshing for radiocarbon scientists to get valuable input from someone knowledgeable, but slightly outside the field. It is also great to have many of the concerns radiocarbon scientists have shouted about for years (often to people submitting radiocarbon samples, and then publishing dates) forcibly described. But on the other hand, I worry that it will 1. scare potential users of radiocarbon labs into never attempting to radiocarbon date bone (except at Oxford…?!) and 2. appears to claim scientific credibility over which pretreatment methods are best, but is really based on the opinion of people who submit radiocarbon samples (and acknowledge they don't know about pretreatment) and radiocarbon lab staff. I think a lot of this can be addressed by the way in which the paper is written. I've also made some suggestions so as not to alienate the C14 community.
As a starting point, I think the author needs to show how often, and when radiocarbon dates on bone actually are problematic -and do this very early on in the paper. Note -this is not the same as taking the statistic from the survey to show that people submitting bones think they are contaminated. First the author needs to establish whether he is just talking about Pleistocene bones or all bones (as implied in the current text). The author mentions the problem of young contamination in very old samples, but 1% modern contamination in a sample that is 2000 years old will only shift the age by 23 years. All of the references I checked are about Pleistocene samples where samples are very sensitive to young contaminants. Second, the author would need to use recent examples (last 10 years or so), as pretreatment protocols and background calculation methods have recently changed. The assumption that radiocarbon dates on bone are always problematic is a little outdated in my opinion (though I have no survey to prove this). This is very important, as I think someone reading the paper with little knowledge of the field would assume that the author is implying that all, or most, radiocarbon dates on bone are probably wrong. The author nicely described the two cases where there are still currently problems -humic (severely contaminated e.g. black bone in peat) and conservation treatments (could add embalming as well), are actually quite rare in my experience (although I admit the conserved bones are often valuable specimens and often very important to get an accurate age on). These samples are very easy to identify prior to pretreatment. Even though many people think the samples they submit are contaminated (line 202) -this is people 'thinking' they are contaminated. What does this really mean -do they have evidence they are contaminated and how severely are they contaminated? Of course, we would all think that buried bones may be contaminated but this does not mean that they fall into the two categories above, or that contaminants cannot be removed with the standard ultrafiltration pretreatment.

4
Line 299 -301 -this is nothing to do with added contamination, only that we don't really know what the ultrafilters remove. It should go right up in the intro where you first discuss what the ultrafilters do.
Line 304 -the age of the humectant actually changes between batches. It was old, but is now mainly modern in age (i.e. much more problematic for Pleistocene aged bones), or occasionally late Holocene. I can't remember when the change in age happened, but it was prior to 2010. See Wood et al. 2012 Radiocarbon 52 (2). There is another paper, I think by Brock in Radiocarbon, on a batch with a late Holocene age, but I can't find it.
Line 307 -acid washing the membrane would damage it -so the increase in collagen yield is due to the membrane not doing the same thing as before (and note an earlier comment that I think we are not exactly sure what ultrafiltration does to improve the age…).
Line 358 -whole-bone %N is not used to assess collagen quality in this reference. It is only used as a very rough indicator of whether collagen may be present. It is a prescreening method only.
Line 360 -near infra-red methods are screening methods (see comment for line 358).
Line 367 -' Dating individual samples several times will however be often beyond a researcher's budget' -but HPLC is not?! Line 384 -My major issue is the need for a dedicated and knowledgable researcher to run and maintain the setup. Plus, unless a lab is using it continuously, the constant background samples that would need to be run would be prohibitive in terms of time and money (and precious standard material).
Line 385-393 -Really great to have exciting suggestions here, though I do worry about adding carbonaceaous reagents to the process. Given the problems you talk about with column bleed and membrane humectant, adding reagents is a worry I think you need to mention.
Line 400 -I would add to this -culturally valuable material. Why destroy large amounts of material for no extra accuracy? Bone from indigenous Australian/ American (and probably numerous other places I don't know about) cannot be destroyed unless absolutely necessary. Many museum curators, even in western museums, will not allow unnecessary destruction (and in my view, rightly so). Is it notbetter to try ultrafiltration/XAD and escalate to single amino acid where needed. This has two advantages 1.
you don't destroy material unnecessarily 2.
you will know if there is enough collagen present to justify taking a larger sample, and how much to sample. 420 -It might be an idea to look up some of the work that Heritage England (Alex Bayliss) has done on comparisons. They replicate many bone samples between labs to assess accuracy and comparability. Only a few labs are involved, but there are many samples involved.
Line 434 -my initial response to this was 'If the case, screw universities, and lets make all labs commercial with no freedom to innovate. I'm leaving...'. I doubt I'd be the only radiocarbon researcher to think this. To avoid alienating some of your audience, can you explain how this would not be the case?
Line 437 -I think this is already accepted, and from what you have written, I think you also already accept this. On the other hand -would anyone really put too much weight on any dataset generated in the 1960s and say it is comparable to present day data without any cross checking? It seems a bit harsh to say that this is only a problem with radiocarbon dates.
Line 450 -the easiest samples are ones which fall beyond the limit of the radiocarbon method. Then, any 14C is a contaminant. The method is incredibly sensitive to them (but also to the calculation of the lab background -another massive problem for very old dates). And we do not need to know the age of the samples -only that they are >60,000 yr. This vastly increases the number of samples available as U-series/ OSL/ ESR and even tephrochronology are not without their own problems. The problem will be in finding large enough *severely and consistently contaminated* samples which can be completely destroyed when sending to so many radiocarbon labs. Perhaps this is why the process has not happened?
Line 477 -I think this quote will alienate much of the 14C community. I would argue that they now spend huge amounts of money on pretreatment. Think of all of the technician time processing the samples let alone the time in pretreatment research or quality control, especially now an AMS can be run by 1 person working part of the time. This seems a ridiculous and slightly offensive comment to me. Radiocarbon labs have long histories of trying to address contamination issues, and long histories of trying to communicate the difficulties with their users (look at any 14C review). This quote seems to suggest that labs have ignored the problem, and this is demonstrably not the case. Perhaps count the number of papers in the journal Radiocarbon on pretreatment (remembering we need to work on numerous different materials) vs. the number on AMS development.

6
In several parts of the text, the word "imino" acid appears. I'm not sure if you are actually referring to imino acids (a molecule related to amino acids) or if this is just a mistake. Please check. I'm not attaching any file, since I don't have any further comments.
Decision letter (RSOS-201351.R0) We hope you are keeping well at this difficult and unusual time. We continue to value your support of the journal in these challenging circumstances. If Royal Society Open Science can assist you at all, please don't hesitate to let us know at the email address below.

Dear Dr Herrando-Pérez
The Editors assigned to your paper RSOS-201351 "Bone need not remain an elephant in the room for radiocarbon dating" have now received comments from reviewers and would like you to revise the paper in accordance with the reviewer comments and any comments from the Editors. Please note this decision does not guarantee eventual acceptance.
We invite you to respond to the comments supplied below and revise your manuscript. Below the referees' and Editors' comments (where applicable) we provide additional requirements. Final acceptance of your manuscript is dependent on these requirements being met. We provide guidance below to help you prepare your revision.
We do not generally allow multiple rounds of revision so we urge you to make every effort to fully address all of the comments at this stage. If deemed necessary by the Editors, your manuscript will be sent back to one or more of the original reviewers for assessment. If the original reviewers are not available, we may invite new reviewers.
Please submit your revised manuscript and required files (see below) no later than 21 days from today's (ie 20-Oct-2020) date. Note: the ScholarOne system will 'lock' if submission of the revision is attempted 21 or more days after the deadline. If you do not think you will be able to meet this deadline please contact the editorial office immediately.
Please note article processing charges apply to papers accepted for publication in Royal Society Open Science (https://royalsocietypublishing.org/rsos/charges). Charges will also apply to papers transferred to the journal from other Royal Society Publishing journals, as well as papers submitted as part of our collaboration with the Royal Society of Chemistry (https://royalsocietypublishing.org/rsos/chemistry). Fee waivers are available but must be requested when you submit your revision (https://royalsocietypublishing.org/rsos/waivers). Both reviewers agree that this paper represents a valuable contribution in giving non-specialists tools to be more informed consumers of radiocarbon dates, and in advocating for greater transparency in publications about the metadata of dated specimens. However, Reviewer 1 raises some valid concerns about how some material is presented and the recommendations made, and I agree that the authors should incorporate these suggestions before the paper can be accepted for publication.
Reviewer comments to Author: Reviewer: 1 Comments to the Author(s) I'm in two minds about this paper. On the one had it will be really refreshing for radiocarbon scientists to get valuable input from someone knowledgeable, but slightly outside the field. It is also great to have many of the concerns radiocarbon scientists have shouted about for years (often to people submitting radiocarbon samples, and then publishing dates) forcibly described. But on the other hand, I worry that it will 1. scare potential users of radiocarbon labs into never attempting to radiocarbon date bone (except at Oxford…?!) and 2. appears to claim scientific credibility over which pretreatment methods are best, but is really based on the opinion of people who submit radiocarbon samples (and acknowledge they don't know about pretreatment) and radiocarbon lab staff. I think a lot of this can be addressed by the way in which the paper is written. I've also made some suggestions so as not to alienate the C14 community.
As a starting point, I think the author needs to show how often, and when radiocarbon dates on bone actually are problematic -and do this very early on in the paper. Note -this is not the same as taking the statistic from the survey to show that people submitting bones think they are contaminated. First the author needs to establish whether he is just talking about Pleistocene bones or all bones (as implied in the current text). The author mentions the problem of young contamination in very old samples, but 1% modern contamination in a sample that is 2000 years old will only shift the age by 23 years. All of the references I checked are about Pleistocene samples where samples are very sensitive to young contaminants. Second, the author would need to use recent examples (last 10 years or so), as pretreatment protocols and background calculation methods have recently changed. The assumption that radiocarbon dates on bone are always problematic is a little outdated in my opinion (though I have no survey to prove this). This is very important, as I think someone reading the paper with little knowledge of the field would assume that the author is implying that all, or most, radiocarbon dates on bone are probably wrong. The author nicely described the two cases where there are still currently problems -humic (severely contaminated e.g. black bone in peat) and conservation treatments (could add embalming as well), are actually quite rare in my experience (although I admit the conserved bones are often valuable specimens and often very important to get an accurate age on). These samples are very easy to identify prior to pretreatment. Even though many people think the samples they submit are contaminated (line 202) -this is people 'thinking' they are contaminated. What does this really mean -do they have evidence they are contaminated and how severely are they contaminated? Of course, we would all think that buried bones may be contaminated but this does not mean that they fall into the two categories above, or that contaminants cannot be removed with the standard ultrafiltration pretreatment.
I think the search for the 'best' method is not advisable. The paper does touch on this aspect, but I think more needs to be made of it. There are multiple degrees and types of contamination, so why should there only be one pretreatment? This is especially important when that 'best' method is much more destructive and so irresponsible and unethical (or impossible) to use in many (or I would argue all) circumstances? I take the view that it is better to start with a 'good' pretreatment method that we know works in the majority of cases (ie ultrafiltration or XAD-resin) and escalate to a single amino acid approach only where needed (and from what I've read of the papers coming out of the Oxford lab, this is very rare).

Comments by section:
Abstract -Perhaps you could argue that journals need to take the lead on reporting etiquette? People have been shouting about this for years (Millard 2014 Radiocarbon for example), but few seem to listen. Unless a radiocarbon chemist is included on a paper, the necessary details are still rarely published, as the author rightly states. There are simply not enough radiocarbon chemists to help write every paper including radiocarbon dates! Lines 55-57 -no reference is really needed, the maths is straightforward (isotope mixing equation, and then F14C to BP). Also, consider the effect in Holocene aged bones. 1% modern C contamination in a sample that is 2000 years old will only shift the age by 23 years. This is well within the quoted uncertainties (often +-25 14C at 1 sigma). Whilst it may cause a problem in the most high precision models if every sample is systematically affected, it is unlikely to be noticeable in the vast majority of applications (and I would envisage that other factors are likely to be much more of an issue).
Line 76 -'contamination from human and megafauna bone' -just these types of bone? I think you really mean 'Pleistocene age bone'.
Line 97 -There is not just one pretreatment, in the same way as there is not just one type of contamination. So, why should we be looking for 'the one'?
Line 219 -I think you need to remind the reader here that the question was about when 'a bone sample was *severely* contaminated with exogenous carbon' (my own emphasis). To most people, I would imagine that this meant soaked in consolidant (it certainly means that to me rereading the question). These are not the normal bones that you would find in the normally dates recently excavated/ museum curated bagged faunal bone (which is often selected because it is worked in some way or of a specific species, but has been kept away from the museum conservators working on the beautiful specimens…). If true, it would imply that people were not responding to what they would do for a routine pretreatment.
Results section: general It would be interesting so see the breakdown in pretreatment selection by people who work on collagen extraction separately to those who just use the data and don't work in a lab. Are they different?
The first paragraph of the discussion is very repetitive.
Line 265 -the cost of a hyp date would be much much larger in a lab if only a few dates were generated each year -they would need to pay a post-doc/ technician with substantial prior knowledge dedicated to run this method, and for the vast numbers of standards required to monitor column bleed. I'd imagine for most labs the cost (excluding the equipment which needs to be dedicated to radiocarbon to keep contamination under control) would be too much.
Line 279 -it might be nice to acknowledge that ORAU did not in fact propose the ultrafiltration method or use it first -this was Brown 1988. They have a very loud spokesperson who has driven some large high profile projects using the method. I think the list of chosen publications is in part due to publicity (this is not in any way meant to be a negative comment about Oxford, it just reflects good communication).
288 -289 -this would only matter if the methods did not give accurate/ equivalent results surely. The problem is the few studies which compare the methods. I personally think that different methods might be suited to different locations. There would be little point in a lab working on bone from cold regions focusing on methods to clean low yielding bones. And there would be no point in a lab focusing on an area with limited museum curation to use methods to remove consolidants.
Line 299 -301 -this is nothing to do with added contamination, only that we don't really know what the ultrafilters remove. It should go right up in the intro where you first discuss what the ultrafilters do.
Line 304 -the age of the humectant actually changes between batches. It was old, but is now mainly modern in age (i.e. much more problematic for Pleistocene aged bones), or occasionally late Holocene. I can't remember when the change in age happened, but it was prior to 2010. See Wood et al. 2012 Radiocarbon 52 (2). There is another paper, I think by Brock in Radiocarbon, on a batch with a late Holocene age, but I can't find it.
Line 307 -acid washing the membrane would damage it -so the increase in collagen yield is due to the membrane not doing the same thing as before (and note an earlier comment that I think we are not exactly sure what ultrafiltration does to improve the age…).
Line 358 -whole-bone %N is not used to assess collagen quality in this reference. It is only used as a very rough indicator of whether collagen may be present. It is a prescreening method only.
Line 360 -near infra-red methods are screening methods (see comment for line 358).
Line 367 -' Dating individual samples several times will however be often beyond a researcher's budget' -but HPLC is not?! Line 384 -My major issue is the need for a dedicated and knowledgable researcher to run and maintain the setup. Plus, unless a lab is using it continuously, the constant background samples that would need to be run would be prohibitive in terms of time and money (and precious standard material).
Line 385-393 -Really great to have exciting suggestions here, though I do worry about adding carbonaceaous reagents to the process. Given the problems you talk about with column bleed and membrane humectant, adding reagents is a worry I think you need to mention.
Line 400 -I would add to this -culturally valuable material. Why destroy large amounts of material for no extra accuracy? Bone from indigenous Australian/ American (and probably numerous other places I don't know about) cannot be destroyed unless absolutely necessary. Many museum curators, even in western museums, will not allow unnecessary destruction (and in my view, rightly so). Is it notbetter to try ultrafiltration/XAD and escalate to single amino acid where needed. This has two advantages 1. you don't destroy material unnecessarily 2. you will know if there is enough collagen present to justify taking a larger sample, and how much to sample. 420 -It might be an idea to look up some of the work that Heritage England (Alex Bayliss) has done on comparisons. They replicate many bone samples between labs to assess accuracy and comparability. Only a few labs are involved, but there are many samples involved.
Line 434 -my initial response to this was 'If the case, screw universities, and lets make all labs commercial with no freedom to innovate. I'm leaving...'. I doubt I'd be the only radiocarbon researcher to think this. To avoid alienating some of your audience, can you explain how this would not be the case?
Line 437 -I think this is already accepted, and from what you have written, I think you also already accept this. On the other hand -would anyone really put too much weight on any dataset generated in the 1960s and say it is comparable to present day data without any cross checking? It seems a bit harsh to say that this is only a problem with radiocarbon dates.
Line 450 -the easiest samples are ones which fall beyond the limit of the radiocarbon method. Then, any 14C is a contaminant. The method is incredibly sensitive to them (but also to the calculation of the lab background -another massive problem for very old dates). And we do not need to know the age of the samples -only that they are >60,000 yr. This vastly increases the number of samples available as U-series/ OSL/ ESR and even tephrochronology are not without their own problems. The problem will be in finding large enough *severely and consistently contaminated* samples which can be completely destroyed when sending to so many radiocarbon labs. Perhaps this is why the process has not happened?
Line 477 -I think this quote will alienate much of the 14C community. I would argue that they now spend huge amounts of money on pretreatment. Think of all of the technician time processing the samples let alone the time in pretreatment research or quality control, especially now an AMS can be run by 1 person working part of the time. This seems a ridiculous and slightly offensive comment to me. Radiocarbon labs have long histories of trying to address contamination issues, and long histories of trying to communicate the difficulties with their users (look at any 14C review). This quote seems to suggest that labs have ignored the problem, and this is demonstrably not the case. Perhaps count the number of papers in the journal Radiocarbon on pretreatment (remembering we need to work on numerous different materials) vs. the number on AMS development.

Reviewer: 2 Comments to the Author(s)
The paper is very good as it is and I think is tackling a very important issue for the scientific community that does research around the radiocarbon dating of bone.
I have one comment with regards to Figure 2: starting from powder is not the standard procedure. Many bone demineralization protocols start with pieces of bone instead of powder. It would be good to place that option in the figure. In line 963: it says "xcliding". Fit it.
In several parts of the text, the word "imino" acid appears. I'm not sure if you are actually referring to imino acids (a molecule related to amino acids) or if this is just a mistake. Please check. I'm not attaching any file, since I don't have any further comments.

===PREPARING YOUR MANUSCRIPT===
Your revised paper should include the changes requested by the referees and Editors of your manuscript. You should provide two versions of this manuscript and both versions must be provided in an editable format: one version identifying all the changes that have been made (for instance, in coloured highlight, in bold text, or tracked changes); a 'clean' version of the new manuscript that incorporates the changes made, but does not highlight them. This version will be used for typesetting if your manuscript is accepted. Please ensure that any equations included in the paper are editable text and not embedded images.
Please ensure that you include an acknowledgements' section before your reference list/bibliography. This should acknowledge anyone who assisted with your work, but does not qualify as an author per the guidelines at https://royalsociety.org/journals/ethicspolicies/openness/.
While not essential, it will speed up the preparation of your manuscript proof if accepted if you format your references/bibliography in Vancouver style (please see https://royalsociety.org/journals/authors/author-guidelines/#formatting). You should include DOIs for as many of the references as possible.
If you have been asked to revise the written English in your submission as a condition of publication, you must do so, and you are expected to provide evidence that you have received language editing support. The journal would prefer that you use a professional language editing service and provide a certificate of editing, but a signed letter from a colleague who is a native speaker of English is acceptable. Note the journal has arranged a number of discounts for authors using professional language editing services (https://royalsociety.org/journals/authors/benefits/language-editing/).

===PREPARING YOUR REVISION IN SCHOLARONE===
To revise your manuscript, log into https://mc.manuscriptcentral.com/rsos and enter your Author Centre -this may be accessed by clicking on "Author" in the dark toolbar at the top of the page (just below the journal name). You will find your manuscript listed under "Manuscripts with Decisions". Under "Actions", click on "Create a Revision".
Attach your point-by-point response to referees and Editors at Step 1 'View and respond to decision letter'. This document should be uploaded in an editable file type (.doc or .docx are preferred). This is essential.
Please ensure that you include a summary of your paper at Step 2 'Type, Title, & Abstract'. This should be no more than 100 words to explain to a non-scientific audience the key findings of your research. This will be included in a weekly highlights email circulated by the Royal Society press office to national UK, international, and scientific news outlets to promote your work.

At
Step 3 'File upload' you should include the following files: --Your revised manuscript in editable file format (.doc, .docx, or .tex preferred). You should upload two versions: 1) One version identifying all the changes that have been made (for instance, in coloured highlight, in bold text, or tracked changes); 2) A 'clean' version of the new manuscript that incorporates the changes made, but does not highlight them.
--An individual file of each figure (EPS or print-quality PDF preferred [either format should be produced directly from original creation package], or original software format).
--An editable file of each table (.doc, .docx, .xls, .xlsx, or .csv). --An editable file of all figure and table captions.  Note: you may upload the figure, table, and caption files in a single Zip folder. --Any electronic supplementary material (ESM).
--If you are requesting a discretionary waiver for the article processing charge, the waiver form must be included at this step.
--If you are providing image files for potential cover images, please upload these at this step, and inform the editorial office you have done so. You must hold the copyright to any image provided.
--A copy of your point-by-point response to referees and Editors. This will expedite the preparation of your proof.

At
Step 6 'Details & comments', you should review and respond to the queries on the electronic submission form. In particular, we would ask that you do the following: --Ensure that your data access statement meets the requirements at https://royalsociety.org/journals/authors/author-guidelines/#data. You should ensure that you cite the dataset in your reference list. If you have deposited data etc in the Dryad repository, please include both the 'For publication' link and 'For review' link at this stage.
--If you are requesting an article processing charge waiver, you must select the relevant waiver option (if requesting a discretionary waiver, the form should have been uploaded at Step 3 'File upload' above).
--If you have uploaded ESM files, please ensure you follow the guidance at https://royalsociety.org/journals/authors/author-guidelines/#supplementary-material to include a suitable title and informative caption. An example of appropriate titling and captioning may be found at https://figshare.com/articles/Table_S2_from_Is_there_a_trade-off_between_peak_performance_and_performance_breadth_across_temperatures_for_aerobic_sc ope_in_teleost_fishes_/3843624.

At
Step 7 'Review & submit', you must view the PDF proof of the manuscript before you will be able to submit the revision. Note: if any parts of the electronic submission form have not been completed, these will be noted by red message boxes.

Author's Response to Decision Letter for (RSOS-201351.R0)
See Appendix A.

Decision letter (RSOS-201351.R1)
We hope you are keeping well at this difficult and unusual time. We continue to value your support of the journal in these challenging circumstances. If Royal Society Open Science can assist you at all, please don't hesitate to let us know at the email address below.
Dear Dr Herrando-Pérez, It is a pleasure to accept your manuscript entitled "Bone need not remain an elephant in the room for radiocarbon dating" in its current form for publication in Royal Society Open Science. The comments of the Editors are included at the foot of this letter.
Please note the comments made by the Associate Editor --we believe that these are helpful and we hope that you will consider making the suggested minor changes when you receive your manuscript proofs.
You can expect to receive a proof of your article in the near future. Please contact the editorial office (openscience_proofs@royalsociety.org) and the production office (openscience@royalsociety.org) to let us know if you are likely to be away from e-mail contact --if you are going to be away, please nominate a co-author (if available) to manage the proofing process, and ensure they are copied into your email to the journal.
Due to rapid publication and an extremely tight schedule, if comments are not received, your paper may experience a delay in publication.
Please see the Royal Society Publishing guidance on how you may share your accepted author manuscript at https://royalsociety.org/journals/ethics-policies/media-embargo/. The author has done a very thorough job of addressing the reviewers' comments and suggestions and I believe that this article will represent a very useful contribution to the discussion around radiocarbon dating practices. I am suggesting a couple of phrasing changes for clarity: --Abstract: "I argue that only molecular-level dating..." As this paper is not a chemistry research paper, this first-person assertion seems out of place; this statement also seems to directly contradict the first sentence of section 4.2 (and the glue example later in that section). I would recommend rephrasing to convey the idea that molecular techniques are currently the most effective at removing such contaminants --this can be stated as fact based on the various sources cited in the paper, rather than as something the author is *asserting*. I would also recommend splitting off the following recommendations as a separate sentence.
--Final paragraph: "No modeling approach..." I would rephrase this to be more specific and perhaps less offensive; obviously no scientist wants to "subjugate the use of high-quality data" in their research -the key is understanding what impact a lack of precision or potential contamination could have on individual research questions, and in which circumstances contamination is most likely.

Manuscript RSOS-201351 Salvador Herrando-Pérez
Overall, the Associate Editor and the two reviewers highlight the value of my study as follows: Associate Editor: Both reviewers agree that this paper represents a valuable contribution in giving non-specialists tools to be more informed consumers of radiocarbon dates, and in advocating for greater transparency in publications about the metadata of dated specimens.
Reviewer 1: It will be really refreshing for radiocarbon scientists to get valuable input from someone knowledgeable, but slightly outside the field. It is also great to have many of the concerns radiocarbon scientists have shouted about for years (often to people submitting radiocarbon samples, and then publishing dates) forcibly described.

Reviewer 2: The paper is very good as it is and I think is tackling a very important issue
for the scientific community that does research around the radiocarbon dating of bone.
The Associate Editor and the two reviewers provide a number of useful areas of improvement. I provide below a point-by-point discussion of their comments and how I have used those comments to modify and improve my manuscript. I use italics to quote their comments, and underline those parts of the text subjected to change, expansion or removal, respectively. Comments by Reviewer 1 are listed from 1.1 to 1.27, and those by Reviewer 2 from 2.1 to 2.3. All cited references are listed at the end of this document.

REVIEWER 1
Overall comments Comment 1.1: I worry that it will 1. scare potential users of radiocarbon labs into never attempting to radiocarbon date bone (except at Oxford…?!) and 2. appears to claim scientific credibility over which pretreatment methods are best, but is really based on the opinion of people who submit radiocarbon samples (and acknowledge they don't know about pretreatment) and radiocarbon lab staff. I think a lot of this can be addressed by the way in which the paper is written. I've also made some suggestions so as not to alienate the C14 community.

Response:
The overarching goal of my manuscript is to promote improvements rather than discrediting the field of radiocarbon ( 14 C) dating so I have made every effort possible in the revision of the manuscript to follow the specific recommendations made by the reviewer in order to tone down the narrative without weakening the message. In that direction, I have now emphasized the critical (current/future) role of 14 C dating in scientific research (readers, particularly 14 C users, must be reassured on the power of the tool) in the introductory paragraph of the final section 5. " 14 C dating has meritoriously established itself as one of the most powerful tools for dating cultural and palaeontological deposits from the late Quaternary [1,2]. The method is conceptually simple and well understood (see Introduction). Along with its prominence in the Quaternary sciences, its importance in modern research has been, and will be even more, heightened by the growing application of palaeoarchives and fossil materials to understand ongoing global ecosystem shifts and anthropogenic impacts on biodiversity and the environment [3,4]." Further, in the last paragraph of the manuscript, I have stated that researchers are not expected to be 14 C experts though that does not dispense them with the responsibility for using high-quality data.
"Surely if an author does not report a piece of information, it must be because it is deemed to be unimportant. One respondent in the survey noted that "I am basically a consumer [of 14 C data], but I learn that I need to be more involved [in how the data are generated]" (respondent #126: table S2). And when in my work I have requested unpublished pretreatment details of published 14 C dates a typical type of response has been "Your request can only be answered by the radiocarbon lab! I am palaeontologist and morphologist" (confidential pers. comm., 15/08/2019) or "I have not the faintest idea what you are asking. I am an archaeologist and I use dating to contextualize archaeological levels and at most generate population models" (confidential pers. comm., 07/11/2020). These attitudes align with the >50% of the surveyed experts who ask 14 C laboratories to choose bone pretreatment for them. This is not an inappropriate approach per se as the personnel of AMS 14 C facilities should be the true chemistry, geochronology and physics experts. The problem is when authors fail to acknowledge the importance of 14 C protocols relative to the importance of the research questions they attempt to answer. No modelling approach (no matter how sophisticated it is) and no research hypothesis (no matter how global, trendy or scientifically novel it is) should subjugate the use of high-quality data, even if less but more reliable data should decrease the power of a statistical analysis and the scope of the emerging inferences. I contend that scientists using 14 C data should be conceptually more involved in the chemical processes of data generation -without such involvement, bone pretreatment might yet remain for many years an elephant in the room of 14 C dating." I have also stressed the importance of sample provenance (subsection 4.2): "One can expect that all 14 C protocols of collagen purification dealt with in this study should be reliable under minimal to near-zero humate contamination for bones free of conservation substances. This best-case scenario will apply to samples from subarctic to arctic regions or from habitats (e.g., caves) having limited soil growth and associated humate production. On those grounds, ultrafiltration might be a valid 14 C bone pretreatment for the enormous amount of past and ongoing palaeochronological research undertaken in Beringia, Canada, Northern Eurasia and Patagonia but their reliability remains to be compared to lower-latitude sites. If bone destruction is to be minimized and contamination is expected, XAD-2 purification might arguably be the best compromise because it purifies all collagen amino acids, rather than only hydroxyproline or the ultrafiltered (high-molecular-weight) fraction." I have highlighted in the Introduction the recent publication of the new series of calibration curves that have expanded 14 C calibration by 5,000 years relative to the 2013 series.
"While modern particle accelerators can technically determine ages to ten 14 C half-lives [5], or approximately 55-57,000 years, the practical dating limit is eight half-lives (~ 48,000 years) due to sample type (inorganic versus organic carbon), pretreatment chemistry, and efficiency to remove contaminants [6]. The development of calibration curves (IntCal20, SHCal20, Marine20) allows the calibration of 14 C dates up to 55,000 calendar years Before Present (BP, where present is 1950 AD [Anno Domini]) [7]." Finally, I have rephrased the paragraph heading subsection (subsection 4.4) in which I summarize where the emphasis could be put for future research to improve bone 14 C chemistry.
"Molecular-level dating seems the way to go to advance the accuracy of bone 14 C dating. The rationale is obvious in that, rather than using the gelatine from a bone sample, or a purified version of it, the safest way of avoiding carbonaceous contamination is to date the molecular bricks forming the chemical architecture of collagen. Only molecular-level dating appears to accommodate all known bone contaminants and can guarantee complete removal of humic and fulvic acids, conservation substances and any other contaminant of bone collagen. How the amino acids are separated from the contaminants following collagen hydrolysis, and how to maximize the datable mass of amino acids given a fossil's initial mass and degree of collagen preservation are the steps requiring research innovation. If dating of collagen amino acids is to galvanize a future revolution in the chronological study of skeletal remains from the Quaternary fossil record, chemistry procedures need to be developed that are contamination-free, affordable by the majority of 14 C users across scientific disciplines, and able to handle low-mass fractions of amino acids and valuable specimens." To scrutinize and, where required, tone down or improve the flow or content or narrative of my manuscript, the final draft has been proof-read by (in this order) Professor Gregory McDonald (palaeontologist with museum and government experience and advocate for high-quality 14 C pretreatment protocols, www.researchgate.net/profile/H_McDonald), Professor Paula Reimers Chrono AMS Facility Director leading 14 C-calibration developments, https://pure.qub.ac.uk/en/persons/paula-reimer), Professor Thomas Stafford (world-class gechronologist, hwww.stafford-research.com) and Dr Kieren Mitchell (evolutionary biologist using and generating 14 C dates in palaeogenomic research, https://researchers.adelaide.edu.au/profile/kieren.mitchell).

Comment 1.2:
As a starting point, I think the author needs to show how often, and when radiocarbon dates on bone actually are problematic -and do this very early on in the paper. Notethis is not the same as taking the statistic from the survey to show that people submitting bones think they are contaminated. First the author needs to establish whether he is just talking about Pleistocene bones or all bones (as implied in the current text). The author mentions the problem of young contamination in very old samples, but 1% modern contamination in a sample that is 2000 years old will only shift the age by 23 years. All of the references I checked are about Pleistocene samples where samples are very sensitive to young contaminants.
Response: As scientists, we should strive for reducing the uncertainty of our measurements, and avoid being acquiescent with those uncertainties because they might be low in some circumstances and not in others (see my response to Comment 1.27).
We lack a comprehensive quantification of how often and for what types of samples and geographic/geological contexts contamination with exogenous carbon biases 14 C dating. I would rather not speculate about the magnitude of the problem in modern versus Holocene-age versus Pleistocene-age samples and I respectfully argue that such quantification is beyond the scope of my study. The reality is that this issue has lingered since the conception of 14 C dating by Willard Libby. Some AMS facilities argue that purification beyond gelatinization is not required in most scenarios (see my response to Comment 1.23), and many 14 C consumers have fully ignored, and are fully ignoring, pretreatment caveats (my survey shows evidence for this problem, and my database work in progress quantifies it conclusively: see my response to Comment 1.5). What we know (see manuscript paragraph quoted below and my response to Comment 1.25) is that virtually every time that authors have re-dated batches of 'real samples' using different pretreatments, and even the same pretreatment but applied by different labs, age discrepancies emerge (I use the term 'real sample' here relative to the reference materials used in routine interlaboratory comparisons -see also my response to Comment 1.22).
My manuscript proposes several paths of action to assess/abate sample contamination for 14 C dating (see my response to Comment 1.26). I hope that my study prompts comprehensive pretreatment comparisons since the data are already available. For instance, I am aware through my ongoing database work (see my response to Comment 1.5) that ORAU has dated heaps of samples using gelatinization (PCode = AG) versus ultrafiltration (PCode = AF or AF*) on the same bone material, and that some authors have similar (unpublished) data comparing XAD-2 purification versus ultrafiltration or gelatinization.
"A major limitation faced by the growing community of scientists using 14 C data is that laboratory protocols vary among AMS 14 C facilities, even for the same bone pretreatment [8]. Such a procedural variance can make 14 C dates of skeletal materials non-comparable from one laboratory to another and from one research paper to another. Lack of comparability could question the validity of the increasing number of studies collating 14 C dates from multiple sources (see subsection 4.4) to deal with hotly debated topics such as the causes of extinction of late-Quaternary megafauna [9,10] or the timing of the global dispersal of anatomically modern humans [11,12]. We might have highly sophisticated analytical and modelling tools to unravel the mechanisms behind those extraordinary demographic phenomena, but they will be useless if we are unable to time exactly when those individuals, populations and species (dis)appeared. This rationale has been put forward by archaeologists whereby the prowess of Bayesian chronological models [13] can be truncated by the low quality of 14 C data, sample pretreatment and/or reporting etiquette [12,14]." The assumption that radiocarbon dates on bone are always problematic is a little outdated in my opinion (though I have no survey to prove this). This is very important, as I think someone reading the paper with little knowledge of the field would assume that the author is implying that all, or most, radiocarbon dates on bone are probably wrong. The author nicely described the two cases where there are still currently problems -humic (severely contaminated e.g. black bone in peat) and conservation treatments (could add embalming as well), are actually quite rare in my experience (although I admit the conserved bones are often valuable specimens and often very important to get an accurate age on). These samples are very easy to identify prior to pretreatment. Even though many people think the samples they submit are contaminated (line 202) -this is people 'thinking' they are contaminated. What does this really mean -do they have evidence they are contaminated and how severely are they contaminated? Of course, we would all think that buried bones may be contaminated but this does not mean that they fall into the two categories above, or that contaminants cannot be removed with the standard ultrafiltration pretreatment.

Response:
The primary literature attests that the problems associated with dating bone described in my study are not outdated. The latest two international interlaboratory comparisons conclude so using reference materials [15], and will be further assessing this matter in their next comparison [16]. Additionally, I now flag in the Introduction that the last decade of methodological improvements to deal with collagen-contamination issues for 14  "To address those issues, gelatine isolated by the chemical method adapted for 14 C dating by Robert Longin in 1971 [17]  denaturing collagen in slightly acidic, hot water [18]  has become the primary bone pretreatment method, and is the minimum if not final pretreatment used by the vast majority of AMS 14 C laboratories (figure 2). However, many authors acknowledge that gelatinization alone fails to remove mild to severe carbon contamination from Pleistocene-age bone [19][20][21][22][23][24][25]. Consequently, gelatinization is combined with any of three additional steps: ultrafiltration [26], XAD-2 purification [27] or isolation of individual amino acids (molecular-level dating) [28,29]. These additional steps are also part of the menu of services offered by some AMS facilities (figure 2), although they add time and cost to the sample preparation. Concisely, ultrafiltration assumes that molecules larger than 30,000 Daltons (30 kDa) -approximately 1/3 rd the mass of the non-cross-linked chains of the heterotrimer collagen type I 1 (2 per molecule) and 2 (1 per molecule) in bone [30] (300 kDa) -are from bone collagen, while smaller molecules (<30 kDa) are presumed to include non-collagenous contaminants unsuitable for dating [26]. XAD-2 purification uses a nonpolar, hydrophobic resin through which hydrolyzed gelatine or hydrolyzed collagen solution is passed, and the eluate is collected and dated. Contaminants, predominately humic compounds, remain on the resin and are either discarded or afterwards eluted to determine the 'Fraction modern' of the contaminant [44]. Lastly, molecular-level dating uses mostly the imino acid hydroxyproline [31] or, less frequently, amino acids [e.g., glycine, alanine, aspartic acid; 28, 32] for their direct AMS 14 C dating. The 18 amino acids comprising collagen range from 75 to 181 Da and are isolated from gelatine hydrolysates by using high performance liquid chromatography (HPLC) [33,34]. The focus on 131 Da hydroxyproline occurs because it is virtually unique to collagen and constitutes 9 molar of total amino acid content [31,32]. In subsection 4.2, I address 14 C research over the last decades to refine methods dealing with contamination issues." I have included 14 C dating of embalmed mummies as another case scenario requiring specific pretreatments, and added the observation that some conservation substances can also be used in the field and need to be factored in when pretreating bone for 14 C dating.
"The human and animal late-Quaternary fossil record is inherently rare and mostly consists of one or, rarely, a few bones per individual. Consequently, fossil specimens from museum collections pose unique (despite destructive) opportunities for 14 C dating and ancient DNA sequencing [35]. Bone curation in those collections, including embalmed ancient mummies [36,37], entails the application of a range of conservation substances (adhesives, coatings, consolidants) that stabilize skeletal materials and prevent microbiological decomposition [38]. Because those substances contain carbon, it is critical prior to AMS 14 C dating that museum materials be treated with routine acid-alkali-acid rinses in combination with organic solvents specific to every conservation substance [39,40] (figure 2). The apparent 14 C ages of the most commonly used solvents group into two classes: ones with modern 14C content (post-1950 AD) and those with age ranges of 15,000 to >40,000 years [41]. Depending on the age of the fossil, solvents can therefore cause bones to be dated older or younger than their actual age. Tests on synthetic, porous material indicate that solvents might achieve complete removal of some but not all types of conservation substances [42] due to a suite of complex interactions between solvents, conservation substances and the study material (e.g., crosslinks, oxidation, aging degradation). Consequently, where contamination is suspected or confirmed from these sources, reliable AMS 14 C dating of museum bones could be guaranteed by dating individual amino acids and/or by selecting the regions of the bone least impregnated by conservation substances [40]. This will of course fail if animal hide or bonecollagen glues were used as a preservative [43] because it would be impossible to distinguish a fossil's collagen amino acids from those in a collagen glue. In addition, other substances can also be applied in the field to consolidate or preserve bone and these materials might not have been recorded. These situations emphasize the importance of maintaining museum records that detail all treatments a fossil receives, from sampling to storage."

Comment 1.4: I think the search for the 'best' method is not advisable. The paper does touch on this aspect, but I think more needs to be made of it. There are multiple degrees and types of contamination, so why should there only be one pretreatment? This is especially important when that 'best' method is much more destructive and so irresponsible and unethical (or impossible) to use in many (or I would argue all) circumstances? I take the view that it is better to start with a 'good' pretreatment method that we know works in the majority of cases (ie ultrafiltration or XAD-resin) and escalate to a single amino acid approach only where needed (and from what I've read of the papers coming out of the Oxford lab, this is very rare).
Response: Please see my responses to Comments 1.8, 1.23 and 1.24.
I have fully reworded the Abstract to accommodate the content of the revised manuscript.
"Radiocarbon ( 14 C) analysis of skeletal remains by accelerator mass spectrometry is an essential tool in multiple branches of science. However, bone 14 C dating results can be inconsistent and not comparable due to disparate laboratory pretreatment protocols that remove contamination. And pretreatments are rarely discussed or reported by end-users, making it an 'elephant in the room' for Quaternary scientists. Through a questionnaire survey, I quantified consensus on the reliability of collagen pretreatments for 14 C dating across 132 experts (25 countries). I discovered that while >95% of the audience was wary of contamination and avoid gelatinization alone (minimum pretreatment used by most 14 C facilities), 52% asked laboratories to choose the pretreatment method for them, and 58% could not rank the reliability of at least one pretreatment. Ultrafiltration was highly popular, and purification by XAD resins seemed restricted to American researchers. Isolating and dating the amino acid hydroxyproline was perceived as the most reliable pretreatment, but is expensive, time consuming and not widely available. I argue that only molecular-level dating accommodates all known bone contaminants and guarantees complete removal of humic and fulvic acids and conservation substances, with three key areas of progress: (1) Innovation and more funded research is required to develop affordable analytical chemistry that can handle low-mass samples of collagen amino acids. (2) A certification agency overseeing datingquality control is needed to enhance methodological reproducibility and dating accuracy among laboratories. And (3) more cross-disciplinary work with better 14 C reporting etiquette will promote the integration of 14 C dating across disciplines. Those developments could conclude long-standing debates based on low-accuracy data used to build chronologies for animal domestications, human/megafauna extirpations and migrations, palaeoecology and palaeoclimate models."

Comment 1.5: [Abstract] Perhaps you could argue that journals need to take the lead on reporting etiquette? People have been shouting about this for years (Millard 2014 Radiocarbon for example), but few seem to listen. Unless a radiocarbon chemist is included on a paper, the necessary details
are still rarely published, as the author rightly states. There are simply not enough radiocarbon chemists to help write every paper including radiocarbon dates! Response: I am working on a database manuscript ( 14 C megafauna ages based on ultrafiltration, XAD-2 purification and hydroxyproline) where I discuss, in length, prevailing caveats in 14 C reporting etiquette in the primary literature. I expect to submit this manuscript to a mainstream multidisciplinary journal by the end of the year. Along those lines, I partly disagree with the reviewer's comment in that describing 14 C protocols should not be a chemist's duty. To explain why, I will use an analogy I present in my database manuscript in progress. Much palaeo-research is quantitative in nature and requires the use of complex statistical tools often beyond the mathematical expertise of publishing authors. However, even if we are not mathematicians, we must describe with some degree of detail how we analyze our data, and mainstream journals would not accept failure to do so in the peer-review process. In my view, the same can be claimed for current 14 C reporting practices, i.e., many of us doing (palaeo)ecology (or, for that matter, archaeology or palaeontology) are not chemists but we must understand the basics of chemical protocols to support that the 14 C data we publish and analyse and use to test hypotheses, and all the inferences we make, are reliable. The poor reporting habits, in my view, reflect that scientists are not conceptually involved in how 14 C data is generated from sample collection to dating, and many seem to blindly rely on what 14 C labs do, which is at odds with how science operates. See my response to Comment 1.1.

Comment 1.6: Lines 55-57 -no reference is really needed, the maths is straightforward (isotope mixing equation, and then F14C to BP). Also, consider the effect in Holocene aged bones. 1% modern C contamination in a sample that is 2000 years old will only shift the age by 23 years. This is well within the quoted uncertainties (often +-25 14C at 1 sigma). Whilst it may cause a problem in the most high precision models if every sample is systematically affected, it is unlikely to be noticeable in the vast majority of applications (and I would envisage that other factors are likely to be much more of an issue).
Response: The estimated number is given by [44] so I respectfully argue that I must acknowledge the source of my statement.
"Suffice it to say that a 55,000 year old sample contaminated with only 1% modern 14 C will result in a 40,000 year 14 C measurement [44]. If we were characterizing the environment experienced by the animal or human individual being dated, this 15,000 year error would place the fossil in any of three different transitions from cold (stadial) to warm (interstadial) paleoclimates over the Last Glacial Period [45]."

Comment 1.7: Line 76 -'contamination from human and megafauna bone' -just these types of bone? I think you really mean 'Pleistocene age bone'.
Response: I have now incorporated the change suggested as follows (see also my response to Comment 1.27). "However, many authors acknowledge that gelatinization alone fails to remove mild to severe carbon contamination from Pleistocene-age bone [19][20][21][22][23][24][25]. Consequently, gelatinization is combined with any of three additional steps: ultrafiltration [26], XAD-2 purification [27] or isolation of individual amino acids (molecular-level dating) [28,29]." Comment 1.8: Line 97 -There is not just one pretreatment, in the same way as there is not just one type of contamination. So, why should we be looking for 'the one'?
Response: I partly disagree. If molecular-level dating (i.e., dating the amino acids forming the collagen of the target bone) was the cheapest and fastest method, it is hard to envisage that 14 C users would not be routinely using it to date skeletal materials. The comment made by the reviewer aligns well with the main question I attempt to address in my manuscript: "However, researchers in many disciplines currently ignore whether there is consensus in the research community about which pretreatment protocols provide the most accurate 14 C ages of fossil bones."

Comment 1.9: Line 219 -I think you need to remind the reader here that the question was about when 'a bone sample was *severely* contaminated with exogenous carbon' (my own emphasis). To most people, I would imagine that this meant soaked in consolidant (it certainly means that to me rereading the question). These are not the normal bones that you would find in the normally dates recently excavated/ museum curated bagged faunal bone (which is often selected because it is worked in some way or of a specific species, but has been kept away from the museum conservators working on the beautiful specimens…). If true, it would imply that people were not responding to what they would do for a routine pretreatment.
Response: I have now included the qualifier "severe" when I report the results the reviewer is referring to.
"When respondents were asked to pick one single bone pretreatment for its reliability to remove severe carbonaceous contamination prior to AMS 14 C dating (assuming no limitations of funding or sample size), hydroxyproline isolation from collagen gelatine was the preferred option (39% of the audience) followed by ultrafiltration (23%) and finally XAD-2 purification (9%) (figure 5a)." It is important to note that the qualifier is clearly stated in the Methods and Results sections.
[Methods] "In Section 2 -'Pretreatment' (four questions), researchers (2.1) ranked the reliability of four pretreatments (namely, gelatinization alone and gelatinization with further steps of ultrafiltration, XAD-2 purification or hydroxyproline isolation, see Introduction; figure 2) from 1 (low reliability) to 5 (high reliability) in order to remove contamination of exogenous carbon from a bone sample before AMS 14 C dating (including an 'I don't know/I am unsure' option for each pretreatment), (2.2) chose one of the former four pretreatments should they a priori know that a bone sample was severely contaminated with exogenous carbon (including an 'I don't know/I am unsure' option), and confirmed whether they customarily (2.3) request a specific pretreatment when submitting bone samples to a 14 C laboratory (including an 'I have never submitted bone, tooth, or ivory samples to a AMS 14 C dating laboratory' option) and (2.4) use pretreatment information as a criterion to rank the reliability of 14 C dates collated from the literature (including an 'I have never collected/used 14 C dates from the literature')." [Results] "When respondents were asked to pick one single bone pretreatment for its reliability to remove severe carbonaceous contamination prior to AMS 14 C dating (assuming no limitations of funding or sample size), hydroxyproline isolation from collagen gelatine was the preferred option (39% of the audience) followed by ultrafiltration (23%) and finally XAD-2 purification (9%) (figure 5a)." Comment 1.10: [Results: general statement] It would be interesting so see the breakdown in pretreatment selection by people who work on collagen extraction separately to those who just use the data and don't work in a lab. Are they different?
Response: I now report pretreatment-selection rates for respondents with past or current 14 C-lab experience.
"Only <5% of the respondents chose gelatinization alone, and 23% did not know or were unsure of what pretreatment to choose (figure 5a), respectively. In accord with the previous results, when researchers were asked to rank each of the four pretreatments from low (rank = 1) to high (rank = 5) reliability, hydroxyproline isolation (4.05 ± 0.12SE) and ultrafiltration (4.24 ± 0.13SE) were ranked higher than XAD-2 purification (3.39 ± 0.16SE) and, particularly, gelatinization alone (2.55 ± 0.17SE) (figure 5b). Relative to the full set of respondents, best choice, relative pretreatment rankings and main conclusions prevailed for 74 respondents with prior or ongoing experience working at an AMS 14 C facility, except that hydroxyproline isolation led mean rankings (4.16  0.23SE) above ultrafiltration (3.77  0.50SE), XAD-2 purification (3.52  0.22SE) and gelatinization alone (2.82  0.56SE)." Comment 1.11: Line 265 -the cost of a hyp date would be much much larger in a lab if only a few dates were generated each year -they would need to pay a post-doc/ technician with substantial prior knowledge dedicated to run this method, and for the vast numbers of standards required to monitor column bleed. I'd imagine for most labs the cost (excluding the equipment which needs to be dedicated to radiocarbon to keep contamination under control) would be too much.
Response: I definitely agree and I make this point in my manuscript.
"Ultrafiltration is two to three times cheaper per sample than hydroxyproline isolation by HPLC. Costs can partly explain why ultrafiltration is the preferred choice, after hydroxyproline isolation, in the survey." Comment 1.12: Line 279 -it might be nice to acknowledge that ORAU did not in fact propose the ultrafiltration method or use it first -this was Brown 1988. They have a very loud spokesperson who has driven some large high profile projects using the method. I think the list of chosen publications is in part due to publicity (this is not in any way meant to be a negative comment about Oxford, it just reflects good communication).
Response: I now acknowledge the North American development of ultrafiltration later in the manuscript (see my response to Comment 1.23) while I also refer to ORAU's communication strategy (which, I agree, is difficult to go unnoticed) and to the merit of the scientist (not the lab) that first published hydroxyproline dates as follows.
"In that respect, many respondents cite papers produced by ORAU to back their choice of pretreatment (table S1), which might reflect ORAU's efficient communication strategy and/or that 14 C chemist Richard Gillespie at this laboratory was the first to publish hydroxyproline 14 C dates [32] and that ORAU was the first 4 C laboratory to adopt ultrafiltration as default bone pretreatment [46,47]." Comment 1.13: 288 -289 -this would only matter if the methods did not give accurate/ equivalent results surely. The problem is the few studies which compare the methods. I personally think that different methods might be suited to different locations. There would be little point in a lab working on bone from cold regions focusing on methods to clean low yielding bones. And there would be no point in a lab focusing on an area with limited museum curation to use methods to remove consolidants.

Response:
We lack consistent guidelines about what set of properties make a (bone) sample suitable for a particular pretreatment (see my response to Comment 1.23). Instead, 14 C-lab personnel resort to their own criteria/experience/expertise when samples are submitted for dating. On the other hand, Tom Stafford once stated that "…the reason that ultrafiltration has temporarily won the day is it is an easy extraction. It yields a white solid easily combusted. XAD on the other hand yields a viscous syrup that is somewhat difficult to transfer into quartz combustion tubes" (pers. comm., 29/08/2019). From my readings of the primary literature, the current situation whereby XAD-2 purification is excluded from the menu of services catered by European AMS facilities has no scientific basis and fails to follow objective principles of sample suitability (see my response to Comment 1.23). I have added an additional statement arguing that XAD-2 might be partly determined by affiliation.
"So, a scientist in the US (27% of the target audience) is more likely to know the qualities of, and consequently select, XAD-2 purification of bone for AMS 14 C dating than a scientist from other parts of the world. It goes without saying that a scientist's geographical affiliation should be uncorrelated with the reliability of a given bone pretreatment. Having said that, sample-shipping costs, including onerous customs regulations (e.g., sending biological samples from Europe or America to Australia), might play a role in researchers choosing pretreatment protocols from AMS facilities located closest to their working place."

Comment 1.14: Line 299 -301 -this is nothing to do with added contamination, only that we don't really know what the ultrafilters remove. It should go right up in the intro where you first discuss what the ultrafilters do.
Response: I have added the reviewer's point as follows.
"Equipment-related contamination has been attenuated at ORAU with the sonication and rinsing with ultrapure water of the ultrafilters [46], though the >30kDa fraction stills retains <30 kDa material along with non-collagenous proteins and non-proteinaceous organic compounds [48], indicating that the chemical composition of the ultrafiltered fraction is not yet properly understood." Please note that the 'Introduction' already articulates the idea that the ultrafiltered fraction only includes collagen from the target bone is no more than an assumption as far as we know.
"Concisely, ultrafiltration assumes that molecules larger than 30,000 Daltons (30 kDa)approximately 1/3 rd the mass of the non-cross-linked chains of the heterotrimer collagen type I 1 (2 per molecule) and 2 (1 per molecule) in bone [30] (300 kDa) -are from bone collagen, while smaller molecules (<30 kDa) are presumed to include non-collagenous contaminants unsuitable for dating [26]." Comment 1.15: Line 304 -the age of the humectant actually changes between batches. It was old, but is now mainly modern in age (i.e. much more problematic for Pleistocene aged bones), or occasionally late Holocene. I can't remember when the change in age happened, but it was prior to 2010. See Wood et al. 2012 Radiocarbon 52(2). There is another paper, I think by Brock in Radiocarbon, on a batch with a late Holocene age, but I can't find it.
Response: This is definitely an important point which I have now incorporated as follows (including the two references suggested by the Reviewer). The new text has been revised by ORAU's Director Tom Higham.
"The ratio of contaminant versus collagen concentrations increased for ORAU's samples where the collagen yield was low, resulting in offsets of 100 to 300 years for bones younger than two 14 C half-lives (~12,000 years BP) [21] because the apparent age of the humectant (glycerol) was >35,000 years BP. In fact, later work has shown that the humectant's age has changed between batches of samples from fossil (~12,000 to >35,000 years BP) in 2006 to modern (post-1950 AD [Anno Domino]) by 2011 [49,50]. The age of the humectant in ultrafilters used at the ORAU continues to be post-1950 AD in age and the cleaning regime removes any trace of humectant below a few micrograms (Thomas F. G. Higham, pers. comm., 28/10/2020)." Comment 1.16: Line 307 -acid washing the membrane would damage it -so the increase in collagen yield is due to the membrane not doing the same thing as before (and note an earlier comment that I think we are not exactly sure what ultrafiltration does to improve the age…).
Response: I contacted John Southon to address this comment as his lab did the experiment and published the paper [51]. This paper has currently zero citations in Scopus and, to my knowledge, the experiment has not been replicated. Southon's response to my email seems to challenge the Reviewer's comment, thus I rather not modify my one-sentence statement on the topic. "It is always good practice to (pre)screen samples for collagen preservation [53]. Methods and metrics include percentage yield of collagen after each pretreatment step, atomic C:N (Carbon:Nitrogen) ratios, stable C and N isotope values [54], whole-bone %N [52] and, less frequently, relatively expensive but extremely quantitative HPLC [55] and near-infrared spectroscopic methods [56] Response: Agreed. I have added a note of caution at the end of the paragraph. Prior to it, I now also describe the bases of the ninhydrin pretreatment.
"One possible route is using N-phenacylthiazolium bromide to cleave glucose-derived protein cross-links [57]. Using this reagent has allowed researchers to improve the amplification of ancient DNA from megafauna dung [58,59], but remains to be applied to bone samples for AMS 14 C dating. Another possible route involves first derivatizing the amino acids in a gelatine hydrolysate with a reagent that does not react with the imino acids (proline and hydroxyproline) -such as o-phthaldialdehyde as employed for amino-acid racemisation dating [60] -combined with SPE cartridges [61]. Those SPE cartridges have been successfully employed to extract collagen from bone [62] and should be simpler and cheaper than HPLC methods. Lastly, the specific chemical reaction of ninhydrin with the alphacarboxyl group of free amino acids (hence not interacting with humates [63]) produces CO 2 that has been used for isotopic fractionation [64] and bone 14 C dating [65][66][67]. This 14 C pretreatment also uses collagen hydrolized to amino acids in 1M HCL and is simpler and cheaper than HPLC, but has been criticized for requiring abundant glassware and a minimum bone mass of ~1 gram per sample and remains open for improvement [68]. Any new developments should of course gauge the extent by which novel reagents might add carbonaceous contaminants." Response: This is a valid point that I have now included and expanded in the manuscript.
"For bones that are severely degraded during burial, or consist of one or a few small fragments (the majority of the late-Quaternary fossil record!), and/or belong to small bodysized taxa such as shrews or mice, and/or have cultural value (e.g., ancient humans or unique animal specimens), molecular-level dating might remain unfeasible unless AMS 14 C dating incorporates methods of protein enrichment that could increase the yield of collagen amino acids while aiding in the removal of potential contaminants [69,70]." Comment 1.22: 420 -It might be an idea to look up some of the work that Heritage England (Alex Bayliss) has done on comparisons. They replicate many bone samples between labs to assess accuracy and comparability. Only a few labs are involved, but there are many samples involved.
Response: This is a useful suggestion that I have included in the Discussion as follows.
"The main attempt to evaluate dating consistency in the 14 C field has been the International Radiocarbon Intercomparison led by the University of Glasgow (UK) and endorsed by the journal Radiocarbon [71]. This scheme aims to identify reference materials that can be dated and compared over time as 14 C techniques evolve. In each of the six assessments undertaken to date [15,[72][73][74][75][76], a range of 14 C laboratories has been invited to voluntarily participate and date the same set of samples, then the Glasgow team has quantified dating consensus across laboratories. The major limitation of this initiative is that these reference materials contain no contaminants, or do not contain the levels and types of contamination found in fossil bones. Bone has been included only in the last two assessments (and will be part of the next one [16]), not surprisingly concluding that there is a need for "… an investigation of pretreatment effects, especially for the bone samples" [15]. English Heritage has accumulated >400 bone samples with replicate 14 C measurements from >1 laboratory, showing age inconsistencies at p = 0.05 (probability of the data given a null hypothesis that several measurements are equal) by a mean of (i) 234 years for 11% samples subjected to gelatinization, (ii) 30 years for 16% of samples subjected to ultrafiltration, and (iii) 19 years for 23% of samples subject to ultrafiltration versus gelatinization [77]. However, these authors [77] note that "… this dataset consists of measurements on generally well-preserved bone from a temperate climate, which is predominantly less than one half-life in age. This reproducibility may not be obtained on older or poorly preserved material." I think we need to expand those comparisons across more 14 C labs and do it systematically for hundreds of samples across a gradient from low to high (ideally known, e.g., humic) contamination (see response to Comment 1.25) and over a range of time windows in Holocene and Pleistocene times (see also my responses to Comment 1.1 and 1.2).
[Abstract] "I argue that only molecular-level dating accommodates all known bone contaminants and guarantees complete removal of humic and fulvic acids and conservation substances, with three key areas of progress: (1) Innovation and more funded research is required to develop affordable analytical chemistry that can handle low-mass samples of collagen amino acids.
(2) A certification agency overseeing dating-quality control is needed to enhance methodological reproducibility and dating accuracy among laboratories. And (3) more crossdisciplinary work with better 14 C reporting etiquette will promote the integration of 14  Response: I believe my study will stimulate debate and discussion about how innovation can lead to universality in pretreatments for 14 C dating, but I also agree that this point needs to be expanded in the manuscript. As a 14 C-data user, it is disconcerting that some 14 C labs are vehemently stating that collagen gelatinization, or simple alkali extraction with no further purification steps, is the way to go (Groningen is probably the icon for this [78], and Beta Analytics conceded in 2013: see www.radiocarbon.com/miami-bones-ultrafiltration). I don't think the 14 C-dating industry can afford such a strong disparity of criteria. I have added this rationale in the following two paragraphs (see response to Comment 1.24).
"At the heart of this conundrum lies the fact that no international agency oversees quality control, training and certification in the field of 14 C dating. Currently, should the necessary funding exist, 14 C facilities can be discretionally created with freedom to adopt specific pretreatment protocols to compete for customers in a competitive market among >150 AMS facilities currently operating globally [79]. We are indeed far from an arguably ideal scenario whereby 14 C pretreatment procedures are universal across laboratories. Countering that scenario, one respondent (aligning with many 14 C-laboratory personnel and palaeoresearchers I have communicated with) states that "… for the effort a 14 C measurement is requiring, every sample deserves the best individual pretreatment" (respondent #40: table S2). The pitfall is that with different 14 C laboratories favouring different bone pretreatments [80], what 'best' means for every sample can have multiple answers. To my knowledge, no comprehensive guidelines have been published in the primary literature defining what set of consistent properties make a given (bone) sample suitable for a given chemical protocol prior to AMS 14 C dating. This is not to say that pretreatment protocols can be expected to reach infallibility, nor that AMS facilities should not lead or partake in innovation along with their business activity. The overarching goals of 14 C innovation should be to attain methodological reproducibility and dating consistency across laboratories and high accuracy (i.e., 14 C ages capturing the true age of a fossil). However, it is unlikely that pretreatment developments led by one AMS facility are to be promptly adopted by others. It can take time for information to be disseminated at conferences or through research papers and for AMS facilities to test promising procedures rather than adopting them directly. These tests often leave no trace in the literature (Paula Reimer, pers. comm., 03/11/2020). For instance, collagen ultrafiltration for 14 C dating was an initiative of the Simon Fraser University (Canada) (Canada) published in 1988 [26], ORAU did not adopt it until 2000, with some European sites following six or seven years later in some cases when they first acquired an AMS (e.g., Aarhus, Belfast, Poznań, Zurich), and others never including it in their default protocols (e.g., Groningen, Kiel, Vienna). In contrast, the fact that no European AMS facility provides XAD-2 purification seems surprising by sheer criteria of dating reliability. A different model could be explored whereby: (i) Those AMS facilities interested in innovation were coordinated within several nodes of research sites (including universities), each node pushing chemical developments for specialized aspects of AMS 14 C dating (e.g., types of samples versus types of pretreatment). And (ii) an international certification agency regulated the transition from development to customer service according to available personnel's expertise and the equipment hosted by AMS facilities. In such a model, AMS facilities would have an incentive to participate in innovation, as all would directly contribute to, and benefit from, developments." Comment 1.24: Line 437 -I think this is already accepted, and from what you have written, I think you also already accept this. On the other hand -would anyone really put too much weight on any dataset generated in the 1960s and say it is comparable to present day data without any cross checking? It seems a bit harsh to say that this is only a problem with radiocarbon dates.
Response: The focus of my paper is 14 C dating, so does not touch on other chronological methods. I agree that 14 C dates from the 1960s and 1970s are of little use but I am unsure about later dates, particularly those generated in the 1980s and 1990s using XAD-2 purification as I explain in the following paragraph.
"Should authors compile sets of ages of fossil bone from multiple sources and publication years to test hypothesis and make broad inferences about ancient populations and species? Without data-quality control or ways to rank 14 C bone chemistry, the enterprise is certainly risky but keeps attracting the attention of high-profile journals. If a widespread standardization of bone pretreatment protocols came true, we might have to be ready to face the eventuality that many 14 C dates of fossil bone (as well as the inferences made from them) published in the scientific literature over the last seven decades might be inaccurate or wrong, hence hardly comparable with new 14 C dates. XAD-2 purification protocols have remained procedurally constant since its conception in the 1980s, so bone 14 C ages generated through this method should arguably share a similar degree of reliability over time. Unfortunately, there does not exist a year or time interval before and after which 14 C ages should be deemed (un)reliable [but see 77,81] partly because individual AMS 14

Comment 1.25:
Line 450 -the easiest samples are ones which fall beyond the limit of the radiocarbon method. Then, any 14C is a contaminant. The method is incredibly sensitive to them (but also to the calculation of the lab background -another massive problem for very old dates). And we do not need to know the age of the samples -only that they are >60,000 yr. This vastly increases the number of samples available as U-series/ OSL/ ESR and even tephrochronology are not without their own problems. The problem will be in finding large enough *severely and consistently contaminated* samples which can be completely destroyed when sending to so many radiocarbon labs. Perhaps this is why the process has not happened?
Response: This is an extremely useful suggestion that I have added to the manuscript as follows.
"The published evidence for the reliability of bone ages obtained through different pretreatments is sketchy, definitely not comprehensive, and would benefit from a global experiment using skeletal materials of known age from multiple geological deposits, latitudes and time periods. Although I partly concur with the view that "… the most important criterion, far more important than pretreatment, and one that is often not considered (as exemplified in this survey) is 'context' of the specimen. That is, clear and unambiguous control of association and context of the sample with respect to the cultural activities in question" (respondent #3: table S2), the reality is that the stratigraphic integrity of most archaeological and palaeontological sites cannot be confirmed with 100% confidence. So the selection of sites for the global experiment suggested above would have to be based on a careful selection of reliably dated fossil-containing deposits (e.g., volcanic tephras) and/or deposits showing high chronological agreement by several dating methods (e.g., electro-spin resonance, optical techniques, [thermo]luminescence, uranium-series  reviewed by Walker [82]). The latter would require to frame fossil dates into a comparable ranking of reliability across multiple dating methods, which has so far only been applied to megafauna fossils from Australia and Papua New Guinea [83-85] and awaits developments with global scope. A complementary approach would be to apply different pretreatments to a comprehensive set of samples whose age (determined by other chronological methods and/or stratigraphic evidence) conclusively exceeds the limit of 14 C dating. For such old samples the presence of 14 C would be an unequivocal signature of contamination given an appropriate selection of background samples [see 86]." Comment 1.27: Line 477 -I think this quote will alienate much of the 14C community. I would argue that they now spend huge amounts of money on pretreatment. Think of all of the technician time processing the samples let alone the time in pretreatment research or quality control, especially now an AMS can be run by 1 person working part of the time. This seems a ridiculous and slightly offensive comment to me. Radiocarbon labs have long histories of trying to address contamination issues, and long histories of trying to communicate the difficulties with their users (look at any 14C review). This quote seems to suggest that labs have ignored the problem, and this is demonstrably not the case. Perhaps count the number of papers in the journal Radiocarbon on pretreatment (remembering we need to work on numerous different materials) vs. the number on AMS development.
Response: I fully understand the concern. However, I think the amount of human force required to process the daily load of samples for 14 C dating at any given laboratory should not be conflated with the research innovation needed to improve chemistry protocols. My communication with some of the top 14 C chemistry experts does indicate that the 14 C community has been more agile to incorporate AMS developments than chemical developments, and it seems incontestable that 21 st century bone pretreatment relies on procedures developed in the 1960s to 1980s. I have tone down my narrative in the paragraph in question (and the entire manuscript) and made sure that readers understand that AMS facilities and their personnel are not to blame for the paucity of chemical developments for bone 14 C dating (I never intended to mean that) as those developments are arguably only feasible through multidisciplinary efforts.
"However, progress in the physics of modern AMS 14 C dating has driven a revolutionizing transition from beta counters to particle accelerators [87] (see Introduction) and from there to the prompt incorporation of the latest accelerators MICADAS [88,89] already functioning in many 14 C laboratories [e.g., 78,90]. The focus of those developments has been put on minimizing the required amount of datable mass [91][92][93]. In contrast, one of the respondents bluntly expressed that "… if AMS labs spent as much money on chemistry and biology as they do on physics, the inherent inaccuracy in most 14 C bone ages would have been eliminated years ago" (respondent #4: table S2). Indeed, the chemistry of AMS 14 C dating still rests on refined versions of procedures developed during the 1960s to 1980s (figure 2) and awaits a revolution of its own. We cannot expect this revolution to be prompted by AMS personnel, geochronologists and Quaternary scientists alone given the multidisciplinary applications of 14 C data (figure 1). Additionally, although contamination of fossil samples with modern carbon might be most problematic for Late-Pleistocene bone, scientists should not be acquiescent with contamination issues in modern and Holocene-age materials as science should always strive for reducing uncertainty. More cross-disciplinary communication and research, particularly with chemists, is a critical endeavour to better understand the factors that drive the accuracy of AMS 14 C dating and to unite efforts towards integrating chemical protocols and 14 C research with our own fields of specialization. Those efforts should go hand in hand with funding agencies supporting research projects focusing on the improvement of less expensive 14 C chemistry." Comment 2.1: I have one comment with regards to Figure 2: starting from powder is not the standard procedure. Many bone demineralization protocols start with pieces of bone instead of powder. It would be good to place that option in the figure. Response: I have added to Figure 2 an additional drawing representing pieces of bone along with bone powder. "Contamination awareness shown as the number of respondents (n = 132) who suspect, with four levels of increasing confidence (never, sometimes, often, always), that bone samples are contaminated with exogenous carbon prior to radiocarbon ( 14 C) dating. Left and right stacked bars represent respondents who submit raw samples of bone (n =112 after excluding 14 respondents who stated not to have submitted raw bone to a 14 C laboratory) or extract the collagen gelatine (n =101 after excluding 25 respondents who stated not to have extracted collagen gelatine) for 14 C dating, respectively."

Comment 2.3:
In several parts of the text, the word "imino" acid appears. I'm not sure if you are actually referring to imino acids (a molecule related to amino acids) or if this is just a mistake. Please check. Response: "Imino acid" refers to an amino acid with an imine (>C=NH) rather than an amino (-NH2) group linked to the carboxyl (-C(=O)-OH) group. In the context of animal hydroxyproline, 'imino acid' conceptualizes a post-translational event that makes hydroxyproline all the more specific to animal tissues. I have explained this in the text. "Lastly, molecular-level dating uses mostly the imino acid hydroxyproline [31] or, less frequently, amino acids [e.g., glycine, alanine, aspartic acid; 28, 32] for their direct AMS 14 C dating. The 18 amino acids comprising collagen range from 75 to 181 Da and are isolated from gelatine hydrolysates by using high performance liquid chromatography (HPLC) [33,34]. The focus on 131 Da hydroxyproline occurs because it is virtually unique to collagen and constitutes 9 molar percent of total amino acid content [31,32]." "Hydroxyproline originates from the post-translational modification of the other imino acid, proline, and contributes to the stabilization of the collagen triple helix in animal tissues [94] where it contributes ~13% of the total amino acid carbon [33]. Hydroxyproline occurs in plant cell walls (<1% dry weight) [95] and is enriched in soil organic matter during mineralization [96], but any plant material attached to a sample of bone will contain negligible amounts of hydroxyproline and, most importantly, be insoluble during the gelatinization step and therefore unable to contribute any hydroxyproline during AMS 14 C dating [97]." Lastly, I have updated Figure 1 (publications by year and Scopus' subject area) to include current publication numbers and the term 'antler' and the wild card skelet* as key words (along with bone, tooth, teeth and ivory).