Longitudinal analysis of pinnipeds in the northwest Atlantic provides insights on endemic circulation of phocine distemper virus

Phocine distemper virus (PDV) is a morbillivirus that circulates within pinnipeds in the North Atlantic. PDV has caused two known unusual mortality events (UMEs) in western Europe (1988, 2002), and two UMEs in the northwest Atlantic (2006, 2018). Infrequent cross-species transmission and waning immunity are believed to contribute to periodic outbreaks with high mortality in western Europe. The viral ecology of PDV in the northwest Atlantic is less well defined and outbreaks have exhibited lower mortality than those in western Europe. This study sought to understand the molecular and ecological processes underlying PDV infection in eastern North America. We provide phylogenetic evidence that PDV was introduced into northwest Atlantic pinnipeds by a single lineage and is now endemic in local populations. Serological and viral screening of pinniped surveillance samples from 2006 onward suggest there is continued circulation of PDV outside of UMEs among multiple species with and without clinical signs. We report six full genome sequences and nine partial sequences derived from harbour and grey seals in the northwest Atlantic from 2011 through 2018, including a possible regional variant. Work presented here provides a framework towards greater understanding of how recovering populations and shifting species may impact disease transmission.


1.
Can you be clearer in the Methods/Results/Legend what sequences are included in Figure 1? What is the length (in nucleotides) of the full-length genomes versus Phosphoprotein-Matrix-Fusion-Hemagglutinin gene? If there is a large difference in sequence length it would make sense to also infer a tree just for the P-M-F-H as comparison.

2.
Figure 5 contains a number of over-interpretations. It is impossible to infer from Figures 1 and 3 that endemic circulation of PDV began in the Northwest Atlantic population in 1987. The 2006 virus and 2017-2018 viruses share a common ancestor in the 2000s, as far back as you can reasonably infer, not to the TMRCA of the entire tree. And that is when the seal population reached the minimum population size needed to sustain endemic transmission. The dotted arrow with the 2001 trans-Atlantic movement is therefore highly speculative and should be removed.

3.
The rooting is confusing in Figure 3. For consistency could you root the tree similar to Figure 1, rooted by the oldest 1988 viruses? Otherwise it's difficult to reconcile the two trees, particularly the US 2006 virus.

Comments to the Author
This paper examines the phenomenon of phocine distemper virus outbreaks in Atlantic pinniped populations, which have caused significant mortality in some cases and received a lot of scientific and public interest over the past decades. One of the main conundrums about these outbreaks is why they appear to be limited to the eastern Atlantic, whereas outbreaks of comparable severity have not been seen in populations of the same species on the western side of the Atlantic. The study seeks to provide new answers to this question through a a combination of viral genetic and seal host serology data. Based on these data, the author argue that continuous circulation of a less pathogenic form of the virus in the northwest Atlantic might maintain high enough levels of immunity with these populations to prevent the kind of outbreaks seen in eastern populations.
I found the study interesting and well presented but I am not convinced by the authors' data and conclusions. Specifically, the argument about endemic circulation creating partial herd immunity rests on such endemic viruses being limited to western Atlantic populations but being absent in the East. Published data to document this might exist but if so they should be presented for comparison. The part about genetic differentiation of the viruses on either side of the Atlantic is interesting but doesn't appear to be novel. The last part of the study, focussed on variation in the virus' hemagglutinin gene which the authors argue is responsible for different virus phenotypes, seems very speculative and likely involves incorrect analyses (see below). I appreciate that the data for this type of work are hard to come by and that the current study is exceptional in terms of the amount of data that has been assembled. Still, the study would benefit from strengthening the arguments where possible but to otherwise refrain from speculative claims and present the findings as hypothesis-generating rather than confirmatory. There are also a several methodological issues that need to be addressed.
Specific points: 1. Based on serological and qPCR results, the authors argue that some form of PDV is circulating endemically in North American seals. For comparison, do we know that such evidence of consistent exposure is definitely not detectable in European populations?The authors cite data from the eastern Atlantic comparable to their own in the discussion but for the readers to evaluate this comparison it would be helpful to include these published data in their figures or tables. It is also not clear to me whether the available data are limited to serology or whether the same patterns hold when screening eastern populations by qPCR for circulating PDV.
2. Please indicate which sequences in Fig 1 were new and which had been previously published and analysis. This would make it easier to appreciate what the current study is able to add to previous work and what had been known before in terms of geographical clades and their divergence times. It would also be helpful to include highest posterior density intervals on the key internal nodes (i.e. split between NE and NW Atlantic).
3. Before attempting to reconstruct time-scaled phylogenies it would be important to confirm that there is actually enough temporary signal (increase in divergence over time) to estimate a molecular clock. This should be done based on the ML phylogeny using TempEst and results could go into the supplement rather than the main text.
4. Results of the model selection for molecular clock and demographic priors should be documented -the authors state that they decided on an exponential growth model but we don't know what other models were considered. Exponential growth might not be an obvious choice for a virus that appears to be circulating endemically and the choice of demographic prior can have a significant effect on estimated divergence times. Table S4 looks wrong: the simpler models (M1a, M7) have a much higher likelihood then their complex counterparts with two extra parameters, which shouldn't be the case (additional parameters might fail to improve the likelihood but they can't reduce it). This suggests that some of the PAML analyses didn't converge and that the results are unreliable.

The selection analysis in
Given that there appears to be a lack of sites experiencing more than one non-synonymous substitution, I would be surprised if the data contained any evidence of positive selection. I don't think it is appropriate to use the BEAST tree as the topology for the PAML analysis, this should be a tree estimated without a molecular clock.
6. L378-382. I don't understand the point made in this paragraph -genetic variation was found in this position, but how is this evidence of intra-host selection? 7. L391-393. The effect of these mutations on the virus ability for fusion is complete speculationit is not appropriate to refer to this as a 'low-fusion lineage' in the Discussion minor comments: -description of virus genomic sequencing is not consistent (e.g. read length, paired/unpaired reads). Please add missing information -line 430 -I believe this should say "Northeast Atlantic"? -the scale bars on the phylogenetic trees represent the 'substitution rate' not 'mutation rate'please change throughout -I like Fig. 5 but I find it very difficult to read, especially the part under the time line in the right panel -Highlight species of interest in the trees shown in supplemental figures S2 and S3? -how was the tree in Fig 3 rooted? -the trees would benefit from better formatting: if the viewer is supposed to evaluate the position of specific taxa (e.g. 'orphan sequences', sequences from UME or non-UME events, different host species), these should be visually highlighted in the tree Decision letter (RSPB-2021-0437.R0)

19-Apr-2021
Dear Dr Sawatzki: I am writing to inform you that your manuscript RSPB-2021-0437 entitled "Longitudinal analysis of pinnipeds in the Northwest Atlantic provides insights on endemic circulation and regional immunity to Phocine distemper virus" has, in its current form, been rejected for publication in Proceedings B.
This action has been taken on the advice of referees, who have recommended that substantial revisions are necessary. With this in mind we would be happy to consider a resubmission, provided the comments of the referees are fully addressed. However please note that this is not a provisional acceptance.
The resubmission will be treated as a new manuscript. However, we will approach the same reviewers if they are available and it is deemed appropriate to do so by the Editor. Please note that resubmissions must be submitted within six months of the date of this email. In exceptional circumstances, extensions may be possible if agreed with the Editorial Office. Manuscripts submitted after this date will be automatically rejected.
Please find below the comments made by the referees, not including confidential reports to the Editor, which I hope you will find useful. If you do choose to resubmit your manuscript, please upload the following: 1) A 'response to referees' document including details of how you have responded to the comments, and the adjustments you have made. 2) A clean copy of the manuscript and one with 'tracked changes' indicating your 'response to referees' comments document. 3) Line numbers in your main document. 4) Data -please see our policies on data sharing to ensure that you are complying (https://royalsociety.org/journals/authors/author-guidelines/#data).
To upload a resubmitted manuscript, log into http://mc.manuscriptcentral.com/prsb and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Resubmission." Please be sure to indicate in your cover letter that it is a resubmission, and supply the previous reference number.
Sincerely, Professor Hans Heesterbeek mailto: proceedingsb@royalsociety.org Associate Editor Board Member: 1 Comments to Author: Both reviewers recognize the potential contribution of this work but raise substantial concerns that must be addressed, both regarding methodological approaches and presentation of results.
Reviewer(s)' Comments to Author: Referee: 1 Comments to the Author(s) Phocine distemper virus is an ecologically complex disease system and the authors do a commendable job of trying to piece together how multiple seal species located in North America, Greenland, and Europe intersect to cause periodic outbreaks of PDV. The genomic data adds an important dimension to the story that leads to an intriguing hypothesis that mutations led to an attenuated strain that became endemic in the North America population and this explains high rates of seropositivity but less severe outbreaks in American seals. But it's a complex story with a lot of missing data and at times it can be difficult to piece it together the way it's presented.
1. Can you be clearer in the Methods/Results/Legend what sequences are included in Figure 1? What is the length (in nucleotides) of the full-length genomes versus Phosphoprotein-Matrix-Fusion-Hemagglutinin gene? If there is a large difference in sequence length it would make sense to also infer a tree just for the P-M-F-H as comparison.
2. Figure 5 contains a number of over-interpretations. It is impossible to infer from Figures 1 and 3 that endemic circulation of PDV began in the Northwest Atlantic population in 1987. The 2006 virus and 2017-2018 viruses share a common ancestor in the 2000s, as far back as you can reasonably infer, not to the TMRCA of the entire tree. And that is when the seal population reached the minimum population size needed to sustain endemic transmission. The dotted arrow with the 2001 trans-Atlantic movement is therefore highly speculative and should be removed.
3. The rooting is confusing in Figure 3. For consistency could you root the tree similar to Bootstrap values indicate support for specific nodes, not branches, and should be placed next to nodes in Figure 3. Figure S3? Is that a misalignment? Sequencing errors?

What's going on with Tiger and Cougar in
3. It would be helpful to label the virus strain names more simply in the phylogenetic trees (remove the month and date at the end of the name so it's just year).
4. The map of the seal species ranges in Figure 5 is nice. Is it possible to include a map that shows the spatial differences in animal density? For example to show how European numbers are smaller? Can you describe a little more about what is known about long-distance movements/ranges of these seal species? Some more background on the ecology would be helpful.
Referee: 2 Comments to the Author(s) This paper examines the phenomenon of phocine distemper virus outbreaks in Atlantic pinniped populations, which have caused significant mortality in some cases and received a lot of scientific and public interest over the past decades. One of the main conundrums about these outbreaks is why they appear to be limited to the eastern Atlantic, whereas outbreaks of comparable severity have not been seen in populations of the same species on the western side of the Atlantic. The study seeks to provide new answers to this question through a a combination of viral genetic and seal host serology data. Based on these data, the author argue that continuous circulation of a less pathogenic form of the virus in the northwest Atlantic might maintain high enough levels of immunity with these populations to prevent the kind of outbreaks seen in eastern populations.
I found the study interesting and well presented but I am not convinced by the authors' data and conclusions. Specifically, the argument about endemic circulation creating partial herd immunity rests on such endemic viruses being limited to western Atlantic populations but being absent in the East. Published data to document this might exist but if so they should be presented for comparison. The part about genetic differentiation of the viruses on either side of the Atlantic is interesting but doesn't appear to be novel. The last part of the study, focussed on variation in the virus' hemagglutinin gene which the authors argue is responsible for different virus phenotypes, seems very speculative and likely involves incorrect analyses (see below). I appreciate that the data for this type of work are hard to come by and that the current study is exceptional in terms of the amount of data that has been assembled. Still, the study would benefit from strengthening the arguments where possible but to otherwise refrain from speculative claims and present the findings as hypothesis-generating rather than confirmatory. There are also a several methodological issues that need to be addressed.
Specific points: 1. Based on serological and qPCR results, the authors argue that some form of PDV is circulating endemically in North American seals. For comparison, do we know that such evidence of consistent exposure is definitely not detectable in European populations?The authors cite data from the eastern Atlantic comparable to their own in the discussion but for the readers to evaluate this comparison it would be helpful to include these published data in their figures or tables. It is also not clear to me whether the available data are limited to serology or whether the same patterns hold when screening eastern populations by qPCR for circulating PDV. Fig 1 were new and which had been previously published and analysis. This would make it easier to appreciate what the current study is able to add to previous work and what had been known before in terms of geographical clades and their divergence times. It would also be helpful to include highest posterior density intervals on the key internal nodes (i.e. split between NE and NW Atlantic).

Please indicate which sequences in
4. Results of the model selection for molecular clock and demographic priors should be documented -the authors state that they decided on an exponential growth model but we don't know what other models were considered. Exponential growth might not be an obvious choice for a virus that appears to be circulating endemically and the choice of demographic prior can have a significant effect on estimated divergence times. Table S4 looks wrong: the simpler models (M1a, M7) have a much higher likelihood then their complex counterparts with two extra parameters, which shouldn't be the case (additional parameters might fail to improve the likelihood but they can't reduce it). This suggests that some of the PAML analyses didn't converge and that the results are unreliable.

The selection analysis in
Given that there appears to be a lack of sites experiencing more than one non-synonymous substitution, I would be surprised if the data contained any evidence of positive selection. I don't think it is appropriate to use the BEAST tree as the topology for the PAML analysis, this should be a tree estimated without a molecular clock.
6. L378-382. I don't understand the point made in this paragraph -genetic variation was found in this position, but how is this evidence of intra-host selection? 7. L391-393. The effect of these mutations on the virus ability for fusion is complete speculationit is not appropriate to refer to this as a 'low-fusion lineage' in the Discussion minor comments: -description of virus genomic sequencing is not consistent (e.g. read length, paired/unpaired reads). Please add missing information -line 430 -I believe this should say "Northeast Atlantic"? -the scale bars on the phylogenetic trees represent the 'substitution rate' not 'mutation rate'please change throughout -I like Fig -the trees would benefit from better formatting: if the viewer is supposed to evaluate the position of specific taxa (e.g. 'orphan sequences', sequences from UME or non-UME events, different host species), these should be visually highlighted in the tree

Recommendation
Accept with minor revision (please list in comments)

Scientific importance: Is the manuscript an original and important contribution to its field? Excellent
General interest: Is the paper of sufficient general interest? Good Quality of the paper: Is the overall quality of the paper suitable? Good

Do you have any concerns about statistical analyses in this paper? If so, please specify them explicitly in your report. No
It is a condition of publication that authors make their supporting data, code and materials available -either as supplementary material or hosted in an external repository. Please rate, if applicable, the supporting data on the following criteria.

Comments to the Author
The authors did a commendable job incorporating the reviewers' comments and have made it much more clear what the contribution of their study is and what their key findings are. I only have a few specific points, listed below, which I suggest the authors should look.
One general point is that the authors could do more to convey the broader importance of their work. The abstract for example is quite focussed on the virus studied but provides little indication as to why we need to know these things and what can be taken away from the study. Instead of reporting in the final sentence what sequence data have been generated, I would expect to see some general conclusions, possibly beyond PDV. Similar opportunities exist in the Introduction and the Discussion. specific points: I appreciate the additional information provided about the marginal likelihood estimation and model comparison in BEAST in the supplement. Were those done using default settings? If so, please state that or otherwise provide further detail. Some of the results look questionable -the two methods (stepping stone and path sampling) should produce very similar estimates but for some of the models, including for the model selected as the top one, there are major differences, in the order of several hundred log units. This suggests that these analyses were probably not run for a sufficient length of time (steps). I don't expect this change the main conclusions of the paper but it does not look like good practice.
It is not clear how the authors arrived at the "estimated minimum level of 250,000 needed to support endemic infection of morbillivirus" (line 1108-1110). This is a really interesting point (mirroring what is seen in other morbilliviruses like measles, maybe cite some of this work?). The cited reports look like they would contain estimated seal numbers but did they also report this persistence threshold for PDV? If so, how was it inferred? line 545 -poorly constructed sentence: "Individual PDV genes were aligned for... genomes..."why not simply "Sequences were aligned using Clustal"? line 550 -how were dates formatted when only year was available? Or where all dates simplified to 'year only'? line 554 -should say "RELAXED CLOCK with an uncorrelated lognormal distribution" line 798 -I wonder about the heading: to me the key finding here is not the single introduction (there might have been others that simply weren't detected) but the continuous circulation of a single PDV lineage in the Northeastern Atlantic since the early 2000's line 816, 997, 1128 -'predicted' should be replaced with 'hypothesised'. The word prediction has a specific meaning in science and generally involves some quantitative (i.e. statistical) basis so it doesn't seem appropriate here. line 825 -"...and is referred to ..." should be "... with the latter being referred to..."  Your manuscript has now been peer reviewed and the reviews have been assessed by an Associate Editor. The reviewer's comments (not including confidential comments to the Editor) and the comments from the Associate Editor are included at the end of this email for your reference. As you will see, the reviewer and the Associate Editor have raised some issues and we would like to invite you to revise your manuscript to address them.
We do not allow multiple rounds of revision so we urge you to make every effort to fully address all of the comments at this stage. If deemed necessary by the Associate Editor, your manuscript will be sent back to one or more of the original reviewers for assessment. If the original reviewers are not available we may invite new reviewers. Please note that we cannot guarantee eventual acceptance of your manuscript at this stage.
To submit your revision please log into http://mc.manuscriptcentral.com/prsb and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions", click on "Create a Revision". Your manuscript number has been appended to denote a revision.
When submitting your revision please upload a file under "Response to Referees" in the "File Upload" section. This should document, point by point, how you have responded to the reviewers' and Editors' comments, and the adjustments you have made to the manuscript. We require a copy of the manuscript with revisions made since the previous version marked as 'tracked changes' to be included in the 'response to referees' document.
Your main manuscript should be submitted as a text file (doc, txt, rtf or tex), not a PDF. Your figures should be submitted as separate files and not included within the main manuscript file.
When revising your manuscript you should also ensure that it adheres to our editorial policies (https://royalsociety.org/journals/ethics-policies/). You should pay particular attention to the following: Research ethics: If your study contains research on humans please ensure that you detail in the methods section whether you obtained ethical approval from your local research ethics committee and gained informed consent to participate from each of the participants.
Use of animals and field studies: If your study uses animals please include details in the methods section of any approval and licences given to carry out the study and include full details of how animal welfare standards were ensured. Field studies should be conducted in accordance with local legislation; please include details of the appropriate permission and licences that you obtained to carry out the field work.
Data accessibility and data citation: It is a condition of publication that you make available the data and research materials supporting the results in the article (https://royalsociety.org/journals/authors/authorguidelines/#data). Datasets should be deposited in an appropriate publicly available repository and details of the associated accession number, link or DOI to the datasets must be included in the Data Accessibility section of the article (https://royalsociety.org/journals/ethicspolicies/data-sharing-mining/). Reference(s) to datasets should also be included in the reference list of the article with DOIs (where available).
In order to ensure effective and robust dissemination and appropriate credit to authors the dataset(s) used should also be fully cited and listed in the references.
If you wish to submit your data to Dryad (http://datadryad.org/) and have not already done so you can submit your data via this link http://datadryad.org/submit?journalID=RSPB&manu=(Document not available), which will take you to your unique entry in the Dryad repository.
If you have already submitted your data to dryad you can make any necessary revisions to your dataset by following the above link.
For more information please see our open data policy http://royalsocietypublishing.org/datasharing.
Electronic supplementary material: All supplementary materials accompanying an accepted article will be treated as in their final form. They will be published alongside the paper on the journal website and posted on the online figshare repository. Files on figshare will be made available approximately one week before the accompanying article so that the supplementary material can be attributed a unique DOI. Please try to submit all supplementary material as a single file.
Online supplementary material will also carry the title and description provided during submission, so please ensure these are accurate and informative. Note that the Royal Society will not edit or typeset supplementary material and it will be hosted as provided. Please ensure that the supplementary material includes the paper details (authors, title, journal name, article DOI). Your article DOI will be 10.1098/rspb.[paper ID in form xxxx.xxxx e.g. 10.1098/rspb.2016.0049].
Please submit a copy of your revised paper within three weeks. If we do not hear from you within this time your manuscript will be rejected. If you are unable to meet this deadline please let us know as soon as possible, as we may be able to grant a short extension.
Thank you for submitting your manuscript to Proceedings B; we look forward to receiving your revision. If you have any questions at all, please do not hesitate to get in touch.
Best wishes, Professor Hans Heesterbeek mailto: proceedingsb@royalsociety.org Associate Editor Board Member Comments to Author: Thank you for addressing the reviewers' comments in this revision. We have sent the manuscript back to one of the original referees, who has provided additional suggestions. In addition to addressing these critiques, please consider (a) moving some of the additional methodological detail added in this version on generation of the genomes into the supplement and (b) providing access to the serological data (in addition to the genetic data, which we note is already on GenBank).
Reviewer(s)' Comments to Author: Referee: 2 Comments to the Author(s). The authors did a commendable job incorporating the reviewers' comments and have made it much more clear what the contribution of their study is and what their key findings are. I only have a few specific points, listed below, which I suggest the authors should look.
One general point is that the authors could do more to convey the broader importance of their work. The abstract for example is quite focussed on the virus studied but provides little indication as to why we need to know these things and what can be taken away from the study. Instead of reporting in the final sentence what sequence data have been generated, I would expect to see some general conclusions, possibly beyond PDV. Similar opportunities exist in the Introduction and the Discussion. specific points: I appreciate the additional information provided about the marginal likelihood estimation and model comparison in BEAST in the supplement. Were those done using default settings? If so, please state that or otherwise provide further detail. Some of the results look questionable -the two methods (stepping stone and path sampling) should produce very similar estimates but for some of the models, including for the model selected as the top one, there are major differences, in the order of several hundred log units. This suggests that these analyses were probably not run for a sufficient length of time (steps). I don't expect this change the main conclusions of the paper but it does not look like good practice.
It is not clear how the authors arrived at the "estimated minimum level of 250,000 needed to support endemic infection of morbillivirus" (line 1108-1110). This is a really interesting point (mirroring what is seen in other morbilliviruses like measles, maybe cite some of this work?). The cited reports look like they would contain estimated seal numbers but did they also report this persistence threshold for PDV? If so, how was it inferred? line 545 -poorly constructed sentence: "Individual PDV genes were aligned for... genomes..."why not simply "Sequences were aligned using Clustal"?  The Associate editor has recommended publication, but also suggests some minor revisions to your manuscript. Therefore, I invite you to respond to the comment and revise your manuscript. Because the schedule for publication is very tight, it is a condition of publication that you submit the revised version of your manuscript within 7 days. If you do not think you will be able to meet this date please let us know.
To revise your manuscript, log into https://mc.manuscriptcentral.com/prsb and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision. You will be unable to make your revisions on the originally submitted version of the manuscript. Instead, revise your manuscript and upload a new version through your Author Centre.
When submitting your revised manuscript, you will be able to respond to the comments made by the referee(s) and upload a file "Response to Referees". You can use this to document any changes you make to the original manuscript. We require a copy of the manuscript with revisions made since the previous version marked as 'tracked changes' to be included in the 'response to referees' document.
Before uploading your revised files please make sure that you have: 1) A text file of the manuscript (doc, txt, rtf or tex), including the references, tables (including captions) and figure captions. Please remove any tracked changes from the text before submission. PDF files are not an accepted format for the "Main Document".
2) A separate electronic file of each figure (tiff, EPS or print-quality PDF preferred). The format should be produced directly from original creation package, or original software format. PowerPoint files are not accepted.
3) Electronic supplementary material: this should be contained in a separate file and where possible, all ESM should be combined into a single file. All supplementary materials accompanying an accepted article will be treated as in their final form. They will be published alongside the paper on the journal website and posted on the online figshare repository. Files on figshare will be made available approximately one week before the accompanying article so that the supplementary material can be attributed a unique DOI.
Online supplementary material will also carry the title and description provided during submission, so please ensure these are accurate and informative. Note that the Royal Society will not edit or typeset supplementary material and it will be hosted as provided. Please ensure that the supplementary material includes the paper details (authors, title, journal name, article DOI). Your article DOI will be 10.1098/rspb.[paper ID in form xxxx.xxxx e.g. 10.1098/rspb.2016.0049]. 4) A media summary: a short non-technical summary (up to 100 words) of the key findings/importance of your manuscript.

5) Data accessibility section and data citation
It is a condition of publication that data supporting your paper are made available either in the electronic supplementary material or through an appropriate repository.
In order to ensure effective and robust dissemination and appropriate credit to authors the dataset(s) used should be fully cited. To ensure archived data are available to readers, authors should include a 'data accessibility' section immediately after the acknowledgements section. This should list the database and accession number for all data from the article that has been made publicly available, for instance: • DNA sequences: Genbank accessions F234391-F234402 • Phylogenetic data: TreeBASE accession number S9123 • Final DNA sequence assembly uploaded as online supplemental material • Climate data and MaxEnt input files: Dryad doi:10.5521/dryad.12311 NB. From April 1 2013, peer reviewed articles based on research funded wholly or partly by RCUK must include, if applicable, a statement on how the underlying research materials -such as data, samples or models -can be accessed. This statement should be included in the data accessibility section.
If you wish to submit your data to Dryad (http://datadryad.org/) and have not already done so you can submit your data via this link http://datadryad.org/submit?journalID=RSPB&amp;manu=(Document not available) which will take you to your unique entry in the Dryad repository. If you have already submitted your data to dryad you can make any necessary revisions to your dataset by following the above link. Please see https://royalsociety.org/journals/ethics-policies/data-sharing-mining/ for more details.
6) For more information on our Licence to Publish, Open Access, Cover images and Media summaries, please visit https://royalsociety.org/journals/authors/author-guidelines/.
Once again, thank you for submitting your manuscript to Proceedings B and I look forward to receiving your revision. If you have any questions at all, please do not hesitate to get in touch.

19-Oct-2021
Dear Dr Sawatzki I am pleased to inform you that your manuscript entitled "Longitudinal analysis of pinnipeds in the Northwest Atlantic provides insights on endemic circulation of Phocine distemper virus" has been accepted for publication in Proceedings B.
You can expect to receive a proof of your article from our Production office in due course, please check your spam filter if you do not receive it. PLEASE NOTE: you will be given the exact page length of your paper which may be different from the estimation from Editorial and you may be asked to reduce your paper if it goes over the 10 page limit.
If you are likely to be away from e-mail contact please let us know. Due to rapid publication and an extremely tight schedule, if comments are not received, we may publish the paper as it stands.
If you have any queries regarding the production of your final article or the publication date please contact procb_proofs@royalsociety.org Your article has been estimated as being 10 pages long. Our Production Office will be able to confirm the exact length at proof stage.
Data Accessibility section Please remember to make any data sets live prior to publication, and update any links as needed when you receive a proof to check. It is good practice to also add data sets to your reference list.
Open Access You are invited to opt for Open Access, making your freely available to all as soon as it is ready for publication under a CCBY licence. Our article processing charge for Open Access is £1700. Corresponding authors from member institutions (http://royalsocietypublishing.org/site/librarians/allmembers.xhtml) receive a 25% discount to these charges. For more information please visit http://royalsocietypublishing.org/open-access.
Paper charges An e-mail request for payment of any related charges will be sent out shortly. The preferred payment method is by credit card; however, other payment options are available.
Electronic supplementary material: All supplementary materials accompanying an accepted article will be treated as in their final form. They will be published alongside the paper on the journal website and posted on the online figshare repository. Files on figshare will be made available approximately one week before the accompanying article so that the supplementary material can be attributed a unique DOI.
You are allowed to post any version of your manuscript on a personal website, repository or preprint server. However, the work remains under media embargo and you should not discuss it with the press until the date of publication. Please visit https://royalsociety.org/journals/ethicspolicies/media-embargo for more information.
Thank you for your fine contribution. On behalf of the Editors of the Proceedings B, we look forward to your continued contributions to the Journal. Our manuscript addresses a question that has perplexed marine wildlife ecologists and virologists for decades: Why does Phocine distemper virus cause massive mortality events in Northern European waters, but not along the North American Atlantic coast?
We appreciate the Editors interest in giving our manuscript further consideration. We are grateful to the reviewers for their careful reading of our work and the insightful suggestions on data analyses and interpretation. We are pleased that both reviewers were intrigued by the data and saw how it contributed to our knowledge of PDV. We also appreciate feedback on places where the narrative was unclear and for the suggestions on ways to clarify the data presentation.
We have incorporated all suggestions from both reviewers and believe that we now present a much more robust and clearly presented representation of the data. The three overarching critiques involved 1) bioinformatic methodology concerns, which we addressed and clarified, 2) interpretations of the more speculative pieces of data, which we either removed all together or softened the interpretations, and 3) complex flow of the data components, which we have reformatted and streamlined for clarity to readers.
Full responses and revisions are found at the end of this letter.
We believe that the work presented in this manuscript is an interesting story and important contribution to viral ecology. We believe it will be of interest to readers from multiple disciples including virology, epidemiology, ecology and animal sciences. Thank you for your continued consideration of this manuscript for publication in Proceedings of the Royal Society B. Comments to the Author(s) Phocine distemper virus is an ecologically complex disease system and the authors do a commendable job of trying to piece together how multiple seal species located in North America, Greenland, and Europe intersect to cause periodic outbreaks of PDV. The genomic data adds an important dimension to the story that leads to an intriguing hypothesis that mutations led to an attenuated strain that became endemic in the North America population and this explains high rates of seropositivity but less severe outbreaks in American seals. But it's a complex story with a lot of missing data and at times it can be difficult to piece it together the way it's presented.
We agree on the complexity of the story and the difficulty in how best to present the interwoven pieces. For this resubmission, we did a substantial rework on the flow of the manuscript and how the data is presented and cross-referenced, we unified how the data was referred across the figures and throughout the text. We removed the more speculative and tangential components of the manuscript. As such, the tables and figures are now in a different order than they were in the initial submission and some have been removed.
The flow of the data presentation that remains in the new format is:  There is evidence of ongoing PDV in North America outside of UME events, and PDV is present in different species, ages, and clinical presentations (data from archived samples, strandings, and live-captures)  Genetic sequence from both virus and host support the observed absence of species barrier to PDV infection that we observe in North America  Phylogeny from new sequences support a single incursion into North America  Within North America, we found evidence for the ongoing presence of PDV in the population and a regional lineage, in addition to the UME associated lineage Additional changes that were made to improve clarity:  We relabeled sequence and animal IDs to standardize across all tables and figures.  We added a standardized color scheme across figures  We unified how regions are referred to and wherever possible, standardized the text to be Northwest Atlantic and Northeast Atlantic  We removed the discussion on the possible reseeding of PDV into Europe from virus circulating in North America and adjusted our interpretation of the incursion timeframe for North America.
 We softened the interpretations related to a putative impact on fusogenicity in the regional strain identified in this study and removed the PAML analysis looking at possible positive selection.
We believe that the new format is significantly more streamlined to provide better clarity on how the most important pieces fit together and that the remaining interpretations are well supported by the presented data.
1. Can you be clearer in the Methods/Results/Legend what sequences are included in Figure 1? What is the length (in nucleotides) of the full-length genomes versus Phosphoprotein-Matrix-Fusion-Hemagglutinin gene? If there is a large difference in sequence length it would make sense to also infer a tree just for the P-M-F-H as comparison.
The figures have been reordered and the previous Figure 1 is now Figure 2. The sequences that were used for all figures and tables have been better described and annotated throughout the text, figures, and legends. Whenever newly derived sequences are shown in tables or Figures, they are marked with an asterisk (currently Figure 2, 3, 4; Supplemental Table S2, S3). Wherever a sequence dataset is utilized (phylogenies of genome and H gene) the precise composition of the included sequences and whether or not they are newly reported here or previously published, is described in both the methods and results.
The sequence length for the full genome is 15,696 nucleotides and the partial genome of P-M-F-H is 7,268. The full analysis was run on the composite sample set reported here (8 full available genomes, combined with 44 P-M-F-H sequences from prior publications) in order to preserve all available data within the model. This was selected to preserve resolution of the North American clade, particularly given the relatively small number of available sequences. An additional tree was generated with all sequences truncated to the P-M-F-H region and the resulting phylogeny was highly comparable to that from the tree presented in the manuscript with only minor changes in highly similar tips. The additional tree with truncated sequence and all related output files are now available on the github repository (https://github.com/ksawatzki/Supp_data/), and text and a supplemental figure added (lines 297-298, Figure S5). We appreciate the reviewers suggestions on how to improve the summary map and related text for Figure 5 and have revised accordingly. The goal for Figure 5 is to summarize what is well supported with what we hypothesize based on the new data presented here, so we have tried to more clearly make that distinction in this revision. We removed all discussion of the possible seeding of Europe in 2001 from North America and the trans-Atlantic dotted arrow has been removed. We have also adjusted the estimated incursion to state "by 2001" and referenced that a more precise estimate is currently hindered by limited sequence availability, particularly from North America prior to 2000.

Figure
3. The rooting is confusing in Figure 3. For consistency could you root the tree similar to Figure  1, rooted by the oldest 1988 viruses? Otherwise it's difficult to reconcile the two trees, particularly the US 2006 virus.
We replaced the H gene tree with the original paired supplemental figure (BEAST) to match the type of analysis performed in Figure 2. All viral phylogenetic trees are now time-scaled.
4. Is there any experimental evidence that the US viruses are attenuated?
No. We do not yet have experimental evidence that the regional North American variant is attenuated. Our analysis is currently in silico but provides a mechanistically grounded hypothesis. We have removed several points of discussion on this variant and softened the remaining language throughout the manuscript to reflect the fact that the phenotype of this variant is currently speculative.
Minor 1. Bootstrap values indicate support for specific nodes, not branches, and should be placed next to nodes in Figure 3.
This tree has been replaced so this no longer applies. Figure S3? Is that a misalignment? Sequencing errors?

What's going on with Tiger and Cougar in
This now refers to supplemental Figure S4. The tiger and cougar Nectin-4 sequences do cluster with the other large cats, but have significantly longer branch lengths than any of the other sequences. The alignment is robust and there are no reported issues with the available sequence that was used in this analysis, though Nectin-4 sequence from both tiger and cougar each contain multiple insertions and deletions. It is unclear to us as to why those particular species are so unusual, but they are from the RefSeq annotations for both animals.
3. It would be helpful to label the virus strain names more simply in the phylogenetic trees (remove the month and date at the end of the name so it's just year). This change has been made throughout the manuscript text, figures, and tables. Where possible, we also standardized the naming scheme of the sequences to reflect country/species/identifier/year.
4. The map of the seal species ranges in Figure 5 is nice. Is it possible to include a map that shows the spatial differences in animal density? For example to show how European numbers are smaller? Can you describe a little more about what is known about long-distance movements/ranges of these seal species? Some more background on the ecology would be helpful.
Two paragraphs have been added to the discussion to describe population numbers and movement of all 3 species and on both sides of the Atlantic (lines 364-394). We agree that a map that includes animal density would be a helpful tool and based on the reviewers suggestion, we attempted to add that to the summary figure or as a supplemental. Given that the available density data varies significantly in terms of surveillance sampling effort, frequency and geographic region, the information did not map well in a way that informed on the three different species. We have instead added an additional bar chart as part (b) of Figure 5 that bins the available data according to region of the Atlantic and helps to provide a visual representation of species density in the broad regions of interest. This is intentionally placed under the map to encourage easier interpretation of the east to west alignment.
Referee: 2 Comments to the Author(s) This paper examines the phenomenon of phocine distemper virus outbreaks in Atlantic pinniped populations, which have caused significant mortality in some cases and received a lot of scientific and public interest over the past decades. One of the main conundrums about these outbreaks is why they appear to be limited to the eastern Atlantic, whereas outbreaks of comparable severity have not been seen in populations of the same species on the western side of the Atlantic. The study seeks to provide new answers to this question through a a combination of viral genetic and seal host serology data. Based on these data, the author argue that continuous circulation of a less pathogenic form of the virus in the northwest Atlantic might maintain high enough levels of immunity with these populations to prevent the kind of outbreaks seen in eastern populations.
I found the study interesting and well presented but I am not convinced by the authors' data and conclusions. Specifically, the argument about endemic circulation creating partial herd immunity rests on such endemic viruses being limited to western Atlantic populations but being absent in the East. Published data to document this might exist but if so they should be presented for comparison. The part about genetic differentiation of the viruses on either side of the Atlantic is interesting but doesn't appear to be novel. The last part of the study, focused on variation in the virus' hemagglutinin gene which the authors argue is responsible for different virus phenotypes, seems very speculative and likely involves incorrect analyses (see below). I appreciate that the data for this type of work are hard to come by and that the current study is exceptional in terms of the amount of data that has been assembled. Still, the study would benefit from strengthening the arguments where possible but to otherwise refrain from speculative claims and present the findings as hypothesis-generating rather than confirmatory. There are also a several methodological issues that need to be addressed.
Given the speculative nature of the possible cross-protection from a putative less virulent strain, we have reworked the full manuscript to decrease that discussion point, and refocused primarily on the ongoing presence of PDV in North America. The interpretations on the regional variant have been reworded to more clearly convey that they are meant to be hypothesis generating and intriguing considerations for future work. We have also addressed the methodological concerns that were raised, with each further described in the specific points below.
Specific points: 1. Based on serological and qPCR results, the authors argue that some form of PDV is circulating endemically in North American seals. For comparison, do we know that such evidence of consistent exposure is definitely not detectable in European populations? The authors cite data from the eastern Atlantic comparable to their own in the discussion but for the readers to evaluate this comparison it would be helpful to include these published data in their figures or tables. It is also not clear to me whether the available data are limited to serology or whether the same patterns hold when screening eastern populations by qPCR for circulating PDV.
There have been a handful of published studies looking for PDV in European populations. Cumulatively they have included over one thousand animals, primarily harbor seals, though some grey and ringed seals have also been reported. Published surveillance includes samples from 1988 through 2014, spanning the first European UME and extending through 12 years beyond the second UME. Studies have focused primarily on serology, though a smaller dataset of 117 animals from the North Sea and Greenland did include RT-PCR screening and failed to detect any RT-PCR positive animals. From the serology data in Europe, antibodies are detectable for the first few years post UME, but rapidly decline to a point of being undetectable except in the case of older animals who had lived through one of the UME timeframes. Beyond 2 years post UME, antibodies are no longer detected in young of the year or pups. This point has been elaborated on with citations in the introduction text of the manuscript (lines 66-70). Fig 1 were new and which had been previously published and analysis. This would make it easier to appreciate what the current study is able to add to previous work and what had been known before in terms of geographical clades and their divergence times. It would also be helpful to include highest posterior density intervals on the key internal nodes (i.e. split between NE and NW Atlantic).

Please indicate which sequences in
The manuscript has undergone significant restructuring in order to make the data presentation flow more clearly. As such the previous Figure 1 is now Figure 2 in the current version. In all text, figures, and tables throughout the manuscript all newly reported sequences are now marked with an asterisk. Each data presentation also includes more detailed description of which sequences were included. The split between NE and NW Atlantic is estimated at 2001 (HPD 95% 1998(HPD 95% -2002 and has been added in the results (line 295-297).
3. Before attempting to reconstruct time-scaled phylogenies it would be important to confirm that there is actually enough temporary signal (increase in divergence over time) to estimate a molecular clock. This should be done based on the ML phylogeny using TempEst and results could go into the supplement rather than the main text. & 4. Results of the model selection for molecular clock and demographic priors should be documented -the authors state that they decided on an exponential growth model but we don't know what other models were considered. Exponential growth might not be an obvious choice for a virus that appears to be circulating endemically and the choice of demographic prior can have a significant effect on estimated divergence times.
These analyses are now described in the supplemental materials methods with supporting figures in supplemental Figure S1, supplemental table S2 and noted in the main text in lines 200-202.
The H gene tree reconstructed using maximum likelihood methods with dated tips was input into TempEst v1.5.3 to evaluate clock-like evolution. The results indicate that divergence increased as a function of time with an R-squared value of 0.859, and a correlation co-efficient of 0.9273.
To perform this analysis the tree was evaluated using the 'best-fitting root'. We interpreted this as a good temporal signature, indicating that the H gene of PDV is fit for analysis with Bayesian phylogenetics ( Figure S1).
A table describing the model testing output has been added to the supplemental material as Table S2 and is now described in the supplemental methods. All raw data and output from model testing in BEAST have been added to the github repository (https://github.com/ksawatzki/Supp_data/). Six parameterizations were tested using path and stepping stone sampling in BEAST, and the log Bayes factor compared. Constant size models using strict and relaxed clocks were used as the null, and compared against exponential growth and GMRF Bayesian Skyride models with strict and relaxed clocks. Table S4 looks wrong: the simpler models (M1a, M7) have a much higher likelihood then their complex counterparts with two extra parameters, which shouldn't be the case (additional parameters might fail to improve the likelihood but they can't reduce it). This suggests that some of the PAML analyses didn't converge and that the results are unreliable.

The selection analysis in
Given that there appears to be a lack of sites experiencing more than one non-synonymous substitution, I would be surprised if the data contained any evidence of positive selection. I don't think it is appropriate to use the BEAST tree as the topology for the PAML analysis, this should be a tree estimated without a molecular clock.
We appreciate this useful interpretation of the PAML output and we agree with the reviewer.
We have decided to remove this from the manuscript.
6. L378-382. I don't understand the point made in this paragraph -genetic variation was found in this position, but how is this evidence of intra-host selection?
This section has been removed from the text.
7. L391-393. The effect of these mutations on the virus ability for fusion is complete speculation -it is not appropriate to refer to this as a 'low-fusion lineage' in the Discussion This has been either completely removed, or significantly softened and reworded as speculative in the few instances where the reference was preserved.
minor comments: -description of virus genomic sequencing is not consistent (e.g. read length, paired/unpaired reads). Please add missing information We agree that the genomic sequencing methodology was confusing as presented. The 6 genomes were generated at 3 different institutions using different platforms and processing approaches, even within an institution. We have reorganized and reworded the methods so that details are provided for each genome, rather than by institution and the information is presented in a more consistent manner. More specific sequencing methods were added where missing.
-line 430 -I believe this should say "Northeast Atlantic"?
The manuscript has been significantly reworked and this specific line no longer existswe have carefully reviewed the current version for similar mistakes.
-the scale bars on the phylogenetic trees represent the 'substitution rate' not 'mutation rate'please change throughout This has been corrected throughout the manuscript.
-I like Fig. 5 but I find it very difficult to read, especially the part under the time line in the right panel We agree that the annotations under the figure were difficult to read. We have removed them from the figure as we have decided that they were not necessary to the purpose of that figure and created unnecessary clutter. Figure 5 has been further modified to include a 3 rd panel (now Figure 5b) to show species density.
-Highlight species of interest in the trees shown in supplemental figures S2 and S3?
This has been added. Grey and harbor seals are marked in blue, other marine mammals are marked in yellow.
-how was the tree in Fig 3 rooted? The previous unrooted RAxML Figure 3 has been removed and is now a time scaled phyologenetic analysis of the H gene.
-the trees would benefit from better formatting: if the viewer is supposed to evaluate the position of specific taxa (e.g. 'orphan sequences', sequences from UME or non-UME events, different host species), these should be visually highlighted in the tree This has been fixed throughout. Species are now defined by color and newly reported sequences are all marked with asterisk. Sample names have been standardized and simplified where possible.
All models were run with specified tree priors and clocks using default parameters for the HKYγ substitution model. This has been clarified in the methods. We further thank the reviewer for noticing the PS/SS difference in two tested models. In light of this, we re-ran these model tests using longer chain lengths, which resulted in highly comparable PS/SS log marginal likelihood values which have been amended to Supplemental table 2. This resulted in the constant model having slightly better support than exponential growth, and therefore we re-ran the BEAST analysis using the constant size for tree prior. As anticipated, the output are qualitatively indistinguishable, with only within-clade movement in homogenous clusters. We have replaced figures 2 and 3 with the result of the updated model. There is no change in quantitative results (reported tMRCAs and 95% HPD), our interpretation or text. We sincerely appreciate the expert guidance in strengthening the support for this paper.
It is not clear how the authors arrived at the "estimated minimum level of 250,000 needed to support endemic infection of morbillivirus" (line 1108-1110). This is a really interesting point (mirroring what is seen in other morbilliviruses like measles, maybe cite some of this work?). The cited reports look like they would contain estimated seal numbers but did they also report this persistence threshold for PDV? If so, how was it inferred?
As the reviewer has noted, this is the estimate for morbillivirus that has been derived from measles and the references were inadvertently omitted. We thank the reviewer for catching this omission and have corrected it. These references are citations 39-41 (Black, Bartlett, and Keeling & Grenfell).
line 550 -how were dates formatted when only year was available? Or where all dates simplified to 'year only'?
For BEAST tip dating, in the few cases when only a year was available, the tip date was set to the midpoint of the known time period (year or month) with uncertainty set to cover the full period. For instance, if only a year was known, tip date was set to 'YYYY.5' with 0.5 uncertainty.
We have included a sentence describing this in the methods. For figures, dates were simplified to year only in response to a prior reviewer suggestion to simplify the sequence names throughout the manuscript.
Line 178-180: When exact date was not known, the date was set at the mid-point of the known month or year with uncertainty spanning the time period.
line 554 -should say "RELAXED CLOCK with an uncorrelated lognormal distribution" This change has been made as suggested.
Line 181-184: Final analyses were independently run 5 times on BEAST (v1.10.4) with a chain length of 100 million generations using a constant growth coalescent tree prior, relaxed clocked with an uncorrelated lognormal distribution, an HKY γ substitution model, and default parameters line 798 -I wonder about the heading: to me the key finding here is not the single introduction (there might have been others that simply weren't detected) but the continuous circulation of a single PDV lineage in the Northeastern Atlantic since the early 2000's This heading has been changed.
Line 262: Continuous circulation of a single PDV lineage in the Northwest Atlantic line 816, 997, 1128 -'predicted' should be replaced with 'hypothesised'. The word prediction has a specific meaning in science and generally involves some quantitative (i.e. statistical) basis so it doesn't seem appropriate here.
The changes have been made as suggested.
Line 279-280: A clade seen in regional pinnipeds in eastern North America exhibits a Hemagglutinin substitution hotspot hypothesized to decrease fusogenic activity Lines 306-308: Given the location of these substitutions in a region critical for viral fusion and the persistence of this substitution over multiple years, we hypothesize viruses with this hotspot may have impaired fusogenicity.
Lines 381-382: This regional variant has a persistent substitution hotspot that we hypothesize will decrease viral fusion and may result in a naturally attenuated virus.
line 825 -"...and is referred to ..." should be "... with the latter being referred to..." This change has been made as suggested.
Lines 88-90: All sequences from the Northwest Atlantic spanning 2011-2015 fell into one conserved, distinct lineage that was not shared by any of the Northeast Atlantic sequences with the latter being referred to as the North American lineage from here on. The color used in Figure 3 has been adjusted to match the color used in Figure 2, and the sequences have been labelled to distinguish the Northwestern and Northeastern sequences.