Human listeners’ perception of behavioural context and core affect dimensions in chimpanzee vocalizations

Vocalizations linked to emotional states are partly conserved among phylogenetically related species. This continuity may allow humans to accurately infer affective information from vocalizations produced by chimpanzees. In two pre-registered experiments, we examine human listeners' ability to infer behavioural contexts (e.g. discovering food) and core affect dimensions (arousal and valence) from 155 vocalizations produced by 66 chimpanzees in 10 different positive and negative contexts at high, medium or low arousal levels. In experiment 1, listeners (n = 310), categorized the vocalizations in a forced-choice task with 10 response options, and rated arousal and valence. In experiment 2, participants (n = 3120) matched vocalizations to production contexts using yes/no response options. The results show that listeners were accurate at matching vocalizations of most contexts in addition to inferring arousal and valence. Judgments were more accurate for negative as compared to positive vocalizations. An acoustic analysis demonstrated that, listeners made use of brightness and duration cues, and relied on noisiness in making context judgements, and pitch to infer core affect dimensions. Overall, the results suggest that human listeners can infer affective information from chimpanzee vocalizations beyond core affect, indicating phylogenetic continuity in the mapping of vocalizations to behavioural contexts.

It is a condition of publication that authors make their supporting data, code and materials available -either as supplementary material or hosted in an external repository. Please rate, if applicable, the supporting data on the following criteria.

Do you have any ethical concerns with this paper? No
Comments to the Author Human Listeners' Perception of Behavioural Context and Core Affect Dimensions in Chimpanzee Vocalisations Kamiloğlu et al.
Overall, I found this manuscript to be a generally thorough and interesting examination of crossspecies vocal perception. It contains very detailed acoustic analyses and also two main experiments to determine human perception of chimp calls, based on behavioural context in which the calls were uttered, as well as arousal and valence (two key components linked to emotions). The results mostly appear robust and convincing, except see my comments on Exp 1 below.
I suggest the following changes: Please insert line numbers in order to make the lives of reviewers and editors a little easier.
Key words should be in alphabetical order.
The order of results should be changed. A full understanding of the Experiments 1 and 2 is based on section 3 of the results -Acoustic Analysis. Reading Experiment 1 at first, I thought the classification of arousal in calls was based on subjective assessments of one of the authors (See: Text S1: Recording of chimpanzee vocalisations) Table 1. I assume the classification of calls according to arousal and valence is based on the results of the Acoustic Analysis section? However, this is not clear. For example, it is possible that, at least subjectively and without acoustic data, Tantrum screams might be considered High instead of Medium arousal, and Whimpers might be considered Low arousal instead of Medium.
Please provide examples of the calls types as Supplemental files. These should be available at the review stage.
Please insert the sample size for chimps used in the study and not just the number of vocalisations (n = 155). The chimp sample size used for sourcing the call examples also needs to be more prominent in both the Methods (e.g. page 7/32) and Results of the manuscriptcurrently it is quite difficult to locate.
Page 8/32. Experiment 1. "Participants listened to the 155 chimpanzee vocalisations". I have reservations about how informative this sort of setup could be (with 10 behavioural contexts, three arousal levels). How long (average and SD) did it take human participants to work their way through 155 chimp calls? Would you expect the same level of accuracy and focus in the human subjects at call 10 or 14, versus call 145 or 150? There is no analysis reported that checks whether the human subjects were better at classifying the first 30 calls versus the last 30, for example.
The formatting of references contains many inconsistencies, which are not in the journal style.
Decision letter (RSPB-2020-0465.R0) 08-Apr-2020 Dear Ms Kamiloglu: I am writing to inform you that your manuscript RSPB-2020-0465 entitled "Human Listeners' Perception of Behavioural Context and Core Affect Dimensions in Chimpanzee Vocalisations" has, in its current form, been rejected for publication in Proceedings B.
This action has been taken on the advice of referees, who have recommended that substantial revisions are necessary. With this in mind we would be happy to consider a resubmission, provided the comments of the referees are fully addressed. However please note that this is not a provisional acceptance.
The resubmission will be treated as a new manuscript. However, we will approach the same reviewers if they are available and it is deemed appropriate to do so by the Editor. Please note that resubmissions must be submitted within six months of the date of this email. In exceptional circumstances, extensions may be possible if agreed with the Editorial Office. Manuscripts submitted after this date will be automatically rejected.
Please find below the comments made by the referees, not including confidential reports to the Editor, which I hope you will find useful. If you do choose to resubmit your manuscript, please upload the following: 1) A 'response to referees' document including details of how you have responded to the comments, and the adjustments you have made.
2) A clean copy of the manuscript and one with 'tracked changes' indicating your 'response to referees' comments document.
3) Line numbers in your main document.
To upload a resubmitted manuscript, log into http://mc.manuscriptcentral.com/prsb and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Resubmission." Please be sure to indicate in your cover letter that it is a resubmission, and supply the previous reference number.
Sincerely, Dr Robert Barton mailto: proceedingsb@royalsociety.org Associate Editor Board Member: 1 Comments to Author: The two reviewers agree that the study is interesting and informative. There are, however, a number of areas where further clarification of the methodology is required. Reviewer 2 also raises important concerns about potential changes in the accuracy of classifications over time. It would be relatively straightforward to include additional analyses to address this issue.
Reviewer(s)' Comments to Author: Referee: 1 Comments to the Author(s) This is an interesting study that looked at humans' ability to categorize level of arousal, valence, and behavioural context from nonhuman primate vocalizations. Participants were good at the arousal and valence categorizations, but poorer at context categorizations.
I have no concerns, but I do wonder whether the authors administered any questionnaires to participants to determine their level of experience with nonhuman animals' vocalizations.

Referee: 2
Comments to the Author(s) Human Listeners' Perception of Behavioural Context and Core Affect Dimensions in Chimpanzee Vocalisations Kamiloğlu et al.
Overall, I found this manuscript to be a generally thorough and interesting examination of crossspecies vocal perception. It contains very detailed acoustic analyses and also two main experiments to determine human perception of chimp calls, based on behavioural context in which the calls were uttered, as well as arousal and valence (two key components linked to emotions). The results mostly appear robust and convincing, except see my comments on Exp 1 below.
I suggest the following changes: Please insert line numbers in order to make the lives of reviewers and editors a little easier.
Key words should be in alphabetical order.
The order of results should be changed. A full understanding of the Experiments 1 and 2 is based on section 3 of the results -Acoustic Analysis. Reading Experiment 1 at first, I thought the classification of arousal in calls was based on subjective assessments of one of the authors (See: Text S1: Recording of chimpanzee vocalisations) Table 1. I assume the classification of calls according to arousal and valence is based on the results of the Acoustic Analysis section? However, this is not clear. For example, it is possible that, at least subjectively and without acoustic data, Tantrum screams might be considered High instead of Medium arousal, and Whimpers might be considered Low arousal instead of Medium.
Please provide examples of the calls types as Supplemental files. These should be available at the review stage.
Please insert the sample size for chimps used in the study and not just the number of vocalisations (n = 155). The chimp sample size used for sourcing the call examples also needs to be more prominent in both the Methods (e.g. page 7/32) and Results of the manuscriptcurrently it is quite difficult to locate.
Page 8/32. Experiment 1. "Participants listened to the 155 chimpanzee vocalisations". I have reservations about how informative this sort of setup could be (with 10 behavioural contexts, three arousal levels). How long (average and SD) did it take human participants to work their way through 155 chimp calls? Would you expect the same level of accuracy and focus in the human subjects at call 10 or 14, versus call 145 or 150? There is no analysis reported that checks whether the human subjects were better at classifying the first 30 calls versus the last 30, for example.
The formatting of references contains many inconsistencies, which are not in the journal style. It is a condition of publication that authors make their supporting data, code and materials available -either as supplementary material or hosted in an external repository. Please rate, if applicable, the supporting data on the following criteria. The referee(s) do not recommend any further changes. Therefore, please proof-read your manuscript carefully and upload your final files for publication. Because the schedule for publication is very tight, it is a condition of publication that you submit the revised version of your manuscript within 7 days. If you do not think you will be able to meet this date please let me know immediately.

Is it adequate? Yes
To upload your manuscript, log into http://mc.manuscriptcentral.com/prsb and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision.
You will be unable to make your revisions on the originally submitted version of the manuscript. Instead, upload a new version through your Author Centre.
Before uploading your revised files please make sure that you have: 1) A text file of the manuscript (doc, txt, rtf or tex), including the references, tables (including captions) and figure captions. Please remove any tracked changes from the text before submission. PDF files are not an accepted format for the "Main Document".
2) A separate electronic file of each figure (tiff, EPS or print-quality PDF preferred). The format should be produced directly from original creation package, or original software format. Please note that PowerPoint files are not accepted.
3) Electronic supplementary material: this should be contained in a separate file from the main text and the file name should contain the author's name and journal name, e.g authorname_procb_ESM_figures.pdf All supplementary materials accompanying an accepted article will be treated as in their final form. They will be published alongside the paper on the journal website and posted on the online figshare repository. Files on figshare will be made available approximately one week before the accompanying article so that the supplementary material can be attributed a unique DOI. Please see: https://royalsociety.org/journals/authors/author-guidelines/ 4) Data-Sharing and data citation It is a condition of publication that data supporting your paper are made available. Data should be made available either in the electronic supplementary material or through an appropriate repository. Details of how to access data should be included in your paper. Please see https://royalsociety.org/journals/ethics-policies/data-sharing-mining/ for more details.
If you wish to submit your data to Dryad (http://datadryad.org/) and have not already done so you can submit your data via this link http://datadryad.org/submit?journalID=RSPB&manu=RSPB-2020-1148 which will take you to your unique entry in the Dryad repository.
If you have already submitted your data to dryad you can make any necessary revisions to your dataset by following the above link. 5) For more information on our Licence to Publish, Open Access, Cover images and Media summaries, please visit https://royalsociety.org/journals/authors/author-guidelines/.
Once again, thank you for submitting your manuscript to Proceedings B and I look forward to receiving your final version. If you have any questions at all, please do not hesitate to get in touch.
Sincerely, Dr Robert Barton mailto:proceedingsb@royalsociety.org Associate Editor Board Member Comments to Author: The revised manuscript has been reviewed again by the original reviewer 2, who is now happy that their original concerns have been addressed. The paper will make an important contribution to the literature.

27-May-2020
Dear Ms Kamiloglu I am pleased to inform you that your manuscript entitled "Human Listeners' Perception of Behavioural Context and Core Affect Dimensions in Chimpanzee Vocalisations" has been accepted for publication in Proceedings B.
You can expect to receive a proof of your article from our Production office in due course, please check your spam filter if you do not receive it. PLEASE NOTE: you will be given the exact page length of your paper which may be different from the estimation from Editorial and you may be asked to reduce your paper if it goes over the 10 page limit.
If you are likely to be away from e-mail contact please let us know. Due to rapid publication and an extremely tight schedule, if comments are not received, we may publish the paper as it stands.
If you have any queries regarding the production of your final article or the publication date please contact procb_proofs@royalsociety.org Your article has been estimated as being 10 pages long. Our Production Office will be able to confirm the exact length at proof stage.
Open Access You are invited to opt for Open Access, making your freely available to all as soon as it is ready for publication under a CCBY licence. Our article processing charge for Open Access is £1700. Corresponding authors from member institutions (http://royalsocietypublishing.org/site/librarians/allmembers.xhtml) receive a 25% discount to these charges. For more information please visit http://royalsocietypublishing.org/open-access.
Paper charges An e-mail request for payment of any related charges will be sent out shortly. The preferred payment method is by credit card; however, other payment options are available.
Electronic supplementary material: All supplementary materials accompanying an accepted article will be treated as in their final form. They will be published alongside the paper on the journal website and posted on the online figshare repository. Files on figshare will be made available approximately one week before the accompanying article so that the supplementary material can be attributed a unique DOI.
You are allowed to post any version of your manuscript on a personal website, repository or preprint server. However, the work remains under media embargo and you should not discuss it with the press until the date of publication. Please visit https://royalsociety.org/journals/ethicspolicies/media-embargo for more information.
Thank you for your fine contribution. On behalf of the Editors of the Proceedings B, we look forward to your continued contributions to the Journal.

Dear Professor Robert Barton,
We thank you and the referees for the useful comments on our manuscript RSPB-2020-0465 entitled "Human Listeners' Perception of Behavioural Context and Core Affect Dimensions in Chimpanzee Vocalisations" and the opportunity to resubmit a revised manuscript for consideration. We greatly appreciate the thoughtful feedback, which has helped us to improve our manuscript. We hope that you will find the revised manuscript suitable for publication in Proceedings B.
The changes made in response to each point raised by the referees are detailed in the pointpoint response below.

Referees' comments:
Referee #1: This is an interesting study that looked at humans' ability to categorize level of arousal, valence, and behavioural context from nonhuman primate vocalizations. Participants were good at the arousal and valence categorizations, but poorer at context categorizations.
I have no concerns, but I do wonder whether the authors administered any questionnaires to participants to determine their level of experience with nonhuman animals' vocalizations.
Thank you for this comment. We agree that participants' prior experience with vocalisations of nonhuman animals, especially chimpanzees, could influence their recognition accuracy. To ensure that the listeners had minimal prior exposure to chimpanzee vocalisations, we recruited participants who had no experience working with or studying chimpanzees; the recruitment text included the phrase "no experience working with or studying chimpanzees". Additionally, at the end of Experiment 1, we asked participants (N = 300) to report their familiarity with each behavioural context (How familiar are you with the chimpanzees in the context of X (e.g., discovering a large food source) from zoo settings or media?), and a representative vocalisation from each context (How familiar are you with this chimpanzee vocalization from zoo settings or media?) on a 5-point scale ('1 = not at all', '2 = slightly', '3 = moderately', '4 = very', '5 = extremely'). In the revised manuscript, we now report this measure (p.8/31, line 168): "Finally, we participants reported their familiarity with both each behavioural context (How familiar are you with the chimpanzees in the context of X (e.g., discovering a large food source) from zoo settings or media?), and a representative vocalisation from each context (How familiar are you with this chimpanzee vocalization from zoo settings or media?) on a 5point scale (1 = not at all, 5 = extremely)." The results show that, participants rated behavioural contexts as less than "Slightly familiar" on average, and representative vocalisations as less than "Moderately familiar". We report this in the revised manuscript (p.10/31, line 232): "On average, on the 1-5 likert scale where 1 = not at all familiar, participants rated both behavioural contexts (M = 1.86, SD = 0.89) and representative vocalisations (M = 2.14 SD = 0.98) as unfamiliar." These results indicate that the listeners were not familiar with the chimpanzee vocalisations prior to our experiment. Moreover, they were not familiar with the behavioural contexts, suggesting that the behavioural context categorisation task used in Experiment 1 was likely to have been challenging for the listeners, as we also suggest.

Referee #2
Overall, I found this manuscript to be a generally thorough and interesting examination of cross-species vocal perception. It contains very detailed acoustic analyses and also two main experiments to determine human perception of chimp calls, based on behavioural context in which the calls were uttered, as well as arousal and valence (two key components linked to emotions). The results mostly appear robust and convincing, except see my comments on Exp 1 below.
I suggest the following changes: Please insert line numbers in order to make the lives of reviewers and editors a little easier.
Thank you for pointing this out. Line numbers have been added in the revised manuscript.
Key words should be in alphabetical order.
Thank you, this has been corrected.
The order of results should be changed. A full understanding of the Experiments 1 and 2 is based on section 3 of the results -Acoustic Analysis. Reading Experiment 1 at first, I thought the classification of arousal in calls was based on subjective assessments of one of the authors (See: Text S1: Recording of chimpanzee vocalisations). Table 1. I assume the classification of calls according to arousal and valence is based on the results of the Acoustic Analysis section? However, this is not clear. For example, it is possible that, at least subjectively and without acoustic data, Tantrum screams might be considered High instead of Medium arousal, and Whimpers might be considered Low arousal instead of Medium.
Thank you for pointing us to the fact that the arousal and valence classification was unclear. The classifications of arousal (and valence) levels were determined by one of the authors, K.E.S., who is an expert on chimpanzee vocal communication. K.E.S. has over 15 years of experience studying chimpanzee communication, including long term behavioural research on both wild and captivity populations of chimpanzees. In previous research, it is common to use expert classifications of levels of arousal (e.g., Kelly et al., 2017) as well as valence (e.g., Belin et al., 2008;Braby, Shapira & Simmmons, 2001;Maigrot, Hillmann, & Briefer, 2018;Scheumann, Hastin, Kotz, & Zimmenmann, 2014) from animal vocalisations. To clarify our approach, we have added the following part to the revised manuscript (p.7/31, line 148): "The behavioural contexts were recorded by author K.E.S. in real time, alongside the sound recordings of vocalisations, and K.E.S., an expert in chimpanzee vocal communication, provided classifications of the arousal level (high, medium, low) and valence (positive, negative) of each call type (see Table 1)." K.E.S used her knowledge, accrued from years of direct observation of chimpanzee vocal behaviour, of the vocaliser's typical behaviour, the response of other individuals, and the context to provide the classifications of arousal and valence of each call type (Table 1).
In Experiments 1 and 2, we test whether naive participants can infer arousal levels and valence from chimpanzee vocalisations. Their results are consistent with the expert classifications of arousal and valence. In section 3, we conduct a classification analysis based on acoustic features (p.18/31, line 399) in order to test whether the expert's and lay listeners'