for review only - dpz€¦ · towards a new taxonomy of primate vocal production learning julia...

17
For Review Only Towards a new taxonomy of primate vocal production learning Journal: Philosophical Transactions B Manuscript ID RSTB-2019-0045.R1 Article Type: Review Date Submitted by the Author: n/a Complete List of Authors: Fischer, Julia; German Primate Centre Leibniz Institute for Primate Research, Cognitive Ethology Laboratory; Georg-August-Universität Göttingen, Primate Cognition Hammerschmidt, Kurt; German Primate Centre Leibniz Institute for Primate Research, Cognitive Ethology Laboratory Issue Code (this should have already been entered but please contact the Editorial Office if it is not present): COMMUNICATION Subject: Behaviour < BIOLOGY, Cognition < BIOLOGY Keywords: alarm calls, Chlorocebus, speech evolution, Papio, vocal production, learning http://mc.manuscriptcentral.com/issue-ptrsb Submitted to Phil. Trans. R. Soc. B - Issue Manuscript accepted on June 18, 2019 doi: 10.1098/rstb.2019.0045 under media embargo until official publication. Copyright: Royal Society

Upload: others

Post on 12-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review OnlyTowards a new taxonomy of primate vocal production

learning

Journal: Philosophical Transactions B

Manuscript ID RSTB-2019-0045.R1

Article Type: Review

Date Submitted by the Author: n/a

Complete List of Authors: Fischer, Julia; German Primate Centre Leibniz Institute for Primate Research, Cognitive Ethology Laboratory; Georg-August-Universität Göttingen, Primate CognitionHammerschmidt, Kurt; German Primate Centre Leibniz Institute for Primate Research, Cognitive Ethology Laboratory

Issue Code (this should have already been entered but

please contact the Editorial Office if it is not present):

COMMUNICATION

Subject: Behaviour < BIOLOGY, Cognition < BIOLOGY

Keywords: alarm calls, Chlorocebus, speech evolution, Papio, vocal production, learning

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

Manuscript accepted on June 18, 2019doi: 10.1098/rstb.2019.0045under media embargo until official publication. Copyright: Royal Society

Page 2: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

Author-supplied statements

Relevant information will appear here if provided.

Ethics

Does your article include research that required ethical approval or permits?: This article does not present research with ethical considerations

Statement (if applicable): CUST_IF_YES_ETHICS :No data available.

Data

It is a condition of publication that data, code and materials supporting your paper are made publicly available. Does your paper present new data?: My paper has no data

Statement (if applicable): CUST_IF_YES_DATA :No data available.

Conflict of interest

I/We declare we have no competing interests

Statement (if applicable): CUST_STATE_CONFLICT :No data available.

Authors’ contributions

This paper has multiple authors and our individual contributions were as below

Statement (if applicable): JF and KH conceived the manuscript, wrote the paper and gave final approval of the version to be published.

Page 1 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 3: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

Phil. Trans. R. Soc. B. article template

Phil. Trans. R. Soc. B.doi:10.1098/not yet assigned

1

Towards a new taxonomy of primate vocal production learning

Julia Fischer1,2,3* & Kurt Hammerschmidt1,3

1Cognitive Ethology Laboratory, German Primate Center, Kellnerweg 4, 37077 Göttingen, Germany 2Department of Primate Cognition, Georg August University Göttingen, Göttingen, Germany

3Leibniz ScienceCampus Primate Cognition, GöttingenORCID Julia Fischer 0000 0002 5807 0074

Keywords: alarm calls, Chlorocebus, speech evolution, Papio, vocal production, learning

Main Text

Summary

The extent to which vocal learning can be found in nonhuman primates is key to reconstruct the evolution of speech. Regarding the adjustment of vocal output in relation to auditory experience (vocal production learning in the narrow sense), effects on the ontogenetic trajectory of vocal development as well as adjustment to group specific call features have been found. Yet, a comparison of the vocalizations of different primate genera revealed striking similarities in the structure of calls and repertoires in different species of the same genus, indicating that the structure of nonhuman primate vocalizations is highly conserved. Thus, modifications in relation to experience only appear to be possible within relatively tight species-specific constraints. In contrast, comprehension learning may be extremely rapid and open-ended. In conjunction, these findings corroborate the idea of an ancestral independence of vocal production and auditory comprehension learning. To overcome the futile debate about whether or not vocal production learning can be found in nonhuman primates, we suggest to put the focus on the different mechanisms that may mediate the adjustment of vocal output in response to experience; these mechanisms may include auditory facilitation and learning from success.

Background

Conventionalized communication in the auditory-vocal domain, such as in human speech, crucially requires learning both in the production and comprehension of sounds. To shed light on the evolutionary origins of vocal learning, numerous studies have investigated vocal learning in nonhuman primates (hereafter ‘primates’). In the context of vocal learning, it is important to distinguish between vocal learning by the caller and vocal learning by the listener. Vocal learning by callers encompasses adjustment of the structure of the vocalizations (vocal production learning in the narrow sense) and adjustment of usage in relation to experience (vocal usage learning). Vocal learning by listeners comprises auditory comprehension learning, which refers to the ability to associate a sound with its source and/or what the sound ‘stands for’, that is, what it predicts (1–4).

Classic studies provided clear evidence that primates do not require auditory input to develop normal species-specific vocalizations: monkey infants raised under social and acoustic isolation (5,6) or cross-fostered between species (7) developed species-typical vocalizations. Yet, a number of recent studies reported evidence for vocal learning in chimpanzees and marmosets. In the following, we will review these studies, taking also some of the earlier studies into consideration, before we turn to our own work on baboons and Savanna monkeys. We will conclude with a discussion of the potential mechanisms that underlie the observed effects. We argue that some of the divergent views

*Author for correspondence ([email protected]).†Present address: Cognitive Ethology Laboratory, German Primate Center, Kellnerweg 4, 37077, Göttingen, Germany

Page 2 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 4: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

2

on the issue of vocal learning may be reconciled by distinguishing more clearly between different ways in which way social and auditory experience may contribute to variation in acoustic structure.

Evidence from chimpanzees

A highly influential study on vocal learning in the food grunts of chimpanzees explored changes in acoustic structure over four years (8), after integration of 9 adult subjects from a group of chimpanzees previously housed in the Beekse Bergen Safari Park (BB group) in the Netherlands, with 9 adult subjects housed at the Edinburgh zoo (ED group). Observations were made during three years over a four-year period between 2010 and 2013. The data sets comprised recordings of food grunts during the feeding of apples, as well as proximity data from instantaneous scans, to track changes in the social network over time (8). Over time, the acoustic features of BB subjects became more similar to the ED subjects, as evidenced by a significant interaction between group and year of study (8), as well as significant differences in non-parametric follow-up tests in the beginning of the study, and lack of such differences in the last year of the study (9). A detailed inspection of the individual acoustic features suggested that there was substantial overlap in acoustic features to begin with, however, as six out of seven of the BB subjects also produced calls that fell within the range of the ED subjects from the beginning of the study (10). This overlap is relevant for gauging the potential mechanisms underpinning the observed change, as detailed further below in our framework for vocal production learning.

To assess the degree of plasticity in chimpanzee vocalizations, different studies explored the variation between wild social groups or populations. In their analysis of community specific differences, Crockford and colleagues (11) investigated the acoustic structure of pant-hoots given by males from three neighboring communities, for instance. A discriminant function analysis revealed that the calls of the three communities could be well distinguished from each other, but when calls from males residing in a distant community were added, the classification results dropped considerably. These results were taken as indirect evidence that male chimpanzees converge in their pant hoot characteristics, as to be distinct from males in neighboring communities (11). Since the authors did not report the variation in specific parameters, it is difficult to judge the extent of the variation between communities in absolute terms.

Similarly, Mitani and colleagues observed significant differences in temporal characteristics and in the frequency characteristics of the climax element of chimpanzee 'pant hoots' uttered by members of two different groups (12). Marshall, Wrangham, and Arcadi (13) studied the acoustic structure of pant-hoots of two groups of captive chimpanzees, and compared the recordings to calls from three male chimpanzees belonging to the Kanyawara community in Uganda. Irrespective of whether the pant-hoots were recorded in captivity or the wild, the spectral structure of the elements of this multisyllable call was largely similar, although the authors observed significant variation in temporal aspects of the two captive groups comprising 11 and 3 males.

If vocal learning and conventionalization would play a major role in shaping vocal output, one would expect marked differences between communities that never interact. Mitani and colleagues examined variation in the long-distance calls (‘pant-hoots’) between two large populations of Eastern chimpanzees, with 10 males from Mahale and 12 males from Kibale (14). These populations live about 700 km apart. While the qualitative comparison calls revealed considerable heterogeneity within populations in terms of the structure of pant-hoots, the quantitative comparison of specific acoustic features identified several significant acoustic differences. Males from Kibale produced longer introductory elements than Mahale males, and Kibale males also produced longer build-up elements at a slower rate than Mahale males. Phase duration also varied between individuals of the two populations (14). The authors discussed a number of factors that may explain the observed variation, ranging from differences in transmission characteristics of the different habitats, differences in the sound environment, and variation in body size. Although the drivers of the observed variation could not be clearly identified, the study further corroborated the assumption of a strong hard-wired component of vocal output, with little evidence for a role of learning or conventionalization.

In summary, despite some variation between groups and populations, it appears that the call structure of chimpanzees is largely innate. Even the distantly related Eastern and Western chimpanzees produce the same general call types (15,16). Within the constraints imposed by the neural mechanisms and genetic architecture underpinning vocal patterns, there appears to be some limited potential for plasticity at the individual and group level. One caveat is that many of the studies on chimpanzees are based on small sample sizes and thus it is difficult to judge the extent of this plasticity with greater confidence.

Evidence from marmosets and macaques

In recent years, another group of primate species has attracted increasing attention with regard to their potential vocal learning abilities, namely marmosets and tamarins. These small Callitrichid monkeys live in small family groups with extended family care for young. Females typically give birth to twins, and both the father and older offspring carry the young (17). When separated, both young and adult subjects emit vocalizations that function to re-establish contact

Page 3 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 5: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

3

(18,19). Moreover, calling by one party often prompts counter-calling from the other party (20). This “antiphonal” calling provides researchers with an excellent opportunity to study the effects of vocal feedback on the development of offspring vocalizations.

During the first two months of life, marmoset vocalizations undergo distinct changes. Specifically, young first emit twitter calls and trills, as well as cries, phee-cries, and subharmonic phees. The latter three call types are proposed to transition into adult phee calls (21). Subjects almost exclusively produce phee calls from an age of about 2 months. Some of the changes seen in the first two months can be attributed to physiological growth and an increase in strength. At the same time, marmosets exhibit a developmental pattern of FoxP2 expression in their thalamocortical-basal ganglia circuit (22) that has been likened to that of songbirds and humans (23). Therefore, Takahashi et al. hypothesized that marmoset vocal development may also be prone to the influence of vocal feedback (21). Indeed, infants who received more contingent parental responses in the form of antiphonal calling began to use the adult form earlier than those who received fewer contingent responses (21).

These observations formed the basis for a subsequent experimental study in which marmoset infants received either high or low rates of contingent parental calls in response to their own spontaneous vocalizations (24). Three pairs of marmoset twins (six infants) were briefly separated from their family group. The first 5-10 min. of a 40 min. experimental session were used to estimate spontaneous calling rates of the infants (baseline), while the remaining time comprised closed-loop feedback with high (100%) or low (10%) rates of contingent vocal feedback. Because one infant of each twin pair was assigned to the high and the other to the low feedback condition, genetic and perinatal experience were controlled. The study provided compelling evidence that infants in the high feedback condition developed the adult phee call variant earlier than infants in the low feedback condition (24). The authors also checked whether infants in the high-feedback condition produced more calls on average, by examining the call rate in the baseline period, i.e. prior to acoustic feedback. Yet, what was not reported is whether calling rates in the experimental period were affected. If parental calls are not only given in response to infant calls, but also elicit antiphonal calls by the infants themselves, this may have resulted in substantially higher call rates and thus more practice for infants in the high feedback condition. Unfortunately, information on infant call rates in the experimental period was not provided.

Another study on marmosets investigated whether parental auditory feedback is an obligate requirement for proper vocal development or whether it simply accelerates vocal development, by tracing the vocal behavior of two sets of offspring. Two infants were normally raised, while the other three were separated from the parents after the third postnatal month (25). All five monkeys eventually produced mature vocalizations. In contrast to normally raised monkeys, however, marmosets with limited parental feedback also produced infant-specific vocal behavior up to an age of 13 months. The social interactions between infants and parents affected the maturation of the vocal behavior, including changes in acoustic call structures during development. Specifically, subjects that experienced only limited parental input produced calls with a higher entropy than normally reared monkeys (26).

In summary, there is converging evidence that contingent auditory feedback plays a key role in shaping the developmental trajectory of marmosets. At the same time, irrespective of rearing history and amount and temporal contingency of parental feedback, subjects are ultimately able to produce the regular adult call type.

In order to explore which factors may influence modifications in call structure during early development, Hammerschmidt and colleagues studied the development of “coo” calls in young rhesus macaques (27). This call type is useful to study ontogenetic changes because it can be elicited reliably from birth on. Calls were recorded during brief periods of separation from more than 20 rhesus macaque, Macaca mulatta, infants from the first week of life until the age of five months. Infants were either raised with their mothers in normal breeding colonies or separated from their mothers at birth and housed in a nursery with other age-matched peers. With increasing age, the “coo” calls underwent several changes: Calls dropped in pitch and showed reduced variability in call amplitude and fundamental frequency. In addition, call duration increased slightly. Aside from high residual intra-individual variability throughout the recording period, no significant influence of sex or rearing conditions could be found. Controlling for weight as a reliable proxy measure for body growth and changes in vocal tract characteristics (28,29), all except one significant correlation with age could be excluded. The only acoustic parameter which could not be explained by weight gain was a parameter describing the portion of amplitude gaps (Fig. 1). To produce a constant amplitude throughout a call, it is necessary to produce the correct lung pressure in relation to vocal fold tension. Without a correct combination no audible sound can be produced (30). Obviously young rhesus macaques need some practice to find the correct combination of lung pressure and vocal fold tension to produce the coo modulation (Fig. 1). All other changes could be explained by growth. The fact that there were no differences between nursery and mother reared animals confirmed the view that young rhesus do not require an adult model to produce species-specific "coo" calls.

Comparison of baboon repertoires

Page 4 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 6: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

4

Baboons constitute an interesting case to assess the influence of the degree of despotism and other social system characteristics on the vocal repertoire structure, as members of the genus vary greatly in the degree of male tolerance as well as their social organisation (31). The different species are distributed in a wide range of habitats across sub-Saharan Africa and the Arabian peninsula (Fig. 2). In our analysis, we focussed on three species: the highly despotic female philopatric chacma baboon, the less despotic female philopatric olive baboon, and the tolerant male philopatric Guinea baboon (32).

We conducted a detailed acoustic analysis of two types of calls (‘grunts’ and ‘loud calls’) from the three species. Grunts are produced by males and females of all age classes immediately before or during affiliative interactions, but also before or during group movement (33,34). Loud calls (Fig. 2) are also given by male and female animals of all age classes when they lose contact with the group or specific individuals (35). In addition, loud calls are used as alarm calls upon sighting predators such as lions or leopards. Furthermore, chacma baboons use loud calls as part of their “wahoo displays” in male-male competition (36).

The overall structure of these two call types was comparable in all three species. Yet, both grunts and loud calls varied between species in terms of several acoustic features, including call duration, mean fundamental frequency, and mean peak frequency (32). Based on a discriminant function analysis, it was possible to assign the calls to the respective species, but there was considerable misclassification. For female grunts, 65.5 % of calls could be correctly classified; for males, the correct classification was 70.4 % (32). Loud calls showed greater variation between species, as predicted by the hypothesis that loud calls serve to distinguish groups from another, and correspondingly, yielded better classification results. For females, 86.3% of calls were correctly assigned to the respective species; for males, the correct classification was 88.9%. Thus, although the general structure of the calls is comparable, detectable differences between the species exist. These may be partly due to size differences between the species. Interestingly, we found no correlation between the degree of acoustic differentiation and phylogenetic distance, however. In contrast, the acoustic variation between different species of leaf monkeys and gibbons revealed a high concordance with their phylogenetic distance (37,38). The finding that the acoustic differentiation in baboons is less pronounced than in gibbons and leaf monkeys supports the idea that there were no strong selection pressures favoring species recognition and territorial defense in this genus (39).

In addition, we compared the overall structure of the vocal repertoires of Guinea and chacma baboons, as we had the most comprehensive data base for these two species. We found an overall similarity between the most common calls of the two species. We therefore assume that the neural pattern generators giving rise to the different call types are highly conserved between the different baboon species. In summary, selective pressures such as sexual selection or inter-group competition may add to variation of calls, but only within certain constraints existing within the genus (32). A further striking finding was that species that differ so prominently in their social system characteristics as olive and chacma baboons at the one hand, and Guinea baboons on the other, do not differ in terms of their vocal diversity (32).

A proper comparisons of entire repertoires is difficult to achieve, as it takes an extreme sampling effort to collect a sufficient number of calls across all call types and from a sufficient number of individuals (32). As we pointed out before (32), one would need to collect data from several populations per species for a solid assessment. Yet, from the presently available data, it seems unlikely that a very different picture would emerge. All qualitative and quantitative comparisons available to date clearly suggest that the general call types do not vary fundamentally between different baboon species. Given that the six species differ considerably in terms of their social organisation and the degree of sexual selection, this finding is quite remarkable. In contrast, behavioural dispositions such as aggressiveness vary strongly, most likely as a result of variation in inter- and intra-group competition. As a side note, the frequently held assumption that vocal complexity varies with social complexity (40) does not seem to hold in the genus Papio.

Alarm calls in the genus Chlorocebus

Alarm calls are the focus of many of the most influential studies of primate vocal communication, specifically with regard to the semantic content (41–44) and syntactic properties (45–47) of primate vocalizations. The single most influential example in this context is the alarm call system of vervet monkeys, Chlorocebus pygerythrus. In brief, vervet monkeys have evolved different adaptive escape strategies in response to their main predator categories. They climb into trees upon the appearance of a leopard, scan the sky or run into cover when spotting an eagle, and stand bipedally after detecting a snake. They also produce different types of alarm calls in response to each of these main predator categories (48). Playback experiments revealed that the calls alone are sufficient to elicit adaptive escape responses (49). Yet, the classic study was conducted on a single population and until recently, remarkably little was known about the variation between populations or species, although such knowledge provides important insights into the potential flexibility of the alarm call system. More specifically, such comparative research aids to distinguish between local and potentially conventionalized learnt communication systems, and rather hard-wired, evolved solutions in response to predation pressure. We therefore initiated a study of the alarm call system of a congener of vervet monkeys, namely West African green monkey, Chlorocebus sabaeus, in the Niokolo Koba National Park in Senegal, with complementary research on a subspecies of C. pygerythrus in South Africa (50).

Page 5 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 7: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

5

To elicit alarm calls, we presented snake and leopard models, as well as a model of an eagle perched on a tree to green monkeys (51). The monkeys only responded with alarm calls, vigilance, and escape responses to the snake and leopard models, while they largely ignored the eagle model. Putty-nosed monkeys, Cercopithecus nictitans martini, in contrast, produced strong alarm responses, including alarm calls, in response to a similar looking eagle model (52). Because we had never observed the monkeys to give alarm calls to any of the birds of prey in the area since the beginning of our studies in 2009, we considered the possibility that the monkeys in this area are not preyed upon by raptors. This in turn provided us with the opportunity to present to them a novel aerial threat, a flying drone, to assess their vocal responses (53).

When we flew the drone over the monkeys, the animals produced distinct calls, and a number of subjects ran into cover (53,54). We conducted an acoustic analysis of these calls and compared them to the calls given by members of the same study population in response to leopard and snake models recorded in a previous study (51). Using the classification procedure of a discriminant function analysis (DFA), we found that for female subjects, 80.0 % of calls could be correctly assigned to the context in which they were given. This was significantly better than chance, as assessed by a DFA based on permuted data (55). Drone alarms were clearly distinct from the other two categories, with 95.2% correct classification. For male subjects, the overall correct classification was slightly lower, with 71.2% of calls correctly assigned to the context in which they were given, but yet again, this was significantly better than chance (53). Similar to the findings for females, male drone alarms could be most readily distinguished from the other two alarm call categories: 92.6% of drone calls by male green monkeys were correctly classified. In conclusion, the calls given in response to a novel flying object by green monkeys differed from those given to snakes and leopard model.

In a second step, we investigated to which degree the green monkey alarm calls compared to those of their East African congeners. We were particularly interested how green monkey calls given in response to the drone compared to vervet monkey eagle alarms. For this comparison, we used the original recordings of alarm calls collected by Tom Struhsaker, and Dorothy Cheney and Robert Seyfarth that were part of a previous analysis of the vervet monkey alarm call repertoire (56). The visual inspection of the spectrograms (Fig. 3) as well as a statistical analysis of key acoustic parameters for females and males (Table 1) revealed significant variation in relation to context and only marginal differences in relation to species. For females, both species revealed a relatively similar pattern: alarm calls given to leopards had the longest element duration; calls given in response to snakes had the highest frequency range and calls given in response to aerial threats had the lowest median frequency. For males, the picture was more differentiated: leopard alarms were much longer than snake alarms in green monkeys, but only slightly longer in vervet monkeys. In green monkeys, snake alarms had the highest frequency range, while in vervets, the frequency range did not differ between leopard and snake alarms. Aerial alarms had the lowest frequency characteristics in both species (Fig. 4). These differences between species for male calls point to a differential role of sexual selection; this conjecture requires further investigation.

With 80% correct classification of calls to the three contexts in the discriminant function analysis, female green monkey alarm calls were less distinct from each other than the alarm calls of female vervet monkeys, however. In vervet, the correct classification of calls given to eagles, leopards, and snakes, was 93.3%. In male green monkeys, the alarm calls were also less distinct with 71.2 % correct classification than those of male vervet monkeys with 81.3% correct classification. The aerial alarms had the highest similarity values between the two species (Fig. 5).

These findings extend and confirm an earlier study of the acoustic variation of male Chlorocebus barks (50). Male barks of two subspecies of C. pygerythrus, whose last common ancestor lived about 1.5 million years ago (57, but see 58) revealed only marginal acoustic differences, and male barks of C. sabaeus, whose last common ancestor with C. pygerythrus lived around 2.1 million years ago, also produce barks with a highly similar call structure (50). In summary, despite considerable geographic and phylogenetic distance, the overall structure of these calls and the structure of the alarm call repertoire of these two species appears highly conserved. Studies from further members of the genus would be needed to assess whether this assessment holds for the entire genus.

Rapid comprehension learning in green monkeys

While our survey of the variation in vocal production revealed only moderate flexibility, there is ample evidence that recipients are able to learn to attribute meaning to a variety of sounds (see 59 for a review). Our study on the green monkeys’ responses to the drone provided us with the opportunity to test just how rapidly the animals attached meaning to the sound of the drone. When we prepared to fly the drone over the monkeys for the first time, we noticed that the animals appeared to respond to the sound of the drone even before it became visible. This suggested that they are highly sensitive to novel sounds in the environment; further experimental studies are needed to test this notion.

More important in this context, after we had presented the drone, we conducted a playback experiment in which we played back the sound of the drone to the animals (53). Following the presentation of the drone sound, the monkeys looked significantly longer into the direction of the loudspeaker and were more vigilant than in the control condition

Page 6 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 8: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

6

in which we played different familiar broad-band noises, such as the sound of the nearby generator. This was true even after the drone had only been presented a single time. Strikingly, the animals were also more likely to look up and scan the sky after the presentation of drone compared to control sounds. In three cases, the subject immediately ran into cover after the presentation of the drone sound. This never happened in response to control experiments (53). These findings are relevant for two reasons. First, they demonstrate that comprehension learning may be extremely rapid. Second, they reinforce the question why operant conditioning in the auditory domain is so (excruciatingly) difficult to achieve in the lab. Numerous studies have attempted to train monkeys in auditory discrimination tasks, and typically needed hundreds of trials (60). Why nonhuman primates struggle in such lab settings, while they may immediately comprehend what a sound predicts in more naturalistic settings remains a question for further investigation.

Overcoming the dichotomous view of primate vocal production learning

The available evidence for plasticity in nonhuman primate vocal production learning suggests two fundamental principles. First, the overall patterns seems to be relatively strongly genetically fixed – in some cases not only at the level of the species but also at the level of the genus. Second, within the species-specific reaction norms, a certain degree of plasticity appears to be possible, resulting in minor modifications of vocal output in relation to experience. The idea of minor modification as a result of social experience within species-specific typical patterns is not new (61–63); yet we now know much more about the extent of this plasticity during development. At the same time, we also know more about the tight link between phylogenetic relationships and acoustic variation between species. In light of the available evidence, we feel that it is time to overcome the debate of whether or not nonhuman primates have vocal production learning. To fundamentally advance our understanding of primate vocal production learning, we must turn to the mechanisms that support vocal adjustment in relation to (auditory) experience.

Along similar lines, Arriaga and Jarvis (64) proposed the “continuum hypothesis” for vocal learning in a broader range of taxa (including mice and birds). Arriaga and Jarvis argued that progress in the field of vocal production learning may be hampered by the classification of species in ‘have’s’ and ‘have-nots’. They distinguished between vocalizations based on a template, and vocalizations generated de novo. Vocalizations based on a template could range from calls that are strictly determined by an innate central pattern generator, to those where the spectral-temporal structure is “guided by an externally acquired target (imitation-based modification of a template)” (64, p. 112). Vocalizations generated de novo encompass improvisation, i.e., the versatile use of elements that need not have been acquired by experience, as well as full mimicry, i.e. the “modification of vocal output guided by an externally acquired target” (64, p. 113). Song learning in song birds typically falls under this latter category, where young birds acquire the template of the adult song by exposure to their father’s or other adults’ song during a sensitive phase, and then proceed through a phase of ‘babbling’ or practice, until the song crystallizes (65).

While we generally agree with Arriaga and Jarvis (64) that it is time to overcome the dichotomous view of vocal production learning, we caution that the assumption of a continuum may be premature, or even slightly misleading, as adjustment of vocal output in relation to (auditory) experience may be mediated by very different mechanisms. A “many roads” metaphor might therefore be more appropriate, at least for the time being. We have identified the following (non-exhaustive) list of mechanism that underpin vocal production in primates (see (4) for the neurobiology underpinning vocal production in primates).

i. Null model: Innate call types with potential for within and between-call type variability in relation to arousal and valence (and perhaps potency) of the caller. No auditory input necessary. This model is sufficient to account for context- and urgency-related variation in nonhuman vocal production.

ii. General auditory facilitation: Auditory input increases the likelihood of vocal production (irrespective of structure); if increased auditory input consistently leads to increased vocal output, then this may affect maturation processes and developmental trajectories. This may be a simple explanation for some of the variation observed in marmosets deprived of parental input, if, but only if these deprived subjects indeed call less than their normally reared counter-parts.

iii. Specific auditory facilitation: Auditory input of specific call types facilitates the production of the corresponding call type by the listener. This mechanism presupposes that several variants of the same call type may be represented in the caller’s brain, of which certain variants are preferably activated in response to specific auditory input. This mechanism has been suggested to underlie ‘action based learning’ (63) and may account for the emergence of group specific calls observed in different primate species.

iv. Learning from success: Subjects may learn that the production of specific call variants is more likely to elicit the desired response than other call types. Given that nonhuman primates appear to have a certain degree of control over call usage, this model may provide an alternative explanation for the occurrence of group-specific calls.

v. Vocal copying: the vocal production is shaped according to an auditory template stored as a short-term sensory representation.

Page 7 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 9: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

7

vi. Template learning: an auditory template is acquired during a sensitive phase and vocal output modified to match that template.

vii. Innovation/Improvisation: volitional generation of (literally) unheard voiced sounds.

We suggest that (i)-(iv) may be found in nonhuman primates, with (i) explaining the greatest deal of variation in nonhuman primate vocalizations. The evidence for (v) in nonhuman primates is weak (or difficult to judge), and for (vi) and (vii) largely absent, at least for voiced sounds. Note that after intensive training some individuals such as Viki were able to produce an extremely small amount of “words”, including “cup”, “up”, and “mama” (reviewed in 66), but there seems to be no inclination on the animals’ side to acquire speech. Instead, there is wide-ranging consensus that nonhuman primates are not obligate learners (4). A crucial question for further research is to identify the mechanisms that give rise to acoustic changes in relation to social experience, as described for chimpanzees by Watson and colleagues (8). In principle, the observed changes could be due to specific auditory facilitation or learning from success and need not constitute instances of vocal learning.

Although the proposed framework may be far from perfect, it stresses the importance of sensory-motor integration at a very basal level. Further, the framework lends itself for empirical tests. Regarding the developmental trajectory of vocalizations, it is critical to control for practice, to assess the extent to which auditory input mostly functions as a trigger of infant vocal output (ii). To test (iii), one would need to expose subjects with specific call variants that fall within the species-typical frame, to test whether this alters the call characteristics of vocal output, compared to a baseline condition. Ethical considerations discourage the ideal experiment in which infants would be temporarily muted. Likewise, very detailed studies of the use of specific variants and the effects they may have on conspecifics may provide a test of (iv).

We assume that (i) through (iv) are ancestral and also present in humans. Humans also reveal minor adjustments of their speech in the form of “vocal accommodation” (67). Yet, the development of speech critically relies on a combination of (v) to (vii), with speech acquisition constituting a prime example of obligate auditory learning (68). The list of mechanisms compiled above is compatible with the dual network (or dual pathway (69)) model of primate vocal production, which suggests an increasing integration of the more ancestral vocal motor network that produces species-specific calls with a largely fixed structure, with more derived cortical networks (70,71). To summarize, we strongly suggest that future research should aim to clarify the mechanisms supporting variation in vocal output in relation to experience, while the question “whether or not” nonhuman primates have vocal learning should be set aside. The question is, to what extent do they reveal vocal production learning, how is this mediated, and how much is vocal production integrated with the processing of information from social and non-social sources, other than auditory input.

Acknowledgments

We thank Ludwig Ehrenreich for help with the figures. This research is supported by the Leibniz ScienceCampus Primate Cognition by the Leibniz Association.

References1. Janik VM, Slater PJB. 2000. The

different roles of social learning in vocal communication. Anim. Behav. 60, 1–11. (doi:10.1006/anbe.2000.1410)

2. Seyfarth RM, Cheney DL. 1997. Some general features of vocal development in nonhuman primates. In: Snowdon CT, Hausberger M, editors. Social influences on vocal development. Cambridge University Press; p. 249–73.

3. Petkov CI, Jarvis ED. 2012. Birds, primates, and spoken language origins: Behavioral phenotypes and neurobiological substrates. Front. Evol. Neurosci. 4, 12. (doi:10.3389/fnevo.2012.00012)

4. Fischer J, Hage SR. 2019. Primate vocalization as a model for human speech: Scopes and limits. In: Hagoort P, editor.

Human Language: From Genes and Brains to Behavior. Cambridge MA: MIT Press; p. 639–56.

5. Hammerschmidt K, Freudenstein T, Jürgens U. 2001. Vocal development in squirrel monkeys. Behaviour. 138, 1179–204. (doi:10.1163/156853901753287190)

6. Winter PP, Handley D, Schott D. 1973. Ontogeny of squirrel monkey calls under normal conditions and under acoustic isolation. Behaviour. 47, 230–9.

7. Owren MJ, Dieter JA, Seyfarth RM, Cheney DL. 1993. Vocalizations of rhesus (Macaca mulatta) and Japanese (M. fuscata) macaques cross-fostered between species show evidence of only limited

modification. Dev. Psychobiol. 26, 389–406. (doi:10.1002/dev.420260703)

8. Watson SK, Townsend SW, Schel AM, Wilke C, Wallace EK, Cheng L, et al. 2015. Vocal learning in the functionally referential food grunts of chimpanzees. Curr. Biol. 25, 495–9. (doi:10.1016/j.cub.2014.12.032)

9. Watson SK, Townsend SW, Schel AM, Wilke C, Wallace EK, Cheng L, et al. 2015. Reply to Fischer et al. Curr. Biol. 25, R1030–1. (doi:10.1016/j.cub.2015.09.024)

10. Fischer J, Wheeler BC, Higham JP. 2015. Is there any evidence for vocal learning in chimpanzee food calls? Curr. Biol. 25, R1028–9. (doi:10.1016/j.cub.2015.09.010)

11. Crockford C, Boesch C. 2003. Context-specific calls in wild chimpanzees, Pan troglodytes

Page 8 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 10: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

8

verus: Analysis of barks. Anim. Behav. 66, 115–25. (doi:10.1006/anbe.2003.2166)

12. Mitani JC, Hasegawa T, Gros-Louis J, Marler P, Byrne RW. 1992. Dialects in wild chimpanzees? Am. J. Primatol. 27, 233–43. (doi:10.1002/ajp.1350270402)

13. Marshall AJ, Wrangham RW, Arcadi AC. 1999. Does learning affect the structure of vocalizations in chimpanzees? Anim. Behav. 58, 825–30. (doi:10.1006/anbe.1999.1219)

14. Mitani JC, Hunley KL, Murdoch ME. 1999. Geographic variation in the calls of wild chimpanzees: A reassessment. Am. J. Primatol. 47, 133–51.

15. Marler P. 1969. Vocalizations of wild chimpanzees - an introduction. In: Proceedings 2nd International Congress Primatology, Atlanta 1968. Basel: Karger; p. 94–100.

16. Mitani JC, Macedonia JM. 1996. Selection for acoustic individuality within the vocal repertoire of wild chimpanzees. Int. J. Primatol. 17, 569–83.

17. Zahed SR, Prudom SL, Snowdon CT, Ziegler TE. 2007. Male parenting and response to infant stimuli in the common marmoset (Callithrix jacchus). Am. J. Primatol. 70, 84–92. (doi:10.1002/ajp.20460)

18. Martins Bezerra B, Souto A. 2008. Structure and usage of the vocal repertoire of Callithrix jacchus. Vol. 29, International Journal of Primatology. 671–701 p. (doi:10.1007/s10764-008-9250-0)

19. Schrader L, Todt D. 1993. Contact call parameters covary with social context in common marmoset, Callithrix j. jacchus. Anim. Behav. 46, 1026–1028. (doi:10.1006/anbe.1993.1288)

20. Miller CT, Beck K, Meade B, Wang X. 2009. Antiphonal call timing in marmosets is behaviorally significant: Interactive playback experiments. J. Comp. Physiol. A. 195, 783–9. (doi:10.1007/s00359-009-0456-1.Antiphonal)

21. Takahashi DY, Fenley AR, Teramoto Y, Narayanan DZ, Borjon JI, Holmes P, et al. 2015. The developmental dynamics of marmoset monkey vocal production. Science. 349, 734–8. (doi:10.1126/science.aab1058)

22. Kato M, Okanoya K, Koike T, Sasaki E, Okano H, Watanabe S, et al. 2014. Human speech- and reading-related genes display partially overlapping expression patterns in the marmoset brain. Brain Lang. 133, 26–38. (doi:10.1016/j.bandl.2014.03.007)

23. Teramitsu I, Kudo LC, London SE, Geschwind DH, White SA. 2004. Parallel FoxP1 and FoxP2

expression in songbird and human brain predicts functional interaction. J. Neurosci. 24, 3152–63. (doi:10.1523/jneurosci.5589-03.2004)

24. Takahashi DY, Liao DA, Ghazanfar AA. 2017. Vocal learning via social reinforcement by infant marmoset monkeys. Curr. Biol. 27, 1844-1852.e6. (doi:http://dx.doi.org/10.1016/j.cub.2017.05.004)

25. Gultekin YB, Hage SR. 2017. Limiting parental feedback disrupts vocal development in marmoset monkeys. Nat. Commun. 8, 14046. (doi:10.1038/ncomms14046)

26. Gultekin YB, Hage SR. 2018. Limiting parental interaction during vocal development affects acoustic call structure in marmoset monkeys. Sci. Adv. 4, 11. (doi:10.1126/sciadv.aar4012)

27. Hammerschmidt K, Newman JD, Champoux M, Suomi SJ. 2000. Changes in rhesus macaque “coo” vocalizations during early development. Ethology. 106, 873–86. (doi:10.1046/j.1439-0310.2000.00611.x)

28. Fitch WT. 1997. Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques. J. Acoust. Soc. Am. 102, 1213–22.

29. Pfefferle D, Fischer J. 2006. Sounds and size: Identification of acoustic variables that reflect body size in hamadryas baboons, Papio hamadryas. Anim. Behav. 72, 43–51. (doi:10.1016/j.anbehav.2005.08.021)

30. Dusterhoft F, Hausler U, Jürgens U. 2000. On the search for the vocal pattern generator. A single-unit recording study. Neuroreport. 11, 2031–4.

31. Fischer J, Kopp GH, Dal Pesco F, Goffe AS, Hammerschmidt K, Kalbitzer U, et al. 2017. Charting the neglected West: The social system of Guinea baboons. Am. J. Phys. Anthropol. 162, 15–31. (doi:10.1002/ajpa.23144)

32. Hammerschmidt K, Fischer J. 2019. Baboon vocal repertoires and the evolution of primate vocal diversity. J. Hum. Evol. 126, 1–13. (doi:10.1016/j.jhevol.2018.10.010)

33. Rendall D, Seyfarth RM, Cheney DL, Owren MJ. 1999. The meaning and function of grunt variants in baboons. Anim. Behav. 57, 583–92. (doi:10.1006/anbe.1998.1031)

34. Meise K, Keller C, Cowlishaw G, Fischer J. 2011. Sources of acoustic variation: Implications for production specificity and call categorization in chacma baboon (Papio ursinus) grunts. J. Acoust. Soc. Am. 129, 1631–41. (doi:10.1121/1.3531944)

35. Ey E, Hammerschmidt K, Seyfarth

RM, Fischer J. 2007. Age- and sex-related variations in clear calls of Papio ursinus. Int. J. Primatol. 28, 947–60. (doi:10.1007/s10764-007-9139-3)

36. Fischer J, Kitchen DM, Seyfarth RM, Cheney DL. 2004. Baboon loud calls advertise male quality: Acoustic features and their relation to rank, age, and exhaustion. Behav. Ecol. Sociobiol. 56, 140–8. (doi:10.1007/s00265-003-0739-4)

37. Meyer D, Hodges JK, Rinaldi D, Wijaya A, Roos C, Hammerschmidt K. 2012. Acoustic structure of male loud-calls support molecular phylogeny of Sumatran and Javanese leaf monkeys (genus Presbytis). BMC Evol. Biol. 12, 16. (doi:10.1186/1471-2148-12-16)

38. Van Ngoc T, Hallam C, Roos C, Hammerschmidt K. 2011. Concordance between vocal and genetic diversity in crested gibbons. BMC Evol. Biol. 11, 36. (doi:10.1186/1471-2148-11-36)

39. Wilkins MR, Seddon N, Safran RJ. 2013. Evolutionary divergence in acoustic signals: causes and consequences. Trends Ecol. Evol. 28, 156–66. (doi:http://dx.doi.org/10.1016/j.tree.2012.10.002)

40. Freeberg TM. 2006. Social complexity can drive vocal complexity: Group size influences vocal information in Carolina chickadees. Psychol. Sci. 17, 557–61. (doi:10.1111/j.1467-9280.2006.01743.x)

41. Wheeler BC, Fischer J. 2012. Functionally referential signals: A promising paradigm whose time has passed. Evol. Anthropol. 21, 195–205. (doi:10.1002/evan.21319)

42. Zuberbühler K. 2003. Referential signaling in non-human primates - cognitive precursors and limitations for the evolution of language. Adv. Study Behav. 33, 265–307.

43. Townsend SW, Manser MB. 2013. Functionally referential communication in mammals: The past, present and the future. Ethology. 119, 1–11. (doi:10.1111/eth.12015)

44. Marler P, Evans CS, Hauser MD. 1992. Animal signals: Motivational, referential, or both? In: Papoušek H, Jürgens U, Papoušek M, editors. Nonverbal vocal communication. Cambridge: Cambridge University Press; p. 66–86.

45. Arnold K, Zuberbühler K. 2008. Meaningful call combinations in a non-human primate. Vol. 18, Current Biology. p. 202–3. (doi:10.1016/j.cub.2008.01.040)

46. Schamberg I, Cheney DL, Clay Z, Hohmann G, Seyfarth RM. 2016. Call combinations, vocal exchanges and interparty movement in wild bonobos.

Page 9 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 11: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

9

Anim. Behav. 122, 109–16. (doi:10.1016/j.anbehav.2016.10.003)

47. Coye C, Ouattara K, Arlet ME, Lemasson A, Zuberbühler K. 2018. Flexible use of simple and combined calls in female Campbell’s monkeys. Anim. Behav. 141, 171–81. (doi:10.1016/j.anbehav.2018.05.014)

48. Struhsaker TT. 1967. Auditory communication among vervet monkeys (Cercopitheceus aethiops). In: Altmann S, editor. Social Communication Among Primates. Chicago: University of Chicago Press; p. 281–324.

49. Seyfarth RM, Cheney DL, Marler P. 1980. Vervet monkey alarm calls: Semantic communication in a free-ranging primate. Anim. Behav. 28, 1070–94. (doi:10.1016/S0003-3472(80)80097-2)

50. Price T, Ndiaye O, Hammerschmidt K, Fischer J. 2014. Limited geographic variation in the acoustic structure of and responses to adult male alarm barks of African green monkeys. Behav. Ecol. Sociobiol. 68, 815–25. (doi:10.1007/s00265-014-1694-y)

51. Price T, Fischer J. 2014. Meaning attribution in the West African green monkey: Influence of call type and context. Anim. Cogn. 17, 277–86. (doi:10.1007/s10071-013-0660-9)

52. Arnold K, Pohlner Y, Zuberbühler K. 2008. A forest monkey’s alarm call series to predator models. Behav. Ecol. Sociobiol. 62, 549–59. (doi:10.1007/s00265-007-0479-y)

53. Wegdell FL, Hammerschmidt K, Fischer J. 2019. Conserved alarm calls but rapid auditory learning in monkey responses to novel flying objects. Nat. Ecol. Evol. (doi:10.1038/s41559-019-0903-5)

54. Fischer J, Wegdell FL, Hammerschmidt K. 2019. Conserved alarm calls but rapid auditory learning in monkey

responses to novel flying objects - Data Set. OSF.org; (doi:10.17605/OSF.IO/F4UTP)

55. Mundry R, Sommer C. 2007. Discriminant function analysis with nonindependent data: consequences and an alternative. Anim. Behav. 74, 965–76. (doi:10.1016/j.anbehav.2006.12.028)

56. Price T, Wadewitz P, Cheney DL, Seyfarth RM, Hammerschmidt K, Fischer J. 2015. Vervets revisited: A quantitative analysis of alarm call structure and context specificity. Sci. Rep. 5, 13220. (doi:10.1038/srep13220)

57. Perelman P, Johnson WE, Roos C, Seuánez HN, Horvath JE, Moreira MAM, et al. 2011. A molecular phylogeny of living primates. PLoS Genet. 7, e1001342.

58. Warren WC, Jasinska AJ, García-pérez R, Svardal H, Tomlinson C, Rocchi M, et al. 2015. The genome of the vervet (Chlorocebus aethiops sabaeus). Genome Res. 4, 1921–33. (doi:10.1101/gr.192922.115.25)

59. Fischer J, Price T. 2017. Meaning, intention, and inference in primate vocal communication. Neurosci. Biobehav. Rev. 82, 22–31. (doi:10.1016/j.neubiorev.2016.10.014)

60. Scott BH, Mishkin M. 2016. Auditory short-term memory in the primate auditory cortex. Brain Res. 1640, 264–77. (doi:10.1016/j.brainres.2015.10.048)

61. Fischer J. 2002. Developmental modifications in the vocal behaviour of nonhuman primates. In: Ghazanfar AA, editor. Primate Audition. Boca Raton: CRC Press; p. 109–25. (doi:10.1201/9781420041224.ch7)

62. Hammerschmidt K, Fischer J. 2008. Constraints in primate vocal production. In: Oller DK, Griebel U, editors. Evolution of Communicative Flexibility:

Complexity, Creativity, and Adaptability in Human and Animal Communication. Cambridge MA: MIT Press; p. 93–119. (doi:10.7551/mitpress/9780262151214.001.0001)

63. Fischer J. 2008. Transmission of acquired information in nonhuman primates. In: Menzel R, Byrne J, editors. Learning and Memory: A Comprehensive Reference. Oxford: Elsevier; p. 299–313. (doi:10.1016/B978-012370509-9.00055-3)

64. Arriaga G, Jarvis ED. 2013. Mouse vocal communication system: Are ultrasounds learned or innate? Brain Lang. 124, 96–116. (doi:10.1016/j.bandl.2012.10.002)

65. Catchpole CK, Slater PJB. 2008. Bird Song: Biological Themes and Variations. Second Edi. Cambridge: Cambridge University Press;

66. Wallman J. 1992. Aping Language. Cambridge: Cambridge University Press;

67. Snowdon CT. 1997. Affiliative processes and vocal development. Ann. N. Y. Acad. Sci. 807, 340–51.

68. Westermann G, Mani N. 2017. Early Word Learning (Current Issues in Developmental Psychology series). Westermann G, Mani N, editors. Oxford: Taylor & Francis;

69. Jürgens U. 2009. The neural control of vocalization in mammals: A review. J. Voice. 23, 1–10. (doi:10.1016/j.jvoice.2007.07.005)

70. Hage SR, Nieder A. 2016. Dual neural network model for the evolution of speech and language. Trends Neurosci. 39, 813–829. (doi:10.1016/j.tins.2016.10.006)

71. Owren MJ, Amoss RT, Rendall D. 2011. Two organizing principles of vocal production: Implications for nonhuman and human primates. Am. J. Primatol. 73, 530–44. (doi:10.1002/ajp.20913)

Page 10 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 12: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

10

Tables

Table 1. Variation in Savanna monkey alarm calls. Results of multivariate analysis of variance (only P-values indicated, corrected for multiple testing) of four of the most decisive acoustic variables assessing differences in relation to species and context in C. sabaeus and C. pygerythrus alarm calls, separately for females and males.

Females MalesAcoustic parameters Species Alarm context Species Alarm contextCall duration 0.212 0.000 0.788 0.000DFA2 mean 0.212 0.000 0.788 0.000DF1 mean 0.168 0.000 0.788 0.000Frequency range 0.016 0.000 0.744 0.000

Figure captions

Figure 1. Spectrograms of rhesus macaque coo calls. Recordings were obtained from two subjects at the age of 1-2 weeks, 3-4 weeks, and 4-5 months, respectively.

Figure 2. Spectrograms baboon loud calls from Guinea baboons, olive baboons and chacma baboons. A. females. B. males. C. Distribution of different baboon species on the African continent. Baboon drawings by Steven Nash.

Figure 3. Spectrograms of West African green monkey and East African vervet monkey alarm calls, in response to different predator/threat types. A. Female green monkey calls; B. male green monkey calls; C. female vervet monkey calls; D. male vervet monkey calls.

Figure 4. Acoustic differences of Chlorocebus alarm calls in relation to context and species. Boxplots and individual values for female and male vervet monkey (C. p.) and green monkey (C. s.) alarm calls (blue: aer = aerial, orange: leo = leopard, green: snk = snake). (a) Element duration, (b) Mean of the central frequency (DFA2), (c) Mean dominant frequency (DF1), (d) mean frequency range. Boxplots indicate median and interquartile range. Whiskers show values within 1.5 times of the inter-quartile range. Dots represent individual values. Reprinted with permission from (53).

Figure 5. Heat maps reflecting the acoustic similarity of West African green monkey (GM) and East African vervet monkey alarm calls (V). A. males; B. females. aer = aerial alarms, snk = snake alarms, leo = leopard alarms.

Additional InformationAuthors' ContributionsJF and KH conceived the manuscript, wrote the paper and gave final approval of the version to be published.

Page 11 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 13: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only4

8

12

kHz

s0.3 0.6

4

8

12

kHz

s0.3 0.6

month: 4-5

1

2

3

kHz

s0.3 0.6

week: 1-2 week: 3-4

4

8

12

kHz

s0.3 0.6

4

8

12

kHz

s0.3 0.6

month: 4-5

1

2

3

kHz

s0.3 0.6

week: 1-2 week: 3-4

Page 12 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 14: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

Papio papio

1

2

3

kHz

s0.2 0.4

Papio anubis

1

2

3

kHz

s0.2 0.4 0.6 0.8

Papio ursinus

1

2

3

kHz

s0.2 0.4 0.6

A

1

2

3

kHz

s0.2 0.4 0.6

Papio papio Papio anubis

1

2

3

kHz

s0.2 0.4 0.6 0.8 1.0

Papio ursinus

1

2

3

kHz

s0.2 0.4 0.6 0.8

B

P. cynocephalus

P. anubis P. papio

P. ursinus

P. kindae

P. hamadryas

Page 13 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 15: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only10

0.20.2

10

10

0.2

10

0.2

10

0.20.2

10 10

0.2

10

0.2

10

0.2

10

0.2

10

0.20.2

10

(a)

freq

uenc

y (k

Hz)

time (s)

(b)

(c) (d)

Leopard Drone Leopard Drone Snake Snake

Leopard Snake Aerial Snake Aerial Leopard

Page 14 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 16: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

100

150

200

C.p. aer C.p. leo C.p. snk C.s. aer C.s. leo C.s. snk

dura

tion

[ms]

a

1000

2000

3000

C.p. aer C.p. leo C.p. snk C.s. aer C.s. leo C.s. snk

DFA

2 m

ean

[Hz]

b

500

1000

1500

2000

C.p. aer C.p. leo C.p. snk C.s. aer C.s. leo C.s. snk

DF1

mea

n [H

z]

c

2000

3000

4000

5000

C.p. aer C.p. leo C.p. snk C.s. aer C.s. leo C.s. snk

rang

e m

ean

[Hz]

d

50

100

150

200

250

C.p. aer C.p. leo C.p. snk C.s. aer C.s. leo C.s. snk

dura

tion

[ms]

800

1200

1600

2000

2400

C.p. aer C.p. leo C.p. snk C.s. aer C.s. leo C.s. snk

DFA

2 m

ean

[Hz]

300

500

700

C.p. aer C.p. leo C.p. snk C.s. aer C.s. leo C.s. snkD

F1 m

ean

[Hz]

1000

2000

3000

4000

5000

C.p. aer C.p. leo C.p. snk C.s. aer C.s. leo C.s. snk

rang

e m

ean

[Hz]

females malesPage 15 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960

Page 17: For Review Only - DPZ€¦ · Towards a new taxonomy of primate vocal production learning Julia Fischer1,2,3* & Kurt Hammerschmidt1,3 1Cognitive Ethology Laboratory, German Primate

For Review Only

Page 16 of 15

http://mc.manuscriptcentral.com/issue-ptrsb

Submitted to Phil. Trans. R. Soc. B - Issue

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960