Song pattern recognition in crickets based on a delay-line and coincidence-detector mechanism

Acoustic communication requires filter mechanisms to process and recognize key features of the perceived signals. We analysed such a filter mechanism in field crickets (Gryllus bimaculatus), which communicate with species-specific repetitive patterns of sound pulses and chirps. A delay-line and coincidence-detection mechanism, in which each sound pulse has an impact on the processing of the following pulse, is implicated to underlie the recognition of the species-specific pulse pattern. Based on this concept, we hypothesized that altering the duration of a single pulse or inter-pulse interval in three-pulse chirps will lead to different behavioural responses. Phonotaxis was tested in female crickets walking on a trackball exposed to different sound paradigms. Changing the duration of either the first, second or third pulse of the chirps led to three different characteristic tuning curves. Long first pulses decreased the phonotactic response whereas phonotaxis remained strong when the third pulse was long. Chirps with three pulses of increasing duration of 5, 20 and 50 ms elicited phonotaxis, but the chirps were not attractive when played in reverse order. This demonstrates specific, pulse duration-dependent effects while sequences of pulses are processed. The data are in agreement with a mechanism in which processing of a sound pulse has an effect on the processing of the subsequent pulse, as outlined in the flow of activity in a delay-line and coincidence-detector circuit. Additionally our data reveal a substantial increase in the gain of phonotaxis, when the number of pulses of a chirp is increased from two to three.


Introduction
Signalling with repetitive sound patterns is an essential strategy for mate attraction in many insects and vertebrates [1][2][3]. Understanding how the animals process their communication signals in the auditory pathways and what specific mechanisms they employ to recognize mate-specific calls represent fundamental questions in neuroethology [4,5]. Owing to their simple song patterns some acoustically communicating insects are ideal systems to study auditory processing and feature detection. At the receiver side, auditory pattern recognition requires neural processing mechanisms tuned to the species-specific acoustic signals [6][7][8]. In female bispotted crickets (Gryllus bimaculatus), which orient to sequences of chirps composed of three to five sound pulses, behavioural studies have characterized the temporal tuning of phonotactic behaviour, which robustly represents the tuning of the underlying processing mechanism [9][10][11][12]. These studies also led to a concept of temporal pattern recognition based on a delay-line and a coincidence-detector [13,14]. According to this concept, the coincidence-detector integrates an internally delayed response to a sound pulse with the direct response to a subsequent pulse and responds best, when the pulse period matches the internal delay. As a fundamental principle of temporal processing, delay-lines and coincidence-detectors are also employed for the processing of pitch in the auditory system of mammals [15,16], for directional processing of sound signals in birds [17,18] and in visual pathways for the detection of movements [19]. In crickets, the auditory brain neurons of the pattern recognition network have recently been identified [14,20] with functional properties in close agreement with the delay-line and coincidence-detector concept.
An inherent consequence of this pattern recognition mechanism is that the coincidence-detector will receive a combination of different direct and delayed inputs for each sound pulse over the course of a chirp. Here, we test the hypothesis that manipulating the duration of individual sound pulses or pulse intervals at the beginning, middle and end of a chirp will have specific effects on cricket phonotaxis that reflect the mechanism of processing in the pattern recognition network.

Material and methods (a) Animals
Female last instar larvae of Gryllus bimaculatus were selected from a colony at the Department of Zoology, Cambridge University; they were individually housed, acoustically isolated from singing males, had continuous access to water and food, and were kept at 26 -288C. Phonotaxis tests were performed in a soundproof chamber and started 7 -21 days after their final moult.

(b) Trackball system
An open loop trackball system was used to measure the phonotaxis of tethered females walking towards sound patterns presented at 458 to the left or the right of their long axis. We calculated the lateral deviation towards the active speaker for the duration of each stimulus presentation. This provides a reliable measure of the phonotactic response (see [21,22] for details).

(c) Acoustic patterns for phonotaxis tests
We used chirps with three pulses and systematically changed the duration of either a single inter-pulse interval or a pulse, while keeping the duration of the other interval and of the pulses constant at 20 ms. The two inter-pulse intervals of a chirp are labelled I1 and I2 (figure 1b). In the interval paradigms, we adjusted I1 or I2 to 5,10,20,25,30,40, 50, 60, 80 and 100 ms. The three pulses of a chirp are labelled as P1, P2 and P3 (figure 1b). In the pulse duration paradigm, P1, P2 or P3 were set to 5, 10, 20, 25, 30, 40, 50, 60, 80 and 100 ms. We also tested two chirp patterns. In one the duration of P1, P2 and P3 increased from 5 to 20 and 50 ms, and in the other reversed pattern it decreased correspondingly; inter-pulse intervals were 20 ms.
In all tests, we presented a sequence of two-pulse chirps with 20 ms pulse duration and 20 ms inter-pulse interval as a reference signal, which is the minimum chirp pattern that elicits a weak phonotactic response. This allowed us to report relative increases as well as decreases of the phonotactic response.
All sound patterns had a chirp period of 360 ms, pulses had a rising and falling ramp of 2 ms, except for 5 ms pulses where the ramp was 1 ms. The carrier frequency was 4.8 kHz and the sound intensity calibrated to 75 dB SPL RMS . Patterns with different pulse intervals or pulse durations were presented sequentially for 30 s from the left and right speaker, a silent period of 15 s separated different patterns to avoid any carry-over effects [22]. Tests were presented with increasing or decreasing order of pulse intervals or pulse durations. Each animal was tested three to five times with each paradigm.

(d) Data analysis
The lateral deviation of a female towards the active speaker was measured for each test pattern over the course of 1 min combining the responses to the left and right presentation, and was averaged over all trials. As the behaviour of individuals varies (see [13]), we pooled data from 25 phonotactically responding females to obtain the characteristic response curves for changes in pulse intervals or durations. For each test, data for an example recording (n ¼ 1) and the pooled results (n ¼ 25) are listed in the electronic supplementary material, tables. Responses to test patterns are given with +s.e.m. and are compared with the response to the two-pulse reference chirp, which was 14.9 + 0.8 cm min 21 . We describe the mean phonotactic responses as strong, moderate, weak or as no-response, depending on the significance levels by which responses were different from the reference response. Strong phonotaxis responses of more than 30 cm min 21 were always highly significantly different ( p , 0.001) from the reference value, and also moderate responses in the range of 20-30 cm min 21 were different with high significance ( p , 0.001). Phonotaxis responses with scores between 20 cm min 21 and 8.2 cm min 21 were not different from the reference value or were different at a significance level lower than p 0.003, and are described as weak responses. Any responses with scores lower than 8.1 cm min 21 Figure 1. (a) Flow of activity within a delay-line coincidence-detector circuit. The response to a sound pulse (P) is forwarded directly (P DR ) towards a coincidencedetector (CD) and also via a delay-line (P DL ). If the internal delay matches the period of the pulse pattern, the direct spiking and the delayed graded input will coincide and the output of the detector is boosted. (b) Diagram revealing the flow of activity for a chirp with three sound pulses. Each sound pulse (P) elicits a direct (P DR ) and a delayed input (P DL ) to the coincidence-detector (CD). The CD output remains low (small boxes), if a direct input and a delayed input do not coincide, like the direct input by the first pulse (P1 DR ) or the delayed input by the third pulse (P3 DL ). When a direct and a delayed input coincide the CD output is boosted (large boxes), as for the delayed response to P1 coinciding with the direct response to P2 (P1 DL þ P2 DR ). rspb.royalsocietypublishing.org Proc. R. Soc. B 284: 20170745 normally distributed; statistical analysis was performed using the Wilcoxon signed-rank test. For calculating the heat-map diagrams from the tuning curves (figure 4e), we linearly interpolated the phonotactic tuning curves at 5 ms intervals.

Results
The framework of our experiments is based on a delay-line and coincidence-detector mechanism for auditory pattern recognition (figure 1a) as proposed in [13]. In the brain, sound pulses (P) elicit a direct response (P DR ) and, via a parallel line, an internally delayed response (P DL ). Both are forwarded to a coincidence-detector (CD), which integrates the delayed response to a pulse P DL with the direct response P DR of a subsequent pulse. The CD requires a sequence of at least two pulses and is fully activated if the period of the pulse pattern corresponds to the internal delay. As an inherent property of this mechanism, the CD will receive different combinations of direct and delayed inputs at the beginning, middle and end of a chirp; this becomes obvious when the flow of activity is depicted (figure 1b).

(a) Conceptual framework for the design of auditory test patterns
For a chirp with three sound pulses, the response to the first pulse (P1) will forward a direct input (P1 DR ) to the CD (figure 1b, left). The CD output will remain low (small box), as there is no previous pulse providing a delayed input. When the second pulse (P2) occurs, the delayed input P1 DL coincides with the direct input from this pulse (P2 DR ), and thus enhances the response of the CD (large box). Therefore, systematic changes of the duration of P1-while keeping all other parameters constant-are predicted to reveal the time course of the delayed input P1 DL to the CD. The effect should be mirrored by a characteristic change in phonotaxis. Similarly, pulse P2 will generate a delayed input (P2 DL ), which will interact with the direct input from the next pulse (P3 DR ). Varying the duration of pulse P2 will have an impact on the CD output by interacting with the delayed response to P1 and by interacting with the direct input of P3 (figure 1b, middle), and should reveal to what degree the direct and/or the delayed signal of P2 shape the phonotactic response.
In the case of P3, the direct input (P3 DR ) will coincide with the delayed input P2 DL and enhance the CD output. Its delayed signal (P3 DL ), however, will not coincide with an input from a subsequent pulse and the CD will not be activated. Varying the duration of P3 will therefore demonstrate the effect of P3 DR on phonotaxis (figure 1b, right).
When keeping all pulse durations constant, the effect of inter-pulse intervals can be analysed. Changing inter-pulse interval I1 will alter the temporal overlap between the delayed input P1 DL and the direct input P2 DR at CD (figure 1b, left). As the direct input P2 DR is kept constant the time course by which P1 DL impacts on phonotaxis will be revealed. Similarly, varying interval I2 will change the temporal overlap between the delayed input P2 DL and the direct input P3 DR , and will demonstrate how the delayed input P2 DL affects phonotaxis.
Based on this activity flow for chirps with three pulses, systematic changes in inter-pulse intervals and pulse durations are predicted to lead to characteristically different phonotactic responses. For each test, we present an example phonotaxis response from an individual female for a qualitative description and the pooled data from 25 phonotactically responding females.

(b) Phonotactic responses to changes of pulse intervals I1 and I2
The trackball measurements for testing the effect of changing the duration of I1 show a female's typical phonotactic steering response (figure 2a). Sound presentation for a test sequence always started with the left speaker; the lateral deviation measurement is reset to zero at the start of each test. During a strong phonotactic response (e.g. when I1 is at 10, 20 or For inter-pulse intervals of 5 -40 ms, the tests demonstrate similar phonotaxis responses for both I1 and I2 with an optimum centred around 20 ms. Long I1 of 50-100 ms, however, abolished phonotaxis, whereas corresponding long I2 still elicited a weak response (figure 2c).   with the centre of the P1 response shifted towards 5 ms pulse durations; the map for P3 (bottom) stands out as its maximum is shifted towards longer pulses and as it reveals sustained phonotaxis even towards long sound pulses. The characteristic tuning curves and the heat maps reveal the impact of pulse and interval durations on phonotaxis, and indicate that the auditory pattern recognition system is in a different functional state for each pulse that is processed during a chirp.

(g) Designing and testing putative attractive and non-attractive chirp patterns
The different tuning curves for P1, P2 and P3 prompted us to explore the underlying processing mechanism in more detail.
The comparison of the three tuning curves (figure 3d) with the reference response shows that a 5 ms pulse strongly enhanced the gain of phonotaxis by a factor of 3.1 when presented as P1, but when presented as P3 it only has a moderate effect of 1.4. A pulse of 20 ms duration always had a strong effect when presented either as P1, P2 or P3. A 50 ms pulse had a weak effect of 0.9 on the phonotactic response when presented as P1; however, when presented as P3 it strongly enhanced phonotaxis by a factor of 2.4. Consequently, chirps combining a 5 ms, a 20 ms and a 50 ms pulse (5-20-50 ms) should be effective to elicit phonotaxis. However, when played in reverse order (50-20 -5 ms) they should be much less attractive. Presenting crickets with these patterns composed of the same sound pulses, just in reverse order (figure 3d inset), should reveal if these patterns   4b), phonotaxis towards the 50-20 -5 ms pattern was 7.9 + 0.8 cm min 21 and significantly lower than the reference of 14.9 + 0.8 cm min 21 ( p , 0.001). The 5-20 -50 ms chirps elicited a strong phonotaxis response of 34.7 + 1.4 cm min 21 , which was significantly higher than the reference value ( p , 0.001) and significantly higher than the value of the 50-20 -5 ms pattern ( p , 0.001). The phonotaxis response towards the 20 -20-20 ms chirp pattern reached 48.8 + 1.9 cm min 21 and was significantly higher than any other response ( p , 0.001). The very different phonotactic responses towards the attractive 5-20 -50 ms and the non-attractive 50 -20-5 ms chirps confirm that the functional state of the pattern recognition system depends on the duration of the pulses that are processed, and that it changes during the course of a chirp.

Discussion (a) Comparison to previous phonotaxis experiments
Cricket phonotaxis experiments were typically performed by altering the duration of all pulses or pulse intervals in a chirp [9][10][11][12]23]. In G. bimaculatus and its sister species G. campestris, the resulting tuning curves [9][10][11]20,24] point to the importance of pulses and inter-pulse intervals of 15-25 ms for calling song pattern recognition, and show that short (5 ms) and long (50 ms) pulses and pulse intervals are not efficient [9][10][11]20]. Owing to the previous design of these test paradigms, the phonotactic response always depended on changes in all pulses and/or pulse intervals of a chirp pattern.
Here we analysed cricket phonotaxis in response to chirp patterns in which the duration of one pulse or one pulse interval was systematically altered. This allowed us to single out specific effects on the behaviour, which now can be linked to the neuronal processing underlying song pattern recognition, as so far revealed by single cell recordings [14,20,25].

(b) Phonotactic tuning curves and neural processing in the delay-line coincidence-detector circuit
A previous concept [13] and intracellular studies of auditory brain neurons point to a delay-line and coincidence-detector circuit for pattern recognition in the cricket brain [14]. Sound pulses (P) elicit a direct response (P DR ) mediated by the activity of a spiking ascending interneuron, and via a parallel line an internally delayed response (P DL ). The delayed response is a graded excitatory potential (figure 5a) of a non-spiking neuron, which is generated after an initial inhibition of the interneuron. The direct and the delayed signals are forwarded to a coincidence-detector (CD), which integrates the delayed response to a pulse P DL with the direct response P DR of a subsequent pulse. Its response is strongest at the species-specific pulse-period, when the spiking response coincides with the delayed excitatory graded response. The interaction of the graded and spiking response is outlined for intervals and pulses of 20 and 100 ms (figure 5b-d).
Changing interval I1 alters the temporal interaction between the delayed input P1 DL and the direct input P2 DR at CD (figure 1b), and reveals the time course by which the graded signal P1 DL impacts on phonotaxis. Correspondingly varying interval I2 demonstrates the effect of P2 DL . Based on the neuronal processing, both characteristic response curves reveal how the amplitude of the excitatory delayed signal changes with the duration of the pulse interval (figure 5b). For interval durations of 20 ms, the spike activity of P2 DR coincides with the graded excitatory response (figure 5b, dark grey line) and will activate the CD. I1 and I2 showed a similar tuning curve with a best response at 20 ms (figure 2c). These phonotactic responses to changes in pulse intervals are in good agreement with the time course of the graded delayed response recorded in the pattern recognition network in the brain [14, fig. 5b]. They also correspond to previously reported tuning curves [20]. At intervals of 100 ms, the spiking activity coincides only with the falling phase of the delayed graded signal (figure 5b, bright grey line) and the CD will not be activated. I1 intervals longer than 50 ms  Negative effects on phonotaxis due to non-attractive signals have also been reported before [26]. In the case of I2, however, weak phonotaxis is maintained as processing of the first two pulses of the chirps is not affected by changes of I2.
Corresponding to the flow of activity in the circuit, the P1 tuning curve (figure 3d) reflects how the time course and amplitude of the graded delayed signal P1 DL depend on the duration of P1. For 20 ms P1 pulses, the maximum of the graded signal (figure 5c, dark blue line) will coincide with the direct spiking input of P2 and will drive the CD response, whereas for 100 ms durations of P1 spike activity will coincide only with a weak excitatory delayed signal (figure 5c, light blue line). Different from previous data on pattern recognition [9][10][11]20], pulse durations of 5 ms were efficient to elicit strong phonotaxis and the best response occurred at 10 ms. In electrophysiological experiments 5 ms pulses have not yet been tested, but pulses longer than 10 ms were sufficient to trigger the graded response in the delayline neuron. Interestingly, in electrophysiological experiments the graded delayed response was always coupled to the end of a sound pulse, even for 50 ms long pulses [14, fig. 4d]. Also, the coincidence-detection mechanism tolerates 50 ms pulses [25, fig. 7b]. We therefore expected that phonotaxis would not be affected when the duration of P1 was increased to 50 ms and beyond. The tuning curve for P1 (figure 3d), however, shows that this is not the case, as phonotaxis reliably started to break down for P1 longer than 30 ms.
Pulse P2 generates a delayed input (P2 DL ), which interacts with the direct input from the next pulse (P3 DR ). Varying the duration of pulse P2 therefore has an impact on the CD by interacting with the delayed response to P1 and by interacting with the direct input of P3 (figure 1b, middle); thus the direct and/or the delayed signal of P2 may shape the phonotactic response. The tuning curve for P2 (figure 3d) shows the best response at a 20 ms pulse duration, as previously reported for standard chirps [9][10][11]20]. It is similar to the tuning curve obtained by changing I2 (figure 2d) and the P1 tuning curve (figure 3d), indicating that the delayed graded response of P2 provides the dominant effect on phonotaxis.
In the case of P3, the direct spiking input (P3 DR ) coincides with the graded delayed excitatory input P2 DL. The delayed signal P3 DL , however, does not coincide with any subsequent signal (figure 1b, right). Varying the duration of P3 therefore demonstrates the effect of P3 DR on phonotaxis. The characteristic curve (figure 3d) shows an increase of the response up to 25 ms, which is in line with previous data for standard chirps [9][10][11]20] and the time course of the delayed graded excitation of the delay-line neuron [14]. However, different from previous data, P3 durations up to 100 ms were efficient to drive phonotaxis. This may be elucidated by the interaction of the direct spike response P3 DR with the delayed graded response P2 DL (figure 5d). For P3 pulses of 20 ms duration, the spike activity (figure 5d, dark red spikes) will coincide with the peak of the graded excitatory response driven by P2. Even for 100 ms pulses, the initial spike response (P3 DR ) will coincide with the peak of the graded signal P2 DL. This will boost the response of the CD, and the phonotactic response even to long P3 may be explained. The ongoing spike response of P3 DR (figure 5d, light red spikes) will only coincide with the decaying part of the graded signal and will not enhance the CD output. As the delayed signal P3 DL will not coincide with any subsequent  Figure 5. (a) Diagram indicating the flow of neuronal activity in the delayline and coincidence-detector circuit in response to a sound pulse (P). Spike activity provides the direct input (P DR ) to the CD, whereas the delayed input (P DL ) is based on a graded excitatory signal following an initial inhibitory response due to post-inhibitory rebound.  1b right), this part of the response has no effect on phonotaxis.

(c) A pattern recognition concept based on a delay-line and coincidence-detector mechanism
Testing the pulse duration of P1, P2 or P3 revealed three different characteristic responses (figure 3d,e). From P1 to P3, there is a shift of the best response towards longer pulse durations. For P1, the response is tuned towards 5-25 ms with a peak at 10 ms, the response to P2 is best at 10-30 ms with a broad peak at 10-20 ms, and for P3 the range is 10-100 ms with a peak at 25 ms. Long P1 pulses of 60-100 ms abolished phonotaxis, in the case of P2 weak phonotaxis was maintained, and long P3 elicited a strong response. In different crickets, bushcrickets and grasshoppers such different tuning curves may be described as species-specific profiles for acoustic communication [27]. In G. bimaculatus, however, such different filter properties occur sequentially while the pulses of chirps are processed. Short (5 ms) or long (50 ms) pulses have very different effects on phonotaxis, when the pulses are presented at the beginning or the end of a chirp (figure 3d,e). Correspondingly, a chirp pattern composed of 5-20-50 ms pulses elicited phonotaxis, but failed to do so when played in reverse order of 50-20-5 ms (figure 4). Changes in the responsiveness to reversely played song patterns have not yet been reported in crickets but do occur in grasshoppers [28]. In crickets, the different responses towards these chirps with increasing or decreasing pulse durations reveal that the functional state of the processing mechanism changes while sequences of sound pulses are processed. These changes depend on the duration of pulses and intervals, and determine the pattern-recognition process. The characteristic tuning curves provide a look-up table to compose attractive and non-attractive chirp patterns.
The characteristic curves do not indicate a constant, steady filter process underlying song pattern recognition, but rather that filtering is based on dynamic changes in the processing properties of the neuronal network. These dynamic changes can be linked to the neuronal activity within the delay-line and coincidence-detector circuit in the cricket brain [13,14,25], in which the response to a sound pulse impacts on the processing of the subsequent pulse.

(d) Changes in the gain of the phonotactic response
In comparison with the response to the two-pulse reference chirp, a three-pulse chirp increased the phonotactic response by a factor of more than 3.0, indicating a nonlinear impact on the gain of the response. An increase of the response by a factor of 1.5 might be expected if the score simply scales with the number of pulses in a chirp and an increase by a factor of 2.0 as the coincidence detector is fully activated twice due to the additional pulse. Moreover, first pulses (P1), which were followed by a long interval (I1) or which had a long duration, decreased the gain and abolished phonotaxis. These nonlinear changes of the response are not predicted by the mechanistic temporal processing in the delay-line and coincidence-detector. They rather indicate an additional facilitation and/or neuromodulation effect that changes auditory processing and that currently is not reflected in the response properties of the auditory brain neurons [14]. As the change in gain has an impact on the processing of subsequent sound pulses, its rather long time course may provide a filter at a different time scale as required for the processing of chirp durations [11,26].

Conclusion
Our analysis demonstrates how detailed phonotaxis responses can be revealed and linked to the organization of auditory pattern recognition based on a delay-line and coincidencedetector mechanism [13,14]. Further electrophysiological experiments should elucidate how the characteristic tuning curves are mirrored in the neuronal activity of the network. Recent computational approaches to auditory pattern recognition in insects based on Gabor filters present a phenomenological description of the phonotaxis preference functions [24,29,30]. The current experiments provide the basis to refine computational approaches to model this pattern-recognition system. They also may inspire behavioural experiments to analyse delay-line and coincidence-detection mechanisms in other sensory pathways.