L10M: Long Talks 10 - Structure
Friday, 27/Jul/2018:
23:00 - 23:59

Location: Montreal_2

ID: 741

Linguistic structure and listener characteristics modulate the “speech-to-song illusion”

Tamara Rathcke1, Simone Falk2, Simone Dalla Bella3

1University of Kent, United Kingdom; 2Sorbonne-Nouvelle University, France; 3University of Montpellier, France

BACKGROUND. The "speech-to-song illusion" (S2S) is a perceptual phenomenon in which a spoken phrase shifts to be heard as sung after a series of repetitions. This transformation indicates a tight link between language and music perception, and has received much attention since its discovery in 1995 (Deutsch 1995). In a previous study, we showed that acoustic characteristics of looped phrases influenced the perceptual shifts (Falk et al 2014). However, it is still unclear whether linguistic (lexical or syntactic) properties of phrases influence S2S. Moreover, listeners themselves are likely to contribute to their experience of the shift. S2S has been observed in musicians and non-musicians, yet musicality itself is likely to increase the likelihood of the reported shift frequency (Falk et al 2014). A further open question concerns individual differences in cognitive processing styles, and if previous language experience may also shape this perceptual phenomenon.

HYPOTHESES. We hypothesized that the transformation is achieved via functional re-evaluation of prosodic properties: aspects relevant to speech processing dominate the perception initially and diminish during repetitions when underlying rhythmicity comes to light, enabling a melodic re-analysis of the sentence as singing. This general hypothesis allowed for predictions involving both linguistic structure and listener characteristics, e.g. a smaller memory span, longer sentences, sentences with a semantic or syntactic violation were expected to delay the transformation due to higher demands on speech processing.

METHOD. Two sets of sentence pairs were created in English. The first set contained alternations in the plausibility of lexical constituents (Ducks can fly. vs. Trains can fly). The second set comprised of "garden path" sentences in which prosodic break location influenced the sentence interpretability (While the woman washed (.) the cat (.) purred). Sentence length was varied in terms of the number of syllables (3-14). 40 native English listeners participated in the experiment. They rated each test sentences on a scale from 1 (clearly speech) to 8 (clearly song) before and after being exposed to its massed repetitions. Individual data (autistic traits, auditory working memory capacity, flexibility, divided attention, alertness, self-reported musicality and foreign language proficiency) were collected via an online questionnaire and the TAP battery (Zimmermann and Fimm 2011). The data of 40 non-native listeners (native speakers of the prosodically dissimilar French) are currently being collected.

RESULTS. Preliminary results show that overall, all stimuli sounded significantly more song-like after the exposure to their looped versions. Shorter phrases transformed into song much quicker and more often than longer phrases. Most transformations occurred during the 3rd-5th repetition. The shift occurred earlier, however, for listeners showing worse performance in the divided attention test. These preliminary findings with S2S provide the opportunity to gain insights into the cognitive and structural factors governing the links between music and language.

ID: 578

Prosody, Poetry and Processing: ERP evidence for hierarchical metrical structure in silent reading

Michelle Oraa Ali1,2, Ahren B. Fitzroy2,3, Mara E. Breen2

1Massachusetts Institute of Technology; 2Mount Holyoke College; 3University of Massachusetts Amherst


Under the Implicit Prosody Hypothesis, readers generate prosodic structures during silent reading that can direct their real-time interpretations of text (Fodor, 2002). Evidence for the realization of metric structure during silent reading is demonstrated by longer reading times for metrically unpredictable words than predictable ones (Breen & Clifton, 2011), but the cognitive processes underlying metric structure processing in silent reading are unclear.


The current study was designed to investigate whether metric unpredictability in silent reading is processed similarly to metric unpredictability in listening to speech and music.


We analyzed ERPs from nineteen participants (18 female; 1 nonbinary) who silently read 160 rhyming couplets. We manipulated the lexical stress pattern (strong-weak, weak-strong) and metrical predictability (predictable, unpredictable) of the target word (present in [1-4]) in a 2x2 design. In this way, the first syllable in the target word appeared as: [1] a strong syllable aligned with a strong beat (predictable); [2] a strong syllable aligned with a weak beat (unpredictable); [3] a weak syllable aligned with a weak beat (predictable); [4] a weak syllable aligned with a weak beat (unpredictable). An additional 160 metrically predictable rhyming couplets served as fillers. Each couplet was presented in center-embedded 1-to-4-word segments for 700 ms each; the rhyme prime (peasant in [1,4]) was presented for 1000ms. The target word was presented alone for 1000 ms.

1. Trochaic; Predictable:

There once was a penniless peasant // Who couldn’t afford a nice PREsent

2. Trochaic; Unpredictable:

There once was a clever young gent // Who gave to his girl a *PREsent

3. Iambic; Predictable:

There once was a clever young gent // Who had a nice talk to preSENT

4. Iambic; Unpredictable:

There once was a penniless peasant // Who went to his master to *preSENT


Metrically unpredictable trochaic targets (*PREsent in [2]) elicited a negativity between 325 400ms over left and medial-frontocentral scalp regions relative to predictable trochaic targets (PREsent in [1]). Conversely, there was no difference between iambic targets on strong or weak beats.


The larger negativity for the occurrence of a strong syllable on a predicted weak beat is consistent with results from overt listening (Bohn, et al., 2003), demonstrating that consistent metric structure creates temporal expectancies even during silent reading. Moreover, this finding is consistent with music perception results demonstrating larger negativities to metrically unexpected notes (Ladinig, et al., 2009), demonstrating cognitive overlap between hierarchical timing processes in speech and music.


