{"id":61,"date":"2020-12-16T23:58:06","date_gmt":"2020-12-16T23:58:06","guid":{"rendered":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/?page_id=61"},"modified":"2020-12-17T00:11:40","modified_gmt":"2020-12-17T00:11:40","slug":"research","status":"publish","type":"page","link":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/research\/","title":{"rendered":"Research"},"content":{"rendered":"\n<p>Some themes we have focused on:<\/p>\n\n\n\n<p><strong>1. Reduced speech is processed just as well as clear speech<\/strong><\/p>\n\n\n\n<p>Since the 1960s, researchers have consistently shown that there is a bias toward \u201cclear speech\u201d by listeners.\u00a0 Specifically, since clear speech is slower, has a more dispersed vowel space, and has more release bursts for final segments (as examples), there is a benefit for the auditory system for these types of productions.\u00a0 I showed that the bulk of this research was biased by the stimuli used in these studies, as clear speech was recorded and \u201creduced\u201d speech was created by manipulating that sample to arrive at a reduced version.\u00a0 The studies were designed to show a clear speech benefit.\u00a0 By using spoken language at different speech rates, I showed that naturally produced (within-accent) variation is processed equally well by listeners.\u00a0 Importantly, I also showed that variation in the speech signal is helpful, not harmful to successful recognition.\u00a0 This finding was replicated multiple times, and impacted our theories of language, processing and representation, shifting away from trying to find differences between clear speech and manipulated speech, to investigating speech processing as speech is produced by talkers, leading to more interesting questions about processing, representation and variation, that I detail below.\u00a0 <\/p>\n\n\n\n<p>Sumner, M., and Samuel, A. G. (2005). Perception and representation of regular variation: The case of final \/t\/. <em>Journal of Memory and Language<\/em>, 52, 322 \u2013 338.&nbsp;<\/p>\n\n\n\n<p>Sumner, M. (2011). The role of variation in the perception of accented speech. <em>Cognition<\/em>, 119, 131-36.<\/p>\n\n\n\n<p>de Marneffe, M\u2013C., Tomlinson, J., Tice, M., and Sumner, M. (2011). The interaction of lexical frequency and phonetic variability in the perception of accented speech. &nbsp;In L. Carlson, C. H\u00f6lscher, &amp; T. Shipley (Eds.), <em>Proceedings of the 33rd Annual Conference of the Cognitive Science Society<\/em>. Austin, TX: Cognitive Science Society.<\/p>\n\n\n\n<p>Sumner, M. (2013). A phonetic explanation of phonological variant effects. <em>Journal of the Acoustical Society of America, <\/em>134, EL26 \u2013 EL32<em>.<\/em><\/p>\n\n\n\n<p><strong>2. Equivalence in online processing tasks does not imply an equivalence in memory<\/strong><\/p>\n\n\n\n<p>For years, psycholinguists and linguists alike have investigated the nature of language representations.&nbsp; To do this, researchers typically use some immediate processing task (semantic priming, word naming, shadowing, etc.) to assess the responses of participants to variable speech patterns.&nbsp; Any similarities or differences across conditions (e.g., a word like CAT might facilitate the recognition of the word DOG, but the same word with produced slightly differently (with the final T having a sound similar to the medial sound in <em>mitten<\/em>), might also facilitate recognition of the word DOG).&nbsp; Before my work, this type of equivalence would have been used to support a traditional and decades-long theory in linguistics that language representations are devoid of detail, and abstract.&nbsp; This way of making implicit assumptions about representations was called out in my work, by arguing that we need to conduct memory experiments in order to make claims about representation.&nbsp; I show quite clearly within and across speakers of various social groups, that there is a clear ability of listeners to understand variably produced words in immediate tasks, but those equivalences in processing are easily mapped to inequivalence in memory.&nbsp; Specifically, listeners recall less frequent \u201cclear speech\u201d forms better than they recall reduced speech forms, even though they are processed equivalently in the immediate term.&nbsp; This work served to highlight interactions in memory specificity, the effects of language use patterns on the system as a whole, and challenged our notions of representation in linguistics.<\/p>\n\n\n\n<p>Sumner, M., and Samuel, A. G. (2009). The effect of experience on the perception and representation of dialect variants.&nbsp; <em>Journal of Memory and Language<\/em>, 60, 487 \u2013 501.<\/p>\n\n\n\n<p>Sumner, M. (2013). A phonetic explanation of phonological variant effects. <em>Journal of the Acoustical Society of America, <\/em>134, EL26 \u2013 EL32<em>.<\/em><\/p>\n\n\n\n<p>Sumner, M. and Kim, S. K. (2020). Some thoughts on the phonetics-psycholinguistics interface. In J. Setter &amp; R.-A. Knight (Eds), <em>The Cambridge Handbook of Phonetics<\/em>, Cambridge: Cambridge University Press, to appear.<\/p>\n\n\n\n<p><strong>3. Spoken words are socially weighted<\/strong><\/p>\n\n\n\n<p>Research in the realm of spoken word recognition typically asks questions like those articulated above: How do we move from a variable speech signal to representations, and how is it that we take any number of physical instantiations of a given spoken word and quickly map it to meaning, without conversational breakdowns?&nbsp; In this work, speech strings are treated as strings of linguistic units.&nbsp; But speech, is not just a string of Linguistic units.&nbsp; Speech carries phonetic information, in the form of predictable acoustic variation, that tells listeners not only about what was said, but also about who said what.&nbsp; Through various immediate and long-term spoken word recognition tasks, and across various populations that carry with them social baggage, I have found that words carry social weight.&nbsp; Specifically, the inequality of words in memory described in (2) is predictable based on how often a word is said, how the word is uttered, and also the social context in which the word has been experienced by the listener population.&nbsp; This finding impacted the explanatory power of purely frequency-based theories of lexical representation and access, and suggested that we cannot disentangle talker-based characteristics from speech to ever investigate \u201clinguistic\u201d processing without the influence of the other information inherent in the signal.&nbsp; We offered a mechanism through which this equivalence occurs, which has been tested and supported by a variety of researchers.<\/p>\n\n\n\n<p>Sumner, M., Kim, S. K., King, E., and McGowan, K. (2014). The socially-weighted encoding of spoken words: A dual-route approach to speech perception.&nbsp; <em>Frontiers in Psychology<\/em>, 4, 1 \u2013 13.<\/p>\n\n\n\n<p>Sumner, M. (2015). The social weight of spoken words. <em>Trends in Cognitive Sciences<\/em>, 19, 238-239.<\/p>\n\n\n\n<p><strong>4. Phonetically-cued social information affect immediate processing early<\/strong><\/p>\n\n\n\n<p>Logically building on the impact of the findings in (3), I followed up on the notion that talker-information influences foundational behaviors in speech processing that have been investigated from a purely linguistic standpoint.&nbsp; For example, we have found that the voice of a talker influences responses in word association tasks.&nbsp; Specifically, when given a word like <em>academy<\/em> produced by a man, and asked for the first word to come to mind, most listeners say <em>school<\/em>.&nbsp; When the same word is produced by a woman, the top associate is <em>awards<\/em>.&nbsp; Going one step further, these voice-specific associates are also predictive of online processing (when given prime-target pairs like academy \u2013 awards, listeners are faster at recognizing the target when the pair is produced by a woman than when produced by a man).&nbsp; This is not exclusive to gender.&nbsp; Hearing a word in an emotionally angry prosody (TABLE!!!) facilitates the recognition of not only the semantically-related words like <em>chair<\/em>, but also facilitates recognition to the word <em>mad<\/em>.&nbsp; To be clear, there is no transparent, meaning-based relationship between the words <em>table <\/em>and <em>mad.&nbsp; <\/em>It is purely the prosody, not the lexical meaning, that activates this word, and activates it early enough to show semantic facilitation.&nbsp; These results have impact the field in various ways, ranging from a shift to understand social influences in speech processing generally, to working through ways to understand how this all fits into a system.&nbsp;<\/p>\n\n\n\n<p>Kim, SK and Sumner, M. (2017). The effect of emotional prosody on spoken word recognition.&nbsp; <em>Journal of the Acoustical Society of America, <\/em>142, EL49 &#8211; EL55<em>.<\/em><\/p>\n\n\n\n<p>Kim, S. K., and Sumner, M. (2015). Effects of emotional prosody on semantic priming. <em>Proceedings of the 37th Annual Conference of the Cognitive Science Society<\/em>.<\/p>\n\n\n\n<p>King, E., and Sumner, M. (2015). Voice-specific effects in semantic association. <em>Proceedings of the 37th Annual Conference of the&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Cognitive Science Society.<\/em><\/p>\n\n\n\n<p><strong>5.&nbsp; <\/strong><strong>Social activation from speech introduces biases that modulate the encoding of linguistic events<\/strong><\/p>\n\n\n\n<p>Historically, linguistic processing has been considered independent from other aspects of the cognitive architecture.\u00a0 By linking to attention and memory, I have challenged this notion to show that we modulate our cognitive resources variably based on social information gleaned from speech.\u00a0 This happens subjectively: We may rate a woman\u2019s voice as reliable when it is produced on its own, but the rating shifts downward when it is presented in the context of a man\u2019s voice.\u00a0 This happens objectively: As attention to a particular talker increases, memory for that talker is more accurate (e.g., American listeners have better recall for a British speaker than for a NYC speaker).\u00a0 And these differences influences the subsequent processing and recall of speakers of different social categories (e.g., Listeners process speech produced by black speakers quite differently than that produced by white speakers, nearly always along metalinguistic stereotypical lines).\u00a0 This work has had an impact not only on theories of linguistics, shifting from a language-independent view to a complex system one, but also in society more broadly, showing that bias permeates processes as automatic as spoken word recognition.<\/p>\n\n\n\n<p>King, S., and Sumner, M. (2014). Voices and variants: Effects of voice on the form-based processing of words with different phonological variants. n P. Bello, M. Guarini, M. McShane, &amp; B. Scassellati (Eds.), <em>Proceedings of the 36th Annual Conference of the Cognitive Science Society<\/em> (pp. 2913 &#8211; 2918). Austin, TX: Cognitive Science Society.<\/p>\n\n\n\n<p>Sumner, M., and Kataoka, R. (2013). Effects of phonetically-cued talker variation on semantic encoding. <em>Journal of the Acoustical Society of America, <\/em>134, EL485\u2013EL491.<\/p>\n\n\n\n<p>What are gender barriers made of? Freakonomics radio broadcast, July 2016, <a href=\"http:\/\/freakonomics.com\/podcast\/gender-barriers\/\">http:\/\/freakonomics.com\/podcast\/gender-barriers\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some themes we have focused on: 1. Reduced speech is processed just as well as clear speech Since the 1960s, researchers have consistently shown that there is a bias toward \u201cclear speech\u201d by listeners.\u00a0 Specifically, since clear speech is slower, has a more dispersed vowel space, and has more release bursts for final segments (as [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/wp-json\/wp\/v2\/pages\/61"}],"collection":[{"href":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/wp-json\/wp\/v2\/comments?post=61"}],"version-history":[{"count":2,"href":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/wp-json\/wp\/v2\/pages\/61\/revisions"}],"predecessor-version":[{"id":99,"href":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/wp-json\/wp\/v2\/pages\/61\/revisions\/99"}],"wp:attachment":[{"href":"https:\/\/web.stanford.edu\/group\/phonlab\/cgi-bin\/wordpress\/wp-json\/wp\/v2\/media?parent=61"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}