voice and prosody -...

58
Voice and Prosody Licia Sbattella Natural Language Processing Politecnico di Milano

Upload: hakien

Post on 21-May-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

VoiceandProsody

LiciaSbattellaNaturalLanguageProcessingPolitecnicodiMilano

• CharlieChaplin-TheGreatDictator(1940)

•  TheKing’sSpeech

•  TheRealKing’sSpeech

UlyssesandtheSirens

ThePitagora’sVoice

EdvardMunchTheScream(1893)

Voiceesilence•  Twocommonusageofvoice:

•  Thevoicesupportingmeaning(let’sgofighting)•  Thevoicesupportingestheticalexperience(let’sgototheOpera)

•  Thevoicesupportinginteractionandrelationships•  Speakingmachines…thinkingmachines

•  Voice,languageandthoughtßàhumanity•  Voiceandlife•  Silenceanddeath

Always‘somethingmore’•  Voicepointstomeaning(creatinganexpectationofmeaning)

BUT•  Voiceisanevanescentmedium(FredricJameson):makesmeaningpossiblebutdisappearsoncemeaninghasbeenproduced(wegetusedto)

•  Phonology/Semantics/Meaning•  VoiceAnalysisàMeaningextraction/AestheticsDimension…livingvoiceisalwaysSOMETHINGMORE

• àProsodyindiscourseanddialogue,….

Always‘somethingmore’•  Accent(voiceandsingingactivity).Dominant/Officialspeech

•  IntonationandOppositemeaning:irony,sarcasm…•  Tone–thevoiceasa‘fingerprint’immediatelyrecognizedandidentified

Articulation:fromno-articulationtoiper-articulationà ThevoiceasENUNCIATIONPROCESSà Thevoiceas(need?)desireof…others

VoiceandSpeech/NLP• WearenotinterestedonPhonetics(onphonemesandtheirproduction,theirphysicalquality,theirphysicalandphysiologicalproperties)• WeareinterestedonPhonology:theirrelationalnature,theirpositionsintoutterancescontributetomeaning.Theirattitudeto‘takeaplace’helpsustogroupanddistinguishmeaningfulutterancesubparts(Jakobson,DeSaussure,…)• WeareinterestedonProsody…

Prosody:introduction• Prosodyisageneraltermencompassingintonation,rhythm,tempo,loudness,andpausesastheseinteractwithsyntax,lexicalmeaning,andsegmentalphonologyinspokentexts.

•  Inthefieldofdiscourseanalysistherehasbeenanincreasingawarenessofthecentralroleoftheseaspectsoflanguage.

Prosody:introduction•  BUTProsodystillhasasidelinestatusindiscourseanalysisbecause:• Withtheexceptionofpunctuationandcertainoccasionalspecialfonts,prosodicfeaturesarenotreadilyavailableinEnglishortography•  ASRàlossofprosodicdetails.ConversationAnalystshaveincludedprosodiccomponentsintheirtranscriptioncodingsystemsbutnotalltranscribersconsistentlyusethecodes•  Muchofthecurrentworkonprosody(particularlyonintonationandrhythm)iswrittenbyandforphonologistsandphoneticiansàlimitationsforthediscourseanalyst.

Prosody:introduction•  Limitationsfordiscourseanalysts:

• Constructedexamplestendtobesingleutterances,leavingtheroleofprosodyinextendeddiscourseinaccessible• Presupposedcontextsmayinfluencetheintonationof“out-of-the-blue”utterances(citationform)•  Ironicallytherecentdevelopmentofcomputerizedspeechtechnologyintroducedanewsetofmethodologicaldecisionsabouthottoproceed

Prosody:introduction• Wewillpresentareferencemodelforprosodicanalysis–fornonphonologists–discourseoriented–whichencouragesalsomixedprocessing

• DiscourseßàProsody

• Prosodyisnotanaddedflourishorasuperimposedfeature

FromMES-Wennerstrom

SUMMARYOFTHEDISCOURSEFUNCTIONOFINTONATION

DiscourseFunctionandIntonation

•  Themodelthatweadopt:theA.Wennerstrom’smodel:•  ToBI–Pierrehumbert(1980)ANDPierrehumbert,Hirshberg(1990)•  Key–Brazil(1985,1997)•  Paratone–Brown(1077),Yule(1980)

FromMES-Wennerstrom

FromMES-Wennerstrom

PITCHACCENTS1)multisyllabicwords:thestressismarkedwithaccentsymbolsonthevowels.(e=primarystress;e=secondarystress)2)H*pitchaccent:indicatesinformationbeingaddedtothediscourseasnew3)L+H*pitchaccent:underlinedtext,indicatesinformationcontrastingwithaprioritemorideainthediscourse3)L*pitchaccent:insubscriptedcapitalletters,indicatesinformationthatisnottobeaddedtothediscourseasnew,eitherbecauseitisalreadybelievedtobeaccessible,orbecauseitisExtra-propositional4)L*+Hpitchaccent:insubscriptedandunderlinedcapitallettersindicatesthattherelevanceofanitemtothediscourseisquestionedincontrasttosomeotheritem

FromMES-Wennerstrom

FromMES-Wennerstrom

FromMES-Wennerstrom

FromMES-Wennerstrom

FromMES-Wennerstrom

TheIntonationalPhrase•  TheIntonationalPhraseisamoreorlessacontinuouspitchcontourwith,atminimum,aninitialKey,anumberofpitchaccents,andapitchboundary

FromMES-Wennerstrom

TheIntonationalPhrase•  TheIntonationalPhraseisamoreorlesscontinuouspitchcontourwith,atminimum,aninitialKey,anumberofpitchaccents,andapitchboundary

•  The“ideal”IntonationalPhrasecoincideswithaclusteroflinguisticfeatures,anyofwhichmaybeusefulinitsidentification.Itcanbeutteredinonebreath:itisoftensetoffbypauses;itgraduallydeclinesinpitchthroughoutitsduration;itendswithfinallength-ending,slowertempo,andpossiblechangesinvoicequality;itformsthedomainofcertainphonologicalrules;anditmaycoincidewithsyntacticconstituents(MES–pag.31)

TheIntonationalPhrase

• NaturalDiscourse/Alteredbaseline

• AccordingtothePierrehumbert’s“compositional”modelofintonationalmeaning:eachcomponentor“tone”addsasmallelementsofmeaningtothediscourseasawhole

IntonationSystemandothersystemsofprosody•  Theintonationsysteminteractswithothersystemsofprosody:•  Tempoandvolumeinwhichasentenceisdeliveredcanhaveconsequencesforitsinterpretation•  Aparalinguisticvariationofpitchcanbeusedtohighlightparticularconstituents,exaggeratingtheirbasicintonationpatterns•  Changesinvoiceequalitycanoccurinmimicryorintheuseofstereotypicalvoicestoachieveinteractionalgoals• àtheirroleintheanalysisofconversation• àtheirroleonnarratives

StressandRhythm• Rhythmisanorganizingforceinlanguage• RhythmprovidesanunderlininghierarchicalstructureuponwhichStressisbuilt,whileIntonationcomponentsareassociatedwiththehighpointsoftheserhythmichierarchies• Rhythmismaintainedacrossintonationboundariesandfromonespeakertothenext.

StressandRhythm•  StressisnotIntonation•  Theybothusepitch,volumeandlengthchangesinspeechproductionBUT•  Stressisaphonologicalcharacteristicoflexicalitemsandislargelyfixedandpredictable,whereasintonationcanbealtereddependingonthediscourseroleplayedbytheconstituentswithwhichtonesareassociated• àIntonationhasthegreaterpotentialtoinfluencediscoursemeaning

StressandRhythm• Abasicstresspattern(includingprimarystress,secondarystress,nostressandcompoundstress)isaphonologicalpropertyoflexicalitemsandcompounds,dependingontheirhistory,morphology,andsyllablestructure•  Intonationisasystemofmeaningindiscourse,whichassociateswithtextbymappingontotherhythmicstresspatternsofwords

RhythmUnderliesSpokenLanguage• Asenseofrhythmisauniversalhumantrait

•  “Thefactthatspeech,verseandmusicallhavehierarchicallyorganizedmetricalstructureimplies…acommoncognitiveorigin.Notonlyarethetheprinciplesoforganizationsurprisinglysimilarforallthreefaculties,buttheyalsoallowthesameplay-offbetweenabstractconstructorunderlyingstructureandactualrealization”(Couper-Kuhlen,1993)

RhythmUnderliesSpokenLanguage•  Feetandtheenergyoftherhythmicbeats.Where?

•  Atthebeginningofthefoot,usuallyonthestressedsyllableofthefirstcontentword?•  Englishistrochaic(energyatthebeginningofthefoot)•  Englishisoftenreferredasa‘stressedtime’language•  “Innaturalspeechisnotasregularasincountingorinchildren’srhymesNEVERLESSthereisastrongtendencyinEnglishforthesalientsyllablestooccuratregularintervals;speakersofEnglishliketheirfeettobeallroughlythesamelength”(Halliday,1994)

RhythmUnderliesSpokenLanguage

•  Inactualmeasurementsofbeatduration•  Theintervalsbetweenbeatsareneverexact.Howeveraregularrhythmscanstillbeperceivedbecauseaseriesofbeatsisprocessedasan“auditorygestalt”• Witha‘constructiveprocess’thesedatabecomepartsofanholisticschemewhichisthenabletoincorporatefurtherdetailsfromtheincomingsignals•  Silentbeatsaretimefillersinnaturalspeech

FromMES-Wennerstrom

RhythmicHierarchies• Anhandyformalismtoshowhowwordstressinteractswithintonation•  Thesourceofpitchaccentliesoutsidethemetricalstructure,dependingonaspeaker’sintentionandassessmentsaboutthediscourse

• Rulesof‘eurhythmy’arephonologicalrulesthatrestructurepatternsofrhythmicweightsothatstressisdistributedmoreevenlyandtheoverallstructurewillbebalanced.

FromMES-Wennerstrom

FromMES-Wennerstrom

RhythminInteraction•  TheimportanceofRhythmasanorganizingforceinphonologyhasfurtherimplicationsfortheanalysisofinteractionaldiscourse

•  Rhythmicspeechiseasytoprocess•  Hearersareabletoperceiveandprocessthediscourseinregularbeat-sizedcycles

•  Turn-taking•  SynchronizedOverlapping•  Agreementon“awkward”pauses•  SynchronizedRhythmoftalksacrossturnboundaries•  Uncomfortablemoments:whentherhythmicstructureofthediscoursebreaksdownandthecontentalsoreflectssomeawkwardnessordisagreement

RhythminInteraction•  Rhythmicstabilityisglobal•  Speechwithregularrhythmisoptimalforprocessing.However,fromtheidealizedbalance,speakersalsohavetheoptiontomanipulaterhythmforinteractionalpurposes.

FromMES-Wennerstrom

ParalanguageTheColorofEverydaySpeech

FromMES-Wennerstrom

ParalanguageUsage•  OralNarrativeàprosodyofnarrative

•  Storytellersenrichwithgestures,mimicry,volume,variationofpitch(performancefeatures)•  Quotedspeechismarkedbyparalinguisticshifts–tempo,pitchrange,volumeandotheraspectsofvoicequalitychangeduringthequotedportion

•  EmotionRecognition•  Developmentofindividualspeechstyle•  Crossculturalcommunicationstyleanalysis

Intention,MentalRepresentation,Coherence•  Coherence:cohesionandconnectionamongideasbehindadiscourse

•  Intonationplaysaroleincohesionand,thereby,coherence•  Examples:•  thatapitchboundaryattheendofaphrasecananticipatethenextconstituenttocompleteitsinterpretation

•  ThataL*pitchaccentcanbeassociatedwithlexicalitemsthatthespeakerbelievesshouldalreadybeaccessibletothehearer

•  ThatL+H*pitchaccentcansignalacontrastrelationshipwithitemsthathavecomebefore

•  Otherpitchmorphemeshelphearerstodrawconnectionsbetweenwhatisutteredandwhatisalreadyrepresentedintheirmindsduringtheinteraction

Acooperativeprocess•  Thecollaborativenatureofcomprehension•  Twoparticipantsreadjusttheirmentalmodeltoincorporateeachnewideainacoherentmanner

•  It’simportanttonotethecontributionofintonationtothecohesionofatextand(indirectly)tocoherence

•  Intonationcanprovidemoreinformationthancanthelexicogrammaticalstructurealone

•  Thelexicogrammaticalstructurecombineswithothersourcesofinputtocreateamentalrepresentationofdiscourse

•  A.WennerstromàL*associatedtowhataspeakerassumestobeaccessibletohearersàstudiescommunitymembership,socialaffiliationbutalsoprofessor’sassumptionaboutwhatiscommonknowledgeandwhatstudentsneedtolearn.

MentalRepresentationandDiscourse•  Aspeoplecommunicate,eachbuildsamentalrepresentationofthediscourseasitprogresses.

•  Newcontributionsinvolveaspeaker’sassessmentofhoeothersarelikelytohaveconstructedtheirmentalrepresentationsandwhatcommonknowledgetheymayalreadyshare

•  However,despiteourbesteffortsatcollaboration,comprehensionisnotanall-or-nothingstateofaffairsthateithersucceedsoffails.Speakersandlistenershaveindependentanddifferentgoals,…

• àacooperativeprocess…aninterestinggame

Coherencefromthelistener’sperspective•  Coherenceisarelativenotionàwecandefineacoherentdiscoursefromalistener’sperspectiveifwesupporttheconstructionofamentalrepresentationthatisadeguateaspossibleornecessary

•  A.Wennerstromunderlines3importantsourcesofinputtothementalrepresentationoddiscourse:

•  1)LinguisticInput(Anaphora-AntecedentRelationships,Conjunctions,…)•  2)previousKnowledge•  3)Perceptionofthephysicalenvironment

Coherencefromthelistener’sperspective•  1)LinguisticInput(Anaphora-AntecedentRelationships,Conjunctions,…)–Prosodyplaysanimportantrole:Speakersassociateparticularintonationalmorphemeswiththelexicogrammaticalstructureofthetextworkingthelistener’smentalmodel(theitemisnew,alreadyaccessible,contrastive?):pitchaccent,pitchboundaries,keyandpauses.SpeakersandlistenersorganizetheirtopicssupportingalsoRE-FOREGROUNDING

Coherencefromthelistener’sperspective•  2)PreviousKnowledge:•  H*pitchaccentsdirectlisteners’attentiontomaterialthathastobeaddedtotheirmentalrepresentation,•  L*pitchaccentsassociatewiththosedetailsassumedbythespeakertobeaccessibleInthementalrepresentationoflistenersthroughmemoryschemata.•  L+H*pitchaccentsassociatewithitemswhosetiestothementalrepresentationareincontrasttowhatisalreadyaccessibleviatheschemata,evenifadirectantonymhasnotbeenverbalizedinthetext.

•  …..

Coherencefromthelistener’sperspective

•  3)Perceptionofthephysicalenvironment•  DeicticLanguage(deicticpronouns:this,that,…I,orYou,…)•  Pointinggesture•  Directionofgazearebringingcertainfeaturesintheimmediateenvironmenttotheforegroundofattention,whileotherfeaturesremaininthebackgroundWESUPPORTàGESTALTHEORYOFPERCEPTION

Pitchaccentscanbeassociatedwithlexicalitemswhosereferentsareperceivableinthesurroundingenvironmenttoeitherforgroundthem(H*),contrastthem(L+H*),indicatethattheyarealreadybelievedtopresentinthementalrepresentationofthediscourse(L*)ortoquestiontheirrelevance(L*+H)

ResearchTopics•  WhatistherelationshipbetweenReferenceandIntonation?•  WhatistherelationshipbetweenIntonationandContrast?•  HowdoesIntonationreflectCommunityMembershipinCross-CulturalInteraction?

•  HowdoChildrenuseintonationastheydevelopSchematicOrganization?

•  HowdoestheIntonationofClassroomDiscoursereflectthebuildingofnewschematainLearningProcess?

ProsodyasaDiscourseMarker

•  DiscourseMarkersaresequentiallydependentelementswhichbracketunitsoftalk(Schiffrin,1987)

•  “Sequentiallydependent”means:Theoccurrencesofamarkerdependsonthesequenceofeventsatthelevelofdiscourse,ratherthanatthelocalleveloftheclause.

•  “Bracket”meansthatthediscoursemarkerstendtooccurattheperipheryofother“unitsoftalk”

•  “Units”isgeneric:Discoursemarkersmayassociatewithdifferenttypesofconstituents

•  Theunitmaybesyntactic(thephraseorclause);semantic(theproposition);orphonological(theintonationalphrase)

ProsodyandLexicalDiscourseMarkers•  InmanysituationsdiscoursemarkersareassociatedtoL*pitchaccents(theyhaveorganizationalandinteractionalfunctionsanddonotcontributetotheinformationstructureofthediscourse).PierrehumbertandHirshberg(1990)refertothemas‘cuephrases’

•  L*/H*:Youknow,like,oh,well,FromMES-Wennerstrom

TheParatoneasaDiscourseMarker• Considertherelationshipbetweenprosodyitselfandorganizationalstructure• Paratone(Fox,1973)istheprosodicequivalentofawrittenparagraph,wherebyspeakersmanipulatepitch,volume,tempoandpauseattransitionpointsbetweentopicalconstituentstoindicatetherelationshipamongthosetopics• HighParatones•  LowParatones•  EmbeddingParatones

HighParatones

FromMES-Wennerstrom

HighParatones

FromMES-Wennerstrom

ResearchTopics•  WhatcanthestudyofProsodyaddtoourknowledgeofLexicalDiscourseMarkers?

•  WhatistherelationshipbetweenProsodyandRhetoricalStructures?

•  HowdoesProsodyinteractwithGrammaticalCorrelatesofTopics?(studyofthetriangulationeffects:RhetoricalStructure/GrammaticalCorrelatesofTopics/Paratones)

•  WhatistherelationshipbetweenParatoneandKey?•  WhatistherelationshipbetweenProsodyandDiscourseOrganizationinotherDiscourseGenres?

Bibliography•  A.Wennertrom,TheMusicofeverydaySpeech,ProsodyandDiscourseAnalysis,OxfordUniv.Press,2001-[ref.asMES]

•  J.Pierrehumbert,J.Hirshberg,Themeaningofintonationalcontoursindiscourse.InP.Cohen,J.Morgan,M.Pollack(eds),Intentionsincommunication(pp.271-311),CambridgeUniv.Press–MITPress,1990

•  J.Pierrehumbert,ThephonologyandphoneticsofEnglishIntonation.UnpublishedDoctoralDissertation,MIT,1980

•  A.Fox,ProsodicFeaturesandProsodicStructures,OxfordUniv.Press,2007

•  D.Brazil,ThecommunicativevalueofintonationinEnglish,CambridgeUniv.Press,1997

•  G.Brown,G.Yule,DiscourseAnalysis,CambridgeUniv.Press,1983

Bibliography•  M.Dolar,AVoiceandnothingmore,MIT,Csmbridge,2006–DavidLeBreton,Éclatsdevoix.Uneantropologiedesvoix,Métilié,2011

•  M.D’Imperio,TowardastrategyforTOBIlabelingvarietiesofItalian,inS-AJun(editor),ProsodicTypologyandTranscription:AUnifiedApproach,OxfordUniversityPress,2001

•  M.Halliday,R.Hasan,CohesioninEnglish,Longman,1976•  B.J.Birner,IntroductiontoPragmatics,Wiley-Blackwell,2013FromMIT:https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-911-transcribing-prosodic-structure-of-spoken-utterances-with-tobi-january-iap-2006/