dialogue acts - university of...

Dialogue Acts

Bin Zhang

February 1, 2011

Outline

A. Stolcke et al. 2000, Dialogue act modeling for automatictagging and recognition of conversational speech

D. Davidov et al. 2010, Semi-supervised recognition of sarcasticsentences in Twitter and Amazon

Dialogue Act

I It is a specialized speech actI Typical dialogue acts

I StatementI QuestionI BackchannelI AgreementI DisagreementI Apology

Examples of Dialogue Acts

Dialogue Act Labeling

I Unlabled dialogue data from Switchboard corpusI Contains conversational telephone speechI Speech is segmented into utterancesI Each utterance has a dialogue act label

I Tag setI Dialogue act markup in several layers (DAMSL)I SWBD-DAMSL tag set: 50 tags original, reduced to 42 tags

for better annotator consistency

I 1115 conversations (205000 utterances) annotated by 8linguistic graduate students in 3 months, κ = 0.80 (excellentagreement)

Common Dialogue Act Types

Statements descriptive, narrative, or personal statements

Opinions often include such hedges as I think, I believe, itseems, and I mean

Questions yes-no questions, declarative questions, WH questions

Backchannels short utterances that play discourse-structuringroles, e.g., indicating that the speaker should go ontalking

Abandoned utterances those that the speaker breaks off withoutfinishing, and are followed by a restart

Turn exits similar to abandoned utterances, but with speakerchange

Answers yes answers and no answers

Agreements mark the degree to which a speaker accepts someprevious proposal, plan, opinion, or statement

Sequence Modeling of Dialogue Acts

I A conversation can be considered as a sequence of utterances

I Dialogue acts of the utterances in a conversation areinter-dependent

I Denote U as the dialogue act sequence of the conversation,and E as the evidence about the conversation (audio and/ortext)

U∗ = argmaxU

P(U|E )

I Using Bayes rule

U∗ = argmaxU

P(E |U)P(U)

P(E |U) dialogue act likelihoodP(U) discourse grammar

U∗ = argmaxU

P(U|E )

I Using Bayes rule

U∗ = argmaxU

P(E |U)P(U)

U∗ = argmaxU

P(U|E )

I Using Bayes rule

U∗ = argmaxU

P(E |U)P(U)

Dialogue Act HMM

I Markov assumption

P(Ui |U1,U2, . . . ,Ui−1) = P(Ui |Ui−k , . . . ,Ui−1)

I Independence assumption

P(E |U) =∏i

P(Ei |Ui )

Dialogue Act HMM

I Markov assumption

P(Ui |U1,U2, . . . ,Ui−1) = P(Ui |Ui−k , . . . ,Ui−1)

I Independence assumption

P(E |U) =∏i

P(Ei |Ui )

Discourse Grammar

I As is the nature of dialogues, the dialogue acts of theutterances in a dialogue are highly related

I Back-off dialogue act n-gram models as a simple and efficientdiscourse model

I If speaker is known, speaker-dependent dialogue act n-grammodels are better

Discourse Grammar

Combining Evidence

I Words (W )

I ASR acoustics (A)

I Prosodic features (F ): pitch, duration, energy, etc., of thespeech signal

P(W ,A,F |U)

Combining Evidence

I Words (W )

I ASR acoustics (A)

I Prosodic features (F ): pitch, duration, energy, etc., of thespeech signal

P(W ,A,F |U)

Experimental Results

Outline

A. Stolcke et al. 2000, Dialogue act modeling for automatictagging and recognition of conversational speech

D. Davidov et al. 2010, Semi-supervised recognition of sarcasticsentences in Twitter and Amazon

Sarcasm

I Sarcasm – the activity of saying or writing the opposite ofwhat you mean, or of speaking in a way intended to makesomeone else feel stupid or show them that you are angry

Example of Twitter

Example of Amazon

Sarcastic Sentences in Two Genres

I In Twitter messages, sarcastic sentences appear in a widerange. Some sarcastic sentences are marked #sarcasm by theuser

I In Amazon product reviews, sarcastic sentences are usuallywith the negative reviews

Sarcastic Sentences in Two Genres

I In Twitter messages, sarcastic sentences appear in a widerange. Some sarcastic sentences are marked #sarcasm by theuser

I In Amazon product reviews, sarcastic sentences are usuallywith the negative reviews

Examples of Sarcasm

I thank you Janet Jackson for yet another year of Super Bowlclassic rock! (Twitter)

I Hes with his other woman: XBox 360. Its 4:30 fool. Sure Ican sleep through the gunfire (Twitter)

I Wow GPRS data speeds are blazing fast (Twitter)

I [I] Love The Cover (book, amazon)

I Defective by design (music player, amazon)

The Semi-supervised Approach

I Data processing

I Pattern extraction

I Pattern selection

I Data enrichment

I Classification

Data Processing

I For Twitter: actual user, URL, and hashtags tokenized by[USER], [URL], [HASHTAG], respectively

I For Amazon: product, author, company, and book nametokenized by [PRODUCT], [AUTHOR], [COMPANY],[TITLE], respectively

Pattern Extraction

I Construct pattern templatesI Words are classified into high-frequency words and content

words (less frequent)I Punctuations considered high-frequency wordsI Templates are created using both word classes from knowledge

I InstantiationI Templates are instantiated from the data, with high-frequency

words replaces by actual wordsI Hundres of patterns collectedI During matching, partial matching is allowed

Pattern Extraction

I Construct pattern templatesI Words are classified into high-frequency words and content

words (less frequent)I Punctuations considered high-frequency wordsI Templates are created using both word classes from knowledge

I InstantiationI Templates are instantiated from the data, with high-frequency

words replaces by actual wordsI Hundres of patterns collectedI During matching, partial matching is allowed

Pattern Selection

I Remove patterns that originated from one book/product(Amazon)

I Remove patterns that occur in both classes

Data Enrichment

I Used to get more labeled data

I Motivation: sarcastic sentences usually co-occur

I Use labeled sentences as query to search for more sentenceson the web

I Found sentences are assigned similar label to the query

Classification

I Feature vectors are composed of:I Match score of each pattern in the sentencesI Additional punctuation-based features

I k-nearest neighbor classifier is used

Data Annotation

I Sentences are annotated using five labels (1, 2, 3, 4, 5), 1being not sarcastic, 5 being cleary sarcastic. In classification,1, 2 are negative labels, 3, 4, 5 are positive labels

I The Twitter #sarcasm hashtag is found to be biased andnoisy

I Inter-annotator agreement (fair agreement)I Twitter: κ = 0.41I Amazon: κ = 0.34

I Amazon training data

Positive Negative

Seed 80 505Enriched 471 5020

I Evaluation data: for each genre 90 postive + 90 negativesentences

Data Annotation

Positive Negative

Data Annotation

Positive Negative

Data Annotation

Positive Negative

I Baseline (star-sentiment, Amazon only): low rating reviewswith string positive sentiment

Thanks!

Improving Speech Recognition using Dialogue Acts

I Word recognition (with evidence)

W ∗i = argmax

P(Wi |Ai ,E )

I Dialogue acts can be considered as a factor that the acousticand language models depend on

I When the dialogue acts are unknown, the models are themixtures of dialogue act-dependent models

I Mixture-of-posteriors

P(Wi |Ai ,E ) =∑Ui

P(Wi |Ui )P(Ai |Wi )

P(Ai |Ui )P(Ui |E )

I Mixture-of-LMs

P(Wi |Ai ,E ) ≈∑Ui

P(Wi |Ui )P(Ui |E )P(Ai |Wi )

P(Ai )

dialogue acts - university of...

Documents