approaches for automatically tagging affect

35
Approaches for Automatically Tagging Affect Nathanael Chambers, Joel Tetreault, James Allen University of Rochester Department of Computer Science

Upload: ilori

Post on 06-Feb-2016

19 views

Category:

Documents


0 download

DESCRIPTION

Approaches for Automatically Tagging Affect. Nathanael Chambers, Joel Tetreault, James Allen University of Rochester Department of Computer Science. Affective Computing. Why use computers to detect affect? Make human-computer interaction more natural Computers express emotion - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Approaches for Automatically Tagging Affect

Approaches for Automatically Tagging Affect

Nathanael Chambers, Joel Tetreault, James Allen

University of Rochester

Department of Computer Science

Page 2: Approaches for Automatically Tagging Affect

Affective Computing

• Why use computers to detect affect?– Make human-computer interaction more natural

• Computers express emotion

• And detect user’s emotion

• Tailor responses to situation

– Use affect for text summarization

• Understanding affect improves computer-human interaction systems

Page 3: Approaches for Automatically Tagging Affect

From the Psychologist’s P.O.V

• However, if computers can detect affect, it can also help humans understand affect

• By observing the changes in emotion and attitude in people conversing, psychologists can determine correct treatments for patients

Page 4: Approaches for Automatically Tagging Affect

Marriage Counseling

• Emotion and communication are important to mental and physical health

• Psychological theories suggest that how well a couple copes with serious illness is related to how well they interact to deal with it

• Poor interactions (ie. Disengagement during conversations) can at times exacerbate an illness

• Tested hypothesis by observing the engagement-levels of conversation between married-couples presented with a task

Page 5: Approaches for Automatically Tagging Affect

Example Interactions• Good interaction sequence:W: Well I guess we'd just have to develop a plan wouldn't we? H: And we would be just more watchful or plan or maybe not, or be together more when the other one went to do something W: In other words going together H: Going together more W: That's right. And working more closely together and like you say, doing things more closely together. And I think we certainly would want to share with the family openly what we felt was going on so we could kind of work out family plans

• Poor interaction sequence:W: So how would you deal with that?

H: I don't know. I'd probably try to help. And you know, go with you or do things like that if I, if I could. And you know, I don't know. I would try to do the best I could to help you

Page 6: Approaches for Automatically Tagging Affect

Testing theory

• Record and transcribe conversations of married couples presented with “what-if” scenario of one of them having Alzheimer’s. – Participants asked to discuss how they would deal with the

sickness

• Tag sentences of transcripts with affect-related codes. Certain textual patterns evoke negative or position connotations

• Use distribution of tags to look for correlations between communication and marital satisfaction

• Use tag distribution to decide on treatment for couple

Page 7: Approaches for Automatically Tagging Affect

Problem

• However tagging (step 2) is time-consuming and requires training time for new annotators, as well as being unreliable

• Solution: use computers to do tagging work so psychologists can spend more time with patients and less time coding

Page 8: Approaches for Automatically Tagging Affect

Goals

• Develop algorithms to automatically tag transcripts of a Marriage Counseling Corpus (Shields, 1997)

• Develop a tool that human annotators can use to pre-tag a transcript given the best algorithm, and then quickly correct it

Page 9: Approaches for Automatically Tagging Affect

Outline

• Background

• Marriage Counseling Corpus

• N-gram based approaches

• Information-Retrieval/Call Routing approaches

• Results

• CATS Tool

Page 10: Approaches for Automatically Tagging Affect

Background

• Affective computing, or detecting emotion in texts or from a user, is a young field

• Earliest approaches used keyword matching• Tagged dictionaries with grammatical features

(Boucouvalas and Ze, 2002)• Statistical methods – LSA (Webmind project),

TSB (Wu et al., 2000) to tag a dialogue• Liu et al. (2003) use common-sense rules to detect

emotion in emails

Page 11: Approaches for Automatically Tagging Affect

New Methods for Tagging Affect

• Our approaches differ from others in two ways:• Use different statistical methods based on

computing N-grams • Tag individual sentences as opposed to discourse

chunks• Our approaches are based on methods that have

been successful in another domain: discourse act tagging

Page 12: Approaches for Automatically Tagging Affect

Marriage Counseling Corpus

• 45 annotated transcripts of married couples working on a task of Alzheimer’s

• Collected by psychologists in the Center for Future Health, Rochester, NY

• Transcripts broken into “thought units” – one or more sentences that represent how the speaker feels toward a topic (4,040 total)

• Tagging thought units takes into account positive and negative words, level of detail, comments on health, family, travel, etc, sensitivity

Page 13: Approaches for Automatically Tagging Affect

Code Tags

• DTL – “Detail” (11.2%) speaker’s verbal content is concise and distinct with regards to illness, emotions, dealing with death: – “It would be hard for me to see you so helpless”

• GEN – “General” (41.6%) verbal content towards illness is vague or generic, or speaker does not take ownership of emotions: – “I think that it would be important”

Page 14: Approaches for Automatically Tagging Affect

Code Tags

• SAT: “Statements About the Task” – (7.2%) couple discusses what the task is, how to perform it: – “I thought I would be the caregiver”

• TNG – “Tangent” – (2.9%) statements that are way off topic.

• ACK – “Acknowledgments” (22.8%) of the other speaker’s comments: – “Yeah” “right”

Page 15: Approaches for Automatically Tagging Affect

N-Gram Based Approaches

n-gram: a sequential list of n words, used to encode the likelihood that the phrase will appear in the future

Involves splitting sentence into chunks of consecutive words of length “n”

“I don’t know what to say”

1-gram (unigram): I, don’t, know, what, to, say2-gram (bigram): I don’t, don’t know, know what, what to, to say3-gram (trigram): I don’t know, don’t know what, know what to, etc.…n-gram

Page 16: Approaches for Automatically Tagging Affect

Frequency Table (Training)

GEN DTL ACK

“I don’t want to be”

“Don’t want to be”

0.0 1.0 0.0 0.0

0.2 0.8 0.0 0.0

“I” 0.5 0.2 0.2 0.1

SAT

0.3 0.2 0.4 0.1“Yeah”

Each entry: Probability that n-gram is labeled a certain tag

Page 17: Approaches for Automatically Tagging Affect

N-Gram Motivation

Advantages• Encode not just keywords, but also word ordering, automatically

• Models are not biased by hand coded lists of words, but are completely dependent on real data

• Learning features of each affect type is relatively fast and easy

Disadvantages• Long range dependencies are not captured

• Dependent on having a corpus of data to train from– Sparse data for low frequency affect tags adversely affects the quality of

the n-gram model

Page 18: Approaches for Automatically Tagging Affect

Naïve Approach

P(tagi | utt) = maxj,k P(tagi | ngramjk)• Where i is one of {GEN, DTL, ACK, SAT, TNG}

• And ngramjk is the j-th ngram of length k

• So for all n-grams in a thought unit, find the one with the highest probability for a given tag, and select that tag

Page 19: Approaches for Automatically Tagging Affect

Naïve Approach Example

I don’t want to be chained to a wall.

k Tag Top N-gram Probability

1 GEN don’t 0.665

2 GEN to a 0.692

3 GEN <s> I don’t 0.524

4 DTL don’t want to be 0.833

5 DTL I don’t want to be 1.00

Page 20: Approaches for Automatically Tagging Affect

N-Gram Approaches

• Weighted Approach– Weight the longer n-grams higher in the stochastic model

• Lengths Approach– Include a length-of-utterances factor, capturing the differences in

utterance length between affect tags

• Weights with Lengths Approach– Combine Weighted with Lengths

• Repetition Approach– Combine all the above information,with overlap of words between

thought units

Page 21: Approaches for Automatically Tagging Affect

Repetition Approach

Many acknowledgement ACK utterances were being mistagged as GEN by the previous approaches. Most of the errors came from grounding that involved word repetition:

A - so then you check that your tire is not flat.B - check the tire

• We created a model that takes into account word repetition in adjacent utterances in a dialogue.

• We also include a length probability to capture the Lengths Approach. • Only unigrams are used to avoid sparseness in the training data.

Page 22: Approaches for Automatically Tagging Affect

IR-based approaches

• Work based on call-routing algorithm of Chu-Carroll and Carpenter (1999)

• Problem: route a user’s call to a financial call center to the correct destination

• Do this by comparing a query from the user (speech converted to text) into a vector to be compared with a list of possible destination vectors in a database

Page 23: Approaches for Automatically Tagging Affect

Database Table (Training)

GEN DTL ACK

“I don’t want to be”

“Don’t want to be”

0.0 1.0 0.0 0.0

0.2 0.8 0.0 0.0

“I” 0.5 0.2 0.2 0.1

SAT

0.3 0.2 0.4 0.1“yeah”

Query

0.0

1.0

0.0

0.0

Cosine comparison

“yeah, that’s right” Database

Query (thought unit) compared against each tag vector in database

Page 24: Approaches for Automatically Tagging Affect

Database Creation

• Construct database in the same manner as N-gram

• Database then normalized

• Filter: Inverse Document Frequency (IDF) – lowers the weight of terms that occur in many documents:

IDF(t) = log2 (N / d(t) )

• Where d(t) is the number of tags containing n-gram t, and N is the total number of tags

Page 25: Approaches for Automatically Tagging Affect

Method 1: Routing-based method

• Modified call-routing method with entropy (amount of disorder) to further reduce contribution of terms that occur frequently

• Also created two more terms (rows in database)

– Sentence length: tags may be correlated with sentences of a certain length

– Repetition – acknowledgments tend to repeat the words stated in the previous thought unit

Page 26: Approaches for Automatically Tagging Affect

Method 1: ExampleACK=0.002

query

DTL = 0.073 GEN = 0.072

SAT = 0.014

TNG = 0.0001

Cosine scores for tags compared against query vector for “I don’t want to be chained to a wall”

Page 27: Approaches for Automatically Tagging Affect

Method 2: Direct Comparison

• Instead of comparing queries to a normalized database of exemplar documents, compare them to all test sentences

• Advantage: no normalizing or construction of documents

• Cosine test is used to get the top ten matches. Add matches with the same tag. The tag that has the highest sum in the end is selected.

Page 28: Approaches for Automatically Tagging Affect

Method 2: ExampleCosine Score Tag Sentence

0.64 SAT Are we supposed to get them?

0.60 GEN That sounds good

0.60 TNG That’s due to my throat

0.56 DTL But if I said to you I don’t want…

0.55 DTL If it were me, I’d want to be a guinea pig to try things

DTL selected with total score of 1.11

Page 29: Approaches for Automatically Tagging Affect

Evaluation

• Performed six-fold cross-validation over the Marriage Corpus and Switchboard Corpus

• Averaged scores from each of the six evaluations

Page 30: Approaches for Automatically Tagging Affect

Results

Naive Weighted LengthsWeights

with LengthsRepetition

66.80% 67.43% 64.35% 66.02% 66.60%

6-Fold Cross Validation for N-gram Methods

Original Entropy Repetition Length Repetition and Length

Direct

61.37% 66.16% 66.39% 66.76% 66.76% 63.16%

6-Fold Cross Validation for IR Methods

Page 31: Approaches for Automatically Tagging Affect

Discussion

• N-gram approaches do slightly better than IR over Marriage Counseling

• Incorporating additional features of sentence length and repetition improve both models

• Entropy model better than IDF in call-routing system (gets 4% boost)

• Psychologists currently using tool to tag their work. Note sometimes computer tags better than the human annotators

Page 32: Approaches for Automatically Tagging Affect

CATS

CATS: An Automated Tagging System for affect and other similar information retrieval tasks.

• Written in Java for cross-platform interoperability.• Implements the Naïve approach with unigrams and bigrams only.• Builds the stochastic models automatically off of a tagged corpus,

input by the user into the GUI display.• Automatically tags new data using the user’s models. Each tag also

receives a confidence score, allowing the user to hand check the dialogue quickly and with greater confidence.

Page 33: Approaches for Automatically Tagging Affect

The CATS GUI provides a clear workspace for text and tags.Tagging new data and training old data is done with a mouse click.

Page 34: Approaches for Automatically Tagging Affect

Customizable models are available. Create your own list of tags, provide a training corpus, and build a new model.

Page 35: Approaches for Automatically Tagging Affect

Tags are marked with confidence scores based on the probabilistic models.