extracting what we think and how we feel from what we say in social media

21
Extracting What We Think and How We Feel from What We Say in Social Media ---- Subjective Information Extraction Subjective Information Extraction, Lu Chen 1 Lu Chen Kno.e.sis Center Wright State University http://cdryan.com/blog/think-feel/

Upload: lu-chen

Post on 01-Nov-2014

214 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 1

Extracting What We Think and How We Feel from What

We Say in Social Media

---- Subjective Information Extraction

Lu ChenKno.e.sis Center

Wright State University

http://cdryan.com/blog/think-feel/

Page 2: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 2

Directions

• From coarse-grained to fine-grained– Document level -> sentence level -> expression level– General sentiment -> domain-dependent sentiment -> target-dependent sentiment– Sentiment Subjective information

• Sentiment (positive/negative/neutral) -> emotion (happy, sad, angry, surprise, etc.)• Other types of subjective information: Intent, suggestion/recommendation,

wish/expectation, outlook, viewpoint, etc.

• From static to dynamic– Our attitude can be changed during social communication.

• Modeling, detecting, and tracking the change of attitude• What leads to the change of attitude? E.g., persuasion campaign

static

dynamic

coarse-grainedfine-grained

subjective information

Page 3: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 3

Extracting a diverse and richer set of sentiment-bearing

expressions, including formal and slang words/phrases

Assessing thetarget-dependent polarity

of each sentiment expression

A novel formulation of assigning polarity to a sentiment expression

as a constrained optimization problem over the tweet corpus

Extracting Diverse Sentiment Expressions With Target-dependent Polarity from Twitter

Lu Chen, Wenbo Wang, Meenakshi Nagarajan, Shaojun Wang, and Amit P. Sheth

Page 4: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 4

Approach

Extracting Candidate Expressions

Identifying Inter-Expression Relations

Assessing Target-dependent Polarity

Page 5: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 5

Extracting Candidate Expressions

• Root word: a word that is considered sentiment-bearing in general sense.

• Collecting root words from – General-purpose sentiment lexicons: MPQA, General Inquirer, and

SentiWordNet– Slang dictionary: Urban Dictionary

• For each tweet, selecting the “on-target” root words, and extracting all the n-grams that contain at least one selected root word as candidates

Page 6: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 6

Identifying Inter-Expression Relations

• Connecting the candidate expressions via two types of inter-expression relations – consistency relation and inconsistency relation

• Basic ideas:– A sentiment expression is inconsistent with its negation; two sentiment

expressions linked by contrasting conjunctions are likely to be inconsistent.

– Two adjacent expressions are consistent if they do not overlap, and there is no extra negation applied to them or no contrasting conjunction connecting them.

Page 7: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 7

An Example1. I saw The Avengers yesterday evening. It was long but it was very good!

2. I do enjoy The Avengers, but it's both overrated and problematic.

3. Saw the avengers last night. Mad overrated. Cheesy lines and horrible writing. Very predictable.

4. The avengers was good but the plot was just simple minded and predictable.

5. The Avengers was good. I was not disappointed.

Page 8: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 8

Assessing Target-dependent Polarity

• For each candidate expression , – P-Probability – the probability that indicates positive

sentiment– N-Probability – the probability that indicates negative

sentiment

• For each pair of candidate expressions and , – Consistency probability – the probability that and have the same

polarity:

– Inconsistency probability – the probability that and have different polarities:

ic)(Pr i

P c

)(Pr iN c

ic

ic

1)(Pr)(Pr iN

iP cc

ic jcic jc

)(Pr)(Pr)(Pr)(Pr),(Pr jN

iN

jP

iP

jicons cccccc

ic jc

)(Pr)(Pr)(Pr)(Pr),(Pr jP

iN

jN

iP

jiincons cccccc

Page 9: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 9

An Optimization Model

• We want the consistency and inconsistency probabilities derived from the the P-Probabilities and N-Probabilities of the candidates will be closest to their expectations suggested by the relation networks.

• Objective Function:

1

1

22),(Pr1),(Pr1minimize

n

i

n

ijji

inconsinconsijji

consconsij ccwccw

where and are the weights of the edges (the frequency of the relations) between and in the consistency and inconsistency relation networks, and n is the total number of candidate expressions.

ic jcconsijw

inconsijw

Page 10: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 10

The Example

Page 11: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 11

Evaluation

• Datasets:– 168,005 tweets about movies– 258,655 tweets about persons

• Gold standard:– 1,500 tweets labeled with sentiment expressions and overall polarities for the

movie targets– 1,500 tweets labeled with sentiment expressions and overall polarities for the

person targets

• Baseline methods:– MPQA, GI, SWN: For each extracted root word regarding the target, simply

look up its polarity in MPQA, General Inquirer and SentiWordNet, respectively.– PROP: a propagation approach proposed by Qiu et al. (2009)– COM-const: Assign 0.5 to all the candidates as their initial P-Probabilities.– COM-gelex: Initialize the candidates’ polarities according to the root word set.Reference: Qiu, G.; Liu, B.; Bu, J.; and Chen, C. 2009. Expanding domain sentiment lexicon through double propagation. In Proc. of IJCAI.

Page 12: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 12

Page 13: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 13

Page 14: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 14

Application

Page 15: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 15

Relevance of User Groups Based on Demographics and Participation to Social Media Based Prediction

-- -- A Case Study of 2012 U.S. Republican Presidential PrimariesLu Chen, Wenbo Wang, and Amit P. Sheth

• Existing studies on predicting election result are under the assumption that all the users should be treated equally.

• How could different groups of users be different in predicting election results?

1. Providing a detailed analysis of the social media users on different dimensions

2. Estimating the “vote” of each user by analyzing his/her tweets, and predicted the results based on “vote-counting”

3. Examining the predictive power of different user groups in predicting the results of Super Tuesday races in 10 states

Page 16: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 16

User Categorization

Engagement Degree

Tweet Mode

Content Type

Political Preference

Location

Page 17: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 17

Electoral Prediction with Different User Groups

Revealing the challenge of identifying the vote intent of “silent majority”

Retweets may not necessarily reflect users' attitude.

Page 18: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 18

Electoral Prediction with Different User Groups

Prediction of user’s vote based on more opinion tweets is not necessarily more accurate than the prediction using more information tweets

The right-leaning user group provides the most accurate prediction result. It correctly predict the winners in 8 out of 10 states.To some extent, it demonstrates the importance of identifying likely voters in electoral prediction.

Page 19: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 19

Emotion

• Discovering Fine-grained Sentiment in Suicide Notes: Classify each sentence from suicide notes into 15 emotional categories, e.g., love, pride, guilt, blame, hopelessness, etc.

• Emotion Identification from Twitter Data: 7 emotion categories, including joy, sadness, anger, lover, fear, thankfulness, and surprise– Can we automatically create a large emotion dataset with high quality

labels from Twitter? How?– What features can effectively improve the performance of supervised

machine learning algorithms?– How much performance will be gained by increasing the size of the

training data?– Can the system developed on Twitter data be directly applied to identify

emotions from other datasets?

Page 20: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 20

What’s next?

static

dynamic

coarse-grained fine-grained

subjective information

Detecting the change of

attitude during persuasive

communication

Discriminating other types of

subjective information from sentiment,

e.g., wish, intent

Page 21: Extracting What We Think and How We Feel from What We Say in Social Media

Subjective Information Extraction, Lu Chen 21

Thank you !