senticircles for contextual and conceptual semantic sentiment analysis of twitter

38
SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani The Eleventh Extended Semantic Web Conference (ESWC2014) May 2014

Upload: knowledge-media-institute-the-open-university

Post on 26-Jan-2015

141 views

Category:

Technology


7 download

DESCRIPTION

Lexicon-based approaches to Twitter sentiment analysis are gaining much popularity due to their simplicity, domain independence, and relatively good performance. These approaches rely on sentiment lexicons, where a collection of words are marked with fixed sentiment polarities. However, words' sentiment orientation (positive, neural, negative) and/or sentiment strengths could change depending on context and targeted entities. In this paper we present SentiCircle; a novel lexicon-based approach that takes into account the contextual and conceptual semantics of words when calculating their sentiment orientation and strength in Twitter. We evaluate our approach on three Twitter datasets using three different sentiment lexicons. Results show that our approach significantly outperforms two lexicon baselines. Results are competitive but inconclusive when comparing to state-of-art SentiStrength, and vary from one dataset to another. SentiCircle outperforms SentiStrength in accuracy on average, but falls marginally behind in F-measure.

TRANSCRIPT

Page 1: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCirclesfor Contextual and Conceptual Semantic Sentiment Analysis of

Twitter

Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani

The Eleventh Extended Semantic Web Conference (ESWC2014)May 2014

Page 2: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

OutLine

oSentiment Analysis

oApproaches

oSentiCircles

oEvaluation

oConclusion

Page 3: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

“Sentiment analysis is the task of identifying positive and negative opinions, emotions and evaluations in text”

3

Opinion OpinionFact

Sentiment Analysis

yes, It is sunny, but also very humid :(

The weather is great today :)

I think its almost 30 degrees today

Sentiment Analysis

Page 4: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

oRich

o Formal Language {Well

Structured Sentences}

oDomain Specific

Conventional Text

Page 5: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Twitter Data

o Short (140-Chars)

oNoisy {gr8, lol, :), :P}

oOpen Environment

Page 6: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Sentiment Analysis

Approaches

Lexicon-Based Approach

Machine Learning Approach

Page 7: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Machine LearningAp

proa

ch

Page 8: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Lexicon-based Ap

proa

ch

I had nightmares all night long last night :(

Negative

Sentiment Lexicon

Text Processing Algorithm

great successsad

pretty

down

wronghorrible

beautiful

mistake

love

good

Page 9: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

o Requires Labeled Twitter Corpora Labor Intensive TaskDistant Supervision (Noisy Labeling)

o Domain Specific Re-Training with new domains

Machine Learning Approach

On Twitter?

Page 10: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Traditional Lexicons- MPQA & SentiwordNet, etc

- Not tailored to Twitter noisy data:- lol, gr8, wow, :), :P

- Fixed number of words

Lexicon-based ApproachOn Twitter?

Sentiment Lexicon

great successsad

pretty

down

wronghorrible

beautiful

mistake

love

good

grt8lol:)

:P

Page 11: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Twitter-specific Lexicon-based Methods

- Such as SentiStrength- Rule-base method for sentiment analysis

on social web

- Uses Thelwall-Lexicon- Built to specifically work on social data - Contain lists of emoticons, slangs, abbreviations,

etc.

Page 12: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

• Fixed Number of words

• Offer Context-Insensitive Prior Sentiment Orientations and Strength of words

Great

Problem Smile

Positive

Thelwall-Lexicon & SentiStrength

Sentiment Lexicon

great successsad

pretty

down

wronghorrible

beautiful

mistake

love

good

Page 13: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

We Need.. Unsupervised Approach

Understands the Semantic of Words

Captures their Contexts

Updates Sentiment

Page 14: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCircles

Page 15: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCircles

Lexicon-based Approach

Builds Dynamic representation of words

Captures Contextual & Conceptual Semantics of words

Updates words’ sentiment orientation and strength accordingly

Page 16: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Contextual Semantics “Words that occur in similar context tend to have similar meaning”

Wittgenstein (1953)

“You Shall know the word by the company it keeps”Firth (1955)

GreatProblem

Look SmileConcert

Song

WeatherLoss

Game Taylor Swift

AmazingGreat

Page 17: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Capturing Contextual Semantics

Term (m) C1 C2 Cn….

Context-Term Vector

Degree of Correlation

Prior SentimentSentiment Lexicon

(1)

(2)Great

Smile Look

(3)

Contextual Sentiment Strength

Contextual Sentiment Orientation

Positive, Negative Neutral

[-1 (very negative)+1 (very positive)]

Page 18: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Term (m) C1

Degree of Correlation

Prior Sentiment

Great

Smile

SentiCircles Model

X = R * COS(θ)

Y = R * SIN(θ)

Smile

X

ri

θi

xi

yi

Great

PositiveVery Positive

Very Negative Negative

+1

-1

+1-1 Neutral Region

ri = TDOC(Ci)

θi = Prior_Sentiment (Ci) * π

Capturing Contextual Semantics

Page 19: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCircles (Example)

Page 20: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Overall Contextual Sentiment

Ci

X

ri

θi

xi

yi

m

PositiveVery Positive

Very Negative Negative

+1

-1

+1-1 Neutral Region

Senti-Median of SentiCircle

Sentiment Function

Page 21: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCircles & Conceptual Semantics

Page 22: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Enriching SentiCircles with Conceptual Semantics

Sushi time for fabulous Jesse's last day on dragons den

@Stace_meister Ya, I have Rugby in an hour

Dear eBay, if I win I owe you a total 580.63 bye paycheckCompany

Person

Sport

Page 23: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Enriching SentiCircles with Conceptual Semantics

Cycling under a heavy rain.. What a #luck!

Weather Condition

Wind

Snow

Humidity

Page 24: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCircles for

Tweet-level Sentiment Analysis

Detecting the overall Sentiment of a given tweet message (positive vs. negative)

Page 25: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCircles forTweet-level Sentiment Analysis

(1) The Median Method

Cycling under a heavy rain.. what a #luck!

S-Median S-Median S-Median S-Median S-Median S-Median

The Median of Senti-Medians

Page 26: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Tweet-level Sentiment Analysis

(2) The Pivot Methodlike1

X

Yr1

θ1

PositiveVery Positive

Very Negative Negative

new2

pj r2

θ2

like1 new2 iPadj Wn

Sj1

Sj2

Tweet tk

...

I like my new iPad

Page 27: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Experiments

Page 28: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Experimental Setup

(1) Datasets

(2) Sentiment Lexicons- SentiWordNet [3]- MPQA Subjectivity Lexicon [4]- Thelwall-Lexicon [5]

Page 29: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Experimental Setup

(3) Baselines

1. Lexicon-Labeling (MPQA & SentiWordNet)Average of positive & negative words in a tweet.

2. SentiStrength (State-of-the-art)- Lexicon-based method built for Twitter- Apply a set of syntactic rules

Page 30: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Results

Page 31: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Sentiment Detection with Contextual Semantics

Page 32: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCircles vs. Lexicon-Labeling Methods

MPQA-Lex SentiWNet-Lex SentiCircle40.00

45.00

50.00

55.00

60.00

65.00

70.00

75.00

80.00

52.35 52.74

74.96

52.34 52.30

68.06

Accuracy F-Measure

Page 33: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCircle vs. SentiStrengthDatasets Accuracy F1

OMD SentiCircle SentiCircle

HCR SentiCircle SentiStrength

STS-Gold SentiStrength SentiStrength

Average SentiCircle SentiStrength

Page 34: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Why Such Variance..• The sentiment class distribution in our datasets– SentiCircle produces, on average, 2.5% lower recall

than SentiStrength on positive tweet detection– Our datasets contain more negative tweets than

positive ones

• Topic Distribution in the three datasets

• More research is required

Page 35: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Sentiment Detection with Conceptual Semantics

Win/Loss in Accuracy and F-measure of incorporating conceptual semantics into SentiCircles, where Mdn:

SentiCircle with Median method, Pvt: SentiCircle with Pivot method.

Page 36: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Conclusion• We proposed a novel semantic sentiment approach called

SentiCircle

• SentiCircles captures context and update sentiment accordingly

• We showed how SentiCircle can be applied for Tweet-level sentiment analysis

• SentiCircles outperformed other lexicon labeling methods and overtake the state-of-the-art SentiStrength approach in accuracy, with a marginal drop in F-measure.

Page 37: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

SentiCircles for Sentiment Analysis

1. Tweet-level Sentiment Analysis

2. Entity-Level Sentiment Analysis

3. Sentiment Lexicon Adaptation

4. Dynamic Stopwords Generation

5. Sentiment Patterns Discovery

Saif et al. (2014) at ESWC Conference. Greece, Crete

Saif et al. (2014), IPM Journal

Saif et al. (2014) at ESWC Conference

Saif et al. (2014) at LREC Conference. Reykjavik, Iceland

Saif et al. (2014) submitted to ISWC Conference.

Page 38: SentiCircles for Contextual and Conceptual Semantic Sentiment Analysis of Twitter

Thank YouEmail: [email protected]: hrsaifWebsite: tweenator.com