introduction to sentiment analysis - eth zürich - … · 2017-05-12 · introduction to sentiment...

31
| | COSS Machine Learning and Modelling for Social Networks Lloyd Sanders, Olivia Woolley, Iza Moize, Nino Antulov-Fantulin D-GESS: Computational Social Science Introduction to Sentiment Analysis

Upload: duongnhi

Post on 10-Jul-2018

227 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

Machine Learning and Modelling for Social Networks Lloyd Sanders, Olivia Woolley, Iza Moize, Nino Antulov-Fantulin D-GESS: Computational Social Science

Introduction to Sentiment Analysis

Page 2: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  What is Sentiment Analysis? §  Classifying Sentiment §  Feature Creation and Selection §  Use Case: Public health and Vaccine Sentiment §  References and Reading

L Sanders 2

Overview

Page 3: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Positive/Negative Polarity assigned to text §  The Sentiment ‘space’ is being expanded to

accommodate more than a single dimension §  Classification with respect to emotion: Joy, frustration,

sadness are occurring §  Classification with respect to stance (either for, or against

a position) is similar to, but not entirely the same as sentiment

§  Sentiment analysis is also known as opinion mining L Sanders 3

What is Sentiment Analysis

Sentiment analysis is the operation of understanding the intent or emotion behind a given piece of text

Page 4: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Sentiment Analysis is a branch of computer science, and overlaps heavily with Machine Learning, and Computational Linguistics

§  Why? One seeks to understand the general opinion across many documents within a corpus (e.g., all tweets relating to a given brand).

§  This is labor intensive, so we use ML to automatically label documents via classifier through a labeled dataset (supervised learning)

L Sanders 4

What is Sentiment Analysis

Sentiment analysis is the operation of understanding the intent or emotion behind a given piece of text

Page 5: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS L Sanders 5

Sentiment examples in the wild – Business Reviews

Yelp.com

Page 6: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS L Sanders 6

Sentiment examples in the wild – Product Reviews

Amazon.com

Page 7: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Vonnegut posited in his Master’s thesis that there were 6 basic shapes to a story §  Rags to Riches (rise) §  Riches to Rags (fall) §  Man in a hole (fall then rise) §  Icarus (rise then fall) §  Cinderella (rise then fall then rise) §  Oedipus (fall then rise then fall)

§  A team used sentiment analysis to verify this with over 1700 English fiction novels [Reagan et al. 2016]

L Sanders 7

Emotional Arcs of Fiction

Page 8: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS L Sanders 8

Emotional Arcs of Fiction

Cinderella

Oedipus Icarus

Man in Hole Rags to Riches

Riches to Rags

Reagan et al. 2016

Page 9: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS L Sanders 9

DeepBreath

https://github.com/googlecloudplatform/deepbreath

Page 10: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Sentiment analysis often correlates well with real world observables.

§  For commercial aspects: Brand Awareness §  Stock fluctuations and public opinion [Bollen et al. 2010] §  Health related: Vaccine sentiment vs. coverage [Later] §  Public safety: Situational awareness in mass emergencies

via Twitter [Verma et al. 2011]

L Sanders 10

Why is it useful?

Sentiment could be considered a latent variable in social behavior. Measuring and understanding this

behavior, could lead to better understanding of social phenomena.

Page 11: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Sentiment is very domain specific, and also temporally specific w.r.t. social media.

§  Different contexts, alter polarity of different words (e.g.: ‘unpredictable’: movie review good, driving = bad)

§  Slang ‘Movie is bad ass’ §  Sentiment has multiple levels:

§  Document or message (tweet/sms) level §  Term/Aspect level “The coffee was amazing, but the atmosphere

was dull” §  Word level / within word level (severity of sentiment per word)

§  Negations, sloppy spelling/structure, compound the difficulty

L Sanders 11

Sentiment Classification is Difficult

Page 12: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Gather a large quantity of data – the more the better §  Construct a labeled set of data into your classes (e.g.

positive/negative/neutral) §  Split your set into training/test sets §  Construct your features §  Train Classifier (SVM, Naïve Bayes, Ensemble Methods,

Neural Nets) §  Assess accuracy §  Let loose on a the full set

L Sanders 12

Classifying Sentiment: A Recipe

Page 13: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  It’s important to have well labeled data, and there are a number of ways of doing this

§  Self-annotation can lead to biases. §  Crowd sourcing annotation

L Sanders 13

Labeling Training Data

“Put junk in, get junk out”

mturk.com crowdflower.com

Page 14: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Pseudo-labeling data can have a net positive effect §  This can be achieved, for example on social media,

through hashtags, or emoticons/emojis [Kouloumpis et al. 2011, Davidov et al., 2010]

L Sanders 14

Labeling Training Data

“Put junk in, get junk out”

Page 15: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Common practice one can use a bag of words technique which discards structure, but does incorporate word count

§  Each document in the corpus is disassembled into a bag of words, represented as a vector

§  Can use TF-IDF on this bag of words vector [see Iza’s lecture on Big Data].

§  Your bag of words vector per document will be sparse, can leverage that in computation.

L Sanders 15

Constructing text features

1 Sentiment Analysis Equations

Bag of words

~

di = [x1, x2, · · · , xn]T

1

Page 16: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  N-grams are a simple technique to capture document structure

§  When considering words: a unigram is a single word, a bigram is a string of two words

§  Bigrams can begin to capture negations such as ‘this food was not_good’, but will miss out on ‘this food was not_very_good’ (less severe)

§  One can construct skip n-grams, e.g.: not_*_good §  N-grams are also possible with characters: ‘good’ is a 4-

gram, ‘happy’ is a 5-gram char

L Sanders 16

N-grams

Page 17: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  A negation word can flip the polarity on an entire sentence.

§  Bigrams, or Trigrams go some way towards this, as mentioned before.

§  How else can one take these into account? §  Preprocess text to take negations into account: ‘not good’ =>

‘good_neg’

L Sanders 17

Negations and how to deal with them

“This food was not good”

Page 18: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  General Inquirer [http://www.wjh.harvard.edu/~inquirer/homecat.htm]

§  SentiWordNet [http://sentiwordnet.isti.cnr.it/]

§  Bing Liu’s lexicons [https://www.cs.uic.edu/~liub/]

L Sanders 18

Sentiment Lexicons

There are many publically available Sentiment (and Emotion) lexicons available. These can be used as a complementary feature construction for your classifiers (especially for out of vocabulary words – those not in your corpus).

Page 19: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Here is a sample of the features used by a state of the art Twitter sentiment classifier: §  Word ngrams (up to 4), skip ngrams w/ 1 missing word §  Character ngrams up to 5 §  All caps: number of words in capitals §  Number of hashtags §  Number of continuous punctuation marks, either exclamation or

question or mixed. Also whether last char contains one of these. §  Presence of emoticons

L Sanders 19

Feature Vectors for short informal texts: a bird’s eye view

Page 20: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Here is a sample of the features used by a state of the art Twitter sentiment classifier: §  Number of elongated words (one character repeated more than

twice: ‘raaaaaad’) §  Normalization: URLS to http://someurl; userids to @someurl §  Part-of-Speech tagged tweets: number of occurrences of each

POS tag.

L Sanders 20

Feature Vectors: a bird’s eye view

Page 21: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Sentiment is a classification problem §  Typically people have used Naïve Bayes or Support

Vector Machines (SVM) in the past [Mohammad et al. 2013]

§  Artificial Neural Nets are also becoming more popular now [Nogueira dos Santos & Gatti, 2014]

L Sanders 21

Classifying your sentiment

Page 22: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  How does one construct a baseline for accuracy? §  As always, we refer to ‘better than chance’ baseline §  In the context of pos/neg/neu, they are often not split

evenly. §  One can use the maximum likelihood for each class: If

pos is 70% of the classes, then choose that. §  For multiple classes, as a single measure, it is common to

use the macro F-score. §  For binary case, the go to is: AUC ROC

L Sanders 22

Sentiment Accuracy

Page 23: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS L Sanders 23

Ablation Experiments of features

Kiritchenko et al. 2014

Page 24: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS L Sanders 24

Use case: Public Health and Vaccine Sentiment

The authors wanted to investigate the correlation between sentiment on vaccines with respect to vaccine uptake. Usual survey methods are expensive, so they took a new approach in using Twitter. The took the model further to understand if such sentiments held in similar clustering within real-world

communities, what outbreaks would look like.

Page 25: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Analyzed over 100k users from twitter over 6 months to assess how sentiment of a new (2009) H1N1 vaccine correlated with actual coverage of the vaccine.

§  478k tweets (320k relevant to H1N1). 256k neutral, 27k negative, 36k positive (imbalanced data set).

L Sanders 25

Synopsis

Salathe & Khandelwal [2011]

Page 26: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS L Sanders 26

Sentiment-Coverage Correlation

Salathe & Khandelwal [2011]

Due to the correlation, we see that there is promise in this technique to be used as a cost-effective probing tool to stage vaccine interventions

Page 27: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Built a webapp which was used by 64 ‘volunteers’ §  Each student was given 1400 tweets (with heavy overlap

w.r.t. other students’ tweet sets). §  47k tweets were rated. Each tweet labeled by a majority

decision. §  The high confidence* test set numbered 630. These were

those rated 44 times. §  Built an ensemble classifier: Naïve Bayes (pos/neg) and

Max. Entropy (irrelevant/neu) §  Accuracy was 84.29 %

L Sanders 27

Methodology of the Classifier

Page 28: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Created a directed graph of 40k nodes, 685k edges. §  Nodes are users with either a pos/neg sentiment score. §  A directed edge is created if a user follows another user. §  Measured the assortative mixing of users with a

qualitatively similar opinion on vaccination (homophily) §  0<r<=1: nodes are mostly connected to nodes of the

same type §  -1<= r <0: nodes are connected to the opposite type §  r = 0.144: People with the same vacc. opinion are likely to

be connected. Sentiment gives a measure of info. flow.

L Sanders 28

Social Network – Homophily and Herd Immunity

Page 29: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS L Sanders 29

Consequences for disease spread

Page 30: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  Working with Text Data, User Guide from Sci-kit Learn http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html

§  Sentiment Analysis of Short Informal Texts; Kiritchenko et al., Journal of Artificial Intelligence Research 2014

§  Stance and Sentiment in Tweets, Mohammad et al., arXiv, 2016 §  Assessing vaccination sentiments with online social media: Implications for infectious disease

dynamics and control, PLoS Comp. Bio. 2011. §  The emotional arcs of stories are dominated by six basic shapes, Reagan et al., arXiv 2016 §  Survey on Aspect-level sentiment analysis, Schouten and Frasnicar, IEEE, 2016 §  Twitter mood predicts the stock market, Bollen, Mao, and Zeng, 2010 §  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts, Cicero Nogueira dos

Santos & Maira Gatti, 2014

L Sanders 30

References and Reading

Page 31: Introduction to Sentiment Analysis - ETH Zürich - … · 2017-05-12 · Introduction to Sentiment Analysis . ... Sentiment analysis is also known as opinion mining L Sanders 3 What

| | COSS

§  High confidence test set: Tweets had to have a percentage polarity of over 50% or could be agreed on by the two authors

L Sanders 31

Tweet test set