introduction to sentiment analysis - eth z sentiment ! sentiment analysis is also known as opinion...

Download Introduction to Sentiment Analysis - ETH Z sentiment ! Sentiment analysis is also known as opinion mining

Post on 12-Mar-2020




0 download

Embed Size (px)


  • | | COSS

    Machine Learning and Modelling for Social Networks Lloyd Sanders, Olivia Woolley, Iza Moize, Nino Antulov-Fantulin D-GESS: Computational Social Science

    Introduction to Sentiment Analysis

  • | | COSS

    §  What is Sentiment Analysis? §  Classifying Sentiment §  Feature Creation and Selection §  Use Case: Public health and Vaccine Sentiment §  References and Reading

    L Sanders 2


  • | | COSS

    §  Positive/Negative Polarity assigned to text §  The Sentiment ‘space’ is being expanded to

    accommodate more than a single dimension §  Classification with respect to emotion: Joy, frustration,

    sadness are occurring §  Classification with respect to stance (either for, or against

    a position) is similar to, but not entirely the same as sentiment

    §  Sentiment analysis is also known as opinion mining L Sanders 3

    What is Sentiment Analysis

    Sentiment analysis is the operation of understanding the intent or emotion behind a given piece of text

  • | | COSS

    §  Sentiment Analysis is a branch of computer science, and overlaps heavily with Machine Learning, and Computational Linguistics

    §  Why? One seeks to understand the general opinion across many documents within a corpus (e.g., all tweets relating to a given brand).

    §  This is labor intensive, so we use ML to automatically label documents via classifier through a labeled dataset (supervised learning)

    L Sanders 4

    What is Sentiment Analysis

    Sentiment analysis is the operation of understanding the intent or emotion behind a given piece of text

  • | | COSS L Sanders 5

    Sentiment examples in the wild – Business Reviews

  • | | COSS L Sanders 6

    Sentiment examples in the wild – Product Reviews

  • | | COSS

    §  Vonnegut posited in his Master’s thesis that there were 6 basic shapes to a story §  Rags to Riches (rise) §  Riches to Rags (fall) §  Man in a hole (fall then rise) §  Icarus (rise then fall) §  Cinderella (rise then fall then rise) §  Oedipus (fall then rise then fall)

    §  A team used sentiment analysis to verify this with over 1700 English fiction novels [Reagan et al. 2016]

    L Sanders 7

    Emotional Arcs of Fiction

  • | | COSS L Sanders 8

    Emotional Arcs of Fiction


    Oedipus Icarus

    Man in Hole Rags to Riches

    Riches to Rags

    Reagan et al. 2016

  • | | COSS L Sanders 9


  • | | COSS

    §  Sentiment analysis often correlates well with real world observables.

    §  For commercial aspects: Brand Awareness §  Stock fluctuations and public opinion [Bollen et al. 2010] §  Health related: Vaccine sentiment vs. coverage [Later] §  Public safety: Situational awareness in mass emergencies

    via Twitter [Verma et al. 2011]

    L Sanders 10

    Why is it useful?

    Sentiment could be considered a latent variable in social behavior. Measuring and understanding this

    behavior, could lead to better understanding of social phenomena.

  • | | COSS

    §  Sentiment is very domain specific, and also temporally specific w.r.t. social media.

    §  Different contexts, alter polarity of different words (e.g.: ‘unpredictable’: movie review good, driving = bad)

    §  Slang ‘Movie is bad ass’ §  Sentiment has multiple levels:

    §  Document or message (tweet/sms) level §  Term/Aspect level “The coffee was amazing, but the atmosphere

    was dull” §  Word level / within word level (severity of sentiment per word)

    §  Negations, sloppy spelling/structure, compound the difficulty

    L Sanders 11

    Sentiment Classification is Difficult

  • | | COSS

    §  Gather a large quantity of data – the more the better §  Construct a labeled set of data into your classes (e.g.

    positive/negative/neutral) §  Split your set into training/test sets §  Construct your features §  Train Classifier (SVM, Naïve Bayes, Ensemble Methods,

    Neural Nets) §  Assess accuracy §  Let loose on a the full set

    L Sanders 12

    Classifying Sentiment: A Recipe

  • | | COSS

    §  It’s important to have well labeled data, and there are a number of ways of doing this

    §  Self-annotation can lead to biases. §  Crowd sourcing annotation

    L Sanders 13

    Labeling Training Data

    “Put junk in, get junk out”

  • | | COSS

    §  Pseudo-labeling data can have a net positive effect §  This can be achieved, for example on social media,

    through hashtags, or emoticons/emojis [Kouloumpis et al. 2011, Davidov et al., 2010]

    L Sanders 14

    Labeling Training Data

    “Put junk in, get junk out”

  • | | COSS

    §  Common practice one can use a bag of words technique which discards structure, but does incorporate word count

    §  Each document in the corpus is disassembled into a bag of words, represented as a vector

    §  Can use TF-IDF on this bag of words vector [see Iza’s lecture on Big Data].

    §  Your bag of words vector per document will be sparse, can leverage that in computation.

    L Sanders 15

    Constructing text features

    1 Sentiment Analysis Equations

    Bag of words


    di = [x1, x2, · · · , xn]T


  • | | COSS

    §  N-grams are a simple technique to capture document structure

    §  When considering words: a unigram is a single word, a bigram is a string of two words

    §  Bigrams can begin to capture negations such as ‘this food was not_good’, but will miss out on ‘this food was not_very_good’ (less severe)

    §  One can construct skip n-grams, e.g.: not_*_good §  N-grams are also possible with characters: ‘good’ is a 4-

    gram, ‘happy’ is a 5-gram char

    L Sanders 16


  • | | COSS

    §  A negation word can flip the polarity on an entire sentence.

    §  Bigrams, or Trigrams go some way towards this, as mentioned before.

    §  How else can one take these into account? §  Preprocess text to take negations into account: ‘not good’ =>


    L Sanders 17

    Negations and how to deal with them

    “This food was not good”

  • | | COSS

    §  General Inquirer [] §  SentiWordNet [] §  Bing Liu’s lexicons []

    L Sanders 18

    Sentiment Lexicons

    There are many publically available Sentiment (and Emotion) lexicons available. These can be used as a complementary feature construction for your classifiers (especially for out of vocabulary words – those not in your corpus).

  • | | COSS

    §  Here is a sample of the features used by a state of the art Twitter sentiment classifier: §  Word ngrams (up to 4), skip ngrams w/ 1 missing word §  Character ngrams up to 5 §  All caps: number of words in capitals §  Number of hashtags §  Number of continuous punctuation marks, either exclamation or

    question or mixed. Also whether last char contains one of these. §  Presence of emoticons

    L Sanders 19

    Feature Vectors for short informal texts: a bird’s eye view

  • | | COSS

    §  Here is a sample of the features used by a state of the art Twitter sentiment classifier: §  Number of elongated words (one character repeated more than

    twice: ‘raaaaaad’) §  Normalization: URLS to http://someurl; userids to @someurl §  Part-of-Speech tagged tweets: number of occurrences of each

    POS tag.

    L Sanders 20

    Feature Vectors: a bird’s eye view

  • | | COSS

    §  Sentiment is a classification problem §  Typically people have used Naïve Bayes or Support

    Vector Machines (SVM) in the past [Mohammad et al. 2013]

    §  Artificial Neural Nets are also becoming more popular now [Nogueira dos Santos & Gatti, 2014]

    L Sanders 21

    Classifying your sentiment

  • | | COSS

    §  How does one construct a baseline for accuracy? §  As always, we refer to ‘better than chance’ baseline §  In the context of pos/neg/neu, they are often not split

    evenly. §  One can use the maximum likelihood for each class: If

    pos is 70% of the classes, then choose that. §  For multiple classes, as a single measure, it is common to

    use the macro F-score. §  For binary case, the go to is: AUC ROC

    L Sanders 22

    Sentiment Accuracy

  • | | COSS L Sanders 23


View more >