a taste of sentiment analysis - institute for statistics...

105
A Taste of Sentiment Analysis Rob Zinkov May 26th, 2011 Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 1 / 105

Upload: dangdiep

Post on 20-Mar-2018

234 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

A Taste of Sentiment Analysis

Rob Zinkov

May 26th, 2011

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 1 / 105

Page 2: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Outline

1 Introduction

2 Basics of NLP

3 Basic Techniques for Sentiment Analysis

4 Advanced Techniques for Sentiment Analysis

5 Further Questions

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 2 / 105

Page 3: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

What is Sentiment Analysis?

Sentiment Analysis is a subfield of Computational Linguisticsconcerned with extracting emotions from text

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 3 / 105

Page 4: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Applications

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 4 / 105

Page 5: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Applications - Political Blogs

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 5 / 105

Page 6: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Applications - Political Blogs

• Tracking opinions on issues

• Tracking which issues are held emotionally

• Tracking subjectivity of bloggers

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 6 / 105

Page 7: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Political Blogs - Challenges

• Identifying opinion holder

• Associating opinions with issue

• Identifying public figures and legislation

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 7 / 105

Page 8: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Applications - Product Reviews

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 8 / 105

Page 9: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Product Reviews - Challenges

• Identifying aspects of product

• Associating opinions with aspects of product

• Identifying Fake Reviews

• No canonical form

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 9 / 105

Page 10: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Applications - Financial News

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 10 / 105

Page 11: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Financial News - Challenges

• Identifying the equity in the article (think commodities)

• Associating entities with market symbols

• Specialized financial terms with distinct sentiment

• Articles rarely only about one equity

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 11 / 105

Page 12: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Applications - Brand Tracking

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 12 / 105

Page 13: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Brand Tracking - Challenges

• Text likely to be unstructured

• Identifying Brand

• Identifying Opinion Holder/Demographic

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 13 / 105

Page 14: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Goals

• Give a broad overview of the field

• Showcase the best current tools and approaches

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 14 / 105

Page 15: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Caveats

• There are no good R code/libraries to do this (yet)

• This talk is biased towards my domains

• No one in this area really knows what they are doing

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 15 / 105

Page 16: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

History

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 16 / 105

Page 17: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

History

• Grew out of Web integration Field

• Started as extension of knowledge extraction

• This is why field sometimes called Opinion Mining

• Also why papers as likely to occur in ACL as in WWW

• Many early algorithms are extraction patterns

• Field was still largely academic

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 17 / 105

Page 18: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Then something happened

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 18 / 105

Page 19: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Unique Challenges in Sentiment Analysis

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 19 / 105

Page 20: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Opinions are not Facts

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 20 / 105

Page 21: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Order Matters

• Sentences at end of article have stronger influence on sentiment

• Sentences at beginning of article have stronger influence on sentiment

• Irrelevant sentences influence sentiment of document.

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 21 / 105

Page 22: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Order Matters - Valience Shifts

The camera is reasonable,but there are far better ones at this priceThe meal could have been better,though still tasty.

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 22 / 105

Page 23: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Sentiment Orientation

• shifts in sentiment noted by special words

• special words usually have no sentiment of their own

• sentiment though consistent in each phrase

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 23 / 105

Page 24: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Sentiment Orientation - continued

• Naive method misses these shifts

• Bag of Words model fails here

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 24 / 105

Page 25: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Opinions polarize

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 25 / 105

Page 26: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 26 / 105

Page 27: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Opinions have context

Small screenSmall carbon footprint

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 27 / 105

Page 28: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

Opinions need to be normalized

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 28 / 105

Page 29: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Introduction

People disagree on what words mean

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 29 / 105

Page 30: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Basics of Natural Language Processing

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 30 / 105

Page 31: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Introduction to NLP

• Computational Linguistics in centered in Frequency Counts

• Frequency Counts become statistic through which we reason

• This statistic has flaws but still useful

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 31 / 105

Page 32: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Stemming

It is useful to combine words with a common root.When counting terms this groups words that denote the same termThis is done by dropping the end

sleepingsleepersleeps

sleep

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 32 / 105

Page 33: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Stopwords

It is important to remove common words as they dominate all countsCommon words in English:

a, the, an, is, be, could, there

Most NLP libraries packaged with a list of stopwords

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 33 / 105

Page 34: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Sometimes words will need to more finely processedThe following tools exist in most NLP packagesI prefer the Stanford NLP software suitehttp://nlp.stanford.edu/software/index.shtml

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 34 / 105

Page 35: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Parsing

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 35 / 105

Page 36: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Parsing

• Structure also derivable by parsing sentences

• Treat text like programming language

• Algorithms can then convert text into Tree

• Algorithms exist to learn grammar

• Very Heavyweight

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 36 / 105

Page 37: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Shallow Parsing

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 37 / 105

Page 38: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Shallow Parsing

• Less heavy to use than a full parser

• Processes words into phrases

• Training Chunking parser significantly easier/faster

• Requires having words tagged with their part of speech

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 38 / 105

Page 39: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Part of Speech tagging

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 39 / 105

Page 40: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

POS tagging

• Simplest operation to perform on words

• All NLP libraries support this operation

• Provides lightweight metadata

• Very common word feature

• Used by nearly all more complex NLP techniques

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 40 / 105

Page 41: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Dependency Parsing

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 41 / 105

Page 42: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Dependency Parsing

• Traditional Treebank Parsing is a bit bureaucratic

• Hides relations words have with each in sentence

• Dependency Parsing provides a lightweight alternative

• Alternative has looser representation, more language agnostic

• More readily captures which words modify each other

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 42 / 105

Page 43: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Wordnet

• Words can be related by how similar they are

• Words are similar if they mean similar things

• Words are similar is they are a type of another word

• Words can have many meanings

• Wordnet is a hand curated ontology that annotates these relations

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 43 / 105

Page 44: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Wordnet synsets

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 44 / 105

Page 45: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Wordnet concept network

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 45 / 105

Page 46: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Topic Modeling

Topic Modeling is a way to group and categorize documentsUsually unsupervised approach

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 46 / 105

Page 47: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

CTM - Coorelated Topic Models

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 47 / 105

Page 48: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

CTM - Coorelated Topic Models

• CTMs model the underlying topics within a document

• They differ from earlier approaches in capturing correlations betweentopics

• Give superior performance compared to other unsupervised models

• Available for use as an R package in CRAN (topicmodels)

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 48 / 105

Page 49: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basics of NLP

Named Entity Recognition

The purpose of NER is to extract out and label phrases in a sentence

Bill Clinton arrived at the United Nations Building in Manhattan.

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 49 / 105

Page 50: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Sentiment Definitions

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 50 / 105

Page 51: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Opinion

A vector denoting representing an opinionwith values positive, negative, or neutral gradings

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 51 / 105

Page 52: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Opinion Holder

The agent an opinion belongs to.This mostly relevant in political blogs

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 52 / 105

Page 53: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Item Features

Facets of the object that are readily available

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 53 / 105

Page 54: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Sentiment Features

Facets of the object that an opinion may be subscribed.These are usually hard to tease out of the text

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 54 / 105

Page 55: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

1. Gather a Seed set

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 55 / 105

Page 56: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Opinion corpora available at:

• Wiebe’s corpora http://www.cs.pitt.edu/mpqa/

• Sentiwordnet: http://sentiwordnet.isti.cnr.it/

• Personal dictionaries (available on request)

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 56 / 105

Page 57: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Gathering initial seed words

• Wiebe’s work comes with subjectivity scores in addition to sentiment

• Sentiwordnet was autogenerated, quality could be better

• Personal dictionaries hand generated, small but good quality

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 57 / 105

Page 58: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

2. Learn sentiment of unknown words

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 58 / 105

Page 59: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Learn sentiment - Supervised

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 59 / 105

Page 60: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Learn sentiment - Supervised

• Get a large collection of them labeled

• Use this collection as is

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 60 / 105

Page 61: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Learn sentiment - Unsupervised - Turney

• Use Turney’s Method

• Calculate Pointwise Mutual Information between every word and theseed words ’excellent’ ’poor’

SO(w) = lg(hits(w NEAR excellent)hits(excellent)

hits(w NEAR poor)hits(poor)

)where hits(w NEAR y) = number of times w is within 10 words of the y

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 61 / 105

Page 62: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Learn sentiment - Unsupervised - Twitter

• Use Turney’s Method with Twitter

• Calculate Pointwise Mutual Information between every word andwhenever it appears with _̈ or ¨̂ within a tweet

• This method has the advantage of being multilingual, other kinds ofsmiles aside

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 62 / 105

Page 63: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Learn sentiment - Unsupervised - Wordnet

• Use wordnet to walk random paths from start word until arriving at aseed word

• Average across sentiments of all seed words arrived at• This method is the fastest and most accurate

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 63 / 105

Page 64: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

3. Apply rules to simplify document

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 64 / 105

Page 65: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

• Rules make words more independent

• Rewrites make it less likely to misclassify a phrase

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 65 / 105

Page 66: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

4. Identify opinion phrases

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 66 / 105

Page 67: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

• Shallow Parse the document into chunks

• Remove chunks with mostly neutral words

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 67 / 105

Page 68: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Alternatively, extract with some rules

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 68 / 105

Page 69: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

5. Extend sentiment to phrases and sentences

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 69 / 105

Page 70: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

• Ultimately, sentiment is for phrases and sentences

• Use sentiment on individual words as priors

• Sentiment is based on joint probability across words in phrase

• Use Naive Bayes or a Markov Model as needed

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 70 / 105

Page 71: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

6. Aggregate sentiments for display

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 71 / 105

Page 72: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Group phrases based on what you want the sentiment

• Entities

• Topics

• Sentiment Features

• Item Features

• Users

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 72 / 105

Page 73: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

8. Generating Summary

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 73 / 105

Page 74: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Generating Summary

• Largely only relevant when you returning text

• Rate all sentences based on readability

• Return snippet of text for each group with sentiment vector attached

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 74 / 105

Page 75: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Basic Techniques for Sentiment Analysis

Summary

1 Gather a seed set

2 Learn sentiment of unknown words

3 Apply rules to simplify document

4 Identify opinion phrases

5 Extend sentiment to phrases and document

6 Aggregate sentiments for display

7 Generate summary

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 75 / 105

Page 76: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Anaphora Resolution

• Many articles refer entities by their name only a few times

• Opinions will usually co-occur with an anaphora of the entity

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 76 / 105

Page 77: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Anaphora Resolution

• Simplest solution, replace all anaphora with their referent

• Trickier solution, aggregate all opinions associated with anaphora later

• Other options?

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 77 / 105

Page 78: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Sentiment Analysis is fundamentally a DiscriminativeLearning Task

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 78 / 105

Page 79: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Conditional Random Fields

• Sentiment is clearly affected by its surrounding context

• Sentiment is also affected by orientation shifting words

• Why not make these connections explicit in our model?

• Conditional Random Fields (CRFs) are a flexible way of representingthese connections.

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 79 / 105

Page 80: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Conditional Random Fields

In a CRF, we represent posterior probability of a set of sentiments giventhe underlying text. A is a collection of cliques in the graph of connections.

p(y |x) =1

Z

∏A

ΨA(xA, yA)

ΨA(xA, yA) = exp

{∑k

θAk fAk(xA, yA)

}

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 80 / 105

Page 81: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Linear Chain CRFs

If we assume the sentiment of any given word only depends on theprevious, the formula simplifies to

p(y |x) =1

Z

t∏exp

{∑k

θk fk(xt , yt , yt−1)

}

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 81 / 105

Page 82: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Linear Chain CRFs are best understood as a discriminative version ofHidden Markov Models

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 82 / 105

Page 83: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Skip-chain CRFs

But we can assume sentiment depends on words much further away

We can now connect entities to each other and connect phrases explicitlyseparated by a sentiment shifting word.

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 83 / 105

Page 84: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

CRFs - Conclusions

• CRFs allow us to add context to opinion

• Properly used they can handle the connections between sentiments onphrases as well as words

• CRFs allow us to link arbitrary features of words and labels to eachother

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 84 / 105

Page 85: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Extensions

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 85 / 105

Page 86: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Extensions - Time Series

Just order your documents in time, and can plot changes in sentiment

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 86 / 105

Page 87: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Extensions - Time Series

• This one tends to get used with financial data and monitoring brands

• Requires having access to lots of articles to make sense

• There can be sparsity issues so apply proper shrinkage

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 87 / 105

Page 88: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Beyond Positive and Negative

We can be more subtle

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 88 / 105

Page 89: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Sarcasm

If you deal with Product Review this is helpful

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 89 / 105

Page 90: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Sarcasm is best detected through punctuation and capitalization features

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 90 / 105

Page 91: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Detecting Fake Reviews

• Fake Reviews are best treated as a classification task

• Collect enough and use frequency counts for features

• This is useful in production deployments and simple to implement

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 91 / 105

Page 92: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Multilingual Sentiment Analysis

• Sentiment does not translate well

• Words that mean the same thing can not correspond wrt sentiment

• Retrain for each new language you wish to support

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 92 / 105

Page 93: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Word-sense disambiguation

• This is largely not worth the effort

• Using the first sense of the word gives comparable performance tomore sophisticated approaches

• Exception: domain specific corpus where word is unlikely to be thefirst sense. Use specialized dictionaries for this case

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 93 / 105

Page 94: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Advanced Techniques for Sentiment Analysis

Comparisons

• Sometimes opinions are stated relevant two separate entities

• Superlatives are a special case of this

• Treat these as a ranking problem and handle as a separate problem

• Merge sentiments during aggregation

R is much better than SPSS

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 94 / 105

Page 95: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

Lingering Questions

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 95 / 105

Page 96: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

What keeps me from doing this in R?

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 96 / 105

Page 97: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

Further Questions - Large Data

• Text analysis is hard to do in R

• R has memory limits

• Using Hadoop or BigMemory usually means giving up many libraries

• tm.plugins.distributed helps a bit

• snow and OpenMPI gives mixed results

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 97 / 105

Page 98: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

Further Questions - Metadata

Is there a lightweight metadata format?Index Offset Property Value

2 10 POS NP

35 5 Sentiment Positive

17 7 POS JJ

51 20 Chunk NULL

20 8 Entity Person

2 45 Sentence NULL

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 98 / 105

Page 99: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

Further Questions - Model Files

• Not enough of the tools take model files

• Model files are needed for tokenization,sentence splitting, postagging, chunking

• Without easy support for model files, multilingual support is difficult

• Without easy support, impossible to train better models as databecomes available

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 99 / 105

Page 100: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

Further Questions - Rule Files

• No standard on preprocessing rules

• DSL required for them

• Is this something we need to provide?

• Until better techniques come around, essential for any performance

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 100 / 105

Page 101: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

Theoretical Formulation

• Can these techniques be made less hacky?

• Dependency Parses provide much of the structure for trackingsentiment orientation

• Can structure be handled in a more unsupervised manner?

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 101 / 105

Page 102: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

References

Best starting point:Sentiment Analysis and Subjectivity by Bing Liuhttp://www.cs.uic.edu/ liub/FBS/NLP-handbook-sentiment-analysis.pdf

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 102 / 105

Page 103: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

References (More)

• Joint Extraction of Entities and Relations for Opinion Recognition(Choi 2006)

• Mining Opinion Features in Customer Reviews (Liu 2004)

• A Holistic Lexicon-Based Approach to Opinion Mining (Deng 2008)

• I Cant Recommend This Paper Highly Enough (Dillard thesis)

• Entity Discovery and Assignment for Opinion Mining Applications(Deng 2009)

• Extracting Product Features and Opinions from Reviews (Popescu2005)

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 103 / 105

Page 104: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

Conclusions

• Sentiment Analysis is a relatively young area

• Still plenty of ideas to be explored

• Widely applicable

• Really fun

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 104 / 105

Page 105: A Taste of Sentiment Analysis - Institute for Statistics ...statmath.wu.ac.at/research/talks/resources/sentimentanalysis.pdf · Introduction What is Sentiment Analysis? Sentiment

Further Questions

Questions?

Rob Zinkov () A Taste of Sentiment Analysis May 26th, 2011 105 / 105