sentiment analysis - ugr · 2018. 12. 17. · sentiment analysis an introduction to opinion mining...

47
Sentiment Analysis Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

Upload: others

Post on 29-Mar-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

Sentiment AnalysisSentiment AnalysisAn Introduction to Opinion Mining and its Applications  

Ana ValdiviaGranada, 17/11/2016

Page 2: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

About me

Ana ValdiviaDegree in Mathematics (UPC)MSc in Data Science (UGR)

Paper about museums:Martínez‐de‐Albéniz, V. and Valdivia, A.; “Measuring and Exploiting the Impact ofMeasuring and Exploiting the Impact of Exhibitions Scheduling on Museum Attendance”.

M t Th i b t SentimentMaster Thesis about Sentiment Analysis 

Organizer of @DataBeersGRX

Ana Valdivia ©

Page 3: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

ROADMAP

1 I t d ti1. Introduction

2. The Sentiment Analysis Problem

3. The Sentiment Analysis Process3. The Sentiment Analysis Process

4. My Master’s Thesis

Ana Valdivia ©

Page 4: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

1. INTRODUCTION

What is SA?What is SA?Sentiment Analysis (SA) is the field of knowledge that analyses people’s opinions reviews or thoughts aboutanalyses people s opinions, reviews or thoughts about products, companies or experiences identifying its sentiment. 

Al f d O i i Mi iAlso referred as Opinion Mining.

Ana Valdivia ©

Page 5: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

1. INTRODUCTION

What is SA? “Alhambra with General LifeWhat is SA?parks and gardens, the towerand Nazrid palaces isabsolutely amazing. If you

“DO NOT EVEN TRY TO VISIT - A total

waste of time!!!. Spent 5 hours in the ticket

“Most visited monument in Spain. There are no

y g yare in Granada you must notmiss it.”

p 5

queue in the broiling sun 35 degrees. An

officious staff member told us when we reached

the head of the queue that there were no more words to descibe this place - beaty awaits around

every corner. THe mixture of two cultures in one

place makes it very special…”

the head of the queue that there were no more

tickets and to buy online…”

Ana Valdivia ©

Page 6: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

1. INTRODUCTION

Where it comes from?…

Sentiment AnalysisParsing

Discourse analysis

Name entity recognition (NER)

Part-of-speech tagging (POS)

Topic segmentation

Discourse analysis

Machine translation

Part-of-speech tagging (POS)

Automatic summarization

Ana Valdivia © NLP

Page 7: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

1. INTRODUCTION

Why is SA being popular?

Social Networks

Web 2.0Web 2.0

Ana Valdivia ©

Page 8: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

1. INTRODUCTION

Customer’s satisfaction

Ana Valdivia © http://www.slideshare.net/robin_allfamous/sentiment‐analysis‐and‐applications‐in‐the‐news‐and‐media‐industry

Page 9: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

1. INTRODUCTION

Why is SA being popular?

Social media sentiment isthe #nofiltervoice of thevoice of thepeople.

Ana Valdivia © http://www.slideshare.net/robin_allfamous/sentiment‐analysis‐and‐applications‐in‐the‐news‐and‐media‐industry

Page 10: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

ROADMAP

1 I t d ti1. Introduction

2. The Sentiment Analysis Problem

3. The Sentiment Analysis Process3. The Sentiment Analysis Process

4. My Master’s Thesis

Ana Valdivia ©

Page 11: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEM

What’s an opinion?What s an opinion?

Ana Valdivia ©

Page 12: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEM 

What’s an opinion?p

“If we cannot structure a problem, web bl d d d h bl ”probably do not understand the problem” .

B. Liu

Ana Valdivia ©

Page 13: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEM 

What’s an opinion?p

“If we cannot structure a problem, web bl d d d h bl ”probably do not understand the problem” .

B. Liu

Ana Valdivia ©

Page 14: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEM 

What’s an opinion?p

“If we cannot structure a problem, web bl d d d h bl ”

Liu’s proposal:probably do not understand the problem”.B. Liu.

BOOK REMARKB. Liu, 

Sentiment analysis and i i i i

Ana Valdivia ©

opinion mining

Page 15: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEM

Polarityy

Ana Valdivia ©

Page 16: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEM

Polarityy

Ana Valdivia ©

Page 17: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEM

Polarityy

Ana Valdivia ©

Page 18: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEMOne example is worth a thousand wordsthousand words…

Ana Valdivia ©

Page 19: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEMOne example is worth a thousand words “We were very tired after a loong walk. Wethousand words… We were very tired after a loong walk. We

stopped her for a rest, the first nice thing here, is

the view, and the fruit juices were excellent. WeLiu’s proposal:

felt much better after drunk it. Also the desert

were very good. Thank you.”

Ana Valdivia ©

Page 20: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEM

Different analytic levels

‐ Document level

‐ Sentence level

‐ Aspect or entity level

Ana Valdivia ©

Page 21: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

2. THE SENTIMENT ANALYSIS PROBLEM

Main concerns‐ Different types of opinionsDirect/indirect, comparative, explicit/implicit, …

l i h i i‐ Deal with text miningGrammar mistakes, emoticons, …

‐ Irony and sarcasm

‐ Fake or spam opinions‐ Fake or spam opinions

Ana Valdivia ©

Page 22: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

ROADMAP

1 I t d ti1. Introduction

2. The Sentiment Analysis Problem

3. The Sentiment Analysis Process3. The Sentiment Analysis Process

4. My Master’s Thesis

Ana Valdivia ©

Page 23: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Step by stepp y p

Ana Valdivia ©

Page 24: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Step by stepp y p

Ana Valdivia ©

Page 25: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Sentiment identificationExpert or user               Sentiment extraction 

algorithmsg

‐ Stanford CoreNLP‐MeaningCloud’s‐Microsoft Azure 

Ana Valdivia ©‐ …

Page 26: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Step by stepp y p

Ana Valdivia ©

Page 27: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Feature Selection

Bag of Wordsg

Ana Valdivia ©

Page 28: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Feature Selection Term‐Document Matrix

Bag of Words

Term Document Matrix

g

Ana Valdivia ©

Page 29: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Feature Selection Term‐Document Matrix

Bag of Words

Term Document Matrix

g

tf‐idf

Ana Valdivia ©

Page 30: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Feature SelectionText Preprocessing

ParsingParsingStemming

RemoveSTOP Words

Ana Valdivia ©

Page 31: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Feature SelectionText Preprocessing

ParsingParsingStemming

{nightmare nighttime nocturnal{nightmare, nighttime, nocturnal, nightlife...} night

RemoveSTOP Words

Ana Valdivia ©

Page 32: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Feature SelectionN‐grams More sophisticated…

Aspect‐Based Sentiment Analysis

Ana Valdivia ©ASUM

Page 33: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

3. THE SENTIMENT ANALYSIS PROCESS

Step by stepp y p

Medhat Walaa Ahmed Hassan and Hoda Korashy "Sentiment

Ana Valdivia ©

Medhat, Walaa, Ahmed Hassan, and Hoda Korashy.  Sentimentanalysis algorithms and applications: A survey." Ain ShamsEngineering Journal 5.4 (2014): 1093‐1113.

Page 34: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

ROADMAP

1 I t d ti1. Introduction

2. The Sentiment Analysis Problem

3. The Sentiment Analysis Process3. The Sentiment Analysis Process

4. My Master’s Thesis

Ana Valdivia ©

Page 35: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

Ana Valdivia ©

Page 36: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

Objectives

1. Study correlation between humanand machine sentiment

2. Classify opinions

3.Dicover interesting patterns in negative opinionsnegative opinions

Ana Valdivia ©

Page 37: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

Ana Valdivia ©

Page 38: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

Ana Valdivia ©

Page 39: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

Studying correlation between different sentimentlabels

SentimentCoreNLP SentimentValue

Ana Valdivia ©

Page 40: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

Studying correlation between different sentimentlabels

53 08 % f i id53.08 % of coincidence

Ana Valdivia ©

Page 41: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

Studying correlation between different sentimentlabels

93 49 % f i id93.49 % of coincidence

Ana Valdivia ©

Page 42: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

Classification problem positive

positive negativeUFSMpositive negative

BFSM

Ana Valdivia © negative

Page 43: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

DocumentTerm Matrix

Use UFSM and BFSM

TripAdvisor Alhambra data set

Split it in threesets depending onsentiment classdata setlabel

PreprocessingIf it is very unbalanced, apply oversamplingtechniques

Classification algorithmsApply different machine learning algorithms in traindata set with 5cv

Split it upSplit complete set in 75% training set and 25% testing setq g

Evaluate ResultsCheck measure values and dicuss best model

Ana Valdivia ©

dicuss best model

Page 44: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

XGBoost

IR = 1

unigrams

Ana Valdivia ©

Page 45: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

4. MY MASTER’S THESIS

Subgroup Discovery

tinegative

SD‐Map algorithm

Ana Valdivia ©

Page 46: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

SUMMARY

SA is a very challenging problem

Lots of applications

New research line

Ana Valdivia ©

Page 47: Sentiment Analysis - UGR · 2018. 12. 17. · Sentiment Analysis An Introduction to Opinion Mining and its Applications Ana Valdivia Granada, 17/11/2016

THANKS!any question?

[email protected]

@ana valdiAna Valdivia ©

@ _