sentiment analysis potsdam, 7 june 2012 · natural language processing sentiment analysis potsdam,...
TRANSCRIPT
![Page 1: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/1.jpg)
Natural Language ProcessingSentiment Analysis
Potsdam, 7 June 2012
Saeedeh MomtaziInformation Systems Group
based on the slides of the course book
![Page 2: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/2.jpg)
Sentiment Analysis
---------------------------------------------------------------------------------------------------------
Saeedeh Momtazi | NLP | 07.06.2012
2
![Page 3: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/3.jpg)
Outline
1 Applications
2 Task
3 Machine Learning Approach
4 Rule-based Approach
Saeedeh Momtazi | NLP | 07.06.2012
3
![Page 4: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/4.jpg)
Outline
1 Applications
2 Task
3 Machine Learning Approach
4 Rule-based Approach
Saeedeh Momtazi | NLP | 07.06.2012
4
![Page 5: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/5.jpg)
Hotel Reviews
Saeedeh Momtazi | NLP | 07.06.2012
5
![Page 6: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/6.jpg)
Product Reviews
Picture Quality
Ease of Use
Size
Weight
Color
Zoom
Saeedeh Momtazi | NLP | 07.06.2012
6
![Page 7: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/7.jpg)
Social Media
Saeedeh Momtazi | NLP | 07.06.2012
7
![Page 8: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/8.jpg)
Event Analysis and Prediction
Analyzing the side effects of events in different communitiesPredicting the election resultsPredicting the Stock exchange...
Saeedeh Momtazi | NLP | 07.06.2012
8
![Page 9: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/9.jpg)
Outline
1 Applications
2 Task
3 Machine Learning Approach
4 Rule-based Approach
Saeedeh Momtazi | NLP | 07.06.2012
9
![Page 10: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/10.jpg)
Sentiment Analysis Levels
�� ��Text
⇓ ⇓�� ��Opinion�� ��Fact
⇓ ⇓�� ��+�� ��−
⇓ ⇓
⇓ ⇓�
�happy
surprised...
�
�angry
afraid...
Saeedeh Momtazi | NLP | 07.06.2012
10
![Page 11: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/11.jpg)
Advanced Sentiment Analysis
Opinion holderOpinion target / aspect
�
�Students︸ ︷︷ ︸ like Wikipedia︸ ︷︷ ︸ because it is easy to use and it sounds authoritative.
op holder target�
�I had a nice stay in this hotel and the rooms︸ ︷︷ ︸ were very clean.
. aspect
Mixed opinions��
��The restaurant has an amazing view but it is very dirty.
Saeedeh Momtazi | NLP | 07.06.2012
11
![Page 12: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/12.jpg)
Other Names
Opinion miningOpinion extractionSentiment miningSubjectivity detectionSubjectivity analysis
Saeedeh Momtazi | NLP | 07.06.2012
12
![Page 13: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/13.jpg)
Sentiment Analysis Approaches
Machine learning methods⇒classification
Rule-based methods⇒ dictionary oriented
Saeedeh Momtazi | NLP | 07.06.2012
13
![Page 14: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/14.jpg)
Outline
1 Applications
2 Task
3 Machine Learning Approach
4 Rule-based Approach
Saeedeh Momtazi | NLP | 07.06.2012
14
![Page 15: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/15.jpg)
Machine Learning Approach
Training
T1 → C1T2 → C2
...
Tn → Cn
−−−−→
f1f2...
fn
−−−−→�� ��Model
Testing
Tn+1 →? −−−−→ fn+1 −−−−→
←−−−−−−−−
Cn+1
Saeedeh Momtazi | NLP | 07.06.2012
15
![Page 16: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/16.jpg)
Sentiment Classification
Using any kinds of supervised classifiersK Nearest NeighborSupport Vector MachinesNaïve BayesMaximum EntropyLogistic Regression...
Saeedeh Momtazi | NLP | 07.06.2012
16
![Page 17: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/17.jpg)
Features
Word
All words or adjectives?All words works better than adjectives only
Word occurrence or frequency?Word occurrence is more useful than frequency
Using binary value for wordsReplace all word counts higher than 0 in each text by 1
Saeedeh Momtazi | NLP | 07.06.2012
17
![Page 18: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/18.jpg)
Features
Negation
Negation words change the text polarityAdding prefix NOT− to every word between negation and nextpunctuation
�� ��“I did not like the restaurant location, but the food ...”
I did not NOT-like NOT-the NOT-restaurant NOT-location but the food ...
Saeedeh Momtazi | NLP | 07.06.2012
18
![Page 19: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/19.jpg)
Features
Other emotions
Considering emoticons as additional features:):(
Saeedeh Momtazi | NLP | 07.06.2012
19
![Page 20: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/20.jpg)
Fine-grained Analysis
Dealing with finer classes of sentiment-3,-2,-1,+1,+2,+3
ApproachesUsing multiclass classifier (6 classes in this case)Using two level classifier
First level: polarity classifier (positive or negative)Second level: strength classifier (1 or 2 or 3)
Saeedeh Momtazi | NLP | 07.06.2012
20
![Page 21: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/21.jpg)
Outline
1 Applications
2 Task
3 Machine Learning Approach
4 Rule-based Approach
Saeedeh Momtazi | NLP | 07.06.2012
21
![Page 22: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/22.jpg)
Rule-based Approach
Training
T1 → C1T2 → C2...
Tn → Cn
�
�
�
goodlove
braveintelligent
nice...
�
�
�
badhatelie
uglypoor...
Testing
Tn+1 →? −−−−−−−→
←−−−−
Cn+1
Saeedeh Momtazi | NLP | 07.06.2012
22
![Page 23: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/23.jpg)
Rule-based Approach
Looking for opinionated words in each textClassifying the text based on the number of positive andnegative words
Considering different rules for classificationFine-grained dictionaryNegation wordsBooster wordsIdiomsEmoticonsMixed opinionsLinguistic features of the language
Saeedeh Momtazi | NLP | 07.06.2012
23
![Page 24: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/24.jpg)
Rule-based Approach
Fine-grained Dictionary
�� ��“It was a good song.”
�� ��“The song was excellent.”
Saeedeh Momtazi | NLP | 07.06.2012
24
![Page 25: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/25.jpg)
Rule-based Approach
Negation Words
�� ��“The song was good.”
�� ��“The song was not good.”
Saeedeh Momtazi | NLP | 07.06.2012
25
![Page 26: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/26.jpg)
Rule-based Approach
Booster Words
�� ��“The song was interesting.”
�� ��“The song was very interesting.”
�� ��“The song was somewhat interesting.”
Saeedeh Momtazi | NLP | 07.06.2012
26
![Page 27: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/27.jpg)
Rule-based Approach
Idioms
�� ��“shock horror”
Saeedeh Momtazi | NLP | 07.06.2012
27
![Page 28: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/28.jpg)
Rule-based Approach
Mixed Opinions
�� ��“The song was good, but I think its title was strange.”
Saeedeh Momtazi | NLP | 07.06.2012
28
![Page 29: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/29.jpg)
Rule-based Approach
German Linguistic Features
�
�
�
�“I do not love the song.” ⇒ “Ich liebe nicht das Lied.”
“Ich liebe das Lied nicht.”
Saeedeh Momtazi | NLP | 07.06.2012
29
![Page 30: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/30.jpg)
Opinion Dictionary
English
Subjectivity Clues (2005)SentiSpin (2005)SentiWordNet (2006)Polarity Enhancement (2009)SentiStrength (2010)
German
GermanPolarityClues (2010)SentiWortSchatz (2010)GermanSentiStrength (2012)
Saeedeh Momtazi | NLP | 07.06.2012
30
![Page 31: Sentiment Analysis Potsdam, 7 June 2012 · Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the](https://reader030.vdocuments.mx/reader030/viewer/2022040314/5e16086abd809f26f97401c8/html5/thumbnails/31.jpg)
Machine Learningwith Opinion Dictionary
Using opinion words as a feature in the algorithmsIgnoring other words in the text
Adjectives alone do not work well,but opinion words are the best features to be used
Saeedeh Momtazi | NLP | 07.06.2012
31