Download - Thesis oral defense 2015 elvis saravia
![Page 1: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/1.jpg)
Inferring User Interests from Microblog Data through Opinion Mining
Student: Elvis Saravia Advisor: Prof. Yi-Shin Chen
Institution: National Tsing Hua UniversityProgram: International Master Program in Information Systems and Applications (IMPISA)
1
![Page 2: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/2.jpg)
Our Journey...→ Introduction→ Related Work→ Objectives→ Framework→ Experiment & Results→ Conclusion & Future Work→ Q & A
2Inferring User Interests from Microblog Data through Opinion Mining
![Page 3: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/3.jpg)
Introduction→ Rapid Growth of the Web
○ Web 2.0 (user-generated content)○ Data generated rapidly○ Social Sharing platforms (Facebook & Twitter)
3
→ Online User-Behaviour Data○ Introduced research opportunities ○ The most valuable asset that a company possesses
Inferring User Interests from Microblog Data through Opinion Mining
![Page 4: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/4.jpg)
Online User Behaviors
4Inferring User Interests from Microblog Data through Opinion Mining
Interests Emotions
![Page 5: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/5.jpg)
Objectives
5Inferring User Interests from Microblog Data through Opinion Mining
→ This work aims to develop a behavior-based user interests identification model.
→ The algorithms proposed combine both contextual and emotion analysis to obtain better performance on user interests extraction.
![Page 6: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/6.jpg)
Motivation→ Economic value
○ Recommendation services (dating sites & ads. targeting)○ Personalized systems (E-commerce & Search engines)
6
→ Personalization○ We love to be uniquely identified ○ Reduce extraction of ambiguous interests
Inferring User Interests from Microblog Data through Opinion Mining
“I may be exactly the same demographic as my neighbor, but that has nothing to do with what I eat.” - Lesperance VP of Digital Marketing and CRM for GrubHub
![Page 7: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/7.jpg)
Related Work→ Ontology [Mylonas et al. 2008] [Bakalov et al. 2009]
○ Search logs and contextual information to build ontology
→ Social Structure [Bao et al., WWW 2010] [Wen et al., SIGKDD 2010]○ Focuses on the user social graph (friends and follows)
7Inferring User Interests from Microblog Data through Opinion Mining
![Page 8: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/8.jpg)
Related Work → Contextual Information [Piao et al., 2011] [Yang et al., JCIS 2012]
○ Natural Language Processing (NLP) and latent Dirichlet allocation (LDA)
→ Behavior-Based [Zhou et al. 2008; Xing et al., WWW 2010]○ Collaborative filtering and Social Actions○ User Interactions (printing, copying and saving)
8Inferring User Interests from Microblog Data through Opinion Mining
![Page 9: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/9.jpg)
Interest Definition→ Considerations:
○ Not everything we say or write interests us○ Our interests shouldn’t be ambiguous○ Ranking interests is challenging
→ Observations:○ Interests ← Motivation ← Positive Emotions [Silvia et al., 2002]○ Our personal Interests are interlinked with our positive emotions
9
I am in New York.
I cannot wait for the Facebook Developer Conference
Inferring User Interests from Microblog Data through Opinion Mining
![Page 10: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/10.jpg)
Framework
10Inferring User Interests from Microblog Data through Opinion Mining
Contextual analysis + Emotion analysis
![Page 11: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/11.jpg)
Rule-BasedExtraction
Emotion Classification
KeywordExtraction
Pre-processing
Interest Candidates Extraction
Interest Identification
Emotion Tagging & Filtering
Emotion Analysis
Interest Identification
Twitter Corpus
11Inferring User Interests from Microblog Data through Opinion Mining
Output file
POSTagging
Pre-processing
![Page 12: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/12.jpg)
12Inferring User Interests from Microblog Data through Opinion Mining
Pre-processingFilter out information that doesn’t
provide any knowledge or value to user interest identification
![Page 13: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/13.jpg)
Pre-processing→ Filter out non-English posts and Re-Tweets
→ Remove useless punctuation marks
13Inferring User Interests from Microblog Data through Opinion Mining
I am loving Jeremy Lin! I am loving Jeremy Lin!
For every post (P) in a collection of Tweets (T)
![Page 14: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/14.jpg)
Pre-processing → Remove tweets containing hyperlinks (no emotion)
→ Remove repeated tweets (same emotion)
14Inferring User Interests from Microblog Data through Opinion Mining
Linsanity comes to LA. http://espn.com
.
.Linsanity comes to LA. http://espn.com
For every post (P) in a collection of Tweets (T)
![Page 15: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/15.jpg)
Pre-processing
15Inferring User Interests from Microblog Data through Opinion Mining
→ Remove terms less than 3 characters long and terms containing “@” symbol○ (e.g. to and @jason)
@jason I love to go to New York @jason I love to go to New York
For every post (P) in a collection of Tweets (T)
![Page 16: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/16.jpg)
16Inferring User Interests from Microblog Data through Opinion Mining
Rule-BasedExtraction
Emotion Classification
KeywordExtraction
Pre-processing
Interest Candidates Extraction
Interest Identification
Emotion Tagging & Filtering
Emotion Analysis
Interest Identification
Twitter Corpus
Output file
POSTagging
Pre-processing
![Page 17: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/17.jpg)
17Inferring User Interests from Microblog Data through Opinion Mining
Interest Candidates Extraction3-phase interest candidates algorithm to extract as much interest candidates
as possible
![Page 18: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/18.jpg)
Interest Candidates Extraction (1)→ POS-tagging
○ Part-of-speech tagging○ Nouns, Proper Nouns and Named entities○ Limitation: Naïve interest candidates
18
I cannot wait for the Facebook Developer Conference
I cannot wait for the Facebook Developer Conference
Inferring User Interests from Microblog Data through Opinion Mining
For every post (P) in a collection of Tweets (T)
![Page 19: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/19.jpg)
Interest Candidates Extraction (2)→ Keyword Extraction (RAKE) [Rose et. al 2009]
○ Extract keywords from posts○ Limitation: phrase boundaries
19
I cannot wait for the Facebook Developer Conference
I cannot wait for the Facebook Developer Conference
Inferring User Interests from Microblog Data through Opinion Mining
I enjoyed watching Mr. Bean
For every post (P) in a collection of Tweets (T)
![Page 20: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/20.jpg)
Interest Candidates Extraction (3)→ Previous Phases: Unreliable and Inconsistent
→ Emerging Interest Concepts? ○ Previous phases cannot extract them○ Provide better insights about users current interests
20Inferring User Interests from Microblog Data through Opinion Mining
Wimbledon 2015
![Page 21: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/21.jpg)
Interest Candidates Extraction (3)→ Rule-Based Concept Extraction [Hsu et al., 2015]
○ Extract frequent emerging concepts based on “wisdom of the crowd”○ 80,000,000 tweets (3,000,000 users)○ 6 patterns were defined
21Inferring User Interests from Microblog Data through Opinion Mining
![Page 22: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/22.jpg)
Interest Candidates Extraction (3)
22Inferring User Interests from Microblog Data through Opinion Mining
![Page 23: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/23.jpg)
Interest Candidates Extraction (3)
23Inferring User Interests from Microblog Data through Opinion Mining
![Page 24: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/24.jpg)
Interest Candidates Extraction (3)
24Inferring User Interests from Microblog Data through Opinion Mining
I am loving Wimbledon 2015 #WC2015 Wimbledon 2015
Crowd-wisdom
![Page 25: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/25.jpg)
Interest Candidates Extraction
25Inferring User Interests from Microblog Data through Opinion Mining
→ Combine the results of the 3-phase interest candidates extraction algorithm.○ Repetitive interest candidates were removed
For every post (P) in a collection of Tweets (T)
![Page 26: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/26.jpg)
26Inferring User Interests from Microblog Data through Opinion Mining
Rule-BasedExtraction
Emotion Classification
KeywordExtraction
Pre-processing
Interest Candidates Extraction
Interest Identification
Emotion Tagging & Filtering
Emotion Analysis
Interest Identification
Twitter Corpus
Output file
POSTagging
Pre-processing
![Page 27: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/27.jpg)
27Inferring User Interests from Microblog Data through Opinion Mining
Emotion AnalysisTagging interest candidates with their pertaining emotion
![Page 28: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/28.jpg)
Emotion Classification→ Pattern based approach
○ Appropriate for grammar informality of tweets
○ Effective for multilingual applications○ Contribution Degree
→ Why Positive emotions?○ Anticipation, Joy and Trust○ Highly related to motivation and interests.
28Inferring User Interests from Microblog Data through Opinion Mining
[Argueta et al., 2015]
Anticipation
Joy
Trust
Surprise
Sadness
Disgust
Anger
Fear
![Page 29: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/29.jpg)
Emotion Analysis→ Tag Interests with emotion
○ Every interest candidate is tagged with its pertaining emotion○ Original post is classified (no pre-processing)
→ Only positive emotions considered:○ Anticipation, Joy and Trust○ Negative emotions were not considered in this work
29Inferring User Interests from Microblog Data through Opinion Mining
Joy
Anticipation
Trust
![Page 30: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/30.jpg)
Emotion Filtering→ Filtering process
○ Posts bearing no emotion○ Shorts posts○ Posts that bear opposite emotions (ambiguous)
30Inferring User Interests from Microblog Data through Opinion Mining
Joy
Anticipation
Trust
![Page 31: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/31.jpg)
Emotion Classification
31
I am loving Jeremy Lin right now
The traffic today is okay!
.
.
Feeling excited for the ASONAM Conference. #feelingblessed
joy
trust ASONAM Conference
Jeremy Lin
Inferring User Interests from Microblog Data through Opinion Mining
For every post P in a collection of Tweets T
![Page 32: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/32.jpg)
32Inferring User Interests from Microblog Data through Opinion Mining
Rule-BasedExtraction
Emotion Classification
KeywordExtraction
Pre-processing
Interest Candidates Extraction
Interest Identification
Emotion Tagging & Filtering
Emotion Analysis
Interest Identification
Twitter Corpus
Output file
POSTagging
Pre-processing
![Page 33: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/33.jpg)
33Inferring User Interests from Microblog Data through Opinion Mining
Interest IdentificationRanking interest candidates in each emotion set
![Page 34: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/34.jpg)
Interest Identification
34Inferring User Interests from Microblog Data through Opinion Mining
→ Repetitive Interest Candidates○ Interest candidates found under several emotions are kept
→ Ambiguity○ Remove interests that are ambiguous○ Emotion classifier aids at this very well
![Page 35: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/35.jpg)
Interest Identification
35Inferring User Interests from Microblog Data through Opinion Mining
→ Occurrence○ Calculate frequency for each interest candidate (ws) ○ Frequency (f) is based on occurrence
Anticipation:
ws1 (f)ws2 (f)wsn (f)
...
Joy:
Jeremy Lin (f)ws2 (f)wsn (f)
...
Trust:
ACM Conference (f)ws2 (f)wsn (f)
...
![Page 36: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/36.jpg)
Interest Identification
36Inferring User Interests from Microblog Data through Opinion Mining
→ Ranking○ Calculate weight for each interest candidate (ws)○ Rank them by weight (w)
Anticipation:
ws1 (w)ws2 (w)wsn (w)
...
Joy:
Jeremy Lin (w)ws2 (w)wsn (w)
...
Trust:
ACM Conference (w)ws2 (w)wsn (w)
...
![Page 37: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/37.jpg)
Interest Identification
37Inferring User Interests from Microblog Data through Opinion Mining
![Page 38: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/38.jpg)
38Inferring User Interests from Microblog Data through Opinion Mining
Experiments and Results2 different types of experiment were conducted
![Page 39: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/39.jpg)
Experiment (1)
39Inferring User Interests from Microblog Data through Opinion Mining
→ Experimental Setup○ 3 active Twitter users (A,B,C)○ The latest 3000+ English posts crawled from feed○ The top-15 most frequent interests per emotion○ Results rated by the users
![Page 40: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/40.jpg)
Evaluation
40
0 = not-related1~4 = related5~10 = highly related
User A: Top 15 frequent interests per emotion
Inferring User Interests from Microblog Data through Opinion Mining
![Page 41: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/41.jpg)
Evaluation
41
0 = not-related1~4 = related5~10 = highly related
User B: Top 15 frequent interests per emotion
Inferring User Interests from Microblog Data through Opinion Mining
![Page 42: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/42.jpg)
Evaluation
42
0 = not-related1~4 = related5~10 = highly related
User C: Top 15 frequent interests per emotion
Inferring User Interests from Microblog Data through Opinion Mining
![Page 43: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/43.jpg)
Evaluation
43
User C
Inferring User Interests from Microblog Data through Opinion Mining
User BUser A
![Page 44: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/44.jpg)
Experiment (2)
44Inferring User Interests from Microblog Data through Opinion Mining
→ Experimental Setup○ Online Surveys○ 7 Users (A,B,C,D,E,F)○ Top 5 interests (including 5 sub-category interests)○ The latest 3000+ English posts crawled from feed○ Interests are categorized (ConceptNet)
![Page 45: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/45.jpg)
Categorizing Interests
45Inferring User Interests from Microblog Data through Opinion Mining
→ Hierarchical Interests Extraction○ Top 15 interests in the 3 emotion sets are combined and categorized○ ConceptNet API○ 2 level “is-a” relationship○ Observation: top interest candidates were highly related
![Page 46: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/46.jpg)
Evaluation
46Inferring User Interests from Microblog Data through Opinion Mining
Precision of system on raw data (Twitter feed)
![Page 47: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/47.jpg)
Evaluation
47Inferring User Interests from Microblog Data through Opinion Mining
● Precision of system when including ambiguous tweets
● Ambiguous tweets bearopposite or no emotion
![Page 48: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/48.jpg)
Evaluation
48Inferring User Interests from Microblog Data through Opinion Mining
● Precision of the full system when considering positive emotions
● Average precision of approx. 81% as top performance (top-10).
![Page 49: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/49.jpg)
Evaluation
49Inferring User Interests from Microblog Data through Opinion Mining
Performance of componentsper user
![Page 50: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/50.jpg)
Evaluation
50Inferring User Interests from Microblog Data through Opinion Mining
Precision comparison of all components evaluated
![Page 51: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/51.jpg)
Conclusion
51Inferring User Interests from Microblog Data through Opinion Mining
→ Positive emotions contribute tremendously to user interests identification as seen in the experiments section.
→ Emotion Analysis is an important component for the effective ranking of user’s interests and the removal of ambiguous information.
![Page 52: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/52.jpg)
Future Work
52Inferring User Interests from Microblog Data through Opinion Mining
→ Analyze emotion distribution to observe if there are patterns in the change of interests.
→ Adopt machine learning techniques to automate feature extraction for interest identification.
→ Improve approach by considering temporal information and negative emotions as a weighting factor.
→ Improve Interest categorization.
![Page 53: Thesis oral defense 2015 elvis saravia](https://reader033.vdocuments.mx/reader033/viewer/2022052706/5a64c0767f8b9ac21c8b55ff/html5/thumbnails/53.jpg)
Thanks for listening...
53Inferring User Interests from Microblog Data through Opinion Mining
Q & A