Download - Introduction to Sentiment Mining
Sentiment Mining
Prof. Maurice Mulvenna
University of Kassel 14 December 2011
Outline § Ulster § What is Sentiment Mining § Why Sentiment Mining § Challenges § Methods § Data Sources § Applications § Examples § Simple Keyword-based Prototype § Some Results
The Right Choice COLERAINE JORDANSTOWN
MAGEE BELFAST
FOUR CAMPUSES-ONE UNIVERSITY
• Largest University in Ireland – over 25,500 local, national and international student body
• International reputation in research • “Excellence” in teaching • Graduate employment well above national average • Excellent study and recreational facilities
University of Ulster in Top 10 UK universities in applications
What to Study
Computing and Multimedia Electrical and Mechanical Engineering Humanities/Performing Arts Life and Health Sciences Social Sciences Art, Design and Built Environment Business and Management
Around 600 degree programmes:
Faculty of Computing and Engineering
Within the Faculty there are: § 5 Schools § Approximately 3000 students § 200 staff § Extensive specialist facilities on the Coleraine,
Jordanstown and Magee Campuses
What is Sentiment Mining § Also referred to as sentiment analysis or opinion
mining § It refers to the application of natural language
processing, computational linguistics, and text analytics to identify and extract subjective information in source materials. (Wikipedia)
§ Its aim is to determine the attitude or mood of a user or user group (i.e. happy or sad) the contextual polarity of statements or larger documents (i.e. positive or negative) the intended emotional communication (i.e. sarcasm or irony)
Why Sentiment Mining § Capture and analyse public opinion § Capturing the word-of-mouth effect § Evaluate the social profile of individual § News detection and analysis § Quantify the emotional state of users (i.e. duress,
stress, sadness, angriness, etc.) § Feedback mechanism to e.g. policy makers § National (e.g., UK riots) and § International ( االلععررببيي للررببييعع or ‘Arab Spring’)
events that impact and resonate in peoples’ daily lives
Marketing
http://mashable.com/2011/11/23/kindle-fire-nook-ipad-online-buzz/
Challenges § Sentiment is a subjective measure and as such is subject
to interpretation § Data Volumes
Number of statements, users, documents, etc. Size of documents and the complexity (topic, sentence, paragraph, chapter, document level)
§ Noise, and unstructured data § Slang, vernaculars, abbreviations (i.e. wdc, cu, ru, lol, etc.) § Language heterogeneity
Demographic dependencies Social dependency
§ Ambivalence § Complexity of NLP tasks
Methods § Keyword-based approaches § Machine learning techniques
Latent semantic analysis Support vector machines "bag of words” Methods Naive Bayes classifiers Other NLP tools that allow the detailed parsing of text related sources including the underlying grammar.
Data Sources § Any single document or document collection (i.e.
reviews of any kind – travel, food, movie, etc.) § Social media networks (i.e. Twitter) § Spoken communication (either directly or after
converting it into a textual representation)
à Any source in which an opinion or emotion is expressed or communicated
Applications § Reputation Management § Customer Profiling § Product Management § News Detection and Analysis § Public Opinion Analysis § Affective Computing where systems should
interpret the emotional state of users and adapt there behaviour accordingly also providing an appropriate response for the emotions detected.
The essence of the book is Lanier's attempt to answer the question: "What happens when we stop shaping technology and technology starts shaping us?" "
Prototype Architecture
Prototype Interface
Topic Map & Tag Clouds
Timeline
Runtime
Keyword
100 Keywords
6500 Keywords
Evaluation & Accuracy
Manual Classification
SentiGen
Thank you