sentiment analysis using solr
DESCRIPTION
Solr is an open source, widely used, popular IR machine. It can be used for simple sentiment analysis and sentiment retrieval tool. Its multi-language analyzers together with UIMA (Unstructured Information Management Architecture) framework can be extended for sentiment extraction. Each sentence passes through a series of pluggable annotators. Entity and its associated polarity are detected for each sentence. Polarity of each sentence is stored into Solr index. Persistent model files can be created from training data and accessed at run time.TRANSCRIPT
By: Pradeep Pujari
Working mostly in Search domain
Search = IR + ML + NLP
Who am I?
Works for
Contributing to SolrSherlock
- Open Source Project
Who am I?
http://solrsherlock.github.io/SolrSherlock/
What is Sentiment Analysis? A linguistic analysis technique that identifies
The movie is great.
The movie stars Mr. X
The movie is horrible.
opinion early in a piece of text.
Challenging
Too easy Too hard
Difficulty
mis
cla
ssif
icati
on
What is Sentiment Analysis?
Sentiment Analysis
NLP
Cognitive Science
What is Sentiment Analysis?
Human can easily understand emotions.
Can a machine be trained to do it?
What is Sentiment Analysis?
Solr ? Http Request Servlet
Admin Interface
Update Servlet
Standard Request Handler
Custom Request Handler
Response Writer
Solr Core
Lucene
Analysis UIMA
config Caching
Update Handler
Linguistics module Stems, Lemmas and Synonyms multi language capability CJKAnalyzer, UIMA Analyzers
UIMA integration UpdateProcessorChain
Why Solr ?
Why Solr ? Extract domain specific entities and concepts
Time and Cost
Solr Set Up – 5 mins
UIMA Annotators - 5 days
Enrich text, write to dedicated field
Tagging entities in review text
Usecase
I wasn't really in the market for another tablet, but my girlfriend ended up getting one for me so she got me on this one. I would like to say that this tablet reminds me of the first Motorola Droid smartphone that came out several years back. The phone jam packed a ton of bells & whistles into its hardware and software to give a lot of bang for your buck. This is what it feels like amazon has done with the Kindle Fire 8.9. They have put a lot of advanced hardware and innovative software, so for the average user, specially someone who absorbs a lot of media, you get a lot for the price. But just because you get a lot for the price, doesn't mean it is without its flaws.
Usecase Consumer feedback about products
Which product features are more relevant
Polarity
Digital SLR with Full 1080p HD Video
There are many preprogrammed scene modes that make this a very easy camera to use.
The picture quality is beyond belief, and even better for the price.
Price:
Usecase
Why UIMA ? UIMA Framework manages components and data flow – No coding
Deploy pipeline of analysis engines
AEs wrap NLP algorithms
Person Place
organization Language Detection
Aggregate analysis engine
Sentence Annotator
POS Annotator
NER
Index
Lucene
Solr Update RequestProcessor
Solr
QParser Data
Solr+UIMA
UIMA AE
NLP+UIMA Use POS in query understanding
boosting terms
Synonym expansion
Extract concepts/entities
Faceting using entities
Identify places in query and use spatial queries
Ideas: Sentiment Analysis App
Identify Subjective Sentences from text
Remove noisy sentences – Regex, conditional probability
Graph min cut – LingPipe
Subjectivity Lexicons
Discard Facts and Objective Sentences
Subjectivity detector
Subjective
Objective
Polarity Classifier
Ideas: Sentiment Analysis App
Sentiments Intensity - SentiWordNet
WordNet-Affect: WordNet +
annotated concepts
Ideas: Sentiment Analysis App
Hybrid model with adding dictionary
Update Handler with
processor chain
Remove Duplicates processor
Logging processor
Custom Transform processor
Index processor
Update Processor Chain
Text Analyzers
Lucene
Lucene Index
Sentence Detection processor
Sentiment Classifier
Company Name Annotator
Sentiment Score processor
Product Reviews
http://lucene.apache.org/solr/
http://uima.apache.org/
http://alias-i.com/lingpipe/demos/tutorial/sentiment/read-me.html
http://openie.cs.washington.edu/
Questions ?
Thank You
Email: [email protected]