Movie Sentiment Analysis

Download Movie Sentiment Analysis

Post on 17-Jul-2015



Data & Analytics

2 download

Embed Size (px)


Slide 1

Team: Mohammad Saquib Aslam Prathusha Etikyala Saicharantej KommuriAniket Baheti Neha Garg Ankur WadhwaMovie Sentiment Analysis

So tell me Which movie should I watch?

Opinion ExtractionBetter Promotion

Quick Review classificationRich User EngagementPersonalization

Accurate RatingGood RecommendationMinimum Search

ClassificationTraining Dataset has 8000 reviews of different moviesGiven Variables: Phrase Id, Sentence Id, Phrase, SentimentOutput Variable: Sentiment (0 or 1)Input Variables: Features extracted through Lightside tool

Logistic RegressionNave BayesSupport Vector MachinesEvery review was initially broken into phrases in separate rows with different phrase IDsWe only took the phrase containing the entire review and discarded the other phrasesInitially 5 sentiments in training dataset: +ve, Somewhat +ve, Neutral, -ve and Somewhat veDiscarded the Neutral sentiments and grouped:i) +ve and Somewhat +ve into 1ii) -ve and Somewhat ve into 0Accuracy: 78%Accuracy: 78.46%Accuracy: 74.68%%Models UsedHighest accuracy was achieved in Nave Bayes with 21.54% errorVariables/Tool UsedCleaning Steps3Can we do better? YES!

Eliminate need of manual ratings 1000 reviews per movie20 movies in a weekWhopping $ 3000 per movie!! Just to rate them!$3000/wk * 50wks/yr= $1,50,000/ year

Improve the success of Sequels70% success rate for a blockbusters sequel20 sequels on an average in a yearExpected Revenue from sequel = $ 1400 million/yearAt mere 1% of revenues as commission for improvement(90%)= $200 mn*1% = $2 mn

Expanding the subscribers Lifetime value: Present user churn rate @ 50% accuracy= 25% LTV = ARPA x gross profit margin / customer churn Average LTV = $8x11%/25% = $ 3.2/user Increase LTV by 7 times for an rating improvement from 50% to 78%Ad revenue on recommendation websites10 million unique users monthlyAverage revenue per engagement = $ 0.5 Current Revenue from 50% accuracy = $ 60 million

Potential Savings at 78% accuracy = $1.018 millionGuess the review sentiment!!ReviewGuess the sentimentRotten tomatoesOur predictionThe movie is so thoughtlessly assembledDirector Tom Dey demonstrated a knack for mixing action and idiosyncratic humor in his charming 2000 debut Shanghai Noon , but Showtime 's uninspired send-up of TV cop show cliches mostly leaves him shooting blanksRoman Polanski directs The Pianist like a surgeon mends a broken heart; very meticulously but without any passion` Synthetic ' is the best description of this well-meaning , beautifully produced film that sacrifices its promise for a high-powered star pedigreeA film of empty , fetishistic violence in which murder is casual and fun

Way forward2) Category expansion:Expose the sentiment analysis as API for consumption by books, entertainment and ecommerce websitesSuggest right movie for improving TRP No precise method behind screening movies on TV network - Flat rate based on popularity at box office3) Algorithmic Improvement :Improve Algorithms to interpret ambiguous phraseseg: This is great this is not great - this could be great - if this were great this is just greatCan we make it more robust?Can we expand the market scope?Can we reuse our model?

Not sure which movie to watch this weekend?

You know who to ask Thank You !!