why sentiment analysis is a market for lemons … and how to fix it
TRANSCRIPT
![Page 1: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/1.jpg)
Language Intelligence
Why Sentiment Analysis is a Market for Lemons … and How to Fix it
Robert Munro
![Page 2: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/2.jpg)
With thanks!
Gary King & Jana Thompson:
<- other Idibon people here:Michelle Casbon & Nick Gaylord
![Page 3: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/3.jpg)
What is a market for lemons?
• Information asymmetry between buyers and sellers, leaving only "lemons" behind. George Akerlof • Buyers cannot distinguish good
from bad products• Prices are equally low for all
products• The buyer's price adverse
selection problem drives the high-quality products from the market
![Page 4: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/4.jpg)
Competition is not increasing accuracy• 100+ companies
offering some form of sentiment analysis• Accuracy hovering
around 70% for real-world applications for almost a decade
![Page 5: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/5.jpg)
The most honest sentiment analysis results you will see
Accuracy
F-Score Recall Precision F-Score
PositiveNegativ
e NeutralPositiv
eNegativ
e NeutralPositiv
eNegati
ve NeutralSemantria 0.59 0.59 0.56 0.47 0.78 0.68 0.80 0.45 0.62 0.59 0.57MonkeyLearn 0.50 0.38* 0.84 0.54 0.00 0.45 0.60 0.00 0.59 0.57 0.00MetaMind 0.66 0.66 0.68 0.46 0.88 0.78 0.88 0.50 0.73 0.60 0.64Idibon Public 0.68 0.67 0.76 0.75 0.49 0.66 0.69 0.72 0.71 0.72 0.58
• Even within the best results for one domain, there is no clear leader when broken down by category• All systems could have best results in other domains• All could adapt here: Monkey Learn had errors with the ‘Neutral’
category, but we are sure they could update their models
Source: Sentiment 140 corpus, 3-way sentiment on social data:http://cs.stanford.edu/people/alecmgo/trainingandtestdata.zip
![Page 6: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/6.jpg)
Data beats algorithms; feedback beats data
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0.457 0.473
0.615
0.948precisionrecallF-value
Distinguishing the correct ‘Ford’
Distinguishing “Ford” the company from people called “Ford”
![Page 7: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/7.jpg)
Consumers are uncertain• When consumers try out-
of-domain analysis, they lose confidence from the poor results.• Domain-dependence
means that even bad models will be accurate in some areas• Consumers can only
evaluate anecdotally or by precision, not recall • Uncertainty prevails
![Page 8: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/8.jpg)
Market forces are not breeding innovation• Can’t innovate
through code alone• More training data! • But low price-points
means low margins • Lack of capital to
find & label enough training data
![Page 9: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/9.jpg)
The Solution
• A different economic models for useful sentiment analysis: • Data-sharing for more
accurate training data • Protecting sensitive data
from public release
![Page 10: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/10.jpg)
Machine learning
Optimization
Human annotation
Cloudprediction
engine
Actionable intelligence
On-site prediction
engine
Copy & Sync Models
App Requests
Ambiguous, Novel & Interesting Items
Internal Data Flow
Hybrid Model Data Flow
Application Data Flow
firewall
![Page 11: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/11.jpg)
The Benefits• Multiple organizations can share in the benefits of better
sentiment analysis, without sacrificing privacy• Single point of human-contact: no expensive duplicate
manual labeling of data• Keeps lemons out of the market
![Page 12: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/12.jpg)
Idibon Public: our implementation
• Free product, offered in addition to our enterprise Idibon Studio and Idibon Terminal solutions
![Page 13: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/13.jpg)
Applies to NLP and Machine Learning more broadly
Every human communication
• Any task can be bundled this way• Allows margins for use cases that
were not otherwise viable• … including the full diversity of
languages, priced out when everyone started in English
![Page 14: Why Sentiment Analysis is a Market for Lemons … and How to Fix it](https://reader035.vdocuments.mx/reader035/viewer/2022062523/58ed4cb91a28ab68588b46d7/html5/thumbnails/14.jpg)
Language Intelligence
Why Sentiment Analysis is a Market for Lemons … and How to Fix it
QUESTIONS?Robert Munro