when to use the different text analytics tools - meaning cloud
Post on 12-Apr-2017
499 Views
Preview:
TRANSCRIPT
Text Analytics Tools:
When and How to
Use Them
February 8th, 2017
Webinar
Text Analytics Tools
Before we get started…
Presenter
How to participate
• Send questions with the chat feature, or
• Click the “Raise your hand” button to speak
and we’ll enable your mic
• Afterwards, you’ll be able to access a recording of the
webinar and its contents as tutorials on our blog
Antonio Matarranz
CMO
Text Analytics Tools
The purpose of this webinar…
Learn what the main Text
Analytics functions are and
what they can do for us
Text Analytics Tools
Agenda
Introduction to text analytics
Application scenarios. Benefits and challenges
Text analytics functions. Description and use cases
Quality of text analytics tools
A look at MeaningCloud’s roadmap
Conclusions and Q&A
Text Analytics Tools
Why should we be using text analytics?
Structured data
Unstructured
content
Text Analytics Tools
OpinionsFacts
Concepts
Organizations
People
Semantic
Analysis
Relationships
Themes
Text analytics
Extract meaning and actionable insights from unstructured content
Automation of costly manual activities
Text Analytics Tools
Text analytics functions
Information
extraction, NER
Categorization
Clustering
Sentiment analysis
Morphosyntactic
analysis
…
Text Analytics Tools
APPLICATION SCENARIOS
Text Analytics Tools
Social media analysis
Management of user generated content
Security & defense
Challenge: informal language
Understand the conversation in social networks, blogs, forums…
Brand and reputation monitoring
Signals, customer journey, intent, social leads
User profiling
Text Analytics Tools
Voice of the Customer (VoC) / Customer Experience
Extend your view of the customer
to new, non traditional data
sources: comments in surveys,
contact center interactions, social
conversations…
Demographic data
CRM / Mktng.automation
Contact Center interactions
Devices
Product use
Navigation
Social
360º vision
Orders and Payments
Unsolicited, unstructured sources
contribute to create integrated
360º customer view
Integrated customer view helps
provide personalized, consistent,
context-specific and relevant
experiences
Text Analytics Tools
Voice of the Citizen/Voter
Analysis of social opinions and segmentation allow to understand
citizen attitudes and behaviors
Citizen profiling. Opinions and trends about political situation, government and their services
Emergency detection and lifecycle management
Text Analytics Tools
Voice of the Employee / People Analytics
LeadersRegular
ArmyGeeks
Improve workforce understanding
Analysis of surveys, performance reviews, exit interviews, CVs, communications
Attitudes/skills/behaviors most present among top performers
Effective talent management and employee retention
Text Analytics Tools
Semantic analysis of content for enhanced exploitation and relation
Better understanding and use of archive. Generation of high-value content
Improved audience engagement thanks to personalization, recommendation and topical
contents
New ways of monetization: targeted advertising, distribution and syndication
Moderation and understanding of user generated content
Intelligent content (media, publishers)
Text Analytics Tools
For knowledge-intensive industries and departments
Leverage the tacit knowledge hidden in your document repositories
Semantic tagging and analysis of documents for advanced retrieval and exploitation
Knowledge management
Text Analytics Tools
E-discovery and regulatory compliance
Analysis of electronic documents and communications to discover evidence
Legal proceedings, regulated industries (e.g., financial services)
Sources: documents, phone call transcriptions, email, chat, social…
Low latency enables criminal behavior prevention and quick response
Text Analytics Tools
TEXT ANALYTICS FUNCTIONS
Text Analytics Tools
MeaningCloud: “Meaning as a Service”
(SaaS and on-premises)
Sign up, and use it for FREE at
http://www.meaningcloud.com
Text Analytics Tools
MeaningCloud’s APIs
Identifies occurrences of
names of people,
organizations, abstract
concepts, quantities, etc.
Theme classification
according to
predefined taxonomies
Identifies general and
attribute-level polarity
Distinguishes among 60
languages
Detailed morphosyntactic analysis Evaluates the impact of
opinions on several
reputational axes
Discover meaningful topics and
similarities among texts without
relying on predefined
taxonomies
Text Analytics Tools
Add-in for Excel
An experience fully integrated into Excel
Easy to use - No programming!
The most convenient way to evaluate, prototype, and use MeaningCloud
19
Text Analytics Tools
Topic Extraction API
Disambiguate appearances of brands, companies, organizations, people,
concepts… and many more
Contextual disambiguation
• Apple = company (not fruit)
Coreference
Based on standard ontology
Extendable/customizable dictionaries
In a filing with the SEC today, Apple revealed that CEO Tim Cook has donated the equivalent to approximately $6.5 million in Apple stock shares to charity this week. Since becoming CEO in 2011, Cook has promoted charity as a key part of Apple’s mission. Upon taking over, Cook initiated an employee charity program. Apple has also expanded its offerings for employees to help their communities.
Topic
detected
Semantic information
Tim Cook Person, Timothy Donald Cook,
Executive Apple Inc.
Apple Company, Apple Inc., Technology, USA
SEC Organization, Securities and Exchange
Comission, Government, USA
$6.5 million Monetary amount, USD, 6.5 million
charity Concept, charity
Text Analytics Tools
MeaningCloud: standard ontology
Built-in ontology
437 nodes
78 themes
250,000+ lemmas/language
Continuously updated
https://www.meaningcloud.com/developer/
documentation/ontology
Text Analytics Tools
What is topic extraction for?
Sophisticated detection of appearances/mentions of brands, people,
companies, concepts…
• Context-aware disambiguation
• Considering variants
• Coreference
Application examples:
• Key word extraction
• Document annotation: news, books, emails, records
• Social media monitoring
• Voice of the Customer / Employee / Citizen / Patient analysis
• User profiling (interests)
Text Analytics Tools
Text Classification API (featuring standard models, e.g. IAB)
Mix machine learning and rules to accurately classify text according to
predefined categories
The World Cup is the best way to see the
potential football can have for your inbound
travel, economic success and positive public
image:
The 2006 World Cup in Germany was a prime
example of this power with: $200+ per day
average tourist spending, 50,000 new jobs
created, 18 million people at Fan-Fests, total
worldwide TV viewership at 30 billion and 4.2
billion official webpage views. In a survey, 90%
of foreigners who visited the World Cup said
they felt welcome there and would recommend
Germany as a holiday destination. "The World
Cup marks an enormous gain in Germany's
image, even if it's difficult to put an economic
figure on this change in image, the economy as
a whole will certainly benefit from it." the
German economics minister, Michael Glos,
said.
Categories Relevance
Sports – World soccer 0.7
Travel - Europe 0.2
Arts & Entertainment - Television 0.3
IAB (English)
Hybrid technology
• Machine learning and/or rules
Features standard classification models
• IPTC (news), IAB (advertising), EuroVoc
(public administration), Social Media,
Business Reputation
Customizable classification models
Text Analytics Tools
MeaningCloud: standard classification models
‘Out-of-the-box’ support of
well-known predefined
classification standards
IPTC: news
IAB: targeted advertising
EuroVoc: public
administration
Social Media: social
conversations
… and more to come
https://www.meaningcloud.com/developer/documentation/supported-models
Text Analytics Tools
Classification technologies
Classifiers use patterns/vectors that represent each category
Technologies to generate those representations
• Statistic
• Rule-based
Training
documents
for category
Machine
learning
Rules for
category
Rule
codifier
Rule 1
Rule 2
Rule 3
Rule 4
Category
representation
Category
representation
Text Analytics Tools
What is text classification for?
Theme categorization: category is inferred from whole content
• Text is similar to others belonging to the category
• Text verifies certain rules
• In general it is not necessary that
certain term explicitly appears
Application examples:
• Document annotation: news, books, emails, records
• Voice of the Customer / Employee / Citizen / Patient analysis
• Conversation analysis in social media
• User profiling (interests)
Text Analytics Tools
Text Clustering API
Group similar texts and discover meaningful themes
27
Financial crisis
Greenhouse effect
No predefined taxonomy required
(unsupervised learning)
Text-specific processing
Text grouping based on
• Adherence to a theme
• Content similarity
Cluster title Size Score Document list
Financial
crisis
4 0.96 Doc1, Doc4, Doc7,
Doc8
Greenhouse
effect
5 0.34 Doc2, Doc3, Doc5,
Doc6, Doc9
Text Analytics Tools
What is text clustering for?
Grouping of similar texts and discovery of meaningful themes
• Without relying on predefined taxonomies
Application examples:
• Duplicate detection
• Discovery of structure in document collections
• Discovery of conversation themes in social media
• Discovery of the "new voice" of Customer /
Employee / Citizen / Patient
Text Analytics Tools
Sentiment Analysis API
Assign multilevel polarity to entities and other aspects, discriminate facts
from opinions and detect irony
Aspect Sentiment
Excelsior Hotel - landscapes P+
Excelsior Hotel - rooms N-
General NEU, DISAGREEMENT,
SUBJECTIVE, NON IRONIC
5-level polarity (plus absence of polarity) scoring
Aspect-based analysis
Objective (fact) / subjective (opinion) discrimination
Irony detection (beta)
Customizable sentiment models
Excelsior Hotel has the most
amazing landscapes I've ever seen,
but the rooms are disgusting.
Text Analytics Tools
What is sentiment analysis for?
Opinion analysis and mining (polarity)
• General and at attribute/aspect level
• Fact/opinion discrimination
Application examples:
• Social media monitoring
• Voice of the Customer / Employee / Citizen / Patient analysis
Text Analytics Tools
Lemmatization, PoS and Parsing API
Detailed morphosyntactic and semantic analysis
Syntactic analysis
Lemmatization
Part of Speech tagging
Relationships
Quotations
Topics: entities, concepts, etc.
Sentiment analysis
Text Analytics Tools
What is morphosyntactic analysis for?
Analysis of a text‘s deep structure
• Morphological, grammatical, semantic
Application examples:
• Text proofreading: spell,
grammar and style
• Support for the detection of
semantic relationships, e.g.,
“CompanyX has invested in
CompanyY”
• In MeaningCloud’s case,
applications of Topics
Extraction and Sentiment Analysis
Use it for FREE at www.mystilus.com
Text Analytics Tools
User Profiling API
Use the profile and content generated by the user to infer his demographic
& psychographic attributes
20% of companies say process digitization
yields actionable #analytics
Is your IT team talking SMAC (#social,
#mobile, #analytics, & #cloud)?
Five Rules of Modern Icon Design
http://bit.ly/1y3B6i6
What Twitter Can Be.
http://wp.me/p2Gq8C-6E Just if they'd play
nice with the ecosystem ... #socialtv
#recommendation
What your name says about your age,
where you live, your politics & your job
http://wapo.st/1RkqDcA
Londoner, hooked on data science, NLP
and REST.
Social posts
Social profile
Atribute Value
Person/Organization Person
Gender Male
Age 25-35
Location London
Occupation Engineer
Brands IBM
Demographic
Person /organization
Gender
Age
Location
Occupation
Psychographic
Affinities
Lifestyle…
Text Analytics Tools
What is user profiling for?
Demographic and psychographic profiling of
users
Application examples:
• Audience/Market understanding and
segmentation
• Community analysis in social media
• Influencer marketing
Text Analytics Tools
IS THIS ALL A QUESTION OF
PRECISION?
Text Analytics Tools
Just how precise is precise?
Precision is relative
Even experts aren’t 100% precise
• Tests involving human analysts: 85-95% agreement
Along with precision, recall is also important
High precision
High recall
High precision
Low recall
Low precision
High recall
Identified by algorithm
Text Analytics Tools
Accuracy: precision & recall
Precision and recall are
inversely related
• Trade-off needed
Requirements are application-specific
• Brand monitoring in social media: high precision, low recall
• Counter-terrorism : high recall, low precision
Text Analytics Tools
Opinions
The sentence “The
highest interest rate in
industry!” is…
Positive, if talking
about savings
Negative, if talking
about mortgages
Customized linguistic resources improve accuracy
Mentions
Names of banks and
financial companies,
e.g., JPMorgan, BNP
Paribas, Citibank
Product names, e.g.,
Your Way Account.
Compass Account…
Themes
Example: analysis of a bank’s customer opinions
Products
Accounts
Checking
Savings
Borrowing
Credit
Mortgage
Channel
Office
Phone
Internet
Text Analytics Tools
MeaningCloud customization tools
Text Analytics Tools
Customization tools
Create your own dictionaries, classification
models, and sentiment analysis
Graphical user interface - no programming!
Improve precision & recall
Learn more about customization in this webinar
Text Analytics Tools
A vew into the future
MeaningCloud’s roadmap
Extension for RapidMiner: combine data and text analytics
New languages: Russian, Chinese, Arabic… and many more
New APIs: Summarization, Parts of Document
Vertical Packs: VoC (general and several industries), VoE, Health
Insight Extractor: a granular categorizer and information extractor based on
semantic rules
Q1 2027 Q2 2017 Q3 2017 Q4 2017 Q1 2018
Extension for
RapidMiner Insight Extractor
Aditional languages
Summarization API,
Parts of Document API Vertical Packs
Text Analytics Tools
In conclusion
Tools that turn text into
insights Countless applications
Accuracy = customization MeaningCloud: specialists in
text analytics
Text Analytics Tools
Q & A
Text Analytics Tools
Stay tuned to our emails and blog
We’ll be posting a recording of the webinar and
its contents as tutorials soon
Text Analytics Tools
Thank you for your attention!
Questions, suggestions...
Antonio Matarranz
CMO
amatarranz@meaningcloud.com
http://www.meaningcloud.com
top related