dandelion: semantic text analytics as a service
TRANSCRIPT
![Page 1: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/1.jpg)
Ugo ScaiellaR&D Team Lead @ Smarter Engagement – Milano, 20.05.2016
Dandelionsemantic text analytics
as a service
![Page 2: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/2.jpg)
The bag-of-words paradigm
The Mona Lisa is a 16th century oil on canvas painted by Leonardo.It's held at the Louvre in Paris.
![Page 3: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/3.jpg)
The bag-of-words paradigmTerm Freqthe 2mona 1leonardo 1century 1oil 1Paris 1Lisa 1By 1painted 1at 1canvas 1... ...
![Page 4: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/4.jpg)
Classic NLP pipeline
Segmentation Tokenization PoS Tagging Chunking Dependency
Parsing
![Page 5: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/5.jpg)
Classic NLP pipeline1 The the DT O 3 det2 Mona Mona NNP O 3 compound3 Lisa Lisa NNP O 8 nsubj4 is be VBZ O 8 cop5 a a DT O 8 det6 16th 16th JJ DATE 8 amod7 century century NN DATE 8 compound8 oil oil NN O 0 ROOT9 on on IN O 10 case10 canvas canvas NN O 8 nmod11 painted paint VBN O 10 acl12 by by IN O 13 case13 Leonardo Leonardo NNP PERSON 11 nmod14 .. . O _ _
1 It it PRP O 3 nsubjpass2 's be VBZ O 3 auxpass3 held hold VBN O 0 ROOT...
![Page 6: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/6.jpg)
Limitations
The book is on the table
“
”
![Page 7: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/7.jpg)
Limitations
Training: expensive, hard
![Page 8: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/8.jpg)
![Page 9: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/9.jpg)
![Page 10: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/10.jpg)
The graph of conceptsThe Mona Lisa is a 16th century oil on canvas painted by Leonardo. It's held at the Louvre in Paris.
![Page 11: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/11.jpg)
The graph of conceptsThe Mona Lisa is a 16th century oil on canvas painted by Leonardo. It's held at the Louvre in Paris.
![Page 12: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/12.jpg)
The graph of conceptsPERSONbirthDatebirthPlacedeathDateauthorOf...
CONCEPT...
WORK...
PLACEcoordscapitalOfpopulation...
BUILDINGcoords...
![Page 13: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/13.jpg)
paris
leonardo
oil on canvas
mona lisa
Oil painting
Paris (mythology)
Mona Lisa (painting)
Mona Lisa (movie)
Paris (city)
Leonardo da Vinci
Leonardodo Nascimento
Spots(aka mentions, surface
forms)
Concepts
![Page 14: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/14.jpg)
Advantages
• Less training• Speed• Customization• Robustness to syntax• … but still (may) use classic NLP to improve results
![Page 15: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/15.jpg)
Applications
• Entity Extraction• Classification• Similarity & clustering … basically any IR task
![Page 16: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/16.jpg)
Applications: an example
Cameron wins the Oscar
Cameron wins general elections
All nominees for the Academy Awards
See more onhttps://dandelion.eu
![Page 17: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/17.jpg)
Real World Use Cases
![Page 18: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/18.jpg)
Use case #1Lawful interception
Identify potential terrorism threats on social networks and message boards
Customized domain-specific taxonomy
![Page 19: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/19.jpg)
Use case #2Website tagging
Profile a company looking at his website• Entity extraction: products, locations• People & Roles
Sales intelligencefor lead generation
http://atoka.io
![Page 20: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/20.jpg)
![Page 21: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/21.jpg)
Use case #3News stream monitoring
News stream of 70k articles per day• BI vertical of semantic engine• Entity extraction: companies, people• Business signals extraction
![Page 22: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/22.jpg)
![Page 23: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/23.jpg)
Use case #4Social media analysis
• Entity extraction, sentiment analysis• Dashboard, tag-cloud
![Page 24: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/24.jpg)
Use case #5Travel recommendation
Crawl the web and understand people’s behaviorDisplay travel offers that match user preferences
![Page 25: Dandelion: semantic text analytics as a service](https://reader034.vdocuments.mx/reader034/viewer/2022051520/588362d51a28ab536b8b4d13/html5/thumbnails/25.jpg)
Use case #6E-Commerce Optimization
Collect and annotate customer reviews from e-commerce websites
Dashboard for product ratings analysis