![Page 1: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/1.jpg)
Adding Neurons to Your Assistants
Christophe Bourguignat
Matthieu Bizien
@zelrosAI @ParisNLP
![Page 2: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/2.jpg)
What we want to solve at Zelros
![Page 3: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/3.jpg)
DatasourceConnection
Predictive Modeling
Dialogs & NLPConfiguration
AIEducation
INTELLIGENT VIRTUAL ASSISTANT PLATFORM
![Page 4: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/4.jpg)
Understanding Natural Language Understanding
Source : http://nlp.stanford.edu/~wcmac/papers/20140716-UNLU.pdf
![Page 5: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/5.jpg)
Our playground
Today : retrieval based systems - what works today in practice for conversational agents
Given a user sentence and context, find the best answer among a pre-defined set of intents
Tomorrow : generative models, self-learning
![Page 6: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/6.jpg)
Bots are not born equal
Success
![Page 7: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/7.jpg)
Bots are not born equal
Success
Error
![Page 8: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/8.jpg)
Bots are not born equal
Success
Error
Fallback
![Page 9: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/9.jpg)
A bit of history
http://disi.unitn.it/~riccardi/papers/specom97.pdf
![Page 10: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/10.jpg)
Things are going fast
01/15 09/16 11/16 12/1606/16
![Page 11: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/11.jpg)
Why we want to build our own NLU system
More fun !
Data Privacy
Performances for our use-cases
Our own roadmap
...
![Page 12: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/12.jpg)
BUNT, The first public benchmarker for NLU APIs
https://github.com/zelros/bunt
![Page 13: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/13.jpg)
Approach 1 : Supervised N-grams
Intent Utterance X Y
NAME What is your name? N-gram 1
NEED What do you need? N-gram 2
NEED Do you need anything? N-gram 2
✔ Work in practice❌ Out-Of-Vocabulary words?❌ Fallback Detection?
![Page 14: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/14.jpg)
Approach 2 : Word2Vec Step 1: Distance function
What is your name Utterance
0.34 0.32 0.01 0.31 0.27
0.21 0.21 0.04 0.42 0.22
... ... ... .. ..
0.5 0.12 0.5 0.44 0.41
What was your mood Sentence
0.34 0.12 0.01 0.21 0.17
0.21 0.51 0.04 0.52 0.32
... ... ... .. ..
0.5 0.22 0.5 0.24 0.36
MEAN
MEAN
CosineSimilarity
![Page 15: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/15.jpg)
Approach 2 : Word2VecStep 2: Classifier
Distance Function
Nearest Neighbors+
Best Intent
⚠� Work in practice✔ Out-Of-Vocabulary words?❌ Fallback Detection?
![Page 16: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/16.jpg)
Going further: ML without data
Distance Function
Betteralgorithm
Learn the Distance Function
Fallback detection
Nearest Neighbors+
Best Intent
![Page 17: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/17.jpg)
ML without dataAvailable datasets: QA/QC
15 500 Questions
Supervised : label = type of intent● 6 coarse classes : abbreviation, description,
entity, human, location, numeric value● 50 fine classes
Q AQ C
Sentence Coarse class Fine class
How did serfdom develop in and then leave Russia ? DESC manner
What films featured the character Popeye Doyle ? ENTY cremat
How can I find a list of celebrities ' real names ? DESC manner
What is the full form of .com ? ABBR exp
What contemptible scoundrel stole the cork from my lunch ? HUM ind
https://cogcomp.cs.illinois.edu/Data/QA/QC/
![Page 18: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/18.jpg)
ML without dataAvailable datasets: SNLI
570k human-written English sentence pairs
Label = entailment, contradiction, or neutral(judgments of five turkers)
Text Judgments Hypothesis
A man inspects the uniform of a figure in some East Asian country.
contradiction C C C C C The man is sleeping
An older and younger man smiling. neutral N N E N N
Two men are smiling and laughing at the cats playing on the floor.
A black race car starts up in front of a crowd of people.
contradiction C C C C C A man is driving down a lonely road.
A soccer game with multiple males playing. entailment E E E E E Some men are playing a sport.
https://nlp.stanford.edu/projects/snli/
![Page 19: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/19.jpg)
ML without dataAvailable datasets: Quora
Sentence 1 Duplicate Sentence 2
What is the step by step guide to invest in share market in india?
False What is the step by step guide to invest in share market?
Why am I mentally very lonely? How can I solve it?
False Find the remainder when [math]23^{24}[/math] is divided by 24,23?
How do we prepare for UPSC? True How do I prepare for civil service?
How can I be a good geologist? True What should I do to be a great geologist?
400 000 pairs of questions
Supervised : label = are they duplicates?
![Page 20: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/20.jpg)
ML without dataAvailable datasets: Quora
![Page 21: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/21.jpg)
Siamese Network �
![Page 22: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/22.jpg)
Siamese Network �Creation of a sentence embedding
What is your name
0.34 0.32 0.01 0.31
0.21 0.21 0.04 0.42
... ... ... ..
0.5 0.12 0.5 0.44
Embedding
0.27
0.22
..
0.41
Lambda (lambda x: K.max(x, axis=1), output_shape=(300,))
TimeDistributed(Dense(300, activation=’relu’))
Embedding(nb_tokens+1, 300, input_length=25, trainable=False)
![Page 23: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/23.jpg)
Siamese Network �A simple architecture
What is your name Embedding
0.34 0.32 0.01 0.31 0.27
0.21 0.21 0.04 0.42 0.22
... ... ... .. ..
0.5 0.12 0.5 0.44 0.41
What was your mood Embedding
0.34 0.12 0.01 0.21 0.17
0.21 0.51 0.04 0.52 0.32
... ... ... .. ..
0.5 0.22 0.5 0.24 0.36
Lambda (lambda x: K.max(x, axis=1), output_shape=(300,))
TimeDistributed(Dense(300, activation=’relu’))
Embedding(nb_tokens+1, 300, input_length=25, trainable=False)
Lambda (lambda x: K.max(x, axis=1), output_shape=(300,))
TimeDistributed(Dense(300, activation=’relu’))
Embedding(nb_tokens+1, 300, input_length=25, trainable=False)
SIMILARITY
1/(1 + |h-h’|²)Same weights
![Page 24: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/24.jpg)
Siamese Network �Learned Similarity
https://github.com/bradleypallen/keras-quora-question-pairs
![Page 25: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/25.jpg)
Siamese Network �Going further
Dropout?
BiLSTM?
GRU?
MeanPooling?
BatchNorm?
More Layers?
N-char?
Maxout?
![Page 26: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/26.jpg)
Siamese Network �Conclusion
What is your name Embedding
0.34 0.32 0.01 0.31 0.27
0.21 0.21 0.04 0.42 0.22
... ... ... .. ..
0.5 0.12 0.5 0.44 0.41
What was your mood Embedding
0.34 0.12 0.01 0.21 0.17
0.21 0.51 0.04 0.52 0.32
... ... ... .. ..
0.5 0.22 0.5 0.24 0.36
SIMILARITY
![Page 27: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/27.jpg)
Non-siamese network Example
Bilateral Multi-Perspective Matching for Natural Language Sentences
![Page 28: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/28.jpg)
Better Algorithm: SVM
Distance Function
Betteralgorithm
Learn the Distance Function
Fallback detection
Nearest Neighbors+
Best Intent
✅
![Page 29: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/29.jpg)
Better Algorithm: SVM
J.P. Vert
![Page 30: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/30.jpg)
Better Algorithm: SVM
J.P. Vert
So the Siamese Networks are kernels!
So we can use SVM � J.P. Vert
![Page 31: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/31.jpg)
Fallback Detection
Distance Function
Betteralgorithm
Learn the Distance Function
Fallback detection
Nearest Neighbors+
Best Intent
✅✅
![Page 32: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/32.jpg)
Fallback Detection
If Probability < ThresholdThen Fallback
External dataset of unrelated sentences
![Page 33: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/33.jpg)
Learned Distance Function
Betteralgorithm
Learn the Distance Function
Fallback detection
SVM +
Best IntentOr Fallback
✅✅
✅
![Page 34: @zelrosAI @ParisNLP Matthieu Bizien Christophe Bourguignat · 2017. 6. 4. · Available datasets: Quora Sentence 1 Duplicate Sentence 2 What is the step by step guide to invest in](https://reader033.vdocuments.mx/reader033/viewer/2022060903/609f651787bcb726b9579381/html5/thumbnails/34.jpg)
Thanks !@chris_bour
@MatthieuBizien
@ParisNLP