incorporating contextual and syntactic structures improves ... · pwim: he, hua, and jimmy lin....
TRANSCRIPT
Incorporating Contextual and Syntactic Structures
Improves Semantic Similarity Modeling
Linqing Liu, Wei Yang, Jinfeng Rao, Raphael Tang, and Jimmy Lin
1
2
Semantic Textual Similarity
Some people are walking
People are walking
Similarity score: 1 5
A group of scouts are camping in the grass
4.7
1.5
A group of people is on a beach
A group of people is near the sea
A group of people is on a mountain
4.0
2.6
SICK dataset: Marelli, Marco, et al. "Semeval-2014 task 1: Evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment." Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014). 2014.
Current Models for Sentence Pair Modeling
● Shortcut-stacked Sentence Encoders (SSE) [Nie and Bansal, 2017]
● Decomposable Attention Model (DecAtt) [Parikh et al., 2016]
● Bi-LSTM Max-pooling Network (InferSent) [Conneau et al., 2017]
● Enhanced Sequential Inference Model (ESIM) [Chen et al., 2017]
● Pairwise Word Interaction model (PWIM) [He and Lin, 2016]
3
Current Models for Sentence Pair Modeling
4
Components
Sentence Pair Interaction & Attention Mechanism
Sentence Representation
Current Models for Sentence Pair Modeling
5
Components
Sentence Pair Interaction & Attention Mechanism
Encoding Contextual Information by LSTM
Incorporation of Syntactic Parsing Information
Current Models for Sentence Pair Modeling
6
SSE DecAtt Infersent ESIM PWIM
Components
Sentence Pair Interaction & Attention Mechanism
Encoding Contextual Information by LSTM
Incorporation of Syntactic Parsing Information
Current Models for Sentence Pair Modeling
7
SSE DecAtt Infersent ESIM PWIM
Components
Sentence Pair Interaction & Attention Mechanism
Encoding Contextual Information by LSTM
Incorporation of Syntactic Parsing Information
Current Models for Sentence Pair Modeling
8
SSE DecAtt Infersent ESIM PWIM
Components
Sentence Pair Interaction & Attention Mechanism
Encoding Contextual Information by LSTM
Incorporation of Syntactic Parsing Information
Current Models for Sentence Pair Modeling
9
SSE DecAtt Infersent ESIM PWIM
Components
Sentence Pair Interaction & Attention Mechanism
Encoding Contextual Information by LSTM
Incorporation of Syntactic Parsing Information
Does Syntactic Structure help Sentence Pair Modeling?
Current Models for Sentence Pair Modeling
10
SSE DecAtt Infersent ESIM PWIM
Components
Sentence Pair Interaction & Attention Mechanism
Encoding Contextual Information by LSTM
Incorporation of Syntactic Parsing Information
Do Contextual and Syntactic Structures help sentence Pair Modeling?
Pairwise Word Interaction Model (PWIM)
11
w11 w12 w1n w21 w22 w2m…...…...
Cats Sit on the Mat On the Mat There Sit Cats
On the Mat There Sit Cats
Cats
Sit
On
The
Mat cos(w1n, w2m)
L2Euclid(w1n, w2m)
DotProduct(w1n, w2m)
19-layer deep CNN
Fully connected layers
…... …...Bi-LSTM
PWIM: He, Hua, and Jimmy Lin. "Pairwise word interaction modeling with deep neural networks for semantic similarity measurement." Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016.
Lan, Wuwei, and Wei Xu. "Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering." ACL 2018.
Pairwise Word Interaction Model (PWIM)
12
w11 w12 w1n w21 w22 w2m…...…...
Cats Sit on the Mat On the Mat There Sit Cats
On the Mat There Sit Cats
Cats
Sit
On
The
Mat cos(w1n, w2m)
L2Euclid(w1n, w2m)
DotProduct(w1n, w2m)
19-layer deep CNN
Fully connected layers
…... …...Bi-LSTM
PWIM: He, Hua, and Jimmy Lin. "Pairwise word interaction modeling with deep neural networks for semantic similarity measurement." Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016.
Pairwise Word Interaction Model (PWIM)
13
w11 w12 w1n w21 w22 w2m…...…...
Cats Sit on the Mat On the Mat There Sit Cats
On the Mat There Sit Cats
Cats
Sit
On
The
Mat cos(w1n, w2m)
L2Euclid(w1n, w2m)
DotProduct(w1n, w2m)
19-layer deep CNN
Fully connected layers
…... …...Bi-LSTM
Pairwise Word Interaction Model (PWIM)
14
w11 w12 w1n w21 w22 w2m…...…...
…...
Cats Sit on the Mat On the Mat There Sit Cats
On the Mat There Sit Cats
Cats
Sit
On
The
Mat cos(w1n, w2m)
L2Euclid(w1n, w2m)
DotProduct(w1n, w2m)
19-layer deep CNN
Fully connected layers
Bi-LSTM
…...Bi-LSTM
…...Bi-LSTM
…...
…...
…...
Multi-layer BiLSTMSentence Encoders
Pairwise Word Interaction Model (PWIM)
15
w11 w12 w1n w21 w22 w2m…...…...
Cats Sit on the Mat On the Mat There Sit Cats
On the Mat There Sit Cats
Cats
Sit
On
The
Mat cos(w1n, w2m)
L2Euclid(w1n, w2m)
DotProduct(w1n, w2m)
19-layer deep CNN
Fully connected layers
…... …...Bi-LSTM
Pairwise Word Interaction Model (PWIM)
16
On the Mat There Sit Cats
Cats
Sit
On
The
Mat cos(w1n, w2m)
L2Euclid(w1n, w2m)
DotProduct(w1n, w2m)
19-layer deep CNN
Fully connected layers
w11 w12 w1n w21 w22 w2m…...…...
Cats Sit on the Mat On the Mat There Sit Cats
w1
w2
w3
w5
w4 w2
w3
w1
w5
w4 w6
Tree-LSTM
*Tai, Kai Sheng, Richard Socher, and Christopher D. Manning. "Improved semantic representations from tree-structured long short-term memory networks." arXiv preprint arXiv:1503.00075(2015).
Experiments on eight datasets
❏ Semantic Textual Similarity
STS-2014 SICK
❏ Paraphrase Identification
Quora Twitter PIT-2015
❏ Question Answering
WikiQA TrecQA
❏ Natural Language Inference
SNLI17
0 5
Paraphrase Non-paraphrase
True False
Entailment Neutral Contradict
1
Experiment Results Analyses
18
Additional Contextual Structure?
19
Additional Contextual Structure?
20
+0.019 +0.006 +0.009 +0.012 +0.01 +0.008 +0.015 +0.029
Additional Contextual Structure?
21
+0.019 +0.006 +0.009 +0.012 +0.01 +0.008 +0.015 +0.029
Additional Contextual Structure?
22
+0.336 -0.016 +0.107 +0.19 +0.093 +0.146 -0.004
Additional Syntactic Structure?
23
+0.023 +0.016 +0.017 -0.002 +0.021 +0.026 +0.022 +0.033
Sentence Pair Visualization
24
mPWIM_seq mPWIM_seq+tree
Sentence Pair Visualization
25
mPWIM_seq mPWIM_seq+tree
Sentence Pair Visualization
26
mPWIM_seq mPWIM_seq+tree
Sentence Pair Visualization
27
mPWIM_seq mPWIM_seq+tree
28
29
● We are after the question: Does Contextual / Syntactic Structure help?
● Help to form a better student model
Takeaway
30
Incorporating structural information contributes to consistent improvements over strong baselines
31