jointly learning word and phrase embeddings using neural ...hassy/publications/talk/ucl2015/... ·...
TRANSCRIPT
![Page 1: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/1.jpg)
Jointly Learning Word and Phrase Embeddings Using
Neural Networks and Implicit Tensor Factorization
Kazuma Hashimoto
Tsuruoka Laboratory, University of Tokyo
19/06/2015 Talk@UCL Machine Reading Lab.
![Page 2: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/2.jpg)
• Name
– Kazuma Hashimoto (橋本 和真 in Japanese)
– http://www.logos.t.u-tokyo.ac.jp/~hassy/
• Belong
– Tsuruoka Laboratory, University of Tokyo
• April 2015 – present Ph.D. student
• April 2013 – March 2015 Master’s student
– National Centre for Text Mining (NaCTeM)
• Research Interest
– Word/phrase/document embeddings and their
applications
Self Introduction
19/06/2015 Talk@UCL Machine Reading Lab. 2 / 39
![Page 3: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/3.jpg)
1. Background
– Word and Phrase Embeddings
2. Jointly Learning Word and Phrase Embeddings
– General Idea
3. Our Methods Focusing on Transitive Verb Phrases
– Word Prediction (EMNLP 2014)
– Implicit Tensor Factorization (CVSC 2015)
4. Experiments and Results
5. Summary
Today’s Agenda
19/06/2015 Talk@UCL Machine Reading Lab. 3 / 39
![Page 4: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/4.jpg)
1. Background
– Word and Phrase Embeddings
2. Jointly Learning Word and Phrase Embeddings
– General Idea
3. Our Methods Focusing on Transitive Verb Phrases
– Word Prediction (EMNLP 2014)
– Implicit Tensor Factorization (CVSC 2015)
4. Experiments and Results
5. Summary
Today’s Agenda
19/06/2015 Talk@UCL Machine Reading Lab. 4 / 39
![Page 5: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/5.jpg)
• Word: String Index Vector
• Why vectors?
– Word similarities can be measured using distance
metrics of the vectors (e.g., the cosine similarity)
Assigning Vectors to Words
cause
trigger
disorder
disease
animal
mouse
ratanimal
mouserat
diseasedisorder
triggercause
Embedding words in a vector space
19/06/2015 Talk@UCL Machine Reading Lab. 5 / 39
![Page 6: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/6.jpg)
• Two approaches using large corpora:
(systematic comparison of them in Baroni+ (2014))
– Count-based approach
• e.g.) Reducing the dimension of word co-
occurrence matrix using SVD
– Prediction-based approach
• e.g.) Predicting words from their contexts using
neural networks
• We focus on prediction-based approach
– Why?
Approaches to Word Representations
19/06/2015 Talk@UCL Machine Reading Lab. 6 / 39
![Page 7: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/7.jpg)
• Prediction-based approaches usually
– parameterize the word embeddings
– learn them based on co-occurrence statistics
• Word embeddings appearing in similar contexts get
close to each other
Learning Word Embeddings
------------
text data
… the prevalence of drunken driving and accidents caused by drinking …
target
word prediction using the word embedding
SkipGram model (Mikolov+, 2013) in word2vec
19/06/2015 Talk@UCL Machine Reading Lab. 7 / 39
![Page 8: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/8.jpg)
• Learning word embeddings for relation classification
– To appear at CoNLL 2015 (just advertising)
Task-Oriented Word Embeddings
19/06/2015 Talk@UCL Machine Reading Lab. 8 / 39
![Page 9: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/9.jpg)
• Treating phrases and sentences as well as words
– gaining much attention recently!
Beyond Word Embeddings
make payment
pay money
moneypay
pay moneymake payment
paymentmake
Embedding phrases in a vector space
19/06/2015 Talk@UCL Machine Reading Lab. 9 / 39
![Page 10: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/10.jpg)
• Element-wise addition/multiplication (Lapata+, 2010)
– 𝑣 sentnce = 𝑖 𝑣 𝑤𝑖
• Recursive autoencoders (Socher+, 2011; Hermann+, 2013)
– Using parse trees
– 𝑣 parent = 𝑓(𝑣 left child , 𝑣 right child )
• Tensor/matrix-based methods
– 𝑣 adj noun = 𝑀 adj 𝑣(noun) (Baroni+, 2010)
– 𝑀 verb = 𝑖,𝑗 𝑣 subj𝑖T𝑣 obj𝑗 (Grefenstette+, 2011)
• 𝑀 subj, verb, obj ={𝑣 subj T𝑣 obj } ∗ 𝑀(verb)
• 𝑣 subj, verb, obj = 𝑀 verb 𝑣 obj ∗ 𝑣 subj
(Kartsaklis+, 2012)
Approaches to Phrase Embeddings
19/06/2015 Talk@UCL Machine Reading Lab. 10 / 39
![Page 11: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/11.jpg)
• Co-occurrence matrix + SVD
• C&W (Collobert+, 2011)
• RNNLM (Mikolov+, 2013)
• SkipGram/CBOW (Mikolov+, 2013)
• vLBL/ivLBL (Mnih+, 2013)
• Dependency-based SkipGram (Levy+, 2014)
• Glove (Pennington+, 2014)
Which Word Embeddings are the Best?
19/06/2015 Talk@UCL Machine Reading Lab.
Which word embeddings should we use for which composition methods?
Joint leaning11 / 39
![Page 12: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/12.jpg)
1. Background
– Word and Phrase Embeddings
2. Jointly Learning Word and Phrase Embeddings
– General Idea
3. Our Methods Focusing on Transitive Verb Phrases
– Word Prediction (EMNLP 2014)
– Implicit Tensor Factorization (CVSC 2015)
4. Experiments and Results
5. Summary
Today’s Agenda
19/06/2015 Talk@UCL Machine Reading Lab. 12 / 39
![Page 13: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/13.jpg)
• Word co-occurrence statistics word embeddings
• How about phrase embeddings?
– Phrase co-occurrence statistics!
Co-Occurrence Statistics of Phrases
The importer made payment in his own domestic currency
19/06/2015 Talk@UCL Machine Reading Lab.
The businessman pays his monthly fee in yen
similar contexts
similar meanings?
13 / 39
![Page 14: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/14.jpg)
• Using Predicate-Argument Structures (PAS)
– Enju parer (Miyao+, 2008)
• Analyzes relations between phrases and words
How to Identify Phrase-Word Relations?
The importer made payment in his own domestic currency
NP
NP
NP
VPNP
verb prepositionpredicates
19/06/2015 Talk@UCL Machine Reading Lab.
arguments
14 / 39
![Page 15: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/15.jpg)
1. Background
– Word and Phrase Embeddings
2. Jointly Learning Word and Phrase Embeddings
– General Idea
3. Our Methods Focusing on Transitive Verb Phrases
– Word Prediction (EMNLP 2014)
– Implicit Tensor Factorization (CVSC 2015)
4. Experiments and Results
5. Summary
Today’s Agenda
19/06/2015 Talk@UCL Machine Reading Lab. 15 / 39
![Page 16: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/16.jpg)
• Meanings of transitive verbs are affected by their
arguments
– e.g.) run, make, etc.
Good target to test composition models
Why Transitive Verb Phrases?
19/06/2015 Talk@UCL Machine Reading Lab.
make
make payment
make money
make use (of)
pay
earn
use
16 / 39
![Page 17: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/17.jpg)
• Embedding subject-verb-object tuples in a vector space
– Semantic similarities between SVOs can be used!
Possible Application: Semantic Search
19/06/2015 Talk@UCL Machine Reading Lab. 17 / 39
![Page 18: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/18.jpg)
• Focusing on the role of prepositional adjuncts
– Prepositional adjuncts complement meanings of
verb phrases should be useful
Training Data from Large Corpora
simplification
How to model the relationships between predicates and arguments?
19/06/2015 Talk@UCL Machine Reading Lab.
------------
English Wikipedia,BNC, etc.
parse
18 / 39
![Page 19: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/19.jpg)
1. Background
– Word and Phrase Embeddings
2. Jointly Learning Word and Phrase Embeddings
– General Idea
3. Our Methods Focusing on Transitive Verb Phrases
– Word Prediction (EMNLP 2014)
– Implicit Tensor Factorization (CVSC 2015)
4. Experiments and Results
5. Summary
Today’s Agenda
19/06/2015 Talk@UCL Machine Reading Lab. 19 / 39
![Page 20: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/20.jpg)
• Predicting words in predicate-argument tuples
Word Prediction Model (like word2vec)
arg1
+
currency furniture
max(0, 1-s(currency)+s(furniture)) cost function
pred
[importer make payment] in
𝐩 = tanh(𝐡𝑎𝑟𝑔1prep
⊙𝐯𝑎𝑟𝑔1 +
𝐡𝑝𝑟𝑒𝑑prep
⊙𝐯𝑝𝑟𝑒𝑑)
𝐯𝑎𝑟𝑔1 𝐯𝑝𝑟𝑒𝑑
feature vectorfor the word prediction
𝐡𝑎𝑟𝑔1prep
𝐡𝑝𝑟𝑒𝑑prep
19/06/2015 Talk@UCL Machine Reading Lab.
PAS-CLBLM20 / 39
![Page 21: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/21.jpg)
• Two methods:
– (a) assigning a vector to each SVO tuple
– (b) composing SVO embeddings
How to Compute SVO Embeddings?
[importer make payment]
subj obj
+verb
[importer make payment]
(a) (b)
- parameterized vectors
- composed vectors
19/06/2015 Talk@UCL Machine Reading Lab. 21 / 39
![Page 22: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/22.jpg)
1. Background
– Word and Phrase Embeddings
2. Jointly Learning Word and Phrase Embeddings
– General Idea
3. Our Methods Focusing on Transitive Verb Phrases
– Word Prediction (EMNLP 2014)
– Implicit Tensor Factorization (CVSC 2015)
4. Experiments and Results
5. Summary
Today’s Agenda
19/06/2015 Talk@UCL Machine Reading Lab. 22 / 39
![Page 23: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/23.jpg)
• Only element-wise vector operations
– Pros: Fast training
– Cons: Poor interaction between predicates and
arguments
• Interactions between predicates and arguments are
important for transitive verbs
Weakness of PAS-CLBLM
19/06/2015 Talk@UCL Machine Reading Lab.
make
make payment
make money
make use (of)
pay
earn
use
23 / 39
![Page 24: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/24.jpg)
• Tensor/matrix-based approaches (Noun: vector)
– Adjective: matrix (Baroni+, 2010)
– Transitive verb: matrix
(Grefenstette+, 2011; Van de Cruys+, 2013)
Focusing on Tensor-Based Approaches
19/06/2015 Talk@UCL Machine Reading Lab.
verb
subject
verb
𝑑𝑑
𝑑
subject≅
𝑃𝑀𝐼(importer, make, payment) = 0.31
GivenGiven
Given
pre-trained
24 / 39
![Page 25: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/25.jpg)
• Parameterizing
– Predicate matrices and
– Argument embeddings
Implicit Tensor Factorization (1)
19/06/2015 Talk@UCL Machine Reading Lab.
predicate
argument 2
predicate
𝑑𝑑
𝑑
argument 2≅
GivenGiven
Given
25 / 39
![Page 26: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/26.jpg)
• Calculating plausibility scores
– Using predicate matrices & argument embeddings
Implicit Tensor Factorization (2)
19/06/2015 Talk@UCL Machine Reading Lab.
predicate
argument 2
predicate
𝑑𝑑
𝑑
argument 2≅
GivenGiven
Given
𝑇(i, j, k) =
ij k
26 / 39
![Page 27: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/27.jpg)
• Learning model parameters
– Using plausibility judgment task
• Observed tuple: (i, j, k)
• Collapsed tuple: (i’, j, k), (i, j’, k), (i, j, k’)
–Negative sampling (Mikolov+, 2013)
Implicit Tensor Factorization (3)
19/06/2015 Talk@UCL Machine Reading Lab.
Cost function
27 / 39
![Page 28: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/28.jpg)
• Discriminating between observed and collapsed ones
Example
19/06/2015 Talk@UCL Machine Reading Lab.
(i, j, k) = (in, importer make payment, currency)(i’, j, k)= (on, importer make payment, currency)(i, j’, k)= (in, child eat pizza, currency)(i, j, k’)= (in, importer make payment, furniture)
28 / 39
![Page 29: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/29.jpg)
• Two methods:
– (a) assigning a vector to each SVO tuple
– (b) composing SVO embeddings
How to Compute SVO Embeddings?
- parameterized vectors
- composed vectors
19/06/2015 Talk@UCL Machine Reading Lab.
[importer make payment][importer make payment]
(a) (b)
- parameterized matrices
(Kartsaklis+, 2012)
29 / 39
![Page 30: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/30.jpg)
• The function is presented in Kartsaklis+ (2012)
– Using verb matrices in Grefenstette+ (2011)
• Our verb matrices are related to Grefenstette+
(2011)
• The function can compute
– verb-object phrase embeddings
– subject-verb-object phrase embeddings
Why the Copy-Subject Function?
19/06/2015 Talk@UCL Machine Reading Lab. 30 / 39
![Page 31: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/31.jpg)
1. Background
– Word and Phrase Embeddings
2. Jointly Learning Word and Phrase Embeddings
– General Idea
3. Our Methods Focusing on Transitive Verb Phrases
– Word Prediction (EMNLP 2014)
– Implicit Tensor Factorization (CVSC 2015)
4. Experiments and Results
5. Summary
Today’s Agenda
19/06/2015 Talk@UCL Machine Reading Lab. 31 / 39
![Page 32: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/32.jpg)
• Training corpus: English Wikipedia
– SVO data: 23.6 million instances
– SVO-preposition-noun data: 17.3 million instances
• Parameter Initialization: random values
• Optimization: mini-batch AdaGrad (Duchi+, 2011)
• Embedding dimensionality
– PAS-CLBLM: 200
– Tensor method: 50
• # of model parameters of PAS-CLBLM is a little
bit larger than that of the tensor method
Experimental Settings
19/06/2015 Talk@UCL Machine Reading Lab. 32 / 39
![Page 33: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/33.jpg)
• Case 1: assigning a vector to each SVO tuple
Examples of Learned SVO Embeddings
Adjuncts seem to be helpful in learning the meanings of verb phrases
This approach omits the information about individual words!
19/06/2015 Talk@UCL Machine Reading Lab. 33 / 39
![Page 34: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/34.jpg)
• Case 2: composing SVO embeddings
Examples of Learned SVO Embeddings
Tensor (CVSC 2015) PAS-CLBLM (EMNLP 2014)
More flexible!
19/06/2015 Talk@UCL Machine Reading Lab.
Strongly enhancing the head word
34 / 39
![Page 35: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/35.jpg)
• In the latest approach, the learned verb matrices
capture multiple meanings
Multiple Meanings in Verb Matrices
19/06/2015 Talk@UCL Machine Reading Lab. 35 / 39
![Page 36: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/36.jpg)
• Measuring semantic similarities of verb pairs taking
the same subjects and objects (Grefenstette+, 2011)
– Evaluation: Speaman’s rank correlation between
similarity scores and human ratings
Verb Sense Disambiguation Task
verb pair with subj&obj human rating
student write name
student spell name7
child show sign
child express sign6
system meet criterion
system visit criterion1
19/06/2015 Talk@UCL Machine Reading Lab. 36 / 39
![Page 37: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/37.jpg)
• State-of-the-art results on the disambiguation task
– Prepositional adjuncts improve the results
• How about other kinds of adjuncts?
Results
Method
Tensor (only verb data) 0.480
Tensor (verb and preposition data) 0.614
PAS-CLBLM (this experiment) 0.374
Milajevs+, 2014 0.456
Hashimoto+, 2014 0.422
Future work: improving real-world applications using the method
19/06/2015 Talk@UCL Machine Reading Lab. 37 / 39
![Page 38: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/38.jpg)
1. Background
– Word and Phrase Embeddings
2. Jointly Learning Word and Phrase Embeddings
– General Idea
3. Our Methods Focusing on Transitive Verb Phrases
– Word Prediction (EMNLP 2014)
– Implicit Tensor Factorization (CVSC 2015)
4. Experiments and Results
5. Summary
Today’s Agenda
19/06/2015 Talk@UCL Machine Reading Lab. 38 / 39
![Page 39: Jointly Learning Word and Phrase Embeddings Using Neural ...hassy/publications/talk/ucl2015/... · –parameterize the word embeddings –learn them based on co-occurrence statistics](https://reader033.vdocuments.mx/reader033/viewer/2022060502/5f1c35a4d94e684722346897/html5/thumbnails/39.jpg)
• Word and phrase embeddings are jointly learned
using large corpora parsed by syntactic parsers
– Tensor-based method is suitable for verb sense
disambiguation
– Adjuncts are useful in learning verb phrases
• Future directions:
– improving the embedding methods
– applying them to real-world NLP applications
• What kind of information should be captured?
Summary
19/06/2015 Talk@UCL Machine Reading Lab. 39 / 39