dependency-based word embeddings omer levyyoav goldberg bar-ilan university israel

39
Dependency-Based Word Embeddings Omer Levy Yoav Goldberg Bar-Ilan University Israel

Upload: bruno-johns

Post on 19-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

Our Main Contribution: Generalizing Skip-Gram with Negative Sampling

TRANSCRIPT

Page 1: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Dependency-BasedWord Embeddings

Omer Levy Yoav GoldbergBar-Ilan University

Israel

Page 2: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Neural Embeddings• Dense vectors• Each dimension is a latent feature• word2vec (Mikolov et al., 2013)• State-of-the-Art: Skip-Gram with Negative Sampling • “Linguistic Regularities”

king man woman queen

Linguistic Regularities in Sparse and Explicit Word RepresentationsFriday, 2:00 PM, CoNLL 2014

Page 3: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Our Main Contribution:

Generalizing Skip-Gram with Negative Sampling

Page 4: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Skip-Gram with Negative Sampling v2.0• Original implementation assumes bag-of-words contexts• We generalize to arbitrary contexts

• Dependency contexts create qualitatively different word embeddings

• Provide a new tool for linguistically analyzing embeddings

Page 5: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Context Types

Page 6: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Australian scientist discovers star with telescope

Example

Page 7: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Australian scientist discovers star with telescope

Target Word

Page 8: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Australian scientist discovers star with telescope

Bag of Words (BoW) Context

Page 9: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Australian scientist discovers star with telescope

Bag of Words (BoW) Context

Page 10: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Australian scientist discovers star with telescope

Bag of Words (BoW) Context

Page 11: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Australian scientist discovers star with telescope

Syntactic Dependency Context

Page 12: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Australian scientist discovers star with telescope

Syntactic Dependency Contextprep_wit

hnsubj

dobj

Page 13: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Australian scientist discovers star with telescope

Syntactic Dependency Contextprep_wit

hnsubj

dobj

Page 14: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Generalizing Skip-Gram with Negative Sampling

Page 15: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

How does Skip-Gram work?• Skip-gram represents each word as a vector

• Skip-gram represents each context word as a different vector

• Same word has 2 different embeddings (as “word”, as “context”)

Page 16: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

How does Skip-Gram work?Text

Bag of Words Context

Word-Context Pairs

Learning

Page 17: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

How does Skip-Gram work?Text

Bag of Words Contexts

Word-Context Pairs

Learning

Page 18: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Our ModificationText

Arbitrary Contexts

Word-Context Pairs

Learning

Page 19: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Our ModificationText

Arbitrary Contexts

Word-Context Pairs

Learning

Modified word2vec publicly available!

Page 20: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Our Modification: ExampleText

Syntactic Contexts

Word-Context Pairs

Learning

Page 21: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Our Modification: ExampleText (Wikipedia)

Syntactic Contexts

Word-Context Pairs

Learning

Page 22: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Our Modification: ExampleText (Wikipedia)

Syntactic Contexts (Stanford Dependencies)

Word-Context Pairs

Learning

Page 23: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

What is the effect of different context types?

Page 24: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

What is the effect of different context types?• Thoroughly studied in explicit representations (distributional)• Lin (1998), Padó and Lapata (2007), and many others…

General Conclusion:• Bag-of-words contexts induce topical similarities• Dependency contexts induce functional similarities• Share the same semantic type• Cohyponyms

• Does this hold for embeddings as well?

Page 25: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Embedding Similarity with Different Contexts

Target Word Bag of Words (k=5) DependenciesDumbledore Sunnydale

hallows CollinwoodHogwarts half-blood Calarts

(Harry Potter’s school) Malfoy GreendaleSnape Millfield

Related to Harry Potter Schools

Page 26: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Embedding Similarity with Different Contexts

Target Word Bag of Words (k=5) Dependenciesnondeterministic Paulingnon-deterministic Hotelling

Turing computability Heting(computer scientist) deterministic Lessing

finite-state Hamming

Related to computability Scientists

Page 27: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Online Demo!

Embedding Similarity with Different Contexts

Target Word Bag of Words (k=5) Dependenciessinging singingdance rapping

dancing dances breakdancing(dance gerund) dancers miming

tap-dancing busking

Related todance Gerunds

Page 28: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Embedding Similarity with Different Contexts• Dependency-based embeddings have more functional similarities

• This phenomenon goes beyond these examples

• Quantitative Analysis (in the paper)

Page 29: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Dependency-based embeddings have more functional similarities

Quantitative Analysis

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Recall

Prec

ision

Dependencies

BoW (k=2)

BoW (k=5)

Page 30: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Why do dependencies induce functional similarities?

Page 31: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Dependency Contexts & Functional Similarity• Thoroughly studied in explicit representations (distributional)• Lin (1998), Padó and Lapata (2007), and many others…

• In explicit representations, we can look at the features and analyze

• But embeddings are a black box!• Dimensions are latent and don’t necessarily have any meaning

Page 32: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Analyzing Embeddings

Page 33: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Peeking into Skip-Gram’s Black Box• Skip-Gram allows a peek…

• Contexts are embedded in the same space!

• Given a word , find the contexts it “activates” most:

Page 34: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Associated ContextsTarget Word Dependencies

students/prep_at-1

educated/prep_at-1

Hogwarts student/prep_at-1

stay/prep_at-1

learned/prep_at-1

Page 35: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Associated ContextsTarget Word Dependencies

machine/nn-1

test/nn-1

Turing theorem/poss-1

machines/nn-1

tests/nn-1

Page 36: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Associated ContextsTarget Word Dependencies

dancing/conjdancing/conj-1

dancing singing/conj-1

singing/conjballroom/nn

Page 37: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Analyzing Embeddings• We found a way to linguistically analyze embeddings

• Together with the ability to engineer contexts…

• …we now have the tools to create task-tailored embeddings!

Page 38: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Conclusion

Page 39: Dependency-Based Word Embeddings Omer LevyYoav Goldberg Bar-Ilan University Israel

Conclusion• Generalized Skip-Gram with Negative Sampling to arbitrary contexts

• Different contexts induce different similarities

• Suggest a way to peek inside the black box of embeddings

• Code, demo, and word vectors available from our websites

• Make linguistically-motivated task-tailored embeddings today!Thank you for listening :)