lda2vec text by the bay 2016

Post on 15-Apr-2017

1.598 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

lda2vec (word2vec, and lda)

Christopher Moody @ Stitch Fix

About

@chrisemoody Caltech Physics PhD. in astrostats supercomputing sklearn t-SNE contributor Data Labs at Stitch Fix github.com/cemoody

Gaussian Processes t-SNE

chainer deep learning

Tensor Decomposition

word2vec

lda

1

23ld

a2vec

1. king - man + woman = queen 2. Huge splash in NLP world 3. Learns from raw text 4. Pretty simple algorithm 5. Comes pretrained

word2vec

1. Set up an objective function 2. Randomly initialize vectors 3. Do gradient descent

word2vec

word

2vec

word2vec: learn word vector w from it’s surrounding context

w

word

2vec

“The fox jumped over the lazy dog”Maximize the likelihood of seeing the words given the word over.

P(the|over) P(fox|over)

P(jumped|over) P(the|over) P(lazy|over) P(dog|over)

…instead of maximizing the likelihood of co-occurrence counts.

word

2vec

P(fox|over)

What should this be?

word

2vec

P(vfox|vover)

Should depend on the word vectors.

P(fox|over)

word

2vec

“The fox jumped over the lazy dog”

P(w|c)

Extract pairs from context window around every input word.

word

2vec

“The fox jumped over the lazy dog”

c

P(w|c)

Extract pairs from context window around every input word.

word

2vec

“The fox jumped over the lazy dog”

w

P(w|c)

c

Extract pairs from context window around every input word.

word

2vec

P(w|c)

w c

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

word

2vec

“The fox jumped over the lazy dog”

P(w|c)

w c

Extract pairs from context window around every input word.

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

word

2vec

P(w|c)

w c

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

word

2vec

P(w|c)

cw

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

word

2vec

P(w|c)

cw

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

word

2vec

P(w|c)

cw

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

objectiv

e

Measure loss between w and c?

How should we define P(w|c)?

objectiv

e

w . c

How should we define P(w|c)?

Measure loss between w and c?

word

2vec

w . c ~ 1

objectiv

e

w

c

vcanada . vsnow ~ 1

word

2vec

w . c ~ 0

objectiv

e

w

cvcanada . vdesert ~0

word

2vec

w . c ~ -1

objectiv

e

w

c

word

2vec

w . c ∈ [-1,1]

objectiv

e

word

2vec

But we’d like to measure a probability.

w . c ∈ [-1,1]

objectiv

e

word

2vec

But we’d like to measure a probability.

objectiv

e

∈ [0,1]σ(c·w)

word

2vec

But we’d like to measure a probability.

objectiv

e

∈ [0,1]σ(c·w)

w c

w c

SimilarDissimilar

word

2vec

Loss function:

objectiv

e

L=σ(c·w)

Logistic (binary) choice. Is the (context, word) combination from our dataset?

word

2vec

The skip-gram negative-sampling model

objectiv

e

Trivial solution is that context = word for all vectors

L=σ(c·w)w

c

word

2vec

The skip-gram negative-sampling model

L = σ(c·w) + σ(-c·wneg)

objectiv

e

Draw random words in vocabulary.

word

2vec

The skip-gram negative-sampling model

objectiv

e

Discriminate positive from negative samples

Multiple Negative

L = σ(c·w) + σ(-c·wneg) +…+ σ(-c·wneg)

word

2vec

The SGNS ModelPMI

ci·wj = PMI(Mij) - log k

…is extremely similar to matrix factorization!

Levy & Goldberg 2014

L = σ(c·w) + σ(-c·wneg)

word

2vec

The SGNS ModelPMI

Levy & Goldberg 2014

‘traditional’ NLP

L = σ(c·w) + σ(-c·wneg)

ci·wj = PMI(Mij) - log k

…is extremely similar to matrix factorization!

word

2vec

The SGNS Model

L = σ(c·w) + Σσ(-c·w)

PMI

ci·wj = log

Levy & Goldberg 2014

#(ci,wj)/n

k #(wj)/n #(ci)/n

‘traditional’ NLP

word

2vec

The SGNS Model

L = σ(c·w) + Σσ(-c·w)

PMI

ci·wj = log

Levy & Goldberg 2014

popularity of c,wk (popularity of c) (popularity of w)

‘traditional’ NLP

word

2vec

PMI

99% of word2vec is counting.

And you can count words in SQL

word

2vec

PMI

Count how many times you saw c·w

Count how many times you saw c

Count how many times you saw w

word

2vec

PMI

…and this takes ~5 minutes to compute on a single core. Computing SVD is a completely standard math library.

word2vec

ITEM_3469 + ‘Pregnant’

+ ‘Pregnant’

= ITEM_701333 = ITEM_901004 = ITEM_800456

what about?LDA?

LDA on Client Item Descriptions

LDA on Item

Descriptions (with Jay)

LDA on Item

Descriptions (with Jay)

LDA on Item

Descriptions (with Jay)

lda vs word2vec

Bayesian Graphical ModelML Neural Model

word2vec is local: one word predicts a nearby word

“I love finding new designer brands for jeans”

“I love finding new designer brands for jeans”

But text is usually organized.

“I love finding new designer brands for jeans”

But text is usually organized.

“I love finding new designer brands for jeans”

In LDA, documents globally predict words.

doc 7681

typical word2vec vector

[ 0%, 9%, 78%, 11%]

typical LDA document vector

[ -0.75, -1.25, -0.55, -0.12, +2.2]

All sum to 100%All real values

5D word2vec vector

[ 0%, 9%, 78%, 11%]

5D LDA document vector

[ -0.75, -1.25, -0.55, -0.12, +2.2]

Sparse All sum to 100%

Dimensions are absolute

Dense All real values

Dimensions relative

100D word2vec vector

[ 0%0%0%0%0% … 0%, 9%, 78%, 11%]

100D LDA document vector

[ -0.75, -1.25, -0.55, -0.27, -0.94, 0.44, 0.05, 0.31 … -0.12, +2.2]

Sparse All sum to 100%

Dimensions are absolute

Dense All real values

Dimensions relative

dense sparse

100D word2vec vector

[ 0%0%0%0%0% … 0%, 9%, 78%, 11%]

100D LDA document vector

[ -0.75, -1.25, -0.55, -0.27, -0.94, 0.44, 0.05, 0.31 … -0.12, +2.2]

Similar in fewer ways (more interpretable)

Similar in 100D ways (very flexible)

+mixture +sparse

can we do both? lda2vec

-1.9 0.85 -0.6 -0.3 -0.5

Lufthansa is a German airline and when

fox

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Lufthansa is a German airline and when

German

word2vec predicts locally: one word predicts a nearby word

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

German

Document vector predicts a word from

a global context

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

We’re missing mixtures & sparsity!

German

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

We’re missing mixtures & sparsity!

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Now it’s a mixture.

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Trinitarian baptismal

Pentecostals Bede

schismatics excommunication

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

#topicsDocument weight

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

topic 1 = “religion” Trinitarian baptismal

Pentecostals Bede

schismatics excommunication

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

#topicsDocument weight

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Milosevic absentee

Indonesia Lebanese Isrealis

Karadzic

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

#topicsDocument weight

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

topic 2 = “politics” Milosevic absentee

Indonesia Lebanese Isrealis

Karadzic

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

#topicsDocument weight

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Sparsity!

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

34% 32% 34%

t=0

41% 26% 34%

t=10

99% 1% 0%

t=∞

time

@chrisemoody

lda2vec.com

+ API docs + Examples + GPU + Tests

@chrisemoody

lda2vec.com

@chrisemoody Example Hacker News comments

Topics: http://nbviewer.jupyter.org/github/cemoody/lda2vec/blob/master/examples/

hacker_news/lda2vec/lda2vec.ipynb

Word vectors: https://github.com/cemoody/

lda2vec/blob/master/examples/hacker_news/lda2vec/

word_vectors.ipynb

@chrisemoody

lda2vec.com

human-interpretable doc topics, use LDA.

machine-useable word-level features, use word2vec.

if you like to experiment a lot, and have topics over user / doc / region / etc. features, use lda2vec. (and you have a GPU)

If you want…

?@chrisemoody

Multithreaded Stitch Fix

@chrisemoody

lda2vec.com

“PS! Thank you for such an awesome idea”

@chrisemoody

doc_id=1846

Can we model topics to sentences? lda2lstm

Can we model topics to sentences? lda2lstm

“PS! Thank you for such an awesome idea”doc_id=1846

@chrisemoody

Can we model topics to images? lda2ae

TJ Torres

and now for something completely crazy4Fun Stuff

translation

(using just a rotation matrix)

Miko

lov

2013

English

Spanish

Matrix Rotation

deepwalk

Perozz

i

et al 2

014

learn word vectors from sentences

“The fox jumped over the lazy dog”

vOUT vOUT vOUT vOUTvOUTvOUT

‘words’ are graph vertices ‘sentences’ are random walks on the graph

word2vec

Playlists at Spotify

context

sequence

lear

ning

‘words’ are song indices ‘sentences’ are playlists

Playlists at Spotify

contextErik

Bernhar

dsson

Great performance on ‘related artists’

Fixes at Stitch Fix

sequence

lear

ning

Let’s try: ‘words’ are items ‘sentences’ are fixes

Fixes at Stitch Fix

context

Learn similarity between styles because they co-occur

Learn ‘coherent’ styles

sequence

lear

ning

Fixes at Stitch Fix?

context

sequence

lear

ningGot lots of structure!

Fixes at Stitch Fix?

context

sequence

lear

ning

Fixes at Stitch Fix?

context

sequence

lear

ning

Nearby regions are consistent ‘closets’

?@chrisemoody

Multithreaded Stitch Fix

context dependent

Levy

& G

oldberg

2014

Australian scientist discovers star with telescopecontext +/- 2 words

context dependent

context

Australian scientist discovers star with telescope

Levy

& G

oldberg

2014

context dependent

context

Australian scientist discovers star with telescopecontext

Levy

& G

oldberg

2014

context dependent

context

BoW DEPS

topically-similar vs ‘functionally’ similar

Levy

& G

oldberg

2014

?@chrisemoody

Multithreaded Stitch Fix

Crazy Approaches

Paragraph Vectors (Just extend the context window)

Content dependency (Change the window grammatically)

Social word2vec (deepwalk) (Sentence is a walk on the graph)

Spotify (Sentence is a playlist of song_ids)

Stitch Fix (Sentence is a shipment of five items)

CBOW

“The fox jumped over the lazy dog”

Guess the word given the context

~20x faster. (this is the alternative.)

vOUT

vIN vINvIN vINvIN vIN

SkipGram

“The fox jumped over the lazy dog”

vOUT vOUT

vIN

vOUT vOUT vOUTvOUT

Guess the context given the word

Better at syntax. (this is the one we went over)

lda2

vec

vDOC = a vtopic1 + b vtopic2 +…

Let’s make vDOC sparse

lda2

vec

This works! 😀 But vDOC isn’t as interpretable as the topic vectors. 😔

vDOC = topic0 + topic1

Let’s say that vDOC ads

lda2

vec

softmax(vOUT * (vIN+ vDOC))

theory of lda2vec

lda2

vec

pyLDAvis of lda2vec

lda2

vec

LDA Results

context

History

I loved every choice in this fix!! Great job!

Great Stylist Perfect

LDA Results

context

History

Body Fit

My measurements are 36-28-32. If that helps. I like wearing some clothing that is fitted.

Very hard for me to find pants that fit right.

LDA Results

context

History

Sizing

Really enjoyed the experience and the pieces, sizing for tops was too big.

Looking forward to my next box!

Excited for next

LDA Results

context

History

Almost Bought

It was a great fix. Loved the two items I kept and the three I sent back were close!

Perfect

All of the following ideas will change what ‘words’ and ‘context’ represent.

parag

raph

vecto

r

What about summarizing documents?

On the day he took office, President Obama reached out to America’s enemies, offering in his first inaugural address to extend a hand if you are willing to unclench your fist. More than six years later, he has arrived at a moment of truth in testing that

On the day he took office, President Obama reached out to America’s enemies, offering in his first inaugural address to extend a hand if you are willing to unclench your fist. More than six years later, he has arrived at a moment of truth in testing that

The framework nuclear agreement he reached with Iran on Thursday did not provide the definitive answer to whether Mr. Obama’s audacious gamble will pay off. The fist Iran has shaken at the so-called Great Satan since 1979 has not completely relaxed.

parag

raph

vecto

r

Normal skipgram extends C words before, and C words after.

IN

OUT OUT

On the day he took office, President Obama reached out to America’s enemies, offering in his first inaugural address to extend a hand if you are willing to unclench your fist. More than six years later, he has arrived at a moment of truth in testing that

The framework nuclear agreement he reached with Iran on Thursday did not provide the definitive answer to whether Mr. Obama’s audacious gamble will pay off. The fist Iran has shaken at the so-called Great Satan since 1979 has not completely relaxed.

parag

raph

vecto

r

A document vector simply extends the context to the whole document.

IN

OUT OUT

OUT OUTdoc_1347

fromgensim.modelsimportDoc2Vecfn=“item_document_vectors”model=Doc2Vec.load(fn)model.most_similar('pregnant')matches=list(filter(lambdax:'SENT_'inx[0],matches))

#['...Iamcurrently23weekspregnant...',#'...I'mnow10weekspregnant...',#'...notshowingtoomuchyet...',#'...15weeksnow.Babybump...',#'...6weekspostpartum!...',#'...12weekspostpartumandamnursing...',#'...Ihavemybabyshowerthat...',#'...amstillbreastfeeding...',#'...Iwouldloveanoutfitforababyshower...']

sente

nce

sear

ch

top related