tensorflow 深度學習快速上手班--自然語言處理應用

TensorFlow深度學習快速上⼿手班��

四、⾃自然語⾔言處理應⽤用

By Mark Chang

•  ⾃自然語⾔言處理簡介 •  Word2vec神經網路 •  語意運算實作

⾃自然語⾔言處理簡介

⾃自然語⾔言處理 •  ⾃自然語⾔言處理是⼈人⼯工智慧和語⾔言學領域的分⽀支

– 探討如何處理及運⽤用⾃自然語⾔言 •  ⾃自然語⾔言理解系統

– 把⾃自然語⾔言轉化為電腦易於處理的形式。 •  ⾃自然語⾔言⽣生成系統

– 把電腦程式數據轉化為⾃自然語⾔言。 •  https://zh.wikipedia.org/wiki/%E8%87%AA

%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86��

語意理解

https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

機器翻譯

http://arxiv.org/abs/1409.0473

詩詞創作

http://emnlp2014.org/papers/pdf/EMNLP2014074.pdf

影像標題產⽣生

http://arxiv.org/pdf/1411.4555v2.pdf

影像內容問答

http://arxiv.org/pdf/1505.00468v6.pdf

Word2vec神經網路

⽂文字的語意

•  某個字的語意，可從它的上下⽂文得知

dog 和 cat 語意相近.

The dog run. A cat run. A dog sleep. The cat sleep. A dog bark. The cat meows.

語意向量

The dog run. A cat run. A dog sleep. The cat sleep. A dog bark. The cat meows.

the a run sleep bark meow dog 1 2 2 2 1 0

cat 2 1 2 2 0 1

語意向量

dog (1, 2,..., xn)

cat (2, 1,..., xn)

Car (0, 0,..., xn)

語意向量相似度 •  A 和 B 的Cosine Similarity 為： A ·B

|A||B|

dog (a1, a2, ..., an)

cat (b1, b2, ..., bn)

dog 和 cat 的cosine similarity為：

a1b1 + a2b2 + ...+ anbnpa21 + a22 + ...+ a2n

pb21 + b22 + ...+ b2n

語意向量加減運算

Woman + King - Man = Queen

Woman Queen

Man King

King - Man

語意向量維度太⼤大

(x1=the, x2 =a,..., xn)

語意向量的維度等於總字彙量

xn ...

Word2vec神經網路

One-Hot Encoding

word2vec 神經網路

壓縮過的語意向量

One-Hot Encoding

dog cat run fly １

Initialize Weights

cat run

w11 w12 w13

w21 w22 w23

w31 w32 w33

w31 w32 w43

775V =

v11 v12 v13v21 v22 v23v31 v32 v33v31 v32 v43

把語意向量壓縮

高維度

低維度

Compressed Vectors

dog cat run fly

cat run

Context Word dog １

v13 run

cat run

fly dog cat run fly

1 + e�V1W3⇡ 1

V1 ·W3 = v11w31 + v12w32 + v13w33

Context Word cat

v23 run

dog cat run fly

V2 ·W3 = v21w31 + v22w32 + v23w33

dog cat run fly

1 + e�V2W3⇡ 1

Non-context Word dog １

V1 ·W4 = v11w41 + v12w42 + v13w43

1 + e�V1W4⇡ 0

dog cat run fly

dog cat run

Non-context Word

cat １

V2 ·W4 = v21w41 + v22w42 + v23w43

dog cat run

1 + e�V2W4⇡ 0

Result

dog cat run

dog cat run fly

cat run

語意運算實作

語意運算實作 https://github.com/ckmarkoh/ntc_deeplearning_tensorflow/blob/master/sec4/semantics.ipynb

訓練資料 anarchism originated as a term of abuse first used against early working class radicals including the diggers of the english revolution and the sans culottes of the french revolution whilst the term is still used in a pejorative way to describe any act that used violent means to destroy the organization of society it has also been taken up as a positive label by self defined anarchists the word anarchism is derived from the greek without archons ruler chief king anarchism as a political philosophy is the belief that rulers are unnecessary and should be abolished although there are differing interpretations of what this means anarchism also refers to related social movements that advocate the elimination of authoritarian institutions particularly the state the word anarchy as most anarchists use it does not imply chaos nihilism or anomie but rather a harmonious anti authoritarian society in place of what

前處理 anarchism originated as a term of abuse first used against early working class radicals including the diggers of the english revolution and the sans culottes of the french revolution whilst the term is still used in a pejorative way to describe any act that used violent means to destroy the organization of society it has also been taken up ….

[‘anarchism’, ‘originated’, ‘as’, ‘a’, ‘term’, ‘of’, ‘abuse’, ‘first’, ‘used’, ‘against’, ‘early’, ‘working’, ‘class’, ‘radicals’, ‘including’, ‘the’, ‘diggers’, ‘of’, ‘the’, ‘english’, ‘revolution’, ‘and’, ‘the’, ‘sans’, ‘culottes’, ‘of’, ‘the’, ‘french’, ‘revolution’, ‘whilst’, ‘the’, ‘term’, ‘is’, ‘still’, ‘used’, ‘in’, ‘a’, ‘pejorative’, ‘way’, ‘to’, ‘describe’, ‘any’, ‘act’, ‘that’, ‘used’, ‘violent’, ‘means’, ‘to’, ‘destroy’, ‘the’... ]

前處理

‘the’, ‘english’, ‘revolution’, ‘and’, ‘the’, ‘sans’, UNK, 'of', 'the', 'french', 'revolution’…

1, 103, 855, 3, 1, 15068, 0, 2, 1, 151, 855, …

‘the’, ‘english’, ‘revolution’, ‘and’, ‘the’, ‘sans’, ‘culottes’, 'of', 'the', 'french', 'revolution’…

字典外的字，用UNK代替。

將字轉換成字典內的代碼。

根據詞頻，轉換成字典

{“UNK”: 0, “the”: 1, “of”: 2, “and”: 3, “one”: 4, “in”: 5, “a”: 6, “to”: 7, “zero”: 8, “nine”: 9, .... }

# 字典大小 vocabulary_size = 50000

前處理 5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156, 128, 742, 477, 10572, 134, 1, 27549, 2, 1, 103, 855, 3, 1, 15068, 0, 2, 1, 151, 855, …

input output

3084 5239

3084 12

12 3084

3084 5239

word2vec

前處理

5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156, 128, 742, 477, 10572, 134, 1, 27549, 2, 1, 103, 855, 3, 1, 15068, 0, 2, 1, 151, 855, …

generate_batch(batch_size=8, num_skips=2, skip_window=1)

batch size

input 3084 3084 12 12 6 6 195 195

output 5239 12 3084 6 12 195 6 2

num_skips

batch_size

skip_window=1

Computational Graph train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1]) with tf.device('/cpu:0'):

embeddings = tf.Variable( tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))

embed = tf.nn.embedding_lookup(embeddings, train_inputs) nce_weights = tf.Variable(

tf.truncated_normal([vocabulary_size, embedding_size], stddev=1.0 / math.sqrt(embedding_size))) nce_biases = tf.Variable(tf.zeros([vocabulary_size])) loss = tf.reduce_mean( tf.nn.nce_loss(nce_weights, nce_biases, embed, train_labels, num_sampled, vocabulary_size))

optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)

Device with tf.device('/cpu:0’)

在CPU上執行以下定義的Computational Graph

由於Tensorflow未支援 embedding_lookup 在GPU上執行，故需令它在CPU上執行。

Inputs & Outputs

word2vec

train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])

train_inputs 3084

train_labels 5239

Embedding Lookup embeddings = tf.Variable(tf.random_uniform([vocabulary_size,

embedding_size], -1.0, 1.0)) embed = tf.nn.embedding_lookup(embeddings, train_inputs)

train_inputs 2

embeddings

embedding_lookup

NCE Weights •  NCE: Noise Contrastive Estimation

nce_weights = tf.Variable( tf.truncated_normal([vocabulary_size,

embedding_size], stddev=1.0 / math.sqrt(embedding_size) ))

nce_biases = tf.Variable( tf.zeros([vocabulary_size]) )

nce_weights

nce_biases

NCE Loss loss = tf.reduce_mean(

tf.nn.nce_loss(nce_weights, nce_biases, embed, train_labels, num_sampled, vocabulary_size))

1 + e�V2W3⇡ 1

1 + e�V2W4⇡ 0

Positive Negative

cost = log(1

log(1� 1

Train feed_dict = {train_inputs: batch_inputs,

train_labels: batch_labels} _, loss_val = session.run([optimizer, loss], feed_dict=feed_dict)

loss_val

batch_inputs 3084

batch_labels 5239

Result final_embeddings

array([[-0.02782757, -0.16879494, -0.06111901, ..., -0.25700757, -0.07137159, 0.0191142 ], [-0.00155336, -0.00928817, -0.0535327 , ..., -0.23261793, -0.13980433, 0.18055709], [ 0.02576068, -0.06805354, -0.03688766, ..., -0.15378961, 0.00459271, 0.0717089 ], ..., [ 0.01061165, -0.09820389, -0.09913248, ..., 0.00818674, -0.12992384, 0.05826835], [ 0.0849214 , -0.14137401, 0.09674817, ..., 0.04111136, -0.05420518, -0.01920278], [ 0.08318492, -0.08202577, 0.11284919, ..., 0.03887166, 0.01556483, 0.12496017]], dtype=float32)

Visualization

Most Similar Words def get_most_similar(word, top=10): wid = dictionary.get(word,-1)

result = np.dot(final_embeddings[wid:wid+1,:],final_embeddings.T) result = result [0].argsort().tolist() result.reverse() for idx in result [:10]: print(reverse_dictionary[idx])

get_most_similar("one")

one six two four seven three ...

講師資訊

•  Email: ckmarkoh at gmail dot com •  Blog: http://cpmarkchang.logdown.com •  Github: https://github.com/ckmarkoh

Mark Chang

•  Facebook: https://www.facebook.com/ckmarkoh.chang •  Slideshare: http://www.slideshare.net/ckmarkohchang •  Linkedin:

https://www.linkedin.com/pub/mark-chang/85/25b/847

tensorflow 深度學習快速上手班--自然語言處理應用

Technology

tensorflow codelab

welcome to tensorflow! - stanford university...welcome to...

tensorflow 2

tensorflow 深度學習快速上手班--電腦視覺應用

랩탑으로 tensorflow 도전하기 - tensorflow 설치

tensorflow w/xla: tensorflow, compiled! - autodiff · pdf...

deep learning for computer vision with tensorflow · what...

introduction of tensorflow -...

google tensorflow tutorial

tensorflow tutorial

h2o & tensorflow - fabrizio

tensorflow tutorial part2

introduction to tensorflow

cs224d: tensorflow tutorial

tensorflow - aalto · tensorflow api tensorflow has apis...

nvidia digits with tensorflow · nvidia digits with...

introduction to tensorflow 2...deep learning intro to...

快樂處方箋

tensorflow graph optimizations - stanford...

tensorflow and keras - statinfer...contents •deep learning...