自然言語処理のためのdeep learning

自然言語処理のための

Deep Learning

東京工業大学奥村高村研究室

D1 菊池悠太 kiyukuta

20130911

Deep Learning for Natural Language Processing

13年9月28日土曜日

2つのモチベーション　- NLPでニューラルネットを　- 言語の意味的な特徴を

NNrarr多層timesrarrpretrainingrarrbreakthrough

焦って早口過ぎてたら教えて下さい

A yet another brief introduction to neural networkshttpwwwslidesharenetyutakikuchi927a-yet-another-brief-introduction-to-neural-

networks-26023639

Neural networkベースの話RBMとか苦しい

Deep Learning

for NLP

Deep Learning

Deep Learning概要Neural NetworkふんわりDeepへの難しさPretrainingの光Stacked Autoencoder DBN

for NLP

Deep Learning

for NLP

Deep Learning

Deep LearningUnsupervised Representation Learning

生データ

特徴抽出学習器- 特徴抽出器

- 人手設計答え

答えDeep Learning

従来

Deep Learning

結論からいうと

Deep Learningとは良い初期値を（手に入れる方法を）

手に入れた

多層Neural Networkです

Deep Learning

生画像から階層毎に階層的な特徴をラベル無しデータから教師なしで学習

生画像

高次な特徴はより低次な特徴の組み合わせで表現

＝＝＝

低次レベルの特徴は共有可能

将来のタスクが未知でも起こる世界は今と同じ

Deep Learning

A yet another brief introduction to

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

生データ抽出した素性

予測

Neural Network

入力層x

隠れ層z

出力層y

例えば手書き数字認識

784次元

10次元

MNIST (2828の画像)

[005 005 005 040 005 005 015 005 015 005] 10次元の確率分布（左から入力画像が　　0である確率　　 1である確率　　　　　 9である確率）

2828=784次元の数値ベクトル

入力層x

隠れ層z

出力層y

Neuron

隠れユニットjの入力層に対する重みW1

隠れユニットj

出力ユニットk

出力ユニットkの隠れ層に対する重みW2

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

層間の重みを行列で表現

Neural Networkの処理- Forward propagation- Back propagation- Parameter update

Forward Propagation

入力層x

隠れ層z

出力層y入力に対し出力を出す

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

入力層から隠れ層への情報の伝播

非線形活性化関数f( )

tanh とかsigmoid とか

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

tanhsigmoid reLU maxout

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力力の情報を重み付きで受け取る

隠れユニットが出す出力力値が決まる

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

出力力層用の非線形活性化関数σ( )

　　　タスク依存

隠れ層から出力層への情報の伝播

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

タスク依存の出力層

解きたいタスクによってσが変わる

- 回帰- 二値分類- 多値分類- マルチラベリング

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

出力に値域はいらない

恒等写像でそのまま出力(a) = a

y = (W2z + b2)

二値分類のケース

出力層は確率

σは00~10であって欲しい

(a) = 11+exp(a)

Sigmoid関数入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

多値分類のケース

出力は確率分布

各ノード0以上総和が1

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

マルチラベリングのケース

各々が独立に二値分類

(a) = 11+exp(a)

element-wiseでSigmoid関数

[01] [01] [01] y = (W2z + b2)

ちなみに多層になった場合

出力層だけタスク依存

隠れ層はぜんぶ同じ

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

正解tNNが入力に対する出力の予測を間違えた場合

正解するように修正したい

Back Propagation

入力層x

隠れ層z

出力層y

修正対象層間の重み

z = f(W1x + b1)uarrとバイアス

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

正解t 誤差関数を最小化するよう修正

E() = 12y() t2

k=1 tk log yk

E = t log y (1 t) log(1 y)

k=1 t log y + (1 t) log(1 y)

いずれも予測と正解が違うほど大きくなる

Back Propagation

入力層x

隠れ層z

出力層y

正解t

　出力ラベルと正解の差

ノードの誤差を計算y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t ノードの誤差を計算

自分が情報を伝えた先の誤差が伝播してくる

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

ノードの誤差を計算

自分の影響で上で発生した誤差

Back Propagation

入力層x

隠れ層z

出力層y

正解t 重みの勾配を計算

自分が上に伝えた情報で発生した誤差

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

正解t 重みの更新

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

-Gradient Descent -Stochastic Gradient Descent -SGD with mini-batch　　修正するタイミングの違い

Neural Networkの処理まとめ

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

正解t 誤差と勾配を計算

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

正解t 勾配方向へ重み更新

W1 = W1 EnW1

W2 = W2 EnW2

ちなみにAutoencoderNeural Networkの特殊系

1 入力と出力の次元が同じ2 教師信号が入力そのもの

入力を圧縮1して復元

1 圧縮(隠れ層が入力層より少ない)でなくても適切に正則化すればうまくいく13年9月28日土曜日

Neural Networkの特殊系

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

マルチラベリングのケースに該当

画像の場合各画素(ユニット)ごとに明るさ(00黒 10白)を判定するため

Autoencoderの学習するもの

Denoising Autoencoder

add noise

denoise

正則化法の一つ再構築+ノイズの除去13年9月28日土曜日

Deep Learning

Deepになると

many figures from httpwwwcstorontoedu~fleetcoursescifarSchool09slidesBengiopdf

仕組み的には同じ

隠れ層が増えただけ13年9月28日土曜日

問題は初期化NNのパラメータ

初期値は乱数

多層(Deep)になってもOK13年9月28日土曜日

乱数だとうまくいかない

NNはかなり複雑な変化をする関数なので悪い局所解にいっちゃう

Learning Deep Architectures for AI (2009)

乱数だとうまくいかないNN自体が表現力高いので上位二層分のNNだけで訓練データを再現するには事足りちゃう

ただしそれは汎化能力なし過学習

inputのランダムな写像だがinputの情報は保存している

Greedy Layer-Wise Training of Deep Networks [Bengio+ 2007]

Deep Learning

2006年ブレークスルー(Hinton+2006)

Greedy Layer-wise unsupervised pretraining

層ごとにまずパラメータを更新

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

どうなるの良い初期値を

得られるようになりました

Why does Unsupervised Pre-training Help Deep Learning [Erhan+ 2010]

[Bengio+ 2007]

なぜpre-trainingが良いのか諸説あり

手に入れた1

Neural Network2

つまり

1 諸説あり Why does Unsupervised Pre-training Help Deep Learning [Erhan+ 2010]

2 stacked autoencoderの場合

Deep Learning

AutoencoderNeural Networkの特殊系

訓練データ中の本質的な情報を捉える

入力を圧縮して復元

圧縮ということは隠れ層は少なくないといけないの

そうでなくても正則化などでうまくいく

Autoencoder

これは正確にはdenoising autoencoderの図httpkiyukutagithubio20130820hello_autoencoderhtml

Stacked Autoencoder

このNNの各層をその層への入力力を再構築するAutoencoderとして事前学習

Stacked Autoencoder

Deep Learning

for NLP

画像処理のように

Deeeeeeepって感じではないNeural Network-basedくらいのつもりで

Deep Learning for

Hello world My name is Tom

(28 x 28)

28 x 28=

Input size

Image Sentence

任意の長さの文を入力力とするには単語(句句や文も)をどうやって表現する

(28 x 28)

28 x 28=

Input representation

Image Sentence

言い換えると任意の長さの文を入力力とするには単語(句句や文も)をどうやって表現する

NLPでNNを使いたい

単語の特徴をうまく捉えた表現の学習

Keywords任意の長さの文を入力力とするには単語(句句や文も)をどうやって表現する

Distributed word representation

-‐‑‒ convolutional-‐‑‒way-‐‑‒ recursive-‐‑‒way

Neural language model

phrase sentence-‐‑‒levelrepresentation

Keywords

Word representation

自然言語処理における単語の表現方法

ベクトル(Vector Space Model VSM)

単語の意味をベクトルで表現

単語 rarr ベクトル

いろいろな方法- One-hot- Distributional- Distributed 本題

Word representation

One-hot representation各単語に個別IDを割り当て表現

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

スパースすぎて訓練厳しい汎化能力なくて未知語扱えず

Distributional representation単語の意味は周りの文脈によって決まる

Standardな方法13年9月28日土曜日

Distributed representationdense low-dimensional real-valued

Neural Language Modelにより学習

= Word embedding

構文的意味的な情報を埋め込む

Distributed Word representation

Distributed Phrase representation

Distributed Sentence representation

Distributed Document representation

recursive勢の一強

さて

Neural Language Model

Distributed Word Representation

の学習

言語モデルとはP(ldquo私の耳が昨日からじんじん痛むrdquo)

P(ldquo私を耳が高くに拡散して草地rdquo) はぁ

うむ

与えられた文字列の生成確率を出力するモデル

N-gram言語モデル単語列の出現確率を N-gram ずつに分解して近似

次元の呪いを回避

N-gram言語モデルの課題1 実質的には長い文脈は活用できない　　せいぜいN=12

2 ldquo似ている単語rdquoを扱えない　　

P (house|green)

Neural Language Modelとは

Neural Networkベースの言語モデル

　- 言語モデルの学習　- Word Embeddingsの学習同時に学習する

単語そのもの

その単語のembedding

|辞書|次元の確率分布どの単語が次に出てくるかを予測

A Neural Probabilistic Language Model (bengio+ 2003)

n語の文脈が与えられた時

次にどの単語がどのくらいの確率でくるか

似ている単語に似たembeddingを与えられればNN的には似た出力を出すはず　　　語の類似度を考慮した言語モデルができる

Ranking language model[Collobert amp Weston2008]

仮名

単語列に対しスコアを出すNN

正しい単語列最後の単語をランダムに入れ替え

gtとなるように学習

他の主なアプローチ

Recurrent Neural Network [Mikolov+ 2010]

t番目の単語の入力力時に同時にt-‐‑‒1番目の内部状態を文脈として入力力

1単語ずつ入力力出力力は同じく語彙上の確率率率分布

word2vecの人

word2vechttpscodegooglecompword2vec

研究進展人生 rarr 苦悩人生恋愛研究 rarr 進展

他に

単語間の関係のoffsetを捉えている仮定

king - man + woman ≒ queen

単語の意味についてのしっかりした分析

先ほどは単語表現を学習するためのモデル(Bengiorsquos CampWrsquos Mikolovrsquos)

以降はNNで言語処理のタスクに取り組むためのモデル

（結果的に単語ベクトルは学習されるが　　おそらくタスク依存なものになっている)

Collobert amp Weston[2008]

convolutional-‐‑‒way はじめに

2008年の論文文レベルの話のとこだけ

他にMulti-task learningLanguage modelの話題がある

ここは2層Neural Network

入力

隠れ層

出力層

Neural Networkに入力するためにどうやって

固定次元に変換するか

任意の長さの文

convolutional-‐‑‒wayCollobert amp Weston[2008]

convolutional-‐‑‒way

単語をd次元ベクトルに(word embedding + α)

3単語をConvolutionしてlocalな特徴を得る

共通の重み( )

max(　　　　　　)

各次元の最大値を取得

max(　　　　　　)

次元固定のキモ

固定次元の入力

任意サイズの入力

Neural Networkで学習

Richard Socher一派recursive-‐‑‒way

Recursive Neural Network

Recursive Autoencoder

Parsing Natural Scenes and Natural Language with Recursive Neural Networks (ICML2011)

Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions (EMNLP2011)

recursive-‐‑‒wayRecursive Autoencoder

感情分布の推定

Distributed Representation

Autoencoder

2つを1つに圧縮用の共通のAutoencoderを再帰的に適用

Stacked Autoencoderは階層毎に別のものを訓練

recursive-‐‑‒wayRecursive Neural Network

Parsing

Neural Network

Parsing

recursive-‐‑‒wayRecursive Neural NetworkParsing Natural Scenes and Natural Language with Recursive Neural Networks (ICML2011)

モデル出力の最大スコア

正解構文木のスコア

Parsing

構文木中の全非終端ノードのsの総和

Socher+rsquosほかの

Compositional semanticsSemantic Compositionality through Recursive Matrix-Vector Spaces (EMNLP2012)

各単語にベクトルと行列を割り当て

意味隣接語への影響 not very

Paraphrase

入力二つの文出力二つの文が同じ意味か否か

Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection (NIPS2011)

- Unfolding RAEで文の意味- Dynamic poolingで任意の長さの２文を比べる

単語の表現方法としての密で低次元な連続値ベクトル(word embedding)

Keywords

word embeddingsを学習するためのアプローチex) Bengiorsquos recurrent ranking

文ごとに長さが異なるのを扱うアプローチ

Recursiveな方は途中のphraseやsentenceにおける単語ベクトルも保存

具体例の説明が重くなりすぎたかも

Distributed (Word|Phrase|Sentence|Document)

Representation

Recursive Autoencoder一強他の枠組みは

どうする

よりよい単語の表現Neural Language Model

意味

Compositional Semanticsというタスク自体はdeep learning

以外でも最近盛ん

既存タスクへの応用

単語類似度分類構造学習

要約翻訳推薦

- 学習された単語のembeddingを追加素性に使う他の方法は

おわり

networks-26023639

Deep Learning

for NLP

Deep Learning

for NLP

Deep Learning

for NLP

Deep Learning

生データ

答えDeep Learning

従来

Deep Learning

手に入れた

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

networks-26023639

Deep Learning

for NLP

Deep Learning

for NLP

Deep Learning

for NLP

Deep Learning

生データ

答えDeep Learning

従来

Deep Learning

手に入れた

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

for NLP

Deep Learning

for NLP

Deep Learning

for NLP

Deep Learning

生データ

答えDeep Learning

従来

Deep Learning

手に入れた

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

for NLP

Deep Learning

for NLP

Deep Learning

for NLP

Deep Learning

生データ

答えDeep Learning

従来

Deep Learning

手に入れた

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

for NLP

Deep Learning

for NLP

Deep Learning

生データ

答えDeep Learning

従来

Deep Learning

手に入れた

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

for NLP

Deep Learning

生データ

答えDeep Learning

従来

Deep Learning

手に入れた

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

生データ

答えDeep Learning

従来

Deep Learning

手に入れた

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

生データ

答えDeep Learning

従来

Deep Learning

手に入れた

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

手に入れた

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

生画像

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

＝＝＝

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Neural Networks

菊池悠太

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Neural Network

入力層x

隠れ層z

出力層y

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

予測

Neural Network

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

784次元

10次元

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

Neuron

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

隠れユニットj

出力ユニットk

入力層x

隠れ層z

出力層y

Neuron

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

行列で表現

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Forward Propagation

入力層x

隠れ層z

input x

output y

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

z = f(W1x + b1)f( )

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

f(x0) f(x1) f(x2) f(x3)

f(x) =

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

y = (W2z + b2)

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

実数

入力層x

隠れ層z

出力層y

回帰のケース

y = (W2z + b2)

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

出力層は確率

(a) = 11+exp(a)

隠れ層z

出力層y

y = (W2z + b2)

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

Softmax関数

sum( 02 07 01 )=10

(a) = exp(a)exp(a)

y = (W2z + b2)

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力層x

隠れ層z

出力層y

(a) = 11+exp(a)

[01] [01] [01] y = (W2z + b2)

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

出力層

隠れ層1

隠れ層N

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

y = (W2z + b2)

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

E() = 12y() t2

k=1 tk log yk

k=1 t log y + (1 t) log(1 y)

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

正解t

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y t

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

正解t

y = y t

z = WT2 yf (az)

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Forward Propagation

入力層x

隠れ層z

出力層y

z = f(W1x + b1)

y = (W2z + b2)

入力から予測

input x

output y

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Back Propagation

入力層x

隠れ層z

出力層y

z = WT2 yf (az)

y = y tEnW2

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Update parameters

入力層x

隠れ層z

出力層y

W1 = W1 EnW1

W2 = W2 EnW2

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

z = f(W1x + b1)

y = (W2z + b2)

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

y = (W2z + b2)

(a) = 11+exp(a)

element-wiseで

Sigmoid関数

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

add noise

denoise

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deepになると

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

初期値は乱数

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

層ごとに学習

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

どうやって

Autoencoder

RBMも

[Bengio2007]

[Hinton2006]

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

[Bengio+ 2007]

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

手に入れた1

Neural Network2

つまり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Stacked Autoencoder

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning

for NLP

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Deep Learning for

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

(28 x 28)

28 x 28=

Input size

Image Sentence

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

(28 x 28)

28 x 28=

Image Sentence

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Keywords

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Word representation

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

辞書V

236237

the a of

dog sky

0 |V|1 00 000 0

1 00 000 0

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

= Word embedding

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

さて

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

の学習

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

うむ

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

P (house|green)

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

単語そのもの

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

仮名

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

word2vecの人

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

他に

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

入力

隠れ層

出力層

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

共通の重み( )

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

max(　　　　　　)

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Autoencoder

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Parsing

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Neural Network

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Parsing

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Paraphrase

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Keywords

Representation

どうする

意味

要約翻訳推薦

おわり

Representation

どうする

意味

要約翻訳推薦

おわり

Representation

どうする

意味

要約翻訳推薦

おわり

Representation

どうする

意味

要約翻訳推薦

おわり

Representation

どうする

意味

要約翻訳推薦

おわり

Representation

どうする

意味

要約翻訳推薦

おわり

要約翻訳推薦

おわり

自然言語処理のためのdeep learning

Technology

scala による自然言語処理

自然言語処理ai uipathuipath...

コロナストレスに負けないための...

米軍との自然災害対処協力...

aiを支える自然言語処理技術 - ntt.co.jp · ai...

自然言語処理 2010

自然言語処理 2013 no.13

エージェントサービスを実現する自然言語...

自然言語処理における...

“より人間に近い自然言語処理システムの構築とその応用”...

【化成処理･･･クロメート処理とその他】€化成処理...【化成処理･･･クロメート処理とその他】...

apache solrで実現する共創のエコシステム ...

deep learningと自然言語処理

gunosy における aws...

リクルート式...

perl で自然言語処理

自然言語処理のための変分ベイズ法 -...

自然言語処理 2007 natural language processing

社団法人電子情報通信学会 the institute of...

自然言語処理に適した...