a/2mib7vbm;*v#2`s`2/ iq`b#vlbbm; a2mibk2mi m hvbbb m/ … · h #h2q7*qmi2mib gbbiq76b;m`2b pbbb...
TRANSCRIPT
σ
X S O xt ∈ Rn
st ∈ Rm
st = f(Wxt + Us(t−1))
st s(t−1)
xt f tangent ReLU yt
t
ot = Softmax(V st)
(U, V, W )
Et(ot, ot) = −ot log ot ot
t ot y = −1 ∗ log(x) y
x
E ∂E∂U
∂E∂U
= ∑t
∂Et
∂U
∂Et
∂U=
t∑k=0
∂Et
∂yt
∂yt
∂st
⎛⎝ t∏
j=k+1
∂sj
∂sj−1
⎞⎠ ∂sk
∂W
Et
∂sj
∂sj−1
(t∏
j=k+1
∂sj
∂sj−1
)
ct
st
σ
ct st
it = σ(xtUi + s(t−1)W
i)
ft = σ(xtUf + s(t−1)W
f )
ot = σ(xtUo + s(t−1)W
o)
gt = tanh(xtUg + s(t−1)W
g)
ct = ft ∗ c(t−1) + it ∗ gt
st = ot ∗ tanh(ct)
it ft ot σ
gt st st
Ct−1
xt St−1
ft St1 xt
Ct−1
sigmoid
tangent gt
Ct−1 Ct
Ct−1 Ct−1 ft
ft ∗ Ct−1 itgt
Ct−1
sigmoid xt, St−1
Ot Ct
Ot Ct St
W = [w1, . . . , wt], wt ∈ V
V p(w1, , wt)
wt+1
p(w1, . . . , wt) = Softmax(s�t ewt)
e ∈ E[ew1 , . . . , ewt ]
wt w1 wt−1
ewt
L(θ) =∑
t
logP (wt−n+1, . . . , wt−1)
w1, w2, . . . , wn
wn w1, w2, . . . , wn−1 H
H H
H
PP
p(w1, w2, . . . , wm) (w1, w2, . . . , wm)
H = − limm→∞
1m
m∑w=1
(p(w1, . . . , wm)log2p(w1, . . . , wm))
H = − 1m
log2p(w1, . . . , wm)
PP = 2H
PP = P (w1, . . . , wm) 1m
h1
w1
w2 p(w2|w1) s2
wt
{w1, . . . , wt}
≈≈
≈ ≈
≈≈
≈ ≈
β
Accuracy(A) = (TP + TN)(TP + FP + TN + FN)
Precision(P ) = (number of relevant items retrieved)(number of retrieved items)
Recall(R) = (number of relevant items retrieved)(number of relevant items)
Fβ = (1 + β2) · P · R
β2P + R
β
β