connectionist temporal classification with maximum entropy ...06-09... · hu liu sheng jin...
TRANSCRIPT
Connectionist Temporal Classification with Maximum EntropyRegularization
Hu Liu Sheng Jin Changshui Zhang
Department of AutomationTsinghua University
NeurIPS, 2018
Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 1 / 9
Introduction to Connectionist Temporal Classification (CTC)
P(‘dog’| )
dd_oo___gdoo____gg_____dogg
dddoooggg
= P
The most suitable path
@Lctc
@ytk
= � 1
p(l|X)ytk
X
{⇡|⇡2B�1(l),⇡t=k}p(⇡|X)
CTC lacks exploration and is prone to fall into worse local minima.Output overconfident paths (overfitting).Output paths with peaky distribution.
CTC Drawbacks:
positive feedbackerror signal
Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 2 / 9
Maximum Conditional Entropy Regularization for CTC (EnCTC)
EnCTC:
Lenctc = Lctc � �H(p(⇡|l,X))
god
CTC:
expected:(High Entropy)
H(p(⇡|l,X)) = �X
⇡2B�1(l)
p(⇡|X, l) log p(⇡|X, l).
Entropy-based regularization
Better generalization and exploration.Solve peaky distribution problem.Depict ambiguous segmentation boundaries.
Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 3 / 9
Equal Spacing CTC (EsCTC)
The spacing of two consecutive elements is nearly the same in many sequential tasks.
g
o
d
d o g
d o g
gd o
We adopt equal spacing as a pruning method to rule out unreasonable CTC paths.
Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 4 / 9
Equal Spacing CTC (EsCTC)
z1 z2 z3
o o o o o o
d d d
_ _ _ _ _ o
_ _ _ _ _ d
d d d d d d
_ _ d _ d d
o o o
_ _ o _ o o
g g g
_ _ g _ g g
og g _ g
dg g _ g
g
o
dTheorem 3.1. Among all segmentation sequences, the equal spacing one has the maximum entropy.
Equal spacing is the best prior without any subjective assumptions.
zs ⌧T
|l|
zargmax max
pH(p(⇡|z, l, X)) = zes
Lesctc = � logX
z2C⌧,T
X
⇡2B�1z (l)
p(⇡|X)
Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 5 / 9
Algorithm and Complexity Analysis
We propose dynamic programming algorithms for EnCTC, EsCTC� and EnEsCTC.
�̄(t, s) =
(�(t � 1, s) + �(t � 1, s � 1) if l0s = b or l0s�2 = l0s�(t � 1, s) + �(t � 1, s � 1) + �(t � 1, s � 2) otherwise
Q(l) = �(T, |l0|) + �(T, |l0| � 1)
EnCTC
EsCTC
EnEsCTC
p⌧ (l|X1:T ) = ↵⌧ (T, |l|) +
⌧ T|l|X
t0=1
↵⌧ (T � t0, |l|)�(T � t0 + 1, T, 0)↵⌧ (t, s) =
8<:
P⌧ T|l|
t0=1 ↵⌧ (t � t0, s � 1)�(t � t0 + 1, t, s) if ls�1 6= lsP⌧ T
|l|t0=2 ↵⌧ (t � t0, s � 1)yt�t0+1
b �(t � t0 + 2, t, s) otherwise
Q⌧ (l) =�⌧ (T, |l|) +
⌧ T|l|X
t0=1
�⌧ (T � t0, |l|)�(T � t0 + 1, T, 0)
+ ↵⌧ (T � t0, |l|)�(T � t0 + 1, T, 0) log �(T � t0 + 1, T, 0)
�⌧ (t, s) =
8>>>>>>>>><>>>>>>>>>:
P⌧ T|l|
t0=1 �⌧ (t � t0, s � 1)�(t � t0 + 1, t, s) + ↵⌧ (t � t0, s � 1)⌘(t � t0 + 1, t, s)
if ls�1 6= lsP⌧ T
|l|t0=2 �⌧ (t � t0, s � 1)yt�t0+1
b �(t � t0 + 2, t, s)+
↵⌧ (t � t0, s � 1)yt�t0+1b ⌘(t � t0 + 2, t, s)+
↵⌧ (t � t0, s � 1)yt�t0+1b log yt�t0+1
b �(t � t0 + 2, t, s)
otherwise
Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 6 / 9
Qualitative Analysis
(a) (b) (c) (d)
Error Signal in Training
p a r i t y
CTCEnCTCEsCTC
EnEsCTC
Alignment Evaluation
Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 7 / 9
Results on Scene Text Recognition Benchmarks
Evaluation of model generalization.Method Synth5K
CTC 38.1CTC + LS 42.9CTC + CP 44.4
EnCTC 45.5EsCTC 46.3EnEsCTC 47.2
Comparisons with the state-of-the-art methods.Method IC03 IC13 IIIT5K SVT
CRNN 89.4 86.7 78.2 80.8STAR-Net 89.9 89.1 83.3 83.6R2AM 88.7 90.0 78.4 80.7RARE 90.1 88.6 81.9 81.9
EnCTC 90.8 90.0 82.6 81.5EsCTC 92.6 87.4 81.7 81.5EnEsCTC 92.0 90.6 82.0 80.6
Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 8 / 9
For more results and analyses, please come
Poster: Room 210 & 230 AB #106
https://github.com/liuhu-bigeye/enctc.crnn
Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 9 / 9