dynamic data selection in search (1/4)
Post on 19-Jan-2016
26 Views
Preview:
DESCRIPTION
TRANSCRIPT
A Dynamic In-Search Data Selection Method With Its
Application to Acoustic Modeling and Utterance Verification (about
training)
Dynamic data selection in search (1/4)
• In continue speech recognition, it become much more difficult to define the CT or TT set because of unknown unit boundaries.
• As a result, any possible segmentation in an utterance could potentially become a competing token. However, an exhaustive search it too expensive to be affordable.
Dynamic data selection in search (2/4)
• In this paper, every utterance is recognized with Viterbi beam search algorithm.
• All partial paths surviving during beam search always have relatively large likelihood values and usually potentially compete with the true path.
Dynamic data selection in search (3/4)
tat time path ending- wordactive
)(,),(),()( 123
2
2
1
1211
W
apapapW Mtt
tt
tt
tt
M
M
M
Reference phone segmentation
)(1m
tt ap m
m
alignment
2
)1()1tt((
1),max(t),min(t
phones. edhypothesiz as
identity phone same the withsegmentsreferenct
allover rate overlapmax thecalculate we
''se
's
'e
se
se
tt
tt
Dynamic data selection in search (4/4)
)(1m
tt ap m
m
1 da threshol exceeds if True token of
ma
procedure.alignment force
theon basedsegment same theof likelihood
average the )( of frameper likelihood
average theand , da thresholbelow is if1
2
m
tt ap m
m
)(1m
tt ap m
m
competing
Token ofma
Application I : acoustic modeling (1/20)
• When giving true tokes of => MAP
• MAP :
)()|(max)(
)()|(max)|(max λλX
X
λλXXλλ
λλλpp
p
pppMAP
EstimationML)|(max withcomparing λXλλ
pML
training observation sequences with prior information
Application I : acoustic modeling (2/20)
T
tlltlssslllss tttttt
NwaNwp
gpp
gpE
g
21,
_
_
__
),|(),|()|,,(Where
))(log()|,,(log),|,(
))(log(,|)|,,(log
))(log()|Q()|R(
: step-E
111111rmxrmxLSX
LSXXLS
XLSX
LS
S
S
S
S
S
S
x
λSX
xx
xx
λSX
λSX
λSXλX
λSX
λSXλXSλXλSXλλ
T
tts
T
tsss
T
ttsssss
T
ttsssss
ttt
ttt
ttt
ba
P
bab
bab
P
P
PP
P
PPPEQ
22
21
21
)(loglog
|,
)()(
)()(log
|,
|,
,log|
|,
,log,,|,log|
11
111
111
A simpler example (DHMM) :
4,1117,1114,11 babab 1 4,1117,1114,11 loglogloglogloglog babab
4,2127,1114,11 babab 2 4,2127,1114,11 loglogloglogloglog babab
4,1217,2124,11 babab 3 4,1217,2124,11 loglogloglogloglog babab
4,2227,2124,11 babab 4 4,2227,2124,11 loglogloglogloglog babab
4,1117,1214,22 babab 5 4,1117,1214,22 loglogloglogloglog babab
4,2127,1214,22 babab 6 4,2127,1214,22 loglogloglogloglog babab
4,1217,2224,22 babab 7 4,1217,2224,22 loglogloglogloglog babab
4,2227,2224,22 babab 8 4,2227,2224,22 loglogloglogloglog babab
)|,(log λsXp)|,( λsXp
1
2
1
2
1
2
statrt1
212a
22a
21a
4v 7v 4vA simpler example (DHMM) :
221121 log)1(log)1(log8765
log4321
allall
2221
1211
log8487
log7365
log6243
log5121
aallall
aallall
aallall
aallall
1i
2i
1j 2j1t 2t
pathsall87654321 all
A simpler example (DHMM) :
i
j
t
jkvxt
ktt
K
k
N
j
ij
T
ttt
N
j
N
i
N
ii
bvjs
ajsisis
p
p
pQ
kt
log),|~,Pr(
log),|,Pr(log),|Pr(
)|,(log
)|,(
)|,()|(
~:11
1
11
1111
λXx
λXλX
λSX
λSX
λSXλλ
S
S
A simpler example (DHMM) :
Application I : acoustic modeling (3/20)
K
kkktik
kktikt
ttt
ttttt
kkt
T
tt
K
k
N
iik
T
tt
K
k
N
i
ijt
T
t
N
j
N
ii
N
i
Nw
Nwki
kliski
isijsisji
Nkiwki
ajiiQ
1
1
111111
1
1111
1
_
),|(
),|(),(and
),|,Pr(),(
),|Pr()(),|,Pr(),(where
),|(log),(log),(
log),(log)()|(
indecomposedbecanfunctionQ
rmx
rmx
λX
λXλX
rmx
Application I : acoustic modeling (4/20)
0,, where
),(gammanormalofproduct
aasassumedis),(thenmatrixprecisiondiagonalaisIf
),()(
:betoassumedisfordensitypriorThe
ikd
1
)(2)2/1(
1
1
111
1
11
1
2
ikdikd
D
d
rmr
ikdikik
ikikik
K
kikikik
K
k
N
i
N
jij
N
i
N
iic
ikdikdikdikdikd
ikd
ikd
ikiji
eerg
g
gwaKg
rm
rmr
rm
Constantrmgw
ag
ikik
K
kikik
K
k
N
i
ijij
N
j
N
iii
N
i
),(loglog1
log1log1)(log
111
111
_
Application I : acoustic modeling (5/20)
)(
)1())(2()2
1(
),|(
1),|(log
0)|R(
take
2
1
)(2
12/1
1
_
ikdtdikd
ikdtdikd
mxr
ikd
D
d
ikiktikd
ikikt
ikd
mxr
mxrer
Nm
N
m
ikdtdikd
D
d
rmx
rmx
ikdikdikd
D
d
rmx
ikd
D
dikikt erN
2
1
)(2
12/1
1
),|(
rmx
)(
)2)((2),(
1
),(
1),(log
2
2
)(2
1)2/1(
)(2
12/1
ikdikdikdikd
ikdikdikdkdrmr
ikdikik
rmr
ikdikikikd
ikik
mr
mreerg
eergm
g
ikdikdikdikdikdikd
ikd
ikdikdikdikdikdikd
kd
rm
rm
rm
D
d
rmr
ikdikikikdikd
ikdikdikdikd
ikd eerg1
)(2)2/1(
2
),( rm
Application I : acoustic modeling (6/20)
),(
),(
),(),(
0)()(),(
1
1
11
1
ki
xkim
xkimmki
mrmxrki
t
T
tikd
tdt
T
tikdikd
ikd
ikdikdtdt
T
tikdikdikdt
T
t
ikdikdikdikdikdtdikdt
T
t
Application I : acoustic modeling (7/20)
21
21
22/1
_
_
)(2
1
)(2
1)(
2
1
)()(2
1)log(
),|(log
0)|R(
take
ikdtdikd
ikdtdikd
ikdikdtdikdikd
ikikt
ikd
N
r
mxr
mxr
rmxrr
rmx
ikdikdikd
D
d
rmx
ikd
D
dikikt erN
2
1
)(2
12/1
1
),|(
rmx
Application I : acoustic modeling (8/20)
ikdikdikdikd
ikdikd
ikd
ikdikdikd
ikdikd
ikik
rmr
ikdikikikd
ikik
mr
m
r
g
eergr
g
ikd
ikdikdikdikdikdikd
ikd
21
2
2/3
)(2
12/1
)(2
)2/1(
)()3()2()1(
)(2
)2()3()1(
)3()2()2/1(
),(
1
||),(
1),(log 2
rm
rm
rm
(1) (2) (3)
Application I : acoustic modeling (9/20)
),()12(
))(,()(2
))(,()(2
)12(),(
0)(2
)2/1(
)(2
1),(
1
2
1
2
1
2
1
2
11
1
21
21
1
),(
ki
mxkimr
mxkim
rrki
mr
mxrki
t
T
tikd
ikdtdt
T
tikdikdikdikd
ikd
ikdtdt
T
tikdikdikdikd
ikdikdikdt
T
t
ikdikdikdikd
ikdikd
ikdtdikdt
T
t
nm
Application I : acoustic modeling (10/20)
ikd
ikdikd
ikdikd
ikd
ikdikd
ikdikd
r
m
r
m
prior PDF theof mean theas chosen be can also And
2/1
prior PDF theof mode theas chosen be can estimate initial The
Application I : acoustic modeling (11/20)
• When giving true tokes of => MCE– Imposter word : the hypothesized word is wrong,
and the likelihood exceeds the likelihood given its reference model.
– If we can minimize the total number of imposter word, we can reduce the WER.
a
refhrefh WWWXPWXP )|()|(
Application I : acoustic modeling (12/20)
MCE• The misclassification distance measure for the
wrong word W is defined as
• And then a general form of the “smoothed” count of the imposter word is defined as
)exp(1
1)(
Ww d
d
)|())(|(111 Wtt
ttref
ttw
MMM XlXXld
Application I : acoustic modeling (13/20)
MCE• The so-called GPD algorithm is adopted to minimized the
“smoothed” count of the imposter word
W
Wtt dLL )()( where)(1
)( wdL
)|())(|(111 Wtt
ttref
ttw
MMM XlXXld
Application I : acoustic modeling (14/20)
MCE
ikdikd
ikd
ikd
T
rr
r
r
lll
a
xxxaX
log are
ations transform the where, parameters ed transform therespect to with
takenisgradient The . 0 precision theof ssdefinitene positive
theas such sconstraint certainsatisfy tohave parameters HMM The
sequence. labelcomponent mixture optimal its is },,,{ and path,
Viterbi optimal ingcorrespond its is }s,,s,{s assume We . model
phone ofset CT thein},,,,{)( tokencompetinga Given
_
T21
T21
21
Application I : acoustic modeling (15/20)
ki
taq
aqt
rq
rq
ww
T
t
ki
taq
aqt
rq
rq
uw
uw
uw
T
t
ki
taq
aqt
rq
rq
T
tuw
uw
ki
taq
aqt
rq
rq
t
ttw
uw
ki
w
w
wki
w
xbaxbadd
xbaxba
d
d
d
xbaxba
d
d
xbaxba
d
d
Xd
Xd
XdXd
tttt
tttt
tttt
ttttm
)))(log())(log(())(1()(
)))(log())(log((
))exp(1(
)exp(
))exp(1(
1
)))(log())(log((
))exp(1(
)exp()(
)))(log())(log(()exp(1
1
)(
)(
))(())((
11
11
11
11
1
1
1
12
Application I : acoustic modeling (16/20)
)()()())(1()(
)()()()(
),|())(1()(
)()()1())(()2
1(2
)())(1()(
)()())(log(
))(1()(
)))(log())(log(())(1()(
))((
1)(
1)(
)(2
12/1
11)(
1)(
1)(
)(
'
2
1
klisrmyddm
klisrmyyb
rmyNcddm
klismyr
erxb
cddm
klism
xbddm
m
xbaxbaddm
m
Xdmm
ttikdikdtd
T
tww
SXikd
ttikdikdtdt
ri
ikikrik
T
tww
SXikd
ttikdtdikd
rmy
ikd
D
itri
rik
T
tww
SXikd
ttikd
tri
T
tww
SXikd
ikd
tai
ait
ri
ri
T
tww
SXikd
ikd
w
SXikdikd
c
c
ikdikdtd
D
d
c
c
c
c
Reference model
Application I : acoustic modeling (17/20)
)()())((1))(1()(log
)()()()()(
),|())(1()(log
)()(
)(22
1
2
1
)())(1()(log
)()())(log(
))(1()(log
)))(log())(log(())(1()(log
))((loglog
2
1)(
21'
1)(
2')()(
2
12/1
1
)()(2
1
2/1
2/1
,1
1)(
1)(
'_1)(
_)(
'
2'
1
'2
1
klismyrddr
klisrmyryb
rmyNcddr
klisr
myer
err
xb
cddr
klisrr
xbddr
r
r
r
xbaxbaddr
r
Xdrr
ttikdtdikd
T
tww
SXikd
ttikdikdtdikdt
ri
ikikrik
T
tww
SXikd
ttikd
ikdtd
rmy
ikj
D
j
rmy
ikdikj
D
djj
ti
rik
T
tww
SXikd
ttikdikd
tki
T
tww
SXikd
ikd
ikd
ikd
tai
ait
ci
ci
T
tww
SXikd
ikd
w
SXikdikd
c
c
ikdikdtd
D
d
ikdikdtd
D
d
c
c
c
c
Reference model
Application I : acoustic modeling (18/20)
)()())((1))(1()(log
))((loglog
)()()())(1()(
))((
2
1)(
_)(
'
2
1)(
')(
'
klismyrddr
r
Xdrr
klisrmxddm
m
Xdmm
ttikdtdikd
T
tww
SXikd
ikd
w
SXikdikd
ttikdikdtd
T
tww
SXikd
ikd
w
SXikdikd
c
c
c
c
a modelFor
top related