hidden markov models (hmms) –probabilistic models for learning patterns in sequences (e.g. dna,...

• Hidden Markov Models (HMMs)– probabilistic models for learning patterns in

sequences (e.g. DNA, speech, weather, cards...)

(2nd order model)

• an observable Markov model– directly get sequence of states– p(s1,s2...sn|)=p(s1) i=2..n p(si|si-1)– (why I don’t like the Urn example in the book)

• Hidden Markov model– only observe sequence of symbols generated by

states– for each state, there is a probability distribution over

finite set of symbols (emission probabilities)

– example: think of soda machine• observations: messages display (“insert 20 cents more”),

output can, give change• states: coins inserted so far add up to N cents...• state transitions are determined by coins input

• tasks1) given a sequence, compute the probability it came

from one of a set of models (e.g. most likely phoneme) – classification

2) infer the most likely sequence of states underlying sequence of symbols

find Q* such that:

3) train the HMM by learning the parameters (transition probabilities) from a set of examples

given seqs X, find * such that

• given an observation sequence O=o1...oT – if we also knew state seq Q=q1..qT, then we

could easily calculate p(O|Q,)– joint probability:

• p(O,Q|)=p(q1) · i=2..T p(qi|qi-1) · i=1..T p(oi|qi)

– could calculate by marginalization• p(O|) = Q p(O,Q|)• intractable, have to sum over all possible sequences

Q– the forward-backward algorithm is a recursive

procedure that solves this efficiently (via dynamic programming)

• Forward variable:

5Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

t(i) is the probability of observing prefix o1..ot and ending in state qi

SqOOPi

:Recursion

:tionInitializa|,...

• Backward variable:

t(i) is the probability of being in state qi at time t and observing suffix ot+1..oT

jttjijt

SqOOPi

:Recursion

:tionInitializa

| ,...

Forward-backward algorithm, O(N2T)forward pass:

for each time step i=1..Tcalculate (i) by summing over all predecessor states j

reverse passfor each time step i=T..1

calculate (i) by summing over all successor states j

• A01 function ForwardBackward( O, S,π,A,B ) : returns p(O|π,A,B)• A02 for each state si do• A03 1(i)←πi*Bi(O1)• A04 end for• A05 for i←2,3,...,T do• A06 for each state sj do• A07 i(j)←k (i-1(k)*Akj*Bj(Oi))• A08 end for• A09 end for

// is not needed for output, but is often computed for other purposes• A10 T(i)←1• A11 for i←T-1,...,1 do• A12 for each state sj do• A13 i(j)←k (Ajk*Bk(Oi+1)*i+1(k))• A14 end for• A15 end for• A16 return i T(i)• A17 end function

Choose the state that has the highest probability, for each time step:

qt*= arg maxi γt(i)

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Viterbi’s Algorithm

δt(i) ≡ maxq1q2 ∙∙∙ qt-1 p(q1q2∙∙∙qt-1,qt =Si,O1∙∙∙Ot | λ)

• Initialization: δ1(i) = πibi(O1), ψ1(i) = 0

• Recursion: δt(j) = maxi δt-1(i)aijbj(Ot), ψt(j) = argmaxi δt-1(i)aij ψt(j) – note: I think the book has wrong formula for ψt(j)

• Termination:p* = maxi δT(i), qT

*= argmaxi δT (i)• Path backtracking:

qt* = ψt+1(qt+1

* ), t=T-1, T-2, ..., 1

• A01 function VITERBI( O, S,π,A,B ) : returns state sequence q1*..qT*• A02 for each state si do• A03 1(i)←πi*Bi(O1)• A04 1(i)←0• A05 end for• A06 for i←2,3,...,T do• A07 for each state sj do• A08 i(j)←maxk (i-1(k)*Akj*Bj(Oi))• A09 i(j)←argmaxk (i-1(k)*Akj*Bj(Oi))• A10 end for• A11 end for

// traceback, extract sequence of states• A12 p*←maxi T(i)• A13 qT*← argmaxi T(i)• A14 for i←T-1,T-2,...,1 do• A15 qi*=j+1(qi+1*)• A16 end for• A17 return q1*..qT*• A19 end function

otherwise and if

otherwise if

(EM) algorithm Welch-Baum

jtittij

k l ttlklt

ttjijtt

lObakjObai

OSqSqPji

learn model parameters (transition aij and emissionprobabilities bij with highest likelihood for a given setof training examples

define t(i,j) as prob of being in si at time t and sj at time t+1, given sequence of observations O

define latent variables zjt and zij

t as indicators of which state a sequence passes through at each time step

Baum-Welch (EM)

,ˆ ˆ

:stepM

, :stepE

jizEizE

recall, it = i

t it , prob of being in state i at time t

expectation oftransition

hidden markov models (hmms) –probabilistic models for learning patterns in sequences (e.g. dna,...

Documents

markov models and hidden markov models (hmms)

hidden markov models k 1 … 2. outline hidden markov models...

hidden markov models - ucsd...

probability and time: hidden markov models...

hidden markov models - eemb dersler · a brief history of...

hidden markov models: multiple processes and model...

9 markov chains and hidden markov models - freie … · 9...

inertial hidden markov models: modeling change in...

hidden markov models use for speech recognition...hmms 1...

emergent inference of hidden markov models in spiking...

ΠΟΛΛΑΠΛΗ ΣΤΟΙΧΙΣΗ ΑΚΟΛΟΥΘΙΩΝ iΙ...

1 pattern recognition chapter 3 hidden markov models (hmms)

lecture 7: hidden markov models (hmms)

cs 3710: hidden markov models...

1 cs6825: recognition 8. hidden markov models 2 hidden...

hmms and particle filters. observations and latent states...

chapter 4 hidden markov models...

comp90042 trevor cohn 1 wsta lecture 15 tagging with hmms...

hidden markov models -...

hidden markov models for longitudinal...