an introduction to hmm and it’s uses

15
1 Muhammad Gulraj BS GIKI, Pakistan MS UET Peshawar, Pakistan Pattern Recognition Name: Muhammad Gulraj [email protected] Muhammad Gulraj BS GIKI, Pakistan MS UET Peshawar, Pakistan

Upload: muhammad-gulraj

Post on 14-Apr-2017

46 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Introduction to HMM and it’s Uses

1

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Pattern RecognitionName: Muhammad Gulraj

[email protected]

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 2: An Introduction to HMM and it’s Uses

2

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

An Introduction to HMM and it’s Uses

A Hidden Markov Model HMM is a statistical/probabilistic model which is a sequence/set of observable variable X which is generated by hidden state Y. In simple terms Hidden Markov model consists of hidden states and the output is a sequence/set of observations. In simple Markov models the state is directly observable while in hidden Markov model the states are not observable directly. Hidden Markov model is very reliable model for probabilistic estimation. Hidden Markov Models HMM have applications in pattern recognitions such gesture and hand writing recognition, computational Bioinformatics, speech recognition etc.

Suppose there is a man in the room having three coins to flip. The room is locked and no one can see what’s happening inside. There is a display screen outside the room which shows the result of the coin flips. The result can be any sequence of heads and tails e.g. THTHHHTHHTTTTHT etc. We can get any sequence of heads and tails, and it is impossible to predict any specific sequence that will occur. This sequence or unpredictable outcome can be termed as ‘Observation sequence’.

1. Suppose the 3rd coin will produce more heads, then tails. The resulting sequence will obviously have more heads then tails in this case.

2. Now suppose that the chance of flipping the 3rd coin after 1st and 2nd coin is nearly zero, in this case the transition from 1st and 2nd coin to 3rd coin will be very less and as a result we will be getting very few heads if the man starts flipping the coin from 2nd and 3rd

coin.

3. Assume that each coin have some probability associated with them that the man will start the process of flipping from that particular coin.

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 3: An Introduction to HMM and it’s Uses

3

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

The first supposition is called ‘Emission Probability’ bj(O), the 2nd is called ‘transition probability’ (aij) and the 3rd is called ‘initial probability’ πi. In this whole example the tails/heads sequence are Observation sequences and coins are states.

Formally the HMM can be specified as:

Set of hidden states S1,S2,S3 … Sn

Set of Observations O1,O2,O2 … Om

Initial state probability πi

Emission/Output Probability B: P (Ok | qi), where Ok is observation and qi is the state.

Transition probability A

HMM λ = { Π, A, B}

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 4: An Introduction to HMM and it’s Uses

4

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Problems of HMM and their explanations

There are 3 acknowledged problems of Hidden Markov Model HMM.

1. Computing the probability P (O/Δ) for a particular sequence/observation, given the parameters O and Δ. It is the summation of probabilities of observation O over all states S. For a fixed state S, the probability of observation O is

We can find the total probability by using:

It looks quite simple but it computationally it is very expensive. This problem is called evaluation problem and can be solved using Forward-Backward algorithm.

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 5: An Introduction to HMM and it’s Uses

5

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

2. Computing the mostly likely sequence of (hidden) states S1, S2, S3 …. St, which can generate an output sequence (observation). This sequence should maximize the join probability of state sequence and observation P (O, S/ Δ). This problem is called ‘Decoding problem’. Decoding problem can be solved using Viterbi algorithm or posterior decoding. To optimize the probability of individual state we can use

This is the probability when the state is j at some time t and where given the parameters observation O and model Δ. The most likely sequence can be find by simple combining the individual states:

3. The third problem of hidden Markov model HMM is finding the most likely set/sequence of state transition probabilities and output probabilities P (O/ ∆). The given parameters are adjusted so that probability can P (O/ ∆) can be maximized. This is called Training problem. No analytical solution exists to solve this problem. This problem can be solved using Baun-Welch re-estimation algorithm.

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 6: An Introduction to HMM and it’s Uses

6

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Relation of HMM to Prior, Posterior and evidence

As I have discussed in the introductory example, there are basically 3 type of probabilities associated with Hidden Markov model HMM.

1. Initial probability’ πi

2. Emission/Output Probability B: P (Ok | qi), where Ok is observation and qi is the state.

3. Transition probability

4.

From the example we know that initial probability is the probability that we know before an experiment is performed. Prior probability similarly has the same property as initial probability.

Posterior probability is the Emission probability or Output probability P (Ok | qi). The Posterior probability is used in forward backward algorithm.

Evidence is the Transition probability A, which is the probability that the next state is Qi given that the current state is Qj.

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 7: An Introduction to HMM and it’s Uses

7

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Solution to the problems of HMM and their algorithms

As discussed earlier there are 3 problems of hidden Markov model HMM.

1. Evaluation problem, which can be solve using Forward-Backward algorithm.

2. Decoding problem, which can be solve using Viterbi algorithm or posterior decoding

3. Training problem, which can be solve using Baun-Welch re-estimation algorithm

Forward-Backward algorithm

The forward and backward algorithm combines the forward and backward steps to find the probability of every hidden state at some specific time t, repeating this for every time step t can give us sequence of most likely state at each time t. It cannot be guaranteed that the sequence will be a valid sequence as it considers every individual step.

The forward algorithm can be stated in step

.

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 8: An Introduction to HMM and it’s Uses

8

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Similarly we can do this backward algorithm as well

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 9: An Introduction to HMM and it’s Uses

9

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Viterbi algorithm

Viterbi algorithm is used to find the most likely hidden states, resulting in a sequence of observed events. The relationship between observations and states can be inferred from the given image.

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 10: An Introduction to HMM and it’s Uses

10

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

In first step Viterbi algorithm initialize the variable

In second step the process is iterated for every step

In third step the iteration ends

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 11: An Introduction to HMM and it’s Uses

11

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

In Fourth step we track the best path

Baun-Welch re-estimation algorithm

Baun-Welch re-estimation algorithm is used to compute the unknown parameters in hidden Markov model HMM. Baun-Welch re-estimation algorithm can be best described using the following example.

Assume we collect eggs from chicken every day. The chicken had lay eggs or not depends upon unknown factors. For simplicity assume that there are only 2 states (S1 and S2) that determine that the chicken had lay eggs. Initially we don’t know about the state, transition and probability that the chicken will lay egg given specific state. To guess initial probabilities, suppose all the sequences starting with S1 and find the maximum probability and then repeat for S2. Repeat these steps until the resulting probabilities converge. Mathematically it can be

ReferencesMuhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

Page 12: An Introduction to HMM and it’s Uses

12

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan

1. Andrew Ng (2013), an online course for Machine learning, Stanford University, Stanford, https://class.coursera.org/ml-004/class.

2. Duda and Hart, Pattern Classification (2001-2002), Wiley, New York.

3. http://en.wikipedia.org

4. http://hcicv.blogspot.com/2012/07/hidden-markov-model-for-dummies.html

5. http://www.mathworks.com/help/stats/hidden-markov-models-hmm.html

6. http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/viterbi_algorithm/ s3_pg3.html

Muhammad GulrajBS GIKI, PakistanMS UET Peshawar, Pakistan