multiple alignment using hidden markove models
DESCRIPTION
Multiple alignment using hidden Markove models. November 21, 2001 Kim Hye Jin Intelligent Multimedia Lab [email protected]. Outline. Introduction Methods and algorithm Result Discussion. IM lab. Introduction| why HMM. Introduction. Why HMM? - PowerPoint PPT PresentationTRANSCRIPT
Multiple alignment using hidden Multiple alignment using hidden Markove modelsMarkove models
November 21, 2001
Kim Hye Jin
Intelligent Multimedia Lab
Outline
• Introduction
• Methods and algorithm
• Result
• Discussion
IM lab
IntroductionIntroduction
• Why HMM?– Mathematically consistent description of
insertions and deletions– Theoretical insight into the difficulties of
combining disparate forms of information
Ex) sequences / 3D structures– Possible to train models from initially unaligned
sequences
Introduction| why HMM
IM lab
Methods and algorithms
• State transition – State sequence is a 1st
order Markov chain
– Each state is hidden
– match/Insert/delete state
• Symbol emission
Methods and algorithms|HMMs
States transition
Symbol emission
IM lab
Deletion state
Match state
Insertion state
IM lab
Methods and algorithms|HMMs
Methods and algorithms
Methods and algorithms|HMMs
IM lab
• Replacing arbitrary scores with probabilities relative to consensus
• Model M consists of N states S1 …SN.
• Observe sequence O consists of T symbols
O1 … ON from an alphabet x• aij : a transition from Si to Sj • bj(x) : emission probabilities for emission of a
symbol x from each state Sj
Methods and algorithms
Methods and algorithms|HMMs
IM lab
• Model of HMM : example of ACCY
Methods and algorithms
Methods and algorithms|HMMs
IM lab
• Forward algorithm
- a sum rather than a maximum
Methods and algorithms
Methods and algorithms|HMMs
IM lab
• Viterbi algorithm- the most likely path through the model- following the back pointers
Methods and algorithms
Methods and algorithms|HMMs
IM lab
• Baum-Welch algorithm– A variation of the forward algorithm– Reasonable guess for initial model and then
calculates a score for each sequence in the training set using EM algorithms
• Local optima problem: – forward algorithm /Viterbi algorithm – Baum-welch algorithm
Methods and algorithms
Methods and algorithms|HMMs
IM lab
• Simulated annealing– support global suboptimal – kT = 0 : standard Viterbi training procesure– kT goes down while in training
Methods and algorithms
Methods and algorithms|HMMs
IM lab
ClustalW
Methods and algorithms
Methods and algorithms|HMMs
IM lab
ClustalX
Results
Results
IM lab
• len : consensus length of the alignment
• ali : the # structurally aligned sequences
• %id: the percentage sequence identity
• Homo: the # homologues identified in and extraced from SwissProt 30
• %id : the average percentage sequence identity in the set of homologues
Results
Results
IM lab
Discussion
Discussion
IM lab
• HMM- a consistent theory for insertion and deletion
penality- EGF : fairly difficult alignments are well done
• ClusterW- progressive alignment- Disparaties between the sequence identity of the
structures and the sequence identity of the homologoues
- Large non-correlation between score and quality
Discussion
Discussion
IM lab
• The ability of HMM to sensitive fold recognition is apparent