protein structure alignment human myoglobin pdb:2mm1 human hemoglobin alpha-chain pdb:1jeba sequence...
Post on 22-Dec-2015
222 views
TRANSCRIPT
Protein Structure Alignment
Human Myoglobin pdb:2mm1
Human Hemoglobin alpha-chain pdb:1jebA
Sequence id: 27%
Structural id: 90%
Another example:
G-Proteins: 1c1y:A, 1kk1:A6-200
Sequence id: 18%Structural id: 72%
Transformations
Translation
Translation and Rotation Rigid Motion (Euclidian Trans.)
Translation, Rotation + Scaling
txx
'
'x Rx t
)(' txRsx
Inexact Alignment.
Simple case – two closely related proteins with the same number of amino acids.
Assume transformation T is given
Question: how to measure an alignment error?
Distance FunctionsTwo point sets: A={ai} i=1…n
B={bj} j=1…m• Pairwise Correspondence:
(ak1,bt1) (ak2,bt2)… (akN,btN)
(1) Exact Matching: ||aki – bti||=0
(2) Bottleneck max ||aki – bti||
(3) RMSD (Root Mean Square Distance)
Sqrt( Σ||aki – bti||2/N)
Correspondence is Unknown
find those rotations and translations of one of the point sets which produce “large” superimpositions of corresponding 3-D points.
Given two configurations of points in the three dimensional space,
T
Largest Common Point Set (LCP) problem
Given e>0 and two point sets A and B find a transformation T and equally sized subsets A’ (a subset of A) and B’ (a subset of B) of maximal cardinality such that dist(A’,T(B’)) ≤ e.
Bottleneck metric: optimal solution in O(n32.5) C. Ambuhl et al. 2000
RMSD metric: open problem
A 3-D reference frame can be uniquely defined by the ordered vertices of a non-
degenerate triangle
p1
p2
p3
Structure Alignment (Straightforward Algorithm)
• For each pair of triplets, one from each molecule which define ‘almost’ congruent triangles compute the rigid transformation that superimposes them.
• Count the number of aligned point pairs.
-> maximal bipartite matching (bottleneck metric)
How?
• Complexity : O(n3m3 ) * O(nm √(m +n) ) .
Can we say something about the quality of the final solution?
YES!
If there is a LCP of size L with error e, then the alignment method detects a LCP of size >= L with error 8e. M.T. Goodrich et al. 1994.
Superposition - best least squares(RMSD – Root Mean Square Deviation)
Given two sets of 3-D points :P={pi}, Q={qi} , i=1,…,n;
rmsd(P,Q) = √ i|pi - qi |2 /n
Find a 3-D rigid transformation T* such that:
rmsd( T*(P), Q ) = minT √ i|T(pi) - qi |2 /n
A closed form solution exists for this task.It can be computed in O(n) time.
Sequence Order Independent Alignment
2cbl:A
1f4n
1rhg:A
1b3q
51 103 113 169
3 58 54 7
73 126 34 12
306 355 354 305
171 147
chain A
chain A
chain B
chain B
E. A. NALEFSKI and J. J. FALKE
The C2 domain calcium-binding motif: Structural and functional diversity Protein Sci 1996 5: 2375-2390
The C2 domain calcium-binding motif