phylogenetic trees tutorial 6. distance based methods upgma neighbor joining tools mega phylogeny.fr...
Post on 22-Dec-2015
238 views
TRANSCRIPT
Distance based methods• UPGMA• Neighbor JoiningTools• Mega• phylogeny.fr• DrewTree
Phylogenetic Trees
a
d
c
b
acb d
fe
Unweighted Pair Group Method using Arithmetic AveragesUPGMA
Assumption: Divergence of sequences is assumed to occur at constant rate. Therefore the distance to root is equal.
Step by step summary:1. Calculate all pairwise distances.
2. Pick two nodes (i and j) for which the relative distance is minimal (lowest).
3. Define a new node (x).
4. Calculate Dix and Djx - the distance of the chosen nodes I and J to the new node X, as well as the distance from X to all other nodes.
5. Continue until two nodes remain – connect with edge.
Neighbor Joining (Not assuming equal divergence)
A
B
C
D
EA B C D E
A - 22 39 39 41
B - - 41 41 43
C - - - 18 20
D - - - - 10
E - - - - -
Step 1. Calculate all pairwise distances.
• Problem: unrelated sequences approach a fraction of difference expected by chance The distance measure converges.
• Jukes-Cantor
, Fraction of sites where residues differi jD f
Measuring Distance
,
3 4log(1 )
4 3i jD f
Measuring Distance (cont)• Euclidean Distance: Given a multiple sequence alignment,
calculate the square root of the sum of the score at every position between two sequences
• the score increases proportionally to the extent of dissimilarity between residues
2
,1
( , )n
a b i ii
d s a b
Step 2. Pick two nodes (i and j) for which the relative distance is minimal (lowest).
, , ( )i j i j i jM D r r
,
,
2
2
i ki
j kj
Dr
LD
rL
,i jM Relative distance between i and j
,i jD Distance between i and j from the distance table
ir Distance of i from all other sequences
L Number of leaves (=sequences) left in the tree
A B C D E
A - 22 39 39 41
B - - 41 41 43
C - - - 18 20
D - - - - 10
E - - - - -
22 39 39 41 14147
2 5 2 341 41 43 22
493
18 20 41 3939.3
310 39 41 18
363
41 43 20 1038
3
n ijA i j
B
C
D
E
Dr
L
r
r
r
r
Step 2. Pick two nodes (i and j) for which the relative distance is minimal (lowest).
A B C D E
A - -74 -47.3 -44 -44
B - - -47.3 -44 -44
C - - - -57.3 -57.3
D - - - - -64
E - - - - -
A,B is the pair with the minimal Mi,j distance.
The Mij Table is used only to choose the closest pairs (lowest value) and not for calculating the distances
, ( ) 22 (47 49) 74
39 (47 39.3) 47.3AB A B A B
AC
M D r r
M
Etc.
Step 2. Pick two nodes (i and j) for which the relative distance is minimal (lowest).
Step 4. Calculate Dix and Djx - the distance of the chosen nodes I and J to the new node X, as well as the distance from X to all other nodes.
22 47 4910
2 222 49 47
122 2
AB A BAX
AB B ABX
D r rD
D r rD
X C D E
X - 29 29 31
C - - 18 20
D - - - 10
E - - - -
Now we’ll calculate the distance from X to all other nodes.
39 41 2229
2 239 41 22
292 2
41 43 2231
2 2
AC BC ABCX
AD BD ABDX
AE BE ABEX
D D DD
D D DD
D D DD
Step 5 - Continue until two nodes remain
New Mi,j tableX C D E
X - -49 -44 -44
C - - -44 -44
D - - - -49
E - - - -
A
B
C
D
E
XY
9, 20YC XYD D
New Di,j tableY D E
Y - 9 11
D - - 10
E - - -
Only 2 nodes are left. Let’s calculate all the distances to Z
A
B
C
D
E
XYZ
5, 6, 4ZY ZE ZDD D D
http://www.phylogeny.fr/version2_cgi/one_task.cgi?task_type=phyml
http://www.phylodiversity.net/rree/drawtree/index.html
cladogram
phylogram