distance-based method vs. character-based...

57
Chap. 4. Distance-Based Method of Phylogenetics Distance-based method vs. Character-based method

Upload: others

Post on 12-Mar-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Chap. 4. Distance-Based Method of Phylogenetics

Distance-based method vs.

Character-based method

Page 2: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

시간

변이

A B C

D

• Phenetic relationship(표현론적 관계)

(A(B,C)) Phenetic relationship에의해 계통을 추론하는 법: Distance-based method

• Cladistic relationship(분지적 관계)

((A,B)C) Cladistic relationship에의해 계통을 추론하는 법: Character-based method

참고) Chronistic relationship(시간적 관계)(D, (A,B,C))

• 종이하의 분류군 또는 집단간의 관계와 같이 낮은 rank의 분류군에서는phenetic relationship = cladisticrelationship이 될 가능성이 많지만 상위 rank에 있어서는 pheneticrelationship은 cladistic relationship과 달라질 가능성이 많다.

Page 3: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Newick Format

Note: polytomy

Page 4: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Molecular Phylogenetics (분자계통학)의 장점

- 닮았나 닮지 않았나라고 하는 phenotype들은 종종 진화

적 계통관계를 잘 못 보여주기도 함. 형태학적 유사성은

언제나 유전적 유사성을 반영하는 것은 아니다.

homoplastic characters (불계적 형질) 때문임.

- 형태학적 형질들은 수가 적어 불충분 함.

• 분자적 자료의 비교에 있어서도 homology를 생각하여

비교의 대상이 될 수 있는 것을 비교하여야 함.

homologous gene (상동 유전자)들을 비교해야 하며,

또한 sequence alignment 가 필요함.

Page 5: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

HumanChimpanzeeLionSea lionBat

Humerus

Phalanges

Metacarpals

Carpals

Radiusand ulna

Homology 상동

Page 6: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

History of Molecular Phylogenetics

1) DNA-DNA hybridization

2) Restirction Fragment Length Polymorphism (RFLP)

(Protein sequencing, Immunological method)

3) DNA sequencing

i) Single gene comparison

ii) Multiple gene comparison

iii) Comparative genomics

Page 7: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

계통학에서 사용되는 분자적 방법

1) DNA-DNA hybridization

- 생물간의 전체 유전체의 비교.- DNA 분석 초기 동물군에서 활발히 연구

ex) 독수리, 콘돌, 황새간의 유연관계 연구- 현재는 재현성이 부족하여 거의 사용되고

있지 않음

Page 8: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

2) RFLP (restriction fragment length polymorphism)

Page 9: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Enzyme Source Recognition Sequence Cut

EcoRI Escherichia coli 5'GAATTC 5'---G/AATTC---3'

EcoRII Escherichia coli 5'CCWGG 5'---/CCWGG---3'

BamHI Bacillus amyloliquefaciens 5'GGATCC5'---G/GATCC---3'

HindIII Haemophilus influenzae 5'AAGCTT 5'---A/AGCTT---3'

TaqI Thermus aquaticus 5'TCGA 5'---T/CGA---3'

NotI Nocardia otitidis 5'GCGGCCGC5'---GC/GGCCGC---3'

HinfI Haemophilus influenzae 5'GANTC 5'---G/ANTC---3'

Sau3A Staphylococcus aureus 5'GATC 5'---/GATC---3'

PovII* Proteus vulgaris 5'CAGCTG5'---CAG/CTG---3'

SmaI* Serratia marcescens 5'CCCGGG5'---CCC/GGG---3’

Page 10: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)
Page 11: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)
Page 12: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

식물에서는 주로 cpDNA의 RFLP 연구가 1980년대말~90년대 초반까지 활발히 이루어짐

Page 13: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

(Protein sequencing, Immunological method)3) DNA sequencing

i) 단일 DNA 구간에 의한 자료ii) 복수의 DNA 구간에 의한 자료iii) Comparative genomics

• 계통연구를 위한 DNA 염기서열 분석의 장점:i) 객관적인 형질상태(character status): A/C/G/T ii) 계통상 의미 있는 형질의 수(informative character)가 무수히 많다.iii) 기존의 데이터에 새로운 데이터의 첨가가 용이하다.iV) 매우 적은 양의 시료에서도 데이터를 얻을 수 있다.V) 실험 결과가 광범위한 분류군에 이용될 수 있다.

- 진화속도가 느린 유전자: 전체 피자식물의 계통- 진화속도가 빠른 유전자 또는 유전자들 사이의 염기서열

(intergenic spacer) 등: 속 또는 종들의 구분

- Basic process to reconstruct phylogenetic tree based on sequencing method

Sequencing search filtering alignments phylogenetic analyses

Page 14: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Phylogenetic Tree- Node 절- Terminal node 말단절- Internal node 내부절- Inferred ancestor (추정) 조상

- Newick format: (((I, II), (III, IV)), V)

- Multifuracting vs. bifracting

- Scaled tree vs. unscaled tree

Page 15: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Unscaled tree

Cladogram

Page 16: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Scaled tree

Phylogram

Page 17: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Rooted vs. Unrooted Tree

- 모든 계통수의 도출법은 공통적을로 먼저 unrooted tree (tree network)가 만들어 진 후 root를 설정하여 진화의 방향을 표시하게 된다. Root 의 설정은outgroup을 지정함으로서 이루어 지는데, outgroup이란 연구 대상의 분류군과 공동 조상을 가장 먼저 이루는 군임(진화의 역사에 있어서 공동조상으로부터 연구 대상과 가장 빨리 분지된 군).

Page 18: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Rooted tree: 궁극적으로 모든 대상 종들이 뻗어 나왔다고 생각되는 하나의root를 설정한 계통수.

root 설정 후 비로소 조상-후손관계가 성립된다.

Page 19: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)
Page 20: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Question: Draw a ROOTED cladogram based on the following unrooted tree with the arrowed rooting position.

A

B

C

D

E

F

G

H

I

J

Page 21: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Find the best tree in HERE

n=3

- n 개의 종이 있을 때 가능한 Rooted tree (NR) 와 unrooted tree (NU) 의수

- 135 human mtochondrial DNA sequences data에서 가능한 tree의수는 2.113X 10267 !!!!!

Page 22: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

A B

C

Unrooted Rooted

A B C A C B C B AN=3

N=4

N=n…

A

B

C

D

A

C

B

D

A

D

C

B

A B C D A D B C A D B C A B D C A B D C

A C B D A D C B A D C B A C D B A C D B

C B A D C B D A C D B A C B D A C B D A

Page 23: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

- Species tree가 언제나gene tree와 일치하는 것은 아님. homologous 한 유전자를 비교해야 함.

Page 24: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)
Page 25: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Homologous 상동Orthologous genes: originated from speciationParalogous genes: originated from gene duplications

Page 26: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Distance Matrix Method: 가장 간단한 계통수 제작법

Distance based method의 대표적 예UPGMA:Unweighted-pair-group method with arithmetic mean (산술평균 비가중 쌍별 그룹법)

Pairwise Distance Matrix

Page 27: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

-가장 작은 거리를 나타내는 것을 선택하여 하나의 group으로 묶음.-이것과 나머지 것들과의 거리를 구함.

D 와 E 가 가장 작은 거리를 나타냄: (D, E)를 우선적으로 묶음

Distance between (DE) and X:d(DE)X = ½ (dDX +dEX)

ex) d(DE)A = ½ (dDA +dEA) = ½ (12 + 15) = 13.5

- The smallest number is 8:(A, C)

Page 28: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Final Newick format: (((A, C)B, (D, E))

Estimation of Branch Lengths

- Cladogram

IF the molecular clock is present!

Page 29: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

1TTTTT

2CAAAA

3AAAAT

4CCTTT

5CCCAA

6TGGGG

ABCDE

Uninformative character: 1, 2, 6Variable character: 2, 3, 4, 5, 6Informative character: 4, 5

Parsimony method: using informative charactersDistance method: using all characters

Transformed Distance Method (Farris, 1977)- UPGMA 단점: 모든 계통의 진화속도가 일정하다고 가정.- TDM 은 각 lineage들의 진화속도를 다르게 가정하여 계산할 수 있음.

Page 30: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)
Page 31: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

d’ij: 종 i와 j 간의 transformed distancedD: average distance between the outgroup and all ingroups.

# in this case: dD=(dAD+dBD+dCD)/3= (12+15+10)/3=37/3

d’AB= (9-12-15)2+37/3= -9+(37/3)=10/3

Page 32: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Neighbor’s Relation Method- neighbor: unrooted tree에 있어서 각 종들이 internal node 들로 나뉠 때각각의 , pairs of species that are separated from each other by one internal node.

Page 33: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Neighbor Joining (NJ) Method-NJ 는 tree에 있어서 전체 branch 길이를 최소화 하는 방법임

S12: any pair of species can take positions 1 and 2 N: number of speciesk: accepted outgroupDij: distance between species

가장 많이 사용되는 방법

Maximum Likelihood Approaches- Make a tree maximizing sum of substitution rate in each site.- 최근 가장 많이 사용되는 방법- Character based method 이후에 다시 설명

Page 34: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Chap. 5. Character-based Method of Phylogenetics

Page 35: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Campbell Biology Chap. 26 - PARSIMONY Method

Species I

Three phylogenetic hypotheses:

Species II Species III

I

II

III

I

III

IIIII

III

Page 36: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Fig. 26-15-2

Species I

Site

Species II

Species III

I

II

III

I

III

IIIII

III

Ancestralsequence

1/C

1/C

1/C

1/C

1/C

4321

C

C C

C

T

T

T

T

T

T A

AA

A G

G

Page 37: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Fig. 26-15-3

Species I

Site

Species II

Species III

I

II

III

I

III

IIIII

III

Ancestralsequence

1/C

1/C

1/C

1/C

1/C

4321

C

C C

C

T

T

T

T

T

T A

AA

A G

G

I I

I

II

II

II

III

III

III3/A

3/A

3/A

3/A

3/A

2/T

2/T

2/T 2/T

2/T4/C

4/C

4/C

4/C

4/C

Page 38: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Fig. 26-15-4

Species I

Site

Species II

Species III

I

II

III

I

III

IIIII

III

Ancestralsequence

1/C

1/C

1/C

1/C

1/C

4321

C

C C

C

T

T

T

T

T

T A

AA

A G

G

I I

I

II

II

II

III

III

III3/A

3/A

3/A

3/A

3/A

2/T

2/T

2/T 2/T

2/T4/C

4/C

4/C

4/C

4/C

I I

I

II

II

II

III

III

III

7 events7 events6 events

Page 39: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Parsinomy: Parsimony is a non-parametric statistical method commonly used in computational phylogenetics for estimating phylogenies. Under parsimony, the preferred phylogenetic tree is the tree that requires the least evolutionary change to explain some observed data.

STRATEGIES FOR FASTER SEARCH1) Exhaustive search

Page 40: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

2) Branch-and-bound method5 steps

9 steps

7 steps

7 steps

12 steps15 steps12 steps

12 steps

17 steps

10 steps10 steps

10 steps

10 steps

10 steps

Calculation of 20 taxa:about 1021 trees

Page 41: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

3) Heuristic search

- 임의의 계통수를 구성 후 이를 시작점으로 보다 짧은 계통수를 찾아나감.

- 실질적으로 가장 많이 쓰이는 방법

Branch swapping

Page 42: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

3) Heuristic search

Page 43: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)
Page 44: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Character-based method: Maximum Parsimony (MP)

When the character is ordered, we can make phylogenetic tree using the following method- ordered: 0 1 2

One of examples for comparing Character-based Method and Distance-based Method

순서가 있는 형질에 있어서의 계통추론을 위한 두 방법의 비교C.f.: “Phenetics vs. Phylogenetics” in Rordford et al.

1) Make distance matrix

2) Choose minimum distance from ANC

Page 45: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

c.f.) HTU: Hypothetical Taxonomic Unit

3) Set HTU1 and find minimum distanced taxa from HTU1

Page 46: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

4) Reconstruct character status of HTU1 and add HTU1 in the matrix and distance table

Page 47: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

DELTRAN (DElayed TRANsformation) optimization : character changes are placed at terminals in the tree

ACCTRAN (ACCelerated TRANsformation) optimization:character changes are placed at the base rather than terminals.

5) Set HTU2 and find minimum distanced taxa again and repeat 3)~4).

6) Now we got a tree and we may reconstruct character evolution. There are two different optimization method:

Page 48: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Distance-based Method: neighbor joining (NJ), UPGMA…fast and easy

1) Calculate coefficient of overall similarity whatever the method is.

2) Choose the most similar pair and link them

Page 49: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

3) Combine two taxa and named “P”.

4) Calculate similarity between P and others

6) Repeat 2), 3) 4).

Now, we have both trees from the theories of pheneticsand cladistics (phylogenetics). Note: two trees may have different topology.

Page 50: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Remember! Consensus Tree- Strict consensus tree- semistrict consensus tree- 50% majority rule consensus tree

Page 51: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Tree confidence: Tree의 clade들이 얼마나 정확한가를보여주는 수치들1) Bootstrapping2) Jackknifing3) Decay analysis

Bootstrapping: making resampled pseudo-matrix

Page 52: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Tree confidence- Bootstrapping- Jackknifing- Decay analysis

Michelia cavalerieiM. pealiana

Michelia bailloniiMichelia champ

Michelia odoraMichelia figo

E.ovalisMichelia cathcartii

M. elegansM. biondii

M. kobusM. dawsoniana

M. campbelliiM. denudataM. cylindrica

M. acuminataM. sinica

Pachylarnax praecalvaM. nitida

M. panamensisM. virginiana

M. tamaulipanaM. grandiflora

M. guatemalensisKmeria duperreanaKmeria septentrionalis

Manglietia grandManglietia aroma

Manglietia coniferaManglietia glauca

M. officinalisM. tripetala

M. sieboldiiM. wilsonii

M. fraseri var. fraseriM. fraseri var. pyramidata

M. macrophyllaM. dealbata

M. cocoM. gigantifolia

M. henryiM. pterocarpaM. liliifera

M. splendensM. mexicana

M. dodecapetalaLiriodendron chinense

Liriodendron tulipifera

256 steps: 1 tree257 steps: 5 trees258 steps: 83 trees259 steps: 345 trees … …

Strict consensus tree

Recognize “collepsed” node

D1

… D2… D3

Decay analysis

Page 53: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Maximum likelihood method:- 통계학에 기반을 둔 방법- site 당 염기가 진화할 수 있는 모든 확률의 합을 계산하고 이를 모든 site에서 계산하여 합친 값을 최대화한 하나의 가능성을찾아내는 방법- long branch attraction을 줄여줄 수 있는방법으로 제시됨.

Page 54: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Long Branch Attraction긴가지 친화현상 김상태 2010 Basal-most angiosperm review 참조

A B

C

True Phylogeny:

21 steps

A B

C

Tree Generated by Parsimony Analysis:

20 steps

Homoplasious characters

(parallelisms)

Homoplasious

character

Page 55: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

Who can minimize long-branch attraction?

Page 56: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

• How can we get good alignment?

Good alignmentGood

phylogenetictree

1) No (or minimum) manual adjustment

2) Start from the alignment of each group3) Hierarchical addition of subgroups using “profile alignment” in CLUSTALX4) Re-arrange input order of sequences based on the phylogenetic analysis5) Repeat step 2), 3), and 4) several times

6) Compare the results of different alignments by different gap penalty value.

Page 57: Distance-based method vs. Character-based methodamborella.net/2012-Bioinformatics/Week13-Chap4.pdf · 2012-05-23 · (Protein sequencing, Immunological method) 3) DNA sequencing i)

http://www.youtube.com/watch?v=H6IrUUDboZo

• Tree of Life Project(ToL): 전 세계 생물 종 간의 계통관계를 집약하여정리하는 국제 콘소시엄(http://tolweb.org/tree/)

• KToL: Korean Tree of Life