recombination histories & global pedigrees
DESCRIPTION
Finding Minimal Recombination Histories. 1. 2. 3. 4. 1. 2. 3. 1. 4. 2. 3. 4. Global Pedigrees. Finding Common Ancestors. NOW. Recombination Histories & Global Pedigrees. Acknowledgements Yun Song - Rune Lyngsø - Mike Steel. Recombination. Gene Conversion. - PowerPoint PPT PresentationTRANSCRIPT
Recombination Histories & Global Pedigrees
Acknowledgements Yun Song - Rune Lyngsø - Mike Steel
Finding Minimal Recombination Histories
1 2 3 4 1 2 3 4 1 234
Global Pedigrees
Fin
din
g
Co
mm
on
A
nc
es
tors
NOW
Basic Evolutionary Events
Recombination Gene Conversion
Coalescent/Duplication Mutation
Infinite site assumption ?
Hudson & Kaplan’s RM
If you equate RM with expected number of recombinations, this could be used as an estimator. Unfortunately, RM is a gross underestimate of the real number of recombinations.
0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 10 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 11 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 11 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1
Local Inference of Recombinations
0000111
0001101
00
10
01
11
Four combinationsIncompatibility:
Myers-Griffiths (2002): Number of Recombinations in a sample, NR, number of types, NT, number of mutations, NM obeys:
NR NT NM 1
0011
0101
T . . . GT . . . CA . . . GA . . . C
Recoding
•At most 1 mutation per column
•0 ancestral state, 1 derived state
Minimal Number of Recombinations
Last Local Tree Algorithm:
L21Data
2
n
i-1 i
1
Trees
The Kreitman data (1983): 11 sequences, 3200bp, 43(28) recoded, 9 different
How many neighbors?
(2n 2)!
2n 1(n 1)!
n! (n 1)!
2n 1
14133 2 nn
~ n3
Bi-partitionsHow many local trees?
• Unrooted
• Coalescent
Metrics on Trees based on subtree transfers.
Pretending the easy problem (unrooted) is the real problem (age ordered), causes violation of the triangle inequality:
Tree topologies with age ordered internal nodes
Rooted tree topologies
Unrooted tree topologies
Trees including branch lengths
Observe that the size of the unit-neighbourhood of a tree does not grow nearly as fast as the number of trees
Song (2003+)
Du
e to Yu
n S
ong
Tree Combinatorics and Neighborhoods
(2n 3)!!(2n 2)!
2n 1(n 1)!
Allen & Steel (2001)
2(n 3)(2n 7)
14133 2 nn
2
12
2 )1(log2)2(4n
m
mn
n! (n 1)!
2n 1
1
32n3 3n2 20n 39
Methods # of rec events obtained
Hudson & Kaplan (1985) 5
Myers & Griffiths (2003) 6
Song & Hein (2004). Set theory based approach. 7
Song & Hein (2003). Tree scanning using DP
Lyngsø, Song & Hein (2006). Massive Acceleration using Branch and Bound Algorithm.
Lyngsø, Song & Hein (2006). Minimal number of Gene Conversions (in prep.)
7
7
5-2/6-1
The Minimal Recombination History for the Kreitman Data
- recombination 27 ACs
0
1
2
3
4
5
6
7
8
1
1
4
2
5
3
1
5
5
The Griffiths-Ethier-Tavare Recursions
No recombination: Infinite Site Assumption
Ancestral State Known
History Graph: Recursions Exists
No cycles
Possible Histories without Recombination for simple data example
+ recombination 3*108 ACs
Counting + Branch and Bound Algorithm
?
Exact len
gth
Lower bound
Up
per B
oun
d
0 31 912 13143 86184 304365 627946 789707 630498 324519 1046710 1727
289920
k-recom
bin
atination
n
eighb
orhood
k
minARGs: Recombination Events & Local Trees
True ARG
Reconstructed ARG
1 2 3 4 5
1 23 4 5
((1,2),(1,2,3))
((1,3),(1,2,3))
n=7, =10, =75
Minimal ARG
True ARG
0 4 Mb
Hudson-Kaplan
Myers-Griiths
Song-Hein
n=8, =40
n=8, =15
Mutation information on only one side
Mutation information on both sides
Reconstructing global pedigrees: SuperpedigreesSteel and Hein, 2005
The gender-labeled pedigrees for all pairs, defines global pedigree
k
Gender-unlabeled pedigrees doesn’t!!