genomic rearrangements cs 374 – algorithms in biology fall 2006 nandhini n s
Post on 21-Dec-2015
221 Views
Preview:
TRANSCRIPT
Genomic Rearrangements
CS 374 – Algorithms in BiologyFall 2006
Nandhini N S
Motivation
One of the keys to evolution. Detecting dynamics between members of
the same family. An interesting combinatorial problem!! Everybody loves Central Limit theorem (or
a variant).
Terminology Possible rearrangements
• Reversals• Translocations• Fission• Fusion.
Most Parsimonious scenario. Genomic Distance. Synteny Blocks
Describing the problem
Basically a reversal distance problem.
Given permutations π & σ (permutations
implying genes) , find a series of reversals such
that π.ρ1.ρ2.ρ3 …. .ρn = σ and n (genomic
distance) is minimum.
“The most parsimonious scenario”.
Putting it all together
Local Alignments. Synteny Blocks. Breakpoint Graph. Rearrangement Scenario.
From Local Alignments to Synteny Blocks Non-Trivial Issue!!
False orthologs. Micro-rearrangements. Sequence similarities in non-coding regions.
Human and Mouse Synteny Blocks
Grimm Synteny algorithm
Form an anchor Graph whose vertex set is the set of anchors.
Obtaining the Anchor Graph. (Use BLAST/ BLAST like techniques).
Grimm Synteny algorithm, contd.
Connect vertices in the anchor graph by an edge if the distance between them is smaller than the gap size G.
Determine the connected components of the anchor graph. Each small component is called a cluster.
Grimm Synteny algorithm, contd.
Delete ‘small’ clusters (shorter than the minimum cluster size C in length).
Grimm Synteny algorithm, contd.
Determine cluster
order and signs for
each genome.
Output the strips
in the resulting
cluster order as
synteny blocks.
Grimm Synteny algorithm, contd.
From Synteny Blocks to the breakpoint graph
From Breakpoint Graph to Rearrangement Scenarios
b(π)–c(π)+h(π) <= d(π) <= b(π)–c(π)+h(π)+1
“Efficient sorting of genomic permutations by translocation, inversion and block interchange ”
Reconstructing contiguous regions of an ancestral genome.
Reconstructing regions of an ancestral genome
Segmenting genomes based on pair wise alignments.
Nets -> Orthology Blocks -> Conserved Segments.
Nets to Orthology Blocks to Conserved Segments
First determine alignments
Then the orthology blocks
And then come the conserved segments.
Methodology
Predicting contiguous ancestral regions (CARs)
from modern alignments.
Identification of small inversions
Properties of breakpoints.
Inferring CARs.
Consider..
Sundry Details - Small Inversions.
For ambiguous cases, go with human data (the best documented till now).
A Sanity Check
Define a genome; and follow it through its
evolution!!
Imagine a genome π with n elements, that
evolves through a series of rearrangements.
Works! 90.8% of adjacencies predicted in the
Boreoeutherian ancestor are correct!
More realism!!!! Employed a realistic evolutionary tree with
branch lengths based on substitution frequencies.
Rearrangements – 90% Inversions.5% Translocations.3.75% Fusions.1.25% Fissions.
Modeled length of block with γ distribution, with shape and scale parameters α = .7 and θ = 500.
Comparison with other reconstructions
Details More data needed. Looking for better sequenced outgroups. Require improvements in handling large
duplications and deletions. Modeling gene conversion, expansion,
contraction of short tandem repeats caused by strand slippage.
Eventually; nucleotide resolution.
Inferring CARs
Thank you
top related