genomic rearrangements cs 374 – algorithms in biology fall 2006 nandhini n s

Post on 21-Dec-2015

221 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Genomic Rearrangements

CS 374 – Algorithms in BiologyFall 2006

Nandhini N S

Motivation

One of the keys to evolution. Detecting dynamics between members of

the same family. An interesting combinatorial problem!! Everybody loves Central Limit theorem (or

a variant).

Terminology Possible rearrangements

• Reversals• Translocations• Fission• Fusion.

Most Parsimonious scenario. Genomic Distance. Synteny Blocks

Describing the problem

Basically a reversal distance problem.

Given permutations π & σ (permutations

implying genes) , find a series of reversals such

that π.ρ1.ρ2.ρ3 …. .ρn = σ and n (genomic

distance) is minimum.

“The most parsimonious scenario”.

Putting it all together

Local Alignments. Synteny Blocks. Breakpoint Graph. Rearrangement Scenario.

From Local Alignments to Synteny Blocks Non-Trivial Issue!!

False orthologs. Micro-rearrangements. Sequence similarities in non-coding regions.

Human and Mouse Synteny Blocks

Grimm Synteny algorithm

Form an anchor Graph whose vertex set is the set of anchors.

Obtaining the Anchor Graph. (Use BLAST/ BLAST like techniques).

Grimm Synteny algorithm, contd.

Connect vertices in the anchor graph by an edge if the distance between them is smaller than the gap size G.

Determine the connected components of the anchor graph. Each small component is called a cluster.

Grimm Synteny algorithm, contd.

Delete ‘small’ clusters (shorter than the minimum cluster size C in length).

Grimm Synteny algorithm, contd.

Determine cluster

order and signs for

each genome.

Output the strips

in the resulting

cluster order as

synteny blocks.

Grimm Synteny algorithm, contd.

From Synteny Blocks to the breakpoint graph

From Breakpoint Graph to Rearrangement Scenarios

b(π)–c(π)+h(π) <= d(π) <= b(π)–c(π)+h(π)+1

“Efficient sorting of genomic permutations by translocation, inversion and block interchange ”

Reconstructing contiguous regions of an ancestral genome.

Reconstructing regions of an ancestral genome

Segmenting genomes based on pair wise alignments.

Nets -> Orthology Blocks -> Conserved Segments.

Nets to Orthology Blocks to Conserved Segments

First determine alignments

Then the orthology blocks

And then come the conserved segments.

Methodology

Predicting contiguous ancestral regions (CARs)

from modern alignments.

Identification of small inversions

Properties of breakpoints.

Inferring CARs.

Consider..

Sundry Details - Small Inversions.

For ambiguous cases, go with human data (the best documented till now).

A Sanity Check

Define a genome; and follow it through its

evolution!!

Imagine a genome π with n elements, that

evolves through a series of rearrangements.

Works! 90.8% of adjacencies predicted in the

Boreoeutherian ancestor are correct!

More realism!!!! Employed a realistic evolutionary tree with

branch lengths based on substitution frequencies.

Rearrangements – 90% Inversions.5% Translocations.3.75% Fusions.1.25% Fissions.

Modeled length of block with γ distribution, with shape and scale parameters α = .7 and θ = 500.

Comparison with other reconstructions

Details More data needed. Looking for better sequenced outgroups. Require improvements in handling large

duplications and deletions. Modeling gene conversion, expansion,

contraction of short tandem repeats caused by strand slippage.

Eventually; nucleotide resolution.

Inferring CARs

Thank you

top related