sorting cancer karyotypes by elementary operations michal ozery-flato and ron shamir school of...
Post on 19-Dec-2015
217 views
TRANSCRIPT
Sorting Cancer Karyotypes Sorting Cancer Karyotypes by Elementary Operationsby Elementary Operations
Michal Ozery-Flato and Ron ShamirSchool of Computer Science,
Tel Aviv University
Outline
• Introduction
• Modeling the evolution of cancer karyotypes
• The karyotype sorting problem
• Combinatorial Analysis
• Results
2
3
Introduction
4
http://www.ncbi.nlm.nih.gov/sky/skyweb.cgi
Normal female
karyotype
5
The "Philadelphia chromosome"
6
http://www.ncbi.nlm.nih.gov/sky/skyweb.cgi
Breast cancer
karytype (MCF-7)
7
Chromosomal Instability
• A phenotype of most cancer cells.– Losses or gains of chromosomes result from
errors during mitosis
– Chromosome rearrangements are associated with "double strand breaks"
multi-polar mitoses
8
Double Strand Breaks
• Constitute the most dangerous type of DNA damage– A successful repair ligates two matching
broken ends– Mis-repair can result in rearrangements (e.g.
translocations) or deletions
M.C. Escher, 1953
Double strand break
The Challenge
Analyze the evolution of aberration events in cancer karyotypes
9
10
The Mitelman Database of Chromosome Aberrations in Cancer• Over 55,000 cancer karyotypes, culled from
over 8000 scientific publications
• Can be parsed automatically (CyDAS parser www.cydas.org)
• The largest current data resource on cancer genomes' organization
11
Modeling the Evolution of Cancer Karyotypes
12
The Normal Karyotype • Band = basic unit observable in karyotype. A
unique region in the genome, identified by integer
• Normal Chromosome = interval of bands– Two normal chromosomes are either disjoint or
equivalent
• Normal karyotype = a collection of normal chromosomes– Usually contains two copies of each chromosome
(with the possible exception of the sex chromosomes)
13
The Cancer karyotype
• Fragment = a sub-interval (>1 bands)
of a normal chromosome
• Chromosome = – One fragment, or a concatenation of several
fragments – Orientation-less: [1,4]::[37,40] [40,37]::[4,1]
• Cancer karyotype = a collection of chromosomes
concatenation (breakpoint)
14
Elementary Operations
Breakage
Fusion
du
plicatio
n
deletion
These operations can generate all known chromosomal aberrations!
15
The Karyotype Sorting Problem
16
The Karyotype Sorting (KS) Problem• Find a shortest sequence of elementary
operations that transforms the normal karyotype into given cancer karyotype
• Find the elementary distance = #operations in such a solution to KS.
???
17
The Karyotype Sorting (KS) Problem(inverse formulation)
• Find a shortest sequence of inverse elementary operations that transforms the given cancer karyotype into the normal karyotype
???
18
Inverse Elementary Operations
Breakage
Fusion
du
plicatio
n
deletion
c-d
eletion
addition
19
Assumptions
• ~95% of the karyotypes in the Mitelman Database have no recurrent breakpoints
• Assumptions:– The cancer karyotype contains no recurrent
breakpoints– Every added chromosome contains no
breakpoints
[20,39]::[12,1]Breakpoint ID={390,120}
20
The Reduced Karyotype Sorting (RKS) Problem
• Assumptions reduced problem:– No breakpoints in the cancer karyotype
(i.e every chromosome is an interval)
– No breakpoints created by fusions / additions All the normal chromosomes are identical
1 2 3 4 5 6 7 8 9 10 110 1 2 3 4 5 6 7 8 9 10 110
The normal karyotype The cancer karyotype
breakage, fusion, c-deletion, addition
identical chromosomes
21
Combinatorial Analysis
(RKS Problem)
22
Extending the karyotypes
1 2 3 4 5 6 7 8 9 10 110
The normal karyotype
1 2 3 4 5 6 7 8 9 10 110
The cancer karyotype
23
Parameter 1: f = #disjoint pairs of complementing interval ends
• Observation: f = -1 for fusion; f = 1 for breakage f {0,-1,-2} for c-deletion f {0,1,2} for addition
f =5
1 2 3 4 5 6 7 8 9 10 110
24
The histogram
• Parameter 2: w = #bricks
• Observations:
– w is even w = 0 for breakage / fusion w {0,2} for addition / c-deletion
1 2 3 4 5 6 7 8 9 10 110
The cancer karyotype
1 2 3 4 5 6 7 8 9 10 110
The histogram
A wall with 2 bricks
A brick
25
Simple Bricks• A brick is simple if
– no lower brick (in the same wall), and– no complementing interval ends
• Parameter 3: s = #simple bricks• Observation:
s {0,-1} for breakage s =0 for constrained-deletion– |s| 2 for addition
Simple bricks
1 2 3 4 5 6 7 8 9 10110
26
The Weighted Bipartite Graph of Bricks
• Parameter 4: m = the minimum weight of a perfect matching
weight v-,v+:simple v-,v+:non-simple otherwise
v- < v+ 2 0 1
v+ < v- 0 2 1
1 2 3 4 5 6 7 8 9 10110
Positive bricks
Results
28
Main Theorem
• The elementary-distance, d, satisfies:
w/2+f+s+m-2N d 3w/2+f+s+m-2N
N = #intervals in the normal karyotype
29
Results (2)
• Used the main theorem to devise a polynomial-time 3-Approximation algorithm
• Combined with a greedy heuristic on real data (95% of Mitelman DB) optimal solutions computed for 100% of karyotypes– 99.99% cases : lower bound is achieved
(hence solution is optimal)– 30 cases: lower-bound+2 but actually optimal
(manual verification)
30
Summary
• A new framework for analyzing chromosomal aberrations in cancer
• A 3-approximation algorithm when there are no recurrent breakpoints – 100% success on 57,252 karyotypes (with no
recurrent breakpoints) from the Mitelman DB.
• Future work: handle recurrent breakpoints– Analyze the remaining 5% of the karyotypes in
the Mitelman DB.
31
Thank for your attention.
Questions?