greedy algorithms -class3- · greedy algorithms and genome rearrangements bioinfo i (institut...
TRANSCRIPT
Greedy Algorithms
And
Genome Rearrangements
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 1 / 75
Outline
Greedy exampleTransforming Cabbage into TurnipGenome RearrangementsSorting By ReversalsPancake Flipping ProblemGreedy Algorithm for Sorting by ReversalsApproximation AlgorithmsIntroduction to Dynamic Programming
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 2 / 75
Greedy Example
Goal: Given a tree, find the longest path from the root to the leaves
Greedy approach Actual path
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 3 / 75
Greedy Example
Goal: Given a tree, find the longest path from the root to the leaves
Greedy approach Actual path
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 3 / 75
Turnip vs. Cabbage: Look and Taste Different
Although cabbages and turnips share a recent common ancestor, they lookand taste different
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 4 / 75
Turnip vs Cabbage: Comparing Gene Sequences Yields NoEvolutionary Information
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 5 / 75
Turnip vs Cabbage: Almost Identical mtDNA genesequences
In 1980s Jeffrey Palmer studied evolution of plant organelles bycomparing mitochondrial genomes of the cabbage and turnip99% similarity between genesThese surprisingly identical gene sequences differed in gene orderThis study helped pave the way to analyzing genomerearrangements in molecular evolution
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 6 / 75
Turnip vs Cabbage: Different mtDNA Gene Order
Gene order comparison:
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 7 / 75
Turnip vs Cabbage: Different mtDNA Gene Order
Gene order comparison:
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 8 / 75
Turnip vs Cabbage: Different mtDNA Gene Order
Gene order comparison:
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 9 / 75
Turnip vs Cabbage: Different mtDNA Gene Order
Gene order comparison:
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 10 / 75
Turnip vs Cabbage: Different mtDNA Gene Order
Gene order comparison:
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 11 / 75
Turnip vs Cabbage: Different mtDNA Gene Order
Gene order comparison:
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 12 / 75
Turnip vs Cabbage: Different mtDNA Gene Order
Gene order comparison:
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 13 / 75
Turnip vs Cabbage: Different mtDNA Gene Order
Gene order comparison:
Evolution is manifested as the divergence in gene order
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 14 / 75
Genome rearrangements
What are the similarity blocks and how to find them?What is the architecture of the ancestral genome?What is the evolutionary scenario for transforming one genome intothe other?
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 15 / 75
Reversals
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 16 / 75
Reversals
Blocks represent conserved genesIn the course of evolution or in a clinical context, blocks 1,...,10 couldbe misread as 1, 2, 3,−8,−7,−6,−5,−4, 9, 10
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 17 / 75
Reversals and Breakpoints
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 18 / 75
Reversals: Example
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 19 / 75
Types of rearrangements
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 20 / 75
Comparative Genomic Architectures: Mouse vs HumanGenome
Humans and mice have similargenomes, but their genes areordered differently245 rearrangements
I ReversalsI FusionsI FissionsI Translocation
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 21 / 75
Comparative Genomic Architecture of Human and MouseGenomes
To locate where correspondinggene is in humans, we have toanalyze the relative architectureof human and mouse genomes
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 22 / 75
Reversals: Example
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 23 / 75
Reversals: Example
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 24 / 75
Reversals and Gene Orders
Gene order is represented by a permutation π:
π = π1π2...πi−1πiπi+1...πj−1πj ...πn
π = π1π2...πi−1πjπj−1...πi+1πi ...πn
Reversal ρ(i , j) reverses (flips) the elements from i to j in π
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 25 / 75
Reversal Distance Problem
Goal: Given two permutations, find the shortest series of reversals thattransforms one into another
Input: Permutations π and σOutput: A series of reversals ρ1,..., ρt transforming π into σ, such that tis minimum
t: reversal distance between π and σd(π, σ): smallest possible value of t, given π and σ
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 26 / 75
Sorting By Reversals Problem
Goal: : Given a permutation, find a shortest series of reversals thattransforms it into the identity permutation (1 2 ... n )
Input: Permutations πOutput: A series of reversals ρ1,..., ρt transforming π into the identitypermutation such that t is minimum
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 27 / 75
Sorting By Reversals: Example
t = d(π) - reversal distance of πExample :
π = 3 4 2 1 5 6 7 10 9 8π = 4 3 2 1 5 6 7 10 9 8π = 4 3 2 1 5 6 7 8 9 10π = 1 2 3 4 5 6 7 8 9 10
→ So d(π) = 3
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 28 / 75
Sorting by reversals: 5 steps
→ d(π) = 5
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 29 / 75
Sorting by reversals: 4 steps
→ d(π) = 4
What is the reversal distance for this permutation?Can it be sorted in 3 steps?
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 30 / 75
Sorting by reversals: 4 steps
→ d(π) = 4
What is the reversal distance for this permutation?Can it be sorted in 3 steps?
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 30 / 75
Pancake Flipping Problem
The chef is sloppy; he prepares anunordered stack of pancakes ofdifferent sizesThe waiter wants to rearrange them(so that the smallest winds up ontop, and so on, down to the largestat the bottom)He does it by flipping over severalfrom the top, repeating this as manytimes as necessary
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 31 / 75
Pancake Flipping Problem: Formulation
Goal: Given a stack of n pancakes, what is the minimum number of flipsto rearrange them into perfect stack?Input: Permutation πOutput: A series of prefix reversals ρ1, ...ρt transforming π into theidentity permutation such that t is minimum
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 32 / 75
Pancake Flipping Problem: Greedy Algorithm
Greedy approach: 2 prefix reversals at most to place a pancake in itsright position, 2n − 2 steps total at mostWilliam Gates and Christos Papadimitriou showed in the mid-1970sthat this problem can be solved by at most 5/3(n + 1) prefix reversals
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 33 / 75
Sorting By Reversals: A Greedy Algorithm
If sorting permutation π = 1 2 3 6 4 5, the first three elements arealready in order so it does not make any sense to break them.The length of the already sorted prefix of π is denoted prefix(π) →prefix(π) = 3This results in an idea for a greedy algorithm: increase prefix(π) atevery step
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 34 / 75
Greedy Algorithm: An Example
Doing so, π can be sorted
1 2 3 6 4 5↓
1 2 3 4 6 5↓
1 2 3 4 5 6
Number of steps to sort permutation of length n is at most (n − 1)
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 35 / 75
Greedy Algorithm: Pseudocode
1 SimpleReversalSort(π)2 1 for i → 1 to n - 13 j → position of element i in π (i.e., πj = i)4 if j 6= i5 π → π ∗ ρ(i , j)6 output π7 if π is the identity permutation8 return
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 36 / 75
Analyzing SimpleReversalSort
SimpleReversalSort does not guarantee the smallest number of reversalsand takes five steps on π = 6 1 2 3 4 5 :
Step 1: 1 6 2 3 4 5
Step 2: 1 2 6 3 4 5
Step 3: 1 2 3 6 4 5
Step 4: 1 2 3 4 6 5
Step 5: 1 2 3 4 5 6
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 37 / 75
Analyzing SimpleReversalSort
But it can be sorted in two steps:π = 6 1 2 3 4 5
Step 1: 5 4 3 2 1 6Step 2: 1 2 3 4 5 6
So, SimpleReversalSort(π) is not optimalOptimal algorithms are unknown for many problems → approximationalgorithms are used
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 38 / 75
Approximation Algorithms
These algorithms find approximate solutions rather than optimalsolutionsThe approximation ratio of an algorithm A on input π is:
A(π) / OPT(π)where
A(π) -solution produced by algorithm AOPT (π) - optimal solution of the problem
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 39 / 75
Approximation Ratio/Performance Guarantee
Approximation ratio (performance guarantee) of algorithm A: maxapproximation ratio of all inputs of size n
For minimization algorithm A the objective function is:
max|π| =n A(π)/OPT (π)
For maximization algorithm:
min|π| =n A(π)/OPT (π)
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 40 / 75
Adjacencies and Breakpoints
Given π = π1π2π3...πn−1πn
A pair of elements πi and πi+1 are adjacent consecutive if
πi+1 = πi ± 1
For example:
π = 1 9 3 4 7 8 2 6 5
(3, 4) or (7, 8) and (6,5) are adjacent consecutive pairs
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 41 / 75
Breakpoints: An Example
There is a breakpoint between any adjacent element that arenon-consecutive:
π = 1 9 3 4 7 8 2 6 5
Pairs (1,9), (9,3), (4,7), (8,2) and (2,6) form breakpoints ofpermutation πb(π): number breakpoints in permutation π
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 42 / 75
Adjacency & Breakpoints
An adjacency - a pair of adjacent elements that are consecutiveA breakpoint - a pair of adjacent elements that are not consecutive
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 43 / 75
Extending Permutations
We put two elements π0=0 and πn+1=n+1 at the ends of π
Example:
Note: A new breakpoint was created after extending
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 44 / 75
Reversal Distance and Breakpoints
Each reversal eliminates at most 2 breakpoints
→ This implies:
reversaldistance ≥ #breakpoints2
π = 2 3 1 4 6 5
0 | 2 3 | 1 | 4 | 6 5| 7 b(π) = 50 1| 3 2 | 4| 6 5 |7 b(π) = 40 1 2 3 4 | 6 5 | 7 b(π) = 20 1 2 3 4 5 6 7 b(π) = 0
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 45 / 75
Reversal Distance and Breakpoints
Each reversal eliminates at most 2 breakpoints→ This implies:
reversaldistance ≥ #breakpoints2
π = 2 3 1 4 6 5
0 | 2 3 | 1 | 4 | 6 5| 7 b(π) = 50 1| 3 2 | 4| 6 5 |7 b(π) = 40 1 2 3 4 | 6 5 | 7 b(π) = 20 1 2 3 4 5 6 7 b(π) = 0
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 45 / 75
Sorting By Reversals: A Better Greedy Algorithm
1 BreakPointReversalSort(π)2 while b(π) > 03 Among all possible reversals, choose reversal ρ minimizing b(π ∗ ρ)4 π ← π ∗ ρ(i , j)5 output π6 return
Problem: this algorithm may work forever
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 46 / 75
Sorting By Reversals: A Better Greedy Algorithm
1 BreakPointReversalSort(π)2 while b(π) > 03 Among all possible reversals, choose reversal ρ minimizing b(π ∗ ρ)4 π ← π ∗ ρ(i , j)5 output π6 return
Problem: this algorithm may work forever
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 46 / 75
Strips
Strip: an interval between two consecutive breakpoints in apermutation
I Decreasing strip: strip of elements in decreasing order(e.g. 6 5 and 3 2 )
I Increasing strip: strip of elements in increasing order (e.g. 7 8)
A single-element strip can be declared either increasing or decreasing.We will choose to declare them as decreasing with exception of thestrips with 0 and n+1
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 47 / 75
Reducing the Number of Breakpoints
Theorem 1:If permutation π contains at least one decreasing strip, then there exists areversal ρ which decreases the number of breakpoints (i.e. b(π ∗ ρ) < b(π))
Things To ConsiderFor π= 1 4 6 5 7 8 3 2
0 1 | 4 | 6 5 | 7 8| 3 2 | 9 b(π) = 5
Choose decreasing strip with the smallest element k in π ( k = 2 inthis case)
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 48 / 75
Reducing the Number of Breakpoints
Theorem 1:If permutation π contains at least one decreasing strip, then there exists areversal ρ which decreases the number of breakpoints (i.e. b(π ∗ ρ) < b(π))
Things To ConsiderFor π= 1 4 6 5 7 8 3 2
0 1 | 4 | 6 5 | 7 8| 3 2 | 9 b(π) = 5
Choose decreasing strip with the smallest element k in π ( k = 2 inthis case)
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 48 / 75
Reducing the Number of Breakpoints
Theorem 1:If permutation π contains at least one decreasing strip, then there exists areversal ρ which decreases the number of breakpoints (i.e. b(π ∗ ρ) < b(π))
Things To ConsiderFor π= 1 4 6 5 7 8 3 2
0 1 | 4 | 6 5 | 7 8| 3 2 | 9 b(π) = 5
Choose decreasing strip with the smallest element k in π ( k = 2 inthis case)
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 49 / 75
Reducing the Number of Breakpoints
Theorem 1:If permutation π contains at least one decreasing strip, then there exists areversal ρ which decreases the number of breakpoints (i.e. b(π ∗ ρ) < b(π))
Things To ConsiderFor π= 1 4 6 5 7 8 3 2
0 1 | 4 | 6 5 | 7 8| 3 2 | 9 b(π) = 5
Choose decreasing strip with the smallest element k in π ( k = 2 inthis case)Find k − 1 in the permutation
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 50 / 75
Reducing the Number of BreakpointsTheorem 1:If permutation π contains at least one decreasing strip, then there exists areversal ρ which decreases the number of breakpoints (i.e. b(π ∗ ρ) < b(π))
Things To ConsiderFor π= 1 4 6 5 7 8 3 2
0 1 | 4 | 6 5 | 7 8| 3 2 | 9 b(π) = 5
Choose decreasing strip with the smallest element k in π ( k = 2 inthis case)Find k − 1 in the permutationReverse the segment between k and k-1:
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 51 / 75
Reducing the number of breakpoints again
If there is no decreasing strip, there may be no reversal ρ that reducesthe number of breakpoints (i.e. b(π ∗ ρ) ≥ b(π) for any reversal ρ)By reversing an increasing strip (# of breakpoints stay unchanged),we will create a decreasing strip at the next step. Then the number ofbreakpoints will be reduced in the next step (Theorem 1).
There is no decreasing strip in π for:
ρ(6, 7) does not change the # of breakpointsρ(6, 7) creates a decreasing strip thus guaranteeing that the next stepwill decrease the # of breakpoints.
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 52 / 75
Reducing the number of breakpoints again
If there is no decreasing strip, there may be no reversal ρ that reducesthe number of breakpoints (i.e. b(π ∗ ρ) ≥ b(π) for any reversal ρ)By reversing an increasing strip (# of breakpoints stay unchanged),we will create a decreasing strip at the next step. Then the number ofbreakpoints will be reduced in the next step (Theorem 1).There is no decreasing strip in π for:
ρ(6, 7) does not change the # of breakpointsρ(6, 7) creates a decreasing strip thus guaranteeing that the next stepwill decrease the # of breakpoints.
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 52 / 75
ImprovedBreakpointReversalSort
1 ImprovedBreakpointReversalSort(π)2 while b(π) > 03 if π has a decreasing strip
Among all possible reversals, choose reversal ρthat minimizes b(π ∗ ρ)
4 else5 Choose a reversal ρ that flips an increasing strip in π6 π ← π ∗ ρ7 output π8 return
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 53 / 75
ImprovedBreakpointReversalSort: Performance Guarantee
ImprovedBreakPointReversalSort is an approximation algorithm with aperformance guarantee of at most 4
It eliminates at least one breakpoint in every two steps; at most 2b(π)stepsApproximation ratio: 2b(π)
d(π)
Optimal algorithm eliminates at most 2 breakpoints in every step:d(π) ≥ b(π)
2
Performance guarantee:
(2b(π)d(π)
) ≤ [2b(π)b(π)
2
] = 4
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 54 / 75
Signed Permutations
Up to this point, all permutations to sort were unsignedBut genes have directions... so we should consider signed permutations
GRIMM Web Server
Real genome architectures are represented by signed permutationsEfficient algorithms to sort signed permutations have been developedGRIMM web server computes the reversal distances between signedpermutations
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 55 / 75
Signed Permutations
Up to this point, all permutations to sort were unsignedBut genes have directions... so we should consider signed permutations
GRIMM Web Server
Real genome architectures are represented by signed permutationsEfficient algorithms to sort signed permutations have been developedGRIMM web server computes the reversal distances between signedpermutations
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 55 / 75
Dynamic ProgrammingPart I: Edit Distance
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 56 / 75
DNA Sequence Comparison: First Success Story
Finding sequence similarities with genes of known function is acommon approach to infer a newly sequenced gene’s functionIn 1984 Russell Doolittle and colleagues found similarities betweencancer-causing gene and normal growth factor (PDGF) gene
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 57 / 75
Cystic Fibrosis
Cystic fibrosis (CF) is a chronic and frequently fatal genetic disease ofthe body’s mucus glands (abnormally high level of mucus in glands).CF primarily affects the respiratory systems in children.Mucus is a slimy material that coats many epithelial surfaces and issecreted into fluids such as saliva
Finding Similarities between the Cystic Fibrosis Gene and ATP bindingproteins
In 1989 biologists found similarity between the cystic fibrosis gene andATP binding proteinsATP binding proteins are present on cell membrane and act astransport channelA plausible function for cystic fibrosis gene, given the fact that CFinvolves sweat secretion with abnormally high sodium level
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 58 / 75
Cystic Fibrosis
Cystic fibrosis (CF) is a chronic and frequently fatal genetic disease ofthe body’s mucus glands (abnormally high level of mucus in glands).CF primarily affects the respiratory systems in children.Mucus is a slimy material that coats many epithelial surfaces and issecreted into fluids such as saliva
Finding Similarities between the Cystic Fibrosis Gene and ATP bindingproteins
In 1989 biologists found similarity between the cystic fibrosis gene andATP binding proteinsATP binding proteins are present on cell membrane and act astransport channelA plausible function for cystic fibrosis gene, given the fact that CFinvolves sweat secretion with abnormally high sodium level
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 58 / 75
Cystic Fibrosis: Finding the Gene
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 59 / 75
Cystic Fibrosis: Mutation Analysis
If a high % of cystic fibrosis (CF) patients have a certain mutation inthe gene and the normal patients don’t, then that could be anindicator of a mutation that is related to CFA certain mutation was found in 70% of CF patients, convincingevidence that it is a predominant genetic diagnostics marker for CF
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 60 / 75
Cystic Fibrosis and CFTR Gene
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 61 / 75
Cystic Fibrosis and the CFTR Protein
CFTR (Cystic FibrosisTransmembrane conductanceRegulator) protein is acting in thecell membrane of epithelial cells thatsecrete mucus.These cells line the airways of thenose, lungs, the stomach wall, etc.
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 62 / 75
Mechanism of Cystic Fibrosis
The CFTR protein (1480 amino acids) regulates a chloride ion channelAdjusts the “wateriness" of fluids secreted by the cellThose with cystic fibrosis are missing one single amino acid in theirCFTRMucus ends up being too thick, affecting many organs
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 63 / 75
Bring in the Bioinformaticians!!!
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 64 / 75
Bring in the Bioinformaticians!!!
Gene similarities between two genes with known and unknown functionalert biologists to some possibilitiesComputing a similarity score between two genes tells how likely it isthat they have similar functionsDynamic programming is a technique for revealing similarities betweengenesThe Change Problem is a good problem to introduce the idea ofdynamic programming
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 65 / 75
The Change Problem
Goal: Convert some amount of money M into given denominations, usingthe fewest possible number of coinsInput: An amount of money M, and an array of d denominationsc = (c1, c2, ..., cd ), in a decreasing order of value (c1 > c2 > ... > cd )Output: A list of d integers i1, i2, ..., id such thatc1i1 + c2i2 + + cd id = M and i1 + i2 + + id is minimal
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 66 / 75
Change Problem: Example
Given the denominations 1, 3, and 5, what is the minimum number of coinsneeded to make change for a given value?
Only one coin is needed to make change for the values 1, 3 and 5
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 67 / 75
Change Problem: Example
Given the denominations 1, 3, and 5, what is the minimum number of coinsneeded to make change for a given value?
However, two coins are needed to make change for the values 2, 4, 6, 8 and10.
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 68 / 75
Change Problem: Example
Given the denominations 1, 3, and 5, what is the minimum number of coinsneeded to make change for a given value?
Lastly, three coins are needed to make change for the values 7 and 9
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 69 / 75
Change Problem: Recurrence
This example is expressed by the following recurrence relation:
minNumCoins(M) = min
minNumCoins(M − 1) + 1minNumCoins(M − 3) + 1minNumCoins(M − 5) + 1
Given the denominations c: c1, c2, ..., cd , the recurrence relation is:
minNumCoins(M) = min
minNumCoins(M − c1) + 1minNumCoins(M − c2) + 1...minNumCoins(M − cd ) + 1
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 70 / 75
Change Problem: Recurrence
This example is expressed by the following recurrence relation:
minNumCoins(M) = min
minNumCoins(M − 1) + 1minNumCoins(M − 3) + 1minNumCoins(M − 5) + 1
Given the denominations c: c1, c2, ..., cd , the recurrence relation is:
minNumCoins(M) = min
minNumCoins(M − c1) + 1minNumCoins(M − c2) + 1...minNumCoins(M − cd ) + 1
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 70 / 75
Change Problem: A Recursive Algorithm
1 RecursiveChange(M,c,d)2 if M = 03 return 04 bestNumCoins← inf5 for i ← 1 to d6 if M ≥ ci
7 numCoins ← RecursiveChange(M − ci , c , d)8 if numCoins + 1 < bestNumCoins9 bestNumCoins ← numCoins + 110 return bestNumCoins
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 71 / 75
RecursiveChange Is Not EfficientIt recalculates the optimal coin combination for a given amount of moneyrepeatedlyFor example M = 77, c = (1, 3, 7)→ optimal coin combo for 70 cents iscomputed 9 times!
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 72 / 75
We Can Do Better
We’re re-computing values in our algorithm more than onceSave results of each computation for 0 to MThis way, we can do a reference call to find an already computedvalue, instead of re-computing each timeRunning time M ∗ d , where M is the value of money and d is thenumber of denominations
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 73 / 75
The Change Problem: Dynamic Programming
1 DPChange(M,c,d)2 bestNumCoins0 ← 03 for m ← 1 to M4 bestNumCoinsm ← inf5 for i ← 1 to d6 if m ≥ ci
7 if bestNumCoinsm−ci+ 1 < bestNumCoinsm8 bestNumCoinsm ← bestNumCoinsm−ci + 19 return bestNumCoinsM
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 74 / 75
DPChange: Example
Bioinfo I (Institut Pasteur de Montevideo) Greedy Algorithms -class3- July 19th, 2011 75 / 75