cse 326: data structures topic 18: the algorhythmics
DESCRIPTION
CSE 326: Data Structures Topic 18: The Algorhythmics. Luke McDowell Summer Quarter 2003. Today’s Outline. Greedy Divide & Conquer Dynamic Programming. Greedy Algorithms. Repeat until problem is solved: Consider possible next steps Choose best-looking alternative and commit to it - PowerPoint PPT PresentationTRANSCRIPT
CSE 326: Data StructuresTopic 18: The Algorhythmics
Luke McDowell
Summer Quarter 2003
Today’s Outline
• Greedy
• Divide & Conquer
• Dynamic Programming
Greedy Algorithms
Repeat until problem is solved:– Consider possible next steps
– Choose best-looking alternative and commit to it
Greedy algorithms are normally fast and simple.
Sometimes appropriate as a heuristic solution or to approximate the optimal solution.
Greed in Action
The Grocery Bagging Problem• You are an environmentally-conscious grocery
bagger at QFC• You would like to minimize the total number of bags
needed to pack each customer’s items.Items (mostly junk food) Grocery bags
Sizes s1, s2,…, sN (0 < si 1) Size of each bag = 1
Optimal Grocery Bagging: An Example
• Example: Items = 0.5, 0.2, 0.7, 0.8, 0.4, 0.1, 0.3– How may bags of size 1 are required?
• Can find optimal solution through exhaustive search– Search all combinations of N items using 1 bag, 2 bags,
etc.– Runtime?
Bin Packing
• General problem: Given N items of sizes s1, s2,…, sN (0 < si 1), pack these items in the least number of bins of size 1.
• The general bin packing problem is NP-complete – Reductions: All NP-problems SAT 3SAT 3DM
PARTITION Bin Packing (see Garey & Johnson, 1979)
Items Bins
Sizes s1, s2,…, sN (0 < si 1) Size of each bin = 1
Greedy Solution?Items = 0.5, 0.2, 0.7, 0.8, 0.4, 0.1, 0.3
Another Greedy Solution?Items = 0.5, 0.2, 0.7, 0.8, 0.4, 0.1, 0.3
Divide & Conquer
• Divide problem into multiple smaller parts• Solve smaller parts
– Solve base cases directly
– Otherwise, solve subproblems recursively
• Merge solutions together (Conquer!)
Often leads to elegant and simple recursive implementations.
Divide & Conquer in Action
Fibonacci Numbers
F(n) = F(n - 1) + F(n - 2)
F(0) = 1 F(1) = 1
int fib(int n) {
if (n <= 1)
return 1;
else
return fib(n - 1) +
fib(n - 2);
}
Divide & Conquer
Fibonacci Numbers
F(n) = F(n - 1) + F(n - 2)
F(0) = 1 F(1) = 1
int fib(int n) {
if (n <= 1)
return 1;
else
return fib(n - 1) +
fib(n - 2);
}
Divide & Conquer F6
F5
F4 F3
F3
F2
F1 F0
F2
F1 F0
F2
F1 F0
F2
F1 F0
F2
F1 F0
F4
F3
F1 F1
F1
Runtime:
Fibonacci NumbersF(n) = F(n - 1) + F(n - 2)
F(0) = 1 F(1) = 1
int fib(int n) {// Create a “static” array however your favorite language allows
int fibs[n];
if (n <= 1)
return 1;
if (fibs[n] == 0)
fibs[n] = fib(n - 1) +
fib(n - 2);
return fibs[n];
}
Memoized
Runtime:
Memoizing/Dynamic Programming
• Define problem in terms of smaller subproblems• Solve and record solution for base cases• Build solutions for subproblems up from solutions to
smaller subproblems
Can improve runtime of divide & conquer algorithms that have shared subproblems with optimal substructure.
Usually involves a table of subproblem solutions.
Dynamic Programming in Action
• Sequence Alignment• Fibonacci numbers• All pairs shortest path• Optimal Binary Search Tree• Matrix multiplication
Sequence Alignment
Biology 101:
DNA Sequence• String using letters (nucleotides): A,C,G,T
For example: ACGGGCATTATCGTA
• DNA can mutate!
Change a letter: ACGGGCAT → ACGTGCAT
Insert a letter: ACGGGCAT → ACGGGCAAT
Delete a letter: ACGGGCAT → ACGGGAT
• A few mutations makes sequences “different”, but “similar”
• Similar sequences often have similar functions
What is Sequence Alignment?
• Use underscores (_) or wildcards to match up 2 sequences
• The “best alignment” of 2 sequences is an alignment which minimizes the number of “underscores”
• For example: ACCCGTTT and TCCCTTT
A_CCCGTTT
_TCCC_TTTBest alignment:
(3 underscores)
Solutions
Naïve solution
• Try all possible alignments
• Running time: exponential
Dynamic Programming Solution
• Create a table
• Table(x,y): # errors in best alignment for first x letters of string 1, and first y letters of string 2
• Running time: polynomial
Example Alignment
• Suppose we have already determined the best alignment for:
– First x letters of string1 with first y-1 letters of string2
– First x-1 letters of string1 with first y-1 letters of string2
– First x-1 letters of string1 with first y letters of string2
If (string1[x] == string2[y]) then TABLE[x,y] = TABLE[x-1,y-1]
Else TABLE[x,y] = min(1+TABLE[x,y-1], 1+TABLE[x-1,y])
Match ACCGTTAG with ACTGTTAA
(1) match ‘G’ with ‘_’: 1 + align(ACCGTTA,ACTGTTAA)
(2) match ‘A’ with ‘_’: 1 + align(ACCGTTAG,ACTGTTA)
Example GGCAT and TGCAA
G
C
(empty)
G
A
T
1
3
0
2
4
5
(empty)
1
T
2
G
3
C
4
A
5
A
Example GGCAT and TGCAA
G
C
(empty)
G
A
T
1
3
0
2
4
5
(empty)
2
4
1
3
5
4
T
1
3
2
2
4
5
G
2
2
3
3
3
4
C
3
3
4
4
2
3
A
4
4
5
5
3
4
A
T_GCAA_
_GGCA_T
Pseudocode (bottom-up)
int align(String X, String Y, TABLE[1..x,1..y]) {int i,j;// Initialize top row and leftmost columnfor (i=1; i<=x, ++i) TABLE[i,1] = i;for (j=1; j<=y; ++j) TABLE[1,j] = j;
for (i=2; i<=x, ++i) {for (j=2; j<=y; ++j) {
if (X[i] == Y[j])TABLE[i,j] = TABLE[i-1,j-1]
elseTABLE[i,j] = min(TABLE[i-1,j],
TABLE[i,j-1])+1}
}return TABLE[x,y];
}runtime:
Pseudocode (top-down)
int align(String X,String Y,TABLE[1..x,1..y]){
Compute TABLE[x-1,y-1] if necessaryCompute TABLE[x-1,y] if necessaryCompute TABLE[x,y-1] if necessaryif (X[x] == Y[y])
TABLE[x,y] = TABLE[x-1,y-1];else
TABLE[x,y] = min(TABLE[x-1,y], TABLE[x,y-1]) + 1;
return TABLE[x,y];}
Dynamic Programming Wrap-Up
• Re-use expensive computations
• Store optimal solutions to sub-problems in table
• Use optimal solutions to sub-problems to determine optimal solution to slightly larger problem