programming interest group comp.hkbu.hk/~chxw/pig/index.htm
DESCRIPTION
Programming Interest Group http://www.comp.hkbu.edu.hk/~chxw/pig/index.htm. Tutorial Seven Dynamic Programming. Dynamic Programming. Optimization problems There can be many possible solutions. Each solution has a value, and we want to find a solution with optimal value (minimum or maximum). - PowerPoint PPT PresentationTRANSCRIPT
1
Programming Interest Grouphttp://www.comp.hkbu.edu.hk/~chxw/pig/index.htm
Tutorial Seven
Dynamic Programming
Dynamic Programming Optimization problems
There can be many possible solutions. Each solution has a value, and we want to find a solution with optimal value (minimum or maximum).
Dynamic programming is a technique that systematically searches all possibilities and stores results of sub-problems in a table to avoid recomputing The meaning of “programming” is to use a tabular
solution Optimal Substructure: if an optimal solution to the
problem contains optimal solution to subproblems
2
3
Example 1:Fibonacci Number
Fn = Fn-1 + Fn-2; F0 = 0; F1 = 1
// use recursion
// an inefficient algorithm
int fib( unsigned int n ) {
if (n == 0 || n == 1)
return n;
return fib(n-1) + fib(n-2);
}
F(6)
F(5) F(4)
F(4) F(3) F(3) F(2)
F(3) F(2) F(2) F(1) F(2) F(1) F(1) F(0)
F(2) F(1) F(1) F(0) F(1) F(0) F(1) F(0)
F(1) F(0)
4
Fibonacci Number// a linear algorithm
int fib( unsigned int n ) {
int i, answer, last, nexttolast;
nexttolast = 0;
last = 1;
for(i = 2; i <= n; i++) {
answer = last + nexttolast;
nexttolast = last;
last = answer;
}
return answer;
}
nexttolast last answer
F(k-1)F(k-2)F(k-3)
nexttolast last answer
F(k)F(k-1)F(k-2)
Dynamic Programming:
Store the answers to the sub-problems in a table.
5
Example 2:Binomial Coefficients
How many ways are there to choose k things out of n possibilities?
Coefficients of (a+b)n
!
( )! !
n n
k n k k
1( )0 1
n n n n k k nn n n na b a a b a b b
k n
6
Binomial Coefficients
It’s difficult to calculate the value of binomial coefficients by using the previous equation directly: arithmetic overflow will happen for n > 12!
Pascal’s triangle (1654), or 杨辉三角 (1261)1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 1
1
n n n
k k k
7
Binomial Coefficients
#define MAXN 100/* compute n choose m */long binomial_coef(int n, int m) {
int i, j;long bc[MAXN][MAXN}; /* table of binomial coef */for(i = 0; i <= n; i++) bc[i][0] = 1;for(j = 0; j <= n; j++) bc[j][j] = 1;for(i = 1; i <= n; i++)
for(j = 1; j < i; j++)bc[i][j] = bc[i-1][j-1] + bc[i-1][j];
return (bc[n][m]);}
8
Example 3:Ordering Matrix Multiplications Matrix multiplication is not commutative, but associative. The obvious way to multiply two matrices of dimensions pxq and
qxr uses pqr scalar multiplications. Given four matrices, A, B, C, D, of dimensions A = 50x10, B =
10x40, C=40x30, D=30x5. The matrix product ABCD can be evaluated in any order (A((BC)D)): 16,000 multiplications (A(B(CD))): 10,500 multiplications ((AB)(CD)): 36,000 multiplications (((AB)C)D): 87,500 multiplications ((A(BC))D): 34,500 multiplications
It is worthwhile to find the optimal ordering of matrix multiplications.
9
Ordering Matrix Multiplications
What’s the number of possible orderings? Given N matrices, define T(N) to be the number of
possible orderings. Assume the matrices are A1, A2, …, AN, and the last
multiplication performed is (A1A2…Ai)(Ai+1Ai+2…AN). There are T(i) ways to compute (A1A2…Ai) and T(N-i) ways to compute (Ai+1Ai+2…AN).
1
1
( ) ( ) ( )N
i
T N T i T N i
The sequence T(N) is called Catalan numbers, which grows exponentially.
http://en.wikipedia.org/wiki/Catalan_number
10
Ordering Matrix Multiplications
Exhaustive search is not practical because Catalan numbers grow exponentially.
Dynamic programming is very useful here. Some notations:
Let Ci be the number of columns in matrix Ai, 1≤i ≤ N. Then Ai has Ci-1 rows, since otherwise the multiplications are not valid.
Define C0 to be the number of rows in the first matrix A1.
11
Ordering Matrix Multiplications
Consider the product of ALAL+1…AR-1AR. Define ML,R to be the number of multiplications
required in an optimal ordering, then we have the following recursion:
, , 1, 1min{ }L R L i i R L i RL i RM M M C C C
(ALAL+1…Ai)(Ai+1…AR-1AR)
ML,i Mi+1,R
CL-1CiCR
12
Ordering Matrix Multiplications
Given N matrices, there are about N2/2 values of ML,R. We can build a table to store these values.
Array C[1…N] contains the number of columns for each of the N matrices. C[0] is the number of rows in matrix 1.
TwoDimArray M is used to store values of ML,R. Minimum number of multiplications is finally
stored in M[1][N].
13
Ordering Matrix Multiplications
Loop 1: M1,1 M2,2 … MN-2,N-2 MN-1,N-1 MN,N
Loop 2: M1,2 M2,3 … MN-2,N-1 MN-1,N
Loop 3: M1,3 M2,4 … MN-2,N
Loop 4: M1,4 M2,5 … MN-3,N
…
Loop N: M1,N
14
Ordering Matrix Multiplications/* M is indexed starting at 1, instead of 0 */void OptMatrix (int C[], int N, TwoDimArray M) { int i, k, Left, Right, ThisM; for(Left = 1; Left <= N; Left++) M[Left][Left] = 0; for(k = 1; k < N; k++) for(Left = 1; Left <= N-k; Left++) { Right = Left + k; M[Left][Right] = Infinity; for(i = Left; i < Right; i++) { ThisM = M[Left][i] + M[i+1][Right] + C[Left-1]*C[i]*C[Right]; if (ThisM < M[Left][Right]) M[Left][Right] = ThisM; } }}
O(N3) !
15
Example 4:Knapsack Problem
http://en.wikipedia.org/wiki/Knapsack_problem
There are N types of items of varying size and value, and a knapsack of capacity M. How to maximize the total value of items that can be carried by the knapsack? Unbounded version: no bounds on the number of
each item
16
Knapsack Problem
Example: Given a knapsack of size 17, and the following 5 types of itemsItem A B C D E
Size 3 4 7 8 9
Value 4 5 10 11 13
Possible combinations:
5 item A’s: total value = 20
4 item B’s: total value = 20
1 D and 1 E: total value = 24
…
What’s the maximum total value?
17
Knapsack Problem A simple but inefficient recursive algorithm
typedef struct { int size; int val; } Item;
Item items[N];
int knap (int cap) { int i, space, max, t; for (i = 0, max = 0; i < N; i++) if ((space = cap - items[i].size) >= 0) if ((t = knap(space) + items[i].val) > max) max = t; return max;}
item i space
With capacity cap - item[i].size
knapsack
18
Knapsack Problem Dynamic Programming
int maxKnown[M], itemKnown[M];int knap (int cap) { int i, space, max, maxi = 0, t; if (maxKnown[cap] != unknown) return maxKnown[cap]; for (i = 0, max = 0; i < N; i++) if ((space = cap – items[i].size) >= 0) if ((t = knap(space) + items[i].val) > max) { max = t; maxi = i; } maxKnown[cap] = max; itemKnown[cap] = items[maxi]; return max;}
Remark:1. The item size must be
integers.
2. The time complexity is O(MN).
3. The space complexity is O(M).
Question:How to reconstruct the contents of the knapsack after the computation?
19
Example 5:Subset Sum Problem
A form of the knapsack problem http://en.wikipedia.org/wiki/
Subset_sum_problem Given a set of integers A = a1, a2, …, aN and
an integer K. Is there a subset of A whose sum is exactly K?
Contest Volumes :: Volume C 10032 - Tug of War
20
Subset Sum Problem
The number of subsets is 2N. Brute-force is only practical for small values of N.
If K is not big, we may consider dynamic programming: Consider the first t integers, a1, a2, …, at. If we can
find out all the possible values that can be the sum of a subset of {a1, a2, …, at}, then the original problem can be solved.
We try to record the possible sum values that can be achieved by any subset of A
21
Subset Sum Problem Construct a TwoDimArray F[N][M], initialized with 0
M denotes the range of possible sums F[t][y] represents whether y can be the sum of any subset of {a1, a2,
…, at} To evaluate the value of F[t][y], we need to use the results for set {a1, a2,
…, at-1}. Obviously, if F[t-1][y] = 1, then y is the sum of some subset of {a1, a2, …, at-1},
then y is also the sum of some subset of {a1, a2, …, at}, then F[t][y] = 1. If F[t-1][y-at] = 1, then y-at is the sum of some subset of {a1, a2, …, at-1}, then
y is the sum of some subset of {a1, a2, …, at}, then F[t][y] = 1. Otherwise, F[t][y] = 0.
The time complexity is O(NM), which is practical for reasonable values of M
Try the following question 562 - Dividing coins
22
Example 6:Longest Ascending Subsequence
Given a sequence of numbers {a1, a2, …, an}, find the length of the longest ascending subsequence.
A subsequence S = {s1, ..., sm} is said to be an ascending subsequence if s1 ≤ ... ≤ sm.
For example, given a sequence {0,9,3,2,5,6,8,1,4,7}, the longest ascending subsequences include {0, 2, 5, 6, 8}, {0, 3, 5, 6, 8}. So the solution is 5.
Analysis:
Denote di as the longest length of the ascending subsequences of {a1, a2, …, ai} that
include ai as the last number. Then the solution will be max{di}, 1 ≤ i ≤ n.
How to find di ?
d1 = 1;
di = max{ max{dj + 1 | j < i, aj < ai} } for i > 1.For the previous example:
d1 = 1, d2 = 2, d3 = 2, d4 = 2, d5 = 3, d6 = 4, d7 = 5, d8 = 2, d9 = 4, d10 = 5
23
Example 7Long long ago, country A invented a missile system to destroy the missiles from their enemy. The system can launch one single missile to destroy many enemy missiles, from near to far. Each enemy missile has a height. The system first chooses one enemy missile to destroy, and then destroys a farther missile whose height is lower than the 1st missile, and then destroys a farther missile whose height is higher than the 2nd missile, etc. In short, the odd missile to destroy must be higher and the previous one; the even missile to destroy must be lower than the previous one.
Now, given you the heights of the enemy missiles from near to far, please find the most missiles that can be destroyed by one missile launched by A.
Input:
In each test case, the first line is an integer n ( 0 < n ≤ 1000) which is the number of enemy missiles. Then follows one line which contains n integers ( ≤ 109) that represent the heights of the enemy missiles, ordered by the distance to A’s missile system. The input is terminated by n = 0.
Output:
For each test case, print the most missiles that can be destroyed in a line.
24
Example 7
Sample Input:45 3 2 431 1 10
Sample Output:31
25
Example 7
Denote the sequence of height as {at}.Then the problem is to find the length of the longest sub-sequence of {at} , denoted as {bk}, which satisfies the condition:bk > bk+1 if k is odd, and bk < bk+1 if k is even.
Denote di as the maximum number of attacked missiles for sequence {a1, …, ai} that include missile ai as the last target.
Then the final solution will be: max{di}, 1 ≤ i ≤ n.
How to find the set of {di}, 1 ≤ i ≤ n?
did1, …, dj, … di-1 di d1, …, dj, … di-1
26
Example 7
The previously attacked missile could be any missile aj that j < i. Assume it to be aj. dj could be either odd or even.
If dj is odd, than ai must be lower than aj.
If dj is even, than ai must be higher than aj.
So we have:
d1 = 1;
di = max{ max{dj + 1 | j < i, aj > ai, dj is odd},
max{dj + 1 | j < i, aj < ai, dj is even} }, for i > 1
27
Example 7int answer = 1;
int h[1001], d[1001];
for (i = 0; i < n; i++)
scanf(“%d”, &h[i]); // read in the height of missiles
memset(d, 0, sizeof(d)); // clear array d[ ]
for( i = 0; i < n; i++) {
d[i] = 1;
for( j = i-1; j >=0; j--) {
if ( h[j] < h[i] && d[j] % 2 == 0 && d[j] + 1 > d[i] )
d[i] = d[j] + 1; if ( h[j] > h[i] && d[j] % 2 == 1 && d[j] + 1 > d[i] ) d[i] = d[j] + 1; } if (answer < d[i] ) answer = d[i];}printf(“%d\n”, answer);
28
Example 8String match with wildcards
Wildcard characters:
“*” matches 0 or more characters
“?” Matches a single character
For example:
“?ert*” matches “herbert”
“herbert?” doesn’t match “herbert”
29
Example 8String match with wildcards
Problem:
Given a string s and a string w. w may contain wildcards. Determine whether w matches string s or not.
Analysis:
Define array a[][]. a[i][j] represents whether the first i characters of w (denoted as wi) match the first j characters of s (denoted as sj).
Initially, a[0][0] = true, all others are set to false.
30
Example 8String match with wildcards
To determine a[i][j], let’s consider character w[i-1]:
Case 1: w[i-1] is not a wildcard, then wi matches sj if and only if wi-1 matches sj-1 and w[i-1] == s[j-1], i.e.,
m[i][j] = m[i-1][j-1] && (w[i-1] == s[j-1]);
Case 2: w[i-1] is “?”. Then wi matches sj if and only if wi-1 matches sj-1, i.e.,
m[i][j] = m[i-1][j-1];
Case 3: w[i-1] is “*”. Then wi matches sj if and only if (1) wi matches sj-1, or (2) wi-1 matches sj, i.e.,
m[i][j] = m[i][j-1] || m[i-1][j];
31
Example 8String match with wildcardsbool match (char *w, char *s) {
int i, j;
for (i = 0; i < strlen(w); i++)
for (j = 0; j < strlen(s); j++)
m[i][j] = false;
m[0][0] = true;
for(i = 1; i <= strlen(w); i++)
for(j = 0; j <= strlen(s); j++)
if (w[i-1] == ‘*’)
m[i][j] = (!j) ? (m[i-1][j]) : (m[i][j-1] || m[i-1][j]);
else if (j) {
if (w[i-1] == ‘?’) m[i][j] = m[i-1][j-1];
else m[i][j] = m[i-1][j-1] && (w[i-1] == s[j-1]);
}
return m[strlen(w)][strlen(s)];
}
Example 9: All-Pairs Shortest Path in Weighted Graph A graph G = (V, E), V is the set of vertices and E is the set of
edges Weighted graph: each edge is assigned a numerical value, or
weight Data structure for a graph
Adjacency matrix Adjacency lists in lists (for sparse graph) Adjacency lists in matrices Table of edges
Problem: to find the length of the shortest path between all pairs of vertices In this problem, we are not interested in the actual shortest paths.
Solution: Floyd’s algorithm http://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm
32
Main Idea
The vertices are numbered from 1 to N Consider a function sPath(i, j, k) that returns
the shortest possible path from i to j using only vertices 1 to k as intermediate points along the way.
So if k is 0, sPath(i, j, 0) = weight(i, j)
Recursive formula: sPath(i, j, k) = min{ sPath(i, j, k-1),
sPath(i, k, k-1) + sPath(k, j, k-1) }
33
Main Idea
34
Vi
Vj
Vk
Shortest Paths using intermediate vertices { V1, . . . Vk -1 }
Adjacency Matrix
35
#define MAXV 100typedef struct {
int weight[MAXV+1][MAXV+1];int nverticies; /* number of vertices */
} adjacency_matrix;
initialize_adjacency_matrix(adjacency_matrix *g){
int i, j;g -> nverticies = 0;for(i = 1; i <= MAXV; i++)
for(j = 1; j <= MAXV; j++)g->weight[i][j] = MAXINT; /* non-edge
*/}
Read in the graph
36
read_adjacency_matrix(adjacency_matrix *g, bool directed){
int i, x, y, w;int m; /* number of edges */
initialize_adjacency_matrix(g); /* read in the number of vertices, and number of edges */
scanf(“%d %d\n”, &(g->nvertices), &m);/* read in the m edges */for(i = 1; i <= m; i++) {
scanf(“%d %d %d\n”, &x, &y, &w);g->weight[x][y] = w;if (directed == FALSE)
g->weight[y][x] = w;}
}
All-pairs Shortest Path:Floyd’s Algorithm
37
floyd(adjacency_matrix *g){
int i, j, k; /* i, j: dimension counters; k: intermediate vertex counter */ int through_k; /* distance through vertex k */
for(k = 1; k <= g->nvertices; k++) { for(i = 1; i <= g->nvertices; i++) {
for(j = 1; j <= g->nvertices; j++) { through_k = g->weight[i][k] + g->weight[k][j];
if (through_k < g->weight[i][j]) g->weight[i][j] = through_k; } }
}}
Example 10Edit Distance Background: inexact string matching
Very important in bioinformatics How to measure the difference between a pair of strings?
The minimum cost of changes which have to be made to convert one string to another
Three natural types of changes Substitution: “shot” to “spot” Insertion: “ago” to “agao” Deletion: “hour” to “hor”
Edit distance: the minimum number of changes needed to transform one string into another Or, we can assign each type of change a score, e.g., w_sub, w_ins,
w_del; then the edit distance will be the minimum total score (which may be different from the minimum number of changes)
38
Example
The edit distance between TGCATAT and ATCCGAT is 4
39
TGCATAT
ATGCATAT
ATGCAATATGCGAT
ATCCGAT
insert A
delete T
substitute G for A
substitute C for G
How to compute the edit distance?
40
s
t
i
j
Given two strings S and T. Consider a prefix of S (denoted by s) and a prefix of T (denotedby t). Assume the length of prefix s is i, the length of prefix t is j. 0 <= i <= strlen(S), 0 <= j <= strlen(T).
We define the edit distance between s and t as dist(i, j).
So the final answer is simply dist(strlen(S), strlen(T)).
We can build a table to store all dist(i, j).
How to compute the edit distance dist(i, j)?
41
s
t
i
j
Consider character at location i in s (si) and location j in t (tj). How is tj obtained? Three cases:1.Match or substitution of si
• sub-problem: dist (i-1, j-1)2.Insertion
• sub-problem: dist (i, j-1)3.Deletion
• sub-problem: dist (i-1, j)So, dist(i, j) = min { dist(i-1, j-1) + match(i, j), dist(i, j-1) + w_ins, dist(i-1, j) +w_del }
If (si == sj) match(i, j) = 0;else match(i, j) = w_sub;
Recursion Version
42
#define MATCH 0#define INSERT 1#define DELETE 2/* very inefficient! */int string_compare(char *s, char *t, int i, int j) { int k, lowest_cost, opt[3]; if (i == 0) return (j * w_ins); /* we can only insert j times */ if (j == 0) return (i * w_del); /* we can only delete i times */
opt[MATCH] = string_compare(s, t, i-1, j-1) + match(s[i], t[j]); opt[INSERT] = string_compare(s, t, i, j-1) + w_ins; opt[DELETE] = string_compare(s, t, i-1, j) + w_del;
lowest_cost = opt[MATCH]; for (k = INSERT; k<= DELETE; k++) if (opt[k] < lowest_cost) lowest_cost = opt[k]; return lowest_cost;}
int match(char a, char b){ if (c == d) return 0; else return w_sub;}
Dynamic Programming
43
typedef struct {int cost; /* cost of reaching this cell */int parent; /* parent cell -- how we got to this location? */
} cell;cell m[MAXLEN+1][MAXLEN+1];
int string_compare(char *s, char *t) { int i, j, k, opt[3]; m[0][0].cost = 0; m[0][0].parent = -1; for(i = 1; i < MAXLEN; i++) { m[0][i].cost = i * w_ins; m[0][i].parent = INSERT; m[i][0].cost = i * w_del; m[i][0].parent = DELETE; }
for(i = 1; i < strlen(s); i++) for(j = 1; j < strlen(t); j++) { opt[MATCH] = m[i-1][j-1].cost + match(s[i], t[j]); opt[INSERT] = m[i][j-1].cost + w_ins; opt[DELETE] = m[i-1][j].cost + w_del; m[i][j].cost = opt[MATCH]; m[i][j].parent = MATCH; for(k = INSERT; k <= DELETE; k++) /* find the minimum cost */ if (opt[k] < m[i][j].cost) { m[i][j].cost = opt[k]; m[i][j].parent = k; } } return m[strlen(s)-1][strlen(t)-1].cost;}
Remark:In the left sample code,we assume s and t arepadded with an initialblank character, so the first real character of string s sits in s[1].
s[0] = t[0] = ‘ ‘;scanf(“%s”, &(s[1]));scanf(“%s”, &(t[1]));
Reconstructing the Path
44
reconstruct_path(char *s, char *t, int i, int j){ if (m[i][j].parent == -1) return;
if (m[i][j].parent == MATCH) { reconstruct_path(s, t, i-1, j-1); if (s[i] == t[j]) printf(“M”); else printf(“S”); return; } if (m[i][j].parent == INSERT) { reconstruct_path(s, t, i, j-1); printf(“I”); return; } if (m[i][j].parent == DELETE) { reconstruct_path(s, t, i-1, j); printf(“D”); return; }}
Example
45
S: “thou shalt not” T: “you should not”
D S M M M M M I S M S M M M M