programming interest group comp.hkbu.hk/~chxw/pig/index.htm

1

Programming Interest Grouphttp://www.comp.hkbu.edu.hk/~chxw/pig/index.htm

Tutorial Seven

Dynamic Programming

Dynamic Programming Optimization problems

There can be many possible solutions. Each solution has a value, and we want to find a solution with optimal value (minimum or maximum).

Dynamic programming is a technique that systematically searches all possibilities and stores results of sub-problems in a table to avoid recomputing The meaning of “programming” is to use a tabular

solution Optimal Substructure: if an optimal solution to the

problem contains optimal solution to subproblems

2

3

Example 1:Fibonacci Number

Fn = Fn-1 + Fn-2; F0 = 0; F1 = 1

// use recursion

// an inefficient algorithm

int fib( unsigned int n ) {

if (n == 0 || n == 1)

return n;

return fib(n-1) + fib(n-2);

}

F(6)

F(5) F(4)

F(4) F(3) F(3) F(2)

F(3) F(2) F(2) F(1) F(2) F(1) F(1) F(0)

F(2) F(1) F(1) F(0) F(1) F(0) F(1) F(0)

F(1) F(0)

4

Fibonacci Number// a linear algorithm

int fib( unsigned int n ) {

int i, answer, last, nexttolast;

nexttolast = 0;

last = 1;

for(i = 2; i <= n; i++) {

answer = last + nexttolast;

nexttolast = last;

last = answer;

}

return answer;

}

nexttolast last answer

F(k-1)F(k-2)F(k-3)

nexttolast last answer

F(k)F(k-1)F(k-2)

Dynamic Programming:

Store the answers to the sub-problems in a table.

5

Example 2:Binomial Coefficients

How many ways are there to choose k things out of n possibilities?

Coefficients of (a+b)n

!

( )! !

n n

k n k k

1( )0 1

n n n n k k nn n n na b a a b a b b

k n

6

Binomial Coefficients

It’s difficult to calculate the value of binomial coefficients by using the previous equation directly: arithmetic overflow will happen for n > 12!

Pascal’s triangle (1654), or 杨辉三角 (1261)1

1 1

1 2 1

1 3 3 1

1 4 6 4 1

1 5 10 10 5 1

1 6 15 20 15 6 1

1 1

1

n n n

k k k

7

Binomial Coefficients

#define MAXN 100/* compute n choose m */long binomial_coef(int n, int m) {

int i, j;long bc[MAXN][MAXN}; /* table of binomial coef */for(i = 0; i <= n; i++) bc[i][0] = 1;for(j = 0; j <= n; j++) bc[j][j] = 1;for(i = 1; i <= n; i++)

for(j = 1; j < i; j++)bc[i][j] = bc[i-1][j-1] + bc[i-1][j];

return (bc[n][m]);}

8

Example 3:Ordering Matrix Multiplications Matrix multiplication is not commutative, but associative. The obvious way to multiply two matrices of dimensions pxq and

qxr uses pqr scalar multiplications. Given four matrices, A, B, C, D, of dimensions A = 50x10, B =

10x40, C=40x30, D=30x5. The matrix product ABCD can be evaluated in any order (A((BC)D)): 16,000 multiplications (A(B(CD))): 10,500 multiplications ((AB)(CD)): 36,000 multiplications (((AB)C)D): 87,500 multiplications ((A(BC))D): 34,500 multiplications

It is worthwhile to find the optimal ordering of matrix multiplications.

9

Ordering Matrix Multiplications

What’s the number of possible orderings? Given N matrices, define T(N) to be the number of

possible orderings. Assume the matrices are A1, A2, …, AN, and the last

multiplication performed is (A1A2…Ai)(Ai+1Ai+2…AN). There are T(i) ways to compute (A1A2…Ai) and T(N-i) ways to compute (Ai+1Ai+2…AN).

1

1

( ) ( ) ( )N

i

T N T i T N i

The sequence T(N) is called Catalan numbers, which grows exponentially.

http://en.wikipedia.org/wiki/Catalan_number

10


Exhaustive search is not practical because Catalan numbers grow exponentially.

Dynamic programming is very useful here. Some notations:

Let Ci be the number of columns in matrix Ai, 1≤i ≤ N. Then Ai has Ci-1 rows, since otherwise the multiplications are not valid.

Define C0 to be the number of rows in the first matrix A1.

11


Consider the product of ALAL+1…AR-1AR. Define ML,R to be the number of multiplications

required in an optimal ordering, then we have the following recursion:

, , 1, 1min{ }L R L i i R L i RL i RM M M C C C

(ALAL+1…Ai)(Ai+1…AR-1AR)

ML,i Mi+1,R

CL-1CiCR

12


Given N matrices, there are about N2/2 values of ML,R. We can build a table to store these values.

Array C[1…N] contains the number of columns for each of the N matrices. C[0] is the number of rows in matrix 1.

TwoDimArray M is used to store values of ML,R. Minimum number of multiplications is finally

stored in M[1][N].

13


Loop 1: M1,1 M2,2 … MN-2,N-2 MN-1,N-1 MN,N

Loop 2: M1,2 M2,3 … MN-2,N-1 MN-1,N

Loop 3: M1,3 M2,4 … MN-2,N

Loop 4: M1,4 M2,5 … MN-3,N

…

Loop N: M1,N

14

Ordering Matrix Multiplications/* M is indexed starting at 1, instead of 0 */void OptMatrix (int C[], int N, TwoDimArray M) { int i, k, Left, Right, ThisM; for(Left = 1; Left <= N; Left++) M[Left][Left] = 0; for(k = 1; k < N; k++) for(Left = 1; Left <= N-k; Left++) { Right = Left + k; M[Left][Right] = Infinity; for(i = Left; i < Right; i++) { ThisM = M[Left][i] + M[i+1][Right] + C[Left-1]*C[i]*C[Right]; if (ThisM < M[Left][Right]) M[Left][Right] = ThisM; } }}

O(N3) !

15

Example 4:Knapsack Problem

http://en.wikipedia.org/wiki/Knapsack_problem

There are N types of items of varying size and value, and a knapsack of capacity M. How to maximize the total value of items that can be carried by the knapsack? Unbounded version: no bounds on the number of

each item

16

Knapsack Problem

Example: Given a knapsack of size 17, and the following 5 types of itemsItem A B C D E

Size 3 4 7 8 9

Value 4 5 10 11 13

Possible combinations:

5 item A’s: total value = 20

4 item B’s: total value = 20

1 D and 1 E: total value = 24

…

What’s the maximum total value?

17

Knapsack Problem A simple but inefficient recursive algorithm

typedef struct { int size; int val; } Item;

Item items[N];

int knap (int cap) { int i, space, max, t; for (i = 0, max = 0; i < N; i++) if ((space = cap - items[i].size) >= 0) if ((t = knap(space) + items[i].val) > max) max = t; return max;}

item i space

With capacity cap - item[i].size

knapsack

18

Knapsack Problem Dynamic Programming

int maxKnown[M], itemKnown[M];int knap (int cap) { int i, space, max, maxi = 0, t; if (maxKnown[cap] != unknown) return maxKnown[cap]; for (i = 0, max = 0; i < N; i++) if ((space = cap – items[i].size) >= 0) if ((t = knap(space) + items[i].val) > max) { max = t; maxi = i; } maxKnown[cap] = max; itemKnown[cap] = items[maxi]; return max;}

Remark:1. The item size must be

integers.

2. The time complexity is O(MN).

3. The space complexity is O(M).

Question:How to reconstruct the contents of the knapsack after the computation?

19

Example 5:Subset Sum Problem

A form of the knapsack problem http://en.wikipedia.org/wiki/

Subset_sum_problem Given a set of integers A = a1, a2, …, aN and

an integer K. Is there a subset of A whose sum is exactly K?

Contest Volumes :: Volume C 10032 - Tug of War

20

Subset Sum Problem

The number of subsets is 2N. Brute-force is only practical for small values of N.

If K is not big, we may consider dynamic programming: Consider the first t integers, a1, a2, …, at. If we can

find out all the possible values that can be the sum of a subset of {a1, a2, …, at}, then the original problem can be solved.

We try to record the possible sum values that can be achieved by any subset of A

21

Subset Sum Problem Construct a TwoDimArray F[N][M], initialized with 0

M denotes the range of possible sums F[t][y] represents whether y can be the sum of any subset of {a1, a2,

…, at} To evaluate the value of F[t][y], we need to use the results for set {a1, a2,

…, at-1}. Obviously, if F[t-1][y] = 1, then y is the sum of some subset of {a1, a2, …, at-1},

then y is also the sum of some subset of {a1, a2, …, at}, then F[t][y] = 1. If F[t-1][y-at] = 1, then y-at is the sum of some subset of {a1, a2, …, at-1}, then

y is the sum of some subset of {a1, a2, …, at}, then F[t][y] = 1. Otherwise, F[t][y] = 0.

The time complexity is O(NM), which is practical for reasonable values of M

Try the following question 562 - Dividing coins

22

Example 6:Longest Ascending Subsequence

Given a sequence of numbers {a1, a2, …, an}, find the length of the longest ascending subsequence.

A subsequence S = {s1, ..., sm} is said to be an ascending subsequence if s1 ≤ ... ≤ sm.

For example, given a sequence {0,9,3,2,5,6,8,1,4,7}, the longest ascending subsequences include {0, 2, 5, 6, 8}, {0, 3, 5, 6, 8}. So the solution is 5.

Analysis:

Denote di as the longest length of the ascending subsequences of {a1, a2, …, ai} that

include ai as the last number. Then the solution will be max{di}, 1 ≤ i ≤ n.

How to find di ?

d1 = 1;

di = max{ max{dj + 1 | j < i, aj < ai} } for i > 1.For the previous example:

d1 = 1, d2 = 2, d3 = 2, d4 = 2, d5 = 3, d6 = 4, d7 = 5, d8 = 2, d9 = 4, d10 = 5

23

Example 7Long long ago, country A invented a missile system to destroy the missiles from their enemy. The system can launch one single missile to destroy many enemy missiles, from near to far. Each enemy missile has a height. The system first chooses one enemy missile to destroy, and then destroys a farther missile whose height is lower than the 1st missile, and then destroys a farther missile whose height is higher than the 2nd missile, etc. In short, the odd missile to destroy must be higher and the previous one; the even missile to destroy must be lower than the previous one.

Now, given you the heights of the enemy missiles from near to far, please find the most missiles that can be destroyed by one missile launched by A.

Input:

In each test case, the first line is an integer n ( 0 < n ≤ 1000) which is the number of enemy missiles. Then follows one line which contains n integers ( ≤ 109) that represent the heights of the enemy missiles, ordered by the distance to A’s missile system. The input is terminated by n = 0.

Output:

For each test case, print the most missiles that can be destroyed in a line.

24

Example 7

Sample Input:45 3 2 431 1 10

Sample Output:31

25

Example 7

Denote the sequence of height as {at}.Then the problem is to find the length of the longest sub-sequence of {at} , denoted as {bk}, which satisfies the condition:bk > bk+1 if k is odd, and bk < bk+1 if k is even.

Denote di as the maximum number of attacked missiles for sequence {a1, …, ai} that include missile ai as the last target.

Then the final solution will be: max{di}, 1 ≤ i ≤ n.

How to find the set of {di}, 1 ≤ i ≤ n?

did1, …, dj, … di-1 di d1, …, dj, … di-1

26

Example 7

The previously attacked missile could be any missile aj that j < i. Assume it to be aj. dj could be either odd or even.

If dj is odd, than ai must be lower than aj.

If dj is even, than ai must be higher than aj.

So we have:

d1 = 1;

di = max{ max{dj + 1 | j < i, aj > ai, dj is odd},

max{dj + 1 | j < i, aj < ai, dj is even} }, for i > 1

27

Example 7int answer = 1;

int h[1001], d[1001];

for (i = 0; i < n; i++)

scanf(“%d”, &h[i]); // read in the height of missiles

memset(d, 0, sizeof(d)); // clear array d[ ]

for( i = 0; i < n; i++) {

d[i] = 1;

for( j = i-1; j >=0; j--) {

if ( h[j] < h[i] && d[j] % 2 == 0 && d[j] + 1 > d[i] )

d[i] = d[j] + 1; if ( h[j] > h[i] && d[j] % 2 == 1 && d[j] + 1 > d[i] ) d[i] = d[j] + 1; } if (answer < d[i] ) answer = d[i];}printf(“%d\n”, answer);

28

Example 8String match with wildcards

Wildcard characters:

“*” matches 0 or more characters

“?” Matches a single character

For example:

“?ert*” matches “herbert”

“herbert?” doesn’t match “herbert”

29


Problem:

Given a string s and a string w. w may contain wildcards. Determine whether w matches string s or not.

Analysis:

Define array a[][]. a[i][j] represents whether the first i characters of w (denoted as wi) match the first j characters of s (denoted as sj).

Initially, a[0][0] = true, all others are set to false.

30


To determine a[i][j], let’s consider character w[i-1]:

Case 1: w[i-1] is not a wildcard, then wi matches sj if and only if wi-1 matches sj-1 and w[i-1] == s[j-1], i.e.,

m[i][j] = m[i-1][j-1] && (w[i-1] == s[j-1]);

Case 2: w[i-1] is “?”. Then wi matches sj if and only if wi-1 matches sj-1, i.e.,

m[i][j] = m[i-1][j-1];

Case 3: w[i-1] is “*”. Then wi matches sj if and only if (1) wi matches sj-1, or (2) wi-1 matches sj, i.e.,

m[i][j] = m[i][j-1] || m[i-1][j];

31

Example 8String match with wildcardsbool match (char *w, char *s) {

int i, j;

for (i = 0; i < strlen(w); i++)

for (j = 0; j < strlen(s); j++)

m[i][j] = false;

m[0][0] = true;

for(i = 1; i <= strlen(w); i++)

for(j = 0; j <= strlen(s); j++)

if (w[i-1] == ‘*’)

m[i][j] = (!j) ? (m[i-1][j]) : (m[i][j-1] || m[i-1][j]);

else if (j) {

if (w[i-1] == ‘?’) m[i][j] = m[i-1][j-1];

else m[i][j] = m[i-1][j-1] && (w[i-1] == s[j-1]);

}

return m[strlen(w)][strlen(s)];

}

Example 9: All-Pairs Shortest Path in Weighted Graph A graph G = (V, E), V is the set of vertices and E is the set of

edges Weighted graph: each edge is assigned a numerical value, or

weight Data structure for a graph

Adjacency matrix Adjacency lists in lists (for sparse graph) Adjacency lists in matrices Table of edges

Problem: to find the length of the shortest path between all pairs of vertices In this problem, we are not interested in the actual shortest paths.

Solution: Floyd’s algorithm http://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm

32

Main Idea

The vertices are numbered from 1 to N Consider a function sPath(i, j, k) that returns

the shortest possible path from i to j using only vertices 1 to k as intermediate points along the way.

So if k is 0, sPath(i, j, 0) = weight(i, j)

Recursive formula: sPath(i, j, k) = min{ sPath(i, j, k-1),

sPath(i, k, k-1) + sPath(k, j, k-1) }

33

Main Idea

34

Vi

Vj

Vk

Shortest Paths using intermediate vertices { V1, . . . Vk -1 }

Adjacency Matrix

35

#define MAXV 100typedef struct {

int weight[MAXV+1][MAXV+1];int nverticies; /* number of vertices */

} adjacency_matrix;

initialize_adjacency_matrix(adjacency_matrix *g){

int i, j;g -> nverticies = 0;for(i = 1; i <= MAXV; i++)

for(j = 1; j <= MAXV; j++)g->weight[i][j] = MAXINT; /* non-edge

*/}

Read in the graph

36

read_adjacency_matrix(adjacency_matrix *g, bool directed){

int i, x, y, w;int m; /* number of edges */

initialize_adjacency_matrix(g); /* read in the number of vertices, and number of edges */

scanf(“%d %d\n”, &(g->nvertices), &m);/* read in the m edges */for(i = 1; i <= m; i++) {

scanf(“%d %d %d\n”, &x, &y, &w);g->weight[x][y] = w;if (directed == FALSE)

g->weight[y][x] = w;}

}

All-pairs Shortest Path:Floyd’s Algorithm

37

floyd(adjacency_matrix *g){

int i, j, k; /* i, j: dimension counters; k: intermediate vertex counter */ int through_k; /* distance through vertex k */

for(k = 1; k <= g->nvertices; k++) { for(i = 1; i <= g->nvertices; i++) {

for(j = 1; j <= g->nvertices; j++) { through_k = g->weight[i][k] + g->weight[k][j];

if (through_k < g->weight[i][j]) g->weight[i][j] = through_k; } }

}}

Example 10Edit Distance Background: inexact string matching

Very important in bioinformatics How to measure the difference between a pair of strings?

The minimum cost of changes which have to be made to convert one string to another

Three natural types of changes Substitution: “shot” to “spot” Insertion: “ago” to “agao” Deletion: “hour” to “hor”

Edit distance: the minimum number of changes needed to transform one string into another Or, we can assign each type of change a score, e.g., w_sub, w_ins,

w_del; then the edit distance will be the minimum total score (which may be different from the minimum number of changes)

38

Example

The edit distance between TGCATAT and ATCCGAT is 4

39

TGCATAT

ATGCATAT

ATGCAATATGCGAT

ATCCGAT

insert A

delete T

substitute G for A

substitute C for G

How to compute the edit distance?

40

s

t

i

j

Given two strings S and T. Consider a prefix of S (denoted by s) and a prefix of T (denotedby t). Assume the length of prefix s is i, the length of prefix t is j. 0 <= i <= strlen(S), 0 <= j <= strlen(T).

We define the edit distance between s and t as dist(i, j).

So the final answer is simply dist(strlen(S), strlen(T)).

We can build a table to store all dist(i, j).

How to compute the edit distance dist(i, j)?

41

s

t

i

j

Consider character at location i in s (si) and location j in t (tj). How is tj obtained? Three cases:1.Match or substitution of si

• sub-problem: dist (i-1, j-1)2.Insertion

• sub-problem: dist (i, j-1)3.Deletion

• sub-problem: dist (i-1, j)So, dist(i, j) = min { dist(i-1, j-1) + match(i, j), dist(i, j-1) + w_ins, dist(i-1, j) +w_del }

If (si == sj) match(i, j) = 0;else match(i, j) = w_sub;

Recursion Version

42

#define MATCH 0#define INSERT 1#define DELETE 2/* very inefficient! */int string_compare(char *s, char *t, int i, int j) { int k, lowest_cost, opt[3]; if (i == 0) return (j * w_ins); /* we can only insert j times */ if (j == 0) return (i * w_del); /* we can only delete i times */

opt[MATCH] = string_compare(s, t, i-1, j-1) + match(s[i], t[j]); opt[INSERT] = string_compare(s, t, i, j-1) + w_ins; opt[DELETE] = string_compare(s, t, i-1, j) + w_del;

lowest_cost = opt[MATCH]; for (k = INSERT; k<= DELETE; k++) if (opt[k] < lowest_cost) lowest_cost = opt[k]; return lowest_cost;}

int match(char a, char b){ if (c == d) return 0; else return w_sub;}

Dynamic Programming

43

typedef struct {int cost; /* cost of reaching this cell */int parent; /* parent cell -- how we got to this location? */

} cell;cell m[MAXLEN+1][MAXLEN+1];

int string_compare(char *s, char *t) { int i, j, k, opt[3]; m[0][0].cost = 0; m[0][0].parent = -1; for(i = 1; i < MAXLEN; i++) { m[0][i].cost = i * w_ins; m[0][i].parent = INSERT; m[i][0].cost = i * w_del; m[i][0].parent = DELETE; }

for(i = 1; i < strlen(s); i++) for(j = 1; j < strlen(t); j++) { opt[MATCH] = m[i-1][j-1].cost + match(s[i], t[j]); opt[INSERT] = m[i][j-1].cost + w_ins; opt[DELETE] = m[i-1][j].cost + w_del; m[i][j].cost = opt[MATCH]; m[i][j].parent = MATCH; for(k = INSERT; k <= DELETE; k++) /* find the minimum cost */ if (opt[k] < m[i][j].cost) { m[i][j].cost = opt[k]; m[i][j].parent = k; } } return m[strlen(s)-1][strlen(t)-1].cost;}

Remark:In the left sample code,we assume s and t arepadded with an initialblank character, so the first real character of string s sits in s[1].

s[0] = t[0] = ‘ ‘;scanf(“%s”, &(s[1]));scanf(“%s”, &(t[1]));

Reconstructing the Path

44

reconstruct_path(char *s, char *t, int i, int j){ if (m[i][j].parent == -1) return;

if (m[i][j].parent == MATCH) { reconstruct_path(s, t, i-1, j-1); if (s[i] == t[j]) printf(“M”); else printf(“S”); return; } if (m[i][j].parent == INSERT) { reconstruct_path(s, t, i, j-1); printf(“I”); return; } if (m[i][j].parent == DELETE) { reconstruct_path(s, t, i-1, j); printf(“D”); return; }}

Example

45

S: “thou shalt not” T: “you should not”

D S M M M M M I S M S M M M M

programming interest group comp.hkbu.hk/~chxw/pig/index.htm

Documents