fibonacci numbers f n = f n-1 + f n-2 f 0 =0, f 1 =1 – 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 …...

Fibonacci Numbers

• Fn= Fn-1+ Fn-2

• F0 =0, F1 =1– 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 …

• Straightforward recursive procedure is slow!• Why? How slow? • Let’s draw the recursion tree

Fibonacci Numbers (2)

• We keep calculating the same value over and over!

F(6) = 8

F(5)

F(4)

F(3)

F(1)

F(2)

F(0)

F(1) F(1)

F(2)

F(0)

F(3)

F(1)

F(2)

F(0)

F(1)

F(4)

F(3)

F(1)

F(2)

F(0)

F(1) F(1)

F(2)

F(0)


• How many summations are there?• Golden ratio • Thus Fn»1.6n

• Our recursion tree has only 0s and 1s as leaves, thus we have »1.6n summations

• Running time is exponential!

1 1 51.61803...

2n

n

F

F


• We can calculate Fn in linear time by remembering solutions to the solved subproblems – dynamic programming

• Compute solution in a bottom-up fashion • Trade space for time! – In this case, only two values need to be remembered at

any time (probably less than the depth of your recursion stack!)

Fibonacci(n) F0¬0 F1¬1 for i ¬ 1 to n do Fi ¬ Fi-1 + Fi-2

Fibonacci(n) F0¬0 F1¬1 for i ¬ 1 to n do Fi ¬ Fi-1 + Fi-2

Optimization Problems

• We have to choose one solution out of many – a one with the optimal (minimum or maximum) value.

• A solution exhibits a structure– It consists of a string of choices that were made – what

choices have to be made to arrive at an optimal solution?

• The algorithms computes the optimal value plus, if needed, the optimal solution

• Two matrices, A – n´m matrix and B – m´k matrix, can be multiplied to get C with dimensions n´k, using nmk scalar multiplications

• Problem: Compute a product of many matrices efficiently

• Matrix multiplication is associative– (AB)C = A(BC)

Multiplying Matrices

, , ,1

m

i j i l l jl

c a b

11 12

1311 1221 22 22

2321 2231 32

... ... ...

... ...

... ... ...

a abb b

a a cbb b

a a

Multiplying Matrices (2)

• The parenthesization matters• Consider A´B´C´D, where – A is 30´1, B is 1´40, C is 40´10, D is 10´25

• Costs:– (AB)C)D = 1200 + 12000 + 7500 = 20700– (AB)(CD) = 1200 + 10000 + 30000 = 41200– A((BC)D) = 400 + 250 + 750 = 1400

• We need to optimally parenthesize

1 2 1, where is a matrixn i i iA A A A d d


• Let M(i,j) be the minimum number of multiplications necessary to compute

• Key observations– The outermost parenthesis partitions the chain of

matrices (i,j) at some k, (i£k<j): (Ai… Ak)(Ak+1… Aj)– The optimal parenthesization of matrices (i,j) has

optimal parenthesizations on either side of k: for matrices (i,k) and (k+1,j)

j

kk i

A


• We try out all possible k. Recurrence:

• A direct recursive implementation is exponential – there is a lot of duplicated work (why?)

• But there are only different subproblems (i,j), where 1 £ i £ j £ n

1

( , ) 0

( , ) min ( , ) ( 1, )i k j i k j

M i i

M i j M i k M k j d d d

2( )2

nn n

Multiplying Matrices (5)• Thus, it requires only Q(n2) space to store the optimal cost M(i,j)

for each of the subproblems: half of a 2d array M[1..n,1..n]

Matrix-Chain-Order(d0…dn)1 for i¬1 to n do2 M[i,i] ¬ 03 for l¬2 to n do4 for i¬1 to n-l+1 do5 j ¬ i+l-16 M[i,j] ¬ ¥7 for k¬i to j-l do8 q ¬ M[i,k]+M[k+1,j]+di-1dkdj

9 if q < M[i,j] then10 M[i,j] ¬ q11 c[i,j] ¬ k12 return M, c

Matrix-Chain-Order(d0…dn)1 for i¬1 to n do2 M[i,i] ¬ 03 for l¬2 to n do4 for i¬1 to n-l+1 do5 j ¬ i+l-16 M[i,j] ¬ ¥7 for k¬i to j-l do8 q ¬ M[i,k]+M[k+1,j]+di-1dkdj

9 if q < M[i,j] then10 M[i,j] ¬ q11 c[i,j] ¬ k12 return M, c


• After execution: M[1,n] contains the value of the optimal solution and c contains optimal subdivisions (choices of k) of any subproblem into two subsubproblems

• A simple recursive algorithm Print-Optimal-Parents(c, i, j) can be used to reconstruct an optimal parenthesization

• Let us run the algorithm on d = [10, 20, 3, 5, 30]


• Running time– It is easy to see that it is O(n3)– It turns out, it is also W(n3)

• From exponential time to polynomial

Memoization

• If we still like recursion very much, we can structure our algorithm as a recursive algorithm:– Initialize all M elements to ¥ and call Lookup-Chain(d, i, j)

Lookup-Chain(d,i,j)1 if M[i,j] < ¥ then2 return M[i,j]3 if i=j then4 M[i,j] ¬0 5 else for k ¬ i to j-1 do6 q ¬ Lookup-Chain(d,i,k)+

Lookup-Chain(d,k+1,j)+di-1dkdj

7 if q < M[i,j] then8 M[i,j] ¬ q9 return M[i,j]

Lookup-Chain(d,i,j)1 if M[i,j] < ¥ then2 return M[i,j]3 if i=j then4 M[i,j] ¬0 5 else for k ¬ i to j-1 do6 q ¬ Lookup-Chain(d,i,k)+

Lookup-Chain(d,k+1,j)+di-1dkdj

7 if q < M[i,j] then8 M[i,j] ¬ q9 return M[i,j]

Dynamic Programming

• In general, to apply dynamic programming, we have to address a number of issues:– 1. Show optimal substructure – an optimal solution to the

problem contains within it optimal solutions to sub-problems• Solution to a problem:

– Making a choice out of a number of possibilities (look what possible choices there can be)

– Solving one or more sub-problems that are the result of a choice (characterize the space of sub-problems)

• Show that solutions to sub-problems must themselves be optimal for the whole solution to be optimal (use “cut-and-paste” argument)

Dynamic Programming (2)

– 2. Write a recurrence for the value of an optimal solution• Mopt = Minover all choices k {(Sum of Mopt of all sub-problems,

resulting from choice k) + (the cost associated with making the choice k)}

– Show that the number of different instances of sub-problems is bounded by a polynomial

Dynamic Programming (3)– 3. Compute the value of an optimal solution in a

bottom-up fashion, so that you always have the necessary sub-results pre-computed (or use memoization)

– See if it is possible to reduce the space requirements, by “forgetting” solutions to sub-problems that will not be used any more

– 4. Construct an optimal solution from computed information (which records a sequence of choices made that lead to an optimal solution)

Longest Common Subsequence

• Two text strings are given: X and Y• There is a need to quantify how similar they

are:– Comparing DNA sequences in studies of evolution

of different species– Spell checkers

• One of the measures of similarity is the length of a Longest Common Subsequence (LCS)

LCS: Definition

• Z is a subsequence of X, if it is possible to generate Z by skipping some (possibly none) characters from X

• For example: X =“ACGGTTA”, Y=“CGTAT”, LCS(X,Y) = “CGTA” or “CGTT”

• To solve LCS problem we have to find “skips” that generate LCS(X,Y) from X, and “skips” that generate LCS(X,Y) from Y

LCS: Optimal Substructure

• We make Z to be empty and proceed from the ends of Xm=“x1 x2 …xm” and Yn=“y1 y2 …yn”– If xm=yn, append this symbol to the beginning of Z, and find

optimally LCS(Xm-1, Yn-1)

– If xm¹yn,• Skip either a letter from X • or a letter from Y• Decide which decision to do by comparing LCS(Xm, Yn-1) and LCS(Xm-1,

Yn)

– “Cut-and-paste” argument

LCS: Recurence

• The algorithm could be easily extended by allowing more “editing” operations in addition to copying and skipping (e.g., changing a letter)

• Let c[i,j] = LCS(Xi, Yj)

• Observe: conditions in the problem restrict sub-problems (What is the total number of sub-problems?)

0 if 0 or 0

[ , ] [ 1, 1] 1 if , 0 and

max{ [ , 1], [ 1, ]} if , 0 and i j

i j

i j

c i j c i j i j x y

c i j c i j i j x y

LCS: Compute the Optimum

LCS-Length(X, Y, m, n)1 for i¬1 to m do2 c[i,0] ¬ 03 for j¬0 to n do4 c[0,j] ¬ 05 for i¬1 to m do6 for j¬1 to n do7 if xi = yj then8 c[i,j] ¬ c[i-1,j-1]+19 b[i,j] ¬ ”copy”10 else if c[i-1,j] ³ c[i,j-1]

then11 c[i,j] ¬ c[i-1,j]12 b[i,j] ¬ ”skipx”13 else14 c[i,j] ¬ c[i,j-1]15 b[i,j] ¬ ”skipy”16 return c, b

LCS-Length(X, Y, m, n)1 for i¬1 to m do2 c[i,0] ¬ 03 for j¬0 to n do4 c[0,j] ¬ 05 for i¬1 to m do6 for j¬1 to n do7 if xi = yj then8 c[i,j] ¬ c[i-1,j-1]+19 b[i,j] ¬ ”copy”10 else if c[i-1,j] ³ c[i,j-1]

then11 c[i,j] ¬ c[i-1,j]12 b[i,j] ¬ ”skipx”13 else14 c[i,j] ¬ c[i,j-1]15 b[i,j] ¬ ”skipy”16 return c, b

LCS: Example

• Lets run: X =“ACGGTTA”, Y=“CGTAT”• How much can we reduce our space

requirements, if we do not need to reconstruct LCS?

34February 8, 2003

Shortest Path

• Generalize distance to weighted setting• Digraph G = (V,E) with weight function W: E ® R

(assigning real values to edges)• Weight of path p = v1 ® v2 ® … ® vk is

• Shortest path = a path of the minimum weight• Applications

– static/dynamic network routing– robot motion planning– map/route generation in traffic

1

11

( ) ( , )k

i ii

w p w v v

35February 8, 2003

Shortest-Path Problems

• Shortest-Path problems– Single-source (single-destination). Find a shortest path

from a given source (vertex s) to each of the vertices. The topic of this lecture.

– Single-pair. Given two vertices, find a shortest path between them. Solution to single-source problem solves this problem efficiently, too.

– All-pairs. Find shortest-paths for every pair of vertices. Dynamic programming algorithm.

– Unweighted shortest-paths – BFS.

36February 8, 2003

Optimal Substructure

• Theorem: subpaths of shortest paths are shortest paths

• Proof (cut and paste)– if some subpath were not the shortest path, one

could substitute the shorter subpath and create a shorter total path

37February 8, 2003

Triangle Inequality

• Definition– d(u,v) º weight of a shortest path from u to v

• Theorem– d(u,v) £ d(u,x) + d(x,v) for any x

• Proof – shortest path u Î v is no longer than any other path u Î v – in

particular, the path concatenating the shortest path u Î x with the shortest path x Î v

38February 8, 2003

Negative Weights and Cycles?

• Negative edges are OK, as long as there are no negative weight cycles (otherwise paths with arbitrary small “lengths” would be possible)

• Shortest-paths can have no cycles (otherwise we could improve them by removing cycles)– Any shortest-path in graph G can be no longer

than n – 1 edges, where n is the number of vertices

39February 8, 2003

Relaxation

• For each vertex in the graph, we maintain d[v], the estimate of the shortest path from s, initialized to ¥ at start

• Relaxing an edge (u,v) means testing whether we can improve the shortest path to v found so far by going through u

5u v

vu

2

2

9

5 7

Relax(u,v)

5u v

vu

2

2

6

5 6

Relax(u,v)

Relax (u,v,w)if d[v] > d[u]+w(u,v)then d[v] ¬ d[u]+w(u,v)

p[v] ¬ u

40February 8, 2003

Dijkstra's Algorithm

• Non-negative edge weights• Greedy, similar to Prim's algorithm for MST• Like breadth-first search (if all weights = 1, one can

simply use BFS)• Use Q, priority queue keyed by d[v] (BFS used FIFO

queue, here we use a PQ, which is re-organized whenever some d decreases)

• Basic idea– maintain a set S of solved vertices– at each step select "closest" vertex u, add it to S, and relax

all edges from u

41February 8, 2003

Dijkstra’s Pseudo Code

• Graph G, weight function w, root s

relaxing edges

42February 8, 2003

Dijkstra’s Example

¥ ¥

¥ ¥

0s

u v

yx

10

5

1

2 3 94 67

2

10 ¥

5 ¥

0s

u v

yx

10

5

1

2 3 94 67

2

u v

8 14

5 7

0s

yx

10

5

1

2 3 94 67

2

8 13

5 7

0s

u v

yx

10

5

1

2 3 94 67

2

43February 8, 2003

• Observe– relaxation step (lines 10-11)– setting d[v] updates Q (needs Decrease-Key)– similar to Prim's MST algorithm

Dijkstra’s Example (2)

8 9

5 7

0

u v

yx

10

5

1

2 3 94 67

2

8 9

5 7

0

u v

yx

10

5

1

2 3 94 67

2

44February 8, 2003

Dijkstra’s Correctness

• We will prove that whenever u is added to S, d[u] = d(s,u), i.e., that d is minimum, and that equality is maintained thereafter

• Proof– Note that "v, d[v] ³ d(s,v)– Let u be the first vertex picked such that there is a shorter

path than d[u], i.e., that Þ d[u] > d(s,u)– We will show that this assumption leads to a contradiction

45February 8, 2003

Dijkstra Correctness (2)

• Let y be the first vertex ÎV – S on the actual shortest path from s to u, then it must be that d[y] = d(s,y) because– d[x] is set correctly for y's predecessor x ÎS on the shortest

path (by choice of u as the first vertex for which d is set incorrectly)

– when the algorithm inserted x into S, it relaxed the edge (x,y), assigning d[y] the correct value

46February 8, 2003

• But d[u] > d[y] Þ algorithm would have chosen y (from the PQ) to process next, not u Þ Contradiction

• Thus d[u] = d(s,u) at time of insertion of u into S, and Dijkstra's algorithm is correct

Dijkstra Correctness (3)

[ ] ( , ) (initial assumption)

( , ) ( , ) (optimal substructure)

[ ] ( , ) (correctness of [ ])

[ ] (no negative weights)

d u s u

s y y u

d y y u d y

d y

47February 8, 2003

Dijkstra’s Running Time• Extract-Min executed |V| time• Decrease-Key executed |E| time• Time = |V| TExtract-Min + |E| TDecrease-Key

• T depends on different Q implementations

Q T(Extract-Min)

T(Decrease-Key)

Total

array O(V) O(1) O(V 2)

binary heap O(lg V) O(lg V) O(E lg V)

Fibonacci heap O(lg V) O(1) (amort.)

O(V lgV + E)

48February 8, 2003

Bellman-Ford Algorithm

• Dijkstra’s doesn’t work when there are negative edges:– Intuition – we can not be greedy any more on the

assumption that the lengths of paths will only increase in the future

• Bellman-Ford algorithm detects negative cycles (returns false) or returns the shortest path-tree

49February 8, 2003

Bellman-Ford Algorithm

Bellman-Ford(G,w,s)01 for each v Î V[G]02 d[v] ¬ ¥03 d[s] ¬ 004 p[s] ¬ NIL05 for i ¬ 1 to |V[G]|-1 do06 for each edge (u,v) Î E[G] do07 Relax (u,v,w) 08 for each edge (u,v) Î E[G] do09 if d[v] > d[u] + w(u,v) then return false10 return true

50February 8, 2003

Bellman-Ford Example5

¥ ¥

¥ ¥

0s

zy

6

7

8-3

72

9

-2xt

-4

6 ¥

7 ¥

0s

zy

6

7

8-3

72

9

-2xt

-4

6 4

7 2

0s

zy

6

7

8-3

72

9

-2xt

-4

2 4

7 2

0s

zy

6

7

8-3

72

9

-2xt

-4

5

5 5

51February 8, 2003

Bellman-Ford Example

2 4

7 -2

0s

zy

6

7

8-3

72

9

-2xt

-4

• Bellman-Ford running time:– (|V|-1)|E| + |E| = Q(VE)

5

52February 8, 2003

Correctness of Bellman-Ford

• Let di(s,u) denote the length of path from s to u, that is shortest among all paths, that contain at most i edges

• Prove by induction that d[u]= di(s,u) after the i-th iteration of Bellman-Ford– Base case (i=0) trivial– Inductive step (say d[u] = di-1(s,u)):

• Either di(s,u) = di-1(s,u) • Or di(s,u) = di-1(s,z) + w(z,u)• In an iteration we try to relax each edge ((z,u) also), so we will

catch both cases, thus d[u] = di(s,u)

53February 8, 2003

Correctness of Bellman-Ford

• After n-1 iterations, d[u] = dn-1(s,u), for each vertex u.

• If there is still some edge to relax in the graph, then there is a vertex u, such that dn(s,u) < dn-

1(s,u). But there are only n vertices in G – we have a cycle, and it must be negative.

• Otherwise, d[u]= dn-1(s,u) = d(s,u), for all u, since any shortest path will have at most n-1 edges

54February 8, 2003

Shortest-Path in DAG’s

• Finding shortest paths in DAG’s is much easier, because it is easy to find an order in which to do relaxations – Topological sorting!

DAG-Shortest-Paths(G,w,s)01 for each v Î V[G]02 d[v] ¬ ¥03 d[s] ¬ 004 topologically sort V[G]05 for each vertex u, taken in topolog. order do06 for each vertex v Î Adj[u] do07 Relax(u,v,w)

fibonacci numbers f n = f n-1 + f n-2 f 0 =0, f 1 =1 – 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 …...

Documents

u slide

vertex v s

f i f i

contradiction slide

uv s yx

edge u

optimal solution slide

root s relaxing edges