fibonacci numbers f n = f n-1 + f n-2 f 0 =0, f 1 =1 – 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 …...
TRANSCRIPT
Fibonacci Numbers
• Fn= Fn-1+ Fn-2
• F0 =0, F1 =1– 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 …
• Straightforward recursive procedure is slow!• Why? How slow? • Let’s draw the recursion tree
Fibonacci Numbers (2)
• We keep calculating the same value over and over!
F(6) = 8
F(5)
F(4)
F(3)
F(1)
F(2)
F(0)
F(1) F(1)
F(2)
F(0)
F(3)
F(1)
F(2)
F(0)
F(1)
F(4)
F(3)
F(1)
F(2)
F(0)
F(1) F(1)
F(2)
F(0)
Fibonacci Numbers (3)
• How many summations are there?• Golden ratio • Thus Fn»1.6n
• Our recursion tree has only 0s and 1s as leaves, thus we have »1.6n summations
• Running time is exponential!
1 1 51.61803...
2n
n
F
F
Fibonacci Numbers (4)
• We can calculate Fn in linear time by remembering solutions to the solved subproblems – dynamic programming
• Compute solution in a bottom-up fashion • Trade space for time! – In this case, only two values need to be remembered at
any time (probably less than the depth of your recursion stack!)
Fibonacci(n) F0¬0 F1¬1 for i ¬ 1 to n do Fi ¬ Fi-1 + Fi-2
Fibonacci(n) F0¬0 F1¬1 for i ¬ 1 to n do Fi ¬ Fi-1 + Fi-2
Optimization Problems
• We have to choose one solution out of many – a one with the optimal (minimum or maximum) value.
• A solution exhibits a structure– It consists of a string of choices that were made – what
choices have to be made to arrive at an optimal solution?
• The algorithms computes the optimal value plus, if needed, the optimal solution
• Two matrices, A – n´m matrix and B – m´k matrix, can be multiplied to get C with dimensions n´k, using nmk scalar multiplications
• Problem: Compute a product of many matrices efficiently
• Matrix multiplication is associative– (AB)C = A(BC)
Multiplying Matrices
, , ,1
m
i j i l l jl
c a b
11 12
1311 1221 22 22
2321 2231 32
... ... ...
... ...
... ... ...
a abb b
a a cbb b
a a
Multiplying Matrices (2)
• The parenthesization matters• Consider A´B´C´D, where – A is 30´1, B is 1´40, C is 40´10, D is 10´25
• Costs:– (AB)C)D = 1200 + 12000 + 7500 = 20700– (AB)(CD) = 1200 + 10000 + 30000 = 41200– A((BC)D) = 400 + 250 + 750 = 1400
• We need to optimally parenthesize
1 2 1, where is a matrixn i i iA A A A d d
Multiplying Matrices (3)
• Let M(i,j) be the minimum number of multiplications necessary to compute
• Key observations– The outermost parenthesis partitions the chain of
matrices (i,j) at some k, (i£k<j): (Ai… Ak)(Ak+1… Aj)– The optimal parenthesization of matrices (i,j) has
optimal parenthesizations on either side of k: for matrices (i,k) and (k+1,j)
j
kk i
A
Multiplying Matrices (4)
• We try out all possible k. Recurrence:
• A direct recursive implementation is exponential – there is a lot of duplicated work (why?)
• But there are only different subproblems (i,j), where 1 £ i £ j £ n
1
( , ) 0
( , ) min ( , ) ( 1, )i k j i k j
M i i
M i j M i k M k j d d d
2( )2
nn n
Multiplying Matrices (5)• Thus, it requires only Q(n2) space to store the optimal cost M(i,j)
for each of the subproblems: half of a 2d array M[1..n,1..n]
Matrix-Chain-Order(d0…dn)1 for i¬1 to n do2 M[i,i] ¬ 03 for l¬2 to n do4 for i¬1 to n-l+1 do5 j ¬ i+l-16 M[i,j] ¬ ¥7 for k¬i to j-l do8 q ¬ M[i,k]+M[k+1,j]+di-1dkdj
9 if q < M[i,j] then10 M[i,j] ¬ q11 c[i,j] ¬ k12 return M, c
Matrix-Chain-Order(d0…dn)1 for i¬1 to n do2 M[i,i] ¬ 03 for l¬2 to n do4 for i¬1 to n-l+1 do5 j ¬ i+l-16 M[i,j] ¬ ¥7 for k¬i to j-l do8 q ¬ M[i,k]+M[k+1,j]+di-1dkdj
9 if q < M[i,j] then10 M[i,j] ¬ q11 c[i,j] ¬ k12 return M, c
Multiplying Matrices (6)
• After execution: M[1,n] contains the value of the optimal solution and c contains optimal subdivisions (choices of k) of any subproblem into two subsubproblems
• A simple recursive algorithm Print-Optimal-Parents(c, i, j) can be used to reconstruct an optimal parenthesization
• Let us run the algorithm on d = [10, 20, 3, 5, 30]
Multiplying Matrices (7)
• Running time– It is easy to see that it is O(n3)– It turns out, it is also W(n3)
• From exponential time to polynomial
Memoization
• If we still like recursion very much, we can structure our algorithm as a recursive algorithm:– Initialize all M elements to ¥ and call Lookup-Chain(d, i, j)
Lookup-Chain(d,i,j)1 if M[i,j] < ¥ then2 return M[i,j]3 if i=j then4 M[i,j] ¬0 5 else for k ¬ i to j-1 do6 q ¬ Lookup-Chain(d,i,k)+
Lookup-Chain(d,k+1,j)+di-1dkdj
7 if q < M[i,j] then8 M[i,j] ¬ q9 return M[i,j]
Lookup-Chain(d,i,j)1 if M[i,j] < ¥ then2 return M[i,j]3 if i=j then4 M[i,j] ¬0 5 else for k ¬ i to j-1 do6 q ¬ Lookup-Chain(d,i,k)+
Lookup-Chain(d,k+1,j)+di-1dkdj
7 if q < M[i,j] then8 M[i,j] ¬ q9 return M[i,j]
Dynamic Programming
• In general, to apply dynamic programming, we have to address a number of issues:– 1. Show optimal substructure – an optimal solution to the
problem contains within it optimal solutions to sub-problems• Solution to a problem:
– Making a choice out of a number of possibilities (look what possible choices there can be)
– Solving one or more sub-problems that are the result of a choice (characterize the space of sub-problems)
• Show that solutions to sub-problems must themselves be optimal for the whole solution to be optimal (use “cut-and-paste” argument)
Dynamic Programming (2)
– 2. Write a recurrence for the value of an optimal solution• Mopt = Minover all choices k {(Sum of Mopt of all sub-problems,
resulting from choice k) + (the cost associated with making the choice k)}
– Show that the number of different instances of sub-problems is bounded by a polynomial
Dynamic Programming (3)– 3. Compute the value of an optimal solution in a
bottom-up fashion, so that you always have the necessary sub-results pre-computed (or use memoization)
– See if it is possible to reduce the space requirements, by “forgetting” solutions to sub-problems that will not be used any more
– 4. Construct an optimal solution from computed information (which records a sequence of choices made that lead to an optimal solution)
Longest Common Subsequence
• Two text strings are given: X and Y• There is a need to quantify how similar they
are:– Comparing DNA sequences in studies of evolution
of different species– Spell checkers
• One of the measures of similarity is the length of a Longest Common Subsequence (LCS)
LCS: Definition
• Z is a subsequence of X, if it is possible to generate Z by skipping some (possibly none) characters from X
• For example: X =“ACGGTTA”, Y=“CGTAT”, LCS(X,Y) = “CGTA” or “CGTT”
• To solve LCS problem we have to find “skips” that generate LCS(X,Y) from X, and “skips” that generate LCS(X,Y) from Y
LCS: Optimal Substructure
• We make Z to be empty and proceed from the ends of Xm=“x1 x2 …xm” and Yn=“y1 y2 …yn”– If xm=yn, append this symbol to the beginning of Z, and find
optimally LCS(Xm-1, Yn-1)
– If xm¹yn,• Skip either a letter from X • or a letter from Y• Decide which decision to do by comparing LCS(Xm, Yn-1) and LCS(Xm-1,
Yn)
– “Cut-and-paste” argument
LCS: Recurence
• The algorithm could be easily extended by allowing more “editing” operations in addition to copying and skipping (e.g., changing a letter)
• Let c[i,j] = LCS(Xi, Yj)
• Observe: conditions in the problem restrict sub-problems (What is the total number of sub-problems?)
0 if 0 or 0
[ , ] [ 1, 1] 1 if , 0 and
max{ [ , 1], [ 1, ]} if , 0 and i j
i j
i j
c i j c i j i j x y
c i j c i j i j x y
LCS: Compute the Optimum
LCS-Length(X, Y, m, n)1 for i¬1 to m do2 c[i,0] ¬ 03 for j¬0 to n do4 c[0,j] ¬ 05 for i¬1 to m do6 for j¬1 to n do7 if xi = yj then8 c[i,j] ¬ c[i-1,j-1]+19 b[i,j] ¬ ”copy”10 else if c[i-1,j] ³ c[i,j-1]
then11 c[i,j] ¬ c[i-1,j]12 b[i,j] ¬ ”skipx”13 else14 c[i,j] ¬ c[i,j-1]15 b[i,j] ¬ ”skipy”16 return c, b
LCS-Length(X, Y, m, n)1 for i¬1 to m do2 c[i,0] ¬ 03 for j¬0 to n do4 c[0,j] ¬ 05 for i¬1 to m do6 for j¬1 to n do7 if xi = yj then8 c[i,j] ¬ c[i-1,j-1]+19 b[i,j] ¬ ”copy”10 else if c[i-1,j] ³ c[i,j-1]
then11 c[i,j] ¬ c[i-1,j]12 b[i,j] ¬ ”skipx”13 else14 c[i,j] ¬ c[i,j-1]15 b[i,j] ¬ ”skipy”16 return c, b
LCS: Example
• Lets run: X =“ACGGTTA”, Y=“CGTAT”• How much can we reduce our space
requirements, if we do not need to reconstruct LCS?
34February 8, 2003
Shortest Path
• Generalize distance to weighted setting• Digraph G = (V,E) with weight function W: E ® R
(assigning real values to edges)• Weight of path p = v1 ® v2 ® … ® vk is
• Shortest path = a path of the minimum weight• Applications
– static/dynamic network routing– robot motion planning– map/route generation in traffic
1
11
( ) ( , )k
i ii
w p w v v
35February 8, 2003
Shortest-Path Problems
• Shortest-Path problems– Single-source (single-destination). Find a shortest path
from a given source (vertex s) to each of the vertices. The topic of this lecture.
– Single-pair. Given two vertices, find a shortest path between them. Solution to single-source problem solves this problem efficiently, too.
– All-pairs. Find shortest-paths for every pair of vertices. Dynamic programming algorithm.
– Unweighted shortest-paths – BFS.
36February 8, 2003
Optimal Substructure
• Theorem: subpaths of shortest paths are shortest paths
• Proof (cut and paste)– if some subpath were not the shortest path, one
could substitute the shorter subpath and create a shorter total path
37February 8, 2003
Triangle Inequality
• Definition– d(u,v) º weight of a shortest path from u to v
• Theorem– d(u,v) £ d(u,x) + d(x,v) for any x
• Proof – shortest path u Î v is no longer than any other path u Î v – in
particular, the path concatenating the shortest path u Î x with the shortest path x Î v
38February 8, 2003
Negative Weights and Cycles?
• Negative edges are OK, as long as there are no negative weight cycles (otherwise paths with arbitrary small “lengths” would be possible)
• Shortest-paths can have no cycles (otherwise we could improve them by removing cycles)– Any shortest-path in graph G can be no longer
than n – 1 edges, where n is the number of vertices
39February 8, 2003
Relaxation
• For each vertex in the graph, we maintain d[v], the estimate of the shortest path from s, initialized to ¥ at start
• Relaxing an edge (u,v) means testing whether we can improve the shortest path to v found so far by going through u
5u v
vu
2
2
9
5 7
Relax(u,v)
5u v
vu
2
2
6
5 6
Relax(u,v)
Relax (u,v,w)if d[v] > d[u]+w(u,v)then d[v] ¬ d[u]+w(u,v)
p[v] ¬ u
40February 8, 2003
Dijkstra's Algorithm
• Non-negative edge weights• Greedy, similar to Prim's algorithm for MST• Like breadth-first search (if all weights = 1, one can
simply use BFS)• Use Q, priority queue keyed by d[v] (BFS used FIFO
queue, here we use a PQ, which is re-organized whenever some d decreases)
• Basic idea– maintain a set S of solved vertices– at each step select "closest" vertex u, add it to S, and relax
all edges from u
41February 8, 2003
Dijkstra’s Pseudo Code
• Graph G, weight function w, root s
relaxing edges
42February 8, 2003
Dijkstra’s Example
¥ ¥
¥ ¥
0s
u v
yx
10
5
1
2 3 94 67
2
10 ¥
5 ¥
0s
u v
yx
10
5
1
2 3 94 67
2
u v
8 14
5 7
0s
yx
10
5
1
2 3 94 67
2
8 13
5 7
0s
u v
yx
10
5
1
2 3 94 67
2
43February 8, 2003
• Observe– relaxation step (lines 10-11)– setting d[v] updates Q (needs Decrease-Key)– similar to Prim's MST algorithm
Dijkstra’s Example (2)
8 9
5 7
0
u v
yx
10
5
1
2 3 94 67
2
8 9
5 7
0
u v
yx
10
5
1
2 3 94 67
2
44February 8, 2003
Dijkstra’s Correctness
• We will prove that whenever u is added to S, d[u] = d(s,u), i.e., that d is minimum, and that equality is maintained thereafter
• Proof– Note that "v, d[v] ³ d(s,v)– Let u be the first vertex picked such that there is a shorter
path than d[u], i.e., that Þ d[u] > d(s,u)– We will show that this assumption leads to a contradiction
45February 8, 2003
Dijkstra Correctness (2)
• Let y be the first vertex ÎV – S on the actual shortest path from s to u, then it must be that d[y] = d(s,y) because– d[x] is set correctly for y's predecessor x ÎS on the shortest
path (by choice of u as the first vertex for which d is set incorrectly)
– when the algorithm inserted x into S, it relaxed the edge (x,y), assigning d[y] the correct value
46February 8, 2003
• But d[u] > d[y] Þ algorithm would have chosen y (from the PQ) to process next, not u Þ Contradiction
• Thus d[u] = d(s,u) at time of insertion of u into S, and Dijkstra's algorithm is correct
Dijkstra Correctness (3)
[ ] ( , ) (initial assumption)
( , ) ( , ) (optimal substructure)
[ ] ( , ) (correctness of [ ])
[ ] (no negative weights)
d u s u
s y y u
d y y u d y
d y
47February 8, 2003
Dijkstra’s Running Time• Extract-Min executed |V| time• Decrease-Key executed |E| time• Time = |V| TExtract-Min + |E| TDecrease-Key
• T depends on different Q implementations
Q T(Extract-Min)
T(Decrease-Key)
Total
array O(V) O(1) O(V 2)
binary heap O(lg V) O(lg V) O(E lg V)
Fibonacci heap O(lg V) O(1) (amort.)
O(V lgV + E)
48February 8, 2003
Bellman-Ford Algorithm
• Dijkstra’s doesn’t work when there are negative edges:– Intuition – we can not be greedy any more on the
assumption that the lengths of paths will only increase in the future
• Bellman-Ford algorithm detects negative cycles (returns false) or returns the shortest path-tree
49February 8, 2003
Bellman-Ford Algorithm
Bellman-Ford(G,w,s)01 for each v Î V[G]02 d[v] ¬ ¥03 d[s] ¬ 004 p[s] ¬ NIL05 for i ¬ 1 to |V[G]|-1 do06 for each edge (u,v) Î E[G] do07 Relax (u,v,w) 08 for each edge (u,v) Î E[G] do09 if d[v] > d[u] + w(u,v) then return false10 return true
50February 8, 2003
Bellman-Ford Example5
¥ ¥
¥ ¥
0s
zy
6
7
8-3
72
9
-2xt
-4
6 ¥
7 ¥
0s
zy
6
7
8-3
72
9
-2xt
-4
6 4
7 2
0s
zy
6
7
8-3
72
9
-2xt
-4
2 4
7 2
0s
zy
6
7
8-3
72
9
-2xt
-4
5
5 5
51February 8, 2003
Bellman-Ford Example
2 4
7 -2
0s
zy
6
7
8-3
72
9
-2xt
-4
• Bellman-Ford running time:– (|V|-1)|E| + |E| = Q(VE)
5
52February 8, 2003
Correctness of Bellman-Ford
• Let di(s,u) denote the length of path from s to u, that is shortest among all paths, that contain at most i edges
• Prove by induction that d[u]= di(s,u) after the i-th iteration of Bellman-Ford– Base case (i=0) trivial– Inductive step (say d[u] = di-1(s,u)):
• Either di(s,u) = di-1(s,u) • Or di(s,u) = di-1(s,z) + w(z,u)• In an iteration we try to relax each edge ((z,u) also), so we will
catch both cases, thus d[u] = di(s,u)
53February 8, 2003
Correctness of Bellman-Ford
• After n-1 iterations, d[u] = dn-1(s,u), for each vertex u.
• If there is still some edge to relax in the graph, then there is a vertex u, such that dn(s,u) < dn-
1(s,u). But there are only n vertices in G – we have a cycle, and it must be negative.
• Otherwise, d[u]= dn-1(s,u) = d(s,u), for all u, since any shortest path will have at most n-1 edges
54February 8, 2003
Shortest-Path in DAG’s
• Finding shortest paths in DAG’s is much easier, because it is easy to find an order in which to do relaxations – Topological sorting!
DAG-Shortest-Paths(G,w,s)01 for each v Î V[G]02 d[v] ¬ ¥03 d[s] ¬ 004 topologically sort V[G]05 for each vertex u, taken in topolog. order do06 for each vertex v Î Adj[u] do07 Relax(u,v,w)