csc 430: basics of algorithm runtime analysis 1. what is runtime analysis? fundamentally, it is the...
DESCRIPTION
3 Historical Reasons to Care about Runtime Analysis Charles Babbage (1864) As soon as an Analytic Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will arise - By what course of calculation can these results be arrived at by the machine in the shortest time? - Charles Babbage Analytic Engine (schematic) Earliest computational problems were physical: What’s the fastest way to reconfigure a train using two turntables? When first invented, computers were rare (and quite slow) Had to schedule use-time This held true until about 1990 Apple rounded-rectangle storyTRANSCRIPT
1
CSC 430: Basics of Algorithm Runtime Analysis
2
What is Runtime Analysis?
• Fundamentally, it is the process of counting the number of “steps” it takes for an algorithm to terminate
• “Step” = primitive computer operation– Arithmetic– Read/Write memory– Make a comparison– …
• The number of steps needed depends on: – The fundamental algorithmic structure– The data structures you use
5
Other Types of Algorithm Analysis
• Algorithm space analysis• Approximate-solution analysis• Parallel algorithm analysis• A PROBLEM’S run-time analysis• Complexity class analysis
6
Runtime Motivates Design
• Easiest Algorithm to implement for any problem: Brute force.– check every possible solution.– Typically takes 2N time or worse for inputs of size N.– Unacceptable in practice.
• Usually, through insight, we can do much better• What do we mean by “better”?
7
Worst-Case Analysis• Best case running time:
– Rarely used; often superfluous• Average case running time:
– Obtain bound on running time of algorithm on random input as a function of input size N.
– Hard (or impossible) to accurately model real instances by random distributions.
– Acceptable in certain, easily-justified scenarios• Worst case running time:
– Obtain bound on largest possible running time– Generally captures efficiency in practice.– Pessimistic view, but hard to find effective alternative.
9
Defining Efficiency: Worst-Case Polynomial-Time
• Def. An algorithm is efficient if its worst-case running time is polynomial or better.
• Not always true, but works in practice:– 6.02 1023 N20 is technically poly-time– Breaking through the exponential barrier of brute force typically exposes some
crucial structure of the problem.– Most poly-time algorithms that people develop have low constants and low
exponents (e.g., N1 N2 N3). • Exceptions.
– Sometimes the problem size demands greater efficiency than poly-time– Some poly-time algorithms do have high constants and/or exponents, and are
useless in practice.– Some exponential-time (or worse) algorithms are widely used because the
worst-case instances seem to be rare. simplex methodUnix grep
11
2.2 Asymptotic Order of Growth
12
What is Asymptotic Analysis
• Instead of giving precise runtime, we’ll show that algorithms are bounded within a constant factor of common functions
• Three types of bounds– Upper Bounds (never greater)– Lower Bounds (never less)– Tight Bounds (Upper and Lower are same)
13
Growth Notation
• Upper bounds (Big-Oh):– T(n) O(f(n)) if there exist constants c > 0 and n0 0 such
that for all n n0 we have T(n) c · f(n).
• Lower bounds (Omega):– T(n) (f(n)) if there exist constants c > 0 and n0 0 such
that for all n n0 we have T(n) c · f(n).
• Tight bounds (Theta): – T(n) (f(n)) if T(n) is both O(f(n)) and (f(n)).
• Frequently abused notation: • T(n) = O(f(n)).– Asymmetric:• f(n) = 5n3; g(n) = 3n2
• f(n) = O(n3) = g(n)• but f(n) g(n).
• Better notation: T(n) O(f(n)).
14
Use-Cases• Big-Oh:
– “The algorithm’s runtime is O(N lgN), a potential improvement over prior approaches, which were O(N2).”
• Omega: – “Their algorithm’s runtime is (N2),
which is worse than our algorithm’s proven O(N1.5) runtime.”
• Theta:– “Our original algorithm was (N
lgN) and O(N2). With modifications, we were able to improve the runtime bounds to (N1.5)”
0 50 100 150 200 250 3000
10000
20000
30000
40000
50000
60000
70000
80000
N^2NlgnN^1.5
15
Properties• Transitivity.
– If f O(g) and g O(h) then f O(h).– If f (g) and g (h) then f (h). – If f (g) and g (h) then f (h).
• Additivity.– If f O(h) and g O(h) then f + g O(h). – If f (h) and g (h) then f + g (h).– If f (h) and g O(h) then f + g (h).– These generalize to any number of terms
• “Symmetry”:– If f (h) then h (f)– If f O(h) then h (f)– If f (h) then h O(f)
16
Asymptotic Bounds for Some Common Functions
• Polynomials. a0 + a1n + … + adnd is (nd) if ad > 0.
• Polynomial time. Running time is O(nd) for some constant d independent of the input size n.
• Logarithms. O(log a n) = O(log b n) for any constants a, b > 0.
• Logarithms. For every x > 0, log n = O(nx).
• Exponentials. For every r > 1 and every d > 0, nd = O(rn).every exponential grows faster than every polynomial
can avoid specifying the base
log grows slower than every polynomial
17
Proving Bounds• Use Transitivity:
– If an algorithm is at least as fast as another, then it has the same upper bound
• Use Additivity:– f(n) = n3 + n2 + n1 + 3– f(n) (n3)
• Be direct:– Start with the definition
and provide the parts it requires
• Take the logarithm of both functions
• Use limits:
• Keep in mind that sometimes functions can’t be bounded. (But I won’t assign you any like that!)
))(()( then ,))(()( then ,0))(()( then ,
)()(lim
xgxfxgOxfxgxfc
xgxf
x
19
2.4 A Survey of Common Running Times
20
Linear Time: O(n)
• Linear time. Running time is at most a constant factor times the size of the input.
• Computing the maximum. Compute maximum of n numbers a1, …, an.
def max(L):max = L[0]for i in L:
if i > max: max = i
return max
21
Linear Time: O(n)
• Claim. Merging two lists of size n takes O(n) time.• Pf. After each comparison, the length of output list increases by 1.
#the merge algorithm combines two#sorted lists into one larger sorted #listdef merge(A,B):
i = j = 0M = []while i < len(A) or j < len(B):
if A[i] B[j]: M.append(A[i])i+=1
else:M.append(B[j])j+=1
if i == len(A):return M + B[j:]
return M + A[i:]
22
O(n log n) Time• O(n log n) time. Arises in divide-and-conquer algorithms.
• Sorting. Mergesort and heapsort are sorting algorithms that perform O(n log n) comparisons.
• Largest empty interval. Given n time-stamps x1, …, xn on which copies of a file arrive at a server, what is largest interval of time when no copies of the file arrive?
• O(n log n) solution. Sort the time-stamps. Scan the sorted list in order, identifying the maximum gap between successive time-stamps.
23
Quadratic Time: O(n2)• Quadratic time. Enumerate all pairs of elements.
• Closest pair of points. Given a list of n points in the plane (x1, y1), …, (xn, yn), find the pair that is closest.
• Remark. (n2) seems inevitable, but this is just an illusion.
def closest_pair(L):x1 = L[0]x2 = L[1]#no need to compute the sqrt for the distancemin = (x1[0] – x2[0])**2 + (x1[1] – x2[1])**2for i in range(len(L)):
for j in range(i+1,len(L)) :x1,x2 = L[i],L[j]d = (x1[0] – x2[0])**2 + (x1[1] – x2[1])**2if d < min :
min dreturn min
see chapter 5
24
Cubic Time: O(n3)• Cubic time. Enumerate all triples of elements.• Set disjointness. Given n sets S1, …, Sn each of which is a
subset of1, 2, …, n, is there some pair of these which are disjoint?def find_disjoint_sets(S):
for i in range(len(S)):for j in range(i+1,len(S)):
for p in S[i]:if p in S[j]:
break;else: #no p in S[i] belongs to
S[j] return S[i],S[j] #return tuple if found
return None #return nothing if all sets overlap
25
Polynomial Time: O(nk) Time• Independent set of size k. Given a graph, are there k nodes such that no two are
joined by an edge?
• O(nk) solution. Enumerate all subsets of k nodes.
– Check whether S is an independent set = O(k2).– Number of k element subsets = – O(k2 nk / k!) = O(nk).
def independent_k(V,E,k):#subset_k,independent omitted for
spacefor sub in subset_k(V,k):
if independent(sub,E): return sub
return None
nk
n (n 1) (n 2) (n k 1)k (k 1) (k 2) (2) (1)
nk
k!
poly-time for k=17,but not practical
k is a constant
26
Exponential Time
• Independent set. Given a graph, what is maximum size of an independent set?
• O(2n) solution. Enumerate all subsets.
def max_independent(V,E):for k in len(V):
if not independent_k(V,E,k):return k-1