csc 430: basics of algorithm runtime analysis 1. what is runtime analysis? fundamentally, it is the...

1

CSC 430: Basics of Algorithm Runtime Analysis

2

What is Runtime Analysis?

• Fundamentally, it is the process of counting the number of “steps” it takes for an algorithm to terminate

• “Step” = primitive computer operation– Arithmetic– Read/Write memory– Make a comparison– …

• The number of steps needed depends on: – The fundamental algorithmic structure– The data structures you use

5

Other Types of Algorithm Analysis

• Algorithm space analysis• Approximate-solution analysis• Parallel algorithm analysis• A PROBLEM’S run-time analysis• Complexity class analysis

6

Runtime Motivates Design

• Easiest Algorithm to implement for any problem: Brute force.– check every possible solution.– Typically takes 2N time or worse for inputs of size N.– Unacceptable in practice.

• Usually, through insight, we can do much better• What do we mean by “better”?

7

Worst-Case Analysis• Best case running time:

– Rarely used; often superfluous• Average case running time:

– Obtain bound on running time of algorithm on random input as a function of input size N.

– Hard (or impossible) to accurately model real instances by random distributions.

– Acceptable in certain, easily-justified scenarios• Worst case running time:

– Obtain bound on largest possible running time– Generally captures efficiency in practice.– Pessimistic view, but hard to find effective alternative.

9

Defining Efficiency: Worst-Case Polynomial-Time

• Def. An algorithm is efficient if its worst-case running time is polynomial or better.

• Not always true, but works in practice:– 6.02 1023 N20 is technically poly-time– Breaking through the exponential barrier of brute force typically exposes some

crucial structure of the problem.– Most poly-time algorithms that people develop have low constants and low

exponents (e.g., N1 N2 N3). • Exceptions.

– Sometimes the problem size demands greater efficiency than poly-time– Some poly-time algorithms do have high constants and/or exponents, and are

useless in practice.– Some exponential-time (or worse) algorithms are widely used because the

worst-case instances seem to be rare. simplex methodUnix grep

11

2.2 Asymptotic Order of Growth

12

What is Asymptotic Analysis

• Instead of giving precise runtime, we’ll show that algorithms are bounded within a constant factor of common functions

• Three types of bounds– Upper Bounds (never greater)– Lower Bounds (never less)– Tight Bounds (Upper and Lower are same)

13

Growth Notation

• Upper bounds (Big-Oh):– T(n) O(f(n)) if there exist constants c > 0 and n0 0 such

that for all n n0 we have T(n) c · f(n).

• Lower bounds (Omega):– T(n) (f(n)) if there exist constants c > 0 and n0 0 such

that for all n n0 we have T(n) c · f(n).

• Tight bounds (Theta): – T(n) (f(n)) if T(n) is both O(f(n)) and (f(n)).

• Frequently abused notation: • T(n) = O(f(n)).– Asymmetric:• f(n) = 5n3; g(n) = 3n2

• f(n) = O(n3) = g(n)• but f(n) g(n).

• Better notation: T(n) O(f(n)).

14

Use-Cases• Big-Oh:

– “The algorithm’s runtime is O(N lgN), a potential improvement over prior approaches, which were O(N2).”

• Omega: – “Their algorithm’s runtime is (N2),

which is worse than our algorithm’s proven O(N1.5) runtime.”

• Theta:– “Our original algorithm was (N

lgN) and O(N2). With modifications, we were able to improve the runtime bounds to (N1.5)”

0 50 100 150 200 250 3000

10000

20000

30000

40000

50000

60000

70000

80000

N^2NlgnN^1.5

15

Properties• Transitivity.

– If f O(g) and g O(h) then f O(h).– If f (g) and g (h) then f (h). – If f (g) and g (h) then f (h).

• Additivity.– If f O(h) and g O(h) then f + g O(h). – If f (h) and g (h) then f + g (h).– If f (h) and g O(h) then f + g (h).– These generalize to any number of terms

• “Symmetry”:– If f (h) then h (f)– If f O(h) then h (f)– If f (h) then h O(f)

16

Asymptotic Bounds for Some Common Functions

• Polynomials. a0 + a1n + … + adnd is (nd) if ad > 0.

• Polynomial time. Running time is O(nd) for some constant d independent of the input size n.

• Logarithms. O(log a n) = O(log b n) for any constants a, b > 0.

• Logarithms. For every x > 0, log n = O(nx).

• Exponentials. For every r > 1 and every d > 0, nd = O(rn).every exponential grows faster than every polynomial

can avoid specifying the base

log grows slower than every polynomial

17

Proving Bounds• Use Transitivity:

– If an algorithm is at least as fast as another, then it has the same upper bound

• Use Additivity:– f(n) = n3 + n2 + n1 + 3– f(n) (n3)

• Be direct:– Start with the definition

and provide the parts it requires

• Take the logarithm of both functions

• Use limits:

• Keep in mind that sometimes functions can’t be bounded. (But I won’t assign you any like that!)

))(()( then ,))(()( then ,0))(()( then ,

)()(lim

xgxfxgOxfxgxfc

xgxf

x

19

2.4 A Survey of Common Running Times

20

Linear Time: O(n)

• Linear time. Running time is at most a constant factor times the size of the input.

• Computing the maximum. Compute maximum of n numbers a1, …, an.

def max(L):max = L[0]for i in L:

if i > max: max = i

return max

21

Linear Time: O(n)

• Claim. Merging two lists of size n takes O(n) time.• Pf. After each comparison, the length of output list increases by 1.

#the merge algorithm combines two#sorted lists into one larger sorted #listdef merge(A,B):

i = j = 0M = []while i < len(A) or j < len(B):

if A[i] B[j]: M.append(A[i])i+=1

else:M.append(B[j])j+=1

if i == len(A):return M + B[j:]

return M + A[i:]

22

O(n log n) Time• O(n log n) time. Arises in divide-and-conquer algorithms.

• Sorting. Mergesort and heapsort are sorting algorithms that perform O(n log n) comparisons.

• Largest empty interval. Given n time-stamps x1, …, xn on which copies of a file arrive at a server, what is largest interval of time when no copies of the file arrive?

• O(n log n) solution. Sort the time-stamps. Scan the sorted list in order, identifying the maximum gap between successive time-stamps.

23

Quadratic Time: O(n2)• Quadratic time. Enumerate all pairs of elements.

• Closest pair of points. Given a list of n points in the plane (x1, y1), …, (xn, yn), find the pair that is closest.

• Remark. (n2) seems inevitable, but this is just an illusion.

def closest_pair(L):x1 = L[0]x2 = L[1]#no need to compute the sqrt for the distancemin = (x1[0] – x2[0])**2 + (x1[1] – x2[1])**2for i in range(len(L)):

for j in range(i+1,len(L)) :x1,x2 = L[i],L[j]d = (x1[0] – x2[0])**2 + (x1[1] – x2[1])**2if d < min :

min dreturn min

see chapter 5

24

Cubic Time: O(n3)• Cubic time. Enumerate all triples of elements.• Set disjointness. Given n sets S1, …, Sn each of which is a

subset of1, 2, …, n, is there some pair of these which are disjoint?def find_disjoint_sets(S):

for i in range(len(S)):for j in range(i+1,len(S)):

for p in S[i]:if p in S[j]:

break;else: #no p in S[i] belongs to

S[j] return S[i],S[j] #return tuple if found

return None #return nothing if all sets overlap

25

Polynomial Time: O(nk) Time• Independent set of size k. Given a graph, are there k nodes such that no two are

joined by an edge?

• O(nk) solution. Enumerate all subsets of k nodes.

– Check whether S is an independent set = O(k2).– Number of k element subsets = – O(k2 nk / k!) = O(nk).

def independent_k(V,E,k):#subset_k,independent omitted for

spacefor sub in subset_k(V,k):

if independent(sub,E): return sub

return None

nk

n (n 1) (n 2) (n k 1)k (k 1) (k 2) (2) (1)

nk

k!

poly-time for k=17,but not practical

k is a constant

26

Exponential Time

• Independent set. Given a graph, what is maximum size of an independent set?

• O(2n) solution. Enumerate all subsets.

def max_independent(V,E):for k in len(V):

if not independent_k(V,E,k):return k-1

csc 430: basics of algorithm runtime analysis 1. what is runtime analysis? fundamentally, it is the...

Documents