computer science 112 fundamentals of programming ii finding faster algorithms

Computer Science 112

Fundamentals of Programming IIFinding Faster Algorithms

Bubble Sort Strategy

• Compare the first two items and if they are out of order, exchange them

• Repeat this process for the second and third items, etc.

• At the end of this process, the largest itemwill have bubbled down to the end of the list

• Repeat this process for the unsorted portion of the list, etc.

set n to the length of the list

while n > 1

bubble the elements from position 0 to position n - 1

decrement n

Formalize the Strategy

set n to the length of the list

while n > 1

for each position i from 1 to n - 1

if the elements at i and i - 1 are out of order

swap them

decrement n

Refine the Strategy

def bubbleSort(lyst): n = len(lyst) while n > 1: # Do n - 1 bubbles #i = 1 # Start each bubble for i in range(1, n): if lyst[i] < lyst[i - 1]: # Swap if needed swap(lyst, i, i – 1) n -= 1

Implement bubbleSort

Analysis:

How many iterations does the outer loop perform?

How many iterations does the inner loop perform?

def bubbleSort(lyst): n = len(lyst) while n > 1: isSorted = True i = 1 for i in range(n): if lyst[i] < lyst[i - 1]: swap(lyst, i, i – 1) isSorted = False if isSorted: break n -= 1

Improving bubbleSort

Analysis:

Best, worst, average cases?

Example: Exponentiation

def ourPow(base, expo): if expo == 0: return 1 else: return base * ourPow(base, expo – 1)

What is the best case performance? Worst case? Average case?

Recursive definition:

bn = 1, when n = 0bn = b * bn-1 otherwise

Faster Exponentiation

def fastPow(base, expo): if expo == 0: return 1 elif n % 2 == 1: return base * fastPow(base, expo – 1) else: result = fastPow(base, expo // 2) return result * result

What is the best case performance? Worst case? Average case?

Recursive definition:

bn = 1, when n = 0bn = b * bn-1, when n is oddbn = (bn/2)2, when n is even

The Fibonacci Series

def fib(n): if n == 1 or n == 2: return 1 else: return fib(n – 1) + fib(n – 2)

fib(n) = 1, when n = 1 or n = 2fib(n) = fib(n – 1) + fib(n – 2) otherwise

1 1 2 3 5 8 13 . . .

Tracing fib(5)with a Call Tree

fib(5)

fib(4)

fib(3)

fib(3)

fib(2)

fib(2)

fib(2)

fib(1)

fib(1)

1 1 1 1 1

Work Done – Function Calls

fib(5)

fib(4)

fib(3)

fib(3)

fib(2)

fib(2)

fib(2)

fib(1)

fib(1)

1 1 1 1 1

Somewhere between 1n and 2n

Memoization

def fib(n): if n == 1 or n == 2: return 1 else: return fib(n – 1) + fib (n – 2)

Intermediate values returned by the function can be memoized, or saved in a cache, for subsequent access

Then they don’t have to be recomputed!

Memoization

def fib(n): cache = dict() def fastFib(n): if n == 1 or n == 2: return 1 elif n in cache: return cache[n] else: value = fastFib(n – 1) + fastFib(n – 2) cache[n] = value return value return fastFib(n)

The cache is a dictionary whose keys are the arguments of fib and whose values are the values of fib at those keys

Improving on n2 Sorting

• Selection sort uses a linear method within a linear method, so it’s an O(n2) method

• Find a way of using a linear method with a method that’s better than linear

A Hint from Binary Search

• Binary search is better than linear, because we divide the problem size by 2 on each step

• Find a way of dividing the size of sorting problem by 2 on each step, even though each step will itself be linear

• This should produce an O(nlogn) algorithm

Quick Sort

• Select a pivot element (say, the element at the midpoint)

• Shift all of the smaller values to the left of the pivot, and all of the larger values to the right of the pivot (the linear part)

• Sort the values to the left and to the right of the pivot (ideally, done logn times)

89 56 63 72 41 34 950 1 2 3 4 5 6

pivot

Step 1: select the pivot

(at the midpoint)

56 63 41 34 72 89 950 1 2 3 4 5 6

Step 2: shift the data

pivot

Trace of Quick Sort

56 63 41 34 72 89 950 1 2 3 4 5 6

Step 3: sort to the left of the pivot

pivot

34 41 56 63 72 89 950 1 2 3 4 5 6

Step 4: sort to the right of the pivot

pivot

Trace of Quick Sort

Design of Quick Sort: First Cut

quickSort(lyst, left, right) if left < right pivotPosition = partition(lyst, left, right) quickSort (lyst, left, pivotPosition - 1); quickSort (lyst, pivotPosition + 1, right)

Design of Quick Sort: First Cut

quickSort(lyst, left, right) if left < right pivotPosition = partition(lyst, left, right) quickSort (lyst, left, pivotPosition - 1); quickSort (lyst, pivotPosition + 1, right)

• This version selects the midpoint element as the pivot• The position of the pivot might change during the shifting of data

partition(lyst, left, right) pivotValue = lyst[(left + right) // 2] shift smaller values to left of pivotValue shift larger values to right of pivotValue return pivotPosition

Implementation of Partition

def partition(lyst, left, right): # Find the pivot and exchange it with the last item middle = (left + right) // 2 pivot = lyst[middle] lyst[middle] = lyst[right] lyst[right] = pivot # Set boundary point to first position boundary = left # Move items less than pivot to the left for index in range(left, right): if lyst[index] < pivot: swap(lyst, index, boundary) boundary += 1 # Exchange the pivot item and the boundary item swap(lyst, right, boundary) return boundary

The number of comparisons required to shift values in each sublistis equal to the size of the sublist.

def quickSort(lyst): def recurse(left, right): if left < right: pivotPosition = partition(lyst, left, right) recurse(left, pivotPosition - 1); recurse(pivotPosition + 1, right)

def partition(lyst, left, right): # Find the pivot and exchange it with the last item middle = (left + right) // 2 pivot = lyst[middle] lyst[middle] = lyst[right] lyst[right] = pivot # Set boundary point to first position boundary = left # Move items less than pivot to the left for index in range(left, right): if lyst[index] < pivot: swap(lyst, index, boundary) boundary += 1 # Exchange the pivot item and the boundary item swap(lyst, right, boundary) return boundary recurse(0, len(lyst) – 1)

The number of comparisons in the top-level call is n

Complexity Analysis

The sum of the comparisons in the two recursive calls is also n

The sum of the comparisons in the four recursive calls beneaththese is also n, etc.

Thus, the total number of comparisons equals n * the number of times the list must be subdivided

How Many Times Must the Array Be Subdivided?

• It depends on the data and on the choice of the pivot element

• Ideally, when the pivot is the median on each call, the list is subdivided log2n times

• Best-case behavior is O(nlogn)

Call Tree For a Best Case34 41 56 63 72 89 95

34 41 56 72 89 95

34 56 72 95

We select the midpoint element as the pivot.The median element happens to be at the midpoint on each call. But the list was already sorted!

Worst Case

• What if the value at the midpoint is near the largest value on each call?

• Or near the smallest value on each call?

• Then there will be approximately n subdivisions, and quick sort will degenerate to O(n2)

Call Tree For a Worst Case34 41 56 63 72 89 95

We select the first element as the pivot.The smallest element happens to be the first one on each call. n subdivisions!

41 56 63 72 89 95

56 63 72 89 95

63 72 89 95

72 89 95

89 95

95

Other Methods of Selecting the Pivot Element

• Pick a random element

• Pick the median of the first three elements

• Pick the median of the first, middle, and last elements

• Pick the median element - not!! This is an O(n) algorithm

For Friday

Working with the Array Data Structure

Chapter 4

computer science 112 fundamentals of programming ii finding faster algorithms

Documents