computer science 112 fundamentals of programming ii finding faster algorithms
TRANSCRIPT
Bubble Sort Strategy
• Compare the first two items and if they are out of order, exchange them
• Repeat this process for the second and third items, etc.
• At the end of this process, the largest itemwill have bubbled down to the end of the list
• Repeat this process for the unsorted portion of the list, etc.
set n to the length of the list
while n > 1
bubble the elements from position 0 to position n - 1
decrement n
Formalize the Strategy
set n to the length of the list
while n > 1
for each position i from 1 to n - 1
if the elements at i and i - 1 are out of order
swap them
decrement n
Refine the Strategy
def bubbleSort(lyst): n = len(lyst) while n > 1: # Do n - 1 bubbles #i = 1 # Start each bubble for i in range(1, n): if lyst[i] < lyst[i - 1]: # Swap if needed swap(lyst, i, i – 1) n -= 1
Implement bubbleSort
Analysis:
How many iterations does the outer loop perform?
How many iterations does the inner loop perform?
def bubbleSort(lyst): n = len(lyst) while n > 1: isSorted = True i = 1 for i in range(n): if lyst[i] < lyst[i - 1]: swap(lyst, i, i – 1) isSorted = False if isSorted: break n -= 1
Improving bubbleSort
Analysis:
Best, worst, average cases?
Example: Exponentiation
def ourPow(base, expo): if expo == 0: return 1 else: return base * ourPow(base, expo – 1)
What is the best case performance? Worst case? Average case?
Recursive definition:
bn = 1, when n = 0bn = b * bn-1 otherwise
Faster Exponentiation
def fastPow(base, expo): if expo == 0: return 1 elif n % 2 == 1: return base * fastPow(base, expo – 1) else: result = fastPow(base, expo // 2) return result * result
What is the best case performance? Worst case? Average case?
Recursive definition:
bn = 1, when n = 0bn = b * bn-1, when n is oddbn = (bn/2)2, when n is even
The Fibonacci Series
def fib(n): if n == 1 or n == 2: return 1 else: return fib(n – 1) + fib(n – 2)
fib(n) = 1, when n = 1 or n = 2fib(n) = fib(n – 1) + fib(n – 2) otherwise
1 1 2 3 5 8 13 . . .
Tracing fib(5)with a Call Tree
fib(5)
fib(4)
fib(3)
fib(3)
fib(2)
fib(2)
fib(2)
fib(1)
fib(1)
1 1 1 1 1
Work Done – Function Calls
fib(5)
fib(4)
fib(3)
fib(3)
fib(2)
fib(2)
fib(2)
fib(1)
fib(1)
1 1 1 1 1
Somewhere between 1n and 2n
Memoization
def fib(n): if n == 1 or n == 2: return 1 else: return fib(n – 1) + fib (n – 2)
Intermediate values returned by the function can be memoized, or saved in a cache, for subsequent access
Then they don’t have to be recomputed!
Memoization
def fib(n): cache = dict() def fastFib(n): if n == 1 or n == 2: return 1 elif n in cache: return cache[n] else: value = fastFib(n – 1) + fastFib(n – 2) cache[n] = value return value return fastFib(n)
The cache is a dictionary whose keys are the arguments of fib and whose values are the values of fib at those keys
Improving on n2 Sorting
• Selection sort uses a linear method within a linear method, so it’s an O(n2) method
• Find a way of using a linear method with a method that’s better than linear
A Hint from Binary Search
• Binary search is better than linear, because we divide the problem size by 2 on each step
• Find a way of dividing the size of sorting problem by 2 on each step, even though each step will itself be linear
• This should produce an O(nlogn) algorithm
Quick Sort
• Select a pivot element (say, the element at the midpoint)
• Shift all of the smaller values to the left of the pivot, and all of the larger values to the right of the pivot (the linear part)
• Sort the values to the left and to the right of the pivot (ideally, done logn times)
89 56 63 72 41 34 950 1 2 3 4 5 6
pivot
Step 1: select the pivot
(at the midpoint)
56 63 41 34 72 89 950 1 2 3 4 5 6
Step 2: shift the data
pivot
Trace of Quick Sort
56 63 41 34 72 89 950 1 2 3 4 5 6
Step 3: sort to the left of the pivot
pivot
34 41 56 63 72 89 950 1 2 3 4 5 6
Step 4: sort to the right of the pivot
pivot
Trace of Quick Sort
Design of Quick Sort: First Cut
quickSort(lyst, left, right) if left < right pivotPosition = partition(lyst, left, right) quickSort (lyst, left, pivotPosition - 1); quickSort (lyst, pivotPosition + 1, right)
Design of Quick Sort: First Cut
quickSort(lyst, left, right) if left < right pivotPosition = partition(lyst, left, right) quickSort (lyst, left, pivotPosition - 1); quickSort (lyst, pivotPosition + 1, right)
• This version selects the midpoint element as the pivot• The position of the pivot might change during the shifting of data
partition(lyst, left, right) pivotValue = lyst[(left + right) // 2] shift smaller values to left of pivotValue shift larger values to right of pivotValue return pivotPosition
Implementation of Partition
def partition(lyst, left, right): # Find the pivot and exchange it with the last item middle = (left + right) // 2 pivot = lyst[middle] lyst[middle] = lyst[right] lyst[right] = pivot # Set boundary point to first position boundary = left # Move items less than pivot to the left for index in range(left, right): if lyst[index] < pivot: swap(lyst, index, boundary) boundary += 1 # Exchange the pivot item and the boundary item swap(lyst, right, boundary) return boundary
The number of comparisons required to shift values in each sublistis equal to the size of the sublist.
def quickSort(lyst): def recurse(left, right): if left < right: pivotPosition = partition(lyst, left, right) recurse(left, pivotPosition - 1); recurse(pivotPosition + 1, right)
def partition(lyst, left, right): # Find the pivot and exchange it with the last item middle = (left + right) // 2 pivot = lyst[middle] lyst[middle] = lyst[right] lyst[right] = pivot # Set boundary point to first position boundary = left # Move items less than pivot to the left for index in range(left, right): if lyst[index] < pivot: swap(lyst, index, boundary) boundary += 1 # Exchange the pivot item and the boundary item swap(lyst, right, boundary) return boundary recurse(0, len(lyst) – 1)
The number of comparisons in the top-level call is n
Complexity Analysis
The sum of the comparisons in the two recursive calls is also n
The sum of the comparisons in the four recursive calls beneaththese is also n, etc.
Thus, the total number of comparisons equals n * the number of times the list must be subdivided
How Many Times Must the Array Be Subdivided?
• It depends on the data and on the choice of the pivot element
• Ideally, when the pivot is the median on each call, the list is subdivided log2n times
• Best-case behavior is O(nlogn)
Call Tree For a Best Case34 41 56 63 72 89 95
34 41 56 72 89 95
34 56 72 95
We select the midpoint element as the pivot.The median element happens to be at the midpoint on each call. But the list was already sorted!
Worst Case
• What if the value at the midpoint is near the largest value on each call?
• Or near the smallest value on each call?
• Then there will be approximately n subdivisions, and quick sort will degenerate to O(n2)
Call Tree For a Worst Case34 41 56 63 72 89 95
We select the first element as the pivot.The smallest element happens to be the first one on each call. n subdivisions!
41 56 63 72 89 95
56 63 72 89 95
63 72 89 95
72 89 95
89 95
95
Other Methods of Selecting the Pivot Element
• Pick a random element
• Pick the median of the first three elements
• Pick the median of the first, middle, and last elements
• Pick the median element - not!! This is an O(n) algorithm