searching and sorting - tiziana ligorio · searching and sorting tiziana ligorio...
TRANSCRIPT
Today’s Plan
Recap
Searching algorithms and their analysis
Sorting algorithms and their analysis
!2
Announcements
Questions?
!3
Searching
!4
Looking for something!In this discussion we will assume
searching for an element in an array
Linear searchMost intuitiveStart at first position and keep looking until you find it
int linearSearch(int a[], int size, int value){
for (int i = 0; i < size; i++) { if (a[i] == value) { return i; } } return-1;}
!5
How long does linear search take?
If you assume value is in the array and probability of finding it at any location is uniform, on average n/2
If value is not in the array (worst case) n
Either way it’s O(n)
!6
What if you know array is sorted? Can you do better than linear search?
!7
Lecture Activity
You are given a sorted array of integers.
How would you search for 115? ( try to do it in fewer than n steps: don’t search sequentially)
You can write pseudocode or succinctly explain your algorithm
!8
We have done this before! When?
!9
!10
Look in ?
Binary Search
!11
14 43 76 100 108 158 195 200 274 523 543 5993
Binary Search
!12
14 43 76 100 108 158 195 200 274 523 543 5993
Binary Search
!13
14 43 76 100 108 158 195 200 274 523 543 5993
Binary Search
!14
14 43 76 100 108 158 195 200 274 523 543 5993
Binary Search
!15
14 43 76 100 108 158 195 200 274 523 543 5993
Binary Search
!16
14 43 76 100 108 158 195 200 274 523 543 5993
Binary Search
!17
14 43 76 100 108 158 195 200 274 523 543 5993
Binary Search
What is happening here?
!18
Binary Search
What is happening here?
Size of search is cut in half at each step
!19
Binary Search
What is happening here?
Size of search is cut in half at each step
Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1
!20
Simplification: assume n is a power of 2 so it can be
evenly divided in two partsThe running time
One comparison
Search lower OR upper half
Binary Search
What is happening here?
Size of search is cut in half at each step
Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n/2) = T(n/4) +1
!21
One comparison
Search lower OR upper half of n/2
Binary Search
What is happening here?
Size of search is cut in half at each step
Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n/2) = T(n/4) +1 T(n) = T(n/4) + 1 + 1
!22
Binary Search
What is happening here?
Size of search is cut in half at each step
Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n) = T(n/4) + 2 . . .
!23
22 2
211
Binary Search
What is happening here?
Size of search is cut in half at each step
Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n) = T(n/4) + 2 . . . T(n) = T(n/2k) + k
!24
Binary Search
What is happening here?
Size of search is cut in half at each step
Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n) = T(n/4) + 2 . . . T(n) = T(n/2k) + k T(n) = T(1) + log2(n)
!25n/n = 1
The number to which I need to raise 2 to get n
And we said n = 2k
Binary Search
What is happening here?
Size of search is cut in half at each step
Let T(n) be the running time and assume n = 2kT(n) = T(n/2) + 1 T(n) = T(n/4) + 2 . . . T(n) = T(n/2k) + k T(n) = T(1) + log2(n)
!26
Binary search is O(log(n))
Sorting
!27
Rearranging a sequence into increasing (decreasing) order!
Several approaches
Can do it in may ways
What is the best way?
Let’s find out using Big-O
!28
Lecture Activity
Write pseudocode to sort an array.
!29
144376 100108158 195200 274523543 5993
There are many approaches to sorting We will look at some comparison-
based approaches here
!30
Selection Sort
!31
Selection Sort
!32
Find smallest element and move it at lowest position
Unsorted
Sorted
1st Pass
Selection Sort
!33
Find smallest element and move it at lowest position
Unsorted
Sorted
Swap
1st Pass
Selection Sort
!34
Find smallest element and move it at lowest position
Unsorted
Sorted
1st Pass
Selection Sort
!35
Find smallest element and move it at lowest position
Unsorted
Sorted
Unsorted
2nd Pass
Selection Sort
!36
Find smallest element and move it at lowest position
Unsorted
Sorted
Swap
2nd Pass
Selection Sort
!37
Find smallest element and move it at lowest position
Unsorted
Sorted
2nd Pass
Selection Sort
!38
Find smallest element and move it at lowest position
Unsorted
Sorted
3rd Pass
Selection Sort
!39
Find smallest element and move it at lowest position
Unsorted
Sorted
3rd Pass
Selection Sort
!40
Find smallest element and move it at lowest position
Unsorted
Sorted
4th Pass
Selection Sort
!41
Find smallest element and move it at lowest position
Unsorted
Sorted
4th Pass
Selection Sort
!42
Find smallest element and move it at lowest position
Unsorted
Sorted
5th Pass
Selection Sort
!43
Find smallest element and move it at lowest position
Unsorted
Sorted
Swap
5th Pass
Selection Sort
!44
Find smallest element and move it at lowest position
Unsorted
Sorted
5th Pass
Selection Sort
!45
Find smallest element and move it at lowest position
Unsorted
Sorted
6th Pass
Selection Sort
!46
Find smallest element and move it at lowest position
Unsorted
Sorted
Selection Sort
Find the smallest item and move it at position 1
Find the next-smallest item and move it at position 2
. . .
!47
Selection Sort Analysis
How much work?
Find smallest: look at n elements
!48
Selection Sort Analysis
How much work?
Find smallest: look at n elements
Find second smallest: look at n-1 elements
!49
Selection Sort Analysis
How much work?
Find smallest: look at n elements
Find second smallest: look at n-1 elements
Find third smallest: look at n-2 elements. . .
!50
Selection Sort Analysis
How much work?
Find smallest: look at n elements
Find second smallest: look at n-1 elements
Find third smallest: look at n-2 elements. . .
Total work: n + (n-1) + (n-2) + . . . +1
!51
!52
n + (n-1) + ... + 2 + 1
n
n + 1
= n(n+1) / 2
Selection Sort Analysis
T(n) = n(n+1) / 2 comparisons + n data moves = O( )?
!53
Selection Sort Analysis
T(n) = n(n+1) / 2 comparisons + n data moves = O( )?
T(n) = (n2+n) / 2 + n = O( )?
!54
Selection Sort Analysis
T(n) = n(n+1) / 2 comparisons + n data moves = O( )?
T(n) = (n2+n) / 2 + n = O( )?
!55
Ignore constant
Ignore non-dominant terms
Selection Sort Analysis
T(n) = n(n+1) / 2 comparisons + n data moves = O( )?
T(n) = (n2+n) / 2 + n = O( n2)
!56
Ignore constant
Ignore non-dominant terms
Selection Sort Analysis
T(n) = n(n+1) / 2 comparisons + n data moves = O( )?
T(n) = (n2+n) / 2 + n = O( n2)
Selection Sort run time is O( n2)
!57
template<class T>void selectionSort(T the_array[], int n) { // last = index of the last item in the subarray of items yet // to be sorted; // largest = index of the largest item found for (int last = n - 1; last >= 1; last--) { // At this point, the_array[last+1..n-1] is sorted, and its // entries are greater than those in the_array[0..last]. // Select the largest entry in the_array[0..last] int largest = findIndexOfLargest(the_array, last+1); // Swap the largest entry, the_array[largest], with // the_array[last] std::swap(the_array[largest], the_array[last]); } // end for } // end selectionSort
!58
template<class T>void selectionSort(T the_array[], int n) { // last = index of the last item in the subarray of items yet // to be sorted; // largest = index of the largest item found for (int last = n - 1; last >= 1; last--) { // At this point, the_array[last+1..n-1] is sorted, and its // entries are greater than those in the_array[0..last]. // Select the largest entry in the_array[0..last] int largest = findIndexOfLargest(the_array, last+1); // Swap the largest entry, the_array[largest], with // the_array[last] std::swap(the_array[largest], the_array[last]); } // end for } // end selectionSort
!59
PassO(n)
O(n)
O( n2)
Stability
A sorting algorithm is Stable if elements that are equal remain is same order relative to each other after sorting
!60
Selection Sort
!61
Find smallest element and move it at lowest position
Unsorted
Sorted
Selection Sort
!62
Find smallest element and move it at lowest position
Unsorted
Sorted
Selection Sort
!63
Find smallest element and move it at lowest position
Unsorted
Sorted
Swap
Selection Sort
!64
Find smallest element and move it at lowest position
Unsorted
Sorted
Unstable
Selection Sort Analysis
Execution time DOES NOT depend on initial arrangement of data => ALWAYS O( n2)
O( n2) comparisons
Good choice for small n and/or data moves are costly ( O( n ) data moves )
Unstable
!65
Understanding O(n2)
!66
14 43100 200 2743
T(n)
Understanding O(n2)
!67
14 43
76
100
108 158195
200 274
523 599
3
14 43100 200 2743
T(2n) ≈ 4T(n)
T(n)
(2n)2 = 4n2
Understanding O(n2)
!68
14 43
76
100
108 158195
200 274
523 599
3
14 43100 200 2743 11260 5642 932
T(n)
T(3n) ≈ 9T(n)
(3n)2 = 9n2
Understanding O(n2) on large input
If size of input increases by factor of 100Execution time increases by factor of 10,000 T(100n) = 10,000T(n)
!69
Understanding O(n2) on large input
If size of input increases by factor of 100Execution time increases by factor of 10,000 T(100n) = 10,000T(n)
Assume n = 100,000 and T(n) = 17 seconds Sorting 10,000,000 takes 10,000 longer
!70
Understanding O(n2) on large input
If size of input increases by factor of 100Execution time increases by factor of 10,000 T(100n) = 10,000T(n)
Assume n = 100,000 and T(n) = 17 seconds Sorting 10,000,000 takes 10,000 longer
Sorting 10,000,000 entries takes ≈ 2 days
Multiplying input by 100 to go from 17sec to 2 days!!!
!71
Raise your hand if you had Selection Sort
!72
Bubble Sort
!73
Bubble Sort
!74
Compare adjacent elements and if necessary swap them
Unsorted
Sorted
Bubble Sort
!75
Compare adjacent elements and if necessary swap them 1st Pass
Unsorted
Sorted
Bubble Sort
!76
Compare adjacent elements and if necessary swap them 1st Pass
Unsorted
Sorted
Swap
Bubble Sort
!77
Compare adjacent elements and if necessary swap them 1st Pass
Unsorted
Sorted
Bubble Sort
!78
Compare adjacent elements and if necessary swap them 1st Pass
Unsorted
Sorted
Swap
Bubble Sort
!79
Compare adjacent elements and if necessary swap them 1st Pass
Unsorted
Sorted
Bubble Sort
!80
Compare adjacent elements and if necessary swap them 1st Pass
Unsorted
Sorted
Swap
Bubble Sort
!81
Compare adjacent elements and if necessary swap them 1st Pass
Unsorted
Sorted
Bubble Sort
!82
Compare adjacent elements and if necessary swap them 1st Pass
Unsorted
Sorted
Bubble Sort
!83
Compare adjacent elements and if necessary swap them 1st Pass
Unsorted
Sorted
Swap
Bubble Sort
!84
Compare adjacent elements and if necessary swap them
End of1st Pass: Not sorted, but largest has “bubbled up” to its proper
position
Bubble Sort
!85
Compare adjacent elements and if necessary swap them
2nd Pass: Sort n-1
Bubble Sort
!86
Compare adjacent elements and if necessary swap them 2nd Pass
Unsorted
Sorted
Bubble Sort
!87
Compare adjacent elements and if necessary swap them 2nd Pass
Unsorted
Sorted
Bubble Sort
!88
Compare adjacent elements and if necessary swap them
Swap
2nd Pass
Unsorted
Sorted
Bubble Sort
!89
Compare adjacent elements and if necessary swap them 2nd Pass
Unsorted
Sorted
Bubble Sort
!90
Compare adjacent elements and if necessary swap them 2nd Pass
Unsorted
Sorted
Bubble Sort
!91
Compare adjacent elements and if necessary swap them
3rd Pass: Sort n-2
Bubble Sort
!92
Compare adjacent elements and if necessary swap them 3rd Pass
Unsorted
Sorted
Bubble Sort
!93
Compare adjacent elements and if necessary swap them 3rd Pass
SwapArray is sorted
But our algorithm doesn’t know It keeps on going
Unsorted
Sorted
Bubble Sort
!94
Compare adjacent elements and if necessary swap them 3rd Pass
Unsorted
Sorted
Bubble Sort
!95
Compare adjacent elements and if necessary swap them 3rd Pass
Unsorted
Sorted
Bubble Sort
!96
Compare adjacent elements and if necessary swap them
4th Pass: Sort n-3
Bubble Sort
!97
Compare adjacent elements and if necessary swap them 4th Pass
Unsorted
Sorted
Bubble Sort
!98
Compare adjacent elements and if necessary swap them 4th Pass
Unsorted
Sorted
Bubble Sort
!99
Compare adjacent elements and if necessary swap them
5th Pass: Sort n-4
Bubble Sort
!100
Compare adjacent elements and if necessary swap them 5th Pass
Unsorted
Sorted
Bubble Sort
!101
Compare adjacent elements and if necessary swap them Done!
Unsorted
Sorted
Bubble Sort Analysis
How much work?
First pass: n-1 comparisons and at most n-1 swaps
Second pass: n-2 comparisons and at most n-2 swaps
Third pass: n-3 comparisons and at most n-3 swaps. . .
Total work: (n-1) + (n-2) + . . . +1
!102
!103
n + (n-1) + ... + 2 + 1
n
n + 1
= n(n+1) / 2 (n-1) + (n-2) + . . . + 2 + 1 = n(n-1)/2
(n-1)
n
Bubble Sort Analysis
T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?
A swap is usually more than one operation but this simplification does not change the analysis
T(n) = 2( n(n-1) / 2 )= O( )?
!104
Bubble Sort Analysis
T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?
A swap is usually more than one operation but this simplification does not change the analysis
T(n) = 2( n(n-1) / 2 )= O( )?
T(n) = 2( (n2-n) / 2 )= O( )?
!105
Bubble Sort Analysis
T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?
A swap is usually more than one operation but this simplification does not change the analysis
T(n) = 2( n(n-1) / 2 )= O( )?
T(n) = 2( (n2-n) / 2 )= O( )?
T(n) = n2-n = O( )?
!106Ignore non-dominant terms
Bubble Sort Analysis
T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?
A swap is usually more than one operation but this simplification does not change the analysis
T(n) = 2( n(n-1) / 2 )= O( )?
T(n) = 2( (n2-n) / 2 )= O( )?
T(n) = n2-n = O( n2)
Bubble Sort run time is O(n2)!107
Optimize!
Easy to check: if there are no swaps in any given pass stop because it is sorted
!108
Bubble Sort
!109
Compare adjacent elements and if necessary swap them
Bubble Sort
!110
Compare adjacent elements and if necessary swap them
Bubble Sort
!111
Compare adjacent elements and if necessary swap them
Bubble Sort
!112
Compare adjacent elements and if necessary swap them
Bubble Sort
!113
Compare adjacent elements and if necessary swap them
Bubble Sort
!114
Compare adjacent elements and if necessary swap them
Bubble Sort
!115
Compare adjacent elements and if necessary swap them
template<class T> void bubbleSort(T the_array[], int n) { bool sorted = false; // False when swaps occur int pass = 1; while (!sorted && (pass < n)) { // At this point, the_array[n+1-pass..n-1] is sorted // and all of its entries are > the entries in the_array[0..n-pass] sorted = true; // Assume sorted for (int index = 0; index < n - pass; index++) { // At this point, all entries in the_array[0..index-1] // are <= the_array[index] int nextIndex = index + 1; if (the_array[index] > the_array[nextIndex]) { // Exchange entries std::swap(the_array[index], the_array[nextIndex]); sorted = false; // Signal exchange } // end if } // end for // Assertion: the_array[0..n-pass-1] < the_array[n-pass] pass++; } // end while } // end bubbleSort
!116
template<class T> void bubbleSort(T the_array[], int n) { bool sorted = false; // False when swaps occur int pass = 1; while (!sorted && (pass < n)) { // At this point, the_array[n+1-pass..n-1] is sorted // and all of its entries are > the entries in the_array[0..n-pass] sorted = true; // Assume sorted for (int index = 0; index < n - pass; index++) { // At this point, all entries in the_array[0..index-1] // are <= the_array[index] int nextIndex = index + 1; if (the_array[index] > the_array[nextIndex]) { // Exchange entries std::swap(the_array[index], the_array[nextIndex]); sorted = false; // Signal exchange } // end if } // end for // Assertion: the_array[0..n-pass-1] < the_array[n-pass] pass++; } // end while } // end bubbleSort
!117
PassO(n)
O(n)
O( n2)
Bubble Sort Analysis
Execution time DOES depend on initial arrangement of data
Worst case: O(n2) comparisons and data moves
Best case: O(n) comparisons and data moves
Stable
If array is already sorted bubble sort will stop after first pass and no swaps => good choice for small n and data likely somewhat sorted
!118
Raise your hand if you had Bubble Sort
!119
https://www.youtube.com/watch?v=lyZQPjUT5B4
!120
Insertion Sort
!121
Insertion Sort
!122
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
1st Pass
Insertion Sort
!123
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
1st Pass
Insertion Sort
!124
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Swap
1st Pass
Insertion Sort
!125
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
1st Pass
Insertion Sort
!126
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!127
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
2nd Pass
Insertion Sort
!128
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Swap
2nd Pass
Insertion Sort
!129
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
2nd Pass
Insertion Sort
!130
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
2nd Pass
Insertion Sort
!131
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!132
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
3rd Pass
Insertion Sort
!133
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Swap
3rd Pass
Insertion Sort
!134
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
3rd Pass
Insertion Sort
!135
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Swap
3rd Pass
Insertion Sort
!136
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
3rd Pass
Insertion Sort
!137
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Swap
3rd Pass
Insertion Sort
!138
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
3rd Pass
Insertion Sort
!139
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!140
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
4th Pass
Insertion Sort
!141
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
4th Pass
Insertion Sort
!142
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!143
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
5th Pass
Insertion Sort
!144
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Swap
5th Pass
Insertion Sort
!145
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
5th Pass
Insertion Sort
!146
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
5th Pass
Insertion Sort
!147
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort Analysis
How much work?
First pass: 1 comparison and at most 1 swap
Second pass: at most 2 comparisons and at most 2 swaps
Third pass: at most 3 comparisons and at most 3 swaps. . .
Total work: 1 + 2 + 3 + . . . + (n-1)
!148
!149
n + (n-1) + ... + 2 + 1
n
n + 1
= n(n+1) / 2 1 + 2 + . . . (n-2) + (n-1) = n(n-1)/2
(n-1)
n
Insertion Sort Analysis
T(n) = n(n-1) / 2 comparisons + n(n-1) / 2 swaps = O( )?
T(n) = 2( (n2-n) / 2 )= O( )?
T(n) = n2-n = O( n2)
Insertion Sort run time is O(n2)
!150
Insertion Sort
!151
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!152
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!153
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!154
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!155
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!156
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!157
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!158
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!159
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!160
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort
!161
Pick first element in unsorted region and put it in right place
in sorted region
Unsorted
Sorted
Insertion Sort Analysis
Execution time DOES depend on initial arrangement of data
Worst case: O( n2) comparisons and data moves
Best case: O(n) comparisons and data moves
Stable
If array is already sorted Insertion sort will do only n comparisons and no swaps => good choice for small n and data likely somewhat sorted
!162
template<class T> void insertionSort(T the_array[], int n) { // unsorted = first index of the unsorted region, // loc = index of insertion in the sorted region, // next_item = next item in the unsorted region. // Initially, sorted region is the_array[0], // unsorted region is the_array[1..n-1]. // In general, sorted region is the_array[0..unsorted-1], // unsorted region the_array[unsorted..n-1] for (int unsorted = 1; unsorted < n; unsorted++) { // At this point, the_array[0..unsorted-1] is sorted. // Find the right position (loc) in the_array[0..unsorted] // for the_array[unsorted], which is the first entry in the // unsorted region; shift, if necessary, to make room T next_item = the_array[unsorted]; int loc = unsorted; while ((loc > 0) && (the_array[loc - 1] > next_item)) { // Shift the_array[loc - 1] to the right the_array[loc] = the_array[loc - 1]; loc--; } // end while // At this point, the_array[loc] is where next_item belongs the_array[loc] = next_item; // Insert next_item into sorted region } // end for } // end insertionSort
!163
template<class T> void insertionSort(T the_array[], int n) { // unsorted = first index of the unsorted region, // loc = index of insertion in the sorted region, // next_item = next item in the unsorted region. // Initially, sorted region is the_array[0], // unsorted region is the_array[1..n-1]. // In general, sorted region is the_array[0..unsorted-1], // unsorted region the_array[unsorted..n-1] for (int unsorted = 1; unsorted < n; unsorted++) { // At this point, the_array[0..unsorted-1] is sorted. // Find the right position (loc) in the_array[0..unsorted] // for the_array[unsorted], which is the first entry in the // unsorted region; shift, if necessary, to make room T next_item = the_array[unsorted]; int loc = unsorted; while ((loc > 0) && (the_array[loc - 1] > next_item)) { // Shift the_array[loc - 1] to the right the_array[loc] = the_array[loc - 1]; loc--; } // end while // At this point, the_array[loc] is where next_item belongs the_array[loc] = next_item; // Insert next_item into sorted region } // end for } // end insertionSort
!164
PassO(n)
O(n)
O( n2)
Raise your hand if you had Insertion Sort
!165
What we have so far
!166
Worst Case Best Case
Selection Sort O( n2 ) O( n2 )
Bubble Sort O( n2 ) O( n )
Insertion Sort O( n2 ) O( n )
Lecture Activity
Sort the array using Insertion SortShow the entire array after each comparison/swap operation and at each step mark clearly the division between the sorted and unsorted portions of the array
!167
5 8 3 4 9
5 8 3 4 9
Pick first element in unsorted region and put it in right place in sorted region
2 7
2 7
!168
5 8 3 4 9
5 8 3 4 9
2 7
2 7
5 3 8 4 9
3 5 8 4 9
2 7
2 7
3 5 4 8 9 2 7
3 4 5 8 9 2 7
3 4 5 8 9 2 7
3 4 5 8 2 9 7
3 4 5 2 8 9 7
3 4 2 5 8 9 7
3 2 4 5 8 9 7
2 3 4 5 8 9 7
2 3 4 5 8 7 9
2 3 4 5 7 8 9
!169
https://www.toptal.com/developers/sorting-algorithms
Can we do better?
!170
Can we do better?
!171
Divide and Conquer!!!
Merge Sort
!172
Understanding O(n2)
!173
76108 158195523 59914 43100 200 2743 11260 5642 932
T(n)
Understanding O(n2)
!174
76108 158195523 59914 43100 200 2743 11260 5642 932
T(n)
10852314 43100 200 2743 76 158195 599 11260 5642 932
Understanding O(n2)
!175
76108 158195523 59914 43100 200 2743 11260 5642 932
T(n)
10852314 43100 200 2743 76 158195 599 11260 5642 932
T(1/2n) T(1/2n)
Understanding O(n2)
!176
76108 158195523 59914 43100 200 2743 11260 5642 932
T(n)
10852314 43100 200 2743 76 158195 599 11260 5642 932
T(1/2n) T(1/2n)
(n/2)2 = n2/4
Understanding O(n2)
!177
76108 158195523 59914 43100 200 2743 11260 5642 932
T(n)
10852314 43100 200 2743 76 158195 599 11260 5642 932
(n/2)2 = n2/4
T(1/2n) ≈ 1/4 T(n) T(1/2n) ≈ 1/4 T(n)
Understanding O(n2)
!178
76108 158195523 59914 43100 200 2743 11260 5642 932
T(n)
108 52314 43 100 200 2743 76 158 195 59911 2605 642 932
(n/2)2 = n2/4
T(1/2n) ≈ 1/4 T(n) T(1/2n) ≈ 1/4 T(n)
Key Insight: Merge is linear
!179
2 1 34 57 68 910
Key Insight: Merge is linear
!180
2 1 34 57 68 910
Key Insight: Merge is linear
!181
2 34 57 68 910
1
Key Insight: Merge is linear
!182
2 34 57 68 910
1
Key Insight: Merge is linear
!183
34 57 68 910
21
Key Insight: Merge is linear
!184
34 57 68 910
21
Key Insight: Merge is linear
!185
4 57 68 910
21 3
Key Insight: Merge is linear
!186
4 57 68 910
21 3
Key Insight: Merge is linear
!187
57 68 910
21 3 4
Key Insight: Merge is linear
!188
57 68 910
21 3 4
Key Insight: Merge is linear
!189
7 68 910
21 3 4 5
Key Insight: Merge is linear
!190
7 68 910
21 3 4 5
Key Insight: Merge is linear
!191
7 8 910
21 3 4 5 6
Key Insight: Merge is linear
!192
7 8 910
21 3 4 5 6
Key Insight: Merge is linear
!193
8 910
21 3 4 5 76
Key Insight: Merge is linear
!194
8 910
21 3 4 5 76
Key Insight: Merge is linear
!195
910
21 3 4 5 76 8
Key Insight: Merge is linear
!196
910
21 3 4 5 76 8
Key Insight: Merge is linear
!197
10
21 3 4 5 76 8 9
Key Insight: Merge is linear
!198
10
21 3 4 5 76 8 9
Key Insight: Merge is linear
!199
21 3 4 5 76 8 9 10
Key Insight: Merge is linear
!200
21 3 4 5 76 8 9 10
Each step makes one comparison and reduces the number of elements to
be merged by 1. If there are n total elements to be
merged, merging is O(n)
Divide and Conquer
!201
76108 158195523 59914 43100 200 2743 11260 5642 932
T(n)
108 52314 43 100 200 27476 158 195 59911 26064 932
T(1/2n) ≈ 1/4 T(n) T(1/2n) ≈ 1/4 T(n)
Divide and Conquer
!202
76108 158195523 59914 43100 200 2743 11260 5642 932
T(n)
108 52314 43 100 200 2743 76 158 195 59911 2605 642 932
Speed up insertion sort by a factor of two by splitting in half, sorting separately and merging results!
T(1/2n) ≈ 1/4 T(n) T(1/2n) ≈ 1/4 T(n)
108 52314 43 100 200 27476 158 195 59911 26064 932
T(n) ≈ 1/2 T(n) + n
Divide and Conquer
Splitting in two gives 2x improvement.
!203
Divide and Conquer
Splitting in two gives 2x improvement.
Splitting in four gives 4x improvement.
!204
Divide and Conquer
Splitting in two gives 2x improvement.
Splitting in four gives 4x improvement.
Splitting in eight gives 8x improvement.
!205
Divide and Conquer
Splitting in two gives 2x improvement.
Splitting in four gives 4x improvement.
Splitting in eight gives 8x improvement.
What if we never stop splitting?
!206
Merge Sort
!207
76108 158195523 59914 43 200 2743 11260 642 932
7610852314 43 200 2743 158195 599 11260 642 932
14 43 2003 76108523274 158195 599 2 11260 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
Merge Sort
!208
76108 158195523 59914 43 200 2743 11260 642 932
7610852314 43 200 2743 158195 599 11260 642 932
14 43 2003 76108523274 158195 599 2 11260 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
Merge Sort
!209
76108 158195523 59914 43 200 2743 11260 642 932
7610852314 43 200 2743 158195 599 11260 642 932
14 43 2003 76108523274 158195 599 2 11260 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
Merge Sort
!210
76108 158195523 59914 43 200 2743 11260 642 932
7610852314 43 200 2743 158195 599 11260 642 932
3 43 20014 52327410876 1952 158 599 2611 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
Merge Sort
!211
76108 158195523 59914 43 200 2743 11260 642 932
5232742003 43 76 10814 262 11 195158 59964 932
3 43 20014 52327410876 1952 158 599 2611 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
Merge Sort
!212
7664 19510843 1582 11 14 263 523274 599200 932
5232742003 43 76 10814 262 11 195158 59964 932
3 43 20014 52327410876 1952 158 599 2611 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
Merge Sort
!213
7664 19510843 1582 11 14 263 523274 599200 932
5232742003 43 76 10814 262 11 195158 59964 932
3 43 20014 52327410876 1952 158 599 2611 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
Merge Sort Analysis
!214
7664 19510843 1582 11 14 263 523274 599200 932
5232742003 43 76 10814 262 11 195158 59964 932
3 43 20014 52327410876 1952 158 599 2611 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
O(n)
O(n)
O(n)
O(n)
O(n)
Merge Sort Analysis
!215
7664 19510843 1582 11 14 263 523274 599200 932
5232742003 43 76 10814 262 11 195158 59964 932
3 43 20014 52327410876 1952 158 599 2611 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
Merge how many times?
n
n/2
n/4
. . .
n/2k
Merge Sort Analysis
!216
7664 19510843 1582 11 14 263 523274 599200 932
5232742003 43 76 10814 262 11 195158 59964 932
3 43 20014 52327410876 1952 158 599 2611 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
Merge how may times? n/2k = 1 n = 2k
log2 n = k
n
n/2
n/4
. . .
n/2k
Merge Sort Analysis
!217
7664 19510843 1582 11 14 263 523274 599200 932
5232742003 43 76 10814 262 11 195158 59964 932
3 43 20014 52327410876 1952 158 599 2611 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
n
n/2
n/4
. . .
n/2k
Merge n elements log2 n times
Merge Sort Analysis
!218
7664 19510843 1582 11 14 263 523274 599200 932
5232742003 43 76 10814 262 11 195158 59964 932
3 43 20014 52327410876 1952 158 599 2611 64 932
3 14 43 200 523274 10876 195 599 2 158 2611 64 932
14 3 43 200 523274 76108 195 599 158 2 11260 64 932
O(n log n )
O(n)
O(n)
O(n)
O(n)
O(n)
Merge Sort
How would you code this?
!219
Merge Sort
How would you code this?
Hint: Divide and Conquer!!!
!220
Merge Sort
void mergeSort(array){ if array size <= 1 return //base case split array into left_array and right_array mergeSort(left_array) mergeSort(right_array) merge(left_array, right_array, sorted_array)}
!221
Merge Sort Analysis
Execution time does NOT depend on initial arrangement of data
Worst Case: O( n log n) comparisons and data moves
Best Case: O( n log n) comparisons and data moves
Stable
Best we can do with comparison-based sorting that does not rely on a data structure in the worst case => can’t beat O( n log n)
Space overhead: auxiliary array at each merge step
!222
What we have so far
!223
Worst Case Best Case
Selection Sort O( n2 ) O( n2 )
Insertion Sort O( n2 ) O( n )
Bubble Sort O( n2 ) O( n )
Merge Sort O( n log n ) O( n log n )
Quick Sort
!224
Quick Sort
!225
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!226
pivot
<= pivot
> pivot
Partition
Quick Sort
!227
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!228
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!229
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
swap
Partition
Quick Sort
!230
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!231
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!232
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!233
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
swap
Partition
Quick Sort
!234
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!235
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!236
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!237
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!238
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
swap
Partition
Quick Sort
!239
pivot
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
Partition
Quick Sort
!240
Select a pivot. Arrange other entries s.t. entries in left partition are ≤ pivot
and entries in right partition are > pivot
<= pivot
> pivot
≤ pivot > pivotquickSort( ) quickSort( )
Partition
Quick Sort Analysis
Divide and Conquer
n comparisons for each partition
How many subproblems? => Depends on pivot selection
Ideally partition divides problem into two n/2 subproblems for logn recursive calls (Best case)
Possibly (though unlikely) each partition has 1 empty subarray for n recursive calls (Worst case)
!241
template<class T> void quickSort(T the_array[], int first, int last) { if (last - first + 1 < MIN_SIZE) { insertionSort(the_array, first, last); } else { // Create the partition: S1 | Pivot | S2 int pivot_index = partition(the_array, first, last); // Sort subarrays S1 and S2 quickSort(the_array, first, pivot_index - 1); quickSort(the_array, pivotIndex + 1, last); } // end if } // end quickSort
!242
How to select pivot?
!243
How to select pivot?
Ideally median Need to sort array to find median
Other ideas?
!244
How to select pivot?
Ideally median Need to sort array to find median
Other ideas? Pick first, middle, last position and order them making middle the pivot
!245
695 13
How to select pivot?
Ideally median Need to sort array to find median
Other ideas? Pick first, middle, last position and order them making middle the pivot
!246
136 95
pivot
Quick Sort AnalysisExecution time DOES depend on initial arrangement of data AND on PIVOT SELECTION (luck?) => on random data can be faster than Merge Sort
Optimization (e.g. smart pivot selection, speed up base case, iterative instead of recursive implementation) can improve actual runtime -> fastest comparison-based sorting algorithm on average
Worst Case: O( n2) comparisons and data moves
Best Case: O( n log n) comparisons and data moves
Unstable
!247
!248
Worst Case Best Case
Selection Sort O( n2 ) O( n2 )
Insertion Sort O( n2 ) O( n )
Bubble Sort O( n2 ) O( n )
Merge Sort O( n log n ) O( n log n )
Quick Sort O( n2 ) O( n log n )
!249
https://www.toptal.com/developers/sorting-algorithms