chapter 5 efficiency and analysis. algorithm selection algorithms are ordered lists of steps for...
Post on 19-Dec-2015
234 views
TRANSCRIPT
Chapter 5
Efficiency and Analysis
Algorithm selection
Algorithms are ordered lists of steps for solving a problem
Algorithms are also abstractions of the programming process
Often there are many different algorithms for solving the same problem
Algorithms: main considerations
How fast will it run How much memory will it use How easy is it to implement The choice depends on the software
– It may mean your algorithm is not the most efficient
Program efficiency
Computer scientists analyze algorithms for their efficiency– In a machine independent manner– By focussing on the critical operations
performed– Like the number of comparisons
(for sorting routines)
Rate of growth
The most important aspect of algorithm efficiency
Two algorithms may be just as efficient for small data sets but differ greatly for large ones.
Asymptotic Efficiency
The asymptotic efficiency of an algorithm – describes its relative efficiency as n gets
very large. Often called the “order of complexity” “Big O” - O() Example: Sequential search is O(n)
Exercise
– Rank the asymptotic orders of the following functions, from highest to lowest.
(You may wish to graph some or all of them.)
x
x xlog+
x xlog
x2 100+
2x x2+
100
xlog
1.1
Analyzing the Linear Search
Assumes a list of n elements Assumes a search key A linear search looks for the target key value
by proceeding in sequential fashion through the list
Sample record format
key data
Linear Search algorithm
For each item in the list if the item’s key matches the
target, stop and report “success” Report “failure”
Linear Search
int linearSearch(int a[], int n, int target) { int i; for (i = 0; i < n; i++) if (a[i] == target) // key comparison return i; return -1; // use -1 to indicate failure }
Analysis
Speed of an algorithm is measured by counting key comparisons
Best case is 1 comparison Worst case is n comparisons Average case is n/2 comparisons
– if the target is in the list
Analysis (continued)
What if the target we are looking for has only a 50% chance of being in the list.
The complexity must account for both targets that can be found and targets that cannot.
For targets that can be found it is n/2 For targets that cannot be found it is n For targets that can be found only 50% of the time it is:
1/2*n/2 (found) + 1/2*n (not found) The order of complexity therefore is: n/4 + n/2 = 3n/4
Polynomial expressions All polynomial equations are made up of a set of
terms. Usually with coefficients and exponents The fastest growing term is called the term of the
“highest order” A linear search, like the last example, has only one
term to describe it: 3n/4 This can be written as (3/4)n 3/4 is the coefficient, n is the exponent n is also the highest order term
From polynomials to Big-O A linear function is one whose growth is tied to n In other words, n is the highest order polynomial The growth of the expression is tied to the highest
order term and is called the “order of magnitude” Order of magnitude is expressed as Big-O For linear functions, Big-O is n We express this as O(n) To find out what the order of an expression is, look
for the highest order term.
Graphing Big-O The x axis corresponds to the number of elements in
the list The y axis is the number of critical operations
required by an algorithm The order of complexity is the highest order term in
the analysis because it is the one that increases the fastest
Constant multipliers are ignored Example: n+100 is graphed as O(n) For a linear search n/2 is also O(n) Similarly, from last example, 3n/4 is O(n)
A graph of the average case performance of Linear Search
1000
75
n
f(n)
Exercise
– For the following formulas, identify the high-order term and indicate the order of each using big-O notation.
Binary Search Repeatedly divides the list in half and in half again The result is a great improvement on linear
searching What is the asymptotic efficiency?
– Best case is 1 comparison– Worst case is log(base 2)n
Logarithmic complexity is much faster than linear– Even if we use the worst case
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]
Searching for 26
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]
Searching for 26
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]
Searching for 26
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]
Searching for 26
Equation 5-2
The growth of the base-2 log function
n n (as power of 2) log2n
16 24 4
256 28 8
4,096 212 12
65,536 216 16
1,048,576 220 20
Graph illustrating relative growth of linear and logarithmic functions
0 n
f(n)
log n
n
Defective Binary Search
int binarySearch(int a[], int n, int target) { // Precondition: array a is sorted in ascending order from a[0] to a[n-1] int first(0), last(n - 1), int mid; while (first <= last) { mid = (first + last)/2; if (target == a[mid]) return mid; else if (target < a[mid]) last = mid; else // must be that target > a[mid] first = mid; } return -1; // use -1 to indicate item not found }
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]8
Searching for 26
First0
Last17
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]4
Searching for 26
First0
Last8
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]6
Searching for 26
First4
Last8
Problem: bad loop invariant
This code works fine for finding an item that is contained in the list.
However, it does not work so well if the item is not in the list.
We get an infinite loop in that case. The reason for this is the bad loop
invariant
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]8
First0
Last17Searching for 24
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]4
First0
Last8 Searching for 24
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]6
First4
Last8 Searching for 24
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]5
Searching for 24First4
Last6
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]4
First4
Last5
Searching for 24WE ARE NOW STUCKHERE FOREVER!!
Illustrated Invariant for Binary Search
a
first last
target must lie in shaded region
Binary Search Invariant
a[first] <= target <= a[last]This assumes target is in the listThe correct invariant should beif target in a, then a[first] <=
target <= a[last]
Importance of loop invariant
Proving program correctness– Argument values (does the loop have what it
needs to work correctly)– Termination (regardless of it’s correctness, will it
ever stop) We violated the termination requirement of the
loop invariant by not proving that it could be able to stop under all conditions
Proof of correctness
We can show that the original loop invariant is incorrect using a simple trace.
Tracing skills are an important part of your programmers skill set.
The next slide contains an example of a trace for the binary search program that demonstrates the problem of the infinite loop.
Trace of Code Example 5-2
Verified Binary Search int binarySearch(int a[], int n, int target) { // Precondition: array a is sorted in ascending order // from a[0] to a[n-1] int first(0); int last(n - 1); int mid; while (first <= last) { // Invariant:
// if target in a, then a[first] <= target <= a[last] mid = (first + last)/2;
Verified Binary Search
if (target == a[mid]) return mid; else if (target < a[mid]) last = mid - 1; else // must be that target > a[mid] first = mid + 1; } return -1; //use -1 to indicate item not found }
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]8
First0
Last17Searching for 24
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]4
First0
Last7
Searching for 24
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]6
First5
Last7 Searching for 24
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]5
Searching for 24First5
Last5
Binary search strategy
10 14 15 20 23 25 26 27 31 32 34 37 41 42 44 45 46 49
A[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11][12] [13][14] [15][16] [17]
A[mid]5
Searching for 24PROGRAM TERMINATESWHEN first > last
First5
Last4
Analysis of simple sorting algorithms
Selection sort Bubble sort
Analysis of selection sort
Assume n element list Makes n-1 passes through the list Each pass has a sequential search of
some portion of the n elements Analysis: n * n = O(n2)
int maxSelect(int a[], int n) { int maxPos(0), currentPos(1); while (currentPos < n) { // Invariant: a[maxPos] >= a[0] ... a[currentPos-1] if (a[currentPos] > a[maxPos]) maxPos = currentPos; currentPos++; } return maxPos; }
The maxSelect function
The operation of Selection Sort
a
last
sorted items
n-1
all items in here less than shaded items
Selection Sort
void selectionSort(int a[], int n) { int last(n-1); int maxPos; while (last > 0) { // invariant: a[last+1] ... a[n-1] is sorted && // everything in a[0] ... a[last] <= everything in a[last+1] ... a[n-1] maxPos = maxSelect(a, last+1); // last+1 is length from 0 to last swapElements(a, maxPos, last); last--; } }
Selection Sort example
2
7
3
5
6
0
1
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last4
Selection Sort example
2
7
3
5
6
1
1
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last4
Selection Sort example
2
7
3
5
6
1
2
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last4
Selection Sort example
2
7
3
5
6
1
3
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last4
Selection Sort example
2
7
3
5
6
1
4
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last4
Swap, after one pass
2
6
3
5
7
1
4
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last4
Selection Sort
2
6
3
5
7
0
1
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last3
Selection Sort
2
6
3
5
7
1
1
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last3
Selection Sort
2
6
3
5
7
1
2
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last3
Selection Sort
2
6
3
5
7
1
3
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last3
Selection Sort
2
5
3
6
7
1
3
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last3
Selection Sort
2
5
3
6
7
0
1
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last2
Selection Sort
2
5
3
6
7
1
1
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last2
Selection Sort
2
5
3
6
7
1
2
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last2
Selection Sort
2
3
5
6
7
1
2
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last2
Selection Sort
2
3
5
6
7
0
1
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last1
Selection Sort
2
3
5
6
7
1
1
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last1
Selection Sort
2
3
5
6
7
0
1
5maxPos
currPos
nA[0]A[1]A[2]A[3]A[4] last0
Result after each pass
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
6
3
5
7
Result after each pass
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
6
3
5
7
2
5
3
6
7
Result after each pass
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
6
3
5
7
2
5
3
6
7
Result after each pass
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
6
3
5
7
2
5
3
6
7
2
3
5
6
7
Result after each pass
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
6
3
5
7
2
5
3
6
7
2
3
5
6
7
Result after each pass
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
6
3
5
7
2
5
3
6
7
2
3
5
6
7
2
3
5
6
7
Result after each pass
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
6
3
5
7
2
5
3
6
7
2
3
5
6
7
2
3
5
6
7
Result after each pass
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
6
3
5
7
2
5
3
6
7
2
3
5
6
7
2
3
5
6
7
2
3
5
6
7
Analysis
Number of passes: n-1– the first pass guaranteed to place 1 item– the second guarantees a second– the third a third, etc.– After n-1 passes we have them all in place
Analysis: comparisons
In the first pass we compared n-1 pairs In the second, n-2 In the third, n-3, etc. Actual number of comparisons made
across all passes, for this example was: 4+3+2+1 = 10
Order of complexity
The number of passes is O(n) The number of comparisons in each pass
is also O(n) Therefore, the order of complexity of the
sort is O(n*n) = O(n2)
Analysis of selection sort (con’t)
What if the list is sorted to begin with? This selection sort is a mindless one
– it would not know that it should stop. Best case and worst case are the same
(except for the swaps) The best, worst and average cases are all
quadratic algorithms with O(n2)
Bubble Sort
Also n-1 passes Also n-pass comparisons in each pass Order of complexity is therefore n-squared The same as the selection sort The main difference is that swapping may
occur as many as n-1 times on a single pass, with the selection sort it only occurs once, at the end of the pass.
Example of one phase of Bubble Sort
17 9 21 6 3 32 37 41 45
179 21 6 3 32 37 41 45
179 21 6 3 32 37 41 45
179 216 3 32 37 41 45
179 216 3 32 37 41 45
compare
compare
compare
compare
These items are sorted
bubbleSortPhase // void swapElements(int a[], int maxPos, int last);
void bubbleSortPhase(int a[], int last) { // Precondition: a is an array indexed from a[0] to a[last] // Move the largest element between a[0] and a[last] into a[last], // by swapping out of order pairs int pos;
bubbleSortPhase
for (pos = 0; pos < last - 1; pos++) if (a[pos] > a[pos+1]) { swapElements(a, pos, pos+1); } // Postconditions: a[0] ... a[last]
// contain the same elements, // possibly reordered; a[last] >= a[0] ... a[last-1] }
Bubble Sort
void bubbleSortPhase(int a[], int last); void bubbleSort(int a[], int n) { // Precondition: a is an array indexed from a[0] to a[n-1] int i; for (i = n - 1; i > 0; i--) bubbleSortPhase(a, i); // Postcondition: a is sorted }
Equation 5-7
Version 1 of bubble sort
This version uses n-1 passes During each pass, n-1 pairs are compared Every time a pair needs to be swapped
this is done before the pass can continue
Bubble Sort: difficult example
7
6
5
3
2
0 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
6
7
5
3
2
0 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
6
5
7
3
2
0 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
6
5
3
7
2
0 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
6
5
3
2
7
0 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
6
5
3
2
7
0 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
5
6
3
2
7
1 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
5
3
6
2
7
2 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
5
3
2
6
7
3 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
5
3
2
6
7
0 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
3
5
2
6
7
1 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
3
2
5
6
7
1 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
3
2
5
6
7
0 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Bubble Sort: difficult example
2
3
5
6
7
0 5Pos nA[0]A[1]A[2]A[3]A[4] last4
Pos+1
Result after each pass
7
6
5
3
2
A[0]A[1]A[2]A[3]A[4]
6
5
3
2
7
5
3
2
6
7
3
2
5
6
7
2
3
5
6
7
2
3
5
6
7
Analysis
For this example, we made n-1 passes through the array
Each time we looked at n-1 pairs– (we should have only looked at n-pass pairs.
Why?) Either way, the number of passes is O(n) The number of pairs processed in each
pass is O(n) So, overall we get O(n*n) = O(n2)
Selection and Bubblesort comparison
Both are O(n2) sorts If we count up the number of critical
operations for both sorts, handling n-1, n-2, etc. pairs for each pass, using the data in the last example, we get
Selection sort: 10 comparisons Bubble sort: 10 comparisons
What about swaps?
The same swap function can be used for both programs.
Let’s say it takes 3 operations, then the actual number of operations is
Selection sort: 10 + 3(n-1) = 22 Bubble sort: 10 + 3(10) = 40 The selection sort uses roughly half the
number of operations as the bubble sort
Moral
Two sorts of the same order O(n2) may not be the same speed.
It depends on the data sets they are sorting.
It also depends on the way they are implemented.
Data sets
The data set we chose for the bubble sort was the worst one possible.
What happens if we run it on the data set we originally used for the selection sort?
Result after pass 1
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
3
5
6
7
Result after pass 2
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
3
5
6
7
2
3
5
6
7
Improvement needed
Neither the selection sort, nor the bubble sort know enough to stop if the list is sorted early!
We should be able to come up with a smart version of the bubble sort that can do this.
Instead of the outer for loop, let’s use a while loop that runs until the array is sorted.
Further improvements
We will also make sure the inner loop runs the minimum number of times (n-pass)
and that we keep track of whether a swap was needed for the pass we are on.
If a swap was needed then we cannot assume the array is sorted.
If a swap was not needed, then the array is sorted and we should send that signal to the outer loop.
Bubble sort improvements
What improvements can you suggest for the bubble sort that we have seen?
Improvements
Do not look at portions of the array that are already in sorted order
Leave when the array is sorted– this is something the insertion and
selection sorts we have seen could not do.
Improvements
How much of an operational difference does the smart bubble sort have over it’s dumb form?
How much of an operational difference does the smart bubble sort have over the selection sort?
Do these improvements make a difference in Big-O?
Which form of sort is best?
Bubble Sort void bubbleSort(int a[], int n) { // Precondition: a is an array indexed from a[0] to a[n-1] bool sorted(false); int pass(0); while (!sorted) { sorted = true; for (pos = 0; pos < last - 1; pos++) if (a[pos] > a[pos+1]) { swapElements(pos, pos+1); sorted = false; } pos++; } // Postcondition: a is sorted }
Result after pass 1
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
3
5
6
7
Result after pass 2
2
7
3
5
6
A[0]A[1]A[2]A[3]A[4]
2
3
5
6
7
2
3
5
6
7
Analysis Only two passes are made 4 pairs are checked on pass 1 swap is called 3 times on pass 1 3 pairs are checked on pass 2, no swaps Total operations: 4*3+3 = 15 Better than the selection sort. The savings would be even bigger for larger
lists that were sorted early.
Bubble sort: final analysis
Not counting swapping, the best case is only n-1 operations!
Worst case is still n-squared Order of complexity same as selection
sort no matter what.