algorithm analysis dr. bernard chen ph.d. university of central arkansas

27
Algorithm Analysis Dr. Bernard Chen Ph.D. University of Central Arkansas

Upload: lilian-summers

Post on 24-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Algorithm Analysis

Dr. Bernard Chen Ph.D.University of Central Arkansas

Outline

Big O notation Two examples

Search program Max. Contiguous Subsequence

Program

Algorithms + Data Structure = Programs

Algorithms: Must be definite and unambiguous Simple enough to carry out by computer Need to be terminated after a finite

number of operations

Why Algorithm analysis

Generally, we use a computer because we need to process a large amount of data.

When we run a program on large amounts of input, besides to make sure the program is correct, we must be certain that the program terminates within a reasonable amount of time.

What is Algorithm Analysis?

Algorithm: A clearly specified finite set of instructions a computer follows to solve a problem.

Algorithm analysis: a process of determining the amount of time, resource, etc. required when executing an algorithm.

Big O Notation

Big O notation is used to capture the most dominant term in a function, and to represent the growth rate.

Also called asymptotic upper bound.

Ex: 100n3 + 30000n =>O(n3) 100n3 + 2n5+ 30000n =>O(n5)

Examples of Algorithm Running Times Min element in an array :O(n)

Closest points in the plane (an X-Y coordinate), ie. Smallest distance pairs:

n(n-1)/2 => O(n2)

Colinear points in the plane, ie. 3 points on a straight line: n(n-1)(n-2)/6 => O(n3)

Examples of Algorithm Running Times

In the function 10n3 + n2 + 40n + 80, for n=1000, the value of the function is 10,001,040,080

Of which 10,000,000,000 is due to the 10n3

Various growth rates

Functions in order of increasing growth rate

Static Searching problem

Static Searching Problem Given an integer X and an array A,

return the position of X in A or an indication that it is not present. If X occurs more than once, return any occurrence. The array A is never altered.

Sequential Search A sequential search steps through the

data sequentially until an match is found. A sequential search is useful when the

array is not sorted. A sequential search is linear O(n) (i.e.

proportional to the size of input) Unsuccessful search --- n times Successful search (worst) --- n times Successful search (average) --- n/2 times

Binary Search If the array has been sorted, we can use

binary search, which is performed from the middle of the array rather than the end.

We keep track of low_end and high_end, which delimit the portion of the array in which an item, if present, must reside.

If low_end is larger than high_end, we know the item is not present.

Cont.

Sequential search: =>O(n)

Binary search (sorted data): => O(logn)

Binary Search 3-ways comparisonsint binarySearch(vector<int> a[], int x){

int low = 0;int high = a.size() – 1;

int mid;while(low <= high) {

mid = (low + high) / 2;if(a[mid] < x)

low = mid + 1;else if( a[mid] > x)

high = mid - 1;else

return mid;}return NOT_FOUND; // NOT_FOUND = -1

}//binary search using three-ways comparisons

The Max. Contiguous Subsequence Given (possibly negative) integers

A1, A2, .., An, find (and identify the sequence corresponding to) the max. value of sum of Ak where k = i -> j. The max. contiguous sequence sum is zero if all the integer are negative.

{-2, 11, -4, 13, -5, 2} =>20 {1, -3, 4, -2, -1, 6} => 7

Real life Example from http://solvealgo.blogspot.com/2009/03/dynamic-programming-1-maximum-value.html

Real life example: The club X never closes. Its public entrance, a revolving door, just keeps on spinning. With each rotation some punters enter and others leave. The club’s owners would like to track this traffic. Specifically, they’d like to know the maximum increase in people entering the club over a given period.

 Entry log: 0 1 2 -3 3 -1 0 -4 0 -1 -4 2 4 1 3 1

 Positive = people entering into the club Negative = leaving.

Brute Force Algorithm O(n3)int maxSubSum(int a[]){ int n = a.size(); int maxSum = 0; for(int i = 0; i < n; i++){ // for each possible start

point for(int j = i; j < n; j++){ // for each possible end point int thisSum = 0;

for(int k = i; k <= j; k++) thisSum += a[k];//dominant term

if( thisSum > maxSum){ maxSum = thisSum;

seqStart = i; seqEnd = j; } } } return maxSum;} //A cubic maximum contiguous subsequence sum algorithm

O(n3) Algorithm Analysis

We do not need precise calculations for a Big-Oh estimate. In many cases, we can use the simple rule of multiplying the size of all the nested loops

O(N2) algorithm An improved algorithm makes use of the

fact that

If we have already calculated the sum for the subsequence i, …, j-1. Then we need only one more addition to get the sum for the subsequence i, …, j. However, the cubic algorithm throws away this information.

If we use this observation, we obtain an improved algorithm with the running time O(N2).

O(N2) Algorithm cont.int maxSubsequenceSum(int a[]){

int n = a.size();int maxSum = 0;for( int i = 0; i < n; i++){

int thisSum = 0;for( int j = i; j < n; j++){

thisSum += a[j];if( thisSum > maxSum){

maxSum = thisSum;seqStart = i;seqEnd = j;

}}

}return maxSum;

}//figure 6.5

O(N) Algorithm

If we remove another loop, we have a linear algorithm

The algorithm is tricky. It uses a clever observation to step quickly over large numbers of subsequences that cannot be the best

O(N) Algorithmtemplate <class Comparable>int maxSubsequenceSum(int a[]){

int n = a.size();int thisSum = 0, maxSum = 0;

int i=0;for( int j = 0; j < n; j++){

thisSum += a[j];if( thisSum > maxSum){

maxSum = thisSum;seqStart = i;seqEnd = j;

}else if( thisSum < 0) {i = j + 1;thisSum = 0;

} }return maxSum;

}//figure 6.8

Checking an Algorithm Analysis

If it is possible, write codes to test your algorithm for various large n.

Limitations of Big-Oh Analysis

Big-Oh is an estimate tool for algorithm analysis. It ignores the costs of memory access, data movements, memory allocation, etc. => hard to have a precise analysis.

Ex: 2nlogn vs. 1000n. Which is faster? => it depends on n

Common errors

For nested loops, the total time is effected by the product of the loop size, for consecutive loops, it is not.

Do not write expressions such as O(2N2) or O(N2+2). Only the dominant term, with the leading constant removed is needed.

Some practice

Determine the Big O( ) notation for the following: