# stacks queues lists

Post on 23-Jan-2017

31 views

Embed Size (px)

TRANSCRIPT

* Data Structures

*The logical or mathematical model of a particular organization of data is called a data structure DATA STRUCTURES

*A primitive data type holds a single piece of datae.g. in Java: int, long, char, boolean etc.Legal operations on integers: + - * / ...

A data structure structures data!Usually more than one piece of dataShould provide legal operations on the dataThe data might be joined together (e.g. in an array): a collection

DATA STRUCTURES

* Static vs. Dynamic StructuresA static data structure has a fixed size

This meaning is different than those associated with the static modifier

Arrays are static; once you define the number of elements it can hold, it doesnt change

A dynamic data structure grows and shrinks as required by the information it contains

*

An Abstract Data Type (ADT) is a data type together with the operations, whose properties are specified independently of any particular implementation. Abstract Data Type

* Abstract Data Type In computing, we view data from three perspectives: Application level View of the data within a particular problem Logical level An abstract view of the data values (the domain) and the set of operations to manipulate them Implementation level A specific representation of the structure to hold the data items and the coding of the operations in a programming language

*Problem Solving: Main StepsProblem definitionAlgorithm design / Algorithm specificationAlgorithm analysisImplementationTesting[Maintenance]

*Problem DefinitionWhat is the task to be accomplished?Calculate the average of the grades for a given student

What are the time / space / speed / performance requirements?

*. Algorithm Design / SpecificationsAlgorithm: Finite set of instructions that, if followed, accomplishes a particular task.Describe: in natural language / pseudo-code / diagrams / etc. Criteria to follow:Input: Zero or more quantities (externally produced)Output: One or more quantities Definiteness: Clarity, precision of each instructionFiniteness: The algorithm has to stop after a finite (may be very large) number of stepsEffectiveness: Each instruction has to be basic enough and feasible

* Implementation, Testing, MaintenancesImplementationDecide on the programming language to useC, C++, Lisp, Java, Perl, Prolog, assembly, etc. , etc.Write clean, well documented code

Test, test, test

Integrate feedback from users, fix bugs, ensure compatibility across different versions Maintenance

*Algorithm AnalysisSpace complexityHow much space is requiredTime complexityHow much time does it take to run the algorithm

Often, we deal with estimates!

*Space ComplexitySpace complexity = The amount of memory required by an algorithm to run to completion[Core dumps = the most often encountered cause is memory leaks the amount of memory required larger than the memory available on a given system]Some algorithms may be more efficient if data completely loaded into memory Need to look also at system limitationsE.g. Classify 2GB of text in various categories [politics, tourism, sport, natural disasters, etc.] can I afford to load the entire collection?

*Space Complexity (contd)Fixed part: The size required to store certain data/variables, that is independent of the size of the problem:

- e.g. name of the data collection- same size for classifying 2GB or 1MB of texts

Variable part: Space needed by variables, whose size is dependent on the size of the problem:

- e.g. actual text - load 2GB of text VS. load 1MB of text

- *Space Complexity (contd)S(P) = c + S(instance characteristics)c = constantExample:float sum (float* a, int n) { float s = 0; for(int i = 0; i
*Time ComplexityOften more important than space complexityspace available (for computer programs!) tends to be larger and largertime is still a problem for all of us

3-4GHz processors on the market researchers estimate that the computation of various transformations for 1 single DNA chain for one single protein on 1 TerraHZ computer would take about 1 year to run to completionAlgorithms running time is an important issue

*Running TimeProblem: prefix averagesGiven an array XCompute the array A such that A[i] is the average of elements X[0] X[i], for i=0..n-1Sol 1At each step i, compute the element X[i] by traversing the array A and determining the sum of its elements, respectively the average Sol 2 At each step i update a sum of the elements in the array ACompute the element X[i] as sum/I

*Running timeSuppose the program includes an if-then statement that may execute or not: variable running timeTypically algorithms are measured by their worst case

*Experimental ApproachWrite a program that implements the algorithmRun the program with data sets of varying size.Determine the actual running time using a system call to measure time (e.g. system (date) );

Problems?

*Experimental ApproachIt is necessary to implement and test the algorithm in order to determine its running time. Experiments can be done only on a limited set of inputs, and may not be indicative of the running time for other inputs. The same hardware and software should be used in order to compare two algorithms. condition very hard to achieve!

*Use a Theoretical ApproachBased on high-level description of the algorithms, rather than language dependent implementations

Makes possible an evaluation of the algorithms that is independent of the hardware and software environments

*Algorithm DescriptionHow to describe algorithms independent of a programming language Pseudo-Code = a description of an algorithm that is more structured than usual prose but less formal than a programming language(Or diagrams)Example: find the maximum element of an array.Algorithm arrayMax(A, n):Input: An array A storing n integers.Output: The maximum element in A.currentMax A[0]for i 1 to n -1 doif currentMax < A[i] then currentMax A[i]return currentMax

*Properties of Big-OhExpressions: use standard mathematical symbols use for assignment ( ? in C/C++)use = for the equality relationship (? in C/C++)Method Declarations: -Algorithm name(param1, param2) Programming Constructs: decision structures:if ... then ... [else ..]while-loops while ... do repeat-loops: repeat ... until ... for-loop: for ... do array indexing: A[i]Methodscalls: object method(args)returns:return value Use commentsInstructions have to be basic enough and feasible!

*Asymptotic analysis - terminologySpecial classes of algorithms:logarithmic:O(log n)linear:O(n)quadratic:O(n2)polynomial:O(nk), k 1exponential:O(an), n > 1Polynomial vs. exponential ?Logarithmic vs. polynomial ?

*Some Numbers

Sheet1

log nnn log nn2n32n

010112

122484

248166416

382464512256

41664256409665536

5321601024327684294967296

*Relatives of Big-Oh

Relatives of the Big-Oh (f(n)): Big Omega asymptotic lower bound (f(n)): Big Theta asymptotic tight boundBig-Omega think of it as the inverse of O(n)g(n) is (f(n)) if f(n) is O(g(n))Big-Theta combine both Big-Oh and Big-Omegaf(n) is (g(n)) if f(n) is O(g(n)) and g(n) is (f(n)) Make the difference: 3n+3 is O(n) and is (n)3n+3 is O(n2) but is not (n2)

*More relativesLittle-oh f(n) is o(g(n)) if for any c>0 there is n0 such that f(n) < c(g(n)) for n > n0.Little-omegaLittle-theta

2n+3 is o(n2) 2n + 3 is o(n) ?

*ExampleRemember the algorithm for computing prefix averages compute an array A starting with an array X every element A[i] is the average of all elements X[j] with j < iRemember some pseudo-code Solution 1Algorithm prefixAverages1(X):Input: An n-element array X of numbers.Output: An n -element array A of numbers such that A[i] is the average of elements X[0], ... , X[i].Let A be an array of n numbers.for i 0 to n - 1 doa 0for j 0 to i do a a + X[j] A[i] a/(i+ 1)return array A

*Example (contd)Algorithm prefixAverages2(X):Input: An n-element array X of numbers.Output: An n -element array A of numbers such that A[i] is the average of elements X[0], ... , X[i]. Let A be an array of n numbers.s 0for i 0 to n do s s + X[i] A[i] s/(i+ 1)return array A

*Back to the original questionWhich solution would you choose?O(n2) vs. O(n)

Some math properties of logarithms:logb(xy) = logbx + logbylogb (x/y) = logbx - logbylogbxa = alogbxlogba=logxa/logxbproperties of exponentials:

a(b+c) = aba cabc = (ab)cab /ac = a(b-c)b = a logabbc = a c*logab

*Important SeriesSum of squares:

Sum of exponents:

Geometric series:Special case when A = 220 + 21 + 22 + + 2N = 2N+1 - 1

*Analyzing recursive algorithmsfunction foo (param A, param B) {statement 1;statement 2;if (termination condition) {return;foo(A, B);}

*Solving recursive equations by repeated substitutionT(n) = T(n/2) + csubstitute for T(n/2)= T(n/4) + c + csubstitute for T(n/4)= T(n/8) + c + c + c= T(n/23) + 3c in more compact form= = T(n/2k) + kcinductive leap

T(n) = T(n/2logn) + clognchoose k = logn= T(n/n) + clogn= T(1) + clogn = b + clogn = (logn)

*Solving recursive equations by telescoping T(n) = T(n/2) + cinitial equation T(n/2) =T(n/4) + cso this holds T(n/4) = T(n/8) + cand this T(n/8) = T(n/16) + cand this T(4) = T(2) + ceventually T(2) = T(1) + c

Recommended