advances data structures

Upload: dan-heisenberg

Post on 04-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 Advances Data Structures

    1/36

    Advanced Data StructuresLecture 1: Introduction

    Mircea Marin

    October 1 2013

    Mircea Marin Advanced Data Structures

    http://find/
  • 8/14/2019 Advances Data Structures

    2/36

    Organizatorial items

    Lecturer and TA: Mircea Marin

    email: [email protected]

    Course objectives:1 Become familiar with some of the advanced data structures

    and algorithms which underpin much of todays computerprogramming

    2

    Recognize the data structure and algorithms that are bestsuited to model and solve your problem3 Be able to perform a qualitative analysis of your algorithms

    time complexity analysis

    Course webpage:

    http://web.info.uvt.ro/mmarin/lectures/ADSHandouts: will be posted on the webpage of the lecture

    Grading: 50% final exam (written), 50% lab + homework

    Attendance: required

    Mircea Marin Advanced Data Structures

    http://web.info.uvt.ro/~mmarin/lectures/ADShttp://web.info.uvt.ro/~mmarin/lectures/ADShttp://find/
  • 8/14/2019 Advances Data Structures

    3/36

    Organizatorial items

    Lab work: implementation in C++ or Java of applications,

    using data structures and algorithms presented in this

    lecture

    Requirements:

    Be up-to-date with the presented material

    Prepare the programming assignments for the stateddeadlines

    Recommended textbooks

    Cormen, Leiserson, Rivest. Introduction to Algorithms. MITPress.

    Adam Drozdek. Data structures and algorithms inC++.Third edition, Thomson Course Technology, 2005.Aho, Hopcroft, Ullmann. Data structures and algorithms,Addison-Wesley, 1985.

    Mircea Marin Advanced Data Structures

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    4/36

    From problems to programs

    Mircea Marin Advanced Data Structures

    http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    5/36

    From problems to programs

    Initial problems have no simple, precise specifications

    Mircea Marin Advanced Data Structures

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    6/36

    From problems to programs

    Initial problems have no simple, precise specifications

    Identify a good data structure to model the problem

    Mircea Marin Advanced Data Structures

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    7/36

    From problems to programs

    Initial problems have no simple, precise specifications

    Identify a good data structure to model the problem

    Familiarization with the most common data structures usedin computer science is a good prerequisite!

    Mircea Marin Advanced Data Structures

    F bl

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    8/36

    From problems to programs

    Initial problems have no simple, precise specifications

    Identify a good data structure to model the problem

    Familiarization with the most common data structures usedin computer science is a good prerequisite!

    Find a solution in the form of analgorithm, using the

    selected modelAlgorithm =

    Mircea Marin Advanced Data Structures

    F bl t

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    9/36

    From problems to programs

    Initial problems have no simple, precise specifications

    Identify a good data structure to model the problem

    Familiarization with the most common data structures usedin computer science is a good prerequisite!

    Find a solution in the form of analgorithm, using the

    selected modelAlgorithm = finite sequence of instructions, each of whichhas a clear meaning and can be performed with a finiteamount of effort in a finite length of time.

    Mircea Marin Advanced Data Structures

    F bl t

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    10/36

    From problems to programs

    Initial problems have no simple, precise specifications

    Identify a good data structure to model the problem

    Familiarization with the most common data structures usedin computer science is a good prerequisite!

    Find a solution in the form of analgorithm, using the

    selected modelAlgorithm = finite sequence of instructions, each of whichhas a clear meaning and can be performed with a finiteamount of effort in a finite length of time.Do not confusealgorithmwithprogram!

    Mircea Marin Advanced Data Structures

    F bl t

    http://find/
  • 8/14/2019 Advances Data Structures

    11/36

    From problems to programs

    Initial problems have no simple, precise specifications

    Identify a good data structure to model the problem

    Familiarization with the most common data structures usedin computer science is a good prerequisite!

    Find a solution in the form of analgorithm, using the

    selected modelAlgorithm = finite sequence of instructions, each of whichhas a clear meaning and can be performed with a finiteamount of effort in a finite length of time.Do not confusealgorithmwithprogram!

    Algorithm is language-independent: it can be written in

    pseudocode

    Mircea Marin Advanced Data Structures

    From problems to programs

    http://find/
  • 8/14/2019 Advances Data Structures

    12/36

    From problems to programs

    Initial problems have no simple, precise specifications

    Identify a good data structure to model the problem

    Familiarization with the most common data structures usedin computer science is a good prerequisite!

    Find a solution in the form of analgorithm, using the

    selected modelAlgorithm = finite sequence of instructions, each of whichhas a clear meaning and can be performed with a finiteamount of effort in a finite length of time.Do not confusealgorithmwithprogram!

    Algorithm is language-independent: it can be written in

    pseudocode

    To write a program, you need to choose a programming

    language

    Mircea Marin Advanced Data Structures

    From problems to programs

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    13/36

    From problems to programsA modeling scenario

    Problem specification:

    intersection of 5 roads: A, B, C, D, E; Roads C and E areoneway; There are 13 possible turns: AB, EC, etc.Design a light intersection system to avoid car collisions.

    Chosen model: a graph with nodes = possible turns:

    {AB, AC, AD, BA, BC, BD, DA, DB, DC, EA, EB, EC, ED} edges are between turns that can be performed

    simultaneously.

    Mircea Marin Advanced Data Structures

    From problems to programs

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    14/36

    From problems to programsA modeling scenario (contd.)

    Graph coloring: assignment of a color to each node so

    that no two vertices connected by an edge have the samecolor.

    REMARK: Our problem is the same as coloring the graph

    of incompatible turns usingas few colors as possible.

    Advantages of the chosen model:

    Graph are a well-known abstract data structure.

    Graph coloring is a well-studied problem.

    Bad news: the problem isNP complete: to find the best

    solution, we must essentially try all possibilities :-(finding optimal solution may be very inefficient.Alternative:be happy with a reasonably good solution,using a heuristic algorithm. E.g., use the following "greedy"algorithm.

    Mircea Marin Advanced Data Structures

    From problems to programs

    http://find/
  • 8/14/2019 Advances Data Structures

    15/36

    From problems to programsA modeling scenario (contd.)

    Example (Optimal coloring)

    color turns

    blue AB, AC, AD, BA, DC, EDred BC, BD, EA

    green DA, DByellow EB, EC

    Mircea Marin Advanced Data Structures

    From problems to programs

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    16/36

    From problems to programsA modeling scenario (contd.)

    Example (Optimal coloring)

    color turns

    blue AB, AC, AD, BA, DC, EDred BC, BD, EA

    green DA, DByellow EB, EC

    The "greedy" algorithm - informal description

    1 Select some uncolored node and color it with a new color.2 Scan the list of uncolored nodes. For each node,

    determine whether it has an edge to any vertex already

    colored with the new color. If there is no such edge, color

    the present vertex with the new color.Mircea Marin Advanced Data Structures

    Remarks about the greedy algorithm

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    17/36

    Remarks about the greedy algorithm

    Does not always use the minimum possible number of colors.

    We can use the theory of algorithms to evaluate the

    goodness of the solution produced.

    k-clique= set of knodes, every pair of which is connected

    by an edge.

    REMARK 1: We needkcolors to color a k-clique.

    REMARK 2: The illustrated graph of incompatible turns

    contains a 4-cliqueAC, DA, BD, EB 4 colors are

    needed.

    Mircea Marin Advanced Data Structures

    Pseudo-Language and Stepwise Refinements

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    18/36

    Pseudo Language and Stepwise RefinementsIllustrated algorithm: greedy

    Mircea Marin Advanced Data Structures

    Pseudo-Language and Stepwise Refinements

    http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    19/36

    Pseudo Language and Stepwise RefinementsIllustrated algorithm: greedy

    This algorithm can be refined to more specific

    implementation.

    Mircea Marin Advanced Data Structures

    Pseudo-Language and Stepwise Refinements

    http://find/
  • 8/14/2019 Advances Data Structures

    20/36

    Pseudo Language and Stepwise RefinementsIllustrated algorithm: greedy

    First refinement: clarify the notion of adjacency

    Mircea Marin Advanced Data Structures

    Pseudo-Language and Stepwise Refinements

    http://find/
  • 8/14/2019 Advances Data Structures

    21/36

    Pseudo Language and Stepwise RefinementsIllustrated algorithm: greedy

    Second refinement: clarify the notion of for each loop

    procedure greedy(var G:GRAPH; var newclr:LIST);bool found;int v, w;

    beginwhile v ! = null do begin

    found := falsew := first vertex in newclrif there is an edge between v and w in G then

    found:=true;w:=next vertex in newclr

    end

    if found=false do beginmark v colored;add v to newclr

    endv :=next uncolored vertex in G

    end { greedy }Mircea Marin Advanced Data Structures

    Representations of graphs

    http://find/
  • 8/14/2019 Advances Data Structures

    22/36

    Representations of graphsAdjacency matrix, etc.

    Adjacency matrix

    AB AC AD BA BC BD DA DB DC EA EB EC ED

    AB 1 1 1 1AC 1 1 1 1 1AD 1 1 1BA

    BC 1 1 1BD 1 1 1 1 1DA 1 1 1 1 1DB 1 1 1DCEA 1 1 1EB 1 1 1 1 1

    EC 1 1 1 1ED

    Mircea Marin Advanced Data Structures

    Abstract Data Types (ADT)

    http://find/
  • 8/14/2019 Advances Data Structures

    23/36

    yp ( )

    ADT= mathematical model with a collection of operations

    defined on that model.

    ADT = generalization of primitive data types (integer, float,

    char, boolean, etc.)

    Typical examples of ADT:Lists, stacks, heaps, trees, sets, hash-tables, etc.

    In OOP (e.g., C++), they can be implemented as classes.

    Benefits: encapsulation, generalization (by classextension), overloading, . . .

    Mircea Marin Advanced Data Structures

    ADTs for the greedy algorithm

    http://find/
  • 8/14/2019 Advances Data Structures

    24/36

    g y g

    For colors:List

    Desirable operations:init(colorList) - initialize the list of colors to be empty.first(colorList) - return the first element of the list, and null ifthe list is empty.next(colorList) - get the next element of the list, and null if

    there is no next memberinsert(c, colorList) - insert a new color c into the colorList

    For colored graphs:GraphDesirable operations:

    get the first uncolored node,

    test whether there is an edge between two nodes,mark a node colored,get the next uncolored node,. . .

    Mircea Marin Advanced Data Structures

    Running time of a program

    http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    25/36

    g p g

    Factors on which it depends:

    Size of the input.

    Quality of compiler-generated code.

    Nature and speed of computer instructions.Time complexity of the underlying algorithm.

    The running time of a program should be defined as a function

    T(n)of the sizenof input.

    Mircea Marin Advanced Data Structures

    The Big-Oh, Big-Omega, and Big-Theta notation

    http://find/
  • 8/14/2019 Advances Data Structures

    26/36

    g , g g , g

    Assumptions:

    T(n) :running time function, depending on input sizen N.

    "Big-Oh" notation: T(n)is O(f(n))if there arec R,c >0,andn0 N such thatT(n) c f(n)for allnn0.

    "Big-Omega" notation: T(n)is(f(n))if there arec R,c >0 such that T(n) c f(n)for infinitely many values ofn.

    "Big -Theta" notation: T(n)is(f(n))if there are

    c1, c2 R,c1, c2 >0, andn0 N such thatc1 f(n)T(n) c2 f(n)for allnn0.

    Mircea Marin Advanced Data Structures

    Desirable running times

    http://find/
  • 8/14/2019 Advances Data Structures

    27/36

    g

    Polynomial running time: T(n)isO(nk)for somek >0.

    Algorithms with polynomial running time are calledtractable.

    Question:AlgorithmA1 has running time 5 n3 andA2 has

    running time 100 n2.Which alg. has better running time?

    Mircea Marin Advanced Data Structures

    Desirable running times

    http://find/
  • 8/14/2019 Advances Data Structures

    28/36

    g

    Polynomial running time: T(n)isO(nk)for somek >0.

    Algorithms with polynomial running time are calledtractable.

    Question:AlgorithmA1 has running time 5 n3 andA2 has

    running time 100 n2.Which alg. has better running time?

    Mircea Marin Advanced Data Structures

    Desirable running times

    http://find/
  • 8/14/2019 Advances Data Structures

    29/36

    Polynomial running time: T(n)isO(nk)for somek >0.

    Algorithms with polynomial running time are called

    tractable.

    Question:AlgorithmA1 has running time 5 n3 andA2 has

    running time 100 n2.Which alg. has better running time?

    Answer: A1 ifn

  • 8/14/2019 Advances Data Structures

    30/36

    Rule for sumsfor programP=P1;. . .; Pkmade ofsequence of programsP1, . . . ,Pk.

    IfPi isO(fi(n))fori=1, . . . , k thenP isO(f(n))wheref(n) =max{fi(n)| 1 ik}.

    If there existsn0 N andg(n)such thatfi(n) g(n)for alln n0 thenPis inO(g(n)).

    Rule for products: IfT1(n)isO(f(n))andT2(n)isO(g(n))thenT1(n) T2(n)isO(f(n)g(n)).

    Mircea Marin Advanced Data Structures

    Case study: Bubble-sort

    http://find/
  • 8/14/2019 Advances Data Structures

    31/36

    Input size: n

    Question:What isT(n)?

    Mircea Marin Advanced Data Structures

    General rules for the analysis of programs

    http://find/
  • 8/14/2019 Advances Data Structures

    32/36

    ASSUMPTION: The only parameter for the running time functionTis the input size (n)

    The running time of each assignment, read, and write statementcan usually be taken to beO(1).

    The running time of a sequence of stmts. is given by sum rule.That is, the running time of the sequence is, to within a constantfactor, the largest running time of any stmt. in the sequence.

    An ifstatement has running time = time for evaluating condition(usuallyO(1)) + time for evaluating the corresponding branch.

    The running time to execute a loop is the sum, over all timesaround the loop, of the time to execute the body and the time to

    evaluate the condition for termination (usually the latter is O(1)).Often this time is, neglecting constant factors, the product of thenumber of times around the loop and the largest possible timefor one execution of the body, but we must consider each loopseparately to make sure.

    Mircea Marin Advanced Data Structures

    Exercise 1

    http://find/
  • 8/14/2019 Advances Data Structures

    33/36

    What is the running time T(n)for the computation offact(n)bythe following recursive program:

    Mircea Marin Advanced Data Structures

    Exercise 3

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    34/36

    Give, using big oh notation, the worst case running times ofthe following procedures as a function of n.

    Mircea Marin Advanced Data Structures

    From algorithms to programs

    http://goforward/http://find/http://goback/
  • 8/14/2019 Advances Data Structures

    35/36

    Object-oriented programming languages provide goodsupport for abstract data types

    Programming in C++ is encouragedProgramming in Java is also ok

    Modern OOP languages, including C++ and Java, have

    built-in implementations of several advanced datastructures (ADS) and algorithms presented in this course

    The missing ADS and algorithms are not hard to implementby yourself

    C++ provides the Standard Template Library STL to work

    with efficient implementations of data structures.

    Mircea Marin Advanced Data Structures

    References

    http://find/
  • 8/14/2019 Advances Data Structures

    36/36

    Alfred V. Ahoet al. Data Structures and Algorithms.

    Chapter 1: Design and Analysis of Algorithms.

    1983. Addison-Wesley.

    Mircea Marin Advanced Data Structures

    http://find/