csc401 – analysis of algorithms chapter 2 basic data structures objectives: introduce basic data...

85
CSC401 – Analysis of Algorithms CSC401 – Analysis of Algorithms Chapter 2 Chapter 2 Basic Data Structures Basic Data Structures Objectives: Objectives: Introduce basic data structures, Introduce basic data structures, including including Stacks and Queues Stacks and Queues Vectors, Lists, and Sequences Vectors, Lists, and Sequences Trees Trees Priority Queues and Heaps Priority Queues and Heaps Dictionaries and Hash Tables Dictionaries and Hash Tables Analyze the performance of operations on Analyze the performance of operations on basic data structures basic data structures

Upload: rudolph-lane

Post on 03-Jan-2016

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

CSC401 – Analysis of AlgorithmsCSC401 – Analysis of Algorithms Chapter 2Chapter 2

Basic Data StructuresBasic Data StructuresObjectives:Objectives:

Introduce basic data structures, including Introduce basic data structures, including – Stacks and QueuesStacks and Queues– Vectors, Lists, and SequencesVectors, Lists, and Sequences– TreesTrees– Priority Queues and HeapsPriority Queues and Heaps– Dictionaries and Hash TablesDictionaries and Hash Tables

Analyze the performance of operations on Analyze the performance of operations on basic data structuresbasic data structures

Page 2: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-22

Abstract Data Types (ADTs)Abstract Data Types (ADTs)An abstract An abstract data type (ADT) data type (ADT) is an is an abstraction of a abstraction of a data structuredata structureAn ADT An ADT specifies:specifies:– Data storedData stored– Operations on Operations on

the datathe data– Error conditions Error conditions

associated with associated with operationsoperations

Example: ADT modeling Example: ADT modeling a simple stock trading a simple stock trading systemsystem– The data stored are The data stored are

buy/sell ordersbuy/sell orders– The operations supported The operations supported

areareorder order buybuy(stock, shares, (stock, shares, price)price)

order order sellsell(stock, shares, (stock, shares, price)price)

void void cancelcancel(order)(order)

– Error conditions:Error conditions:Buy/sell a nonexistent stockBuy/sell a nonexistent stock

Cancel a nonexistent orderCancel a nonexistent order

Page 3: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-33

The Stack ADTThe Stack ADTThe The StackStack ADT stores arbitrary ADT stores arbitrary objectsobjectsInsertions and deletions follow Insertions and deletions follow the last-in first-out schemethe last-in first-out schemeThink of a spring-loaded plate Think of a spring-loaded plate dispenserdispenserMain stack operations:Main stack operations:– pushpush(object): inserts an element(object): inserts an element– object object poppop(): removes and (): removes and

returns the last inserted elementreturns the last inserted element

Auxiliary stack operations:Auxiliary stack operations:– object object toptop(): returns the last (): returns the last

inserted element without inserted element without removing itremoving it

– integer integer sizesize(): returns the (): returns the number of elements storednumber of elements stored

– boolean boolean isEmptyisEmpty(): indicates (): indicates whether no elements are storedwhether no elements are stored

Attempting the Attempting the execution of an execution of an operation of ADT may operation of ADT may sometimes cause an sometimes cause an error condition, called error condition, called an exceptionan exception

Exceptions are said to Exceptions are said to be “thrown” by an be “thrown” by an operation that cannot operation that cannot be executedbe executedIn the Stack ADT, In the Stack ADT, operations pop and top operations pop and top cannot be performed if cannot be performed if the stack is emptythe stack is empty

Attempting the execution Attempting the execution of pop or top on an of pop or top on an empty stack throws an empty stack throws an EmptyStackExceptionEmptyStackException

Page 4: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-44

Applications of StacksApplications of StacksDirect applicationsDirect applications– Page-visited history in a Web Page-visited history in a Web

browserbrowser– Undo sequence in a text editorUndo sequence in a text editor– Chain of method calls in the Chain of method calls in the

Java Virtual MachineJava Virtual Machine

Indirect applicationsIndirect applications– Auxiliary data structure Auxiliary data structure

for algorithmsfor algorithms– Component of other Component of other

data structuresdata structures

The Java Virtual Machine (JVM) The Java Virtual Machine (JVM) keeps track of the chain of active keeps track of the chain of active methods with a stackmethods with a stackWhen a method is called, the JVM When a method is called, the JVM pushes on the stack a frame pushes on the stack a frame containingcontaining– Local variables and return valueLocal variables and return value– Program counter, keeping track of Program counter, keeping track of

the statement being executed the statement being executed

When a method ends, its frame is When a method ends, its frame is popped from the stack and control popped from the stack and control is passed to the method on top of is passed to the method on top of thethe stack stack

main() {int i = 5;foo(i);}

foo(int j) {int k;k = j+1;bar(k);}

bar(int m) {…}

bar PC = 1 m = 6

foo PC = 3 j = 5 k = 6

main PC = 2 i = 5

Page 5: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-55

Array-based StackArray-based StackA simple way of A simple way of implementing the Stack implementing the Stack ADT uses an arrayADT uses an arrayWe add elements from We add elements from left to rightleft to rightA variable keeps track of A variable keeps track of the index of the top the index of the top elementelementThe array storing the The array storing the stack elements may stack elements may become fullbecome fullA push operation will then A push operation will then throw a throw a FullStackExceptionFullStackException– Limitation of the array-based Limitation of the array-based

implementation implementation– Not intrinsic to the Stack Not intrinsic to the Stack

ADTADT

Algorithm size()return t + 1

Algorithm pop()if isEmpty() then

throw EmptyStackException else

t t 1return S[t + 1]

Algorithm push(o)if t = S.length 1 then

throw FullStackException else

t t + 1S[t] o

PerformancePerformance– Let Let nn be the number of be the number of

elements in the stackelements in the stack– The space used is The space used is OO((nn))– Each operation runs in time Each operation runs in time OO(1)(1)

LimitationsLimitations– The fixed maximum sizeThe fixed maximum size– Trying to push a new Trying to push a new

element into a full stack element into a full stack causes an implementation-causes an implementation-specific exceptionspecific exception

Page 6: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-66

Other Implementations Other Implementations of Stackof Stack – Extendable array-based Extendable array-based

stackstack– Linked list-based stack Linked list-based stack

Stack Interface & ArrayStack in JavaStack Interface & ArrayStack in Javapublic interface Stack {

public int size();

public boolean isEmpty();

public Object top()throws EmptyStackException;

public void push(Object o);

public Object pop() throws EmptyStackException;

}

public class ArrayStack implements Stack {private Object S[ ];private int top = -1;

public ArrayStack(int capacity) { S = new Object[capacity]);

}

public Object pop()throws EmptyStackException {

if isEmpty()throw new EmptyStackException

(“Empty stack: cannot pop”);Object temp = S[top];S[top] = null;top = top – 1;return temp;

}}

Page 7: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-77

The Queue ADTThe Queue ADTThe The QueueQueue ADT stores ADT stores arbitrary objectsarbitrary objectsInsertions and deletions Insertions and deletions follow the first-in first-out follow the first-in first-out schemeschemeInsertions are at the rear Insertions are at the rear and removals at the front and removals at the front

Main queue operations:Main queue operations:– enqueueenqueue(object): inserts (object): inserts

an element at the end of an element at the end of the queuethe queue

– object object dequeuedequeue(): (): removes and returns the removes and returns the element at the front element at the front

Auxiliary queue operations:Auxiliary queue operations:– object object frontfront(): returns the (): returns the

element at the front without element at the front without removing itremoving it

– integer integer sizesize(): returns the (): returns the number of elements storednumber of elements stored

– boolean boolean isEmptyisEmpty(): indicates (): indicates whether no elements are storedwhether no elements are stored

ExceptionsExceptions– Attempting the execution of Attempting the execution of

dequeue or front on an empty dequeue or front on an empty queue throws an queue throws an EmptyQueueExceptionEmptyQueueException

Direct applicationsDirect applications– Waiting lists, bureaucracyWaiting lists, bureaucracy– Access to shared Access to shared

resources (e.g., printer)resources (e.g., printer)– MultiprogrammingMultiprogramming

Indirect applicationsIndirect applications– Auxiliary data structure for Auxiliary data structure for

algorithmsalgorithms– Component of other data Component of other data

structuresstructures

Page 8: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-88

Array-based QueueArray-based QueueUse an array of size Use an array of size NN in a circular fashion in a circular fashionTwo variables keep track of the front and rearTwo variables keep track of the front and rearff index of the front elementindex of the front elementrr index immediately past the rear elementindex immediately past the rear element

Array location Array location rr is kept empty is kept empty

Q0 1 2 rf

normal configuration

Q0 1 2 fr

wrapped-around configuration

Page 9: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-99

Array-based Queue OperationsArray-based Queue OperationsWe use the modulo We use the modulo operator (remainder of operator (remainder of division)division)Operation enqueue Operation enqueue throws an exception if throws an exception if the array is fullthe array is fullThis exception is This exception is implementation-implementation-dependentdependentOperation dequeue Operation dequeue throws an exception if throws an exception if the queue is emptythe queue is emptyThis exception is This exception is specified in the queue specified in the queue ADTADT

Algorithm size()return (N f + r) mod N

Algorithm isEmpty()return (f r)

Algorithm enqueue(o)if size() = N 1 then

throw FullQueueException else

Q[r] or (r + 1) mod N

Algorithm dequeue()if isEmpty() then

throw EmptyQueueException else

o Q[f]f (f + 1) mod Nreturn o

Page 10: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1010

Other Implementations of QueueOther Implementations of Queue – Extendable array-based queue: The enqueue Extendable array-based queue: The enqueue

operation has amortized running time operation has amortized running time OO((nn)) with the incremental strategy with the incremental strategy OO(1)(1) with the doubling strategy with the doubling strategy

– Linked list-based queueLinked list-based queue

Queue Interface in JavaQueue Interface in Javapublic interface Queue {

public int size();

public boolean isEmpty();

public Object front()throws EmptyQueueException;

public void enqueue(Object o);

public Object dequeue() throws EmptyQueueException;

}

Java interface corresponding Java interface corresponding to our Queue ADTto our Queue ADTRequires the definition of Requires the definition of class class EmptyQueueExceptionEmptyQueueExceptionNo corresponding built-in No corresponding built-in Java classJava class

Page 11: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1111

The Vector ADTThe Vector ADTThe The VectorVector ADT extends the ADT extends the notion of array by storing a notion of array by storing a sequence of arbitrary objectssequence of arbitrary objectsAn element can be accessed, An element can be accessed, inserted or removed by inserted or removed by specifying its rank (number specifying its rank (number of elements preceding it)of elements preceding it)An exception is thrown if an An exception is thrown if an incorrect rank is specified incorrect rank is specified (e.g., a negative rank)(e.g., a negative rank)

Main vector operations:Main vector operations:– object object elemAtRankelemAtRank(integer r): (integer r):

returns the element at rank r returns the element at rank r without removing itwithout removing it

– object object replaceAtRankreplaceAtRank(integer (integer r, object o): replace the r, object o): replace the element at rank with o and element at rank with o and return the old elementreturn the old element

– insertAtRankinsertAtRank(integer r, object (integer r, object o): insert a new element o to o): insert a new element o to have rank rhave rank r

– object object removeAtRankremoveAtRank(integer (integer r): removes and returns the r): removes and returns the element at rank relement at rank r

Additional operations Additional operations sizesize() and () and isEmptyisEmpty()()

Direct applicationsDirect applications– Sorted collection of objects Sorted collection of objects

(elementary database)(elementary database)

Indirect applicationsIndirect applications– Auxiliary data structure for algorithmsAuxiliary data structure for algorithms– Component of other data structuresComponent of other data structures

Page 12: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1212

Array-based VectorArray-based VectorUse an array Use an array VV of size of size NNA variable A variable nn keeps track of the size of the vector keeps track of the size of the vector (number of elements stored)(number of elements stored)Operation Operation elemAtRankelemAtRank((rr) is implemented in ) is implemented in OO(1)(1) time by time by

returning returning VV[[rr]]V

0 1 2 nr

V0 1 2 nr

V0 1 2 n

or

V0 1 2 nr

In operation In operation insertAtRankinsertAtRank((rr,, o o), we need to make ), we need to make room for the new element by shifting forward the room for the new element by shifting forward the n n r r elements elements VV[[rr], …, ], …, VV[[n n 1]1]In the worst In the worst

case (case (r r 00), ), this takes this takes OO((nn)) time time

Page 13: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1313

Array-based VectorArray-based VectorIn operation In operation removeAtRankremoveAtRank((rr), we need to fill the ), we need to fill the hole left by the removed element by shifting hole left by the removed element by shifting backward the backward the n n r r 11 elements elements VV[[r r 1], …, 1], …, VV[[n n 1]1]In the worst In the worst

case (case (r r 00), ), this takes this takes OO((nn)) time time

V0 1 2 nr

V0 1 2 n

or

V0 1 2 nr

PerformancePerformance– In the array based implementation of a VectorIn the array based implementation of a Vector

The space used by the data structure is The space used by the data structure is OO((nn))sizesize, , isEmptyisEmpty, , elemAtRankelemAtRank and and replaceAtRankreplaceAtRank run in run in OO(1)(1) time timeinsertAtRankinsertAtRank and and removeAtRankremoveAtRank run in run in OO((nn)) time time

– If we use the array in a circular fashion,If we use the array in a circular fashion, insertAtRank insertAtRank(0)(0) and and removeAtRankremoveAtRank(0)(0) run in run in OO(1)(1) time time

– In an In an insertAtRankinsertAtRank operation, when the array is full, instead operation, when the array is full, instead of throwing an exception, we can replace the array with a of throwing an exception, we can replace the array with a larger one (extendable array)larger one (extendable array)

Page 14: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1414

Singly Linked ListSingly Linked ListA singly linked list is a concrete data A singly linked list is a concrete data structure consisting of a sequence of structure consisting of a sequence of nodesnodesEach node storesEach node stores– elementelement– link to the next nodelink to the next node

next

elem node

A B C D

Stack with singly linked listStack with singly linked list– The top element is stored at the first node of the listThe top element is stored at the first node of the list– The space used is The space used is OO((nn)) and each operation of the Stack ADT and each operation of the Stack ADT

takes takes OO(1) (1) time time

Queue with singly linked listQueue with singly linked list– The front element is stored at the first nodeThe front element is stored at the first node– The rear element is stored at the last nodeThe rear element is stored at the last node– The space used is The space used is OO((nn)) and each operation of the Queue ADT and each operation of the Queue ADT

takes takes OO(1) (1) timetime

Page 15: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1515

Position ADT & List ADTPosition ADT & List ADTThe The PositionPosition ADT ADT – models the notion of place within a data structure where a models the notion of place within a data structure where a

single object is storedsingle object is stored– gives a unified view of diverse ways of storing data, such asgives a unified view of diverse ways of storing data, such as

a cell of an arraya cell of an arraya node of a linked lista node of a linked list

– Just one method:Just one method:object object elementelement(): returns the element stored at the position(): returns the element stored at the position

The The ListList ADT ADT – models a sequence of positions storing arbitrary objectsmodels a sequence of positions storing arbitrary objects– establishes a before/after relation between positionsestablishes a before/after relation between positions– Generic methods:Generic methods: sizesize(), (), isEmptyisEmpty()()– Query methods:Query methods: isFirstisFirst(p), (p), isLastisLast(p)(p)– Accessor methods:Accessor methods: firstfirst(), (), lastlast(),(), beforebefore(p), (p), afterafter(p)(p)– Update methods: Update methods:

replaceElementreplaceElement(p, o), (p, o), swapElementsswapElements(p, q)(p, q)insertBeforeinsertBefore(p, o), (p, o), insertAfterinsertAfter(p, o) (p, o) insertFirstinsertFirst(o), (o), insertLastinsertLast(o)(o)removeremove(p)(p)

Page 16: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1616

Doubly Linked ListDoubly Linked ListA doubly linked list provides a natural A doubly linked list provides a natural implementation of the List ADTimplementation of the List ADT

Nodes implement Position and store:Nodes implement Position and store:– elementelement– link to the previous nodelink to the previous node– link to the next nodelink to the next node

Special trailer and header nodesSpecial trailer and header nodes

trailerheader nodes/positions

elements

prev next

elem node

Page 17: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1717

Doubly Linked List OperationsDoubly Linked List OperationsWe visualizeWe visualize insertAfter insertAfter(p, X), (p, X), which returns position qwhich returns position q

p

A B C

A B C

p

X

q

A B X C

p q

We visualize We visualize removeremove(p), (p), where p = where p = lastlast()()

A B C D

p

A B C

D

p

A B CPerformancePerformance– The space used by a doubly linked list with The space used by a doubly linked list with nn elements is elements is OO((nn))– The space used by each position of the list is The space used by each position of the list is OO(1)(1)– All the operations of the List ADT run in All the operations of the List ADT run in OO(1)(1) time time– Operation element() of the Position ADT runs in Operation element() of the Position ADT runs in OO(1)(1) time time

Page 18: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1818

Sequence ADTSequence ADTThe The SequenceSequence ADT is the ADT is the union of the Vector and union of the Vector and List ADTsList ADTsElements accessed byElements accessed by– Rank or PositionRank or Position

Generic methods:Generic methods:– sizesize(), (), isEmptyisEmpty()()

Vector-based methods:Vector-based methods:– elemAtRankelemAtRank(r), (r),

replaceAtRankreplaceAtRank(r, o), (r, o), insertAtRankinsertAtRank(r, o), (r, o), removeAtRankremoveAtRank(r)(r)

List-based methods:List-based methods:– firstfirst(), (), lastlast(), (),

beforebefore(p), (p), afterafter(p), (p), replaceElementreplaceElement(p, o), (p, o), swapElementsswapElements(p, q), (p, q), insertBeforeinsertBefore(p, o), (p, o), insertAfterinsertAfter(p, o), (p, o), insertFirstinsertFirst(o), (o), insertLastinsertLast(o), (o), removeremove(p)(p)

Bridge methods:Bridge methods:– atRankatRank(r), (r), rankOfrankOf(p)(p)

The Sequence ADT is a The Sequence ADT is a basic, general-purpose, basic, general-purpose, data structure for storing data structure for storing an ordered collection of an ordered collection of elementselements

Direct applications:Direct applications:– Generic replacement for stack, Generic replacement for stack,

queue, vector, or listqueue, vector, or list– small database small database

Indirect applications:Indirect applications:– Building block of more complex Building block of more complex

data structuresdata structures

Page 19: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-1919

Array-based ImplementationArray-based Implementation

We use a We use a circular array circular array storing storing positions positions A position A position object stores:object stores:– ElementElement– RankRank

Indices Indices ff and and ll keep track of keep track of first and last first and last positionspositions

0 1 2 3

positions

elements

S

lf

Page 20: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2020

Sequence ImplementationsSequence Implementations

nnnninsertAtRank, removeAtRankinsertAtRank, removeAtRank

1111insertFirst, insertLastinsertFirst, insertLast

11nninsertAfter, insertBeforeinsertAfter, insertBefore

nn11replaceAtRankreplaceAtRank1111replaceElement, swapElementsreplaceElement, swapElements

nn11atRank, rankOf, elemAtRankatRank, rankOf, elemAtRank1111size, isEmptysize, isEmpty

11nnremoveremove

1111first, last, before, afterfirst, last, before, after

ListListArrayArrayOperationOperation

Page 21: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2121

Design PatternsDesign Patterns

AdaptorAdaptor

PositionPosition

CompositionComposition

IteratorIterator

ComparatorComparator

LocatorLocator

Page 22: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2222

Design Pattern: IteratorsDesign Pattern: IteratorsAn iterator abstracts the An iterator abstracts the process of scanning process of scanning through a collection of through a collection of elementselementsMethods of the Methods of the ObjectIterator ADT:ObjectIterator ADT:– object object object()object()– boolean boolean hasNext()hasNext()– object object nextObject()nextObject()– reset()reset()

Extends the concept of Extends the concept of Position by adding a Position by adding a traversal capabilitytraversal capabilityImplementation with an Implementation with an array or singly linked listarray or singly linked list

An iterator is typically An iterator is typically associated with an associated with an another data structureanother data structure

We can augment the We can augment the Stack, Queue, Vector, List Stack, Queue, Vector, List and Sequence ADTs with and Sequence ADTs with method:method:– ObjectIterator ObjectIterator elements()elements()

Two notions of iterator:Two notions of iterator:– snapshot: freezes the snapshot: freezes the

contents of the data contents of the data structure at a given timestructure at a given time

– dynamic: follows changes dynamic: follows changes to the data structureto the data structure

Page 23: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2323

The Tree StructureThe Tree Structure

In computer science, a In computer science, a tree is an abstract model tree is an abstract model of a hierarchical of a hierarchical structurestructure

A tree consists of nodes A tree consists of nodes with a parent-child with a parent-child relationrelation

Applications:Applications:– Organization chartsOrganization charts– File systemsFile systems– Programming Programming

environmentsenvironments

Computers”R”Us

Sales R&DManufacturing

Laptops DesktopsUS International

Europe Asia Canada

Page 24: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2424

subtree

Tree TerminologyTree TerminologyRoot: node without parent (A)Root: node without parent (A)Internal node: node with at least Internal node: node with at least one child (A, B, C, F)one child (A, B, C, F)External node (a.k.a. leaf ): node External node (a.k.a. leaf ): node without children (E, I, J, K, G, H, D)without children (E, I, J, K, G, H, D)Ancestors of a node: parent, Ancestors of a node: parent, grandparent, grand-grandparent, grandparent, grand-grandparent, etc.etc.Depth of a node: number of Depth of a node: number of ancestorsancestorsHeight of a tree: maximum depth Height of a tree: maximum depth of any node (3)of any node (3)Descendant of a node: child, Descendant of a node: child, grandchild, grand-grandchild, etc.grandchild, grand-grandchild, etc.

A

B DC

G HE F

I J K

Subtree: tree consisting Subtree: tree consisting of a node and its of a node and its descendantsdescendants

Page 25: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2525

Tree ADTTree ADTWe use positions to We use positions to abstract nodesabstract nodes

Generic methods:Generic methods:– integer integer sizesize()()– boolean boolean isEmptyisEmpty()()– objectIterator objectIterator elementselements()()– positionIterator positionIterator positionspositions()()

Accessor methods:Accessor methods:– position position rootroot()()– position position parentparent(p)(p)– positionIterator positionIterator childrenchildren(p)(p)

Query methods:Query methods:– boolean boolean isInternalisInternal(p)(p)– boolean boolean isExternalisExternal(p)(p)– boolean boolean isRootisRoot(p)(p)

Update methods:Update methods:– swapElementsswapElements(p, q)(p, q)– object object replaceElementreplaceElement(p, o)(p, o)

Additional update methods Additional update methods may be defined by data may be defined by data structures implementing the structures implementing the Tree ADTTree ADT

Page 26: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2626

The Tree StructureThe Tree Structure

In computer science, a In computer science, a tree is an abstract model tree is an abstract model of a hierarchical of a hierarchical structurestructure

A tree consists of nodes A tree consists of nodes with a parent-child with a parent-child relationrelation

Applications:Applications:– Organization chartsOrganization charts– File systemsFile systems– Programming Programming

environmentsenvironments

Computers”R”Us

Sales R&DManufacturing

Laptops DesktopsUS International

Europe Asia Canada

Page 27: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2727

subtree

Tree TerminologyTree TerminologyRoot: node without parent (A)Root: node without parent (A)Internal node: node with at least Internal node: node with at least one child (A, B, C, F)one child (A, B, C, F)External node (a.k.a. leaf ): node External node (a.k.a. leaf ): node without children (E, I, J, K, G, H, D)without children (E, I, J, K, G, H, D)Ancestors of a node: parent, Ancestors of a node: parent, grandparent, grand-grandparent, grandparent, grand-grandparent, etc.etc.Depth of a node: number of Depth of a node: number of ancestorsancestorsHeight of a tree: maximum depth Height of a tree: maximum depth of any node (3)of any node (3)Descendant of a node: child, Descendant of a node: child, grandchild, grand-grandchild, etc.grandchild, grand-grandchild, etc.

A

B DC

G HE F

I J K

Subtree: tree consisting Subtree: tree consisting of a node and its of a node and its descendantsdescendants

Page 28: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2828

Tree ADTTree ADTWe use positions to We use positions to abstract nodesabstract nodes

Generic methods:Generic methods:– integer integer sizesize()()– boolean boolean isEmptyisEmpty()()– objectIterator objectIterator elementselements()()– positionIterator positionIterator positionspositions()()

Accessor methods:Accessor methods:– position position rootroot()()– position position parentparent(p)(p)– positionIterator positionIterator childrenchildren(p)(p)

Query methods:Query methods:– boolean boolean isInternalisInternal(p)(p)– boolean boolean isExternalisExternal(p)(p)– boolean boolean isRootisRoot(p)(p)

Update methods:Update methods:– swapElementsswapElements(p, q)(p, q)– object object replaceElementreplaceElement(p, o)(p, o)

Additional update methods Additional update methods may be defined by data may be defined by data structures implementing the structures implementing the Tree ADTTree ADT

Page 29: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-2929

Depth and HeightDepth and HeightDepth Depth -- the depth of v is -- the depth of v is the number of ancestors, the number of ancestors, excluding v itselfexcluding v itself– the depth of the root is 0the depth of the root is 0– the depth of v other than the the depth of v other than the

root is one plus the depth of root is one plus the depth of its parentits parent

– time efficiency is O(1+d)time efficiency is O(1+d)

Height Height -- the height of a -- the height of a subtree v is the maximum subtree v is the maximum depth of its external nodesdepth of its external nodes– the height of an external the height of an external

node is 0node is 0– the height of an internal the height of an internal

node v is one plus the node v is one plus the maximum height of its maximum height of its childrenchildren

– time efficiency is O(n)time efficiency is O(n)

Algorithm depth(T,v)if T.isRoot(v) then

return 0else return

1+depth(T, T.parent(v))

Algorithm height(T,v)if T.isExternal(v) then

return 0else

h=0;for each

wT.children(v) do h=max(h,

height(T,w)) return 1+h

Page 30: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3030

Preorder TraversalPreorder TraversalA traversal visits the nodes of a A traversal visits the nodes of a tree in a systematic mannertree in a systematic manner

In a preorder traversal, a node is In a preorder traversal, a node is visited before its descendants visited before its descendants

The running time is O(n)The running time is O(n)

Application: print a structured Application: print a structured documentdocument

Make Money Fast!

1. Motivations References2. Methods

2.1 StockFraud

2.2 PonziScheme

1.1 Greed 1.2 Avidity2.3 BankRobbery

1

2

3

5

4 6 7 8

9

Algorithm preOrder(v)visit(v)for each child w of v

preorder (w)

Page 31: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3131

Postorder TraversalPostorder TraversalIn a postorder traversal, a In a postorder traversal, a node is visited after its node is visited after its descendantsdescendants

The running time is O(n)The running time is O(n)

Application: compute space Application: compute space used by files in a directory and used by files in a directory and its subdirectoriesits subdirectories

Algorithm postOrder(v)for each child w of v

postOrder (w)visit(v)

cs16/

homeworks/todo.txt

1Kprograms/

DDR.java10K

Stocks.java25K

h1c.doc3K

h1nc.doc2K

Robot.java20K

9

3

1

7

2 4 5 6

8

Page 32: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3232

Binary TreeBinary TreeA binary tree is a tree with the A binary tree is a tree with the following properties:following properties:– Each internal node has two Each internal node has two

childrenchildren– The children of a node are an The children of a node are an

ordered pairordered pair

We call the children of an We call the children of an internal node left child and internal node left child and right childright childAlternative recursive Alternative recursive definition: a binary tree is definition: a binary tree is eithereither– a tree consisting of a single a tree consisting of a single

node, ornode, or– a tree whose root has an ordered a tree whose root has an ordered

pair of children, each of which is pair of children, each of which is a binary treea binary tree

Applications:Applications:– arithmetic arithmetic

expressionsexpressions– decision processesdecision processes– searchingsearching

A

B C

F GD E

H I

Page 33: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3333

Binary Tree ExamplesBinary Tree ExamplesArithmetic expression Arithmetic expression binary treebinary tree– internal nodes: operatorsinternal nodes: operators– external nodes: operandsexternal nodes: operands– Example: arithmetic Example: arithmetic

expression tree for the expression tree for the expression (2expression (2((aa1)1)(3 (3 b))b))

2

a 1

3 b

Decision treeDecision tree– internal nodes: questions with yes/no answerinternal nodes: questions with yes/no answer– external nodes: decisionsexternal nodes: decisions– Example: dining decisionExample: dining decision

Want a fast meal?

How about coffee? On expense account?

Starbucks Spike’s Al Forno Café Paragon

Yes No

Yes No Yes No

Page 34: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3434

Properties of Binary TreesProperties of Binary TreesNotationNotationnn number of nodesnumber of nodes

ee number of external number of external nodesnodes

ii number of internal number of internal nodesnodes

hh heightheight

Properties:Properties:– e e i i 11– n n 22e e 11– h h ii– h h ((n n 1)1)22– h+h+11 e e 22hh

– h h loglog22 ee– h h loglog22 ( (n n 1)1) 11

Page 35: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3535

BinaryTree ADTBinaryTree ADTThe BinaryTree ADT extends the Tree The BinaryTree ADT extends the Tree ADT, i.e., it inherits all the methods of ADT, i.e., it inherits all the methods of the Tree ADTthe Tree ADT

Additional methods:Additional methods:– position position leftChildleftChild(p)(p)– position position rightChildrightChild(p)(p)– position position siblingsibling(p)(p)

Update methods may be defined by data Update methods may be defined by data structures implementing the BinaryTree structures implementing the BinaryTree ADTADT

Page 36: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3636

Inorder TraversalInorder TraversalIn an inorder traversal a In an inorder traversal a node is visited after its left node is visited after its left subtree and before its subtree and before its right subtreeright subtreeTime efficiency is O(n)Time efficiency is O(n)Application: draw a binary Application: draw a binary treetree– x(v) = inorder rank of vx(v) = inorder rank of v– y(v) = depth of vy(v) = depth of v

Algorithm inOrder(v)if isInternal (v)

inOrder (leftChild (v))visit(v)if isInternal (v)

inOrder (rightChild (v))

3

1

2

5

6

7 9

8

4

Page 37: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3737

Print Arithmetic ExpressionsPrint Arithmetic ExpressionsSpecialization of an inorder Specialization of an inorder traversaltraversal– print operand or operator print operand or operator

when visiting nodewhen visiting node– print “(“ before traversing print “(“ before traversing

left subtreeleft subtree– print “)“ after traversing print “)“ after traversing

right subtreeright subtree

Algorithm printExpression(v)if isInternal (v)

print(“(’’)inOrder (leftChild (v))

print(v.element ())if isInternal (v)

inOrder (rightChild (v))print (“)’’)

2

a 1

3 b((2 (a 1)) (3 b))

Page 38: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3838

Evaluate Arithmetic ExpressionsEvaluate Arithmetic ExpressionsSpecialization of a Specialization of a postorder traversalpostorder traversal– recursive method recursive method

returning the value of a returning the value of a subtreesubtree

– when visiting an internal when visiting an internal node, combine the node, combine the values of the subtreesvalues of the subtrees

Algorithm evalExpr(v)if isExternal (v)

return v.element ()else

x evalExpr(leftChild (v))

y evalExpr(rightChild (v))

operator stored at vreturn x y

2

5 1

3 2

Page 39: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-3939

Euler Tour TraversalEuler Tour TraversalGeneric traversal of a binary treeGeneric traversal of a binary tree

Includes a special cases the preorder, postorder and inorder Includes a special cases the preorder, postorder and inorder traversalstraversals

Walk around the tree and visit each node three times:Walk around the tree and visit each node three times:– on the left (preorder)on the left (preorder)– from below (inorder)from below (inorder)– on the right (postorder)on the right (postorder)

2

5 1

3 2

LB

R

Page 40: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4040

Template Method PatternTemplate Method PatternGeneric algorithm that Generic algorithm that can be specialized by can be specialized by redefining certain stepsredefining certain stepsImplemented by means Implemented by means of an abstract Java class of an abstract Java class Visit methods that can Visit methods that can be redefined by be redefined by subclassessubclassesTemplate method Template method eulerToureulerTour– Recursively called on Recursively called on

the left and right the left and right childrenchildren

– A A ResultResult object with fields object with fields leftResultleftResult,, rightResult rightResult andand finalResultfinalResult keeps track of keeps track of the output of the the output of the recursive calls to recursive calls to eulerToureulerTour

public abstract class EulerTour {protected BinaryTree tree;protected void visitExternal(Position p, Result r) { }protected void visitLeft(Position p, Result r) { }protected void visitBelow(Position p, Result r) { }

protected void visitRight(Position p, Result r) { } protected Object eulerTour(Position p) {

Result r = new Result();if tree.isExternal(p) { visitExternal(p, r); }

else {visitLeft(p, r);r.leftResult = eulerTour(tree.leftChild(p));visitBelow(p, r);r.rightResult = eulerTour(tree.rightChild(p));visitRight(p, r);return r.finalResult;

} …

Page 41: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4141

Specializations of EulerTourSpecializations of EulerTourWe show how to We show how to specialize class specialize class EulerTour to evaluate EulerTour to evaluate an arithmetic an arithmetic expressionexpression

AssumptionsAssumptions– External nodes store External nodes store

Integer objectsInteger objects– Internal nodes store Internal nodes store

OperatorOperator objects objects supporting methodsupporting method

operation operation (Integer, Integer) (Integer, Integer)

public class EvaluateExpressionextends EulerTour {

protected void visitExternal(Position p, Result r) {r.finalResult = (Integer) p.element();

}

protected void visitRight(Position p, Result r) {Operator op = (Operator) p.element();r.finalResult = op.operation(

(Integer) r.leftResult,(Integer) r.rightResult);

}

}

Page 42: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4242

Data Structure for TreesData Structure for TreesA node is represented A node is represented by an object storingby an object storing– ElementElement– Parent nodeParent node– Sequence of children Sequence of children

nodesnodes

Node objects Node objects implement the Position implement the Position ADTADT

B

DA

C E

F

B

A D F

C

E

Page 43: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4343

Data Structure for Binary TreesData Structure for Binary TreesA node is represented A node is represented by an object storingby an object storing– ElementElement– Parent nodeParent node– Left child nodeLeft child node– Right child nodeRight child node

Node objects implement Node objects implement the Position ADTthe Position ADT

B

DA

C E

B

A D

C E

Page 44: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4444

Vector-Based Binary TreeVector-Based Binary TreeLevel numbering of nodes of T: p(v)Level numbering of nodes of T: p(v)– if v is the root of T, p(v)=1if v is the root of T, p(v)=1– if v is the left child of u, p(v)=2p(u)if v is the left child of u, p(v)=2p(u)– if v is the right child of u, p(v)=2p(u)+1if v is the right child of u, p(v)=2p(u)+1

Vector S storing the nodes of T by putting Vector S storing the nodes of T by putting the root at the second position and the root at the second position and following the above level numbering following the above level numbering

Properties: Properties: Let n be the number of nodes of T, N Let n be the number of nodes of T, N be the size of the vector S, and PM be the be the size of the vector S, and PM be the maximum value of p(v) over all the nodes of Tmaximum value of p(v) over all the nodes of T– N=PM+1N=PM+1– N=2^((n+1)/2)N=2^((n+1)/2)

Page 45: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4545

Java ImplementationJava ImplementationTree interfaceTree interface

BinaryTree interface BinaryTree interface extending Treeextending Tree

Classes implementing Classes implementing Tree and BinaryTree Tree and BinaryTree and providingand providing– ConstructorsConstructors– Update methodsUpdate methods– Print methodsPrint methods

Examples of updates Examples of updates for binary treesfor binary trees– expandExternalexpandExternal((vv))– removeAboveExternalremoveAboveExternal((ww))

A

expandExternal(v)

A

CB

B

removeAboveExternal(w)

Av v

w

Page 46: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4646

Trees in JDSLTrees in JDSLJDSL is the Library of Data JDSL is the Library of Data Structures in JavaStructures in Java

Tree interfaces in JDSLTree interfaces in JDSL– InspectableBinaryTreeInspectableBinaryTree– InspectableTreeInspectableTree– BinaryTreeBinaryTree– TreeTree

Inspectable versions of the Inspectable versions of the interfaces do not have interfaces do not have update methodsupdate methods

Tree classes in JDSLTree classes in JDSL– NodeBinaryTreeNodeBinaryTree– NodeTreeNodeTree

JDSL was developed at JDSL was developed at Brown’s Center for Brown’s Center for Geometric ComputingGeometric Computing

See the JDSL See the JDSL documentation and documentation and tutorials at tutorials at http://jdsl.orghttp://jdsl.org

InspectableTree

InspectableBinaryTree

Tree

BinaryTree

Page 47: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4747

Priority Queue ADTPriority Queue ADTA priority queue stores A priority queue stores a collection of itemsa collection of items

An item is a pairAn item is a pair(key, element)(key, element)

Main methods of the Main methods of the Priority Queue ADTPriority Queue ADT– insertIteminsertItem(k, o) -- inserts (k, o) -- inserts

an item with key k and an item with key k and element oelement o

– removeMinremoveMin() -- removes () -- removes the item with smallest the item with smallest key and returns its key and returns its elementelement

Additional methodsAdditional methods– minKeyminKey(k, o) -- returns, (k, o) -- returns,

but does not remove, the but does not remove, the smallest key of an itemsmallest key of an item

– minElementminElement() -- returns, () -- returns, but does not remove, the but does not remove, the element of an item with element of an item with smallest keysmallest key

– sizesize(), (), isEmptyisEmpty()()

Applications:Applications:– Standby flyersStandby flyers– AuctionsAuctions– Stock marketStock market

Page 48: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4848

Total Order RelationTotal Order RelationKeys in a priority Keys in a priority queue can be queue can be arbitrary objects arbitrary objects on which an on which an order is definedorder is defined

Two distinct Two distinct items in a items in a priority queue priority queue can have the can have the same keysame key

Mathematical concept Mathematical concept of total order relation of total order relation – Reflexive property:Reflexive property:

x x x x

– Antisymmetric Antisymmetric property:property:x x y y y y x x x x == y y

– Transitive property:Transitive property: x x y y y y z z x x z z

Page 49: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-4949

Comparator ADTComparator ADTA comparator A comparator encapsulates the action of encapsulates the action of comparing two objects comparing two objects according to a given total according to a given total order relationorder relation

A generic priority queue A generic priority queue uses an auxiliary uses an auxiliary comparatorcomparator

The comparator is The comparator is external to the keys being external to the keys being comparedcompared

When the priority queue When the priority queue needs to compare two needs to compare two keys, it uses its keys, it uses its comparatorcomparator

Methods of the Methods of the Comparator ADT, all Comparator ADT, all with Boolean return with Boolean return typetype– isLessThanisLessThan(x, y)(x, y)– isLessThanOrEqualToisLessThanOrEqualTo(x,(x,

y)y)– isEqualToisEqualTo(x,y)(x,y)– isGreaterThanisGreaterThan(x, y)(x, y)– isGreaterThanOrEqualToisGreaterThanOrEqualTo

(x,y)(x,y)– isComparableisComparable(x)(x)

Page 50: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5050

Sorting with a Priority QueueSorting with a Priority QueueWe can use a priority We can use a priority queue to sort a set of queue to sort a set of comparable elementscomparable elements– Insert the elements Insert the elements

one by one with a one by one with a series of series of insertIteminsertItem(e, (e, e) operationse) operations

– Remove the elements Remove the elements in sorted order with a in sorted order with a series of series of removeMinremoveMin() () operationsoperations

The running time of The running time of this sorting method this sorting method depends on the depends on the priority queue priority queue implementationimplementation

Algorithm PQ-Sort(S, C)Input sequence S, comparator C for the elements of SOutput sequence S sorted in increasing order according to CP priority queue with

comparator Cwhile S.isEmpty ()

e S.remove (S. first ())

P.insertItem(e, e)while P.isEmpty()

e P.removeMin()S.insertLast(e)

Page 51: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5151

Sequence-based Priority QueueSequence-based Priority QueueImplementation with an Implementation with an unsorted sequenceunsorted sequence– Store the items of the Store the items of the

priority queue in a list-priority queue in a list-based sequence, in based sequence, in arbitrary orderarbitrary order

Performance:Performance:– insertIteminsertItem takes takes OO(1)(1) time time

since we can insert the since we can insert the item at the beginning or item at the beginning or end of the sequenceend of the sequence

– removeMinremoveMin, , minKeyminKey and and minElementminElement take take OO((nn)) time time since we have to traverse since we have to traverse the entire sequence to the entire sequence to find the smallest keyfind the smallest key

Implementation with a Implementation with a sorted sequencesorted sequence– Store the items of the Store the items of the

priority queue in a priority queue in a sequence, sorted by sequence, sorted by keykey

Performance:Performance:– insertIteminsertItem takes takes OO((nn))

time since we have to time since we have to find the place where to find the place where to insert the iteminsert the item

– removeMinremoveMin, , minKeyminKey and and minElementminElement take take OO(1)(1) time since the smallest time since the smallest key is at the beginning key is at the beginning of the sequenceof the sequence

Page 52: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5252

Selection-SortSelection-SortSelection-sort is the variation of PQ-sort Selection-sort is the variation of PQ-sort where the priority queue is implemented with where the priority queue is implemented with an unsorted sequencean unsorted sequence

Running time of Selection-sort:Running time of Selection-sort:– Inserting the elements into the priority queue with Inserting the elements into the priority queue with

nn insertIteminsertItem operations takes operations takes OO((nn) ) timetime– Removing the elements in sorted order from the Removing the elements in sorted order from the

priority queue with priority queue with nn removeMinremoveMin operations takes operations takes time proportional totime proportional to

1 1 2 2 ……nn

Selection-sort runs in Selection-sort runs in OO((nn22) ) time time

Page 53: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5353

Insertion-SortInsertion-SortInsertion-sort is the variation of PQ-sort Insertion-sort is the variation of PQ-sort where the priority queue is implemented where the priority queue is implemented with a sorted sequencewith a sorted sequence

Running time of Insertion-sort:Running time of Insertion-sort:– Inserting the elements into the priority queue with Inserting the elements into the priority queue with

nn insertIteminsertItem operations takes time proportional to operations takes time proportional to 1 1 2 2 ……nn

– Removing the elements in sorted order from the Removing the elements in sorted order from the priority queue with a series of priority queue with a series of nn removeMinremoveMin operations takes operations takes OO((nn) ) timetime

Insertion-sort runs in Insertion-sort runs in OO((nn22) ) time time

Page 54: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5454

In-place Insertion-sortIn-place Insertion-sortInstead of using an Instead of using an external data structure, external data structure, we can implement we can implement selection-sort and selection-sort and insertion-sort in-placeinsertion-sort in-placeA portion of the input A portion of the input sequence itself serves as sequence itself serves as the priority queuethe priority queueFor in-place insertion-sortFor in-place insertion-sort– We keep sorted the initial We keep sorted the initial

portion of the sequenceportion of the sequence– We can use We can use

swapElementsswapElements instead of instead of modifying the sequencemodifying the sequence

5 4 2 3 1

5 4 2 3 1

4 5 2 3 1

2 4 5 3 1

2 3 4 5 1

1 2 3 4 5

1 2 3 4 5

Page 55: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5555

What is a heapWhat is a heapA heap is a binary tree A heap is a binary tree storing keys at its storing keys at its internal nodes and internal nodes and satisfying the following satisfying the following properties:properties:– Heap-Order:Heap-Order: for every for every

internal node v other internal node v other than the root,than the root,keykey((vv)) keykey((parentparent((vv))))

– Complete Binary Tree:Complete Binary Tree: let let hh be the height of the be the height of the heapheap

for for i i 0, … , 0, … , h h 1,1, there there are are 22ii nodes of depth nodes of depth iiat depth at depth hh 1 1, the , the internal nodes are to the internal nodes are to the left of the external nodesleft of the external nodes

2

65

79

The last node of a The last node of a heap is the rightmost heap is the rightmost internal node of depth internal node of depth hh 1 1

last node

Page 56: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5656

Height of a HeapHeight of a HeapTheorem:Theorem: A heap storing A heap storing nn keys has height keys has height OO(log (log nn))

Proof: (we apply the complete binary tree property)Proof: (we apply the complete binary tree property)– Let Let hh be the height of a heap storing be the height of a heap storing n n keyskeys– Since there are Since there are 22ii keys at depth keys at depth ii 0, … , 0, … , h h 2 2 and at least and at least

one key at depth one key at depth h h 11, we have , we have nn 1 1 2 2 4 4 … … 2 2hh2 2 11

– Thus, Thus, nn 22hh1 1 , i.e., , i.e., hh log log n n 11

1

2

2h2

1

keys

0

1

h2

h1

depth

Page 57: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5757

Heaps and Priority QueuesHeaps and Priority Queues

We can use a heap to implement a priority queueWe can use a heap to implement a priority queue

We store a (key, element) item at each internal We store a (key, element) item at each internal nodenode

We keep track of the position of the last nodeWe keep track of the position of the last node

For simplicity, we show only the keys in the picturesFor simplicity, we show only the keys in the pictures

(2, Sue)

(6, Mark)(5, Pat)

(9, Jeff) (7, Anna)

Page 58: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5858

Insertion into a HeapInsertion into a HeapMethod insertItem of Method insertItem of the priority queue ADT the priority queue ADT corresponds to the corresponds to the insertion of a key insertion of a key kk to to the heapthe heapThe insertion algorithm The insertion algorithm consists of three stepsconsists of three steps– Find the insertion node Find the insertion node zz

(the new last node)(the new last node)– Store Store kk at at zz and expand and expand z z

into an internal nodeinto an internal node– Restore the heap-order Restore the heap-order

property (discussed property (discussed next)next)

2

65

79

insertion node

2

65

79 1

z

z

Page 59: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-5959

UpheapUpheapAfter the insertion of a new key After the insertion of a new key kk, the heap-order , the heap-order property may be violatedproperty may be violated

Algorithm upheap restores the heap-order property by Algorithm upheap restores the heap-order property by swapping swapping kk along an upward path from the insertion node along an upward path from the insertion node

Upheap terminates when the key Upheap terminates when the key kk reaches the root or a reaches the root or a node whose parent has a key smaller than or equal to node whose parent has a key smaller than or equal to kk

Since a heap has height Since a heap has height OO(log (log nn)), upheap runs in , upheap runs in OO(log (log nn)) timetime

2

15

79 6z

1

25

79 6z

Page 60: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6060

Removal from a HeapRemoval from a HeapMethod removeMin of Method removeMin of the priority queue ADT the priority queue ADT corresponds to the corresponds to the removal of the root removal of the root key from the heapkey from the heapThe removal algorithm The removal algorithm consists of three stepsconsists of three steps– Replace the root key Replace the root key

with the key of the last with the key of the last node node ww

– Compress Compress ww and its and its children into a leafchildren into a leaf

– Restore the heap-order Restore the heap-order property (discussed property (discussed next)next)

2

65

79

last node

w

7

65

9w

Page 61: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6161

DownheapDownheapAfter replacing the root key with the key After replacing the root key with the key kk of the last node, of the last node, the heap-order property may be violatedthe heap-order property may be violated

Algorithm downheap restores the heap-order property by Algorithm downheap restores the heap-order property by swapping key swapping key kk along a downward path from the root along a downward path from the root

Upheap terminates when key Upheap terminates when key kk reaches a leaf or a node reaches a leaf or a node whose children have keys greater than or equal to whose children have keys greater than or equal to kk

Since a heap has height Since a heap has height OO(log (log nn)), downheap runs in , downheap runs in OO(log (log nn)) timetime

7

65

9w

5

67

9w

Page 62: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6262

Updating the Last NodeUpdating the Last NodeThe insertion node can be found by traversing a path of The insertion node can be found by traversing a path of OO(log (log nn) ) nodesnodes– Go up until a left child or the root is reachedGo up until a left child or the root is reached– If a left child is reached, go to the right childIf a left child is reached, go to the right child– Go down left until a leaf is reachedGo down left until a leaf is reached

Similar algorithm for updating the last node after a Similar algorithm for updating the last node after a removalremoval

Page 63: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6363

Heap-SortHeap-SortConsider a priority Consider a priority queue with queue with nn items items implemented by implemented by means of a heapmeans of a heap– the space used is the space used is OO((nn))

– methods methods insertIteminsertItem and and removeMinremoveMin take take OO(log (log nn) ) timetime

– methods methods sizesize, , isEmptyisEmpty, , minKeyminKey, and , and minElementminElement take time take time OO(1) (1) timetime

Using a heap-based Using a heap-based priority queue, we can priority queue, we can sort a sequence of sort a sequence of nn elements in elements in OO((nn log log nn) ) timetime

The resulting algorithm The resulting algorithm is called heap-sortis called heap-sort

Heap-sort is much Heap-sort is much faster than quadratic faster than quadratic sorting algorithms, sorting algorithms, such as insertion-sort such as insertion-sort and selection-sortand selection-sort

Page 64: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6464

Vector-based Heap ImplementationVector-based Heap ImplementationWe can represent a heap with We can represent a heap with nn keys by means of a vector of keys by means of a vector of length length n n 1 1For the node at rank For the node at rank ii– the left child is at rank the left child is at rank 22ii– the right child is at rank the right child is at rank 22i i 1 1

Links between nodes are not Links between nodes are not explicitly storedexplicitly storedThe leaves are not representedThe leaves are not representedThe cell of at rank The cell of at rank 00 is not used is not usedOperation insertItem Operation insertItem corresponds to inserting at corresponds to inserting at rank rank n n 1 1Operation removeMin Operation removeMin corresponds to removing at corresponds to removing at rank rank nnYields in-place heap-sortYields in-place heap-sort

2

65

79

2 5 6 9 7

1 2 3 4 50

Page 65: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6565

Merging Two HeapsMerging Two HeapsWe are given two We are given two heaps and a key heaps and a key kk

We create a new heap We create a new heap with the root node with the root node storing storing kk and with the and with the two heaps as subtreestwo heaps as subtrees

We perform downheap We perform downheap to restore the heap-to restore the heap-order property order property

7

3

58

2

64

3

58

2

64

2

3

58

4

67

Page 66: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6666

We can construct a We can construct a heap storing heap storing nn given given keys in using a keys in using a bottom-up bottom-up construction with construction with log log nn phases phases

In phase In phase ii, pairs of , pairs of heaps with heaps with 22i i 11 keys keys are merged into are merged into heaps with heaps with 22ii1111 keys keys

Bottom-up Heap ConstructionBottom-up Heap Construction

2i 1 2i 1

2i11

Page 67: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6767

ExampleExample

1516 124 76 2023

25

1516

5

124

11

76

27

2023

Page 68: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6868

Example (contd.)Example (contd.)

25

1516

5

124

11

96

27

2023

15

2516

4

125

6

911

23

2027

Page 69: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-6969

Example (contd.)Example (contd.)

7

15

2516

4

125

8

6

911

23

2027

4

15

2516

5

127

6

8

911

23

2027

Page 70: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7070

Example (end)Example (end)

4

15

2516

5

127

10

6

8

911

23

2027

5

15

2516

7

1210

4

6

8

911

23

2027

Page 71: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7171

AnalysisAnalysisWe visualize the worst-case time of a downheap with a We visualize the worst-case time of a downheap with a proxy path that goes first right and then repeatedly goes proxy path that goes first right and then repeatedly goes left until the bottom of the heap (this path may differ from left until the bottom of the heap (this path may differ from the actual downheap path)the actual downheap path)

Since each node is traversed by at most two proxy paths, Since each node is traversed by at most two proxy paths, the total number of nodes of the proxy paths is the total number of nodes of the proxy paths is OO((nn))

Thus, bottom-up heap construction runs in Thus, bottom-up heap construction runs in OO((nn) ) time time

Bottom-up heap construction is faster than Bottom-up heap construction is faster than nn successive successive insertions and speeds up the first phase of heap-sortinsertions and speeds up the first phase of heap-sort

Page 72: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7272

Hash Functions and Hash TablesHash Functions and Hash Tables A A hash functionhash function hh maps keys of a given type to maps keys of a given type to integers in a fixed interval integers in a fixed interval [0, [0, NN1]1]– Example: Example: hh((xx) ) xx mod mod N N iis a hash function for integer keyss a hash function for integer keys– The integer The integer hh((xx)) is called the is called the hash valuehash value of key of key xx

A A hash tablehash table for a given key type consists of for a given key type consists of– A hash function A hash function hh– An array (called table) of size An array (called table) of size NN

ExampleExample– We design a hash table for a We design a hash table for a

dictionary storing items (SSN, dictionary storing items (SSN, Name), where SSN (social Name), where SSN (social security number) is a nine-digit security number) is a nine-digit positive integerpositive integer

– Our hash table uses an array of Our hash table uses an array of sizesize NN10,00010,000 and the hash and the hash functionfunctionhh((xx))last four digits of last four digits of xx

01234

999799989999

451-229-0004

981-101-0002

200-751-9998

025-612-0001

Page 73: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7373

Hash FunctionsHash FunctionsA hash function is A hash function is usually specified as usually specified as the composition of the composition of two functions:two functions:

Hash code mapHash code map:: hh11:: keyskeys integersintegers

Compression mapCompression map:: hh22: integers: integers [0, [0, NN1]1]

The hash code map The hash code map is applied first, and is applied first, and the compression the compression map is applied next map is applied next on the result, i.e., on the result, i.e.,

hh((xx) = ) = hh22((hh11((xx))))

The goal of the The goal of the hash function is to hash function is to “disperse” the keys “disperse” the keys in an apparently in an apparently random wayrandom way

Page 74: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7474

Hash Code MapsHash Code MapsMemory addressMemory address::– We reinterpret the memory We reinterpret the memory

address of the key object as address of the key object as an integer (default hash an integer (default hash code of all Java objects)code of all Java objects)

– Good in general, except for Good in general, except for numeric and string keysnumeric and string keys

Integer castInteger cast::– We reinterpret the bits of We reinterpret the bits of

the key as an integerthe key as an integer– Suitable for keys of length Suitable for keys of length

less than or equal to the less than or equal to the number of bits of the number of bits of the integer type (e.g., byte, integer type (e.g., byte, short, int and float in Java)short, int and float in Java)

Component sumComponent sum::– We partition the bits of We partition the bits of

the key into the key into components of fixed components of fixed length (e.g., 16 or 32 length (e.g., 16 or 32 bits) and we sum the bits) and we sum the components (ignoring components (ignoring overflows)overflows)

– Suitable for numeric Suitable for numeric keys of fixed length keys of fixed length greater than or equal greater than or equal to the number of bits to the number of bits of the integer type of the integer type (e.g., long and double (e.g., long and double in Java)in Java)

Page 75: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7575

Hash Code Maps (cont.)Hash Code Maps (cont.)Polynomial accumulationPolynomial accumulation::– We partition the bits of the We partition the bits of the

key into a sequence of key into a sequence of components of fixed length components of fixed length (e.g., 8, 16 or 32 bits)(e.g., 8, 16 or 32 bits) aa0 0 aa11 … … aann11

– We evaluate the polynomialWe evaluate the polynomial

pp((zz)) a a00 aa1 1 zz aa2 2 zz22 … … … …

aann11zznn11

at a fixed value at a fixed value zz, ignoring , ignoring overflowsoverflows

– Especially suitable for strings Especially suitable for strings (e.g., the choice (e.g., the choice z z 3333 gives gives at most 6 collisions on a set at most 6 collisions on a set of 50,000 English words)of 50,000 English words)

Polynomial Polynomial pp((zz)) can can be evaluated in be evaluated in OO((nn)) time using Horner’s time using Horner’s rule:rule:– The following The following

polynomials are polynomials are successively successively computed, each from computed, each from the previous one in the previous one in OO(1)(1) time time

pp00((zz)) a ann11

ppii ((zz)) a annii1 1 zpzpii11((zz))

((i i 1, 2, …, 1, 2, …, n n 1)1)

We have We have pp((zz) ) p pnn11((zz) )

Page 76: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7676

Compression MapsCompression MapsDivisionDivision::– hh2 2 ((yy) ) y y modmod N N

– The size The size NN of the of the hash table is hash table is usually chosen to usually chosen to be a prime be a prime

– The reason has to The reason has to do with number do with number theory and is theory and is beyond the scope beyond the scope of this courseof this course

Multiply, Add and Multiply, Add and Divide (MAD)Divide (MAD)::– hh2 2 ((yy) ) ((ay ay b b)) modmod N N

– aa and and bb are are nonnegative nonnegative integers such thatintegers such that

a a modmod N N 0 0

– Otherwise, every Otherwise, every integer would map integer would map to the same value to the same value bb

Page 77: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7777

Collision HandlingCollision Handling

Collisions occur Collisions occur when different when different elements are elements are mapped to the mapped to the same cellsame cell

ChainingChaining: let each : let each cell in the table cell in the table point to a linked list point to a linked list of elements that of elements that map theremap there

Chaining is simple, Chaining is simple, but requires but requires additional memory additional memory outside the tableoutside the table

01234 451-229-0004 981-101-0004

025-612-0001

Page 78: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7878

Linear ProbingLinear ProbingOpen addressingOpen addressing: the : the colliding item is placed in colliding item is placed in a different cell of the a different cell of the tabletableLinear probingLinear probing handles handles collisions by placing the collisions by placing the colliding item in the next colliding item in the next (circularly) available (circularly) available table celltable cellEach table cell inspected Each table cell inspected is referred to as a is referred to as a “probe”“probe”Colliding items lump Colliding items lump together, causing future together, causing future collisions to cause a collisions to cause a longer sequence of longer sequence of probesprobes

Example:Example:– hh((xx) ) x x modmod 1313

– Insert keys 18, 41, Insert keys 18, 41, 22, 44, 59, 32, 31, 22, 44, 59, 32, 31, 73, in this order73, in this order

0 1 2 3 4 5 6 7 8 9 10 11 12

41 18445932223173 0 1 2 3 4 5 6 7 8 9 10 11 12

Page 79: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-7979

Search with Linear ProbingSearch with Linear ProbingConsider a hash Consider a hash table table AA that uses that uses linear probinglinear probing

findElementfindElement((kk))– We start at cell We start at cell hh((kk) )

– We probe consecutive We probe consecutive locations until one of locations until one of the following occursthe following occurs

An item with key An item with key kk is is found, orfound, or

An empty cell is An empty cell is found, orfound, or

NN cells have been cells have been unsuccessfully unsuccessfully probed probed

Algorithm findElement(k)i h(k)p 0repeat

c A[i]if c

return NO_SUCH_KEY else if c.key () k

return c.element()else

i (i 1) mod Np p 1

until p Nreturn NO_SUCH_KEY

Page 80: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-8080

Updates with Linear ProbingUpdates with Linear ProbingTo handle insertions and To handle insertions and deletions, we introduce a deletions, we introduce a special object, called special object, called AVAILABLEAVAILABLE, which , which replaces deleted replaces deleted elementselements

removeElementremoveElement((kk))– We search for an item We search for an item

with key with key kk

– If such an item If such an item ((k, ok, o)) is is found, we replace it with found, we replace it with the special item the special item AVAILABLEAVAILABLE and we and we return element return element oo

– Else, we return Else, we return NO_SUCH_KEYNO_SUCH_KEY

insert Iteminsert Item((k, ok, o))– We throw an We throw an

exception if the table exception if the table is fullis full

– We start at cell We start at cell hh((kk) ) – We probe consecutive We probe consecutive

cells until one of the cells until one of the following occursfollowing occurs

A cell A cell ii is found that is found that is either empty or is either empty or stores stores AVAILABLEAVAILABLE, or, orNN cells have been cells have been unsuccessfully unsuccessfully probedprobed

– We store item We store item ((k, ok, o)) in in cell cell ii

Page 81: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-8181

Double HashingDouble HashingDouble hashing uses a Double hashing uses a secondary hash function secondary hash function dd((kk) ) and handles collisions by and handles collisions by placing an item in the first placing an item in the first available cell of the series available cell of the series ((ii jdjd((kk)) mod )) mod NN for for jj 0, 1, … , 0, 1, … , N N 1 1The secondary hash function The secondary hash function dd((kk)) cannot have zero values cannot have zero valuesThe table size The table size NN must be a must be a prime to allow probing of all prime to allow probing of all the cellsthe cells

Common choice of Common choice of compression map for the compression map for the secondary hash function: secondary hash function:

dd22((kk) ) qq kk mod mod q q

where where qq N N andand qq is a prime is a prime

The possible values for The possible values for dd22((kk)) are are 1, 2, … , 1, 2, … , qq

ExampleExample– NN1313 – hh((kk) ) k k modmod 1313 – dd((kk) ) 7 7 k k modmod 77

– Insert keys 18, 41, 22, Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in 44, 59, 32, 31, 73, in this orderthis order

0 1 2 3 4 5 6 7 8 9 10 11 12

31 41 183259732244 0 1 2 3 4 5 6 7 8 9 10 11 12

k h (k ) d (k ) Probes18 5 3 541 2 1 222 9 6 944 5 5 5 1059 7 4 732 6 3 631 5 4 5 9 073 8 4 8

Page 82: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-8282

Performance of HashingPerformance of HashingIn the worst case, searches, In the worst case, searches, insertions and removals on a insertions and removals on a hash table take hash table take OO((nn) ) timetimeThe worst case occurs when The worst case occurs when all the keys inserted into the all the keys inserted into the dictionary collidedictionary collideThe load factor The load factor nnN N affects the performance of a affects the performance of a hash tablehash tableAssuming that the hash Assuming that the hash values are like random values are like random numbers, it can be shown numbers, it can be shown that the expected number of that the expected number of probes for an insertion with probes for an insertion with open addressing isopen addressing is

11 (1 (1 ))

The expected The expected running time of all running time of all the dictionary ADT the dictionary ADT operations in a hash operations in a hash table is table is OO(1)(1) In practice, hashing In practice, hashing is very fast provided is very fast provided the load factor is not the load factor is not close to 100%close to 100%Applications of hash Applications of hash tables:tables:– small databasessmall databases– compilerscompilers– browser cachesbrowser caches

Page 83: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-8383

Universal HashingUniversal Hashing

A family of hash functions is A family of hash functions is universaluniversal if, for any if, for any 00<<i,ji,j<<M-1, M-1, Pr(h(j)=h(k)) Pr(h(j)=h(k)) << 1/N. 1/N.

Choose p as a prime between M and 2M.Choose p as a prime between M and 2M.

Randomly select 0<a<p and 0Randomly select 0<a<p and 0<<b<p, and define b<p, and define h(k)=(ak+b mod p) mod Nh(k)=(ak+b mod p) mod N

Theorem: The set of all functions, h, as Theorem: The set of all functions, h, as defined here, is universal.defined here, is universal.

Page 84: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-8484

Proof of Universality (Part 1)Proof of Universality (Part 1)

Let f(k) = ak+b mod pLet f(k) = ak+b mod p

Let g(k) = k mod NLet g(k) = k mod N

So h(k) = g(f(k)).So h(k) = g(f(k)).

f causes no collisions:f causes no collisions:– Let f(k) = f(j).Let f(k) = f(j).– Suppose k<j. ThenSuppose k<j. Then

pp

bakbakp

p

bajbaj

pp

bak

p

bajkja

)(

So a(j-k) is a multiple of So a(j-k) is a multiple of pp

But both are less than pBut both are less than p

So a(j-k) = 0. I.e., j=k. So a(j-k) = 0. I.e., j=k. (contradiction)(contradiction)

Thus, f causes no Thus, f causes no collisionscollisions..

Page 85: CSC401 – Analysis of Algorithms Chapter 2 Basic Data Structures Objectives: Introduce basic data structures, including –Stacks and Queues –Vectors, Lists,

2-2-8585

Proof of Universality (Part 2)Proof of Universality (Part 2)If f causes no collisions, only g can make h cause If f causes no collisions, only g can make h cause collisions. collisions. Fix a number x. Of the p integers y=f(k), different Fix a number x. Of the p integers y=f(k), different from x, the number such that g(y)=g(x) is at mostfrom x, the number such that g(y)=g(x) is at most

Since there are p choices for x, the number of h’s Since there are p choices for x, the number of h’s that will cause a collision between j and k is at mostthat will cause a collision between j and k is at most

There are p(p-1) functions h. So probability of There are p(p-1) functions h. So probability of collision is at mostcollision is at most

Therefore, the set of possible h functions is Therefore, the set of possible h functions is universal.universal.

1/ Np

N

ppNpp

)1(1/

Npp

Npp 1

)1(

/)1(