data structures dynamic sets heaps binary trees & sorting hashing
Post on 21-Dec-2015
237 views
TRANSCRIPT
Data Structures
Dynamic SetsHeaps
Binary Trees & SortingHashing
Dynamic Sets
• Data structures that hold elements indexed with (usually unique) keys
• Support of some basic operations such as: Search for an element with a given key Insert a new element Delete an element Find the minimum-key element Find the maximum-key element
• Elements can have unique or non-unique keys, depending on the application
• Keys can be ordered (e.g., intergers)
Some operations on dynamic sets
• SEARCH(S, k)Given set S, key k, return x such that key(x) = k, or NIL if not found
• INSERT(S, x)Augment S by adding x to it
• DELETE(S, x)Delete x from S
• MINIMUM(S), MAXIMUM(S)Return element x in S with minimum (maximum) key(x)
• SUCCESSOR(S, x)Given x, return y in S with minimum key(y) > key(x), or NIL if key(x) is
maximum
• PREDECESSOR(S, x)
Data Structures for Sets
• Several data structures can support sets Arrays Linked Lists Trees Heaps Hash Tables etc…
• Depending on the required set of operations, and on their frequency, different data structures are preferable
Example: Linked Lists
• A list L consists of a head, head[L], and a set of <key, next_ptr> values
• Every list ends with NIL• Lists can additionally point to objects indexed by the keys
9 16 4 1 NILhead[L]
9head[L] 16 4 1 NIL
Variants of Linked Lists
• Simple• Doubly linked• Circular
Notice the sentinel NIL
9 16 4 1 NILhead[L]
NILhead[L] 9 16 4 NIL1
nil[L] 16 4 1
Implementing Sets as Lists
HEAD(L) return nil[L].next
TAIL(L) return nil[L].prev
INSERT(L, x)
x.next = HEAD(L)
HEAD(L).prev = x
HEAD(L) = x
x.prev = nil[L]
nil[L] 4 1
111 x
Implementing Sets as Lists
HEAD(L) return nil[L].next
TAIL(L) return nil[L].prev
INSERT(L, x)
x.next = HEAD(L)
HEAD(L).prev = x
HEAD(L) = x
x.prev = nil[L]
nil[L] 4 1
111 x
Implementing Sets as Lists
HEAD(L) return nil[L].next
TAIL(L) return nil[L].prev
INSERT(L, x)
x.next = HEAD(L)
HEAD(L).prev = x
HEAD(L) = x
x.prev = nil[L]
nil[L] 4 1
111 x
Implementing Sets as Lists
HEAD(L) return nil[L].next
TAIL(L) return nil[L].prev
INSERT(L, x)
x.next = HEAD(L)
HEAD(L).prev = x
HEAD(L) = x
x.prev = nil[L]
nil[L] 4 1
111 x
Implementing Sets as Lists
HEAD(L) return nil[L].next
TAIL(L) return nil[L].prev
INSERT(L, x)
x.next = HEAD(L)
HEAD(L).prev = x
HEAD(L) = x
x.prev = nil[L]
nil[L] 4 1
111 x
Implementing Sets as Lists
LIST-DELETE(L, x)x.prev.next = x.nextx.next.prev = x.prev
LIST-SEARCH(L, k)x = HEAD(L)while x != nil(L) and x.key != k x = x.nextreturn x
• How fast do INSERT and DELETE run?
• How fast does LIST-SEARCH run?
16 4 1
Sorted Lists
• INSERT x Search for y such that y.key ≤ x.key ≤ y.next.key
• DELETE x Same as unsorted lists
• MINIMUM Return HEAD(L)
• MAXIMUM Return TAIL(L)
• LIST-SEARCH x Same as unsorted lists
• EXTRACT-MAX DELETE MAXIMUM
Running Times of Basic Operations
UNSORTED LIST SORTED LIST
INSERT constant O(N)
DELETE constant constant
SEARCH O(N) O(N)
MINIMUM O(N) constant
MAXIMUM O(N) constant
EXTRACT-MAX O(N) constant
Heaps
• Heap: a very efficient binary tree data structure
• Heap operations INSERT DELETE EXTRACT-MAX
• Heapsort
• A priority queue based on heaps Heap operations DECREASE-KEY
Heaps: Definition
• A heap is an array A[1,…,length(A)]
• heap_size(A) ≤ length(A), is the size of the heap
• A[1], …, A[heap_size(A)] are elements in the heap
16 14 7 910 8 3 2 4 1A
Contents of the heap
heap_size(A) length(A)
Heaps: Definition
• A heap is also a binary tree
PARENT(i)return i/2
LEFT(i)return 2i
RIGHT(i) return 2i+1
16 14 7 910 8 3 2 4 1A
16
14
7 9
10
8 3
2 4 1
Heaps: Definition
The heap property:
• For every i > 1,
A[PARENT(i)] >= A[i]
16 14 7 910 8 3 2 4 1A
16
14
7 9
10
8 3
2 4 1
Maintaining the Heap Property
Example:• Heap property is violated at root
• Left and right trees are heaps
HEAPIFY(A,1) will fix this
4 16 7 910 14 3 2 8 1A
4
16
7 9
10
14 3
2 8 1
Maintaining the Heap Property
HEAPIFY
• Propagate the problem down
16 4 7 910 14 3 2 8 1A
16
4
7 9
10
14 3
2 8 1
Maintaining the Heap Property
HEAPIFY
• Propagate the problem down
• At each step, replace
problem node with
largest child
16 14 7 910 4 3 2 8 1A
16
14
7 9
10
4 3
2 8 1
Maintaining the Heap Property
HEAPIFY
• Propagate the problem down
• At each step, replace
problem node with
largest child
16 14 7 910 8 3 2 4 1A
16
14
7 9
10
8 3
2 4 1
Maintaining the Heap Property
HEAPIFY(A, i)
l = LEFT(i)
r = RIGHT(i)
if l <= heap_size(A) and A[l]>A[i]
then max = l
else max = i
if r <= heap_size(A) and A[r] > A[max]
then max = r
if max != i
then exchange(A[i], A[max])
HEAPIFY(A, max)
4 16 7 910 14 3 2 8 1A
4
16
7 9
10
14 3
2 8 1
Heaps are balanced
Claim: The size of each child heap is < 2/3 N , where N is the size of the parent
Proof:
• Let N: parent heap size
A, B: child heap sizes
• Worst case: a = 2k, b= 0
B = 1 + … + 2k-1 = 2k – 1
A = 1 + … + 2k = 2B + 1
N = 3B+2
A/N = [2(B+1) – 1] / [3(B+1) – 1]
< 2(B+1)/3(B+1) = 2/3
N = 1 + 2 + … + 2k + a + b = A + B
A = 1 +…+ 2k-1 + aB = 1 +…+ 2k-1 + b
a b
k
HEAPIFY runs in time O(log n)
T(n) < T(2/3 n) + (1) = (log n) by the Master Theorem
[[ Case 2: T(n) = 1 T(n/ (3/2) ) + c
f(n) = c;a = 1; b = 3/2;nlogba = nlog3/21 = n0 = 1
f(n) = (nlogba) therefore T(n) = (nlogba log n) = (log n) ]]
• Alternatively, T(n) = O(h), where h is height of the heap
Building a Heap
4 1 16 93 2 10 14 8 7A
4
1
16 9
3
2 10
14 8 7
• Build a heap starting from an unordered array
Building a Heap
4 1 16 93 2 10 14 8 7A
4
1
16 9
3
2 10
14 8 7
• The leafs are already heaps of size 1
Building a Heap
4 1 16 93 2 10 14 8 7A
4
1
16 9
3
2 10
14 8 7
• Go up one-by-one to the root, fixing the heap property
Building a Heap
4 1 16 93 14 10 2 8 7A
4
1
16 9
3
14 10
2 8 7
• Go up one-by-one to the root, fixing the heap property
Do that by running
HEAPIFY
Building a Heap
4 1 16 910 14 3 2 8 7A
4
1
16 9
10
14 3
2 8 7
• Go up one-by-one to the root, fixing the heap property
Do that by running
HEAPIFY
Building a Heap
4 16 7 910 14 3 2 8 1A
4
16
7 9
10
14 3
2 8 1
• Each HEAPIFY takes time O(height)
Building a Heap
16 14 7 910 8 3 2 4 1A
16
14
7 9
10
8 3
2 4 1
• Each HEAPIFY takes time O(height)
Building a Heap
16 14 7 910 8 3 2 4 1A
16
14
7 9
10
8 3
2 4 1
BUILD-HEAP(A)
heap_size[A]= length(A)
For i = length(A)/2 downto 1 HEAPIFY(A, i)
Running Time of BUILD-HEAP
• How fast does BUILD-HEAP run???
• Here is a bound:
At most N = heap_size(A) calls to HEAPIFY Each call takes at most O(log N) time
Therefore, running time is O(N log N)
• Are we done?
Two lemmas left as exercises
• Lemma 1
An N-element heap has height lg N
• Lemma 2
An N-element heap has at most N / 2h+1 nodes of height h
16
14
7 9
10
8 3
2 4 1
heighth = 3
3
2
1
0
Running Time of BUILD-HEAP
• Each HEAPIFY takes time O(h)• HEAPIFY is called at most once/node
T(N) = h = 0…lg N N / 2h+1 O(h)
≤ O(N h = 0…lg N h/2h)
h = 0…lg N h/2h < h = 0… [ h×(1/2)h]= (1/2)/(1-1/2)2 = 2, by [A.8]
= O(2N) = O(N)
!
Heapsort
16 14 7 910 8 3 2 4 1A
16
14
7 9
10
8 3
2 4 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
16
16 1
1
16
16 1
Heapsort
16 14 7 910 8 3 2 4 1A
16
14
7 9
10
8 3
2 4 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
16
16 1
16
1
1 16
Heapsort
16 14 7 910 8 3 2 4 1A
16
14
7 9
10
8 3
2 4 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
16
16 1
16
1
1 16
Heapsort
14 8 7 910 4 3 2 1 1A
8
7 9
10
4 3
2 1 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
14
Heapsort
1 8 7 910 4 3 2 14 1A
8
7 9
10
4 3
2 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
1
Heapsort
1 8 7 910 4 3 2 14 1A
8
7 9
10
4 3
2 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
1
Heapsort
10 8 7 19 4 3 2 14 1A
8
7 1
9
4 3
2 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
10
Heapsort
2 8 7 19 4 3 10 14 1A
8
7 1
9
4 3
10 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
2
Heapsort
2 8 7 19 4 3 10 14 1A
8
7 1
9
4 3
10 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
2
Heapsort
9 8 7 13 4 2 10 14 1A
8
7 1
3
4 2
10 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
9
Heapsort
8 7 2 13 4 9 10 14 1A
7
2 1
3
4 9
10 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
8
Heapsort
7 4 2 83 1 9 10 14 1A
4
2 8
3
1 9
10 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
7
Heapsort
4 2 7 83 1 9 10 14 1A
4
7 8
3
1 9
10 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
7
Heapsort
3 2 7 81 4 9 10 14 1A
2
7 8
1
4 9
10 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
3
Heapsort
2 1 7 83 4 9 10 14 1A
1
7 8
3
4 9
10 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
2
Heapsort
1 2 7 83 4 9 10 14 1A
2
7 8
3
4 9
10 14 1
HEAPSORT(A)
BUILD-HEAP(A)
for I = length(A) downto 2
exchange(A[1], A[i])
heap_size(A)--
HEAPIFY(A, 1)
1
1
16
16
1
RUNNING TIME?
Priority Queues
A priority queue S is a data structure supporting:
• MAXIMUM(S) Returns maximum key in S
• EXTRACT-MAX(S) Removes maximum key in S, and returns it
• INCREASE-KEY(S, xptr, xnew)
Increases xold, stored in xptr, into xnew > xold
• INSERT(S, x) S = S {x}
PRIORITY QUEUEPRIORITY QUEUE
Heaps as Priority Queues
• Heaps can be efficient priority queues
MAXIMUM is implemented in O(1)
MAXIMUM(A)
return A[1]
EXTRACT-MAX is implemented in ??
EXTRACT-MAX(A)
if heap_size(A) < 1 then error(“underflow”)
max = A[1]
A[1] = A[heap_size(A)]
heap_size(A)--
HEAPIFY(A, 1)
return max
Heaps as Priority Queues
INCREASE-KEY is implemented in O(log n)
INCREASE-KEY(A, i, key)
if key < A[i] then error(“key too small”)
A[i] = key
while i>1 && A[PARENT(i) < A[i]]
exchange(A[i], A[PARENT(i)]
i = PARENT(i)
Example of INCREASE-KEY
16 14 7 910 8 3 2 4 1A
16
14
7 9
10
8 3
2 4 1
INCREASE-KEY(A, i, key)if key < A[i] then error(“key too small”)
A[i] = keywhile i > 1 and A[PARENT(i)] < A[i] exchange (A[i], A[PARENT(i)]) i = PARENT(i)
Example of INCREASE-KEY
16 14 7 910 8 3 2 15 1A
16
14
7 9
10
8 3
2 15 1
INCREASE-KEY(A, i, key)if key < A[i] then error(“key too small”)
A[i] = keywhile i > 1 and A[PARENT(i)] < A[i] exchange (A[i], A[PARENT(i)]) i = PARENT(i)
Example of INCREASE-KEY
16 14 7 910 15 3 2 8 1A
16
14
7 9
10
15 3
2 8 1
INCREASE-KEY(A, i, key)if key < A[i] then error(“key too small”)
A[i] = keywhile i > 1 and A[PARENT(i)] < A[i] exchange (A[i], A[PARENT(i)]) i = PARENT(i)
Example of INCREASE-KEY
16 15 7 910 14 3 2 8 1A
16
15
7 9
10
14 3
2 8 1
INCREASE-KEY(A, i, key)if key < A[i] then error(“key too small”)
A[i] = keywhile i > 1 and A[PARENT(i)] < A[i] exchange (A[i], A[PARENT(i)]) i = PARENT(i)
Example of INCREASE-KEY
16 15 7 910 14 3 2 8 1A
16
15
7 9
10
14 3
2 8 1
INCREASE-KEY(A, i, key)if key < A[i] then error(“key too small”)
A[i] = keywhile i > 1 and A[PARENT(i)] < A[i] exchange (A[i], A[PARENT(i)]) i = PARENT(i)
Heaps as Priority Queues
INSERT is implemented in O(log n)
INSERT(A, key)
Heap_size(A)++
A[heap_size(A)] = -Infinity
INCREASE-KEY(A, heap_size(A), key)
16
15
7 9
10
14 3
2 8 1
16
15
7 9
10
14 3
2 8 1 11
INSERT 11
-
16
15
11 9
10
14 3
2 8 1 7
Summary
UNSORTED LIST SORTED LIST HEAP
INSERT c O(N) O(log N)
DELETE c c O(log N)
SEARCH O(N) O(N) O(N)
MAXIMUM O(N) c c
EXTRACT-MAX O(N) c O(log N)
INCREASE-KEY c O(N) O(log N)