exercise session 4 – associative data structures
TRANSCRIPT
Exercise Session 4 – Associative DataStructuresComputer Science II, D-ITET, ETH Zurich
Program Today
Feedback of last exercise
Repetition theoryAVL ConditionAVL InsertHashing
Programming Task
1
1. Feedback of last exercise
2
2. Repetition theory
3
Comparison of binary Trees
Search trees HeapsMin- / Max- Heap
Balanced trees AVL,red-black tree
in C++: std::make_heap std::map
3
4
5
7
9
16
1
2
235
7 9
16
1
4
2
3
4
5
7
9
16
1
Insertion Θ(h(T )) Θ(log n) Θ(log n)Search Θ(h(T )) Θ(n) (!!) Θ(log n)
Deletion Θ(h(T )) Search + Θ(log n) Θ(log n)
Recall: Θ(log n)≤ Θ(h(T ))≤ Θ(n)
4
Comparison of binary Trees
Search trees HeapsMin- / Max- Heap
Balanced trees AVL,red-black tree
in C++: std::make_heap std::map
3
4
5
7
9
16
1
2
235
7 9
16
1
4
2
3
4
5
7
9
16
1
Insertion Θ(h(T )) Θ(log n) Θ(log n)Search Θ(h(T )) Θ(n) (!!) Θ(log n)
Deletion Θ(h(T )) Search + Θ(log n) Θ(log n)Recall: Θ(log n)≤ Θ(h(T ))≤ Θ(n)
4
AVL Condition
AVL Condition: for eacn node v of a treebal(v) ∈ {−1, 0, 1}
v
Tl(v)
Tr(v)
h h + 1
h + 2
5
Balance at Insertion Point
=⇒
+1 0p p
n
case 1: bal(p) = +1
=⇒
−1 0p p
n
case 2: bal(p) = −1
Finished in both cases because the subtree height did not change
6
Balance at Insertion Point
=⇒
0 +1p p
n
case 3.1: bal(p) = 0 right
=⇒
0 −1p p
n
case 3.2: bal(p) = 0, left
Not finished in both case. Call of upin(p)
7
upin(p) - invariant
When upin(p) is called it holds thatthe subtree from p is grown andbal(p) ∈ {−1, +1}
8
upin(p)
Assumption: p is left son of pp1
=⇒
pp +1 pp 0
p p
case 1: bal(pp) = +1, done.
=⇒
pp 0 pp −1
p p
case 2: bal(pp) = 0, upin(pp)
In both cases the AVL-Condition holds for the subtree from pp
1If p is a right son: symmetric cases with exchange of +1 and −19
upin(p)
Assumption: p is left son of pp
pp −1
p
case 3: bal(pp) = −1,
This case is problematic: adding n to the subtree from pp has violated theAVL-condition. Re-balance!Two cases bal(p) = −1, bal(p) = +1
10
Rotationscase 1.1 bal(p) = −1. 2
y
x
t1
t2
t3
pp −2
p −1
h
h− 1
h− 1
h + 2 h
=⇒rotation
right
x
y
t1 t2 t3
pp 0
p 0
h h− 1 h− 1
h + 1 h + 1
2p right son: ⇒ bal(pp) = bal(p) = +1, left rotation11
Rotationscase 1.1 bal(p) = −1. 3
z
x
y
t1t2 t3
t4
pp −2
p +1
h −1/ + 1
h− 1
h− 1h− 2
h− 2h− 1
h− 1
h + 2 h
=⇒doublerotationleft-right
y
x z
t1
t2 t3t4
pp 0
0/− 1 +1/0
h− 1 h− 1h− 2
h− 2h− 1
h− 1
h + 1
3p right son⇒ bal(pp) = +1, bal(p) = −1, double rotation right left12
Quiz
In the following AVL tree, insert key 12 and rebalance (as shown in class).What does the AVL tree look like after the operation that has been shownin class?
30
10
3
1
17
14 19
50
40 60
13
Solution
17
10
3
1
14
12
30
19 50
40 60
14
Hashing well-done
Useful Hashing. . .distributes the keys as uniformly as possible in the hash table.avoids probing over long areas of used entries(e.g. primary clustering).
15
Hashing Examples
Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):
linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).
0 1 2 3 4 5 6
25 417 45
254 17 45
16
Hashing Examples
Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):
linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).
0 1 2 3 4 5 6
25
417 45
254 17 45
16
Hashing Examples
Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):
linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).
0 1 2 3 4 5 6
25 4
17 45
254 17 45
16
Hashing Examples
Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):
linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).
0 1 2 3 4 5 6
25 417
45
254 17 45
16
Hashing Examples
Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):
linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).
0 1 2 3 4 5 6
25 417 45
254 17 45
16
Hashing Examples
Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):
linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).
0 1 2 3 4 5 6
25 417 45
25
4 17 45
16
Hashing Examples
Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):
linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).
0 1 2 3 4 5 6
25 417 45
254
17 45
16
Hashing Examples
Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):
linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).
0 1 2 3 4 5 6
25 417 45
254 17
45
16
Hashing Examples
Insert the keys 25, 4, 17, 45 into the hash table, using the functionh(k) = k mod 7 and probing to the right, h(k) + s(j, k):
linear probing,s(j, k) = j.Double Hashing,s(j, k) = j · (1 + (k mod 5)).
0 1 2 3 4 5 6
25 417 45
254 17 45
16
3. Programming Task
17
Finding a Sub-Array
Given: two integer arrays A = (a0, . . . , an−1) and B = (b0, . . . , bk−1)Task: Find position of B in A.
Naive: Loop through A, check whether the following k entries match B.
O(nk) comparison operations
Solution using hashing: Calculate hash h(B) and compare it toh((ai, ai+1, . . . , ai+k−1)).Avoid re-computing h((ai, ai+1, . . . , ai + k − 1) for each i =⇒ O(n)expected
18
Finding a Sub-Array
Given: two integer arrays A = (a0, . . . , an−1) and B = (b0, . . . , bk−1)Task: Find position of B in A.Naive: Loop through A, check whether the following k entries match B.
O(nk) comparison operations
Solution using hashing: Calculate hash h(B) and compare it toh((ai, ai+1, . . . , ai+k−1)).Avoid re-computing h((ai, ai+1, . . . , ai + k − 1) for each i =⇒ O(n)expected
18
Finding a Sub-Array
Given: two integer arrays A = (a0, . . . , an−1) and B = (b0, . . . , bk−1)Task: Find position of B in A.Naive: Loop through A, check whether the following k entries match B.
O(nk) comparison operations
Solution using hashing: Calculate hash h(B) and compare it toh((ai, ai+1, . . . , ai+k−1)).Avoid re-computing h((ai, ai+1, . . . , ai + k − 1) for each i =⇒ O(n)expected
18
Finding a Sub-Array
Given: two integer arrays A = (a0, . . . , an−1) and B = (b0, . . . , bk−1)Task: Find position of B in A.Naive: Loop through A, check whether the following k entries match B.
O(nk) comparison operations
Solution using hashing: Calculate hash h(B) and compare it toh((ai, ai+1, . . . , ai+k−1)).Avoid re-computing h((ai, ai+1, . . . , ai + k − 1) for each i =⇒ O(n)expected
18
Sliding Window Hash
Possible hash function: sum of all elements:
Can be updated easily: subtract ai and add ai+k.However: bad hash function
Better:
Hc,m((ai, · · · , ai+k−1)) =k−1∑
j=0ai+j · ck−j−1
mod m
c = 1021 prime numberm = 215 int, no overflows at calculations
19
Sliding Window Hash
Possible hash function: sum of all elements:
Can be updated easily: subtract ai and add ai+k.However: bad hash function
Better:
Hc,m((ai, · · · , ai+k−1)) =k−1∑
j=0ai+j · ck−j−1
mod m
c = 1021 prime numberm = 215 int, no overflows at calculations
19
Computing with Modulo
(a + b) mod m = ((a mod m) + (b mod m)) mod m
(a− b) mod m = ((a mod m)− (b mod m) + m) mod m
(a · b) mod m = ((a mod m) · (b mod m)) mod m
Exercise: Compute
12746357 mod 11
20
Computing Modulo
Exercise: Compute
12746357 mod 11
= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11= 8 mod 11.
For the second equality we used the fact that 102 mod 11 = 1.
21
Computing Modulo
Exercise: Compute
12746357 mod 11= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11
= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11= 8 mod 11.
For the second equality we used the fact that 102 mod 11 = 1.
21
Computing Modulo
Exercise: Compute
12746357 mod 11= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11
= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11= 8 mod 11.
For the second equality we used the fact that 102 mod 11 = 1.
21
Computing Modulo
Exercise: Compute
12746357 mod 11= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11
= 8 mod 11.
For the second equality we used the fact that 102 mod 11 = 1.
21
Computing Modulo
Exercise: Compute
12746357 mod 11= (7 + 5 · 10 + 3 · 102 + 6 · 103 + 4 · 104 + 7 · 105 + 2 · 106 + 1 · 107) mod 11= (7 + 50 + 3 + 60 + 4 + 70 + 2 + 10) mod 11= (7 + 6 + 3 + 5 + 4 + 4 + 2 + 10) mod 11= 8 mod 11.
For the second equality we used the fact that 102 mod 11 = 1.
21
Sliding Window Hash
template<typename It1, typename It2>It1 findOccurrence(const It1 from, const It1 to,
const It2 begin, const It2 end){
const unsigned k = end - begin;const unsigned M = 32768;const unsigned C = 1021;
// your code here// ...
22
Sliding Window Hash
// elements can be compared using std::equal:if(std::equal(window_left, window_right, begin, end))
return current;
// if no occurrence is found return end of arrayreturn to;
}
23
Sliding Window Hash
Make sure thatthe algorithm computes ck only once,all computations are modulo m for all values in order not to get anoverflow (recall the rules of modular arithmetic), andthe values are always positive (e.g., by adding multiples of m).
24