dictionaries i data structures and parallelism
TRANSCRIPT
Dictionaries I Data Structures and
Parallelism
Warm Up
Write a recurrence to represent the running time of this code
CSE 373 SU 18 - ROBBIE WEBER 2
int Mystery(int n){
if(n <= 5)
return 1;
for(int i=0; i<n; i++){
for(int j=0; j<n; j++){
System.out.println(βhiβ);
}
}
return n*Mystery(n/2);
}
Announcements
Project 1 is due Friday at 11:59 PM.-Remember both partners use the late day(s) if you submit late.
P1: for HashTrieMap, be sure youβre updating the size field we gave you.-We havenβt given you tests for everything!
-Remember that the tests we give you are not exhaustive; passing 90% of the tests on IntelliJ does not mean you would get 90% on the project.
Start thinking about potential P2 partners!
Outline
One more unrolling example
Two new (old?) ADTs-Dictionaries
-Sets
Review BSTs
Intro AVL trees
Solving Recurrences III
CSE 373 SU 18 - ROBBIE WEBER 5
πn2
πn
16
2
πn
16
2
πn
16
2
πn
16
2
πn
16
2
πn
16
2
πn
16
2
πn
16
2
πn
16
2
β¦ β¦β¦ β¦ β¦β¦ β¦ β¦β¦ β¦ β¦β¦ β¦ β¦β¦ β¦ β¦β¦ β¦ β¦β¦ β¦ β¦β¦ β¦ β¦β¦
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
Answer the following
questions:
1. What is input size on
level π?2. Number of nodes at
level π?3. Work done at
recursive level π?4. Last level of tree?
5. Work done at base
case?
6. What is sum over all
levels?
π π =5 π€βππ π β€ 4
3ππ
4+ ππ2 ππ‘βπππ€ππ π
πn
4
2
πn
4
2
πn
4
2cπ
4
2
cπ
4
2
ππ
4
2
ππ2
Solving Recurrences III
CSE 373 SU 18 - ROBBIE WEBER 6
Level (i)Number of
Nodes
Work per
Node
Work per
Level
0 1 ππ2 ππ2
1 3 ππ
4
2 3
16ππ2
2 32 ππ
42
2 3
16
2
ππ2
π 3π ππ
4π
2 3
16
π
ππ2
Base =
log4π β 13log4 πβ1 5
5
3πlog4 3
1. Input size on level π?
2. How many calls on level π?
3. How much work on level π?
4. What is the last level?
5. A. How much work for each leaf node?
B. How many base case calls?
π
4ππ
π
4π
2
When π
4π= 4β log4 π β 1
5
π π =5 π€βππ π β€ 4
3ππ
4+ ππ2 ππ‘βπππ€ππ π
6. Combining it all togetherβ¦
3πππ
4π
2
=3
16
π
ππ2
π π =
π=0
log4 π β23
16
π
ππ2 +5
3πlog43
3log4 πβ1 =3log4 π
3 power of a log
π₯logπ π¦ = π¦logπ π₯
=πlog4 3
3
3π
Solving Recurrences III
CSE 373 SU 18 - ROBBIE WEBER 7
π π =
π=0
log4 π β23
16
π
ππ2 +5
3πlog43
π π β€ ππ21
1 β316
+5
3πlog43
π π = ππ2316
log4 πβ1
β 1
316 β 1
+5
3πlog43
π π β π(π2)
π=π
π
ππ(π) = π
π=π
π
π(π)
factoring out a
constantπ π = ππ2
π=0
log4 π β23
16
π
+5
3πlog43
π=0
πβ1
π₯π =π₯π β 1
π₯ β 1
finite geometric series
π=0
β
π₯π =1
1 β π₯
infinite geometric
series
when -1 < x < 1
If weβre trying to prove upper boundβ¦
π π β€ ππ2
π=0
β3
16
π
+ 4πlog43
Closed form:
7. Simplifyβ¦
π π =5 π€βππ π β€ 4
3ππ
4+ ππ2 ππ‘βπππ€ππ π
π π =5 π€βππ π β€ 4
3ππ
4+ ππ2 ππ‘βπππ€ππ π
Our Next ADT
Dictionary ADT
find(key) β returns the stored value
associated with key.
state
behaviorSet of (key, value) pairs
delete(key) β removes the key and its value
from the dictionary.
insert(key, value) β inserts (key, value) pair.
If key was already in dictionary, overwrites
the previous value.
Real world intuition:
keys: words
values: definitions
Dictionaries are
often called βmapsβ
Our Next ADT
Set ADT
find(element) β returns true if element is in
the set, false otherwise.
state
behaviorSet of elements
delete(key) β removes the key and its value
from the dictionary.
insert(element) β inserts element into the
set.
Usually implemented
as a dictionary with
values βtrueβ or βfalseβ
Later in the course
weβll want more
complicated set
operations like
union(set1, set2)
Uses of Dictionaries
Dictionaries show up all the time.
There are too many applications to really list all of them:-Phonebooks
-Indexes
-Databases
-Operating System memory management
-The internet (DNS)
-β¦
Any time you want to organize information for easy retrieval.
Weβre going to design three completely different implementations of Dictionaries β they have that many different uses.
Simple Dictionary Implementations
Insert Find Delete
Unsorted Linked List
Unsorted Array
Sorted Linked List
Sorted Array
What are the worst case running times for each operation if you have π(key, value) pairs.
Assume the arrays do not need to be resized.
Think about what happens if a repeat key is inserted!
Simple Dictionary Implementations
Insert Find Delete
Unsorted Linked List Ξ(π) Ξ(π) Ξ(π)
Unsorted Array Ξ(π) Ξ(π) Ξ π
Sorted Linked List Ξ(π) Ξ(π) Ξ(π)
Sorted Array Ξ(π) Ξ(log π) Ξ(π)
What are the worst case running times for each operation if you have π(key, value) pairs.
Assume the arrays do not need to be resized.
Think about what happens if a repeat key is inserted!
Aside: Lazy Deletion
Lazy Deletion: A general way to make delete() more efficient.
Donβt remove the entry from the structure, just βmarkβ it as deleted.
Benefits:-Much simpler to implement
-More efficient (no need to shift values on every single delete)
Drawbacks:-Extra space:
-For the flag
-More drastically, data structure grows with all insertions, not with the current number of items.
-Sometimes makes other operations more complicated.
Simple Dictionary Implementations
Insert Find Delete
Unsorted Linked List Ξ(π) Ξ(π) Ξ(π)
Unsorted Array Ξ(π) Ξ(π) Ξ π
Sorted Linked List Ξ(π) Ξ(π) Ξ(π)
Sorted Array Ξ(π) Ξ(logπ) Ξ(logπ)
We can do slightly better with lazy deletion, let π be the total number of
elements ever inserted (even if later lazily deleted)
Think about what happens if a repeat key is inserted!
A Better Implementation
What about BSTs?
Keys will have to be comparableβ¦
Insert Find Delete
Average
Worst
A Better Implementation
What about BSTs?
Keys will have to be comparableβ¦
Insert Find Delete
Average Ξ(log π) Ξ(log π)
Worst Ξ(π) Ξ(π)
Letβs talk about how to implement delete.
Deletion from BSTs
Deleting will have three steps:-Finding the element to delete
-Removing the element
-Restoring the BST property
Deletion β Easy Cases
What if the elements to delete is:-A leaf?
-Has exactly one child?
6
72
84
9
Deletion β Easy Cases
What if the elements to delete is:-A leaf?
-Has exactly one child?6
72
84
9
Deleting a leaf:
Just get rid of it.
Delete(7)
Deletion β Easy Cases
What if the elements to delete is:-A leaf?
-Has exactly one child?
6
72
84
9
Deleting a node with one
child:
Delete the node
Connect its parent and child
Delete(4)
Deletion β The Hard Case
What happens if the node to delete has two children?
4
52
73
9
8 106
What if we try
Delete(7)?
What can we replace it
with?
6 or 8
The biggest thing in left
subtree or smallest thing in
right subtree.
A Better Implementation
What about BSTs?
Keys will have to be comparable.
Insert Find Delete
Average Ξ(log π) Ξ(log π)
Worst Ξ(π) Ξ(π)
A Better Implementation
What about BSTs?
Keys will have to be comparable.
Insert Find Delete
Average Ξ(log π) Ξ(log π) Ξ(log π)
Worst Ξ(π) Ξ(π) Ξ(π)
Weβre in the same position we were in for heaps
BSTs are great on average,
but we need to avoid the worst case.
Avoiding the Worst Case
Take I:
Letβs require the tree to be complete.
It worked for heaps!
What goes wrong:
When we insert, weβll break the completeness property.
Insertions always add a new leaf, but you canβt control where.
Can we fix it?
Not easily :/
Avoiding the Worst Case
Take II:
Here are some other requirements you might try. Could they work? If not what can go wrong?
Root Balanced: The root must have the same number of nodes in its
left and right subtrees
Recursively Balanced: Every node must have the same number of
nodes in its left and right subtrees.
Root Height Balanced: The left and right subtrees of the root must
have the same height.
Avoiding the Worst Case
Take III:
The AVL condition
This actually works. To convince you it works, we have to check:1. Such a tree must have height π(log π) .
2. We must be able to maintain this property when inserting/deleting
AVL condition: For every node, the height of its left subtree and
right subtree differ by at most 1.
Bounding the Height
Suppose you have a tree of height β, meeting the AVL condition.
What is the minimum number of nodes in the tree?
If β = 0, then 1 node
If β = 1, then 2 nodes.
In general?
AVL condition: For every node, the height of its left subtree and
right subtree differ by at most 1.
Bounding the Height
In general, let π() be the minimum number of nodes in a tree of height β, meeting the AVL requirement.
π β = α1 if β = 02 if β = 1
π β β 1 + π β β 2 + 1 otherwise
Bounding the Height
π β = α1 if β = 02 if β = 1
π β β 1 + π β β 2 + 1 otherwise
We can try unrolling or recursion trees.
Bounding the Height
π β = α1 if β = 02 if β = 1
π β β 1 + π β β 2 + 1 otherwise
When unrolling weβll quickly realize:-Something with Fibonacci numbers is going on.
-Itβs really hard to exactly describe the pattern.
The real solution (using deep math magic beyond this course) is
π β β₯ πβ β 1 where π is 1+ 5
2β 1.62