midterm midterm is wednesday next week ! the quiz contains 5 problems = 50 min + 0 min more...
TRANSCRIPT
Midterm
• Midterm is Wednesday next week !
• The quiz contains 5 problems = 50 min + 0 min more – Master Theorem/ Examples– Quicksort/ Mergesort– Binary Heaps / Binary Search Trees– Depth/Breadth First Search – Greedy Algorithm / Prim’s algorithm for MST
• You SHOULD know algorithms/master very well!– opened book
– do not spend much time on any problem, attack them in the order of the most progress …
– you will be graded not only on the correctness but also on clarity with which you express it.
(it is better be there :-))
Data Structures: Lists (11.2-3/10.2-3)
• Data Structures: Dictionaries
– Search(S, k) a pointer to element x with key k
– Insert(S, x) add new element pointed to by x
– Delete(S, x) delete x, x is a pointer
• Linked Lists– support all operations for data structures but slow
– insertion and deletion is O(1)
– searching is drawback: O(n)
• Dictionary data structures:– Hashing
keysat-tedata
xrecord
Data Structures: Hashing (12/11 -12.4/11.4)
• Idea: if keys in [0.. m-1] and distinct set up array T[0..m-1] in which T[i] = x, where key[x]=i.
• Direct-address Tables
– operations take O(1)
• Problem: the range of keys is large (e.g. ASCII strings)
• Solution: use hash function to map keys in 0..n-1
• Problem: 2 keys in same slot - collision- how resolve?
U universe of keys
K actual keys
k3k1
k4k20
m-1
h(k2)h(k4)
h(k1)=h(k3)
Collision Resolution 12.2/11.2
• Open addressing (12.4) (key = satellite data) – good when we do not delete (e.g. check spelling)
– table needn’t be much bigger than # of items
– idea: to insert if slot is full, try another slot till find one
– to search follow the same sequence of probes
• Chaining (items in the same slot linked into list)
U universe of keys
K actual keys
k3k1
k4k20
m-1
k2k4
k1 k3
collision
Analysis of Chaining 12.2/11.2
• Assume: simple uniform hashing – each key equally likely to be hashed in each slot
• n keys and m slots: load factor =n/m – average # keys per slot
• Cost of successful search = (1+ /2)= (1+ )– 1 to access slot /2 expected time to search list
• Cost of unsuccessful search = (1+ )– 1 to access slot expected time to search list
• Cost is O(1) if =O(1) (i.e. n=O(m))
Hash Functions 12.3/11.3
• Choice of hash functions– should distribute keys uniformly into slots
– regularity in key distribution should not affect uniformity
• h (1,...,100) 1, h (101,...,200) 2, ... may be bad if all keys are between 1 and 100
• Division for hashing (12.3.1/11.3.1) – h(k) = k mod m, i.e. hash function maps key to the
remainder (residue) of division of k by m
– don’t pick , hash does not depend on last bits
– pick m = prime not too close to power of 2 or 10
pm 2
Multiplication for Hashing (12.3.2/11.3.2)
• Constant 0 < A < 1
• m = 2^p
• multiplication
)()( kAkAmkh
Universal Hashing 12.3.3/11.3.3
• Application: identifiers in a program.
• Problem: for any hash function bad input
• Idea: choose hash function at random independent of the keys
• Def: U - key universe, H - set of hashing functions. H is universal if– the chance of collision b/w x and y is 1/m if h is random from H
• If h chosen randomly from H, then the expected # collisions for any particular key x is < = n/m
myhxhhyxUyx /|||)}()(:{:|)(,
Universal Hashing 12.3.3/11.3.3
• Design of universal hashing
– decompose key into r+1 bytes
s.t.
– let randomly chosen r+1 numbers from set {0,1,...,m-1}
– define a corresponding hash function
– class has members
• Th. The class H defined above is universal
mxaxh i
r
iia mod)(
0
rxxxx ,...,, 10
raaaa ,...,, 10
mxi
)(xha
a
a xh )}({ 1rm