cisc220 fall 2009 james atlas dec 04: hashing and maps k+w chapter 9
Post on 17-Jan-2018
214 Views
Preview:
DESCRIPTION
TRANSCRIPT
CISC220Fall 2009
James AtlasDec 04: Hashing and Maps
K+W Chapter 9
Map Picture
Map usage (STL)
• The map can be used like an array, except that the Key_Type is the index. Example:map<string, string> a_map;a_map["J"] = "Jane";a_map["B"] = "Bill";a_map["S"] = "Sam";a_map["B1"] = "Bob";a_map["B2"] = "Bill";
– If a mapping exists, assignment will replace it.– If a mapping does not exist, a reference will create one with a
default value.
Hash Tables• Goal: access item given its key (not its position)
– we wish to avoid much searching
• Hash tables provide this capability– Constant time in the average case! O(1)– Linear time in the worst case O(n)
• Searching an array: O(n) Searching BST: O(log n)
911 call center example
• Key: phone #• Value: address
Address calculator
01
.
.
.
.
n-1Array table
AddresscalculatorSearch key
Address calculator
01
101 Smith HallNewark, DE
n-1Array table
Addresscalculator302-831-2712
Address calculator
01
101 Smith HallNewark, DE
n-1Array table
Addresscalculator302-831-2712
If we had 9999999999 table entries,this would be easy!
Hash Function
HashfunctionInteger Integer in the range [0,n-1]
Any ideas for our phone number?
Hash Function
HashfunctionInteger Integer in the range [0,n-1]
Any ideas for our phone number?
mod(x,n) -> y such that y is in the range [0,n-1]
Address calculator
01
101 Smith HallNewark, DE
n-1Array table
Hashfunctionmod(10000)
302-831-2712
302-737-2712?
Collision
01
101 Smith HallNewark, DE
n-1Array table
Hashfunctionmod(10000)
302-831-2712
302-737-2712?
Two keys map to the same location
How could we resolve collisions?• Goal is to still be able to insert, delete, and
search based on key 01
101 Smith HallNewark, DE
n-1Array table
Hashfunctionmod(10000)
302-831-2712
302-737-2712?
Open Addressing• Data for a key can be at multiple locations• Probe sequence 0
1302-831-2712
101 Smith HallNewark, DE
302-737-2712136 Main St.
n-1Array table
Hashfunctionmod(10000)
302-831-2712
302-737-2712
Linear Probing
Linear Probing
• s = starting point (based on hash function)• i = 0• do
– check position s + i– i = i + 1;
• while not found• s, s+1, s+2, s+3, s+4, etc.
Quadratic Probing
• s = starting point (based on hash function)• i = 0• do
– check position s + i*i– i = i + 1;
• while not found• s, s+1, s+4, s+9, s+16, etc.
Another collision resolution: Chaining
• Each entry in table is actually a linked list01
302-831-2712101 Smith Hall
Newark, DE
n-1Array table
Hashfunctionmod(10000)
302-831-2712
302-737-2712
302-737-2712136 Main St.
Performance of Hash Tables• Load factor = # filled cells / table size
– Between 0 and 1
• Load factor has greatest effect on performance• Lower load factor better performance
– Reduce collisions in sparsely populated tables
• Average expected # probes for open addressing:– linear probing = ½(1 + 1/(1-L))– quadratic probing = ½(1 + 1/(1-L))2
• For chaining = 1 + (L/2)– Note: Here L can be greater than 1
Performance of Hash Tables (2)L Number of Probes Linear Probing Chaining
0 1.00 1.00 0.25 1.17 1.13 0.5 1.50 1.25 0.75 2.50 1.38 0.83 3.38 1.43 0.9 5.50 1.45 0.95 10.50 1.48
Performance of Hash Tables (3)• Hash table:
– Insert: average O(1)– Search: average O(1)
• Sorted array:– Insert: average O(n)– Search: average O(log n)
• Binary Search Tree:– Insert: average O(log n)– Search: average O(log n)
• But balanced trees can guarantee worst case O(log n)
top related