cisc220 fall 2009 james atlas dec 04: hashing and maps k+w chapter 9

Post on 17-Jan-2018

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Map usage (STL) The map can be used like an array, except that the Key_Type is the index. Example: map a_map; a_map["J"] = "Jane"; a_map["B"] = "Bill"; a_map["S"] = "Sam"; a_map["B1"] = "Bob"; a_map["B2"] = "Bill"; –If a mapping exists, assignment will replace it. –If a mapping does not exist, a reference will create one with a default value.

TRANSCRIPT

CISC220Fall 2009

James AtlasDec 04: Hashing and Maps

K+W Chapter 9

Map Picture

Map usage (STL)

• The map can be used like an array, except that the Key_Type is the index. Example:map<string, string> a_map;a_map["J"] = "Jane";a_map["B"] = "Bill";a_map["S"] = "Sam";a_map["B1"] = "Bob";a_map["B2"] = "Bill";

– If a mapping exists, assignment will replace it.– If a mapping does not exist, a reference will create one with a

default value.

Hash Tables• Goal: access item given its key (not its position)

– we wish to avoid much searching

• Hash tables provide this capability– Constant time in the average case! O(1)– Linear time in the worst case O(n)

• Searching an array: O(n) Searching BST: O(log n)

911 call center example

• Key: phone #• Value: address

Address calculator

01

.

.

.

.

n-1Array table

AddresscalculatorSearch key

Address calculator

01

101 Smith HallNewark, DE

n-1Array table

Addresscalculator302-831-2712

Address calculator

01

101 Smith HallNewark, DE

n-1Array table

Addresscalculator302-831-2712

If we had 9999999999 table entries,this would be easy!

Hash Function

HashfunctionInteger Integer in the range [0,n-1]

Any ideas for our phone number?

Hash Function

HashfunctionInteger Integer in the range [0,n-1]

Any ideas for our phone number?

mod(x,n) -> y such that y is in the range [0,n-1]

Address calculator

01

101 Smith HallNewark, DE

n-1Array table

Hashfunctionmod(10000)

302-831-2712

302-737-2712?

Collision

01

101 Smith HallNewark, DE

n-1Array table

Hashfunctionmod(10000)

302-831-2712

302-737-2712?

Two keys map to the same location

How could we resolve collisions?• Goal is to still be able to insert, delete, and

search based on key 01

101 Smith HallNewark, DE

n-1Array table

Hashfunctionmod(10000)

302-831-2712

302-737-2712?

Open Addressing• Data for a key can be at multiple locations• Probe sequence 0

1302-831-2712

101 Smith HallNewark, DE

302-737-2712136 Main St.

n-1Array table

Hashfunctionmod(10000)

302-831-2712

302-737-2712

Linear Probing

Linear Probing

• s = starting point (based on hash function)• i = 0• do

– check position s + i– i = i + 1;

• while not found• s, s+1, s+2, s+3, s+4, etc.

Quadratic Probing

• s = starting point (based on hash function)• i = 0• do

– check position s + i*i– i = i + 1;

• while not found• s, s+1, s+4, s+9, s+16, etc.

Another collision resolution: Chaining

• Each entry in table is actually a linked list01

302-831-2712101 Smith Hall

Newark, DE

n-1Array table

Hashfunctionmod(10000)

302-831-2712

302-737-2712

302-737-2712136 Main St.

Performance of Hash Tables• Load factor = # filled cells / table size

– Between 0 and 1

• Load factor has greatest effect on performance• Lower load factor better performance

– Reduce collisions in sparsely populated tables

• Average expected # probes for open addressing:– linear probing = ½(1 + 1/(1-L))– quadratic probing = ½(1 + 1/(1-L))2

• For chaining = 1 + (L/2)– Note: Here L can be greater than 1

Performance of Hash Tables (2)L Number of Probes Linear Probing Chaining

0 1.00 1.00 0.25 1.17 1.13 0.5 1.50 1.25 0.75 2.50 1.38 0.83 3.38 1.43 0.9 5.50 1.45 0.95 10.50 1.48

Performance of Hash Tables (3)• Hash table:

– Insert: average O(1)– Search: average O(1)

• Sorted array:– Insert: average O(n)– Search: average O(log n)

• Binary Search Tree:– Insert: average O(log n)– Search: average O(log n)

• But balanced trees can guarantee worst case O(log n)

top related