cosc 1030 lecture 10 hash table. topics table hash concept hash function resolve collision...

Post on 21-Jan-2016

238 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

COSC 1030 Lecture 10COSC 1030 Lecture 10

Hash Table

TopicsTopics

TableHash ConceptHash FunctionResolve collisionComplexity Analysis

TableTable

Table– A collection of entries– Entry :<key, info>– Insert, search and delete– Update, and retrieve

Array representation– Indexed– Maps key to index

Hash TableHash Table Hash Table

– A table– Key range >> table size– Many-to-one mapping (hashing)– Indexed – hash code as index

Tabbed Address Book– Map names to A:Z– Multiple names start with same letter

Same tab, sequential slots

Hash Table ADTHash Table ADT

Interface Hashtable {

void insert(Item anItem);

Item search(Key aKey);

boolean remove(Key aKey);

boolean isFull();

boolean isEmpty();

}

Hash FunctionHash Function

Maps key to index evenlyFor any n in N,

hash(n) = n mod Mwhere M is the size of hash table.

hash(k*M + n) = n, where n < M, k: integerMap to integer first if key is not an integer

– A:Z 0:25String s h(s[0]) + h(s[1])*26 +…+ h(s[n-1])*26^(n-1)String s h(s[0])*26^(n-1) + …+h(s[n-1])

Hash FunctionHash Function

String s h(s[0])*26^(n-1) + …+h(s[n-1])

int toInt(String s) {

assert(s != null);

int c = 0;

for (int I = 0; I < s.length(); I ++) {

c = c*26 + toInt(s.charAt(I));

}

return c;

}

int hash(String s) { return hash(toInt(s)); }

Example Example

Table[7] – HASHTABLE_SIZE = 7 Insert ‘B2’, ‘H7’, ‘M12’, ‘D4’, ‘Z26’ into the table

2, 0, 5, 4, 5 Collision

– The slot indexed by hash code is already occupied

A simple solution– Sequentially decreases index until find an empty slot or

table is full

Collision PossibilityCollision Possibility

How often collision may occur? Insert 100 random number into a table of 200 slots 1 – ((200 – I)/200), I=0:99

= 1 – 6.66E-14 > 0.99999999999993 Load factor

– 100/200 = 0.5 = 50% 0.99999999999993– 20/ 200 = 0.1 = 10% 0.63– 10/200 = 0.05 = 5% 0.2

Default load factor is 75% in java Hashtable

Primary ClusterPrimary Cluster

The biggest solid block in hash tableJoin clustersThe bigger the primary cluster is, the easier

to growDistributed evenly to avoid primary cluster

Probe MethodProbe Method

What we can do when collision occurred?– A consistent way of searching for an empty slot– Probe

Linear probe – decrease index by 1, wrap up when 0 Double hash – use quotient to calculate decrement

– Max(1, (Key / M) % M)

Separate chaining – linked list to store collision items Hash tree – link to another hash table (A4)

Probe sequence coverageProbe sequence coverage

Ensure probe sequence cover all table– Utilizes the whole table– Even distribution– M and probe decrement are relative prime

No common factor except 1

– Makes M a prime number M and any decrement (< M) are relative prime

Probe MethodProbe Method

void insert(Item item) {

if(!isFull()) {

int index = probe(item.key);

assert(index >=0 && index < M);

table[index] = item;

count ++;

}

}

Linear Probe MethodLinear Probe Method int probe(int key) { int hashcode = key % HASHTABLE_SIZE;

if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

do { index--; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Double Hash Probe MethodDouble Hash Probe Method int probe(int key) {

int hashcode = key % HASHTABLE_SIZE;if(table[hashcode] == null) { return hashcode;

} else { int index = hashcode;

int dec = (key / HASHTABLE_SIZE) % HASHTABLE_SIZE; dec = Math.max(1, dec);

do { index -= dec; if(index < 0) index += HASHTABLE_SIZE;

} while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; }}

Search MethodSearch Method Item search(int key) {

int hashcode = key % HASHTABLE_SIZE;

int dec = max(1, (key / HASHTABLE_SIZE) % HASHTABLE_SIZE);

while(table[hashcode] != null) {

if(table[hashcode].key == key) break;

hashcode -= dec;

}

return table[hashcode];

}

Delete MethodDelete Method

Difficulty with delete when open addressing– Destroy hash probe chain

Solution– Set a deleted flag– Search takes it as occupied– Insert takes it as deleted– Forms primary cluster

Separate chaining– Move one up from chained structure

EfficiencyEfficiency Successful search

– Best case – first hit, one comparison– Average

Half of average length of probe sequence Load factor dependent O(1) if load factor < 0.5

– Worst case – longest probe sequence Load factor dependent

Unsuccessful search– Average - average length of probe sequence– Worst case - longest probe sequence

Advanced TopicsAdvanced Topics Choosing Hash Functions

– Generate hash code randomly and uniformly– Use all bits of the key– Assume K=b0b1b2b3– Division

h(k) = k % M; p(k) = max (1, (k / M) % M)

– Folding h(k) = b1^b3 % M; p(k) = b0^b2 % M; // XOR

– Middle squaring h(k) = (b1b2) ^ 2

– Truncating h(k) = b3;

Advanced TopicsAdvanced TopicsHash Tree

– Separate chained collision resolution– Recursively hashing the key

Hash Table

Hash Table Hash Table Hash Table

Hash Table

Hash Table

Hash TreeHash Treevoid insert(int key, Item item) {

Int h = h(key);Int k = g(key); // one-to-one mapping Key KeyIf(table[h] == null) {

table[h] = item;} else {

if(table[h].link == null) table[h].link = new HashTree();

table[h].link.insert(k, item);}

}

top related