hash discrete mathematics and its applications baojian hua [email protected]

35
Hash Discrete Mathematics and Its Applications Baojian Hua [email protected]

Post on 21-Dec-2015

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Hash

Discrete Mathematics andIts Applications

Baojian [email protected]

Page 2: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Searching A dictionary-like data structure

contains a collection of tuple data: <k1, v1>, <k2, v2>, … keys are comparable and pair-wise distinct

supports these operations: new () insert (dict, k, v) lookup (dict, k) delete (dict, k)

Page 3: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Examples

Application Purpose Key Value

Phone Book phone name phone No.

Bank transaction

visa $$$

Dictionary lookup word meaning

compiler symbol variable type

www.google.com

search key words contents

… … … …

Page 4: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Summary So Far

rep’op’

array sorted array

linked list

sorted linked list

binarysearch tree

lookup()

O(n) O(lg n) O(n) O(n) O(n)

insert()

O(n) O(n) O(n) O(n) O(n)

delete()

O(n) O(n) O(n) O(n) O(n)

Page 5: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

What’s the Problem?

For every mapping (k, v)s After we insert it into the dictionary dict,

we don’t know it’s position! Ex: insert (d, “li”, 97), (d, “wang”, 99),

(d, “zhang”, 100), … and then lookup (d, “zhang”);

(“li”, 97) …(“wang”,

99)(“zhang”,

100)

Page 6: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Basic Plan

Start from the array-based approach Use an array A to hold elements (k, v)s For every key k:

if we know its position (array index) i from k then lookup, insert and delete are simple:

A[i] done in constant time O(1)

(k, v)

i

Page 7: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Example

Ex: insert (d, “li”, 97), (d, “wang”, 99), (d, “zhang”, 100), …;and then lookup (d, “zhang”);

(“li”, 97)

?

Problem#1: How to calculate index from the given key?

Page 8: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Example

Ex: insert (d, “li”, 97), (d, “wang”, 99), (d, “zhang”, 100), …;and then lookup (d, “zhang”);

(“li”, 97)

?

Problem#2: How long should array be?

Page 9: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Basic Plan

Save (k, v)s in an array, index i calculated from key k

Hash function: a method for computing index from given keys

(“li”, 97)

hash (“li”)

Page 10: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Hash Function Given any key, compute an index

Efficiently computable Ideal goals: for any key, the index is uniform

different keys to different indexes However, thorough research problem, :-(

Next, we assume that the array is of infinite length, so the hash function has type: int hash (key k); To get some idea, next we perform a “case analy

sis” on how different key types affect “hash”

Page 11: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Hash Function On “int”// If the key of hash is of “int” type, the hash

// function is trivial:

int hash (int i)

{

return i;

}

Page 12: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Hash Function On “char”// If the key of hash is of “char” type, the hash

// function comes with type conversion:

int hash (char c)

{

return c;

}

Page 13: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Hash Function On “float”// Also type conversion:

int hash (float f)

{

return (int)f;

}

// how to deal with 0.aaa, say 0.5?

Page 14: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Hash Function On “string”// Example: “BillG”:// A trivial one, but not so good:int hash (char *s){ int i=0, sum=0; while (s[i]) { sum += s[i]; i++; } return sum;}

Page 15: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Hash Function On “Point”// Suppose we have a user-define type:struct Point2d{

int x;int y;

};

int hash (struct Point2d pt){ // ???}

Page 16: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

From “int” Hash to Index Recall the type:

int hash (T data); Problems with “int” return type

At any time, the array is finite no negative index (say -10)

Our goal: int i ==> [0, N-1] Ok, that’s easy! It’s just:abs(i) % N

Page 17: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Bug! Note that “int”s range: -231~231-1

So abs(-231) = 231 Overflow!

The key step is to wipe the sign bit offint t = i & 0x7fffffff;int hc = t % N; In summary:hc = (i & 0x7fffffff) % N;

Page 18: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Collision

Given two keys k1 and k2, we compute two hash codes hc1, hc2[0, N-1]

If k1<>k2, but h1==h2, then a collision occurs

(k1, v1)

i

(k2, v2)

Page 19: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Collision Resolution

Open Addressing Re-hash Chaining (Multi-map)

Page 20: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Chaining

For collision index i, we keep a separate linear list (chain) at index i

(k1, v1)

i

(k2, v2)

k1

k2

Page 21: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

General Scheme

k1

k2

k5k8

k43

Page 22: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Load Factor

loadFactor=numItems/numBuckets defaultLoadFactor: default value of the l

oad factor

k1

k2

k5k8

k43

Page 23: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

“hash” ADT: interface#ifndef HASH_H#define HASH_H

typedef void *poly;typedef poly key;typedef poly value;

typedef struct hashStruct *hash;

hash newHash ();hash newHash2 (double lf);void insert (hash h, key k, value v);poly lookup (hash h, key k);void delete (hash h, key k);

#endif

Page 24: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Hash Implementation#include “hash.h”

#define EXT_FACTOR 2

#define INIT_BUCKETS 16

struct hashStruct

{

linkedList *buckets;

int numBuckets;

int numItems;

double loadFactor;

};

Page 25: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

In Figure

k1

k2

k5k8

k43

buckets

loadFactor

numItems

numBuckets

h

Page 26: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

“newHash ()”hash newHash (){ hash h = (hash)malloc (sizeof (*h)); h->buckets = malloc (INIT_BUCKETS * sizeof (linkedList));

for (…) // init the array

h->numBuckets = INIT_BUCKETS; h->numItems = 0; h->loadFactor = 0.25;

return h;}

Page 27: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

“newHash2 ()”hash newHash2 (double lf){ hash h = (hash)malloc (sizeof (*h)); h->buckets=(linkedList *)malloc (INIT_BUCKETS * sizeof (linkedList));

for (…) // init the array

h->numBuckets = INIT_BUCKETS; h->numItems = 0; h->loadFactor = lf;

return h;}

Page 28: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

“lookup (hash, key)”value lookup (hash h, key k, compTy cmp)

{

int i = k->hashCode (); // how to perform this?

int hc = (i & 0x7fffffff) % (h->numBuckets);

value t =linkedListSearch ((h->buckets)[hc], k);

return t;

}

Page 29: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Ex: lookup (ha, k43)

k1

k2

k5k8

k43

bucketsha

hc = (hash (k43) & 0x7fffffff) % 8;

// hc = 1

Page 30: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Ex: lookup (ha, k43)

k1

k2

k5k8

k43

bucketsha

hc = (hash (k43) & 0x7fffffff) % 8;

// hc = 1

compare k43 with k8,

Page 31: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Ex: lookup (ha, k43)

k1

k2

k5k8

k43

bucketsha

hc = (hash (k43) & 0x7fffffff) % 8;

// hc = 1

compare k43 with k43,

found!

Page 32: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

“insert”void insert (hash h, poly k, poly v){ if (1.0*numItems/numBuckets >=defaultLoadFactor) // buckets extension & items re-hash; int i = k->hashCode (); // how to perform this? int hc = (i & 0x7fffffff) % (h->numBuckets); tuple t = newTuple (k, v);

linkedListInsertHead ((h->buckets)[hc], t); return;}

Page 33: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Ex: insert (ha, k13)

k1

k2

k5k8

k43

bucketsha

hc = (hash (k13) & 0x7fffffff) % 8;

// suppose hc==4

Page 34: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Ex: insert (ha, k13)

k13

k1

k5k8

k43

bucketsha

hc = (hash (k13) & 0x7fffffff) % 8;

// suppose hc==4

k2

Page 35: Hash Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

Complexity

rep’op’

array sorted array

linked list

sorted linked list

hash

lookup()

O(n) O(lg n) O(n) O(n) O(1)

insert()

O(n) O(n) O(n) O(n) O(1)

delete()

O(n) O(n) O(n) O(n) O(1)