comp 103 hashing (ii), and exam tips 2014-t2 lecture 33 marcus frean school of engineering and...

19
COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington Marcus Frean, Lindsay Groves, Peter Andreae and Thomas Kuehne, VUW

Upload: lorin-preston

Post on 17-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

CO

MP 1

03

Hashing (II), and exam tips

2014-T2 Lecture 33

Marcus FreanSchool of Engineering and Computer Science, Victoria

University of Wellington

Marcus Frean, Lindsay Groves, Peter Andreae and Thomas Kuehne, VUW

Page 2: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

2

RECAP-TODAY

RECAP Hashing with “buckets”

TODAY Hashing by “probing” the exam

Page 3: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

3

Collisions: chaining / buckets Store a Set in each cell:

hash value → which set

ant fox

hen

dog

bee

kea

cow elk

owl

pig sow

tui

ape bat

bug cat

eel gnu

jay nit

ray

yak cod

roe

Page 4: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

4

Dealing with Collisions

Two approaches Use a collection at each place

(“buckets” or “chaining”)

Look for an empty place in the hashtable(“probing” or “open addressing”)

0 1 2 3 4 5 6 7 8 9 581 N⋯ ⋯

“2001 – A Space Odyssey”

HA

SH

“Gravity”

HASH

Page 5: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

5

Linear ProbingHash value tells us where to start looking.

if value.hashCode() → p start at index p if cell is used, try p+1, p+2, p+3 … wrap round to 0 at the end of the array.

hash = (name[0]+name[1])%7

0 1 2 3 4 5 6

Sam

Steve

StigStu

Sven Sun (3)

(2)

(5)

(4)

(2)

(2)

Page 6: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

6

Hash Tables and Load Factor When is the hashTable “full”?

When number of items is close to array size: May have to probe a large number of cells to find

empty cell⇒ performance becomes very slow.

Linear probing is particularly bad!

Should not let table get more than 70% - 80% full(maximum “load factor”)

With a low load factor, cost is O(1) ...........high..............................O(N)

“eel” “pig” “cat” “bee” “fox” “dog” “owl” “hen” “ant”

“kea”

Page 7: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

7

ensureCapacityIf it is full, double and copy:

how do you copy?

Index depends on… hashCode and length (division method)! and it depends on previous collisions...

⇒ Have to rehash everything!

“eel” “kea” “ant” “cat” “bee” “fox” “dog”

“eel” “kea” “ant” “cat” “bee” “fox” “dog”

“eel”“kea” “ant”“cat” “bee” “fox”“dog”

Page 8: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

8

Linear Probing: Runs and Clustering

Linear probing is particularly bad:

Repeated collisions at one index create runs

Runs → linear performance

With linear probing, runs join up

⇒ they grow fast: the bigger the run, the faster it grows

This is called "clustering“

Does it help to increase step size (p, p+d, p+2d, …) ?

“eel”“kea” “ant”“cat” “bee” “fox”“dog”

3 1,2 5 4

henowlpiggnuemurattui

Page 9: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

9

Quadratic Probing Make the sequence of probes have increasing steps:

runs don’t join up so fast

h, h+1, h+4, h+9, h+16, …p=h, p+=1, p+=3, p+=5, p+= 7, p+= 9, ….

In general, quadratic probing uses a quadratic formula:

probei = hash + a i + b i2 ( b 0)

Eg: with a=b=½ , the step sizes become 1,2,3… instead of 1,3,5…

“eel”“kea” “ant”“cat” “fox”“dog” “hen” “bee” “owl”

Page 10: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

10

Quadratic ProbingAnother problem, perhaps? sequence might wrap back on itself before checking

each cell:

If we choose a = b = ½, and length is a power of 2... ⇒ guaranteed not to wrap until it has checked every cell !

probei = hash + ½ (i + i2) ⇒ probes are hash, hash+1, hash+3, hash+6, hash+10, hash+15, ... ⇒ step sizes are 1, 2, 3, 4, 5, …

“eel”“dog” “hen”

Page 11: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

11

Hash Table with Probing: remove Inserted: Stu (2) Sven (5) Sam (4) Steve (2)

Sun (4)

Now remove: Sam (4)

What’s the problem? contains(Sun) will return false! To remove, need to leave a marker (not null,

not a value !)public void remove() {

throw new UnsupportedOperationException();}

0 1 2 3 4 5 6

SamSteve StigStu SvenSun

insert a "tombstone" key instead

Page 12: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

12

Iterator? Iterating through hash table is not so simple!

there will be nulls to skip over the order that items are returned appears

random (and may change when the array is doubled!)

At each call to next(), Iterator must advance the index to the next non-null cell. Could be slow!...

“eel”“kea” “ant”“cat” “bee” “fox”“dog”

Page 13: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

13

hashing summary hashing gives add/find that is crazily quick two ideas: buckets and probing with the probing method, removing requires

“tombstones” when a hashtable is too full, you need to

increase its size: this requires rehashing everything

iterating over a HashSet can be a slow process

Page 14: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

14

the COMP103 final examThe 4th of November is a Tuesday

Exam is at 2:30pm, and lasts TWO hours

You will be distributed over 5 different rooms: ABUBAKR - BHIKHU MYLT101 BHULA - DEIGHTON HMLT104 DEL ROSARIO - LATEGAN KKLT303 LAWRENCE - PEREZ HMLT205 PHEASE - ZHU MCLT103

Page 15: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

15

preparing for the exam the 103 homepage has link to “Assessment archive”

http://ecs.victoria.ac.nz/Main/ExamArchiveCOMP103

1. Do your best without the answers

2. Then check against the answers

Next week: tutor-run help sessions (Jeffrey Wu)1. Monday 20th, 12:30-3pm, in Cotton 228.

2. Wednesday 22nd, 12:30-3pm, but in AM101.

3. ALSO, VUW Science Society runs “cram session” for ECS: Friday 24th, 10am-3pm, in the Memorial Theatre Foyer

checklist – on the 103 homepage friends... assignments... textbook... notes... videos...

Page 16: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

16

Page 17: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

17

The Exam

• answer all questions• manage your time• Dumb calculators & non-

electronic dictionaries are OK

Page 18: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

18

doing your best on the day Read the question carefully and make sure you know what is

being asked. Write your answer clearly Use extra pages for rough work or for answers Cross out what you don’t want marked Say where your answer is if not on same page

For coding questions: There’s more than one way to skin a cat If it’s complicated, start with the pseudocode

Page 19: COMP 103 Hashing (II), and exam tips 2014-T2 Lecture 33 Marcus Frean School of Engineering and Computer Science, Victoria University of Wellington  Marcus

19

best wishes!