cs 44321 cs4432: database systems ii. cs 44322 index definition in sql create index name on rel...

31
CS 4432 1 CS4432: Database Systems II

Post on 22-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 1

CS4432: Database Systems II

Page 2: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 2

Index definition in SQL

• Create index name on rel (attr)

(Check online for index definitions in SQL)

• Drop INDEX name

Page 3: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 3

ATTRIBUTE LIST MULTIKEY INDEX

e.g., CREATE INDEX foo ON R(A,B,C)

Note

Page 4: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 4

Motivation: Find records where DEPT = “Toy” AND SAL >

50k

Multi-key Index

Page 5: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 5

Strategy I:

• Use one index, say Dept.• Get all Dept = “Toy” records

and check their salary

I1

Page 6: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 6

• Use 2 Indexes; Manipulate Pointers

Toy Sal>

50k

Strategy II:

Page 7: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 7

• Multiple Key Index

One idea:

Strategy III:

I1

I2

I3

Page 8: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 8

Example

ExampleRecord

DeptIndex

SalaryIndex

Name=JoeDEPT=SalesSAL=15k

ArtSalesToy

10k15k17k21k

12k15k15k19k

Page 9: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 9

For which queries is this index good?

Find RECs Dept = “Sales” SAL=20kFind RECs Dept = “Sales” SAL > 20kFind RECs Dept = “Sales”Find RECs SAL = 20k

Page 10: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 10

Many alternate methods for indexing

Page 11: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 11

key h(key)

Hashing

<key>

.

.

Buckets(typically 1disk block)

Page 12: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 12

One example hash function

• Key = ‘x1 x2 … xn’ n-byte character string

• Have b buckets

• Hash function :– h: add (x1 + x2 + ….. Xn) modulo b

Page 13: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 13

This may not be best function … Read Knuth Vol. 3 if you really

need to select a good function.

Good hash Expected number of

function: keys/bucket is thesame for all

buckets

Page 14: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 14

Within a bucket:

• Do we keep keys sorted?

• Yes, if CPU time critical & Inserts/Deletes not too frequent

Page 15: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 15

Next: example to illustrateinserts, overflows,

deletes

h(K)

Page 16: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 16

EXAMPLE 2 records/bucket

INSERT:h(a) = 1h(b) = 2h(c) = 1h(d) = 0

0

1

2

3

d

ac

b

h(e) = 1

e

Page 17: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 17

0

1

2

3

a

bc

e

d

EXAMPLE: deletion

Delete:ef

fg

maybe move“g” up

cd

Page 18: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 18

Rule of thumb:• Try to keep space utilization

between 50% and 80% Utilization = # keys used

total # keys that fit

• If < 50%, wasting space• If > 80%, overflows significant

depends on how good hash function is & on # keys/bucket

Page 19: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 19

How do we cope with growth?

• Overflows and reorganizations• Dynamic hashing

• Extensible hashing• Others …

Page 20: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 20

Extensible hashing : idea 1

(a) Use i of b bits output by hash function

b h(K)

use i grows over time….

Note: enables future doubling of space !

00110101

Page 21: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 21

(b) Hash to directory of pointers to buckets (instead of buckets directly)

h(K)[i ] to bucket

Note : Double space by doubling the directory !

.

.

.

.

Extensible hashing : idea 2

Page 22: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 22

Example: h(k) is 4 bits; 2 keys/bucket

i = 1

1

1

0001

1001

1100

Insert 1010

11100

1010

New directory

200

01

10

11

i =

2

2

01

Page 23: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 23

10001

21001

1010

21100

Insert:

0111

0000

00

01

10

11

2i =

Example continued

0111

0000

0111

0001

2

2

Page 24: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 24

00

01

10

11

2i =

21001

1010

21100

20111

20000

0001

Insert:

1001

Example continued

1001

1001

1010

000

001

010

011

100

101

110

111

3i =

3

3

Page 25: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 25

Extensible hashing: deletion

• Merge blocks and cut directory if possible

(Reverse insert procedure)

Page 26: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 26

Extensible hashing

If directory fits into main memory, then access cost is 1 IO, otherwise 2 IOs Can handle growing files

- with less wasted space- with no full reorganizations

Summary

+

Indirection(Not bad if directory in

memory)

Directory doubles in size(Now it fits, now it does not)

-

-

+

Page 27: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 27

Use what when :

• Indexing : Tree-Structures vs Hashing

Page 28: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 28

• Hashing good for probes given keye.g., SELECT …

FROM RWHERE R.A = 5

Indexing vs Hashing

Page 29: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 29

• INDEXING (Including B Trees) good for

Range Searches:e.g., SELECT

FROM RWHERE R.A > 5

Indexing vs Hashing

Page 30: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 30

Reading Chapter 14

• Read – 14.3.1 and 14.3.2

Page 31: CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop

CS 4432 31

The BIG picture….

• Chapters 11 & 12: Storage, records, blocks...

• Chapter 13 & 14: Access Mechanisms - Indexes

- B trees - Hashing - Multi key

• Chapter 15 & 16: Query ProcessingNEXT