storage representations for set-oriented selection predicates karthikeyan ramasamy with jeffrey f....

31
Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton and David Maier

Upload: julius-roderick-henderson

Post on 20-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

Set Valued Attributes Many semantic notions of the real world can be described by sets (e.g) set of courses, set of products, etc. Set valued attributes provide conciseness and ease of expression

TRANSCRIPT

Page 1: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Storage Representations for Set-Oriented Selection Predicates

Karthikeyan Ramasamywith

Jeffrey F. Naughton and David Maier

Page 2: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Object Relational DBMS

• OR-DBMS are gaining acceptance• Market for OR-DBMS is growing • Many vendors are working on a version of

OR-DBMS• Main features of OR-DBMS

– Type extensibility– Collections

Page 3: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Set Valued Attributes

• Many semantic notions of the real world can be described by sets (e.g) set of courses, set of products, etc.

• Set valued attributes provide conciseness and ease of expression

Page 4: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Classification of Representations

Internal External

Nes

ted

Unn

este

d

Yes Yes

YesNo

Page 5: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Nested Internal Representation

• Stored at the end of the tuple• Requires support for handling large tuples• Retrieval cost of a tuple increases• Updates could reorganize the whole tuple• Might do well when the size of the set is

small

Page 6: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Nested Internal Representation

Cardinality

Element 1

Element 2

Element N

.

.

Length

Tuple

A1 A2 A3

Page 7: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Unnested External Representation

• Set-valued attributes are stored separately in an auxiliary relation

• Set instances are unnested and each element stored as a tuple

• Uses key - foreign key for connecting tuple and its set elements

• Requires join to assemble elements

Page 8: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Unnested External Representation

• Example– Moviegoer(name, street, city, state, zip, {movies})

• Base Relation– Moviegoer-Base(name, street, city, state, zip, id)

• Set Relation– Moviegoer-Set(id, movie-name)

Page 9: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Nested External Representation

• Set valued attributes are stored in an auxiliary relation

• Set instances are nested in auxiliary relation• Uses key - foreign key• Number of tuples is the same as base

relation• Resorts to join

Page 10: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Nested External Representation

• Example– Moviegoer(name, street, city, state, zip, {movies})

• Base Relation– Moviegoer-Base(name, street, city, state, zip, id)

• Set Relation– Moviegoer-Set(id, {movies})

Page 11: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Indexed Variants

• Augmentation with Indexes• Nested Representations

– Unnested and unclustered Index• Unnested Representations

– Clustered Index– Unclustered Index

Page 12: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Performance - Settings

• Implementation in Paradise - Set Adt• Intel Pentium 333 MHz - Solaris 2.6• Main memory - 128 MB• Buffer pool size - 32 MB• Used raw disks of size 4 GB• Each experiment was run against cold

database

Page 13: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Performance - Experimental Schema

Moviegoer(name, street, city, state, zipcode, {movies})

– Average tuple size 68 bytes – Number of Base Relation tuples 10000– Number of Set Elements 1000000– Set element size is 20 bytes

Page 14: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Performance - Queries

• Queries ran are– Conjunctive Queries– Disjunctive Queries– Queries not referring to set valued attribute

• Sets are not in the result• Sets in the result

Page 15: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Performance - Parameters Varied

• Cardinality• Selectivity of the predicate• Number of elements in the predicate• Size of each set element

Page 16: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Conjunctive Queries

SELECT m.name, m.street, m.city, m.state, m.zipcodeFROM Moviegoer mWHERE { “movieA50061”, “movieA50062” }

SUBSET OF m.movies

No Set in the Result

SELECT m.name, m.street, m.city, m.state, m.zipcode, m.moviesFROM Moviegoer mWHERE { “movieA50061”, “movieA50062” }

SUBSET OF m.movies

Set in the Result

Page 17: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Disjunctive Queries

SELECT m.name, m.street, m.city, m.state, m.zipcodeFROM Moviegoer mWHERE “movieA50061” IN m.movies OR

“movieA50062” IN m.movies

No Set in the Result

SELECT m.name, m.street, m.city, m.state, m.zipcode, m.moviesFROM Moviegoer mWHERE “movieA50061” IN m.movies OR

“movieA50062” IN m.movies

Set in the Result

Page 18: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

No Set in Result - Varying Cardinality

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120Cardinality

Res

pons

e Ti

me

(sec

)

Nested Internal Indexed Nested InternalNested External Indexed Nested ExternalUnnested External Indexed Unnested External

Selectivity of 1 % for Six Element Predicate Query

Page 19: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

No Set in Result - Varying Selectivity

0

20

40

60

80

100

120

0.01 0.1 1 10 25 50Selectivity (%)

Res

pons

e Ti

me

(sec

)

Nested Internal Indexed Nested InternalNested External Indexed Nested ExternalUnnested External Indexed Unnested External

Six Element Predicate Query with Cardinality of 100

Page 20: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

No Set in Result - Varying Number of Elements in Predicate

Selectivity of 1% with cardinality of 100

0

10

20

30

40

50

60

70

80

1 2 4 6Number of Elements in the Predicate

Res

pons

e Ti

me

(sec

)

Nested Internal Indexed Nested InternalNested External Indexed Nested ExternalUnnested External Indexed Unnested External

Page 21: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

No Set in Result - Varying Size of Set Element

Selectivity of 1% with cardinality of 100

0

10

20

30

40

50

60

70

80

11 20 30Size of Set Element

Res

pons

e Ti

me

(sec

)

Nested Internal Indexed Nested Internal

Nested External Indexed Nested External

Unnested External Indexed Unnested External

Page 22: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Queries - Not Referring Set Valued Attribute

SELECT m1.name, m1.street, m1.city, m1.state, m1.zipcodeFROM Moviegoer m1, Moviegoer m2WHERE m1.id = m2.id

Join Query

SELECT m.name, m.street, m.city, m.state, m.zipcode,FROM Moviegoer m

Select Query

Page 23: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Select Query

0

2

4

6

8

10

12

10 25 50 100

Cardinality

Res

pons

e Ti

me

(sec

)

Unnested External Nested External Nested Internal

Page 24: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Conclusions and Future Work

• Nested representations perform better for set oriented selection predicates

• Indexes on nested representations are effective than unnested representations

• Evaluation of these representations for nested set joins

• Specialized operators for nested representations

Page 25: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Unnested External Representation

• Ability to handle any cardinality• Easily slides into existing relational engine• Set operations might be inefficient since

elements are scattered• Keys provide overhead when set elements

are small• Cardinality Explosion

Page 26: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

No Set in Result - Cost BreakdownNested Internal

0

10

20

30

40

50

60

70

80

10 25 50 100

Cardinality

Res

pons

e Ti

me

(sec

)

I/O Cost Buffer Pool Cost

Predicate Eval Cost Other System Cost

Unnested External

0

10

20

30

40

50

60

70

80

10 25 50 100

Cardinality

Res

pons

e Ti

me

(sec

)

I/O Cost Buffer Pool Cost

Predicate Eval Cost Other System Cost

Selectivity of 1 % for Six Element Predicate Query

Page 27: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Conjunctive Queries - Unnested External

SELECT mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

FROM Moviegoer-Base mb, Moviegoer-Set msWHERE mb.set-id = ms.set-id AND

(ms.movie-name = “movieA50061” OR ms.movie-name = “movieA50062”)

GROUP BY mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

HAVING count(*) = 2

No Set in the Result

Page 28: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Conjunctive Queries - Unnested External

SELECT mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

FROM Moviegoer-Base mb, Moviegoer-Set ms1, Moviegoer-Set ms2

WHERE mb.set-id = ms1.set-id AND mb.set-id = ms2.set-id AND ms1.movie-name = “movieA50061” AND ms2.movie-name = “movieA50062”

No Set in the Result

Page 29: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Conjunctive Queries - Unnested External

INSERT INTO tempSELECT DISTINCT mb.set-id, mb.name, mb.street, mb.city,

mb.state, mb.zipcodeFROM Moviegoer-Base mb, Moviegoer-Set msWHERE mb.set-id = ms.set-id AND

ms.movie-name = “movieA50061” OR ms.movie-name = “movieA50062”

GROUP BY mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

HAVING count(*) = 2

SELECT t.name, t.street, t.city, t.state, t.zip, ms.movie-nameFROM temp t, Moviegoer-Set msWHERE t.set-id = ms.set-id

Set in the Result

Page 30: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Disjunctive Queries - Unnested External

SELECT DISTINCT mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

FROM Moviegoer-Base mb, Moviegoer-Set msWHERE mb.set-id = ms.set-id AND

(ms.movie-name = “movieA50061” OR ms.movie-name = “movieA50062”)

No Set in the Result

Page 31: Storage Representations for Set-Oriented Selection Predicates Karthikeyan Ramasamy with Jeffrey F. Naughton…

Disjunctive Queries - Unnested External

SELECT DISTINCT mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode, ms2.movie-name

FROM Moviegoer-Base mb, Moviegoer-Set ms1, Moviegoer-Set ms2

WHERE mb.set-id = ms1.set-id AND ms1.set-id = ms2.set-id (ms.movie-name = “movieA50061” OR ms.movie-name = “movieA50062”)

Set in the Result