chapter 5 algebraic and logical query languages pp.54 is added pp 61 updated

95
Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Post on 21-Dec-2015

245 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Chapter 5

Algebraic and Logical Query Languages

pp.54 is addedPp 61 updated

Page 2: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.1 Relational Operations on Bags

• What is a bag?

Page 3: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Bags

• What is a bag?• Bag is a relation that may( or may not ) have

duplicate tuples.• Example:

A B

1 2

3 4

1 2

1 2

Page 4: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Bags continue

• Since the tuple (1,2) appear three times this is a bag

Page 5: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.1.1 Why Bags?

Speed• Example:Suppose I want the projection of A and B from

the following relation.A B C

1 2 5

3 4 6

1 2 7

1 2 8

Page 6: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

I simply cut attribute C and get the result

A B

1 2

3 4

1 2

1 2

I created a table and copy A and B to it. Simple and fast!

Page 7: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Now suppose I wanted a set with no duplication

I will have to take the first tuple and put it in

A B

1 2

I will then have to read the second tuple and compare it against the first.

A B

1 23 4

Page 8: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

• Since they are different I will include this tuple in the result and get

A B

1 2

3 4

Now I will read the third tuple and compare it to the first.

A B

1 2

3 4

1 2

Page 9: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

• Since they are the same I will not include this tuple.

• The point is that I had to do a lot of work. Each new tuple has to be compared with all other tuples before I could add it to the set. Hence time consuming.

Page 10: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Another reason to use bags

• Suppose I would like to calculate the averageOf attribute A. Suppose farther that A = revenue

in million of dollars. A B C

1 2 5

3 4 6

1 2 7

1 2 8

Then the average of a set will be 2 and the actual average is 1.5. this is substantial difference.

Page 11: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.1 Relational Operations on Bags

• 5.1.2 Union intersection and Difference of bags

Page 12: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

• RUS that same as regular union only welcomes duplications.

• Suppose we have two grocery bags. One has two boxes of Oreos the second has five boxes or Oreos. I consolidate the two bags into one bag with 2+5=7 Oreo boxes.

Union of bags

Page 13: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Intersection of Bags R ∩ S

• If the tuple t appears n times in R and m times in S then the tuple t appear min(m, n) times in the intersection.

• That is because the intersection it the common element in R and S and the relations has exactly min(m, n) in common.

Page 14: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

The difference of R and S

• Each occurrence of t in S will cancel one occurrence of t in R. Then output is the “left over” of t.

Page 15: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Examples of union, intersection and difference on bags

Page 16: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

• Let R be the relation (bag) A B

1 2

3 4

1 2

1 2

* Let S be the relation bellow.(bag)

A B

1 2

3 4

3 4

5 6

Page 17: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Then R U S in a bag is simply the two tables written together.

A B

1 2

3 4

1 2

1 2

1 2

3 4

3 4

5 6

Page 18: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Intersection of bags R ∩ SA B

1 2

3 4

Page 19: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

The difference of bags R and S, R-S A B

1 2

1 2

Page 20: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.1 Relational Operations on Bags

5.1.3. Projection of BagsIt has been explained previously (simply cut)

Page 21: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.1 Relational Operations on Bags

• 5.1.4 Selection on Bags

Page 22: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Selection on bags

• Let R be the bag

A B C

1 2 5

3 4 6

1 2 7

1 2 8

σC>=6(R)

Page 23: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

σC>=6(R)

A B C

3 4 6

1 2 7

1 2 8

Since it is a bag we allow duplication

Page 24: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.1 Relational Operations on Bags

• 5.1.5 Product of Bags

Page 25: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Product on bags R X S

A B

1 2

1 2

Bag R

Bag S

B C

2 3

4 5

4 5

Page 26: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Product on bags

• As we learned earlier each row from R has to be paired with ALL rows in S.

A R.B S.B C

1 2 2 3

1 2 2 3

1 2 4 5

1 2 4 5

1 2 4 5

1 2 4 5

Page 27: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Product on bags

• As we learned earlier each row from R has to be paired with ALL rows in S.

A R.B S.B C

1 2 2 3

1 2 2 3

1 2 4 5

1 2 4 5

1 2 4 5

1 2 4 5

Page 28: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Product on bags continue

• Notice that in the above bag we again used the convention for the attribute name. B appear twice so we call it R.B and S.B.

• Equivalent Relation•

Page 29: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.1 Relational Operations on Bags

• 5.1.6 Joins of Bags

Page 30: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Joins of bags ∞

• We compare each tuple of one relation with each tuple of the other, decide whether or not this pair of tuples joins successfully, and if so we put the resulting tuple in the answer. When constructing the answer we permit duplication.

Page 31: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Example of Joins in bags ∞A B

1 2

1 2

Relation R

Relation S

A B

2 3

4 5

4 5

Page 32: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Result of R ∞ SA B C

1 2 3

1 2 3

Please notice that unlike the product we do not write B for each relation. We are “naturally” joining the relations. Think of it as the transitive rule.

Page 33: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Theta-join in bags

1. Find the product of the two relations 2. select only these tuples that comply with the

condition.3. Allow duplications It is the symbol ∞C with condition beneath it.

Page 34: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Example theta join A B

1 2

1 2

Relation R

Relation S

A B

2 3

4 5

4 5

Page 35: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

The theta-join of R and S with the condition R.B <S.B

A R.B S.B C

1 2 2 3

1 2 2 3

1 2 4 5

1 2 4 5

1 2 4 5

1 2 4 5

2< 4 hence selected

Page 36: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.1 Relational Operations on Bags

1. Exercise from previous section• (Team 3/7) P52: upload Fig 2.20-21 into your

oracle (submit the source codes: create and insert statements to grader

5.1.7 Exercises for Section 5.1 Ex 5.1.1 (3/7)Ex 5.1.4 Assigned in the ClassList them into the algebraic law Table

Page 37: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.2 Extended Operators of Relational Algebra

• 5.2.1 Duplicate Elimination

Page 38: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

The duplicate-elimination operator δ

• Turns a bag into set. • Eliminate all but one copy of each tuple.Relation R

A B

1 2

3 4

1 2

1 2

Page 39: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Apply the duplication eliminator to R δ(R)

A B

1 2

3 4

( δ is the Greek letter Delta)

Page 40: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.2 Extended Operators of Relational Algebra

• 5.2.2 Aggregation Operators

Page 41: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Aggregation operators

• Aggregation operators apply to attributes (columns ) of relations. Example of aggregation operators are sums and averages.

Page 42: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Example of aggregation operators

Relation RA B

1 2

3 4

1 2

1 2

1.SUM(B)= 2+4+2+2=102.AVG(A)=(1+3+1+1)/43.MIN(A)=14.MAX(B)=45.COUNT(A)=4 number of elements in A

Page 43: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.2 Extended Operators of Relational Algebra

• 5.2.3 Grouping

Page 44: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Grouping

• Grouping of tuples according to their value in one or more attributes has the effect of partitioning the tuples of a relation into groups.

Page 45: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Example of grouping Studio name Length

Disney 123

MGM 345

Century fox 678

Century fox 900

MGM 23

Suppose we use the aggregation, sum(length). This aggregation will give us the sum of the whole column.

Page 46: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Example of grouping continue

• but suppose we want to know the total umber of minutes of movies produced by each studio.

• Then we must have sub tables within the table. Each sub table represent a studio.

• We will do that by grouping by studio name.• Now we can apply the aggregation operator

sum( length) to each group.

Page 47: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Example of grouping continue Studio name Length

MGM 23

MGM 345

Century fox 678

Century fox 900

Disney 123

Now the table is grouped by studio name and we can apply the aggregation operator sum(length)

Page 48: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.2 Extended Operators of Relational Algebra

• 5.2.4 The Grouping Operator

Page 49: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

The grouping operator ϒ

• Given the schema StarsIn(title, year, StarName)• We would like to find the starName of each

star who appeared in at least three movies and earliest year in which they appear.

• How can we approach this problem?

Page 50: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Grouping operator continue

• We must first group by StarName. It is very intuitive. We want to partition the table into stars and then we can do all the tests for each star

• In relational algebra we write ϒ StarName

• Bellow is the table grouped by starName

Group by

Page 51: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

MOVIETITLE MOVIEYEAR STARNAME

Blood Diamond 2006 Leonardo Dicaprio

The Quick and the Dead 1995 Leonardo Dicaprio

Titanic 1997 Leonardo Dicaprio

The Departed 2006 Leonardo Dicaprio

Body of lies 2008 Leonardo Dicaprio

Inception 2010 Leonardo Dicaprio

Somersault 2004 Samuel Henry

Macbeth 2006 Samuel Henry

Love my Way 2006 Samuel Henry

The Great Raid 2005 Samuel Henry

Terminator Salvation 2009 Samuel Henry

Avatar 2009 Samuel Henry

Perseus 2010 Samuel Henry

Autumn in 2000 Vera A Farmiga

Dust 2001 Vera A Farmiga

Mind the Gap 2004 Vera A Farmiga

Page 52: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

• Notice that in the above table there are three groups one for each Star.

• Now for each group we are interested in the first year in which the Star appeared, and we would like to know if he played in 3 or more movies.

• We will use the aggregations min(year) and count(title)>=3

Page 53: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

The grouping operator continue

How these aggregations works? • In each group separately we look for the min

year • In each group we look for the number of titles

in this group.• If the number of titles in a group is grate then

3, then this will be sent to the output otherwise this group is eliminated.

Page 54: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Final statement in the grouping operator

• ϒ starName, min(year)->minYear, count(title)->ctTitle(StarsIn)

Group by

Then

Find the minimum of each group

Count the number of title in each group

Page 55: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Final statement in the grouping operator

• starName (ctTitle>3(ϒ starName, min(year)->minYear, count(title)->ctTitle(StarsIn)))

See Fig 5.5 for tree expresion

Page 56: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Final statement in the grouping operator

• starName (ctTitle>3(ϒ starName, min(year)->minYear, count(title)->ctTitle(StarsIn)))

Page 57: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.2 Extended Operators of Relational Algebra

• 5.2.5 Extending the Projection Operator

Page 58: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Extending the projection operator

• We can include, renaming and arithmetic operators in projection.

Example:π A, B+C-->X

Projection

Of attribute AAnd Add the value in B and C

Rename it to X

Page 59: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Extending the projection operator continue

Relation R

A B C

0 1 2

0 1 2

3 4 5

Relation S

A X

0 3

0 3

3 9

B+C=X

Page 60: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.2 Extended Operators of Relational Algebra

• 5.2.6 The Sorting Operator τ The sorting operator τ turns a relation into a

list of tuples, sorted according to one or more attributes.

Page 61: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

External Merge Sort

• How do you sorting 4000 students using only one class room (can hold only 40 students)

1. Fill in the class room with 40 students, let them line up alphabetically

2. So we have 100 sorted group3. Line up two groups in front of class room4. One of the two “head” students will walk

into class room and sit at the first seat.

Page 62: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

External Merge Sort

1. Once the 40 seats are full, let them go out.2. Student continue to walk int until all sets are

occupied.3. Move then out, now we have a group of

sorted students. 4. Continue . . .

Page 63: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.2 Extended Operators of Relational Algebra

• 5.2.7 Outerjoins

Page 64: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Outer join

• Youtube link • http://www.youtube.com/watch?v

=L5sKDSgPt7M

Page 65: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

OuterjoinsA B C

B C D

Page 66: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Simple outer join

• First find all tuple that agree and pair them. Notice that unlike product the tuples that matches do not repeat.

• Next we deal with tuples that do not agree. We call these dangling tuples.

• Add the dangling tuples but what ever is messing add null.

example:

Page 67: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

A B C

1 2 3

4 5 6

7 8 9

B C D

2 3 10

2 3 11

6 7 12

If I was doing natural join I would be done here. These are the only matching tuples

Page 68: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

A B C

1 2 3

4 5 6

7 8 9

B C D

2 3 10

2 3 11

6 7 12

But what about the dangling tuples. In the outer join we have to account for them too

Page 69: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Outer joinsA B C D

1 2 3 10

1 2 3 11

4 5 6 Null

7 8 9 Null

Null 6 7 12

Page 70: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Left outer join

• The easier way of thinking of it is that we must keep all the tuples from the left relation.

Page 71: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

A B C

1 2 3

4 5 6

7 8 9

B C D

2 3 10

2 3 11

6 7 12

Page 72: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Example of left outer join

• Step one do normal join. That is write all the tuples that pair correctly.

A B C D

1 2 3 10

1 2 3 11

Page 73: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Left outer join

• Next look at the left relation and see that second and third tuples were not used. We must use them

A B C D

1 2 3 10

1 2 3 11

4 5 6 NULL

7 8 9 NULL

I have use the left relation fully

Page 74: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

A B C

1 2 3

4 5 6

7 8 9

B C D

2 3 10

2 3 11

6 7 12

Page 75: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Example of right outer join

• Step one do natural join. That is write all the tuples that pair correctly. Same exact step as the left outer join. I actually copied and paste it.

A B C D

1 2 3 10

1 2 3 11

Page 76: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Example right outer join continue

• Now look for the tuple that were not used in the right relation. That is the third tuple. Add this tuple to complete the right outer join.

A B C D

1 2 3 10

1 2 3 11

NULL 6 7 8

Page 77: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.2 Extended Operators of Relational Algebra

• 5.2.8 Exercises for Section 5.21. Show the commutate law for Cartesian

Product by example in pp.252. Exercise 5.2.1 (a), (b)

Page 78: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.3 A Logic for Relations

• 5.3.1 Predicates and Atoms

Page 79: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

Relational Algebra

• Thanks

Page 80: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.3 A Logic for Relations

• 5.3.2 Arithmetic Atoms

Page 81: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.3 A Logic for Relations

• 5.3.3 Datalog Rules and Queries

Page 82: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.3 A Logic for Relations

• 5.3.4 Meaning of Datalog Rules

Page 83: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

• 5.3.5 Extensional and Intensional Predicates

Page 84: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.3 A Logic for Relations

• 5.3.6 Datalog Rules Applied to Bags

Page 85: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.3 A Logic for Relations

• 5.3.7 Exercises for Section 5.3

Page 86: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.4 Relational Algebra and Datalog

• 5.4.1 Boolean Operations

Page 87: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.4 Relational Algebra and Datalog

• 5.4.2 Projection

Page 88: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.4 Relational Algebra and Datalog

• 5.4.3 Selection

Page 89: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.4 Relational Algebra and Datalog

• 5.4.4 Product

Page 90: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.4 Relational Algebra and Datalog

• 5.4.5 Joins

Page 91: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.4 Relational Algebra and Datalog

• 5.4.6 Simulating Multiple Operations with Datalog

Page 92: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.4 Relational Algebra and Datalog

• 5.4.7 Comparison Between Datalog and Relational Algebra

Page 93: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.4 Relational Algebra and Datalog

• 5.4.8 Exercises for Section 5.4

Page 94: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5.5 Summary of Chapter

Page 95: Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated

5 5.6 References for Chapter 5