joins natural join is obtained by: r natural join s; example select * from moviestar natural join...

25
Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON <condition> Example SELECT * FROM MovieStar JOIN MovieExec ON moviestar.name = movieexec.name; In ORACLE it is used in FROM

Post on 21-Dec-2015

226 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Joins• Natural join is obtained by:

R NATURAL JOIN S;Example

SELECT *FROM MovieStar NATURAL JOIN MovieExec;

• Theta join is obtained by:

R JOIN S ON <condition>

Example

SELECT *

FROM MovieStar JOIN MovieExec

ON moviestar.name = movieexec.name;

In ORACLE it is used in FROM

Page 2: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

OuterjoinsSELECT *

FROM moviestar NATURAL FULL OUTER JOIN movieexec;

SELECT *

FROM moviestar NATURAL LEFT OUTER JOIN movieexec;

SELECT *

FROM moviestar NATURAL RIGHT OUTER JOIN movieexec;

One of LEFT, RIGHT, or FULL before OUTER (but not missing).

LEFT = pad dangling tuples of R only. RIGHT = pad dangling tuples of S only. FULL = pad both.

Page 3: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Subqueries ExampleFind the name of the producer of ‘Star Wars’.

Movie(title, year, length, inColor, studioName, producerC) MovieExec(name, address, cert, netWorth)

We can do:

SELECT name

FROM Movie, MovieExec

WHERE title ='Star Wars' AND producerC =cert;

Or we can have a subquery:

SELECT name

FROM MovieExec

WHERE cert =(SELECT producerC FROM Movie WHERE title = 'Star Wars');

If we can deduce that there will be only a single value produced by the subquery, then we can use this expression, surrounded by parentheses, as if it were a constant.

Page 4: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Conditions Involving Relations

There are a number of SQL operators that we can apply to a relation R and produce a Boolean result.

Typically used in the WHERE clause.

1. EXISTS R is a condition that is true if R is not empty.

2. s IN R is true if s is equal to one of the tuples in R. Likewise, s NOT IN R is true if and only if s is equal to no tuple in R.

Page 5: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

(Continued)3. s > ALL R is true if s is greater than every value in the unary (one column) relation R.

Similarly, the > operator could be replaced by any other comparison operators with the analogous meaning. For instance, s <> ALL R is the same as s NOT IN R.

4. s > ANY R is true if s is greater than at least one value in unary relation R.

Similarly we can use any other comparison operators in place of >. For instance, s =ANY R is the same as s IN R.

EXISTS, ALL, and ANY operators can be negated by putting NOT in front of the entire expression.

Page 6: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

ExampleGive all the producers of movies in which Julia Roberts stars.

SELECT name FROM MovieExecWHERE cert IN

(SELECT producerC FROM Movie WHERE (title, year) IN

(SELECT movieTitle, movieYear FROM StarsIn WHERE starName = 'Julia Roberts'));

Page 7: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

RemarkThe previous nested query can, like many nested queries, be written as a single SELECT-FROM-WHERE expression.

SELECT nameFROM MovieExec, Movie, StarsIn

WHERE cert = producerC AND title = movieTitle AND year = movie Year AND starName = 'Julia Roberts';

Page 8: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Bag Semantics and Union, Intersection and Difference

• Although the SELECT-FROM-WHERE statement uses bag semantics, the default for union, intersection, and difference is set semantics.– That is, duplicates are eliminated as the operation is applied.

Motivation? • When doing projection in relational algebra, it is easier to avoid

eliminating duplicates.– Just work tuple-at-a-time.

• When doing intersection or difference, it is most efficient to sort the relations first.– At that point you may as well eliminate the duplicates anyway.

Page 9: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Controlling Duplicate Elimination• Force the result to be a set by

– SELECT DISTINCT . . .• Force the result to be a bag (i.e., don’t eliminate duplicates) by ALL, as in

– UNION ALL . . .• Only UNION ALL supported in ORACLE.

Example• Find all the different studios producing movies:

– Movie(title, year, length, inColor, studioName, producerC),

SELECT DISTINCT studioname

FROM Movie;

• Notice that without DISTINCT, a studioname would be listed as many times as there were movies from that studio.

Page 10: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Aggregations• SUM, AVG, COUNT, MIN, and MAX can be applied to a column in a SELECT

clause to produce that aggregation on the column.

Example

• Find the average length of movies from Disney.

SELECT AVG(length)

FROM Movie

WHERE studioName = 'Disney';

Remark

• We can also use COUNT(*) which counts the number of tuples in the relation constructed from the FROM and WHERE clauses of the query.

Page 11: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Eliminating Duplicates in an Aggregation• DISTINCT inside an aggregation causes duplicates to be eliminated before

the aggregation.

Example

• Find the number of different producers for Disney movies.

SELECT COUNT(DISTINCT producerc)

FROM Movie

WHERE studioname = 'Disney';

This is not the same as:

SELECT DISTINCT COUNT(producerc)

FROM Movie

WHERE studioname = 'Disney';

DISTINCT here is useless! Why?

Page 12: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Not only in COUNT…

SELECT AVG(DISTINCT length)

FROM Movie

WHERE studioname = 'Disney';

• This will produce the average of only the distinct values for length.

Page 13: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

NULL’s Ignored in Aggregation

• NULL never contributes to a sum, average, or count, and can never be the minimum or maximum of a column.

SELECT SUM(networth)

FROM moviestar NATURAL FULL OUTER JOIN movieexec;

• But if there are no non-NULL values in a column, then the result of the aggregation is NULL.

Page 14: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Example: Effect of NULL’s

SELECT count(*)

FROM Movie

WHERE studioName = 'Disney';

SELECT count(length)

FROM Movie

WHERE studioName = 'Disney';

The number of moviesfrom Disney.

The number of moviesfrom Disney with aknown length.

Page 15: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Grouping• We may follow a SELECT-FROM-WHERE expression by GROUP BY

and a list of attributes.

• The relation that results from the SELECT-FROM-WHERE – is grouped according to the values of all the listed attributes in

GROUP BY, and – any aggregation is applied only within each group.

Example• From the Movie relation, find the average length for each studio.

SELECT studioName, AVG(length)

FROM Movie

GROUP BY studioName;

Page 16: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Another Example

• From Movie and MovieExec, find the producer’s total length of film produced:

SELECT name, SUM(length)

FROM Movie, MovieExec

WHERE producerc = cert

GROUP BY name;

Computethosetuples first,then groupby name.

Page 17: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Restriction on SELECT Lists With Aggregation

• If any aggregation is used, then each element of the SELECT list must be either:

1. Aggregated, or

2. An attribute on the GROUP BY list.

Page 18: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Illegal Query Example• We might think we could find the shortest movie of Disney as:

SELECT title, MIN(length)

FROM Movie

WHERE studioName = 'Disney';• But this query is illegal in SQL. • Because title is neither aggregated nor on the GROUP BY list.

• We should do instead:

SELECT title, length

FROM Movie

WHERE studioName = 'Disney' AND length =

(SELECT MIN(length)

FROM Movie

WHERE studioName = 'Disney');

Page 19: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Or…

SELECT title, length

FROM

Movie NATURAL JOIN

(SELECT MIN(length) AS length

FROM Movie

WHERE studioName = 'Disney')

WHERE studioName = 'Disney';

…resembling Relational Algebra.

Page 20: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

HAVING Clauses• HAVING <condition> may follow a GROUP BY clause.

– If so, the condition applies to each group, and groups not satisfying the condition are eliminated.

Example• Consider again the query

SELECT name, SUM(length)FROM Movie, MovieExecWHERE producerc = cert GROUP BY name;

• Suppose we didn’t wish to include all the producers in our table of aggregated lengths. We want those producers – with networth less than 1,000,000, and– that have at least one movie before 1973.

• SolutionSELECT name, SUM(length)FROM MovieExec, MovieWHERE producerc = cert AND networth<1000000GROUP BY nameHAVING MIN(year) < 1973;

Page 21: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Requirements on HAVING Conditions

• These conditions may refer to any relation in the FROM clause.

• They may refer to attributes of those relations, as long as the attribute makes sense within a group; i.e., it is either:

1. A grouping attribute, or

2. Aggregated attribute.

Page 22: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

“Having” is a special kind of • The previous query can also be written as:

SELECT name, sumLength

FROM (

SELECT name, MIN(year) AS minYear, SUM(length) AS sumLength

FROM MovieExec, Movie

WHERE producerc = cert AND networth < 1000000

GROUP BY name)

WHERE minYear < 1973;

Page 23: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Correlated Subqueries• Suppose StarsIn relation has an additional attribute “salary”

StarsIn(movieTitle, movieYear, starName, salary)

• Now, find the stars who are paid for some movie more than the average salary for that movie.

SELECT starName, movieTitle, movieYear

FROM StarsIn X

WHERE salary >

(SELECT AVG(salary)

FROM StarsIn

WHERE movieTitle = X.movieTitle

AND movieYear=X.movieYear);

Remarks

1. Outer query cannot reference any columns in the subquery.

2. Subquery references the tuple in the outer query.

3. Value of the tuple changes by row of the outer query, so the database must rerun the subquery for each row comparison.

Page 24: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

Another Solution (Nesting in FROM)SELECT X.starName, X.movieTitle, X.movieYear

FROM StarsIn X, (SELECT movieTitle, movieYear, AVG(salary) AS avgSalary

FROM StarsIn

GROUP BY movieTitle, movieYear) Y

WHERE X.salary>Y.avgSalary AND

X.movieTitle=Y.movieTitle AND X.movieYear=Y.movieYear;

Page 25: Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example

ExerciseProduct(maker, model, type)

PC(model, speed, ram, hd, rd, price)

Laptop(model, speed, ram, hd, screen, price)

Printer(model, color, type, price)

1. Find those manufacturers that sell Laptops, but not PC's.

2. Find those hard-disk sizes that occur in two or more PC's.

3. Find those manufacturers of at least two different computers (PC or Laptops) with speed of at least 700.

4. Find the manufacturers who sell exactly three different models of PC.