infs614 - lecture week 9 1 more on sql lecture week 9 infs 614, fall 2008
TRANSCRIPT
INFS614 - Lecture week 9 2
Example Instances
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.558 rusty 10 35.0
sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0
sid bid day
22 101 10/10/9658 103 11/12/96
R1
S1
S2
INFS614 - Lecture week 9 3
SQL: Seen so far… Syntax of Basic SQL query:
– Relation-list (FROM)– Target-list (SELECT)– Qualification (WHERE)
Semantic: Conceptual Evaluation Strategy; String operators: LIKE, NOT LIKE; Set operators: Union (all), Intersect (all),
Except (all); Qualification involving sets: IN, EXISTS,
UNIQUE, ANY, ALL (and their negated version);
Nested queries.
INFS614 - Lecture week 9 4
Post Processing
Processing on the result of an SQL query:– Sorting: can sort the tuples in the output by any
column (even the ones not appearing in the SELECT clause)
– Duplicate removal– Example:
Aggregation operators
SELECT Distinct S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid and R.bid=103Order by S.sid asc;
INFS614 - Lecture week 9 5
Order By: Example
• For each attribute in the sort list, we can specify its order:• ASC : Ascending Order• DESC : Descending Order
• Default Order is Ascending
SELECT S.sname, S.ageFROM Sailors SWHERE S.sname LIKE ‘%U%’ORDER BY S.age,S.sname
sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.0 32 andy 8 25.5 44 guppy 5 35.0 58 rusty 10 35.0 71 zorba 10 16.0
sid sname rating age 44 guppy 5 35.0 58 rusty 10 35.0 28 yuppy 9 35.0 31 lubber 8 55.0
S.age ASC, S.sname DESC
sid sname rating age 28 yuppy 9 35.0 58 rusty 10 35.0 44 guppy 5 35.0 31 lubber 8 55.0
INFS614 - Lecture week 9 6
Aggregate Operators Significant extension of relational
algebra.
COUNT (*)COUNT ( [DISTINCT] A)SUM ( [DISTINCT] A)AVG ( [DISTINCT] A)MAX (A)MIN (A)
single column
INFS614 - Lecture week 9 7
Aggregate OperatorsCOUNT (*)COUNT ( [DISTINCT] A)SUM ( [DISTINCT] A)AVG ( [DISTINCT] A)MAX (A)MIN (A)
SELECT AVG (S.age)FROM Sailors SWHERE S.rating=10
SELECT COUNT (*)FROM Sailors S
SELECT AVG ( DISTINCT S.age)FROM Sailors SWHERE S.rating=10
SELECT S.snameFROM Sailors SWHERE S.rating= (SELECT MAX(S2.rating) FROM Sailors S2)
single column
SELECT COUNT (DISTINCT S.rating)FROM Sailors SWHERE S.sname=‘Bob’
INFS614 - Lecture week 9 8
Find name and age of the oldest sailor(s)
The first query is illegal! (We’ll look into the reason a bit later, when we discuss GROUP BY.)
The third query is equivalent to the second query, and is allowed in the SQL/92 standard, but is not supported in some systems.
SELECT S.sname, MAX (S.age)FROM Sailors S
SELECT S.sname, S.ageFROM Sailors SWHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2)
SELECT S.sname, S.ageFROM Sailors SWHERE (SELECT MAX (S2.age) FROM Sailors S2) = S.age
INFS614 - Lecture week 9 9
GROUP BY and HAVING
So far, we’ve applied aggregate operators to all (qualifying) tuples. Sometimes, we want to apply them to each of several groups of tuples.
Consider: Find the age of the youngest sailor for each rating level.– In general, we don’t know how many rating levels
exist, and what the rating values for these levels are!
– Suppose we know that rating values go from 1 to 10; we can write 10 queries that look like this (!):
SELECT MIN (S.age)FROM Sailors SWHERE S.rating = i
For i = 1, 2, ... , 10:
INFS614 - Lecture week 9 10
Queries With GROUP BY and HAVING
The target-list contains (i) attribute names (ii) terms with aggregate operations (e.g., MIN (S.age)).– The attribute list (i) must be a subset of grouping-list. – Intuitively, each answer tuple corresponds to a group, and
these attributes must have a single value per group. (A group is a set of tuples that have the same value for all attributes in grouping-list.)
SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-list[HAVING group-qualification]
INFS614 - Lecture week 9 11
Conceptual Evaluation The cross-product of relation-list is computed, tuples
that fail qualification are discarded, `unnecessary’ fields are deleted, and the remaining tuples are partitioned into groups by the value of attributes in grouping-list.
The group-qualification is then applied to eliminate some groups. Expressions in group-qualification must have a single value per group!– In effect, an attribute in group-qualification that is not an
argument of an aggregate op also appears in grouping-list. (SQL does not exploit primary key semantics here!)
One answer tuple is generated per qualifying group.
INFS614 - Lecture week 9 12
Query with GROUP BY and HAVING
Query: Find the age of the youngest sailor for each rating level.
SELECT S.rating, MIN (S.age)FROM Sailors SGROUP BY S.rating
INFS614 - Lecture week 9 13
Find the age of the youngest sailor older than 18, for each rating with at least 2 such sailors older than 18
• Only S.rating and S.age are mentioned in the SELECT, GROUP BY or HAVING clauses; other attributes `unnecessary’.
• 2nd column of result is unnamed. (Use AS to name it.)
SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age > 18GROUP BY S.ratingHAVING COUNT (*) > 1
sid sname rating age 22 Dustin 7 450 31 lubber 8 55.0 32 Andy 8 25.5 71 Zorba 10 16.0 64 Horatio 7 35.0 29 brutus 1 33.0 58 rusty 10 35.0 74 Horatio 9 40.0 85 Art 3 25.5 95 Bob 3 63.5
sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.0 32 andy 8 25.5 64 horatio 7 35.0 29 brutus 1 33.0 58 rusty 10 35.0 74 horatio 9 40.0 85 art 3 25.5 95 bob 3 63.5
rating age 1 33.0 3 25.5 3 63.5 7 45.0 7 35.0 8 55.0 8 25.5 9 40.0 10 35.0
Answer relation
rating 3 25.5 7 35.0 8 25.5
INFS614 - Lecture week 9 14
For each red boat, find the number of reservations for this boat
Grouping over a join of two relations.
What do we get if we remove B.color=‘red’ from the WHERE clause and add a HAVING clause with this condition?
SELECT B.bid, COUNT (*) AS rescountFROM Boats B, Reserves RWHERE R.bid=B.bid AND B.color=‘red’GROUP BY B.bid
INFS614 - Lecture week 9 15
Find the age of the youngest sailor with age > 18, for each rating with at least 2 sailors (of any age)
• HAVING clause can also contain a correlated subquery.
• Compare this with the query where we considered only ratings with 2 sailors over 18!
• What if HAVING clause is replaced by:– HAVING COUNT(*) >1
SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age > 18GROUP BY S.ratingHAVING 1 < (SELECT COUNT (*) FROM Sailors S2 WHERE S.rating=S2.rating)
Computes total number of sailors who have rating equal to S.rating
INFS614 - Lecture week 9 16
For each sailor with more than three reservations, find the number of his reservations
SELECT R.sid, COUNT (*) AS rescountFROM Reserves RGROUP BY R.sidHAVING COUNT(*) > 3
Find the age of the youngest sailor, for each rating level > 5 with an average age > 25
SELECT S.rating, MIN(S.age) AS minAgeFROM Sailors SWHERE S.rating > 5GROUP BY S.ratingHAVING AVG(S.age) > 25
SELECT S.rating, MIN(S.age) AS minAgeFROM Sailors SGROUP BY S.ratingHAVING S.rating > 5 AND AVG(S.age) > 25
INFS614 - Lecture week 9 17
Find those ratings for which the average age of sailors is the minimum over all ratings
Aggregate operations cannot be nested! WRONG:
SELECT S.ratingFROM Sailors SWHERE AVG(S.age) = (SELECT MIN (AVG (S2.age))
FROM Sailors S2 GROUP BY S2.rating)
INFS614 - Lecture week 9 18
Continue from previous slide
Correct solution (in SQL/92):
SELECT Temp.rating, Temp.avgageFROM (SELECT S.rating, AVG (S.age) AS avgage FROM Sailors S GROUP BY S.rating) AS TempWHERE Temp.avgage = (SELECT MIN (Temp.avgage) FROM Temp)
INFS614 - Lecture week 9 19
Note on Oracle SQL in the labs
Do not use the keyword “AS” to name a temporary table within the FROM clause.
INFS614 - Lecture week 9 20
Note on Oracle SQL in the labs
SELECT Temp.rating, Temp.avgageFROM (SELECT S.rating, AVG (S.age) AS avgage FROM Sailors S GROUP BY S.rating) AS TempWHERE Temp.avgage = (SELECT MIN (Temp.avgage) FROM Temp)
INFS614 - Lecture week 9 21
Conclusions (so far…)
Post processing on the result of queries is supported.
Aggregation is the most complex “post processing”– “Group by” clause partitions the results into
groups– “Having” clause puts condition on groups
(just like Where clause on tuples).
INFS614 - Lecture week 9 22
Null Values
Field values in a tuple are sometimes unknown (e.g., a rating has not been assigned) or inapplicable (e.g., no spouse’s name).
SQL provides a special value null for such situations.
INFS614 - Lecture week 9 23
Dealing with the null value Special operators are needed to check if value
is/is not null. – Attribute_name IS NULL : TRUE if the
attribute’s value is null. FALSE otherwise– Attribute_name IS NOT NULL : TRUE if the
attribute’s value is not null. FALSE otherwise Is rating>8 true or false when rating is equal to
null?– Actually, it’s unknown.
How about: rating > 8 AND age < 40 ?– Three-valued logic.
INFS614 - Lecture week 9 24
Three Valued Logic Logical operators AND, OR and NOT using a 3-
valued logic (TRUE, FALSE and unknown).
INFS614 - Lecture week 9 25
Null Values: Some issuesThe presence of null complicates many issues:
WHERE clause eliminates tuples for which the qualification does not evaluate to TRUE.Important in nested queries involving EXISTS or UNIQUE
Two tuples in SQL are duplicates in a relation if corresponding attributes have the same value or both contains null.
SELECT S1.sname
FROM SAILORS S1
WHERE EXISTS (
SELECT * FROM SAILORS S2
WHERE S1.sname = S2.sname AND
AND S1.SID < S2.SID
AND S1.RATING <> S2.RATING)
sid sname rating age 22 dustin 7 450 31 lubber 8 55.0 64 horatio null 35.0 32 andy 8 25.5 74 horatio 9 40.0
INFS614 - Lecture week 9 26
Issues with the null value: Summary
WHERE and HAVING clause eliminates rows (groups) for which the qualification does not evaluate to true (i.e., evaluate to false or unknown).
Aggregate functions ignore null values (except count(*)).
Distinct treats all null values as the same.
The arithmetic operations +, -, *, / return null if one of the arguments is null.
INFS614 - Lecture week 9 27
Outer joinssid bid day
22 101 10/10/9658 103 11/12/96
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.558 rusty 10 35.0
(left outer-join)
= sid sname rating age bid day
22 dustin 7 45.0 101 10/10/96
31 lubber 8 55.5 Null Null 58 rusty 10 35.0 103 11/12/96
INFS614 - Lecture week 9 28
In Oracle Select *
From Sailor S, Reserves R
Where S.sid = R.sid (+);
How about:Select S.sid, count(R.bid)
From Sailor S, Reserves R
Where S.sid = R.sid (+)
Group by S.sid;
ORSelect S.sid, count(*)
From Sailor S, Reserves R
Where S.sid = R.sid (+)
Group by S.sid;
INFS614 - Lecture week 9 29
More outer joins
Left outer join+ sign on the right in Oracle:Select * from R, S where R.id=S.id(+)
Right outer join + sign on the left in Oracle:Select * from R, S where R.id(+)=S.id
Full outer join– not implemented in Oracle
INFS614 - Lecture week 9 30
More on value functions
Values can be transformed before aggregated:e.g.: SELECT sum(S.A/2) FROM S;
An interesting decode function (Oracle specific):decode(value, if1, then1, if2, then2, …, else):
SELECT sum(decode(Major, ‘INFS’, 1, 0)) as No_IS_Stu, sum(decode(Major, ‘INFS’, 0, 1)) as No_NonIS_Stu
FROM Student ;
Calculating GPA from letter grades?
INFS614 - Lecture week 9 31
Examples
Department (D-code, D-Name, Chair-SSn)Course (D-code, C-no, Title, Units)Prereq (D-code, C-no, P-code, P-no)Class (Class-no, D-code, C-no, Instructor-SSn)Faculty (Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name, Major, Status)Enrollment (Class-no, Student-Ssn)Transcript (Student-Ssn, D-Code, C-no, Grade)
INFS614 - Lecture week 9 32
Query 1
List the classes (Class-no) currently taken by students whose names start with 'T'.
SELECT distinct e.Class-noFROM Enrollment e, Student sWHERE e.Student-Ssn = s.Ssn AND s.S-Name LIKE 'T%'
Department (D-code, D-Name, Chair-SSn)Course (D-code, C-no, Title, Units)Prereq (D-code, C-no, P-code, P-no)Class (Class-no, D-code, C-no, Instructor-SSn)Faculty (Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name, Major, Status)Enrollment (Class-no, Student-Ssn)Transcript (Student-Ssn, D-Code, C-no, Grade)
INFS614 - Lecture week 9 33
Query 2
List the students (SSN) who are currently
taking exactly one class. SELECT e.Student-SsnFROM Enrollment eGROUP BY e.Student-SsnHAVING 1=count(*)
Department (D-code, D-Name, Chair-SSn)Course (D-code, C-no, Title, Units)Prereq (D-code, C-no, P-code, P-no)Class (Class-no, D-code, C-no, Instructor-SSn)Faculty (Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name, Major, Status)Enrollment (Class-no, Student-Ssn)Transcript (Student-Ssn, D-Code, C-no, Grade)
INFS614 - Lecture week 9 34
Query 3
Give the percentage of the students (among all students) who are currently taking courses offered by ISE (D-code='ISE').
SELECT count(distinct e.Student-Ssn)/count(distinct s.Ssn) as Percent
FROM Enrollment e, Class c, Student sWHERE e.Class-no=c.Class-no and c.D-code='ISE';
Department (D-code, D-Name, Chair-SSn)Course (D-code, C-no, Title, Units)Prereq (D-code, C-no, P-code, P-no)Class (Class-no, D-code, C-no, Instructor-SSn)Faculty (Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name, Major, Status)Enrollment (Class-no, Student-Ssn)Transcript (Student-Ssn, D-Code, C-no, Grade)
INFS614 - Lecture week 9 35
Query 4
List the faculty members (F-Name) who teach 2 or more classes. List these faculty members by the number of classes they teach (ascending order).
SELECT f.F-NameFROM Faculty f, Class cWHERE f.Ssn=c.Instructor-SSnGROUP BY f.Ssn, f.F-NameHAVING count(distinct c.Class-no)>=2ORDER BY count(distinct c.Class-no), F-Name;
Department (D-code, D-Name, Chair-SSn)Course (D-code, C-no, Title, Units)Prereq (D-code, C-no, P-code, P-no)Class (Class-no, D-code, C-no, Instructor-SSn)Faculty (Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name, Major, Status)Enrollment (Class-no, Student-Ssn)Transcript (Student-Ssn, D-Code, C-no, Grade)
INFS614 - Lecture week 9 36
Query 5
List the students (SSN and Name) along with the number of classes they are taking. If a student is not taking any class, the student should also be listed (with 0 as the number of classes he/she is taking). The list should be ordered by the number of classes (in an ascending order), and in case of a tie, by the SSN of the students.
SELECT s.Ssn, s.S-Name, count(distinct Class-no)FROM Student s, Enrollment eWHERE s.Ssn = e.Student-Ssn (+) GROUP BY s.Ssn, s.S-NameORDER BY count(distinct Class-no), Ssn
Department (D-code, D-Name, Chair-SSn)Course (D-code, C-no, Title, Units)Prereq (D-code, C-no, P-code, P-no)Class (Class-no, D-code, C-no, Instructor-SSn)Faculty (Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name, Major, Status)Enrollment (Class-no, Student-Ssn)Transcript (Student-Ssn, D-Code, C-no, Grade)
INFS614 - Lecture week 9 37
Query 6
List the faculty members (F-Name) who teach more than twice as many classes as Professor Smith (F-Name='Smith') is teaching. (Note that if Professor Smith is not teaching anything, then any professor who teaches at least one class will satisfy the above query.)
SELECT f.F-NameFROM Faculty f, Class cWHERE f.Ssn=c.Instructor-SSnGROUP BY f.Ssn, f.F-NameHAVING count(distinct c.Class-no) > (SELECT 2*count(distinct Class-no) FROM Faculty f, Class c WHERE f.F-Name='Smith' and f.Ssn=c.Instructor-SSn)
Department (D-code, D-Name, Chair-SSn)Course (D-code, C-no, Title, Units)Prereq (D-code, C-no, P-code, P-no)Class (Class-no, D-code, C-no, Instructor-SSn)Faculty (Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name, Major, Status)Enrollment (Class-no, Student-Ssn)Transcript (Student-Ssn, D-Code, C-no, Grade)
INFS614 - Lecture week 9 38
Query 7
Find the number of departments which do not have a chairman (Chair_ssn is 'NULL').
SELECT count(d.D-code)FROM Department dWHERE d.Chair-SSn is NULL
Department (D-code, D-Name, Chair-SSn)Course (D-code, C-no, Title, Units)Prereq (D-code, C-no, P-code, P-no)Class (Class-no, D-code, C-no, Instructor-SSn)Faculty (Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name, Major, Status)Enrollment (Class-no, Student-Ssn)Transcript (Student-Ssn, D-Code, C-no, Grade)
INFS614 - Lecture week 9 39
Query 8
For each department (i.e., student’s Major), give the number of graduate students (status='Grad') and the number of other students (status <> 'Grad'). The two numbers must be shown in the same row as the department code. Hint: use decode.
SELECT Major, sum(decode(Status, 'Grad', 1,0)) as Grad, sum(decode(Status, 'Grad', 0,1)) as NoGradFROM StudentGROUP BY Major
Department (D-code, D-Name, Chair-SSn)Course (D-code, C-no, Title, Units)Prereq (D-code, C-no, P-code, P-no)Class (Class-no, D-code, C-no, Instructor-SSn)Faculty (Ssn, F-Name, D-Code, Rank)Student (Ssn, S-Name, Major, Status)Enrollment (Class-no, Student-Ssn)Transcript (Student-Ssn, D-Code, C-no, Grade)
INFS614 - Lecture week 9 40
Query 9For each department (D-code), give the highest rank of the professors in the department along with the number of the faculty with that highest rank. The output contains one row for each department. Assume Full>Associate>Assistant, i.e., lexicographic order is fine. Note that some department may not have professors in some ranks (e.g., a department may not have full, or associate or assistant professors).
SELECT f.D-Code as dept, f.maxrank, count(e.Ssn) as numFROM (SELECT D-code, max(Rank) as maxrank FROM Faculty GROUP BY D-code) AS f, Faculty eWHERE f.D-code=e.D-code AND e.Rank=f.maxrankGROUP BY f.D-code, f.maxrankORDER BY f.D-code