lecture 2 sql - duke university

84
CompSci 516 Data Intensive Computing Systems Lecture 2 SQL Instructor: Sudeepa Roy 1 Duke CS, Fall 2016 CompSci 516: Data Intensive Computing Systems

Upload: others

Post on 10-Nov-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 2 SQL - Duke University

CompSci 516DataIntensiveComputingSystems

Lecture2SQL

Instructor:Sudeepa Roy

1DukeCS,Fall2016 CompSci 516:DataIntensiveComputingSystems

Page 2: Lecture 2 SQL - Duke University

Announcement

• Ifyouareenrolledtotheclass,buthavenotreceivedtheemailfromPiazza,pleasesendmeanemail

• Note:Monday9/5isaholiday– noofficehour• Jungwillholdanadditionalofficehouron9/6,Tuesday,1pm,NorthN303B

• GuestlecturebyJungon9/7(Wed)onMapReduceandSpark– WillbeusefultostartHW2– WorkonHW1andHW2atyourownpace,butnotethedeadlines!

DukeCS,Fall2016 CompSci 516:DataIntensiveComputingSystems 2

Page 3: Lecture 2 SQL - Duke University

Today’stopic• SQLinanutshell• Readingmaterial

– [RG]Chapters3and5– Additionalreadingforpractice[GUW]Chapter6

• TrySQLfromtoday’slectureonatoydatasetorDBLPdatasetonPostGres!

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 3

Acknowledgement:Thefollowingslideshavebeencreatedadaptingtheinstructormaterialofthe[RG]bookprovidedbytheauthorsDr.Ramakrishnan andDr.Gehrke.

Page 4: Lecture 2 SQL - Duke University

RelationalQueryLanguages

• Amajorstrengthoftherelationalmodel:supportssimple,powerfulquerying ofdata.

• Queriescanbewrittenintuitively,andtheDBMSisresponsibleforanefficientevaluation– Thekey:precisesemanticsforrelationalqueries.– Allowstheoptimizertoextensivelyre-orderoperations,andstillensurethattheanswerdoesnotchange.

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 4

Page 5: Lecture 2 SQL - Duke University

TheSQLQueryLanguage

• DevelopedbyIBM(systemR)inthe1970s• Needforastandardsinceitisusedbymanyvendors

• Standards:– SQL-86– SQL-89(minorrevision)– SQL-92(majorrevision)– SQL-99(majorextensions,currentstandard)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 5

Page 6: Lecture 2 SQL - Duke University

PurposesofSQL

• DataManipulationLanguage(DML)– Querying:SELECT-FROM-WHERE– Modifying:INSERT/DELETE/UPDATE

• DataDefinitionLanguage(DDL)– CREATE/ALTER/DROP

6DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 7: Lecture 2 SQL - Duke University

TheSQLQueryLanguage

• Tofindall18yearoldstudents,wecanwrite:

SELECT *FROM Students SWHERE S.age=18

•To find just names and logins, replace the first line:

SELECT S.name, S.login

sid name login age gpa53666 Jones jones@cs 18 3.453688 Smith smith@ee 18 3.2

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 7

allattributes

Page 8: Lecture 2 SQL - Duke University

QueryingMultipleRelations• Whatdoesthefollowing

querycompute?SELECT S.name, E.cidFROM Students S, Enrolled EWHERE S.sid=E.sid AND E.grade=“A”

sid cid grade53831 Carnatic101 C53831 Reggae203 B53650 Topology112 A53666 History105 B

Given the following instances of Enrolled and Students:

we get: ??

sid name login age gpa53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8

Students

Enrolled

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 9: Lecture 2 SQL - Duke University

QueryingMultipleRelations• Whatdoesthefollowing

querycompute?SELECT S.name, E.cidFROM Students S, Enrolled EWHERE S.sid=E.sid AND E.grade=“A”

S.name E.cid Smith Topology112

sid cid grade53831 Carnatic101 C53831 Reggae203 B53650 Topology112 A53666 History105 B

Given the following instances of Enrolled and Students:

we get:

sid name login age gpa53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8

Students

Enrolled

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 10: Lecture 2 SQL - Duke University

CreatingRelationsinSQL• Createsthe“Students”relation

– thetype(domain)ofeachfieldisspecified

– enforcedbytheDBMSwhenevertuplesareaddedormodified

• Asanotherexample,the“Enrolled”tableholdsinformationaboutcoursesthatstudentstake

CREATE TABLE Students(sid CHAR(20), name CHAR(20), login CHAR(10),age INTEGER,gpa REAL)

CREATE TABLE Enrolled(sid CHAR(20), cid CHAR(20), grade CHAR(2))

DukeCS,Fall2016 CompSci 516:DataIntensiveComputingSystems 10

sid cid grade53831 Carnatic101 C53831 Reggae203 B53650 Topology112 A53666 History105 B

sid name login age gpa53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8

StudentsEnrolled

Page 11: Lecture 2 SQL - Duke University

DestroyingandAlteringRelations

• DestroystherelationStudents– Theschemainformationand thetuplesaredeleted.

DROP TABLE Students

• The schema of Students is altered by adding a new field; every tuple in the current instance is extended with a null value in the new field.

ALTER TABLE Students ADD COLUMN firstYear: integer

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 11

Page 12: Lecture 2 SQL - Duke University

AddingandDeletingTuples

• Caninsertasingletupleusing:INSERT INTO Students (sid, name, login, age, gpa)VALUES (53688, ‘Smith’, ‘smith@ee’, 18, 3.2)

• Can delete all tuples satisfying some condition (e.g., name = Smith):

DELETEFROM Students SWHERE S.name = ‘Smith’

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 12

Page 13: Lecture 2 SQL - Duke University

IntegrityConstraints(ICs)

• IC: conditionthatmustbetrueforanyinstanceofthedatabase– e.g.,domainconstraints– ICsarespecifiedwhenschemaisdefined– ICsarecheckedwhenrelationsaremodified

• AlegalinstanceofarelationisonethatsatisfiesallspecifiedICs– DBMSwillnotallowillegalinstances

• IftheDBMSchecksICs,storeddataismorefaithfultoreal-worldmeaning

– Avoidsdataentryerrors,too!

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 13

Page 14: Lecture 2 SQL - Duke University

KeysinaDatabase

• Key/CandidateKey• PrimaryKey• SuperKey• ForeignKey

• Primarykeyattributesareunderlined inaschema– Person(pid,address,name)– Person2(address,name,age,job)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 14

Page 15: Lecture 2 SQL - Duke University

PrimaryKeyConstraints

• Asetoffieldsisakey forarelationif:1.Notwodistincttuplescanhavesamevaluesinallkeyfields,and

2.Thisisnottrueforanysubsetofthekey

• Part2false?Asuperkey• Ifthereare>1keysforarelation,oneofthekeysischosen(byDBA=DBadmin)tobetheprimarykey– E.g.,sidisakeyforStudents– Theset{sid,gpa}isasuperkey.

• Isthereanypossiblebenefittorefertoatupleusingprimarykey(thananykey)?

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 15

Page 16: Lecture 2 SQL - Duke University

PrimaryandCandidateKeysinSQL

CREATE TABLE Enrolled(sid CHAR(20)cid CHAR(20),grade CHAR(2),PRIMARY KEY ???)

• “For a given student and course, there is a single grade.”

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 16

• Possiblymanycandidatekeys– specifiedusingUNIQUE– oneofwhichischosenastheprimarykey.

Page 17: Lecture 2 SQL - Duke University

PrimaryandCandidateKeysinSQL

CREATE TABLE Enrolled(sid CHAR(20)cid CHAR(20),grade CHAR(2),PRIMARY KEY (sid,cid) )

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 17

• Possiblymanycandidatekeys– specifiedusingUNIQUE– oneofwhichischosenastheprimarykey.

• “For a given student and course, there is a single grade.”

Page 18: Lecture 2 SQL - Duke University

PrimaryandCandidateKeysinSQL

CREATE TABLE Enrolled(sid CHAR(20)cid CHAR(20),grade CHAR(2),PRIMARY KEY (sid,cid) )

• “For a given student and course, there is a single grade.”

• vs.

• “Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade.”

CREATE TABLE Enrolled(sid CHAR(20)cid CHAR(20),grade CHAR(2),PRIMARY KEY ???,UNIQUE ??? )

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 18

• Possiblymanycandidatekeys– specifiedusingUNIQUE– oneofwhichischosenastheprimarykey.

Page 19: Lecture 2 SQL - Duke University

PrimaryandCandidateKeysinSQL

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 19

CREATE TABLE Enrolled(sid CHAR(20)cid CHAR(20),grade CHAR(2),PRIMARY KEY (sid,cid) )

• “For a given student and course, there is a single grade.”

• vs.

• “Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade.”

CREATE TABLE Enrolled(sid CHAR(20)cid CHAR(20),grade CHAR(2),PRIMARY KEY sid,UNIQUE (cid, grade))

• Possiblymanycandidatekeys– specifiedusingUNIQUE– oneofwhichischosenastheprimarykey.

Page 20: Lecture 2 SQL - Duke University

PrimaryandCandidateKeysinSQL

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 20

CREATE TABLE Enrolled(sid CHAR(20)cid CHAR(20),grade CHAR(2),PRIMARY KEY (sid,cid) )

• “For a given student and course, there is a single grade.”

• vs.

• “Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade.”

• Used carelessly, an IC can prevent the storage of database instances that arise in practice!

CREATE TABLE Enrolled(sid CHAR(20)cid CHAR(20),grade CHAR(2),PRIMARY KEY sid,UNIQUE (cid, grade))

• Possiblymanycandidatekeys– specifiedusingUNIQUE– oneofwhichischosenastheprimarykey.

Page 21: Lecture 2 SQL - Duke University

ForeignKeys,ReferentialIntegrity

• Foreignkey:Setoffieldsinonerelationthatisusedto`refer’toatupleinanotherrelation– Mustcorrespondtoprimarykeyofthesecondrelation– Likea`logicalpointer’

• E.g.sid isaforeignkeyreferringtoStudents:– Enrolled(sid:string,cid:string,grade:string)– Ifallforeignkeyconstraintsareenforced,referentialintegrity isachieved

– i.e.,nodanglingreferences

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 21

Page 22: Lecture 2 SQL - Duke University

ForeignKeysinSQL• OnlystudentslistedintheStudentsrelationshouldbeallowedtoenrollforcourses

CREATE TABLE Enrolled(sid CHAR(20), cid CHAR(20), grade CHAR(2),PRIMARY KEY (sid,cid),FOREIGN KEY (sid) REFERENCES Students )

sid name login age gpa53666 Jones jones@cs 18 3.453688 Smith smith@eecs 18 3.253650 Smith smith@math 19 3.8

sid cid grade53666 Carnatic101 C53666 Reggae203 B53650 Topology112 A53666 History105 B

EnrolledStudents

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 22

Page 23: Lecture 2 SQL - Duke University

EnforcingReferentialIntegrity• ConsiderStudentsandEnrolled

– sidinEnrolledisaforeignkeythatreferencesStudents.

• WhatshouldbedoneifanEnrolledtuplewithanon-existentstudentidisinserted?– Rejectit!

• WhatshouldbedoneifaStudentstupleisdeleted?– ThreesemanticsallowedbySQL1. AlsodeleteallEnrolledtuplesthatrefertoit(cascadedelete)2. DisallowdeletionofaStudentstuplethatisreferredto3. SetsidinEnrolledtuplesthatrefertoittoadefaultsid4. (inadditioninSQL):SetsidinEnrolledtuplesthatrefertoittoaspecial

valuenull,denoting`unknown’or`inapplicable’

• SimilarifprimarykeyofStudentstupleisupdatedDukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 23

Page 24: Lecture 2 SQL - Duke University

ReferentialIntegrityinSQL

• SQL/92andSQL:1999supportall4optionsondeletesandupdates.– DefaultisNOACTION(delete/updateisrejected)

– CASCADE (alsodeletealltuplesthatrefertodeletedtuple)

– SETNULL/ SETDEFAULT (setsforeignkeyvalueofreferencingtuple)

CREATE TABLE Enrolled(sid CHAR(20),cid CHAR(20),grade CHAR(2),PRIMARY KEY (sid,cid),FOREIGN KEY (sid)REFERENCES StudentsON DELETE CASCADEON UPDATE SET DEFAULT )

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 25: Lecture 2 SQL - Duke University

WheredoICsComeFrom?• ICsarebaseduponthesemanticsofthereal-worldenterprise

thatisbeingdescribedinthedatabaserelations

• CanweinferICsfromaninstance?– WecancheckadatabaseinstancetoseeifanICisviolated,butwe

canNEVER inferthatanICistruebylookingataninstance.– AnICisastatementaboutallpossibleinstances!– Fromexample,weknownameisnotakey,buttheassertionthatsidis

akeyisgiventous.

• KeyandforeignkeyICsarethemostcommon;moregeneralICssupportedtoo

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 25

Page 26: Lecture 2 SQL - Duke University

ExampleInstances

• WewillusetheseinstancesoftheSailorsandReservesrelationsinourexamples

• IfthekeyfortheReservesrelationcontainedonlytheattributessidandbid,howwouldthesemanticsdiffer?

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

Reserves

Sailor

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 27: Lecture 2 SQL - Duke University

BasicSQLQuery

• relation-list Alistofrelationnames– possiblywitha“rangevariable” aftereachname

• target-list Alistofattributesofrelationsinrelation-list• qualification Comparisons

– (Attr opconst)or(Attr1opAttr2)– whereopisoneof=,<,>,<=,>=combinedusingAND,ORandNOT

• DISTINCT isanoptionalkeywordindicatingthattheanswershouldnotcontainduplicates– Defaultisthatduplicatesarenoteliminated!

SELECT [DISTINCT] <target-list>FROM <relation-list>WHERE <qualification>

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 27

Page 28: Lecture 2 SQL - Duke University

ConceptualEvaluationStrategy

• SemanticsofanSQLquerydefinedintermsofthefollowingconceptualevaluationstrategy:– Computethecross-productof<relation-list>– Discardresultingtuplesiftheyfail<qualifications>– Deleteattributesthatarenotin<target-list>– IfDISTINCT isspecified,eliminateduplicaterows

• Thisstrategyisprobablytheleastefficientwaytocomputeaquery!Anoptimizerwillfindmoreefficientstrategiestocomputethesameanswers

SELECT [DISTINCT] <target-list>FROM <relation-list>WHERE <qualification>

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 28

Page 29: Lecture 2 SQL - Duke University

ExampleofConceptualEvaluationSELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

Reserves

Sailor

sid sname rating age sid bid day22 dustin 7 45 22 101 10/10/9622 dustin 7 45 58 103 11/12/9631 lubber 8 55 22 101 10/10/9631 lubber 8 55 58 103 11/12/9658 rusty 10 35 22 101 10/10/9658 rusty 10 35 58 103 11/12/96

Step1:FormcrossproductofSailorandReserves

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 30: Lecture 2 SQL - Duke University

ExampleofConceptualEvaluationSELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

Reserves

Sailor

sid sname rating age sid bid day22 dustin 7 45 22 101 10/10/9622 dustin 7 45 58 103 11/12/9631 lubber 8 55 22 101 10/10/9631 lubber 8 55 58 103 11/12/9658 rusty 10 35 22 101 10/10/9658 rusty 10 35 58 103 11/12/96

Step2:Discardtuplesthatdonotsatisfy<qualification>

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 31: Lecture 2 SQL - Duke University

ExampleofConceptualEvaluationSELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

Reserves

Sailor

sid sname rating age sid bid day22 dustin 7 45 22 101 10/10/9622 dustin 7 45 58 103 11/12/9631 lubber 8 55 22 101 10/10/9631 lubber 8 55 58 103 11/12/9658 rusty 10 35 22 101 10/10/9658 rusty 10 35 58 103 11/12/96

Step3:Selectthespecifiedattribute(s)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 32: Lecture 2 SQL - Duke University

ANoteon“RangeVariables”• ReallyneededonlyifthesamerelationappearstwiceintheFROM clause– sometimesusedasashort-name

• Thepreviousquerycanalsobewrittenas:

SELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND bid=103

SELECT snameFROM Sailors, Reserves WHERE Sailors.sid=Reserves.sid

AND bid=103

It is good style,however, to userange variablesalways!

OR

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 32

Page 33: Lecture 2 SQL - Duke University

Findsailoridswho’vereservedatleastoneboat

SELECT ????FROM Sailors S, Reserves RWHERE S.sid=R.sid

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 33

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

Reserves

Sailor

Page 34: Lecture 2 SQL - Duke University

• WouldaddingDISTINCTtothisquerymakeadifference?

SELECT S.sidFROM Sailors S, Reserves RWHERE S.sid=R.sid

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 34

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

Reserves

Sailor

Findsailoridswho’vereservedatleastoneboat

Page 35: Lecture 2 SQL - Duke University

Findsailorswho’vereservedatleastoneboat

• WouldaddingDISTINCTtothisquerymakeadifference?– Notethatiftherearemultiplebidsforthe

samesid,yougetmultipleoutputtuplesforthesamesid

– Withoutdistinct,yougetthemmultipletimes

• WhatistheeffectofreplacingS.sid byS.sname intheSELECT clause?– WouldaddingDISTINCT tothisvariantofthe

querymakeadifferenceevenifonesidreservesatmostonebid?

SELECT S.sidFROM Sailors S, Reserves RWHERE S.sid=R.sid

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 35

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

Reserves

Sailor

Page 36: Lecture 2 SQL - Duke University

Joins

• Condition/Theta-Join• Equi-Join• Natural-Join• (Left/Right/Full)Outer-Join

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 36

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

Page 37: Lecture 2 SQL - Duke University

Condition/ThetaJoin

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 37

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

SELECT *FROM Sailors S, Reserves RWHERE S.sid=R.sid and age >= 40

sid sname rating age sid bid day22 dustin 7 45 22 101 10/10/9622 dustin 7 45 58 103 11/12/9631 lubber 8 55 22 101 10/10/9631 lubber 8 55 58 103 11/12/9658 rusty 10 35 22 101 10/10/9658 rusty 10 35 58 103 11/12/96

Formcrossproduct,discardrowsthatdonotsatisfythecondition

Page 38: Lecture 2 SQL - Duke University

Equi Join

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 38

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

SELECT *FROM Sailors S, Reserves RWHERE S.sid=R.sid and age = 45

sid sname rating age sid bid day22 dustin 7 45 22 101 10/10/9622 dustin 7 45 58 103 11/12/9631 lubber 8 55 22 101 10/10/9631 lubber 8 55 58 103 11/12/9658 rusty 10 35 22 101 10/10/9658 rusty 10 35 58 103 11/12/96

AspecialcaseofthetajoinJoinconditiononlyhasequalitypredicate=

Page 39: Lecture 2 SQL - Duke University

NaturalJoin

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 39

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

SELECT *FROM Sailors S NATURAL JOIN Reserves R

sid sname rating age bid day22 dustin 7 45 101 10/10/9622 dustin 7 45 103 11/12/9631 lubber 8 55 101 10/10/9631 lubber 8 55 103 11/12/9658 rusty 10 35 101 10/10/9658 rusty 10 35 103 11/12/96

Aspecialcaseofequi joinEqualityconditiononALLcommonpredicates(sid)Duplicatecolumnsareeliminated

Page 40: Lecture 2 SQL - Duke University

OuterJoin

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 40

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

sid bid day22 101 10/10/9658 103 11/12/96

SELECT S.sid, R. bidFROM Sailors S LEFT OUTER JOIN Reserves RON S.sid=R.sid

Preservesalltuplesfromthelefttablewhetherornotthereisamatchifnomatch,fillattributesfromrightwithnullSimilarlyRIGHT/FULLouterjoin

sid bid22 10131 null58 103

Page 41: Lecture 2 SQL - Duke University

ExpressionsandStrings

• Illustratesuseofarithmeticexpressionsandstringpatternmatching• Findtriples(ofagesofsailorsandtwofieldsdefinedbyexpressions)

forsailors– whosenamesbeginandendwithBandcontainatleastthreecharacters

• LIKE isusedforstringmatching.`_’standsforanyonecharacterand`%’standsfor0ormorearbitrarycharacters– Youwillneedtheseoften

SELECT S.age, age1=S.age-5, 2*S.age AS age2FROM Sailors SWHERE S.sname LIKE ‘B_%B’

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 41

Page 42: Lecture 2 SQL - Duke University

Findsid’s ofsailorswho’vereservedaredor agreenboat

• AssumeaBoatsrelation(seebook)

Sailors(sid,sname,rating,age)Reserves(sid,bid,day)Boats(bid,bname,color)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 43: Lecture 2 SQL - Duke University

Findsid’s ofsailorswho’vereservedaredor agreenboat

• AssumeaBoatsrelation

• UNION:Canbeusedtocomputetheunionofanytwounion-compatible setsoftuples– canthemselvesbetheresultof

SQLqueries

• IfwereplaceOR byAND inthefirstversion,whatdoweget?

• Alsoavailable:EXCEPT (WhatdowegetifwereplaceUNIONbyEXCEPT?)

SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid

AND (B.color=‘red’ OR B.color=‘green’)

SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid

AND B.color=‘red’UNIONSELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid

AND B.color=‘green’

Sailors(sid,sname,rating,age)Reserves(sid,bid,day)Boats(bid,bname,color)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 44: Lecture 2 SQL - Duke University

Findsid’sofsailorswho’vereservedaredand agreenboat

Sailors(sid,sname,rating,age)Reserves(sid,bid,day)Boats(bid,bname,color)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 45: Lecture 2 SQL - Duke University

Findsid’s ofsailorswho’vereservedaredand agreenboat

• INTERSECT:Canbeusedtocomputetheintersectionofanytwounion-compatiblesetsoftuples.– IncludedintheSQL/92

standard,butsomesystemsdon’tsupportit

SELECT S.sidFROM Sailors S, Boats B1, Reserves R1,

Boats B2, Reserves R2WHERE S.sid=R1.sid AND R1.bid=B1.bid

AND S.sid=R2.sid AND R2.bid=B2.bidAND (B1.color=‘red’ AND B2.color=‘green’)

SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid

AND B.color=‘red’INTERSECTSELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid

AND B.color=‘green’

Key field!

Sailors(sid,sname,rating,age)Reserves(sid,bid,day)Boats(bid,bname,color)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 46: Lecture 2 SQL - Duke University

NestedQueries

• AverypowerfulfeatureofSQL:– aWHERE/FROM/HAVING clausecanitselfcontainanSQLquery

• Tofindsailorswho’venotreserved#103,useNOTIN.• Tounderstandsemanticsofnestedqueries,thinkofanestedloopsevaluation– ForeachSailorstuple,checkthequalificationbycomputingthesubquery

SELECT S.snameFROM Sailors SWHERE S.sid IN (SELECT R.sid

FROM Reserves RWHERE R.bid=103)

Find names of sailors who’ve reserved boat #103:

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 46

Sailors(sid,sname,rating,age)Reserves(sid,bid,day)Boats(bid,bname,color)

Page 47: Lecture 2 SQL - Duke University

NestedQuerieswithCorrelation

• EXISTS isanothersetcomparisonoperator,likeIN• Illustrateswhy,ingeneral,subquery mustbere-computedforeachSailorstuple

SELECT S.snameFROM Sailors SWHERE EXISTS (SELECT *

FROM Reserves RWHERE R.bid=103 AND S.sid=R.sid)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 47

Find names of sailors who’ve reserved boat #103:

Page 48: Lecture 2 SQL - Duke University

NestedQuerieswithCorrelation

• IfUNIQUE isused,and*isreplacedbyR.bid,findssailorswithatmostonereservationforboat#103– UNIQUE checksforduplicatetuples

SELECT S.snameFROM Sailors SWHERE UNIQUE (SELECT R.bid

FROM Reserves RWHERE R.bid=103 AND S.sid=R.sid)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 48

Find names of sailors who’ve reserved boat #103:

Page 49: Lecture 2 SQL - Duke University

MoreonSet-ComparisonOperators

• We’vealreadyseenIN,EXISTSandUNIQUE• CanalsouseNOTIN,NOTEXISTSandNOTUNIQUE.• Alsoavailable:op ANY,op ALL,op IN

– whereop:>,<,=,<=,>=

• FindsailorswhoseratingisgreaterthanthatofsomesailorcalledHoratio– similarlyALL SELECT *

FROM Sailors SWHERE S.rating > ANY (SELECT S2.rating

FROM Sailors S2WHERE S2.sname=‘Horatio’)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 49

Page 50: Lecture 2 SQL - Duke University

AggregateOperatorsCOUNT (*)COUNT ( [DISTINCT] A)SUM ( [DISTINCT] A)AVG ( [DISTINCT] A)MAX (A)MIN (A)

SELECT AVG (S.age)FROM Sailors SWHERE S.rating=10

SELECT COUNT (*)FROM Sailors S

SELECT AVG ( DISTINCT S.age)FROM Sailors SWHERE S.rating=10

SELECT S.snameFROM Sailors SWHERE S.rating= (SELECT MAX(S2.rating)

FROM Sailors S2)

single column

SELECT COUNT (DISTINCT S.rating)FROM Sailors SWHERE S.sname=‘Bob’

Checkyourself:Whatdothesequeriescompute?

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

ENDOFLECTURE2

Page 51: Lecture 2 SQL - Duke University

MotivationforGrouping• Sofar,we’veappliedaggregateoperatorstoall(qualifying)tuples– Sometimes,wewanttoapplythemtoeachofseveralgroupsoftuples

• Consider:Findtheageoftheyoungestsailorforeachratinglevel– Ingeneral,wedon’tknowhowmanyratinglevelsexist,andwhattheratingvaluesfortheselevelsare!

– Supposeweknowthatratingvaluesgofrom1to10;wecanwrite10queriesthatlooklikethis(needtoreplacei bynum):

SELECT MIN (S.age)FROM Sailors SWHERE S.rating = i

For i = 1, 2, ... , 10:

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 51

LECTURE4willstarthere

Page 52: Lecture 2 SQL - Duke University

QueriesWithGROUPBYandHAVING

• Thetarget-listcontains– (i)attributenames– (ii)termswithaggregateoperations(e.g.,MIN(S.age))

• Theattributelist(i)mustbeasubsetofgrouping-list– Intuitively,eachanswertuplecorrespondstoagroup,andtheseattributes

musthaveasinglevaluepergroup– Herea groupisasetoftuplesthathavethesamevalueforallattributesin

grouping-list

SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualification

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 52

FirstgoovertheexamplesinthefollowingslidesThencomebacktothisslideandstudyyourself

Page 53: Lecture 2 SQL - Duke University

ConceptualEvaluation• Thecross-productofrelation-list iscomputed• Tuplesthatfailqualification arediscarded• `Unnecessary’fieldsaredeleted• Theremainingtuplesarepartitionedintogroupsbythevalueof

attributesingrouping-list• Thegroup-qualification isthenappliedtoeliminatesomegroups• Expressionsingroup-qualificationmusthaveasinglevalueper

group– Ineffect,anattributeingroup-qualificationthatisnotanargumentofan

aggregateopalsoappearsingrouping-list– like“…GROUPBYbid,sid HAVINGbid=3”

• OneanswertupleisgeneratedperqualifyinggroupDukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 53

FirstgoovertheexamplesinthefollowingslidesThencomebacktothisslideandstudyyourself

Page 54: Lecture 2 SQL - Duke University

SELECT S.rating, MIN (S.age) AS minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1

sid sname rating age 22 dustin 7 45.0 29 brutus 1 33.0 31 lubber 8 55.5 32 andy 8 25.5 58 rusty 10 35.0 64 horatio 7 35.0 71 zorba 10 16.0 74 horatio 9 35.0 85 art 3 25.5 95 bob 3 63.5 96 frodo 3 25.5

Answer relation:

Sailors instance:

rating minage 3 25.5 7 35.0 8 25.5

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 54

Findageoftheyoungestsailorwithage>=18,foreachratingwithatleast2such sailors.

Page 55: Lecture 2 SQL - Duke University

Findageoftheyoungestsailorwithage>=18,foreachratingwithatleast2such sailors.

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 55

SELECT S.rating, MIN (S.age) AS minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1

Step1:Formthecrossproduct:FROMclause(someattributesareomittedforsimplicity)

Page 56: Lecture 2 SQL - Duke University

Findageoftheyoungestsailorwithage>=18,foreachratingwithatleast2such sailors.

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 56

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

SELECT S.rating, MIN (S.age) AS minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1

Step2:ApplyWHEREclause

Page 57: Lecture 2 SQL - Duke University

Findageoftheyoungestsailorwithage>=18,foreachratingwithatleast2such sailors.

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

rating age 1 33.0 3 25.5 3 63.5 3 25.5 7 45.0 7 35.0 8 55.5 8 25.5 9 35.0 10 35.0

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 57

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

SELECT S.rating, MIN (S.age) AS minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1

Step3:ApplyGROUPBYaccordingtothelistedattributes

Page 58: Lecture 2 SQL - Duke University

Findageoftheyoungestsailorwithage>=18,foreachratingwithatleast2such sailors.

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

rating age 1 33.0 3 25.5 3 63.5 3 25.5 7 45.0 7 35.0 8 55.5 8 25.5 9 35.0 10 35.0

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 58

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

SELECT S.rating, MIN (S.age) AS minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1

Step4:ApplyHAVINGclauseThegroup-qualification isappliedtoeliminatesomegroups

Page 59: Lecture 2 SQL - Duke University

Findageoftheyoungestsailorwithage>=18,foreachratingwithatleast2such sailors.

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

rating minage 3 25.5 7 35.0 8 25.5

rating age 1 33.0 3 25.5 3 63.5 3 25.5 7 45.0 7 35.0 8 55.5 8 25.5 9 35.0 10 35.0

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 59

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

SELECT S.rating, MIN (S.age) AS minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1

Step5:ApplySELECTclauseApplytheaggregateoperatorAttheend,onetuplepergroup

Page 60: Lecture 2 SQL - Duke University

NullValues• Fieldvaluesinatuplearesometimes

– unknown,e.g.,aratinghasnotbeenassigned,or– inapplicable, e.g.,nospouse’sname– SQLprovidesaspecialvaluenull forsuchsituations.

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 60

Page 61: Lecture 2 SQL - Duke University

StandardBoolean2-valuedlogic

• True=1,False=0• SupposeX=5

– (X<100)AND(X>=1)isT∧ T=T– (X>100)OR(X>=1)isF∨ T=T– (X>100)AND(X>=1)isF∧ T=F– NOT(X=5)is¬T=F

• Intuitively,– T=1,F=0– ForV1,V2∈ {1,0}– V1∧ V2=MIN(V1,V2)– V1∨ V2=MAX(V1,V2)– ¬(V1)=1– V1

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 61

Page 62: Lecture 2 SQL - Duke University

2-valuedlogicdoesnotworkfornulls

• Supposerating=null,X=5• Israting>8trueorfalse?• WhataboutAND,ORandNOT connectives?

– (rating>8)AND(X=5)?

• WhatifwehavesuchaconditionintheWHEREclause?

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 62

Page 63: Lecture 2 SQL - Duke University

3-ValuedLogicForNull

• TRUE(=1),FALSE(=0),UNKNOWN(=0.5)– unknownistreatedas0.5

• Nowyoucanapplyrulesfrom2-valuedlogic!– ForV1,V2∈ {1,0,0.5}– V1∧ V2=MIN(V1,V2)– V1∨ V2=MAX(V1,V2)– ¬(V1)=1– V1

• Therefore,– NOTUNKNOWN=UNKNOWN– UNKNOWNORTRUE=TRUE– UNKNOWNANDTRUE=UNKNOWN– UNKNOWNANDFALSE=FALSE– UNKNOWNORFALSE=UNKNOWN

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 63

Page 64: Lecture 2 SQL - Duke University

NewissuesforNull• Thepresenceofnull complicatesmanyissues.E.g.:

– SpecialoperatorsneededtocheckifvalueIS/ISNOTNULL– Becareful!– “WHEREX=NULL”doesnotwork!– Needtowrite“WHEREXISNULL”

• Meaningofconstructsmustbedefinedcarefully– e.g.,WHEREclauseeliminatesrowsthatdon’tevaluatetotrue– SonotonlyFALSE,butUNKNOWNsareeliminatedtoo– veryimportanttoremember!

• ButNULLallowsnewoperators(e.g.outerjoins)

• ArithmeticwithNULL– allof+,-,*,/returnnullifanyargumentisnull

• Canforce”nonulls” whilecreatingatable– snamechar(20)NOTNULL– primarykeyisalwaysnotnullDukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 64

Page 65: Lecture 2 SQL - Duke University

AggregateswithNULL

• Whatdoyougetfor• SELECTcount(*)fromR1?• SELECTcount(rating)fromR1?

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 65

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

R

Page 66: Lecture 2 SQL - Duke University

AggregateswithNULL

• Whatdoyougetfor• SELECTcount(*)fromR1?• SELECTcount(rating)fromR1?• Ans:3forboth

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 66

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

R1

Page 67: Lecture 2 SQL - Duke University

AggregateswithNULL

• Whatdoyougetfor• SELECTcount(*)fromR1?• SELECTcount(rating)fromR1?• Ans:3forboth

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 67

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

R1

• Whatdoyougetfor• SELECTcount(*)fromR2?• SELECTcount(rating)fromR2?

sid sname rating age22 dustin 7 4531 lubber null 5558 rusty 10 35

R2

Page 68: Lecture 2 SQL - Duke University

AggregateswithNULL

• Whatdoyougetfor• SELECTcount(*)fromR1?• SELECTcount(rating)fromR1?• Ans:3forboth

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 68

sid sname rating age22 dustin 7 4531 lubber 8 5558 rusty 10 35

R1

• Whatdoyougetfor• SELECTcount(*)fromR2?• SELECTcount(rating)fromR2?• Ans:First3,then2

sid sname rating age22 dustin 7 4531 lubber null 5558 rusty 10 35

R2

Page 69: Lecture 2 SQL - Duke University

AggregateswithNULL

• COUNT,SUM,AVG,MIN,MAX(withorwithoutDISTINCT)– Discardsnullvaluesfirst– Thenappliestheaggregate– Exceptcount(*)

• Ifonlyappliedtonullvalues,theresultisnull

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 69

• SELECTsum(rating)fromR2?• Ans:17

sid sname rating age22 dustin 7 4531 lubber null 5558 rusty 10 35

R2• SELECTsum(rating)fromR3?• Ans:null

sid sname rating age22 dustin null 4531 lubber null 5558 rusty null 35

R3

Page 70: Lecture 2 SQL - Duke University

Overview:GeneralConstraints• UsefulwhenmoregeneralICs

thankeysareinvolved

• TherearealsoASSERTIONS tospecifyconstraintsthatspanacrossmultipletables

• ThereareTRIGGERStoo:procedurethatstartsautomaticallyifspecifiedchangesoccurtotheDBMS– seeadditionalslidesattheend

CREATE TABLE Sailors( sid INTEGER,sname CHAR(10),rating INTEGER,age REAL,PRIMARY KEY (sid),CHECK ( rating >= 1

AND rating <= 10 )

CREATE TABLE Reserves( sname CHAR(10),bid INTEGER,day DATE,PRIMARY KEY (bid,day),CONSTRAINT noInterlakeResCHECK (`Interlake’ <>

( SELECT B.bnameFROM Boats BWHERE B.bid=bid)))DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 71: Lecture 2 SQL - Duke University

Views• Aviewisjustarelation,butwestoreadefinition,ratherthan

asetoftuplesCREATE VIEW YoungActiveStudents (name, grade)

AS SELECT S.name, E.gradeFROM Students S, Enrolled EWHERE S.sid = E.sid and S.age<21

• Views can be dropped using the DROP VIEW command

• Views and Security: Views can be used to present necessary information (or a summary), while hiding details in underlying relation(s)

• the above view hides courses “cid” from E

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 71

Page 72: Lecture 2 SQL - Duke University

Cancreateanewtablefromaqueryonothertablestoo

SELECT S.name, E.gradeINTO YoungActiveStudentsFROM Students S, Enrolled EWHERE S.sid = E.sid and S.age<21

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 72

SELECT…INTO....FROM....WHERE

Page 73: Lecture 2 SQL - Duke University

Summary

• SQLhasahugenumberofconstructsandpossibilities– Youneedtolearnandpracticeitonyourown– Givenaproblem,youshouldbeabletowriteaSQLqueryandverify

whetheragivenoneiscorrect

• PayattentiontoNULLs• Youwillfind“WITH”clauseveryuseful!

WITH Temp1 AS(SELECT ….. ..), Temp2 AS(SELECT ….. ..)

SELECT X, YFROM TEMP1, TEMP2WHERE….

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 73

Page 74: Lecture 2 SQL - Duke University

AdditionalExamplesforPractice

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 74

Checkyourself

Page 75: Lecture 2 SQL - Duke University

RewritingINTERSECT QueriesUsingIN

• Similarly,EXCEPT queriesre-writtenusingNOTIN.• Tofindnames(notsid’s)ofSailorswho’vereservedbothredandgreenboats,justreplaceS.sid byS.sname inSELECTclause

Find sid’s of sailors who’ve reserved both a red and a green boat:

SELECT S.sidFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’

AND S.sid IN (SELECT S2.sidFROM Sailors S2, Boats B2, Reserves R2WHERE S2.sid=R2.sid AND R2.bid=B2.bid

AND B2.color=‘green’)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 75

Page 76: Lecture 2 SQL - Duke University

“Division”inSQL

• Option1:• Option2:Let’sdoitthehardway,without

EXCEPT:

SELECT S.snameFROM Sailors SWHERE NOT EXISTS

((SELECT B.bidFROM Boats B)

EXCEPT(SELECT R.bidFROM Reserves RWHERE R.sid=S.sid))

SELECT S.snameFROM Sailors SWHERE NOT EXISTS (SELECT B.bid

FROM Boats B WHERE NOT EXISTS (SELECT R.bid

FROM Reserves RWHERE R.bid=B.bid

AND R.sid=S.sid))

Sailors S such that ...

there is no boat B...

…a Reserves tuple showing S reserved B

Find sailors who’ve reserved all boats.

MoreinRA

option2

option1

…without ...

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 77: Lecture 2 SQL - Duke University

Findnameandageoftheoldestsailor(s)

• Thefirstqueryisillegal!– We’lllookintothereasonabitlater,whenwediscussGROUPBY

• Thethirdqueryisequivalenttothesecondquery– andisallowedintheSQL/92standard,butisnotsupportedinsomesystems

SELECT S.sname, MAX (S.age)FROM Sailors S

SELECT S.sname, S.ageFROM Sailors SWHERE S.age =

(SELECT MAX (S2.age)FROM Sailors S2)

SELECT S.sname, S.ageFROM Sailors SWHERE (SELECT MAX (S2.age)

FROM Sailors S2)= S.age

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

Page 78: Lecture 2 SQL - Duke University

Findageoftheyoungestsailorwithage>=18,foreachratingwithatleast2such sailorsandwithevery sailorunder60.

rating age 7 45.0 1 33.0 8 55.5 8 25.5 10 35.0 7 35.0 10 16.0 9 35.0 3 25.5 3 63.5 3 25.5

rating age 1 33.0 3 25.5 3 63.5 3 25.5 7 45.0 7 35.0 8 55.5 8 25.5 9 35.0 10 35.0

rating minage 7 35.0 8 25.5

WhatistheresultofchangingEVERYtoANY?

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 78

SELECT S.rating, MIN (S.age) AS minageFROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1 ANDEVERY(S.age <=60)

Page 79: Lecture 2 SQL - Duke University

Findageoftheyoungestsailorwithage>=18,foreachratingwithatleast2sailorsbetween18and60.

SELECT S.rating, MIN (S.age) AS minage

FROM Sailors SWHERE S.age >= 18 AND S.age <= 60GROUP BY S.ratingHAVING COUNT (*) > 1

sid sname rating age 22 dustin 7 45.0 29 brutus 1 33.0 31 lubber 8 55.5 32 andy 8 25.5 58 rusty 10 35.0 64 horatio 7 35.0 71 zorba 10 16.0 74 horatio 9 35.0 85 art 3 25.5 95 bob 3 63.5 96 frodo 3 25.5

Answer relation:

Sailors instance:

rating minage 3 25.5 7 35.0 8 25.5

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 79

Page 80: Lecture 2 SQL - Duke University

Foreachredboat,findthenumberofreservationsforthisboat

• Groupingoverajoinofthreerelations.• WhatdowegetifweremoveB.color=‘red’fromthe

WHEREclauseandaddaHAVING clausewiththiscondition?

• WhatifwedropSailorsandtheconditioninvolvingS.sid?

SELECT B.bid, COUNT (*) AS scountFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’GROUP BY B.bid

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 80

Page 81: Lecture 2 SQL - Duke University

Findageoftheyoungestsailorwithage>18,foreachratingwithatleast2sailors(ofanyage)

• ShowsHAVING clausecanalsocontainasubquery.• Comparethiswiththequerywhereweconsideredonlyratings

with2sailorsover18!• WhatifHAVING clauseisreplacedby:

– HAVINGCOUNT(*)>1

SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age > 18GROUP BY S.ratingHAVING 1 < (SELECT COUNT (*)

FROM Sailors S2WHERE S.rating=S2.rating)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 81

Page 82: Lecture 2 SQL - Duke University

Findthoseratingsforwhichtheaverageageistheminimumoverallratings

• Aggregateoperationscannotbenested!WRONG:

SELECT S.ratingFROM Sailors SWHERE S.age = (SELECT MIN (AVG (S2.age)) FROM Sailors S2)

SELECT Temp.rating, Temp.avgageFROM (SELECT S.rating, AVG (S.age) AS avgage

FROM Sailors SGROUP BY S.rating) AS Temp

WHERE Temp.avgage = (SELECT MIN (Temp.avgage)FROM Temp)

• Correctsolution(inSQL/92):

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 82

Page 83: Lecture 2 SQL - Duke University

IntegrityConstraints(Review)• AnICdescribesconditionsthateverylegalinstanceofarelationmustsatisfy.– Inserts/deletes/updatesthatviolateIC’saredisallowed.– Canbeusedtoensureapplicationsemantics(e.g.,sidisakey),orpreventinconsistencies(e.g.,snamehastobeastring,agemustbe<200)

• TypesofIC’s:Domainconstraints,primarykeyconstraints,foreignkeyconstraints,generalconstraints.– Domainconstraints:Fieldvaluesmustbeofrighttype.Alwaysenforced.

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 83

Page 84: Lecture 2 SQL - Duke University

Triggers• Trigger:procedurethatstartsautomaticallyifspecified

changesoccurtotheDBMS• Threeparts:

– Event(activatesthetrigger)– Condition(testswhetherthetriggersshouldrun)– Action(whathappensifthetriggerruns)

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 84

CREATETRIGGERyoungSailorUpdateAFTERINSERTONSAILORS

REFERENCINGNEWTABLENewSailorsFOREACHSTATEMENT

INSERTINTOYoungSailors(sid,name,age,rating)SELECTsid,name,age,ratingFROMNewSailors NWHEREN.age <=18