database design and implementation - umass...

60
Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Upload: haduong

Post on 15-Jul-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Database design and implementation CMPSCI 645

Lecture 04: SQL and Datalog

Page 2: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

What you need

2

}  InstallPostgres.}  Instruc0onsareontheassignmentspageonthewebsite.

}  Useittoprac0ce

}  RefreshyourSQL:}  h?p://sqlzoo.net

Page 3: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Simple SQL query

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

Mul0Touch $203.99 Household Hitachi

SELECT !* FROM !Product WHERE !category='Gadgets'!

Product

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorksSelec0on

3

�category=�Gadgets�

Page 4: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Simple SQL query

PName Price Manufacturer

SingleTouch $149.99 Canon

Mul0Touch $203.99 Hitachi

SELECT !pName, price, manufacturer FROM !Product WHERE !price > 100!

Product

Selec0on&Projec0on

4

�price>100

�pname,price,manufacturer

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

Mul0Touch $203.99 Household Hitachi

Page 5: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Eliminating duplicates

Category

Gadgets

Gadgets

Photography

Household

Category

Gadgets

Photography

Household

SELECT !DISTINCT category FROM !Product!

SELECT !category FROM !Product!

Setvs.Bagseman0cs

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

PowerGizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

Mul0Touch $203.99 Household Hitachi

Product

5

Page 6: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Ordering the results

}  Tiesinpricea"ributebrokenbypnamea?ribute

}  Orderingisascendingbydefault.Descending:

SELECT ! !pName, price, manufacturer!FROM ! !Product!WHERE ! !category='Gadgets' ! ! !and price > 10!ORDER BY !price, pName!

... ORDER BY price, pname DESC!

6

Page 7: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

Mul0Touch $203.99 Household Hitachi

SELECT category!FROM Product!ORDER BY pName!

SELECT DISTINCT category!FROM Product!ORDER BY category!

SELECT DISTINCT category!FROM Product!ORDER BY pName!

Product

???

7

Page 8: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Category

Gadgets

Household

Photography

Category

Gadgets

Household

Gadgets

Photography

Syntax error* 8

SELECT category!FROM Product!ORDER BY pName!

SELECT DISTINCT category!FROM Product!ORDER BY category!

SELECT DISTINCT category!FROM Product!ORDER BY pName!

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

Mul0Touch $203.99 Household Hitachi

Product

Page 9: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Joins

Q:Findallproductsunder$200manufacturedinJapan;returntheirnamesandprices!

SELECT !pName, price FROM !Product, Company WHERE !manufacturer=cName ! !and country='Japan' !and price <= 200!

JoinbetweenProductandCompany

Product(pName,price,category,manufacturer)Company(cName,stockPrice,country)

9

Page 10: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Joins

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

Mul0Touch $203.99 Household Hitachi

Product CompanyCName StockPrice Country

GizmoWorks 25 USA

Canon 65 Japan

Hitachi 15 Japan

PName Price

SingleTouch $149.99

SELECT !pName, price FROM !Product, Company WHERE !manufacturer=cName ! !and country='Japan' !and price <= 200!

10

Page 11: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Semantics are tricky…

SELECT !DISTINCT R.a!FROM !R, S, T!WHERE !R.a=S.a ! !or R.a=T.a!

IfS≠∅ andT≠∅ thenreturnsR∩ (S ∪ T) elsereturns∅

Whatdothesequeriescompute?

SELECT !DISTINCT R.a!FROM !R, S!WHERE !R.a=S.a!

ReturnsR∩ S

R(a),S(a),T(a)

11

Page 12: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Formal semantics of SQL queries

Answer={}forx1inR1doforx2inR2do…..forxninRndoifCondi0onsthenAnswer=Answer∪{(a1,…,ak)}returnAnswer

SELECT !a1, a2, …, ak!FROM !R1 as x1, R2 as x2, …, Rn as xn!WHERE !Conditions!

Conceptualevalua0onstrategy(nestedforloops):

12

Page 13: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Joins introduce duplicates

Country

USA

USA

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

Mul0Touch $203.99 Household Hitachi

Product CompanyCName StockPrice Country

GizmoWorks 25 USA

Canon 65 Japan

Hitachi 15 Japan

RemembertouseDISTINCT13

SELECT !country!FROM !Product, Company WHERE !manufacturer = cName ! and category = 'Gadgets'!

Q:Findallcountriesthatmanufacturesomeproductinthe‘Gadgets’category!

Page 14: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Subqueries }  AsubqueryisaSQLquerynestedinsidealargerquery

}  Suchinner-outerqueriesarecallednestedqueries

}  Asubquerymayoccurin:}  ASELECTclause}  AFROMclause}  AWHEREclause

}  Ruleofthumb:avoidwri0ngnestedquerieswhenpossible;keepinmindthatsome0mesit’simpossible

14

Page 15: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

1. Subqueries in SELECT

Whathappensifthesubqueryreturnsmorethanonecity?Run0meerror

Q:Foreachproductreturnthecitywhereitismanufactured!

SELECT !P.pname, (SELECT !C.city!! ! FROM! !Company C!! ! WHERE! !C.cid = P.cid)!

FROM !Product P!

Product(pname,price,cid)Company(cid,cname,city)

15

Page 16: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

1. Subqueries in SELECT

Q:Foreachproductreturnthecitywhereitismanufactured!

SELECT !P.pname, C.city!FROM !Product P, Company C!WHERE !C.cid = P.cid!

"unnes0ngthequery" Wheneverpossible,don'tusenestedqueries

Product(pname,price,cid)Company(cid,cname,city)

16

SELECT !P.pname, (SELECT !C.city!! ! FROM! !Company C!! ! WHERE! !C.cid = P.cid)!

FROM !Product P!

Page 17: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

2. Subqueries in FROM

Q:Findallproductswhosepricesare>20and<30!

SELECT !X.pname !FROM !(SELECT! *!

! FROM ! Product as P!! WHERE! price >20 ) as X!

WHERE !X.price < 30!

SELECT !pname!FROM !Product!WHERE !price > 20 and price < 30!

unnes0ng

Product(pname,price,cid)Company(cid,cname,city)

17

Page 18: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

3. Subqueries in WHERE

Existen0alquan0fiers∃

UsingEXISTS:

Q:Findallcompaniesthatmakesomeproductswithprice<100!

SELECT !DISTINCT C.cname !FROM !Company C!WHERE !EXISTS !(SELECT !*!

! ! FROM ! !Product P!! ! WHERE! !C.cid = P.cid!! ! ! !and P.price < 100)!

Product(pname,price,cid)Company(cid,cname,city)

18

Page 19: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

3. Subqueries in WHERE

Existen0alquan0fiers∃

UsingIN:

Q:Findallcompaniesthatmakesomeproductswithprice<100!

SELECT !DISTINCT C.cname !FROM !Company C!WHERE !C.cid IN !(SELECT P.cid!

! ! FROM ! Product P!! ! WHERE! P.price < 100)!

Product(pname,price,cid)Company(cid,cname,city)

19

Page 20: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

3. Subqueries in WHERE

Existen0alquan0fiers∃

UsingANY:

Q:Findallcompaniesthatmakesomeproductswithprice<100!

SELECT !DISTINCT C.cname !FROM !Company C!WHERE !100 > ANY!(SELECT price!

! ! FROM !!Product P!! ! WHERE !!P.cid = C.cid)!

Product(pname,price,cid)Company(cid,cname,city)

20

Page 21: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

3. Subqueries in WHERE

Existen0alquan0fiers∃

Now,let'sunnest:

Q:Findallcompaniesthatmakesomeproductswithprice<100!

SELECT !DISTINCT C.cname !FROM !Company C, Product P!WHERE !C.cid = P.cid! !and P.price < 100 !

Existen0alquan0fiersareeasy!J

Product(pname,price,cid)Company(cid,cname,city)

21

Page 22: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

3. Subqueries in WHERE

Universalquan0fiers∀

Q:Findallcompaniesthatmakeonlyproductswithprice<100!

Q:Findallcompaniesforwhichallproductshaveprice<100!

Universalquan0fiersaremorecomplicated!L

sameas:

Product(pname,price,cid)Company(cid,cname,city)

22

Page 23: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

3. Subqueries in WHERE

2. Find all companies s.t. all their products have price < 100!

1. Find the other companies: i.e. they have some product ≥ 100!

SELECT !DISTINCT C.cname !FROM !Company C!WHERE !C.cid IN (SELECT P.cid!

! ! FROM !! Product P!! ! WHERE!! P.price >= 100)!

SELECT !DISTINCT C.cname !FROM !Company C!WHERE !C.cid NOT IN (SELECT P.cid!

! ! FROM!! Product P!! ! WHERE! P.price >= 100)!

23

Page 24: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

3. Subqueries in WHERE

Using NOT EXISTS:

SELECT !DISTINCT C.cname !FROM !Company C!WHERE !NOT EXISTS!(SELECT *!

! ! FROM !Product P!! ! WHERE!C.cid = P.cid!! ! !and P.price >= 100)!

Universal quantifiers ∀

Q: Find all companies that make only products with price < 100!

Product (pname, price, cid) Company (cid, cname, city)

24

Page 25: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

3. Subqueries in WHERE

Using ALL:

Universal quantifiers ∀

Q: Find all companies that make only products with price < 100!

SELECT !DISTINCT C.cname !FROM !Company C!WHERE !100 > ALL!(SELECT price!

! ! FROM Product P!! ! WHERE P.cid = C.cid)!

Product (pname, price, cid) Company (cid, cname, city)

25

Page 26: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Challenging question }  HowcanweunnesttheuniversalquanLfierquery?

26

Page 27: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Queries that must be nested

} AqueryQismonotoneif:}  Addingtuplestotheinputcannotremovetuplesfromtheoutput

}  Fact:allunnestedqueriesaremonotone}  Proof:usingthe“nestedforloops”seman0cs

}  Fact:Querywithuniversalquan0fierisnotmonotone}  Addonetupleviola0ngthecondi0on.Thennot"all"...

}  Consequence:wecannotunnestaquerywithauniversalquan0fier

27

Page 28: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

The drinkers-bars-beers example

Finddrinkersthatfrequentsomebarthatservessomebeertheylike.

Finddrinkersthatfrequentonlybarsthatservesomebeertheylike.

Finddrinkersthatfrequentonlybarsthatserveonlybeertheylike.

Finddrinkersthatfrequentsomebarthatservesonlybeerstheylike.

Likes(drinker,beer)Frequents(drinker,bar)Serves(bar,beer)

Challenge:writetheseinSQL

28

Page 29: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Aggregation

SELECT !count(*)!FROM !Product!WHERE !year > 1995!

Exceptcount,allaggrega0onsapplytoasinglea?ribute

SELECT !avg(price)!FROM !Product!WHERE !maker='Toyota'!

SQLsupportsseveralaggrega0onopera0ons:sum,count,min,max,avg

29

Page 30: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

COUNTappliestoduplicates,unlessotherwisestated:

SELECT !count (category) !FROM !Product!WHERE !year > 1995!

Weprobablywant:

SELECT !count (DISTINCT category)!FROM !Product!WHERE !year > 1995!

Aggregation: count distinct

30

Page 31: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Simple aggregation PurchaseProduct Price Quan0tyBagel 3 20Bagel 2 20Banana 1 50Banana 2 10Banana 4 10

3*20=602*20=40sum:100

(Nocolumnname)100

SQLcreatesa?ributename

31

SELECT !sum (price * quantity)!FROM !Purchase!WHERE !product = 'Bagel'!

canusearithme0cexpressions

Page 32: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Grouping and Aggregation

Product TotalSalesBagel 40Banana 20

Product Price Quan0tyBagel 3 20Bagel 2 20Banana 1 50Banana 2 10Banana 4 10

FindtotalquanLLesforallsalesover$1,byproduct.

32

Page 33: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

From → Where → Group By → Select

SELECT !product, sum(quantity) as TotalSales!FROM !Purchase!WHERE !price > 1!GROUP BY !product!

Product TotalSalesBagel 40Banana 20

Product Price Quan0tyBagel 3 20Bagel 2 20Banana 1 50Banana 2 10Banana 4 10

123

4

Selectcontains•  groupeda?ributes•  andaggregates

33

Page 34: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Another example

SELECT ! product,!! sum(quantity) as SumQuantity,!! max(price) as MaxPrice!

FROM ! Purchase!GROUP BY product!

Product TotalSales MaxPriceBagel 40 3Banana 70 4

Next, focus only on products with at least 50 sales 34

Page 35: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

HAVING clause

Q:Similartobefore,butonlyproductswithatleast50sales.

Product TotalSales MaxPriceBanana 70 4

35

SELECT ! product,!! sum(quantity) as SumQuantity,!! max(price) as MaxPrice!

FROM ! Purchase!GROUP BY product!HAVING ! sum(quantity) >= 50!

Page 36: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

General form of grouping and aggregation

36

S: maycontaina?ributesa1,…,akand/oranyaggregatesbutnoothera?ributes

SELECT !S!FROM !R1,…,Rn WHERE !C1!GROUP BY !a1,…,ak HAVING !C2!

Evalua0on1.  EvaluateFrom→Where,applycondi0onC12.  Groupbythea?ributesa1,…,ak3.  Applycondi0onC2toeachgroup(mayhaveaggregates)4.  ComputeaggregatesinSandreturntheresult

1234

5

C1:isanycondi0ononthea?ributesinR1,…,Rn

C2:isanycondi0ononaggregatesandona?ributesa1,…,ak

Page 37: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Finding witnesses Store(sid,sname)Product(pid,pname,price,sid)

Q:Foreachstore,finditsmostexpensiveproducts

SELECT Store.sid, max(Product.price)!FROM Store, Product!WHERE Store.sid = Product.sid!GROUP BY Store.sid!

Butwewantthe“witnesses”,i.e.theproductswithmaxprice

Findingthemaximumpriceiseasy…

37

Page 38: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Finding witnesses

}  Computemaxpriceinasubquery}  CompareitwitheachproductpriceSELECT Store.sname, Product.pname FROM Store, Product, (SELECT Store.sid as sid, !

! ! ! max(Product.price) as p! FROM Store, Product !

! WHERE Store.sid = Product.sid!! GROUP BY Store.sid) X!

WHERE Store.sid = Product.sid and Store.sid = X.sid ! and Product.price = X.p!

38

Page 39: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Finding witnesses

Thereisamoreconcisesolu0onhere:

SELECT Store.sname, x.pname!FROM Store, Product x!WHERE Store.sid = x.sid ! and x.price >= ALL (SELECT y.price FROM Product y WHERE Store.sid = y.sid)!

39

Page 40: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

NULLS in SQL }  Wheneverwedon’thaveavalue,wecanputaNULL

}  Canmeanmanythings:}  Valuedoesnotexists}  Valueexistsbutisunknown}  Valuenotapplicable}  Etc.

}  Theschemaspecifiesforeacha?ributeifitcanbeNULLornot

}  HowdoesSQLcopewithtablesthathaveNULLs?

40

Page 41: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Null values }  Ifx=NULLthen

}  Arithme0copera0onsproduceNULL.E.g:4*(3-x)/7}  Booleancondi0onsarealsoNULL.E.g:x=‘Joe’

}  InSQLtherearethreebooleanvalues:FALSE,TRUE,UNKNOWN

}  Reasoning:FALSE=0 xANDy=min(x,y)TRUE=1 xORy=max(x,y)UNKNOWN=0.5 NOTx=(1–x)

41

Page 42: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

42

SELECT *!FROM Person!WHERE (age < 25) and !

! (height > 6 or weight > 190)!

RuleinSQL:includeonlytuplesthatyieldTRUE

Age Height Weight20 NULL 200NULL 6.5 170

SELECT *!FROM Person!WHERE age < 25 or age >= 25!

Unexpectedbehavior

SELECT *!FROM Person!WHERE age < 25 or age >= 25 ! or age IS NULL!

TestNULLexplicitly

Page 43: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Outer joins

Ifwewantthenever-soldproducts,weneedan“outerjoin”:

Product(name,category)Purchase(prodName,store)

SELECT Product.name, Purchase.store!FROM Product LEFT OUTER JOIN Purchase ON! Product.name = Purchase.prodName!

Name CategoryGizmo GadgetCamera PhotoOneClick Photo

ProductProdName Store Gizmo Wiz Camera Ritz Camera Wiz

PurchaseName StoreGizmo WizCamera RitzCamera WizOneClick NULL

Result

Innerjoindoesnotproducethistuple 43

Page 44: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Example

}  Compute,foreachproduct,thetotalnumberofsalesin‘September’

What’swrong?

Product(name,category)Purchase(prodName,month,store)

SELECT !Product.name, count(*)! FROM Product, Purchase ! WHERE !Product.name = Purchase.prodName! !and Purchase.month = ‘September’! GROUP BY Product.name!

44

Page 45: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Example

}  Compute,foreachproduct,thetotalnumberofsalesin‘September’

Product(name,category)Purchase(prodName,month,store)

SELECT !Product.name, count(*)! FROM Product LEFT OUTER JOIN Purchase ON !

! Product.name = Purchase.prodName! WHERE Purchase.month = ‘September’! GROUP BY Product.name!

45

What’swrong?

Page 46: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Example

}  Compute,foreachproduct,thetotalnumberofsalesin‘September’

Product(name,category)Purchase(prodName,month,store)

SELECT !Product.name, count(month)! FROM Product LEFT OUTER JOIN Purchase ON !

! !Product.name = Purchase.prodName! WHERE Purchase.month = ‘September’! GROUP BY Product.name!

46

Weneedtousethea?ributetogetthecorrect0count.

Page 47: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

47

Datalog

Page 48: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Datalog

48

}  Friendlynota0onforqueries} Designedforrecursivequeriesinthe80s.}  Today:inacoupleofcommercialproducts,e.g.,LogicBlox,Datomic

}  Today:recursion-freedatalogwithnega0on

Page 49: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Datalog: Facts and Rules

49

Actor(34524,’Johnny’,’Depp’)Casts(34524,28756)Casts(67725,28756)Movie(28756,‘SweeneyTodd’,2007)Movie(28757,‘TheDaVinciCode’,2006)

Q1(y):-Movie(x,y,z),z=‘2007’

Facts=tuplesinthedatabase

Rules=queriesFindmoviesmadein2007

Q2(f,l):-Actor(z,f,l),Casts(z,x),Movie(x,y,’2007’)Findactorswhoactedinamoviein2007

Q3(f,l):-Actor(z,f,l),Casts(z,x1),Movie(x1,y,’2007’),Casts(z,x2),Movie(x2,y2,’2006’)

Findactorswhoactedinamoviein2007andin2006

Page 50: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

EDB and IDB

50

}  ExtensionalDatabasePredicates:EDB}  Actor,casts,movie

}  Inten0onalDatabasePredicates:IDB}  Q1,Q2,Q3

Q2(f,l):-Actor(z,f,l),Casts(z,x),Movie(x,y,’2007’)

Page 51: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Terminology

51

Q2(f,l):-Actor(z,f,l),Casts(z,x),Movie(x,y,’2007’)

head body

atom atom atom

f,l :headvariablesx,y,z :existen0alvariables

Page 52: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Datalog Program

52

B0(x):-Actor(x,’Kevin’,’Bacon’)B1(x):-Actor(x,f,l),Casts(x,z),Casts(y,z),B0(y)B2(x):-Actor(x,f,l),Casts(x,z),Casts(y,z),B1(y)Q4(x):-B1(x)Q4(x):-B2(x)union

FindactorswithBaconnumber≤2

Page 53: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Simple datalog programs

53

1

2 3

4

5

1 2

2 1

2 3

1 4

3 4

4 5

R=

T(x,y):-R(x,y)T(x,y):-R(x,z)T(z,y)

Whatdoesthiscompute?

Page 54: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Simple datalog programs

54

1

2 3

4

5

1 2

2 1

2 3

1 4

3 4

4 5

R=

T(x,y):-R(x,y)T(x,y):-R(x,z)T(z,y)

Whatdoesthiscompute?

Tisini0allyempty

Page 55: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Simple datalog programs

55

1

2 3

4

5

1 2

2 1

2 3

1 4

3 4

4 5

R=

T(x,y):-R(x,y)T(x,y):-R(x,z)T(z,y)

Whatdoesthiscompute?

1 2

2 1

2 3

1 4

3 4

4 5

1stitera0on

Page 56: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Simple datalog programs

56

1

2 3

4

5

1 2

2 1

2 3

1 4

3 4

4 5

R=

T(x,y):-R(x,y)T(x,y):-R(x,z)T(z,y)

Whatdoesthiscompute?

1 2

2 1

2 3

1 4

3 4

4 5

1stitera0on

1 2

2 1

2 3

1 4

3 4

4 5

2nditera0on

1 1

1 3

2 2

2 4

1 5

3 5

Page 57: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Simple datalog programs

57

1

2 3

4

5

1 2

2 1

2 3

1 4

3 4

4 5

R=

T(x,y):-R(x,y)T(x,y):-R(x,z)T(z,y)

Whatdoesthiscompute?

1 2

2 1

2 3

1 4

3 4

4 5

1stitera0on

1 2

2 1

2 3

1 4

3 4

4 5

2nditera0on

1 1

1 3

2 2

2 4

1 5

3 5

1 2

2 1

2 3

1 4

3 4

4 5

3rditera0on

1 1

1 3

2 2

2 4

1 5

3 5

2 5

Page 58: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Datalog with Negation

58

B0(x):-Actor(x,’Kevin’,’Bacon’)B1(x):-Actor(x,f,l),Casts(x,z),Casts(y,z),B0(y)Q5(x):-Actor(x,f,l),notB1(x),notB0(x)

FindactorswithBaconnumber≥2

Page 59: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Recursion and negation: L

59

S(x):-R(x),notT(x)T(x):-R(x),notS(x)EDB:R(a)

Thefixpointinunclear!

Page 60: Database design and implementation - UMass Amherstavid.cs.umass.edu/courses/645/s2017/lectures/04-SQL.pdf · Database design and implementation CMPSCI 645 Lecture 04: SQL and Datalog

Unsafe Datalog Rules

60

Whatisunsafeabouttheserules?

U1(x,y):-Movie(x,z,’2007’),y>‘2000’

U2(x,u):-Movie(x,z,’2007’),notCasts(u,x)

Aruleissafeifeveryvariableappearsinsomeposi0verela0onalatom