relational query languages - purdue university

30
Relational Query Languages Walid G. Aref Walid G. Aref

Upload: others

Post on 29-Dec-2021

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Relational Query Languages - Purdue University

Relational Query LanguagesWalid G. Aref

Walid G. Aref

Page 2: Relational Query Languages - Purdue University

Query Languages For The Relational Model

• Two categories of query languages for the relational model

1. Imperative (Procedural):• Specifies the steps to be taken to evaluate a user’s query• Relational Algebra: The mathematical foundation for all relational query engines

2. Declarative (Non-procedural):• Specifies the query that the user needs answered and not how to answer it• DBMS is responsible for query compilation and optimization for efficient evaluation• Need query optimizer to re-order the operations while guaranteeing a correct answer to the

query (does not change during the process of optimization)• Relational Calculus - Comes in two flavors:

• Tuple Relational Calculus: The mathematical foundation for SQL• Domain Relational Calculus: The mathematical foundation for QBE (Query By Example)

Walid G. Aref

Page 3: Relational Query Languages - Purdue University

Query Languages For The Relational Model

Relational Query Languages

Procedural

Relational Algebra

Declarative

Domain Relational Calculus

Query By Example (QBE)

Tuple Relational Calculus

SQL

Walid G. Aref

Page 4: Relational Query Languages - Purdue University

Query Languages For The Relational Model

Relational Query Languages

Procedural

Relational Algebra

Declarative

Domain Relational Calculus

Query By Example (QBE)

Tuple Relational Calculus

SQL

Walid G. Aref

Page 5: Relational Query Languages - Purdue University

Relational Algebra

• Based on Set Theory• Building block operators to access relations and retrieve and

manipulate tuples in these relations• Each operator takes as input one or multiple relations and produce an

output relation è Composition• Operators can be nested in any order to perform more complex

operations

Walid G. Aref

Page 6: Relational Query Languages - Purdue University

Relational Algebra• Basic operations:

• Select ( ): Select a subset of the tuples from the input relation• Projection ( ): Eliminate unwanted columns from the input relation.• Cross-product ( ): Produce all possible tuple pairs from the two

input relations.• Set-difference ( ) : Produce tuples that belong to input relation 1,

but does not belong to the input relation 2. • Union ( ): Produce output table that contains tuples from the two

input relations.• Renaming: Change the name of an attribute or a table

• Additional non-basic operations• Can be realized by composing multiple basic operations • Relational Algebra is “closed” under the relational operators.• Intersection, join, division

sp

!

Walid G. Aref

Page 7: Relational Query Languages - Purdue University

Our Example Relational Database Schema

• Students(sid: string, name: string, login: string, age: integer, gpa: real)• Courses(cid: string, cname: string, credits: integer) • Enrolled(sid: string, cid: string, grade: string)• Instructor(iid: string, iname: string, irank: string, isalary: real)• Teaches(iid: string, cid: string, year: integer, semester: string)

Walid G. Aref

Page 8: Relational Query Languages - Purdue University

Project ( )

• RetainthelistedattributesandEliminatetheunlistedones• May need to eliminate the resulting duplicate tuples (see Figure)• Students(sid: string, name: string, login: string, age: integer, gpa: real)• 𝜋sid, name Students

• Can also have expressions in the attribute list, e.g., 𝜋name, age + 1 Students

p

sid name login age gpa

0111 Bright, Mary [email protected] 22 4.0

0222 Star, Adam [email protected] 21 2.3

0444 Zhang, Rita [email protected] 17 3.3

0333 Shah, Ragu [email protected] 19 3.0

sid name

0111 Bright, Mary

0222 Star, Adam

0444 Zhang, Rita

0333 Shah, Ragu

Walid G. Aref

Page 9: Relational Query Languages - Purdue University

Select (𝝈 )

• Given an input predicate and table, produce as output the rows that satisfy the predicate• Schema of the output table is the same as that of the input table• 𝝈age<20(students)

sid name login age gpa

0111 Bright, Mary [email protected] 22 4.0

0222 Star, Adam [email protected] 21 2.3

0444 Zhang, Rita [email protected] 17 3.3

0333 Shah, Ragu [email protected] 19 3.0

sid name login age gpa

0444 Zhang, Rita [email protected] 17 3.3

0333 Shah, Ragu [email protected] 19 3.0

Walid G. Aref

Page 10: Relational Query Languages - Purdue University

Cross Product ( X )

• Given two tables, produce all possible combinations of tuple pairs• Enrolled(sid: string, cid: string, grade: string)• Courses(cid: string, cname: string, credits: integer) • Enrolled X Courses

• The schema of the output table is the union of the schemas of the two input tables

cid cname credits

CS541 DB Systems 3

CS580 Algorithms 3

sid cid grade

0111 CS541 A+

0222 CS580 B-

sid cid grade

0111 CS541 A+

0111 CS541 A+

0222 CS580 B-

0222 CS580 B-

cid cname credits

CS541 DB Systems 3

CS580 Algorithms 3

CS541 DB Systems 3

CS580 Algorithms 3

X =

Walid G. Aref

Page 11: Relational Query Languages - Purdue University

Union

• Given two tables with compatible schemas• Same number of attributes• Same data type per matching attribute pairs

• Will eliminate duplicates• WL_Enrolled(sid: string, cid: string, grade: string)• Calumet_Enrolled(sid: string, cid: string, grade: string)• WL_Enrolled ∪ Calumet_Enrolled

sid cid grade

0111 CS541 A+

0333 CS503 B

sid cid grade

0111 CS541 A+

0222 CS580 B-

sid cid grade

0111 CS541 A+

0333 CS503 B

0222 CS580 B-

Walid G. Aref

Page 12: Relational Query Languages - Purdue University

Set Difference ( - )

• Given two tables with compatible schemas• Same number of attributes• Same data type per matching attribute pairs

• Produce tuples in the first table that do not exist in the second table• WL_Enrolled(sid: string, cid: string, grade: string)• Calumet_Enrolled(sid: string, cid: string, grade: string)• WL_Enrolled − Calumet_Enrolled• Notice that in contrast to Union, Set Difference is not commutative.

sid cid grade

0111 CS541 A+

0333 CS503 B

sid cid grade

0111 CS541 A+

0222 CS580 B-

sid cid grade

0222 CS580 B-−

Walid G. Aref

Page 13: Relational Query Languages - Purdue University

Renaming (𝜌)

• Allow us to assign a name to the result of a relational algebra expression• 𝜌X (A1, A2, …, An)(E)

• returns the result of expression E under the name X, and with the attributes renamed to A1, A2, …., An

Walid G. Aref

Page 14: Relational Query Languages - Purdue University

Composition of Operators

• Find the names of students with GPA 4.0• 𝜋name sgpa=4.0 (Students)

sid name login age gpa

0111 Bright, Mary [email protected] 22 4.0

0222 Star, Adam [email protected] 21 2.3

0444 Zhang, Rita [email protected] 17 3.3

0333 Shah, Ragu [email protected] 19 3.0

sid name login age gpa

0111 Bright, Mary [email protected] 22 4.0

name

Bright, MaryWalid G. Aref

Page 15: Relational Query Languages - Purdue University

Additional Relational Algebra Operators

• Can be realized by compositions of the basic Relational Algebra operators• Intersect: • Given two relations with compatible schemas, find the common tuples in

both input relations• Can be realized by the basic relational algebra operators

• r ∩ s = r - (r - s)

Walid G. Aref

Page 16: Relational Query Languages - Purdue University

Join

• Natural Join r ⋈ s = π𝑅 𝑈 𝑆 (𝜎r.id = s.id( r X s ))• R U S

• R,S=Schemas of r and s • Means that the common attributes are repeated only once

• r ⋈ s • Join based on equality of all the common attributes between the two tables

• Equi-Join: r ⋈ s = 𝜎r.a = s.b( r X s )

• Has an equality predicate• Theta-Join: r ⋈ s

• Has a general predicate (𝜃 𝑐𝑜𝑚𝑝𝑎𝑟𝑎𝑡𝑜𝑟)

r.a = s.b

r.a > s.b

Walid G. Aref

Page 17: Relational Query Languages - Purdue University

Division ( ÷ )

• r ÷ s• R ∩ 𝑆 ≠ ∅ (e.g., R contains foreign keys to S)• Find the tuples in r that join with all the tuples in s• For example, find the student who enrolled in all courses• r ÷ s = ÕR-S (r) –ÕR-S ( (ÕR-S (r) x s) – ÕR-S,S(r))• ÕR-S,S(r) reorders the attributes of r

Walid G. Aref

Page 18: Relational Query Languages - Purdue University

Extended Relational-Algebra Operations

• Aggregate Functions and Operations• Outer Join

Walid G. Aref

Page 19: Relational Query Languages - Purdue University

Grouping and Aggregate Functions in Relational Algebra• G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E)• g: Is the grouping operator• Takes as input:

• A relation (Or an relational algebra expression that produces a relation), and • An optional list (can be null) of grouping attributes G1, G2 …, Gn• Aggregate functions F1, F2, … , Fn take as input derived from a certain attribute (A1, A2, …,

An) from the input relation, and each Fi produces a scalar value as output • Example Aggregate Functions:

• count: count the number of values (or number of input tuples)• sum: sum of values• avg: average value• min: minimum value• max: maximum value

• Example: Compute average gpa of students grouped by age: age g avg(gpa) (students)

Walid G. Aref

Page 20: Relational Query Languages - Purdue University

Outer Join

• Compute the join regularly• Then, add the tuples form

one relation that do not join with null values • Comes in three flavors:• Left outer-join ( )• Right outer-join ( )• Full outer-join ( )

• Relations r and s

r2

s1

s2r1

r s

s2r1 r ⋈ s s2r1

r s

Nulls

s1

r s

Nulls

r s

Nulls

r2 Nulls

s2r1 r⋈sr⋈s s2r1 r⋈s

r ⋈ s

Two Tables: r & s

Walid G. Aref

Page 21: Relational Query Languages - Purdue University

Modifying the Underlying Relations in Relational Algebra• E is some Relational Algebra expression that returns a table• E can be a table containing one constant tuple• Delete tuples from a relation r: r ¬ r – E• Insert tuples into a relation r: r ¬ r È E• Update value inside one attribute in a tuple in a relation r ¬Õ F1, F2, …, FI, (r)• Set Fi = Ai for the attributes you do not want to change their values• For the attribute you want to update, plug in its place the new value (may depend on

the old value, e.g., old value + Constant, A1+A2, or Constant2, etc.

Walid G. Aref

Page 22: Relational Query Languages - Purdue University

Query Languages For The Relational Model

Relational Query Languages

Procedural

Relational Algebra

Declarative

Domain Relational Calculus

Query By Example (QBE)

Tuple Relational Calculus

SQL

Walid G. Aref

Page 23: Relational Query Languages - Purdue University

Query Languages For The Relational Model

Relational Query Languages

Procedural

Relational Algebra

Declarative

Domain Relational Calculus

Query By Example (QBE)

Tuple Relational Calculus

SQL

Walid G. Aref

Page 24: Relational Query Languages - Purdue University

Tuple Relational Calculus

• Non-Procedural• Query is of the form:

{t | P (t) }• Where t is a tuple and P is a predicate (a formula in Predicate Calculus)• t[A] or t.A: The value of attribute A in t.• t Î r: Tuple t is in relation r

Walid G. Aref

Page 25: Relational Query Languages - Purdue University

Predicate Calculus Formula

• Set of attributes and constants• Set of comparison operators: (e.g., <, £, =, ¹, >, ³)• Set of connectives: and (Ù), or (v)‚ not (¬)• Implication (Þ): x Þ y, if x is true, then y is true

x Þ y º ¬x v y• Set of quantifiers:• $ t Î r (Q(t)) º ”there exists” a tuple in t in Relation r

such that predicate Q(t) is true• "t Î r (Q(t)) º Q is true “for all” tuples t in Relation r

Walid G. Aref

Page 26: Relational Query Languages - Purdue University

Expressing Relational Algebra Operators in Tuple Relational Calculus• Select 𝜎A = “123”(𝑟) : {t | t Î r Ù t.A = “123”}• Project 𝜋𝐴𝐶(r) : {u| $ t Î r Ù u.A = t.A Ù u.C = t.C} • Schema for u implicitly becomes AC

• Cross-Product r X s : {ut| u Î r Ù t Î s}• Union r U s : {t | t Î r v t Î s}• Set Difference r – s : {t | t Î r Ù t ∉ s}• Join r ⋈ s : {ut| u Î r Ù t Î s Ù r.A = s.B}

r.A = s.B

Walid G. Aref

Page 27: Relational Query Languages - Purdue University

Safety of Tuple Relational Calculus Expressions

• An expression {t | P(t)} in the TRC is safe if t appears in one of the relations, tuples, or constants that appear in P• Avoid writing tuple calculus expressions that generate infinite tuples

in a relation.• For example, {t | ¬ t Î r} is in an infinite relation if the domain of any

attribute of relation r is infinite• Solution: Restrict the set of allowable TRC expressions to safe

expressions.• NOTE: this is more than just a syntax condition. • E.g. { t | t.A=1 Ú true } is not safe --- it defines an infinite set with attribute

values that do not appear in any relation or tuples or constants in P.

Walid G. Aref

Page 28: Relational Query Languages - Purdue University

Domain Relational Calculus

• Non-procedural (Declarative)• Same as TRC but variables refer to attributes not tuples• DRC query form:

{ < x1, x2, …, xn > | P(x1, x2, …, xn)}• x1, x2, …, xn : Domain variables• P : Formula same as that in Predicate Calculus

Walid G. Aref

Page 29: Relational Query Languages - Purdue University

Expressing Relational Algebra Operators in Domain Relational Calculus• Select 𝜎A = “123”(𝑟) : {<x,y,z> | <x,y,z> Î r Ù x = “123”}• Project 𝜋𝐴𝐶(r) : {<x,z>| $ <x,y,z> Î r} • Cross-Product r X s : {<a,b,c,d,e,f>| <a,b,c> Î r Ù <d,e,f> Î s}• Union r U s : {<x,y,z> | <x,y,z> Î r v <x,y,z> Î s}• Set Difference r – s : {<x,y> | <x,y> Î r Ù <x,y> ∉ s}• Join r ⋈ s : {{<a,b,c,a,e,f>| <a,b,c> Î r Ù <a,e,f> Î s}• Notice join is implicit by plugging the same variable in multiple locations in

the expression}

r.A = s.D

Walid G. Aref

Page 30: Relational Query Languages - Purdue University

Query Languages For The Relational Model

Relational Query Languages

Procedural

Relational Algebra

Declarative

Domain Relational Calculus

Query By Example (QBE)

Tuple Relational Calculus

SQL

Walid G. Aref