relational algebra and query languages - github pages

33
Venkatesh Vinayakarao (Vv) Relational Algebra and Query Languages Venkatesh Vinayakarao [email protected] http://vvtesh.co.in Chennai Mathematical Institute

Upload: others

Post on 29-Dec-2021

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Relational Algebra and Query Languages - GitHub Pages

Venkatesh Vinayakarao (Vv)

Relational Algebra and

Query Languages

Venkatesh [email protected]

http://vvtesh.co.in

Chennai Mathematical Institute

Page 2: Relational Algebra and Query Languages - GitHub Pages

Relational Algebra and Query Languages

40

Page 3: Relational Algebra and Query Languages - GitHub Pages

Query Languages

Relation 1Relation x Relation y

Query Language

Procedural LanguageRelational Algebra

Declarative LanguagesTuple Relational Calculus

Domain Relational Calculus

Popular LanguageSQL

Page 4: Relational Algebra and Query Languages - GitHub Pages

Relational Algebra

Relation 1Relation x Relation y

Query Language

Fundamental Operationsselect, project, rename

set difference, cartesian product, union

More Operationsset intersection, natural join, assignment

Page 5: Relational Algebra and Query Languages - GitHub Pages

Select Operation

• Notation: p(r) where p is called the selection predicate

• Example of selection: dept_name=“Physics”(instructor)

Page 6: Relational Algebra and Query Languages - GitHub Pages

Select Operation

• Example of selection: dept_name=“Physics”(instructor)

Page 7: Relational Algebra and Query Languages - GitHub Pages

Project Operation

• Notation: A1, A2,..,Ak (r) where Ai are attribute names

• Example of projection:ID, name, salary (instructor)

• Duplicate rows removed from result, since relations are sets

Page 8: Relational Algebra and Query Languages - GitHub Pages

Projection

46

ID, name, salary (instructor)

Page 9: Relational Algebra and Query Languages - GitHub Pages

Union Operation

• Notation: r s. Defined as:

r s = {t | t r or t s}

• For r s to be valid:• r, s must have the same arity (same number of attributes)• The attribute domains must be compatible (example: 2nd column

of r deals with the same type of values as does the 2nd

column of s)

• Example: to find all courses taught in the Fall 2009 semester,

or in the Spring 2010 semester, or in both

course_id ( semester=“Fall” Λ year=2009 (section))

course_id ( semester=“Spring” Λ year=2010 (section))

Page 10: Relational Algebra and Query Languages - GitHub Pages

Union, Selection and Projection

• course_id ( semester=“Fall” Λ year=2009 (section)) course_id ( semester=“Spring” Λ year=2010 (section))

48

Page 11: Relational Algebra and Query Languages - GitHub Pages

Set Difference Operation

• Notation: r – s. Defined as

r – s = {t | t r and t s}

• Example: to find all courses taught in the Fall 2009 semester,

but not in the Spring 2010 semester

course_id ( semester=“Fall” Λ year=2009 (section)) −

course_id ( semester=“Spring” Λ year=2010 (section))

Page 12: Relational Algebra and Query Languages - GitHub Pages

Union, Selection and Projection

• course_id ( semester=“Fall” Λ year=2009 (section)) −

course_id ( semester=“Spring” Λ year=2010 (section))

50

Page 13: Relational Algebra and Query Languages - GitHub Pages

Set-Intersection Operation

• Notation: r s

• Defined as:

• r s = { t | t r and t s }

Quiz: True/False? r s = r – (r – s)

Page 14: Relational Algebra and Query Languages - GitHub Pages

Cartesian-Product Operation

• Notation r x s

instructor relation teaches relation

Page 15: Relational Algebra and Query Languages - GitHub Pages

The Fly on the Ceiling - Descartes

53

French mathematician RenĂŠ Descartes (1596-1650)

Image Source: http://sites.psu.edu/solvingproblemshttps://wild.maths.org/ren%C3%A9-descartes-and-fly-ceilingvectorstock.com

Page 16: Relational Algebra and Query Languages - GitHub Pages

The instructor X teaches table

Page 17: Relational Algebra and Query Languages - GitHub Pages

Join Operation

• The Cartesian-Product instructor X teaches associates every tuple of instructor with every tuple of teaches.

• Most of the resulting rows have information about instructors who did NOT teach a particular course.

• To get only those tuples of instructor X teaches that pertain to instructors and the courses that they taught, we write:

instructor.id = teaches.id (instructor x teaches ))

Page 18: Relational Algebra and Query Languages - GitHub Pages

Join Operation (Cont.)

• instructor.id = teaches.id (instructor x teaches))

Page 19: Relational Algebra and Query Languages - GitHub Pages

Join Operation (Theta Join)

• The join operation allows us to combine a select operation and a Cartesian-Product operation into a single operation.

• Consider relations r (R) and s (S). Let “theta” be a predicate on attributes in the schema R “union” S. The join operation r ⋈𝜃 s is defined as follows:

𝑟 ⋈𝜃 𝑠 = 𝜎𝜃 (𝑟 × 𝑠)

• Thus

instructor.id = teaches.id (instructor x teaches ))

• Can equivalently be written as

instructor⋈instructor.id = teaches.id

teaches.

Page 20: Relational Algebra and Query Languages - GitHub Pages

Natural Join

• A special case where equality predicate holds on all attributes of same name.

58

Note that neither the employee named Mary nor the Production department appear in the result.

Source: https://en.wikipedia.org/wiki/Relational_algebra#Natural_join_(%E2%8B%88)

Page 21: Relational Algebra and Query Languages - GitHub Pages

Rename Operation

• Allows us to refer to a relation by more than one name.

• Example:

x (E)

returns the expression E under the name X

• If a relational-algebra expression E has arity n, then

returns the result of expression E under the name X, and with the attributes renamed to A1 , A2 , …., An .

)(),...,2

,1

( En

AAAx

Page 22: Relational Algebra and Query Languages - GitHub Pages

Find the highest salary in the university• Steps to reach the solution:

60

95000

We find all salaries that are lesser than some existing salary.

The only salary left out in step2

is the max salary.

Page 23: Relational Algebra and Query Languages - GitHub Pages

Find the highest salary in the university• Compute instructor x instructor

• Compute

• Compute

61

We use rename operation to distinguish the two salary attributes.

Page 24: Relational Algebra and Query Languages - GitHub Pages

Relational Algebra

• A basic expression in the relational algebra consists of either one of the following:• A relation in the database

• A constant relation

• Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions:• E1 E2

• E1 – E2

• E1 x E2

• p (E1), P is a predicate on attributes in E1

• s(E1), S is a list consisting of some of the attributes in E1

• x (E1), x is the new name for the result of E1

Page 25: Relational Algebra and Query Languages - GitHub Pages

Problem

Consider the following relational schema.

Is the following relational algebra expression equivalent to the English sentence given below:

63

"Find the distinct names of all students who score more than 90% in the course numbered 107"

Students(rollno: integer, sname: string)Courses(courseno: integer, cname: string)Registration(rollno: integer, courseno: integer, percent: real)

[GATE 2013]

Page 26: Relational Algebra and Query Languages - GitHub Pages

Problem

Consider the relational schema given below, where eId of the relation dependent is a foreign key referring to empId of the relation employee. Assume that every employee has at least one associated dependent in the dependent relation.

employee (empId, empName, empAge)

dependent(depId, eId, depName, depAge)

Consider the following relational algebra query:

The above query evaluates to the set of empIds of employees whose age is greater than that of

(A) some dependent.(B) all dependents.(C) some of his/her dependents(D) all of his/her dependents.

64[GATE 2014]

Page 27: Relational Algebra and Query Languages - GitHub Pages

Problem

Consider the relational schema given below, where eId of the relation dependent is a foreign key referring to empId of the relation employee. Assume that every employee has at least one associated dependent in the dependent relation.

employee (empId, empName, empAge)

dependent(depId, eId, depName, depAge)

Consider the following relational algebra query:

The above query evaluates to the set of empIds of employees whose age is greater than that of

(A) some dependent.(B) all dependents.(C) some of his/her dependents(D) all of his/her dependents.

65

Page 28: Relational Algebra and Query Languages - GitHub Pages

Fundamental Operations

66

The select, project, rename, set difference, cartesian product, and

union are sufficient to express queries!

Page 29: Relational Algebra and Query Languages - GitHub Pages

Extended Operations

• Set Intersection• Find the set of all courses taught in both the Fall 2017 and the

Spring 2018 semesters

• Assignment• Find all instructors in the “Physics” and Music department.

Physics dept_name=“Physics” (instructor)

Music dept_name=“Music” (instructor)

Physics Music

67

Result

course_id ( semester=“Fall” Λ year=2017 (section)) course_id ( semester=“Spring” Λ year=2018 (section))

Page 30: Relational Algebra and Query Languages - GitHub Pages

Aggregate Functions

• Improves ease of use

• Most common functions:

68

Functions

sum

max

min

avg

count

Calligraphic G Notation

Compute sum of all values of attribute c on relation r.

Page 31: Relational Algebra and Query Languages - GitHub Pages

Summary

• Relational Algebra is very much like normal algebra (x – y). We use relations instead of numbers as basic expressions.

• Basis for commercial query languages such as SQL.

• Three major components:• Fundamental Operations

• Extended Operations

• Aggregate Functions

69

Page 32: Relational Algebra and Query Languages - GitHub Pages

Revision

70

How many rows will this query produce?

[GATE 2012]

Page 33: Relational Algebra and Query Languages - GitHub Pages

Revision

• 7 Rows

71