query processing: relational algebra [email protected]

25
Query Processing: Relational Algebra [email protected] d

Upload: barnard-taylor

Post on 02-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Query Processing:Relational Algebra

[email protected]

Relational Algebra• Basic operators– Selection ( ), select a subset of rows from relation– Project (), deletes unwanted columns from relation– Union ( ), tuples in relation 1 and in relation 2– set difference ( – ), tuples in relation 1 but not in relation 2– Cartesian product (x), allows us to combine 2 relations– Rename ( ), renaming the relations– Join ( )– Aggreagte Function

• The operators take one or two relations as inputs and produce a new relation as a result.

Select Operation – Basic Concept• Notation: p(r)• Used to select a subset of the tuples from a relation that satisfy a selection condition.• Defined as:

p(r) = {t | t r and p(t)}

Where p is a formula in propositional calculus consisting of terms connected by : (and), (or), (not)

Each term is one of:

<attribute> op <attribute> or <constant> where op is one of: =, , >, . <.

Select * from r where p

Select Operation – Example

• Relation r

• country_name = ‘Indonesia’ (r)

country_id country_name region_id1 US 12 Indonesia 33 Canada 14 Spain 25 England 2

country_id country_name region_id2 Indonesia 3

Select Operation – Exercise

• List of departements located in Indonesia• QuerySelect * from departments d, locations l, countries c where country_name=‘Indonesia’ and c.country_id=l.country_id and l.location_id=d.location_id• Rel. Algebra ?

Project Operation – Basic Concept• Notation:

where A1, A2 are attribute names and r is a relation name.• The result is defined as the relation of k columns

obtained by erasing the columns that are not listed• Duplicate rows removed from result, since relations are

sets• Example: To eliminate the country_id and region_id

attribute of countries country_name (countries)

)( ,,, 21r

kAAA

Project Operation – Example

Relation remployee_id first_name last_name email phone_numberemp_001 Jose Mourinho [email protected] +612345678

emp_002 Brendan Rodgers [email protected] +612345679

emp_003 David Moyes [email protected] +612345680

)( ,_ remailnamelast last_name emailMourinho [email protected]

Rodgers [email protected]

Moyes [email protected]

Project Operation – Exercise

• List of name of employees and department name where they work in

• QuerySelect first_name,last_name,departement_name from employees e, departments d where e.department_id=d.department_id• Rel. Algebra ?

Union Operation – Basic Concept• Notation: r s• Defined as:

r s = {t | t r or t s}• For r s to be valid.

1. r, s must have the same arity (same number of attributes)2. The attribute domains must be compatible (example: 2nd column of r deals with the same type of values as does the 2nd column of s)

• Example:

manager_id (departments) manager_id (employees)

Union Operation – Example

• Relation r

• Relation s

country_name region_idUS America Indonesia AsiaCanada AmericaSpain EuropeEngland Europe

country_name region_idItaly Europe Thailand AsiaSouth Africa AfricaIndonesia Asia

• r scountry_name region_id

US America Indonesia AsiaCanada AmericaSpain EuropeEngland EuropeItaly Europe Thailand AsiaSouth Africa Africa

Type Compatibility

• Two Relations are union-compatible if they have the same degree (i.e.,the same number of attributes) and the corresponding attributes are defined on the same domains.

• Suppose we have these tables :• developingCountries (c_id, c_name, region_id)• Countries (country_id, country_name, region_id)

– These are union-compatible tables

• Union, intersection and set difference require union-compatible tables

Intersection Operation – Basic Concept

• The result of this operation, denoted by R ∩ S, is a relation that includes all tuples that appear in both R and S. The two operands must be "type compatible"

• Example:

Relation rcountry_name region_idUS America Indonesia AsiaCanada AmericaSpain EuropeEngland Europe

Relation scountry_name region_idIndonesia AsiaSpain EuropeEngland EuropeItaly Europe Thailand AsiaSouth Africa Africa

r ∩ scountry_name region_idIndonesia AsiaSpain EuropeEngland Europe

Set Difference Operation – Basic Concept

• Notation r – s• Defined as:

r – s = {t | t r and t s}

• Set differences must be taken between compatible relations.– r and s must have the same arity– attribute domains of r and s must be compatible

Set Difference Operation – Example

Relation rcountry_name region_idUS America Indonesia AsiaCanada AmericaSpain EuropeEngland Europe

Relation scountry_name region_idIndonesia AsiaSpain EuropeEngland EuropeItaly Europe Thailand AsiaSouth Africa Africa

r - scountry_name region_idUS America Canada America

Cartesian-Product Operation – Basic Concept

• Notation r x s• Defined as:

r x s = {t q | t r and q s}

• Assume that attributes of r(R) and s(S) are disjoint. (That is, R S = ).

• If attributes of r(R) and s(S) are not disjoint, then renaming must be used.

Cartesian-Product Operation – Example

Cartesian Product : combine information from 2 tables, produces every possible combination

Relation rcountry_name region_nameSpain EuropeEngland Europe

Relation s

r x s

last_name email

Mourinho [email protected]

Rodgers [email protected]

Moyes [email protected]

country_name region_name last_name emailSpain Europe Mourinho [email protected] Europe Rodgers [email protected] Europe Moyes [email protected] Europe Mourinho [email protected] Europe Rodgers [email protected] Europe Moyes [email protected]

Composition of Operations

• Can build expressions using multiple operations• Example: country_name,last_name region_name=‘Europe’ (r x s)

country_name region_name last_name emailSpain Europe Mourinho [email protected] Europe Rodgers [email protected] Europe Moyes [email protected] Europe Mourinho [email protected] Europe Rodgers [email protected] Europe Moyes [email protected] Asia Mourinho [email protected] Asia Rodgers [email protected] Asia Moyes [email protected]

country_name last_nameSpain MourinhoSpain RodgersSpain MoyesEngland MourinhoEngland RodgersEngland Moyes

Properties• Notice that both union and intersection are commutative

operations; that isR ∪ S = S ∪ R, and R ∩ S = S ∩ R

• Both union and intersection can be treated as n-ary operations applicable to any number of relations as both are associative operations; that is

R ∪ (S ∪ T) = (R ∪ S) ∪ T, and

(R ∩ S) ∩ T = R ∩ (S ∩ T)

• The minus operation is not commutative; that is, in generalR - S ≠ S – R

Rename Operation• Allows us to name, and therefore to refer to, the results of relational-

algebra expressions.• Allows us to refer to a relation by more than one name.• Example: x (E)

returns the expression E under the name X• If a relational-algebra expression E has arity n, then

returns the result of expression E under the name X, and with theattributes renamed to A1 , A2 , …., An .

• It returns a new relation with the same schema and content of the original, just different name (for the relation, attributes or both)

• The original relation is unchanged!

)(),...,,( 21E

nAAAx

Join operation

• Natural join :

• Outer join :

Division Operation – Basic Concept• Notation: r s • Suited to queries that include the phrase “for all”.

• Let r and s be relations on schemas R and S respectively where– R = (A1, …, Am , B1, …, Bn )

– S = (B1, …, Bn)

The result of r s is a relation on schema

R – S = (A1, …, Am)

r s = { t | t R-S (r) u s ( tu r ) }

Where tu means the concatenation of tuples t and u to produce a

single tuple

Division Operation – Example

Relation scountry_nameSpainEngland

Relation r s

Relation r

last_name email

Rodgers [email protected]

country_name last_name emailSpain Mourinho [email protected] Rodgers [email protected] Moyes [email protected] Mourinho [email protected] Rodgers [email protected] Moyes [email protected]

Aggregate Functions – Basic Concept• Aggregation function takes a collection of values and

returns a single value as a result.avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values

• Aggregate operation in relational algebra

– E is any relational-algebra expression– G1, G2 …, Gn is a list of attributes on which to group (can be empty)

– Each Fi is an aggregate function

– Each Ai is an attribute name

)()(,,(),(,,, 221121E

nnn AFAFAFGGG

Aggregate Functions – Example

first_name job_id salaryJose 2 25000Brendan 2 12000David 2 20000Pep 2 23000Gerard 2 21000Carlo 2 23000Cristiano 3 80000Lionel 3 78000Andres 3 75000Di Stefano 1 150000Frank 1 170000Sandro 1 140000Wayne 3 65000Fernando 3 60000Moratti 1 120000Xavi 3 70000

max(salary)170000

job_id avg(salary)1 1450002 713333 20666

Relation r))(max( rsalary

Job_idgavg(salary)(r)

a. Cari nama employee yang bekerja pada Bank Niaga.b. Cari nama dan kota tempat tinggal semua employee yang bekerja di Bank

Niaga.c. Cari nama, alamat dan kota tempat tinggal semua employee yang bekerja

di Bank Niaga dan berpenghasilan lebih dari Rp2.000.000.d. Cari nama semua employee yang tinggal di kota yang sama dengan

perusahaan dimana mereka bekerja.e. Cari nama semua employee yang tinggal di kota dan jalan yang sama

dengan manager mereka.f. Cari nama semua employee yang tidak bekerja di Bank Niaga.h. Asumsikan perusahaan berlokasi di beberapa kota. Cari semua perusahaan

yang berlokasi di setiap kota dimana Small Bank Corporation berada.

• Diketahui skema basis data sbb:employee (person-name, street, city)works (person-name, company-name, salary)company (company-name, city)manages (person-name, manager-name)