kjell orsborn 2015-09-14 1 uu - dis - udbl database systems - 10p course no.? a second course on...
TRANSCRIPT
Kjell Orsborn 23-04-21
1UU - DIS - UDBL
DATABASE SYSTEMS - 10pCourse No. ??
A second course on development of database systems
Kjell OrsbornUppsala Database Laboratory
Department of Information Science, Uppsala University, Uppsala, Sweden
Kjell Orsborn 23-04-21
2UU - DIS - UDBL
Introduction to Relational Model
Elmasri/Navathe ch 7
Lecture 2
Kjell Orsborn
Department of Information Science
Uppsala University, Uppsala, Sweden
Kjell Orsborn 23-04-21
3UU - DIS - UDBL
The Relational Model
• The relational model was introduced by E. F. Codd 1970.
• Many DBMS’s are based on this data model.
• It support simple declarative, but yet powerful, languages for describing operations on data.
• Operations in the relational model applies to relations (tables) and produce new relations.– This means that an operation can be applied to the result of
another operation and that several different operations can be combined.
– Operations are described in an algebraic notation that is based on relational algebra.
Kjell Orsborn 23-04-21
4UU - DIS - UDBL
Relations as mathematical objects
• In set theory, a relation is defined as a subset of the product set (cartesian produkt) of a number of domains (value sets).
• The product set of the domains D1,D2,...,Dn is written as D1D2.. Dn.• D1D2... Dn constitute the set of all ordered sets <v1,v2,...,vn> such that
vi belongs to Di for all i.• If n=2, D1={T, F} and D2={P, Q, R} one gets the product sets:
D1D2 = {<T,P>,<T,Q>,<T,R>,<F,P>,<F,Q>,<F,R>}D2D1 = {<P,T>,<P,F>,<Q,T>,<Q,F>,<R,T>,<R,F>}
• For example, we have the relations:R1 D2D1 R1= {<P,T>,<Q,T>,<R,T>}R2 D2D1 R2= {<P,T>,<P,F>}
• Members of a relation is called tuples. If the relation is of degree n, the tuples are called n-tuples.
Kjell Orsborn 23-04-21
5UU - DIS - UDBL
An example relation
• If customer-name = {Jones, Smith, Curry, Lindsay } customer-street = { Main, North, Park } customer-city = { Harrison, Rye, Pittsfield }
• Then r = {(Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield)}is a relation over customer-name customer-street customer-city
Kjell Orsborn 23-04-21
6UU - DIS - UDBL
Relation schema
• A1, A2, . . ., An are attributes
• R = (A1, A2, . . ., An) is a relation schema
– Customer-schema(customer-name, customer-street, customer-city)
• r(R) is a relation on the relation schema R
– customer (Customer-schema)
Kjell Orsborn 23-04-21
7UU - DIS - UDBL
Relation instance• The current values (relation instance) of a relation are specified
by a table.
• An element t of r is a tuple - represented by a row in a table customer
customer
customer-name customer-street customer-city
Jones Main Harrison
Smith North Rye
Curry North Rye
Lindsay Park Pittsfield
Kjell Orsborn 23-04-21
8UU - DIS - UDBL
Relations as tables
A tupleA tuple
An attributeAn attributeA relationA relation
customer-name customer-street customer-city
Jones Main Harrison
Smith North Rye
Curry North Rye
Lindsay Park Pittsfield
Kjell Orsborn 23-04-21
9UU - DIS - UDBL
First Normal Form
• Only simple or atomic values are allowed in the relational model.
• Attributes is not allowed to have composite or multiple values.
• The theory for the relational model is based on these assumptions which is called:
The first normal form assumption
Kjell Orsborn 23-04-21
10UU - DIS - UDBL
Null values
• A special value, null or , can sometimes be used as an attribute value.
• Every occurence of null is unique. Thus, two occurences of null is not considered to be equal even if they are represented by the same symbol.
• null is used:– when one does not know the actual value of an attribute.
– when a certain attribute does not have a value.
– when an attribute is not applicable.
• Examples of the use of null are showed later.
Kjell Orsborn 23-04-21
11UU - DIS - UDBL
Keys
• Because relations are sets, all tuples in the relation are different.
• There is usually a subset k of the attributes in a relation R, i.e. k R, that has the characteristic that if the tuplest1, t2 R och t1 t2, då gäller att t1[k] t2[k] (hence the value of k in t1 the value of k in t2)
• Every such subset k is called a superkey for R.
Kjell Orsborn 23-04-21
12UU - DIS - UDBL
Keys - continued . . .
• A superkey k is minimal if there is no other superkey k' such that k' k.
• Every minimal superkey (OBS! ther can be more than one) is called a candidate key for R.
• The candidate key chosen by the database designer as the key for R is called R:s primary key or just key.
• In addition, term foreign key is used when a tuple is referenced, from another relation, with its key.
Kjell Orsborn 23-04-21
13UU - DIS - UDBL
Key examples
• Example superkey:– {customer-name, customer- street} and {customer- name} are both
superkeys of Customer, if no two customers can possibly have the same name.
• Example candidate key:– {customer- name} is a candidate key for Customer , since it is a
superkey (assuming no two customers can possibly have the same name), and no subset of it is a superkey.
Kjell Orsborn 23-04-21
14UU - DIS - UDBL
Determining keys from E-R types
• Strong entity type. The primary key of the entity type becomes the primary key of the relation.
• Weak entity type. The primary key of the relation consists of the union of the primary key of the strong entity type and the discriminator of the weak entity type.
• Relationship type. The union of the primary keys of the related entity types becomes a super key of the relation.– For binary many-to-many relationship types, above super key is also the
primary key.
– For binary many-to-one relationship types, the primary key of the “many” entity type becomes the relation’s primary key.
– For one-to-one relationship types, the relation’s primary key can be that of either entity type.
Kjell Orsborn 23-04-21
15UU - DIS - UDBL
Integrity constraintsfor a relational database schema
• 1. Domain constraint– attribute values for attribute A shall be atomic values from dom(A)
• 2. Key constraint– candidate keys for a relation must be unique
• 3. Entity integrity constraint– no primary key is allowed to have a null value
• 4. Referential integrity constraint – a tuple that refers to another tuple in another relation must refer to an
existing tuple
• 5. Semantic integrity constraint – e.g. “an employee’s total work time per week can not exceed 40 hours
for all projects taken all together”
Kjell Orsborn 23-04-21
16UU - DIS - UDBL
From E-R to relational model• The basic procedure defines a set of relational schemas that
represent entity and relationship types in the E-R model. This model should further with integrity constraints.– Primary keys allow entity types and relationship types to be expressed uniformly
as tables which represent the contents of the database.
– A database which conforms to an E-R diagram can be represented by a collection of tables.
– For each entity type and relationship type there is a unique table which is assigned the name of the corresponding entity type or relationship type.
– Each table has a number of columns (generally corresponding to attributes), which have unique names.
– Converting an E-R diagram to a table format is the basis for deriving a relational database design from an E-R diagram.
Kjell Orsborn 23-04-21
17UU - DIS - UDBL
Steps in translation from E-R model
to relational model• Translation of entity types and their attributes
– Step 1) Entity types
– Step 2) Weak entity types
• Translation of relationships– Step 3) 1-1 Relationship
– Step 4) 1-N Relationship
– Step 5) M-N Relationship
• Translation of multivalued attributes and relationships– Step 6) Multivalued attributes
– Step 7) Multivalued relationships
Kjell Orsborn 23-04-21
18UU - DIS - UDBL
Translating entity types and their attributes
• Step 1: Entity types - a strong entity type reduces to a table with the same attributes.– Key attributes (primary key - pk) is made the primary key column(s) for
the table. Each attribute gets their own column.
– Composite attributes are normally represented by their simple components.
– Example customer schema and table:
Customer(social-security, customer-name, c-street, c-city)pkpk
Kjell Orsborn 23-04-21
19UU - DIS - UDBL
pk a1 pk k a2
R
Translating entity types cont. . .
E2
a1a1
pkpk
a2a2
kk
1 N
• Step 2: Weak entity types - a weak entity type becomes a table that includes a column for the primary key of the identifying strong entity type .
E1E1
Kjell Orsborn 23-04-21
20UU - DIS - UDBL
Translating entity types cont. . .• The table corresponding to a relationship type linking a weak entity type to
its identifying strong entity type is redundant.
• Example of the payment schema and table:– The payment table already contains the information that would appear in the
loan-payment table (i.e., the columns loan-number and payment-no).
Payment(loan-number, payment-no, pay-date, amount)
loan-number payment-no pay-date amount
L-17 5 10 May 1996 50
L-23 11 17 May 1996 75
L-15 22 23 May 1996 300
Kjell Orsborn 23-04-21
21UU - DIS - UDBL
Translating relationship types• Step 3: 1-1 Relationship types
– The foreign key column (fk) is a copy of the other entity’s primary key column (pk). The values in a fk-column point to unique row in the other table, and thus implement the relationship.
pk1 a1 pk2 a2 f k1Alt 1:
Alt 2:pk2 a2pk1 a1 f k2
R E2
a1a1
pk1pk1
a2a2
pk2pk2
1 1
E1E1
Kjell Orsborn 23-04-21
22UU - DIS - UDBL
Translating 1-1 relationship types cont. . .
pk1 a1Alt 3:
Alt 4:
f k1 f k2 pk2 a2
E1 R E2
pk1 a1
E1
pk2 a2
E2
Kjell Orsborn 23-04-21
23UU - DIS - UDBL
Translating relationship . . . cont. . .• Step 4: 1-N Relationship types
– Include the primary key of the “1-side” as a foreign key on the “N-side”, (i.e. the foreign key column is placed on the entity on the N-side).
– Alternatively, an extra table (R) is created whose primary key is composed of the two foreign keys.
pk1 a1 pk2 a2 f k1Alt 1:
Alt 2:pk1 a1 f k1 f k2 pk2 a2
R E2
a1a1
pk1pk1
a2a2
pk2pk2
1 N
E1E1
Kjell Orsborn 23-04-21
24UU - DIS - UDBL
Translating relationship . . . cont. . .
• Step 5: M-N Relationship types– Always a separate table with columns for the primary keys of the two
participating entity types, and any descriptive attributes of the relationship type.
R E2
a1a1
pk1pk1
a2a2
pk2pk2
M N
E1E1
pk1 a1 f k1 f k2 pk2 a2
Kjell Orsborn 23-04-21
25UU - DIS - UDBL
Translating relationship . . . cont. . .
• Step 6: Multivalued attributes– A separate table is created for the multivalued attribute. Its primary key
is composed of the owning entity’s primary key, and the attribute value itself.
aa
pkpk EE
pk a pk mva
mva
E E-MVA
Kjell Orsborn 23-04-21
26UU - DIS - UDBL
Translating relationship . . . cont. . .
• Step 7: Multivalued relationship types– First try to remove multivalued relationships on the E-R model level by
model transformation.
– A separate table is created, with foreign keys to all tables that are included in the relationship. Its primary key is composed of all foreign keys.
R E2pk1pk1
aa
pk2pk2
N N
E1E1
pk3pk3
N
E3E3
f k1 f k2
R
f k3 a
Kjell Orsborn 23-04-21
27UU - DIS - UDBL
Translating relationship . . . cont. . .
• Step 7: Multivalued relationship types continued– In the case where R is 1-N-N, the primary key on R shall be set on the fk
for the table with cardinality 1.
R E2pk1pk1
aa
pk2pk2
1 N
E1E1
pk3pk3
N
E3E3
f k1 f k2
R
f k3 a
Kjell Orsborn 23-04-21
28UU - DIS - UDBL
Summary• Entity types and their attributes
– Step 1) Entity types• Each entity gets a corresponding table, with the primary key column set to
its key attribute.
– Step 2) Weak entity types• The primary key of a weak entity type table has the primary key of the
owner table as a component.
• Relationships– Step 3) 1-1 Relationship
• 4 alternatives: fk in E1 or E2, separate R table, common table for E1 & E2
– Step 4) 1-N Relationship• fk i entity on the N-side, separate R table
– Step 5) M-N Relationship• separate R table
Kjell Orsborn 23-04-21
29UU - DIS - UDBL
Summary cont. . .
• Multivalued attributes and relationships– Step 6) Multivalued attributes
• Separate table for the attribute with its pk composed of the owner pk and the value column.
– Step 7) Multivalued relationships• Separate R table. N-N-N: pk composed of all fk’s. 1-N-N: pk is fk to
the E1-table.
Kjell Orsborn 23-04-21
30UU - DIS - UDBL
Example E-R to relational model translation
Kjell Orsborn 23-04-21
32UU - DIS - UDBL
Short summary E-R -> RE-R concept Relational concept
entity type relation
1:1 relationship type include one of the primary keys as a foreignkey of the other ”entity relation”
1:N relationship type include the ”1-side” primary key as a foreignkey at the ”n-side”
M:N relationship type relation with two foreign keys
n-ary relationship type (degree > 2) relation with n foreign keys
simple attribute attribute
composite attribute simple attribute components
multivalued attribute relation anf foreign key
value set domain
key attribute primary (or secondary key)
Kjell Orsborn 23-04-21
33UU - DIS - UDBL
Introduction to Relational Algebra
Elmasri/Navathe ch 7
Lecture 2
Kjell Orsborn
Department of Information Science
Uppsala University, Uppsala, Sweden
Kjell Orsborn 23-04-21
34UU - DIS - UDBL
Query languages
• Languages where users can express what information to retrieve from the database.
• Categories of query languages:– Procedural– Non-procedural (declarative)
• Formal (“pure”) languages:– Relational algebra– Relational calculus
• Tuple-relational calculus• Domain-relational calculus
– Formal languages form underlying basis of query languages that people use.
Kjell Orsborn 23-04-21
35UU - DIS - UDBL
Relational algebra• Relational algebra is a procedural langaue
• Operations in relational algebra takes two or more relations as arguments and return a new relation.
• Relational algebraic operations:– Operations from set theory:
• Union, Intersection, Difference, Cartesian product
– Operations specifically introduced for the relational data model:• Select, Project, Join
• It have been shown that the select, project, union, difference, and cartesian product operations form a complete set. That is any other relational algebra operation can be expressed in these.
Kjell Orsborn 23-04-21
36UU - DIS - UDBL
Operations from set theory
• Relations are required to be union compatible to be able to take part in the union, intersection and difference operations.
• Two relations R1 and R2 is said to be union-compatible if:
R1 D1xD2x...xDn andR2 D1xD2x...xDn
i.e. if they have the same degree and the same domains.
Kjell Orsborn 23-04-21
37UU - DIS - UDBL
Union operation
• The union of two union-compatible relations R and S is the set of all tuples that either occur in R, S, or in both.
• Notation: R S
• Defined as: R S = {t | t R or t S}
• For example:
R SA B
abb
121
A B
ab
23
= A B
aabb
1213
Kjell Orsborn 23-04-21
38UU - DIS - UDBL
Difference operation• The difference between two union-compatible sets R and S is
the set of all tuples that occur in R but not in S.
• Notation: R S• Defined as: R S = {t | t R and t S}
• For example:
R SA B
abb
121
A B
ab
23
= A B
ab
11
Kjell Orsborn 23-04-21
39UU - DIS - UDBL
Intersection• The intersection of two union-compatible sets R and S, is the
set of all tuples that occur in both R and S.
• Notation: R S
• Defined as: R S = {t | t R and t S}
• For example:
R SA B
aab
121
A B
ab
23
= A B
a 2
Kjell Orsborn 23-04-21
40UU - DIS - UDBL
Cartesian product• Let R and S be relations with k1 and k2 arities resp. The
cartesian product of R and S is the set of all possible k1+k2 tuples where the first k1 components constitute a tuple in R and the last k2 components a tuple in S.
• Notation: R S• Defined as: R S = {t q | t R and q S}• Assume that attributes of r( R) and s( S) are disjoint. (i.e. R S
= ). If attributes of r( R) and s( S) are not disjoint, then renaming must be used.
X =
Kjell Orsborn 23-04-21
41UU - DIS - UDBL
Cartesian product example
A B
ab
12
C D
abbc
5565
= A B
aaaabbbb
11112222
C D
abbcabbc
55655565
Kjell Orsborn 23-04-21
42UU - DIS - UDBL
Selection operation
• The selection operator, , selects a specific set of tuples from a relation according to a selection condition P.
• Notation: p( R)
• Defined as: p( R) = {t | t R and P( t) }
• Where P is a logical expression built up by:– attribute names
– aritmetic operators: =, <, >, ≤, ≥, and
– logical operators (connectors): (and), (or), ¬ (not).
• V(R) is the set of tuples in R that fulfill the condition P. Example: SALARY>30000(EMPLOYEE)
Kjell Orsborn 23-04-21
43UU - DIS - UDBL
Selection example
= A B
aabb
abbb
C D
1524
7739
R
A=B D > 5 (R) = A B
ab
ab
C D
14
79
Kjell Orsborn 23-04-21
44UU - DIS - UDBL
Projection operation
• The projection operator, , picks out (or projects) listed columns from a relation and creates a new relation consisting of these columns.
• Notation: A1,A2,...,Ak (R)
where A1, A2 are attribute names and R is a relation name.
• The result is a new relation of k columns.
• Duplicate rows removed from result, since relations are sets.
Exemple: LNAME,FNAME,SALARY(EMPLOYEE)
Kjell Orsborn 23-04-21
45UU - DIS - UDBL
Projection example
= A B
aabb
1234
C
1112
R
A=C (R) = A C
aabb
1112
= A C
abb
112
Kjell Orsborn 23-04-21
46UU - DIS - UDBL
Join operator
• The join operator, (almost), creates a new relation by joining related tuples from two relations.
• Notation: R C SC is the join condition which has the form Ar As , where is one of {=, <, >, ≤, ≥, ≠}. Several terms can be connected as C1 C2 ...Ck.
• A join operation with this kind of general join condition is called “Theta join”.
Kjell Orsborn 23-04-21
47UU - DIS - UDBL
Example Theta join
A<D = A B
11169
22277
C B
33388
22277
A B
169
277
C
388
B C
277
338
D
459
R A<D S
C
33388
D
45999
R S
Kjell Orsborn 23-04-21
48UU - DIS - UDBL
Equijoin
• The same as join but it is required that attribute Ar and attribute As should have the same value.
• Notation: R C SC is the join condition which has the form Ar As. Several terms can be connected as C1 C2 ...Ck.
Kjell Orsborn 23-04-21
49UU - DIS - UDBL
Example Equijoin
B=C = A B
aa
24
C D
24
dd
A B
aa
24
C D
249
ddd
E
eee
R B=C S
E
ee
R S
Kjell Orsborn 23-04-21
50UU - DIS - UDBL
Natural join
• Natural join is equivalent with the application of join to R and S with the equality condition Ar = As (i.e. an equijoin) and then removing the redundant column As in the result.
• Notation: R *Ar,As SAr,As are attribute pairs that should fulfil the join condi-tion which has the form Ar As. Several terms can be connected as C1 C2 ...Ck.
Kjell Orsborn 23-04-21
51UU - DIS - UDBL
Example Natural join
B=C = A B
aa
24
D
dd
A B
aa
24
C D
249
ddd
E
eee
R B=C S
E
ee
R S
Kjell Orsborn 23-04-21
52UU - DIS - UDBL
Semijoin
• The Semijoin operation is an equijoin betweeen R and S followed by a projection of R:s attribute.
• Notation: R semi S = R (R S)
• That is, the answer of the query: “Which tuples in R are joinable with S?”
Kjell Orsborn 23-04-21
53UU - DIS - UDBL
Example Semijoin
semi = A B
143
221
C
334
A B
1423
2221
C
3364
B C
221
334
D
452
R semiSR S
Kjell Orsborn 23-04-21
54UU - DIS - UDBL
Composition of operations
• Expressions can be built by composing multiple operations
• Example: A=C (R S)
A=C (R S) =
A B
ab
12
C D
abbc
5565
= A B
aaaabbbb
11112222
C D
abbcabbc
55655565
A B
abb
122
C D
abb
556
R S =
Kjell Orsborn 23-04-21
55UU - DIS - UDBL
Additional relational operations
• Assignment and Rename
• Division
• Outer join and outer union
• Aggregate functions (presented together with SQL)
• Update operations (presented together with SQL)– (not part of pure query language)
Kjell Orsborn 23-04-21
56UU - DIS - UDBL
Assignment operation
• The assignment operation () makes it possible to assign the result of an expression to a temporary relation variable.
• Example:
• temp dno = 5(EMPLOYEE)
• result = ∏fname,lname,salary (temp)
• The result to the right of the is assigned to the relation variable on the left of the .
• The variable may use variable in subsequent expressions.
Kjell Orsborn 23-04-21
57UU - DIS - UDBL
Renaming relations and attribute
• The assignment operation can also be used to rename relations and attributes.
• Example: NEWEMP dno = 5(EMPLOYEE)R(FIRSTNAME,LASTNAME,SALARY) =
∏fname,lname,salary (NEWEMP)
Kjell Orsborn 23-04-21
58UU - DIS - UDBL
Division operation
• Suited to queries that include the phrase “for all”.
• Let R and S be relations on schemas R and S respectively, where R = ( A1,...,A m ,B1,...,Bn)
S = ( B1,...,Bn)
• The result of R ÷ S is a relation on schemaR - S =( A1 ,...,A m)
• R ÷ S = {t | t R-S (R) u S tu R}
Kjell Orsborn 23-04-21
59UU - DIS - UDBL
Example Division operation
÷ = A
ae
A B
aaabcddddee
12311134612
B
12
R ÷SR S
Kjell Orsborn 23-04-21
60UU - DIS - UDBL
Outer join/union operation
• An extension of the avoids loss of information.
• Computes the join/union and then adds tuples from one relation that do not match tuples in the other relation to the result of the join.
• Fills out with null values:– null signifies that the value is unknown or does not exist.
– All comparisons involving null are false by definition.
Kjell Orsborn 23-04-21
61UU - DIS - UDBL
Example Outer join
• Relation loan
• Relation borrower
branch-name
DowntownRedwoodPerryridge
loan-number
1-170L-230L-260
amount
300040001700
customer-name
JonesSmithHayes
loan-number
1-170L-230L-155
Kjell Orsborn 23-04-21
62UU - DIS - UDBL
Example Outer join cont...
• loan * borrower (natural join)
• loan left borrower (left outer join)
branch-name
DowntownRedwood
loan-number
1-170L-230
amount
30004000
customer-name
JonesSmithnull
loan-number
1-170L-230null
branch-name
DowntownRedwoodPerryridge
loan-number
1-170L-230L-260
amount
300040001700
customer-name
JonesSmith
Kjell Orsborn 23-04-21
63UU - DIS - UDBL
Example Outer join cont...
• loan right borrower (right outer join)
• loan full borrower (full outer join)customer-name
JonesSmithnull
Hayes
branch-name
DowntownRedwoodPerryridge
null
loan-number
L-170L-230L-260L-155
amount
300040001700null
customer-name
JonesSmithHayes
branch-name
DowntownRedwood
null
loan-number
L-170L-230L-155
amount
30004000null
Kjell Orsborn 23-04-21
64UU - DIS - UDBL
Aggregation operations
• Presented together with SQL later
• Examples of aggregation operations– avg
– min
– max
– sum
– count
Kjell Orsborn 23-04-21
65UU - DIS - UDBL
Update operations
• Presented together with SQL later
• Operations for database updates are normally part of the DML– insert (of new tuples)
– update (of attribute values)
– delete (of tuples)
• Can be expressed by means of the assignment operator
Kjell Orsborn 23-04-21
68UU - DIS - UDBL
Introduction to Relational Calculus
Elmasri/Navathe ch 9
Lecture 2
Kjell Orsborn
Department of Information Science
Uppsala University, Uppsala, Sweden
Kjell Orsborn 23-04-21
69UU - DIS - UDBL
Relation calculus?
• Relation calculus is, in similarity with relation algebra, a formal query language for the relational data model.
• Relation calculus is based on one branch of mathematical logic that is called predicate logic.
• Relation calculus is divided into two subcategories:– Tuple calculus (e.g. QUEL and SQL is related with tuple calculus)
– Domain calculus (e.g. QBE is related with domain calculus)
• Relation calculus is declarative (or non-procedural) in contrast to relation algebra which is a procedural langauge.
Kjell Orsborn 23-04-21
70UU - DIS - UDBL
Comparison of relation calculus and relation algebra
• Relation calculus and relation algebra have an identical basic level of expressability.
• A relational query language with a corresponding expressability is called relational complete.
• Relation algebra is oriented towards relations but relation calculus is oriented towards tuples (tuple calculus) or domain values of the attributes (domain calculus).
Kjell Orsborn 23-04-21
71UU - DIS - UDBL
Tuple relational calculus
• A nonprocedural query language, where each query is of the form:{t | P (t) }
• It is the set of all tuples t such that predicate P is true for t.• t is a tuple variable; t[ A] denotes the value of tuple t on
attribute A.• t r denotes that tuple t is in relation r.• P is a formula similar to that of the predicate calculus.
Kjell Orsborn 23-04-21
72UU - DIS - UDBL
Predicate calculus formula
• 1. Set of attributes and constants
• 2. Set of comparison operators: (e.g., <,≤, =, ≠, >, ≥)
• 3. Set of connectives: and (), or (), not (¬)
• 4. Logical implication (): x y, if x is true, then y is true (x y x y).
• 5. Set of quantifiers:– t r (Q( t)) “there exists” a tuple t in relation r such that
predicate Q( t)is true
– t r (Q( t)) Q is true “for all” tuples t in relation r
Kjell Orsborn 23-04-21
73UU - DIS - UDBL
Banking example
• branch (branch-name,branch-city,assets)
• customer (customer-name,customer-street,customer-city)
• account (branch-name,account-number,balance)
• loan (branch-name,loan-number,amount)
• depositor (customer-name,account-number)
• borrower (customer-name,loan-number)
Kjell Orsborn 23-04-21
74UU - DIS - UDBL
Example queries of t. c.(Elmasri/Navathe syntax)
• Find the branch-name, loan-number, and amount for loans of over $1200.{t | loan(t) t.amount 1200 }
• Find the loan number for each loan of an amount greater than $1200.{t.loan-number | loan(t) t.amount > 1200) }
Kjell Orsborn 23-04-21
75UU - DIS - UDBL
Example queries of t. c.(Elmasri/Navathe syntax)
• Find the names of all customers who have a loan and an account at the bank.{t.customer-name | borrower(t) ((d) depositor(d) t.customer-name = d.customer-name)}
• Find the names of all customers having a loan, an account, or both at the bank.{t.customer-name | ((b) borrower(b) t.customer-name=b.customer-name) ((d) depositor(d) t.customer-name=d.customer-name)}
Kjell Orsborn 23-04-21
76UU - DIS - UDBL
Example queries of t. c.(Elmasri/Navathe syntax ver.2)
• Find the names of all customers who have a loan and an account at the bank.{ t.customer-name | customer(t) (((b) borrower(b) t.customer-name = b.customer-name) ((d) depositor(d) t.customer-name = d.customer-name))}
• Find the names of all customers having a loan, an account, or both at the bank.{ t.customer-name | customer(t) (((b) borrower(b) t.customer-name = b.customer-name) ((d) depositor(d) t.customer-name = d.customer-name))}
Kjell Orsborn 23-04-21
77UU - DIS - UDBL
Example queries of t. c.(Elmasri/Navathe syntax)
• Find the names of all customers who have an account at all branches located in Brooklyn.
{t | customer(t) (b) (branch(b) b.branch-city = “Brooklyn”) (a) (account(a) a.branch-name = b.branch-name) (d) (depositor(d) d.customer-name = t.customer-name
d.account-number = a.account-number)}
Kjell Orsborn 23-04-21
83UU - DIS - UDBL
Safety of expressions
• It is possible to write tuple calculus expressions that generate infinite relations.
• For example, {t | t r} results in an infinite relation if the domain of any attribute of relation r is infinite.
• As for instance {t | loan(t)) }
• To guard against the problem, we restrict the set of allowable expressions to safe expressions.
• An expression {t | P(t) } in the tuple relational calculus is safe if every component of t appears in one of the relations, tuples, or constants that appear in P.
Kjell Orsborn 23-04-21
84UU - DIS - UDBL
Domain relational calculus
• A nonprocedural query language equivalent in power to the tuple relational calculus.
• Each query is an expression of the form:– { <x1, x2, ..., xn> | P(x1, x2, ..., xn) }
– x1, x2, ..., xn represent domain variables
– P represents a formula similar to that of the predicate calculus
Kjell Orsborn 23-04-21
85UU - DIS - UDBL
Example queries of d. c.(Elmasri/Navathe syntax)
• Find the branch-name, loan-number, and amount for loans of over $1200.{ b,l,a | loan(b,l,a) a > 1200 }
• Find the names of all customers who have a loan of over $1200.{ c | (b) (l) (a) (borrower(c,l) loan(b,l,a) a > 1200)}
• Find the names of all customers who have a loan from the Perryridge branch and the loan amount.{ c,a | (l) (borrower(c,l) (b) (loan(b,l,a)
b = “Perryridge”))}
Kjell Orsborn 23-04-21
88UU - DIS - UDBL
Safety of expressions
• { <x1, x2, ..., xn> | P(x1, x2, ..., xn) } is safe if all of the following hold:
• 1. All values that appear in tuples of the expression are values from dom(P) (that is, the values appear either in P or in a tuple of a relation mentioned in P).
• 2. For every “there exists” subformula of the form x (P1( x)), the subformula is true if and only if there is a value x in dom(P1) such that P1(x) is true.
• 3. For every “for all” subformula of the form x (P1( x)), the subformula is true if and only if P1( x) is true for all values x from dom(P1).
Kjell Orsborn 23-04-21
89UU - DIS - UDBL
The QBE query language(Query-By-Example)
• QBE is a query language based on domain calculus.
• Develped by IBM’s research center at Yorktown Heights for DB2 and others.
• Interactive and graphically oriented:– The user asks for a set of templates
– The system “shows” the templates on screen.
– The user fills in the templates with information that retrieved tuples should match and marks the attributes that should be showed in the result.
– The system “fills” the templates with the information that was sought.
Kjell Orsborn 23-04-21
90UU - DIS - UDBL
Templates - example
ORDERS O# DATE CUST
INCLUDES O# ITEM QUANTITY
SUPPLIES NAME ITEM PRICE
Kjell Orsborn 23-04-21
91UU - DIS - UDBL
A query example
ORDERS O# DATE CUST
INCLUDES O# ITEM QUANTITY
SUPPLIES NAME ITEM PRICE
_x
_x
“Print the name of suppliers that supply items to Steve Stone”
Steve Stone
_item
_itemP.
Kjell Orsborn 23-04-21
92UU - DIS - UDBL
Another query example
ORDERS O# DATE CUST
CUSTOMERS NAME CADDR BALANCE
_x
_cust
“Print customer name, address, orderno and date for all orders”
_cust
_address
P._xP._cust
_date
P._dateP._address
_x_cust _date_addressP.
Alt. 1
Alt. 2
Kjell Orsborn 23-04-21
93UU - DIS - UDBL
How does QBE work• The system creates a tupel variable for each row in every
template.– E.g., in the previous example two tupel variables t1 (for ORDERS) and
t2 (for CUSTOMERS) are created.
• For k nos of variables k nested loops are created where each loop iterates over tuples in respective relation. When all tupel variables have gotten their values, it is checked if the domain variables can be assigned consistent values.– E.g. each time t1[CUST] = t2[NAME] are fulfilled we have a possible
value for _cust.
– Each time the conditions are fulfilled the commands that have been selected are executed (e.g. P.).
Kjell Orsborn 23-04-21
94UU - DIS - UDBL
QBE - operations
– P. Print
– I. Insert
– U. Update
– D. Delete
– UN. Unique (to get unique values)
– ALL. Applied to all
– SUM. Sum
– AVG. Average
– MAX. Maximal value
– MIN. Minimal value
– CNT Count
Kjell Orsborn 23-04-21
95UU - DIS - UDBL
QBE - aggregation example
• The aggregation commands include:– SUM., AVG., MAX., MIN., CNT.
“Print the average value of the BALANCE for all customers”
CUSTOMERS NAME CADDR BALANCE
P.AVG.ALL._x
Kjell Orsborn 23-04-21
96UU - DIS - UDBL
QBE - update example
• I., D., U.– When you get a new customer:
– When Walmart provide 20% discount on everything:
CUSTOMERS NAME CADDR BALANCE
100000I. Lisa Lazy Boston
SUPPLIES NAME ITEM PRICE
0.9*_xprisI. _item
_xprisWalmart _item
Walmart