2. relational model
DESCRIPTION
2. Relational Model. Attribute Types. Each attribute of a relation has a name The set of allowed values for each attribute is called the domain of the attribute Attribute values are (normally) required to be atomic ; that is, indivisible - PowerPoint PPT PresentationTRANSCRIPT
2.
Relational Model
Attribute Types
• Each attribute of a relation has a name
• The set of allowed values for each attribute is called the domain of the attribute
• Attribute values are (normally) required to be atomic; that is, indivisible
– E.g. the value of an attribute can be an account number, but cannot be a set of account numbers
• Domain is said to be atomic if all its members are atomic
• The special value null is a member of every domain
• The null value causes complications in the definition of many operations
Relation Schema
• Formally, given domains D1, D2, …. Dn a relation r is a subset of
D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai Di
• Schema of a relation consists of
– attribute definitions
• name
• type/domain
– integrity constraints
• The database schema , it is a logical design of the database, and the database instance, which is a snapshot of the database at a given instant in time.
• Lower case is written for relation and upper case is written for relation schema.
• We use Account_schema to denote the relation schema for relation account.
• Account_schema=(account_number,branch_name,balance)
• We denote the face that account is a relation on Account_schema
• account(Account_schema)
Database schema
Relation Instance
• The current values (relation instance) of a relation are specified by a table
• An element t of r is a tuple, represented by a row in a table
• Order of tuples is irrelevant (tuples may be stored in an arbitrary order)
JonesSmithCurry
Lindsay
customer_name
MainNorthNorthPark
customer_street
HarrisonRyeRye
Pittsfield
customer_city
customer
attributes(or columns)
tuples(or rows)
• Consider the branch relation and the schema of that relation is:
• Branch_schema(Branch_name, Branch_city,assests)
Example of the relation instance
Branch_name Branch_city assests
Nehrunagar Ahmedabad 1300000
Satellite Ahmedabad 2334444
Sector-12 G’nagar 32222211
Kalawad Rajkot 122000
Sector-25 G’nagar 100000
Why Split Information Across Relations?
• Storing all information as a single relation such as bank(account_number, balance, customer_name, ..)results in
– repetition of information
• e.g.,if two customers own an account (What gets repeated?)
– the need for null values
• e.g., to represent a customer without an account
• Normalization theory (Chapter) deals with how to design relational schemas
Keys
• Let K R
• K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R)
– by “possible r ” we mean a relation r that could exist in the enterprise we are modeling.
– Example: {customer_name, customer_street} and
{customer_name}
are both superkeys of Customer, if no two customers can possibly
have the same name
• In real life, an attribute such as customer_id would be used
instead of customer_name to uniquely identify customers, but
we omit it to keep our examples small, and instead assume
customer names are unique.
Keys (Cont.)
• K is a candidate key if K is minimal
Example: {customer_name} is a candidate key for Customer,
since it is a superkey and no subset of it is a superkey.
• Primary key: a candidate key chosen as the principal means of
identifying tuples within a relation
– Should choose an attribute whose value never, or very rarely,
changes.
– E.g. email address is unique, but may change
• Super Keys : Super key stands for superset of a key.A Super Key is a set of one or more attributes that are taken collectively and can identify all other attributes uniquely. Candidate KeysCandidate Keys are super keys for which no proper subset is a super key. In other words candidate keys are minimal super keys.
• Primary Key:It is a candidate key that is chosen by the database designer to identify entities with in an entity set. Primary key is the minimal super keys. In the ER diagram primary key is represented by underlining the primary key attribute. Ideally a primary key is composed of only a single attribute. But it is possible to have a primary key composed of more than one attribute. Composite KeyComposite key consists of more than one attributes.
Example: Consider a Relation or Table R1. Let A,B,C,D,E are the attributes of this relation. R(A,B,C,D,E)A→BCDE This means the attribute 'A' uniquely determines the other attributes B,C,D,E.BC→ADE This means the attributes 'BC' jointly determines all the other attributes A,D,E in the relation. Primary Key :ACandidate Keys :A, BCSuper Keys : A,BC,ABC,AD ABC,AD are not Candidate Keys since both are not minimal super keys.
Schema diagram
Query Languages
• Language in which user requests information from the database.
• Categories of languages
– Procedural
– Non-procedural, or declarative
• “Pure” languages:
– Relational algebra
– Tuple relational calculus
– Domain relational calculus
• Pure languages form underlying basis of query languages that people use.
Many to many relationships• In general, need a new table
Enrolled(cid, grade, studid)
Studid is foreign key that references sid in Student table
cid grade studid
Carnatic101 C 53831
Reggae203 B 53832
Topology112 A 53650
History 105 B 53666
sid name login
50000 Dave dave@cs
53666 Jones jones@cs
53688 Smith smith@ee
53650 Smith smith@math
53831 Madayan madayan@music
53832 Guldu guldu@music
EnrolledStudent
Foreignkey
Relational Algebra
• Collection of operators for specifying queries• Query describes step-by-step procedure for
computing answer (i.e., operational)• Each operator accepts one or two relations as
input and returns a relation as output• Relational algebra expression composed of
multiple operators
Relational Algebra Summary
• Algebras are useful to manipulate data types (relations in this case)
• Set-oriented• Brings some clarity to what needs to be done• Opportunities for optimization
– May have different expressions that do same thing
• We will see examples of algebras for other types of data in this course
Select,project & rename are unary operations as they
operate on 1 relation.
• Basic operations:– Selection ( ) return rows that meet some condition.– Projection ( ) selects columns from relation.– Cross-product ( ) Allows us to combine two relations.– Set-difference ( ) Tuples in reln. 1, but not in reln. 2.– Union ( ) Tuples in reln. 1 and in reln. 2.
• Additional operations:
– Set Intersection, natural join, division, assignment
Relational Algebra
• The select,project,and rename are called unary operations because they operate on one relation.
Loan_number Branch_name amount
L-11 Round-hill 900
L-14 Downtown 1500
L-15 Perryridge 1300
L-16 Downtown 1000
L-17 Redwood 2000
L-93 Mianus 500
L-23 Perryridge 1300
select operation (select operation ())
Select operation select the tuples from the table with satisfy given condition or predicate
select * from select * from customercustomer where where idid=1234;=1234;
select <attributes> from <table name> where <condition>;select <attributes> from <table name> where <condition>;
conditioncondition(Relation)(Relation)
sid name login age gpa
50000 Dave dave@cs 19 3.3
53666 Jones jones@cs 18 3.4
53688 Smith smith@ee 18 3.2
53650 Smith smith@math 19 3.8
53831 Madayan madayan@music 11 1.8
53832 Guldu guldu@music 12 2.0
Selection
Select students with gpa higher than 3.3 from S1:
σgpa>3.3(S1)
S1
sid name gpa
50000 Dave 3.3
53666 Jones 3.4
53688 Smith 3.2
53650 Smith 3.8
53831 Madayan 1.8
53832 Guldu 2.0
sid name gpa
53666 Jones 3.4
53650 Smith 3.8
The select operation selects tuples that satisfy given predicate.It is denoted by (σ).
• Thus , to select those tuples of the loan relation where the branch is”Perryridge”we write,
• σ branch_name=“Perryridge”(loan)
The select operation
Loan_number Branch_name amount
L-23 Perryridge 1300
L-15 Perryridge 1300
• We can find all tuples in which the amount lent is more than $1200 by writing
• σ amount>1200(loan)• In general we allow comparisions using
<,<=,>,>= in the selection predicate,further we can combine several predicates by using the connections and(^ ),or(v) and not(-)
• σ branch_name=“perryridge” ^amount>1200(loan)
project operation (project operation ())
Selects the particular attributes or columns from the table
attributesattributes (Relation) (Relation)
select select id, nameid, name from from customercustomer;;
select <attributes> from <table name>;select <attributes> from <table name>;
Projection
Project name and gpa of all students in S1:
name, gpa(S1)
S1
Sid name gpa
50000 Dave 3.3
53666 Jones 3.4
53688 Smith 3.2
53650 Smith 3.8
53831 Madayan 1.8
53832 Guldu 2.0
name gpa
Dave 3.3
Jones 3.4
Smith 3.2
Smith 3.8
Madayan 1.8
Guldu 2.0
• Suppose we want to list out all loan numbers and the amount of the loans,but donot care about the branch name. the project operations allows us to produce this relation. Since a relation is a set , any duplicate rows are eliminated.
• Projection is denoted by the upper case π
Projection operation
Loan_number
L-11
L-14
L-15
L-16
L-17
L-93
L-23
Write a query to list all loan numbers and the amount of loan
asπ loan_number,amount(loan)
amount
900
1500
1300
1000
2000
500
1300
Composition of Relational OperationsComposition of Relational Operations
Composition provide the information of the table with specified columns with satisfy conditions or predictions
id,nameid,name ( (account>1000account>1000 (customer)) (customer))
select <attributes> from <table name> where <predictions>;select <attributes> from <table name> where <predictions>;
select id,name from customer where account>1000;
Combine Selection and Projection
• Project name and gpa of students in S1 with gpa higher than 3.3:
name,gpa(σgpa>3.3(S1))
Sid name gpa
50000 Dave 3.3
53666 Jones 3.4
53688 Smith 3.2
53650 Smith 3.8
53831 Madayan 1.8
53832 Guldu 2.0
name gpa
Jones 3.4
Smith 3.8
• “Find those customers who live in ahmedabad”• Πcustomer_name (σ
customer_city=“ahmedabad”(customer))
Composition of relational operations
Set Operations• These are binary operations; that is, each is applied to
two sets.• Binary operations form mathematical set theory:
UNION: R1 U R2,INTERSECTION: R1 n R2,SET DIFFERENCE: R1 - R2,CARTESIAN PRODUCT: R1 X R2.
Rename:
• When these operations are adapted to relational databases, the two relations on which any of the above three operations are applied must have the same type of tuples; this condition is called Union Compatible if they have the same degree n.
• This means that the two relations have the same number of attributes and that each pair of corresponding attributes have the same domain.
UNION:• The result of this operation, denoted by R S is a
relation that includes all tuples that are either in R or in S or in both R and S. Duplicate tuples are eliminated.
INTERSECTION:
• The result of this operation, denoted by R n S, is a relation that includes all tuples that are in both R and S.
SET DIFFERENCE:• The result of this operation denoted by R – S, is a
relation that includes all tuples that are in R but not in S.CARTESIAN PRODUCT:• It is also known as CROSS PRODUCT or CROSS JOIN
denoted by X, which is also a binary set operation, but the relations on which it is applied do not have to be union compatible. This operation is used to combine tuples from two relations in a combinatorial fashion.
Union Operation (Union Operation ())
Union of the two entity set
id,nameid,name (borrower) (borrower) id,nameid,name (depositor) (depositor)
(select (select id,nameid,name from from borrowerborrower))unionunion
(select (select id,nameid,name from from depositordepositor););
Conditions: The relations r and s must be of the same arity The domains of the ith attribute of r and the ith attribute of s must be
the same for all i
Option: union all
• Relations r, s:
Union Operation – Example
r s:
A B
1
2
1
A B
2
3
rs
A B
1
2
1
3
Set Intersection Operation (Set Intersection Operation ())
id,nameid,name (borrower) (borrower) id,nameid,name (depositor) (depositor)
(select (select id,nameid,name from from borrowerborrower))intersectionintersection
(select (select id,nameid,name from from depositordepositor););
Conditions: The relations r and s must be of the same arity The domains of the ith attribute of r and the ith attribute of s must be
the same for all i
Option: intersection all
Intersection of the two entity set
Example: Intersection
sid name gpa
50000 Dave 3.3
53666 Jones 3.4
53688 Smith 3.2
53650 Smith 3.8
53831 Madayan 1.8
53832 Guldu 2.0
sid name gpa
53666 Jones 3.4
53688 Smith 3.2
53700 Tom 3.5
53777 Jerry 2.8
53832 Guldu 2.0
S1 S2
S1 S2 =
sid name gpa
53666 Jones 3.4
53688 Smith 3.2
53832 Guldu 2.0
Intersect Operation – Example
• Relations r, s:
r n s:
A B
1
2
1
A B
2
3
rs
A B 2
Set Difference Operation (-) Set Difference Operation (-)
id,nameid,name (borrower) – (borrower) – id,nameid,name (depositor) (depositor)
(select (select id,nameid,name from from borrowerborrower))exceptexcept
(select (select id,nameid,name from from depositordepositor););
Conditions: The relations r and s must be of the same arity The domains of the ith attribute of r and the ith attribute of s must be
the same for all i
Option: except all
Give tuples which are in relation r but not in s
• Relations r, s:
Set Difference Operation – Example
r – s:
A B
1
2
1
A B
2
3
rs
A B
1
1
• It is denoted by -, allows us to find tuples that are in one relation but not in another.
• We can find the customer who have an account but not a loan by:
• Πcustomer_name (depositor) -Πcustomer_name (borrower)
The set –difference operation
Customer_name
Johnson
Lindsay
turner
balancebalance (account) – (account) –
account,balanceaccount,balance ( (account.balance < d.balanceaccount.balance < d.balance (account x (account x dd(account)))(account)))
Rename Operation (Rename Operation () ->Rename either the ) ->Rename either the relation or attributes or bothrelation or attributes or both
x (E) Returns the result of expression E under the name ‘x’
x(A1,A2,…,An) (E) With the attributes rename to A1, A2, …, An
e.g. find largest balance in account entity set
(select balance from account) except(select balance from account) except(select account,balance from account,account (select account,balance from account,account asas d dwhere account.balance < d.balance)where account.balance < d.balance)
Formal Definition of Relational Algebra
For E1 and E2 Relational-Algebra Expressions:
EE11 E E22 Union OperationUnion Operation
EE11 – E – E22 Set Intersection OperationSet Intersection Operation
EE11 E E22 Set Difference OperationSet Difference Operation
EE11 E E22 Cartesian ProductCartesian Product
PP (E (E11)) Select: P is a predicateSelect: P is a predicate
SS (E (E11)) Project: S is set of attributesProject: S is set of attributes
xx (E (E11)) Rename: x is a new name for result of ERename: x is a new name for result of E11
Cartesian-Product Operation (Cartesian-Product Operation ())
Allow us to combine information from any two relations
Cartesian product of relations r1 & r2 is r1 r2
customer customer loan loan
(customer.id, customer.name, customer.street, customer.city, loan.number, loan.amount)
If relations r1 has n rows and r2 has m rows thanr1 x r2 is size of n x m.
select select id,name,number,amount id,name,number,amount from from customer,loancustomer,loan
Cartesian - Product Operation – Example
Relations r, s:
r x s:
A B
1
2
A B
11112222
C D
1010201010102010
E
aabbaabb
C D
10102010
Eaabb
r
s
• It is denoted by a cross(× ) allows us to combine information from any two relations. We write as r1×r2.
• For example relation schema for r=borrower×loan is
• (borrower.customer_name, borrower.loan_num, loan. loan_number, loan.branch_name, loan.amount)
• We shall drop the relation-name prefix.This simplification doesnot lead to any ambiguity.
• We can write relation schema for r as (customer_name, borrower.loan_num,loan.loan_number,branch_name,amount)
Cartesian product operation
The loan relation
The borrower relation
Borrower * loan
• Suppose we want to find the names of all customers who have a loan at perryridge branch.
• σ branch_name=“perryridge”(borrower×loan))
This is the table of Query: σ branch_name=“Perryridge” (borrower ×loan)
`• Since cartesian product operates every tuple of loan with
every tuple of borrower …
• σ borrower.loan_num=loan.loan_number
(σ branch_name=“perryridge”(borrower×loan))
• Now if I want to find the names of customers,then we do projection:
• ∏customer_name
(σ borrower.loan_num=loan.loan_number
(σ branch_name=“perryridge”(borrower×loan)))
Customer_name
Adams
Hayes
• Emp( eno, fname, lname, gender, dob, address, salary, supno,dno)
• Dependent(eeno,depname,gender,bdate,relationship)
• Retrive names of each female employee’s dependent
• Emp( eno, fname, lname, gender, dob, address, salary, supno,dno)
• Dependent(eeno,depname,dob,relationship)
Retrive names of each female employee’s dependent
• Π fname,depname(σeno=eeno (σ gender = ‘f’ (emp*dependent)))
Assignment operations
• Its denoted by <-.its r-s• For eg:- temp1 <- ∏R-S(r1)
• temp2<- ∏R-S(r2)
• result-=temp1-temp2
• The result of the expression to the right of the <- is assigned to the relation variable on the left of the <-
• Temp1 Πeno,fname (σ gender=‘f’(emp))• Temp2 temp1 * dependent• Temp3 Π fname,depname(σeno=eeno(temp2))
OR• Π fname,depname(σeno=eeno (σ gender = ‘f’
(emp*dependent)))
select select name,number,amountname,number,amount from borrower from borrower natural joinnatural join loan; loan;
name,number,amount (borrower loan)name,number,amount (borrower loan)
(INNER join) Join Operation ( )(INNER join) Join Operation ( )
select <attributes> from <table name>;select <attributes> from <table name>;
Takes the Cartesian product of an Entity Sets and apply the selection forcing equality on those attributes that appear in both relation schemas and finally removes duplicate attributes
If r(R) and s(S) be relations without any attributes in common,That is R S = .Then r s = r x s
Joins• Combine information from two or more tables• Example: students enrolled in courses:
S1 S1.sid=E.studidE
Sid name gpa
50000 Dave 3.3
53666 Jones 3.4
53688 Smith 3.2
53650 Smith 3.8
53831 Madayan 1.8
53832 Guldu 2.0
cid grade studid
Carnatic101 C 53831
Reggae203 B 53832
Topology112 A 53650
History 105 B 53666
S1E
Joins
sid name gpa cid grade studid
53666 Jones 3.4 History105 B 53666
53650 Smith 3.8 Topology112 A 53650
53831 Madayan 1.8 Carnatic101 C 53831
53832 Guldu 2.0 Reggae203 B 53832
Sid name gpa
50000 Dave 3.3
53666 Jones 3.4
53688 Smith 3.2
53650 Smith 3.8
53831 Madayan 1.8
53832 Guldu 2.0
cid grade studid
Carnatic101 C 53831
Reggae203 B 53832
Topology112 A 53650
History 105 B 53666
S1E
ExtendedRelational Algebra Operation
Generalisation projection
∏customer_name,limit - credit_bal(credit_info)
Table: credit info
∏customer_name(limit - credit_bal)as credit_available(credit_info)
Cust_name Limit Credit_bal
Curry 2000 1750
Hayes 1500 1500
Jones 6000 700
Smith 2000 400
Aggregate FunctionsAggregate Functions
Functions take a collection of values and return a single value as a result
sum Returns sum of the valuesavg Returns the average of the valuescount Returns the number of elements in the collectionmin Returns minimum values in the collectionmax Returns maximum values in the collection
Multisets:Multisets:The collections on which aggregate functions operate can have multiple occurrences of a value; the order in which the valued appear is not relevant. Such collections are called multisetsmultisets
Aggregate Functions ExampleAggregate Functions Example
e.g. Take a set…valueset(num) = { 1, 1, 3, 5, 2, 5, 8 }
sum 25
avg 3.5714286
count 7
min 1
max 8
Gsum(num) valueset select sum(num) from valueset;
Gavg(num) valueset select avg(num) from valueset;
Gcount(num) valueset select count(num) from valueset;
Gmin(num) valueset select min(num) from valueset;
Gmax(num) valueset select max(num) from valueset;
Aggregate Operation – Example
• Relation r:A B
C
7
7
3
10
G sum(c) (r) sum(c )
27
Aggregate Operation – Example
• Relation (pt_works)
*G sum(balance) (pt_works)
* Branch_nameg sum(salary) (pt_works)
employee_name Branch_name balanceAdamsBrownGopal
JohnsonLorrena
PetersonRaosato
PerryridgePerryridgePerryridgedowntownDowntowndowntown
austinaustin
15001300530015001300250015001600
branch_name sum(balance)Austin
Downtownperryridge
310053008100
* Branch_nameG sum(salary) (pt_works)
Aggregate Functions (Cont.)
• Suppose we want to find the maximum salary for part-time emp at each branch
branch_name G sum(balance) , G max(salary)(pt_works)
Branch_name Sum_salary Max_salary
Austin 3100 1600
Downtown 5300 2500
Perryridge 8100 5300
Joins
• An SQL JOIN clause is used to combine rows from two or more tables, based on a common field between them.
• The most common type of join is: SQL INNER JOIN (simple join). An SQL INNER JOIN return all rows from multiple tables where the join condition is met.
• Also called as Natural Join
Inner Join
return all of the records in the left table (table A) thathave a matching record in the right table (table B).
SELECT <select_list> FROM Table_A A INNER JOIN Table_B B ON A.Key = B.Key
Example – Inner Join
• SELECT orders.date, customers.name, products.name FROM orders, customers, products WHERE orders.customer = customers.id AND orders.product = products.id;
Selecting without Where Clause
After WHERE Clause
Outer Join
• This Join can also be referred to as a OUTER JOIN or a FULL JOIN.
• This query will return all of the records from both tables, joining records from the left table (table A) that match records from the right table (table B).
• This Join is written as follows:
• Outer join operation is an extension of the join operation to deal with the missing information.
Outer join
Emp_name Street City
Coyote Toon Hollywood
Rabbit Tunnel Carrotville
Smith Revovler Death valley
Williams Seaview seattle
Emp_name Branch_name salary
Coyote Mesa 1500
Rabbit Mesa 1300
gates Redmond 5300
Williams Redmond 1500
Table:EMP
Table:ft_works
Emp_name Street City
Coyote Toon Hollywood
Rabbit Tunnel Carrotville
Williams Seaview seattle
The result of emp ft_works
Branch_name salary
Mesa 1500
Mesa 1300
Redmond 1500
SELECT <select_list> FROM Table_A AFULL OUTER JOIN
Table_B B ON A.Key = B.KeyWHERE A.Key IS NULL OR B.Key IS NULL
• There is loss of information of street and city information about smith. Since smith is absent from the ft_works relation and similary we have lost the information about Gates,
• We can use outer join to avoid loss of information, three forms of operations:
• 1) left outer join • 2) right outer join • 3) full outer join
Emp_name Street City Branch_name
salary
Coyote Toon Hollywood Mesa 1500
Rabbit Tunnel Carrotville Mesa 1300
Williams Seaview seattle Redmond 1500
Smith Revovler Death valley NULL NULL
LEFT OUTER JOIN
left outer join - takes all tuples in the left relation that didnot match with any tuple in the right relation, pads the Tuples with null values for all other attributes from the right relation and adds them to the result of the natural
join.
Emp_name Street City Branch_name
salary
Coyote Toon Hollywood Mesa 1500
Rabbit Tunnel Carrotville Mesa 1300
Williams Seaview seattle Redmond 1500
Gates Null Null Redmond 5300
Right outer join
Right outer join – it pads tuples from the right relation that did not match with any tuple from the left relation with nulls and adds them to the result of the natural join.
Emp_name Street City Branch_name
salary
Coyote Toon Hollywood Mesa 1500
Rabbit Tunnel Carrotville Mesa 1300
Williams Seaview seattle Redmond 1500
Gates NULL NULL Redmond 5300
Smith Revovler Death valley NULL NULL
FULL outer join
full outer join - pads tuples from the left relation that did not match with any tuple from the right relation, as well as tuples from the right relation that did not match with any tuple from the left relation and adds them to the resultant relation.
Full Outer Join
SELECT <select_list>FROM Table_A A
FULL OUTER JOIN Table_B BON A.Key = B.Key
Division Operation (Division Operation ())
customer customer custadd custadd
Let r(R) and s(S) be relations, and let S R,Then relation r s is defined on R-S.A tuple t is in r s if two conditions are satisfied:1. t is in R-S (r)2. For every tuple ts in s, there is tuple tr in r satisfying both
of the following:a. tr [S] = ts [S]b. tr [R – S] = t
where customer = {(id,name,street,city)}custadd = {(name,street,city)}
Division Operation
• Notation: • Suited to queries that include the phrase “for all”.
• Let r and s be relations on schemas R and S respectively where
– R = (A1, …, Am , B1, …, Bn )
– S = (B1, …, Bn)
The result of r s is a relation on schema
R – S = (A1, …, Am)
r s = { t | t R-S (r) u s ( tu r ) }
Where tu means the concatenation of
tuples t and u to produce a single tuple
r s
EXAMPLES OF DIVISION
DIVISIONInterpretation of the division operation A/B:
- Divide the attributes of A into 2 sets: A1 and A2.
- Divide the attributes of B into 2 sets: B2 and B3.
- Where the sets A2 and B2 have the same attributes.
- For each set of values in B2:
- Search in A2 for the sets of rows (having the same A1 values) whose A2 values (taken together) form a set which is the same as the set of B2’s.
- For all the set of rows in A which satisfy the above search, pick out their A1 values and put them in the answer.
Division Operation – Example
Relations r, s:
r s: A
B
1
2
A B
12311134612 r
s
Another Division Example
A B
aaaaaaaa
C D
aabababb
E
11113111
Relations r, s:
r s:
D
ab
E
11
A B
aa
C
r
s
Bank Example Queries
• R1----branch_name (branch_city = “Brooklyn” (branch))
• R2--customer_name, branch_name (depositor account)
• Ans R1 R2
Cust_name
Johnson
• Find all customers who have an account at all branches located in Brooklyn
city.
• R1----branch_name (branch_city = “Brooklyn” (branch))
• R2--customer_name, branch_name (depositor account)
•
• Ans R1 R2
Bank Example Queries
customer_name, branch_name (depositor account)
branch_name (branch_city = “Brooklyn” (branch))
Division Operation (Cont.)
• Property – Let q = r s– Then q is the largest relation satisfying q x s r
• Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S R
r s = R-S (r ) – R-S ( ( R-S (r ) x s ) – R-S,S(r ))
To see why
R-S,S (r) simply reorders attributes of r
R-S (R-S (r ) x s ) – R-S,S(r) ) gives those tuples t in
R-S (r ) such that for some tuple u s, tu r.
Exercise time
• It is useful in special kind of queries that include the phrase “for all”.
X Y
X1 Y1
X1 Y3
X1 Y2
X4 Y3
X4 Y5
X2 Y3
X3 Y4
X4 Y1
X4 Y4
X5 Y5
X5 Y1
Y
Y1
Y3
Y
Y1
Y5
Y
Y5
Y2
Y4
A B1 B2B3
XXX
AA B2
A B1 AA B3
Modification of the Database
• The content of the database may be modified using the following operations:
– Deletion
– Insertion
– Updating
• All these operations are expressed using the assignment operator.
Deletion
• A delete request is expressed similarly to a query, except instead of displaying tuples to the user, the selected tuples are removed from the database.
• Can delete only whole tuples; cannot delete values on only particular attributes
• A deletion is expressed in relational algebra by:
r r – E
where r is a relation and E is a relational algebra query.
Deletion Examples
• Delete all account records in the Perryridge branch.
Delete all accounts at branches located in Needham.
r1 branch_city = “Needham” (account branch )
r2 account_number, branch_name, balance (r1)
r3 customer_name, account_number (r2 depositor)
account account – r2
depositor depositor – r3
Delete all loan records with amount in the range of 0 to 50
loan loan – amount 0and amount 50 (loan)
account account – branch_name = “Perryridge” (account )
Delete all data of customer named “smith” from depositor
depositor depositor - σ cust_name=“smith”(depositor)
Delete all loans with amount range 0 to 50
Loan loan – σ amount >= 0 ^ amount <=50(loan)
Deletion…
Insertion
• To insert data into a relation, we either:
– specify a tuple to be inserted
– write a query whose result is a set of tuples to be inserted
• in relational algebra, an insertion is expressed by:
r r E
where r is a relation and E is a relational algebra expression.
• The insertion of a single tuple is expressed by letting E be a constant relation containing one tuple.
Insertion Examples
• Insert information in the database specifying that Smith has $1200 in account A-973 at the Perryridge branch.
Provide as a gift for all loan customers in the Perryridge branch, a $200 savings account. Let the loan number serve
as the account number for the new savings account.
account account {(“A-973”, “Perryridge”, 1200)}
depositor depositor {(“Smith”, “A-973”)}
r1 (branch_name = “Perryridge” (borrowe loan))
account account loan_number, branch_name, 200 (r1)
depositor depositor customer_name, loan_number (r1)
Updating
• A mechanism to change a value in a tuple without charging all values in the tuple
• Use the generalized projection operator to do this task
• Each Fi is either
– the I th attribute of r, if the I th attribute is not updated, or,
– if the attribute is to be updated Fi is an expression, involving only constants and the attributes of r, which gives the new value for the attribute
)(,,,, 21rr
lFFF
Update Examples
• Make interest payments by increasing all balances by 5 percent.
Pay all accounts with balances over $10,000 6 percent interest and pay all others 5 percent
account account_number, branch_name, balance * 1.06 ( BAL 10000 (account ))
account_number, branch_name, balance * 1.05 (BAL 10000 (account))
account account_number, branch_name, balance * 1.05 (account)
Null Values
• It is possible for tuples to have a null value, denoted by null, for
some of their attributes
• null signifies an unknown value or that a value does not exist.
• The result of any arithmetic expression involving null is null.
• Aggregate functions simply ignore null values (as in SQL)
• For duplicate elimination and grouping, null is treated like any
other value, and two nulls are assumed to be the same (as in
SQL)
Null Values
• Comparisons with null values return the special truth value: unknown
– If false was used instead of unknown, then not (A < 5) would not be equivalent to A >= 5
• Three-valued logic using the truth value unknown:
– OR: (unknown or true) = true, (unknown or false) = unknown (unknown or unknown) = unknown
– AND: (true and unknown) = unknown, (false and unknown) = false, (unknown and unknown) = unknown
– NOT: (not unknown) = unknown
– In SQL “P is unknown” evaluates to true if predicate P evaluates to unknown
• Result of select predicate is treated as false if it evaluates to unknown