lecture9: functional dependencies and normalization for relational databases
Post on 05-Jan-2016
52 Views
Preview:
DESCRIPTION
TRANSCRIPT
Lec
ture
9
Lecture9:Functional Dependencies and Normalization for Relational Databases
Prepared by L. Nouf Almujally
Ref. Chapter14 - 15
1
Lec
ture
9
How to produce a good relation schema?
STEPS:1. Start with a set of relation.2. Define the functional dependencies for the relation to
specify the PK.3. Transform relations to normal form.
2
Lec
ture
9
Functional Dependencies
• Describes the relationship between attributes in a relation.
• If A and B are attributes of relation R, B is functionally dependent on A, denoted by A B, if each value of A is associated with exactly one value of B. B may have several values of A.
Determinant Dependent
3
A BB is functionallydependent on A
Nor
mal
izat
ion
Lec
ture
9
Functional Dependencies
X Y• X -> Y holds if whenever two tuples have the same value for X,
they must have the same value for Y• For any two tuples t and u in any relation instance r(R): If
t[X]=u[X], then t[Y]=u[Y]
4
X Y
t
u
If t & u agree here Then they must agree here
Nor
mal
izat
ion
Lec
ture
9
Functional Dependencies
5
Example
StaffNo positionPosition is functionallydependent on Staffno
position StaffNoStaffNo is NOT functionally
dependent on position
SL21 Manager
Manager SL21 SG5
1:1 or M:1 relationship
between attributes in a
relation
1:M relationship
between attributes in a
relation
Nor
mal
izat
ion
Lec
ture
9
Examples of FD constraints
• Social security number determines employee name• SSN -> ENAME
• Project number determines project name and location• PNUMBER -> {PNAME, PLOCATION}
• Employee ssn and project number determines the hours per week that the employee works on the project• {SSN, PNUMBER} -> HOURS
6
Nor
mal
izat
ion
Lec
ture
9
Identifying the PK
• Purpose of functional dependency, specify the set of integrity constraints that must hold on a relation.
• The determinant attribute(s) are candidate of the relation, if:• 1:1 relationship between determinant & dependent.• No subset of determinant attribute(s) is a determinant.
(nontrivial)
If (A, B) C, then NOT A B, and NOT B A
• All attributes that are not part of the CK should be functionally dependent on the key: CK all attributes of R
• Hold for all time.• PK is the candidate attribute(s) with the minimal set of functional
dependency.
Nor
mal
izat
ion
7
Lec
ture
9
Identifying the PK
• If a relation schema has more than one key, each is called a candidate key.• One of the candidate keys is arbitrarily designated to be the
primary key, and the others are called secondary keys.
• A Prime attribute must be a member of some candidate key
• A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key.
8
Nor
mal
izat
ion
Lec
ture
9
The Purpose of Normalization
• Normalization is a bottom-up approach to database design that begins by examining the relationships between attributes. It is performed as a series of tests on a relation to determine whether it satisfies or violates the requirements of a given normal form.
• Purpose:- Guarantees no redundancy due to FDs- Guarantees no update anomalies
• Normal Forms:• First Normal Form (1NF)• Second Normal Form (2NF)• Third Normal Form (3NF)• Boyce-Codd Normal Form (BCNF) 9
Nor
mal
izat
ion
Lec
ture
9
Normal Forms Defined Informally
• 1st normal form• All attributes depend on the key
• 2nd normal form• All attributes depend on the whole key
• 3rd normal form• All attributes depend on nothing but the key
10
Nor
mal
izat
ion
Lec
ture
9
First Normal Form (1NF)
11
Unnormalized form (UNF): A relation that contains one or more
repeating groups.
First normal form (1NF): A relation in which the intersection of each row
and column contains one & only one value.
1NF Disallows:• composite attributes• multivalued attributes• nested relations; attributes whose values for an individual tuple
are non-atomic
Nor
mal
izat
ion
Lec
ture
9
First Normal Form (1NF)
12
ClientNo
CR76
PropertyNo
PG4
Name
John Key
CLIENT_PROPERTY
PG16
PG4PG36
PG16
CR56 Aline Stewart
Unnormalized form (UNF)
Not in the 1NF because there are Multivalued attribute in the table (PropertyNo)
Nor
mal
izat
ion
Lec
ture
9
UNF 1NF Approach 1
• Expand the key so that there will be a separate tuple in the original relation for each repeated attribute(s).
• Primary key becomes the combination of primary key and redundant value (multivalued attribute).
1NF relation
• Disadvantage: introduce redundancy in the relation.13
ClientNo
CR76
PropertyNo
PG4
Name
John Key
CLIENT_PROPERTY
PG16
PG4PG36
PG16
CR56 Aline Stewart
CR76 John Key
CR56 Aline Stewart
CR56 Aline Stewart
Nor
mal
izat
ion
Lec
ture
9
UNF 1NF Approach 2
• If the maximum number of values is known for the attribute, replace repeated attribute (PropertyNo) with a number of atomic attributes (PropertyNo1, PropertyNo2, PropertyNo3).
1NF relation
• Disadvantage: introduce NULL values in the relation.14
ClientNo
CR76
PropertyNo1
PG4
Name
John Key
CLIENT_PROPERTY
PG16
PG4 PG36CR56 Aline Stewart
PropertyNo2 PropertyNo3
NULL
PG16
Nor
mal
izat
ion
Lec
ture
9
Summary : first normal form
• 1NF : if all attribute values are atomic: no repeating group, no composite attributes.
15
Nor
mal
izat
ion
Lec
ture
9
UNF (multivalued) 1NF
16
Nor
mal
izat
ion
Lec
ture
9
UNF (nested relations) 1NF
17
Nor
mal
izat
ion
Example : First normal form -1NF
The following table is not in 1NF because there are nested relations in the table
DPT_NO MG_NO EMP_NO EMP_NM
D101 12345 200002000120002
Carl SaganMag JamesLarry Bird
D102 13456 3000030001
Jim CarterPaul Simon
18
Lecture9Normalization
Table in 1NF
• all attribute values are atomic because there are no repeating group and no composite attributes.
DPT_NO MG_NO EMP_NO EMP_NM
D101 12345 20000 Carl Sagan
D101 12345 20001 Mag James
D101 12345 20002 Larry Bird
D102 13456 30000 Jim Carter
D102 1345630001
Paul Simon
19
Normalization Lecture9
Lec
ture
9
Second Normal Form
• Uses the concepts of FDs, primary key• Definitions
• Prime attribute: An attribute that is member of the primary key K
• Full functional dependency: a FD Y -> Z where removal of any attribute from Y means the FD does not hold any more
• Examples:• {SSN, PNUMBER} -> HOURS is a full FD since neither SSN ->
HOURS nor PNUMBER -> HOURS hold • {SSN, PNUMBER} -> ENAME is not a full FD (it is called a
partial dependency ) since SSN -> ENAME also holds 20
Nor
mal
izat
ion
Lec
ture
9
Second Normal Form
• Second normal form (2NF) further addresses the concept of removing duplicative data
• A relation R is in 2NF if
1. R is 1NF , and 2. All non-prime attributes are fully dependent on the candidate
keys. Which is creating relationships between these new tables and their predecessors through the use of foreign keys.
• A prime attribute appears in a candidate key.• There is no partial dependency in 2NF.
21
Nor
mal
izat
ion
Summary : Second Normal Form (2NF)
1) Meet all the requirements of the 1NF2) Remove columns that are not fully dependent upon the
primary key.
22
Le
ctu
re9
Normalization
Lec
ture
9
Example1: 1NF 2NF
23
Remove partial dependencies by placing the functionally dependent
attributes in a new relation along with a copy of their determinants.
Nor
mal
izat
ion
Lec
ture
9
Example2: Second normal form -2NF
24
Inventory
Description Supplier Cost Supplier Address
Inventory
Description Supplier Cost
There are two non-key fields. So, here are the questions:
• If I know just Description, can I find out Cost? No, because we have more than one supplier for the same product.
• If I know just Supplier, and I find out Cost? No, because I need to know what the Item is as well.
Therefore, Cost is fully, functionally dependent upon the ENTIRE PK (Description-Supplier) for its existence.
Nor
mal
izat
ion
Lec
ture
9
Example 2: Second normal form -2NF
25Supplier
Name Supplier Address
Inventory
Description Supplier Cost Supplier Address
• If I know just Description, can I find out Supplier Address? No, because we have more than one supplier for the same product.
• If I know just Supplier, and I find out Supplier Address? Yes. The Address does not depend upon the description of the item.
Therefore, Supplier Address is NOT functionally dependent upon the ENTIRE PK (Description-Supplier) for its existence.
Nor
mal
izat
ion
Inventory
Description Supplier Cost
Supplier
Name Supplier Address
The above relations are now in 2NF
26
Le
ctu
re9
Example 2: Second normal form -2NF
Normalization
Lec
ture
9
Third Normal Form (1)
• Transitive functional dependency
X, Y, Z are attributes of a relation, such that:
• If X Y and Y Z, then Z is transitively dependent on X via Y.
• Provided X is NOT functionally dependent on Y or Z (nontrivial FD).
• Examples:• SSN -> DMGRSSN is a transitive FD
• Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold
• SSN -> ENAME is non-transitive• Since there is no set of attributes X where SSN -> X and X -> ENAME
27
Nor
mal
izat
ion
Lec
ture
9
Third Normal Form (2)
• A relation schema R is in third normal form (3NF) if :1. R in 2NF and2. no non-prime attribute A in R is transitively dependent on the
primary key
• R can be decomposed into 3NF relations via the process of 3NF normalization
• NOTE:• In X -> Y and Y -> Z, with X as the primary key, we consider this a
problem only if Y is not a candidate key.• When Y is a candidate key, there is no problem with the
transitive dependency .• E.g., Consider EMP (SSN, Emp#, Salary ).
• Here, SSN -> Emp# -> Salary and Emp# is a candidate key.
28
Nor
mal
izat
ion
Summary : Third Normal Form (3NF)
1) Meet all the requirements of the 1NF2) Meet all the requirements of the 2NF3) Remove columns that are not dependent upon the primary
key.
29
Le
ctu
re9
Normalization
Lec
ture
9
Example: 2NF 3NF
30
If transitive dependencies exist, place transitively dependent attributes
in a new relation along with a copy of their determinants.
Nor
mal
izat
ion
Lec
ture
9
• describes parcels of land for sale in various counties of a state. Suppose that there are two candidate keys: Property_id# and {County_name, Lot#}• lot # are unique only within each county• Property_id# numbers are unique across counties for the entire
state.
31
Example : Third normal form -3NF
Nor
mal
izat
ion
Lec
ture
9
32
Example: 2NF 3NF
Nor
mal
izat
ion
Lec
ture
9
Books
Name Author's Name Author's Non-de Plume # of Pages
Books
Name Author's Name # of Pages
• If I know # of Pages, can I find out Author's Name? No. Can I find out Author's Non-de Plume? No.
• If I know Author's Name, can I find out # of Pages? No. Can I find out Author's Non-de Plume? YES.
Therefore, Author's Non-de Plume is functionally dependent upon Author's Name, not the PK for its existence.
Author
Name Non-de Plume
33
Example : Third normal form -3NF
Nor
mal
izat
ion
Review Example
34
PG4
PG16
Pno pAddress
18-Oct-00
22-Apr-01
1-Oct-01
22-Apr-01
24-Oct-01
iDate iTime
10:00
09:00
12:00
13:00
14:00
comments
Replace crockery
Good order
Damp rot
Replace carpet
Good condition
StaffNo
SG37
SG14
SG14
SG14
SG37
CarReg
M23JGR
M53HDR
N72HFR
M53HDR
N72HFR
Lawrence St,
Glasgow
5 Novar Dr.,
Glasgow
sName
Ann
David
David
David
Ann
STAFF_PROPERTY_INSPECTION
Unnormalized relation
Nor
mal
izat
ion
UNF 1NF
35
PG4
PG4
PG4
PG16
PG16
Pno pAddress
18-Oct-00
22-Apr-01
1-Oct-01
22-Apr-01
24-Oct-01
iDate iTime
10:00
09:00
12:00
13:00
14:00
comments
Replace crockery
Good order
Damp rot
Replace carpet
Good condition
StaffNo
SG37
SG14
SG14
SG14
SG37
CarReg
M23JGR
M53HDR
N72HFR
M53HDR
N72HFR
Lawrence St, Glasgow
Lawrence St,Glasgow
5 Novar Dr., Glasgow
5 Novar Dr., Glasgow
5 Novar Dr., Glasgow
sName
Ann
David
David
David
Ann
STAFF_PROPERTY_INSPECTION
1NF
Nor
mal
izat
ion
1NF 2NF
36
Pno pAddressiDate iTime comments StaffNo CarRegsName
STAFF_PROPERTY_INSPECTION
Partial Dependency : Pno pAddress
Nor
mal
izat
ion
1NF 2NF
37
Pno iDate iTime comments StaffNo CarRegsName
PROPERTY_INSPECTION
2NF
Pno pAddress
PROPERTY
2NF Pno pAddress
Transitive Dependency : StaffNo Sname
Nor
mal
izat
ion
2NF 3NF
38
Pno iDate iTime comments StaffNo CarReg
PROPERTY_INSPECTION
PROPERTY(Pno, pAddres)
STAFF(StaffNo, sName)
PROPERTY_INSPECT(Pno, iDate, iTime, comments, staffNo, CarReg)
3NF
Pno pAddress
PROPERTY
3NF
StaffNo sName
STAFF
3NF
Nor
mal
izat
ion
Lec
ture
9
39
Nor
mal
izat
ion
Lec
ture
9
References
• “Database Systems: A Practical Approach to Design, Implementation and Management.” Thomas Connolly, Carolyn Begg. 5th Edition, Addison-Wesley, 2009.
40
Nor
mal
izat
ion
top related