2
Properties of Table
• When we design relational DB,o It is a set of relations.o Relations can be derived from UML diagram
• But NOT all relations are correct.o We should carefully observe the properties of tableo Functional Dependencyo Keyo Decomposition of Table
3
Definition of Functional Dependency
• FD (Functional Dependency) on a Relation Ro iff A1 A2 A3 … An B
where A1 , A2 , A3 , … , An , B are attributes of Ro A set of attributes A1 A2 A3 … An functionally determines
Bo More than one B’s
A1 A2 A3 … An B1
A1 A2 A3 … An B2…
A1 A2 A3 … An Bk
A1 A2 A3 … An B1 B2 … Bk
A1 A2 A3 … An B1 B2 B3 … Bk
4
Functional Dependency: Example
• A Relationo Movies (title, year, length, filmType, studioName,
starName)(title year) length(title year) filmType(title year) studioName(title year) length filmType studioName? (title year) starName : more than one star in a film
• It is important to discover FD in a relationo It helps to decide the correctness of relation design.
5
Key
• Given a relation Ro A set of one or more attributes {A1, A2, A3, …, An} is a KEY
iff the set functionally determines all other attributes and no proper subset of {A1, A2, A3, …, An} functionally determines
other attributes (Minimal)o Primary Key:
If a relation has more than one keys, a key is defined as primary key
o Super Key a set of attributes containing a key No minimality condition
• Exampleo Movies (title, year, length, filmType, studioName,
starName)o What are keys ?
6
How to discover keys
• From E-R Diagram: Underlined Attributeso It means that keys are defined based on the understanding of the real
worldo Example: Movies (title, year, length, filmType, studioName, starName)
(year, starName) is not key if a star can make more than one film per year (year, starName) is a key if a star is allowed to make only one film per year
• Relation (A1, A2, B) for relationship between R1 and R2o One-Oneo One-Manyo Many-Oneo Many-Many
7
Rules about Functional Dependencies
• Functional Dependencyo An important property of Relation (or Table)o Some interesting properties or rules of FD
• Transitive Ruleo A B and B C then A C
• Splitting/Combining Ruleo A1 A2 A3 …An B1, A1 A2 A3 …An B2, …, A1 A2 A3 … An Bk
iff A1 A2 A3 … An B1 B2 … Bk
• Trivial FD Rule: Given a FD A1 A2 A3 …An B o FD is trivial if B is one of {A1 A2 A3 …An} : really trivialo FD is Completely non-trivial: B is not in {A1 A2 A3 …An}
8
Rules about Functional Dependencies
• Trivial Dependency Ruleo A1 A2 … An B1 B2 … Bm is equivalent to A1 A2 … An C1 C2
… Ck
if {C1 C2 … Ck } { B1 B2 … Bm } and
for any C {C1 C2 … Ck }, C {A1 A2 … An }o Example:
(year, title) (studioName, year), (year, title) studioName
Unnecessary
A1 A2 A3 … An C1 C2 C3 … Ck
B1 B2 B3 … Bm
9
Armstrong's Axioms
• Reflexivity: (Trivial FD)If {C1 C2 … Ck } { B1 B2 … Bm }, then B1 B2 … Bm C1 C2 … Ck
• Augmentation:If A1 A2 … An B1 B2 … Bm , thenA1 A2 … An C1 C2 … Ck B1 B2 … Bm C1 C2 … Ck
• Transitivity:A1 A2 … An B1 B2 … Bm and B1 B2 … Bm C1 C2 … Ck , thenA1 A2 … An C1 C2 … Ck
10
Closure of Attributes
• Closure : {A1, A2, … An }+
o {A1 A2 … An } is a set of attributes and S is a set of FDo Closure of {A1 A2 … An } under FD's in S: set of attributes B such that
A1 A2 … An Bo That is, under all functional dependencies, every Bi that we derive
A1 A2 … An B1
A1 A2 … An B2
. . .A1 A2 … An Bk
then {A1 A2 … An }+ = {B1 ,B2 ,… , Bk }
11
Algorithm to Find Closure
• Input: Set of Attributes {A1, A2, … An }, and set S of FDs
• Output: {A1, A2, … An }+
• Process1. Split FDs that each FD has a single attribute on the right.
e.g. A1 A2 B C then Split it to A1 A2 B and A1 A2 C 2. Initialize X = {A1, A2, … An }3. Search for some FD
e.g. B1 B2 ... Bm C such that B1, B2 , .. Bm are in X but C not in X 4. Repeat 3 until no more attribute to add in X
• Exampleo Given attributes A, B, C, D, E, and Fo S: A B C, B C A D, D E, and C F B
What is { A, B } + ?
12
Closure and Key
• If {A1, A2, … An }+ is the set of all attributes of relation R,then A1, A2, … An is a super keyo Example: R (A, B, C, D, E) and S: A B C, B C A D, D E
then { A, B } + = {A, B, C, D, E} : all attributes of R. {A, B} is a super key of R.
• if no attribute can be removed to cover the all attributed, then it is a key.o Example:
if we remove B from {A, B} then {A} + is not {A, B, C, D, E} .therefore {A, B} is a key
13
Closing Set of Functional Dependencies
• Closing Set of FD set S:o Basis T of S: If we can derive S from a T, then T is a basis of S.o Remove all duplicated FDso Minimal Basis B satisfies three conditions
All the FD in B have one attribute in right side If any FD is removed from S, then some FD becomes no longer valid. If for any FD in B, we remove one or more attributes from the left
side, then the result is no more a basis
• Exampleo for a S={AB, AC, BA, BC, CA, CB}, what is the minimal basis
of S?{ABC, ACB, BCA}?
14
Bad Design: Anomalies
• Bad Design: Example
• Redundancy
• Update Anomaly
• Deletion Anomaly
Title Year Length Film Type StudioName StarName
Star Wars 1977 124 Color Fox Carrie Fisher
Star Wars 1977 124 Color Fox Mark Hamill
Star Wars 1977 124 Color Fox Harrison Ford
Mighty Ducks 1991 104 Color Disney Emilio Estevez
Wayne’s World 1992 95 Color Paramount Dana Carvey
Wayne’s World 1992 95 Color Paramount Mike Meyers
15
Decomposing Relations
• Decomposition of Bad Relationo A good way to remove the problem of bad relations
• Decomposition: Lossless Decompositiono { A1 A2 … An } { B1 B2 … Bm }, {C1 C2 … Ck } such that
{ B1 B2 … Bm } {C1 C2 … Ck } = { A1 A2 … An } and{ B1 B2 … Bm } {C1 C2 … Ck } {}
16
Decomposing Relations: Example
• R={title, year, length, filmType, studioName, starName} {title, year, length, filmType, studioName} (=R1), {title, year, starName} (=R2)
• Redundancy
• Update Anomaly
• Deletion Anomaly
Title Year Length Film Type StudioName
Star Wars 1977 124 Color Fox
Mighty Ducks 1991 104 Color Disney
Wayne’s World 1992 95 Color Paramount
Title Year StarName
Star Wars 1977 Carrie Fisher
Star Wars 1977 Mark Hamill
Star Wars 1977 Harrison Ford
Mighty Ducks 1991 Emilio Estevez
Wayne’s World 1992 Dana Carvey
Wayne’s World 1992 Mike Meyers
17
Normal Form: Conditions for Good Relation
• 1st Normal Form (1NF)• 2nd Normal Form (2NF)• 3rd Normal Form (3NF)• Boyce-Codd Normal Form (BCNF)
18
1st Normal Form
• 1NF: Every component of relation should be ATOMICo No Table in component o No Seto No List etc..
19
2nd Normal Form
• 2NFo 1NF ando None of the non-prime attributes of the relation is
functionally dependent on a part of a candidate key Partial Dependency on non-prime attribute
• Exampleo Player (Team, Number, TeamAddress, Name, Position)o 1NF but not 2NF
B
CA
20
Example
• Player (Team, Number, TeamAddress, Name, Position)o FD1: Team, Name Name, Positiono FD2: Team TeamAddresso Key: {Team, Name}+={Team, Number, TeamAddress, Name, Position}o in FD2, TeamAddress (non-prime attribute) is dependent on {Team},
which is a subset of the key and o 2NF violation
• Should be decomposedo R1(Team, Number, Name, Position) and R2(Team, TeamAddress)o R1 R2 = R
21
Example
Employee Skill Current Work LocationJones Typing 114 Main StreetJones Shorthand 114 Main StreetJones Whittling 114 Main Street
Roberts Light Cleaning 73 Industrial WayEllis Alchemy 73 Industrial WayEllis Juggling 73 Industrial Way
Harrison Light Cleaning 73 Industrial Way
Candidate Key: {Employee, Skill} Not 2ND
Partial FD: Employee Current Work Location Should be decomposed
(Employee, Skill), (Employee, Current Work Location)
22
3rd Normal Form
• 2NF: Every non-prime attributes of the relation must be non-transitively dependent on every candidate key
• Exampleo Team (TeamName, Address, ManagerID, ManagerHireDate)o FD:
TeamNameAddress, TeamNameManagerID (TeamName )ManagerID ManagerHireDate Key: {TeamName} 2NF but Not 3NF
o To be decomposed (TeamName, Address, ManagerID), (Manager SS ID,
ManagerHireDate)
BCA
23
Example: 2NF but NOT 3NFTournament Year Winner Winner Date of Birth
Indiana Invitational 1998 Al Fredrickson 21 July 1975Cleveland Open 1999 Bob Albertson 28 September 1968Des Moines Masters 1999 Al Fredrickson 21 July 1975Indiana Invitational 1999 Chip Masterson 14 March 1977
Candidate Key: {Tournament, Year} 2NF: No Partial Dependency Not 3ND
Transitive Functional Dependency {Tournament, Year} Winner Winner Date of Birth Should be decomposed
(Tournament, Year, Winner), (Player, Birth date}
24
Boyce-Codd Normal Form (BCNF)
• BCNF: For every one of its non-trivial functional dependencies X Y, X is a super key o Remember: nontrivial means A is not a member of set
X.o Remember, a superkey is any superset of a key (not
necessarily a proper superset)
• BCNF is slightly stronger than 3NF
26
Example: 3NF but NOT BCNF
Prof. ID Prof. SS ID Student ID1078 088-51-0074 31850 1078 088-51-0074 37921 1293 096-77-4146 46224 1480 072-21-2223 31850
A table to show the assignment of students
Candidate Keys {Prof. ID, Student ID} {Prof. SS ID, Student ID}
1NF 2NF: no partial FD on non-prime attributes on candidate key 3NF: No transitive FD NOT BCNF:
Prof. ID Prof. SS ID : Functional Dependency but not candidate key Should be decomposed (Prof. ID, Student ID), (Prof. ID, Prof. SS ID)
27
Decomposition
• Three Conditionso Elimination of Anomalies
Update Redundancy Deletion
o Lossless Decomposition Original Relation by Natural Join
o Preservation of Dependencies
• Relation with two attributes: Always in BCNF (why?)
28
BCNF Decomposition Algorithm
• Algorithmo Input: Relation R0 and set S0 of FDso Output: R1, R2, … Rn such that R0 =R1 R2 … Rn o Process
1. Check R0 is in BCNF, then return R0 2. If there is any BCNF violation with X Y, then compute X+. Then R1= X+ and R2 =has the rest attributes and X3. Decompose FD set S0 into S1 and S2.4. Repeat 1-3 until no more BCNF violation.
• Exampleo Team (TeamName, Address, ManagerID, ManagerHireDate)o FD:
TeamNameAddress, TeamNameManagerID ManagerID ManagerHireDate