lecture 09: functional dependencies
DESCRIPTION
Lecture 09: Functional Dependencies. Outline. Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6). Relational Schema Design. Conceptual Model:. name. buys. Product. Person. Relational Model: plus FD’s. price. name. ssn. date. Normalization: - PowerPoint PPT PresentationTRANSCRIPT
Lecture 09: Functional Dependencies
Outline
• Functional dependencies (3.4)• Rules about FDs (3.5)• Design of a Relational schema (3.6)
Relational Schema Design
Conceptual Model:
Relational Model:plus FD’s
Normalization:Eliminates anomalies
PersonbuysProduct
name
pricename ssndate
Functional Dependencies
Definition: A1, ..., Am → B1, ..., Bn holds in R if:
t, t’ R, (t.A1=t’.A1 ... t.Am=t’.Am t.B1=t’.B1 ... t.Bm=t’.Bm )
A1 ... Am B1 ... Bm
if t, t’ agree here then t, t’ agree here
t
t’
R
Important Point!
• Functional dependencies are part of the schema!
• They constrain the possible legal data instances.
• At any point in time, the actual database may satisfy additional FD’s.
Examples
• EmpID → Name, Phone, Position• Position → Phone• but Phone → Position
EmpID Name Phone PositionE0045 John Smith 2376 HRE1847 Zheng Li 2376 QAE3123 Lucy Larsen 1264 DeveloperE9923 Zheng Li 1264 Developer
Formal definition of a key
• A key is a set of attributes A1, ..., An s.t. for any other attribute B, A1, ..., An → B
• A minimal key is a set of attributes which is a key and for which no subset is a key
• Note: book calls them superkey and key
Examples of Keys• Product(name, price, category, color)
name, category → pricecategory → color
Keys are: {name, category} and all supersets
• Enrollment(student, address, course, room, time)student → addressroom, time → coursestudent, course → room, time
Keys are: [in class]
Inference Rules for FD’s
A1 , A2 , … An→B1, B2, … Bm
Is equivalent to
A1 , A2 , … An →B1
A1 , A2 , … An →B2
A1 , A2 , … An →Bm
…
Splitting rule and Combing rule
A1 ... Am B1 ... Bm
AND
Inference Rules for FD’s(continued)
Trivial Rule
Why ?A1 ... Am
where i = 1, 2, ..., n
A1 , A2 , … An → Ai
Inference Rules for FD’s(continued)
Transitive Closure Rule
If
and
then
Why?
A1 , A2 , … An → B1, B2, … Bm
B1, B2, … Bm → C1 , C2 , … Ck
A1 , A2 , … An → C1 , C2 , … Ck
A1 ... Am B1 ... Bm C1 ... Ck
• Enrollment(student, major, course, room, time)student → majormajor, course → roomcourse → time
What else can we infer ? [in class]
Closure of a set of Attributes
Given a set of attributes {A1, …, An} and a set of dependencies S.Problem: find all attributes B such that:
any relation which satisfies S also satisfies: A1 , A2 , … An → B
The closure of {A1, …, An}, denoted {A1, …, An}+,is the set of all such attributes B
Closure AlgorithmStart with X={A1, …, An}.
Until X doesn’t change do:
if B1, B2, … Bm → C is in S, and
B1, B2, … Bm are all in X and
C is not in X
thenadd C to X
ExampleThe schema - R(A,B,C,D,E,F)
A, B → CA,D → EB → DA,F → B
Closure of {A,B}: X = {A, B, }
Closure of {A, F}: X = {A, F, }
Why Is the Algorithm Correct ?• Show the following by induction:
– For every B in X:• A1 , A2 , … An → B
• Initially X = {A1 , A2 , … An } holds• Induction step: B1, …, Bm in X
– Implies A1 , A2 , … An → B1, B2, … Bm
– We also have B1, B2, … Bm → C– By transitivity we have A1 , A2 , … An → C
• This shows that the algorithm is sound; need to show it is complete
Relational Schema Design(or Logical Design)
Main idea:• Start with some relational schema• Find out its FD’s• Use them to design a better relational
schema
Relational Schema Design
Anomalies:• Redundancy = repeated data• Update anomalies = Fred moves to “Bellevue”• Deletion anomalies = Fred drops all phone numbers:
what is his city ?
Recall set attributes (persons with several phones):
SSN → Name, City, but not SSN → PhoneNumber
Name SSN PhoneNumber CityFred 123-45-6789 206-555-1234 SeattleFred 123-45-6789 206-555-6543 Seattle
Joe 987-65-4321 908-555-2121 WestfieldJoe 987-65-4321 908-555-1234 Westfield
Relation DecompositionBreak the relation into two:
Name SSN CityFred 123-45-6789 SeattleJoe 987-65-4321 Westfield
SSN PhoneNumber123-45-6789 206-555-1234123-45-6789 206-555-6543
987-65-4321 908-555-2121987-65-4321 908-555-1234
Relational Schema Design
Conceptual Model:
Relational Model:plus FD’s
Normalization:Eliminates anomalies
PersonbuysProduct
name
pricename ssndate
Decompositions in GeneralR(A1, ..., An)
Create two relations R1(B1, ..., Bm) and R2(C1, ..., Ck)
such that: B1, ..., Bm C1, ..., Ck = A1, ..., An
and:R1 = projection of R on B1, ..., Bm
R2 = projection of R on C1, ..., Ck
Incorrect Decomposition
• Sometimes it is incorrect:
Name Price Category
Gizmo 19.99 Gadget
OneClick 24.99 Camera
DoubleClick 29.99 Camera
Decompose on : Name, Category and Price, Category
Incorrect Decomposition
Name Category
Gizmo Gadget
OneClick Camera
DoubleClick Camera
Price Category
19.99 Gadget
24.99 Camera
29.99 Camera
Name Price Category
Gizmo 19.99 Gadget
OneClick 24.99 Camera
OneClick 29.99 Camera
DoubleClick 24.99 Camera
DoubleClick 29.99 Camera
When we put it back:
Cannot recover information
Normal Forms
First Normal Form = all attributes are atomic
Second Normal Form (2NF) = old and obsolete
Third Normal Form (3NF) = this lecture
Boyce Codd Normal Form (BCNF) = this lecture
Others...
Boyce-Codd Normal FormA simple condition for removing anomalies from relations:
In English:
Whenever a set of attributes of R determines one more attribute, it should determine all the attributes of R.
A relation R is in BCNF if:
Whenever there is a nontrivial dependency A1, ..., An → B
in R , {A1, ..., An} is a key for R
Including superkeys
Example
What are the dependencies?SSN → Name, City
What are the keys?{SSN, PhoneNumber}
Is it in BCNF?
Name SSN PhoneNumber CityFred 123-45-6789 206-555-1234 SeattleFred 123-45-6789 206-555-6543 Seattle
Joe 987-65-4321 908-555-2121 WestfieldJoe 987-65-4321 908-555-1234 Westfield
Decompose it into BCNF
Name SSN CityFred 123-45-6789 SeattleJoe 987-65-4321 Westfield
SSN PhoneNumber123-45-6789 206-555-1234123-45-6789 206-555-6543987-65-4321 908-555-2121987-65-4321 908-555-1234
SSN → Name, City
Summary of BCNF Decomposition
Find a dependency that violates the BCNF condition:
A1 , A2 , … An → B1, B2, … Bm
A’sOthers B’s
R1 R2
Heuristics: choose B1, B2, … Bm “as large as possible”
Decompose:
Exercise: Is there a 2-attribute relation that is not in BCNF ?
Continue untilthere are noBCNF violationsleft.
Example Decomposition Person(name, SSN, age, hairColor, phoneNumber)
SSN → name, ageage → hairColor
Decompose in BCNF (in class):
Step 1: find all keys
Step 2: now decompose
Other Example
• R(A,B,C,D) A → B, B → C• Key: A, D• Violations of BCNF: A → B, B → C, A → C,
A → B,C• Pick A → B,C:
split R into R1(A,B,C), R2(A,D)• What happens if we pick A → B first ?
Correct Decompositions A decomposition is lossless if we can recover: R(A,B,C)
R1(A,B) R2(A,C)
R’(A,B,C) should be the same as R(A,B,C)
R’ is in general larger than R. Must ensure R’ = R
Decompose
Recover
Correct Decompositions
• Given R(A,B,C) s.t. A→B, the decomposition into R1(A,B), R2(A,C) is lossless
3NF: A Problem with BCNF
FD’s: Character Film; Film, Actor CharacterSo, there is a BCNF violation, and we decompose.
Character Film
No FDs
Film Actor Character
Actor Character
Film Character
So What’s the Problem?
No problem so far. All local FD’s are satisfied.Let’s put all the data back into a single table again:
Violates the dependency: Film, Actor Character!
Film CharacterAustin Powers Austin PowersAustin Powers Dr. Evil
Actor CharacterMike Myers Austin PowersMike Myers Dr. Evil
Film Actor CharacterAustin Powers Mike Myers Austin PowersAustin Powers Mike Myers Dr. Evil
What happened
here?
Solution: 3rd Normal Form (3NF)
A simple condition for removing anomalies from relations:
3NF has a dependency-preserving decomposition
A relation R is in 3rd normal form if :
Whenever there is a nontrivial dependency A1, A2, ..., An Bfor R , then {A1, A2, ..., An } a key for R, or B is part of a key.
Including superkeys
Decomposition into 3NF
• Easy when we know BCNF decomposition (how?)
• Not-so-easy: if we want a dependency-preserving decomposition… In the book (3.5)