copyright © 2003-2012 curt hill schema refinement iii 4 th nf and 5 th nf

Post on 16-Jan-2016

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Copyright © 2003-2012 Curt Hill

Schema Refinement III

4th NF and 5th NF

Copyright © 2003-2012 Curt Hill

Now what?

• An example• Consider a table that contains courses,

instructors and textbooks• There may be multiple instructors for

multiple sections of the class• There may be multiple textbooks as well• Both instructors and textbooks come

from a set of possibilities

Copyright © 2003-2012 Curt Hill

Course/Instructor/BookDept Number Instructor Book

CIS 385 221 Smith & Boss

CIS 385 221 Noble

CIS 385 403 Smith & Boss

CIS 385 403 Noble

• Key is entire tuple• Each instructor uses two books for the course• There is a redundancy

Copyright © 2003-2012 Curt Hill

Commentary• There is redundancy that we should

deal with

• The table is in BCNF– No examination of FDs will help us

• The two instructors and two textbooks are both determined by the course department and number

• This is an example of a MultiValued Dependency

Commentary Again• First normal form disallows repeating

groups• A repeating group is often a set• A MultiValued Dependecy is a set

depending on an item• Examples:

– People working on many projects– Each of these have many dependents

Copyright © 2003-2012 Curt Hill

Examples• In this example the course determines a

set of instructors

• The course also determines a set of textbooks

• These two sets are independent

• If the sets are large we get plenty of redundancy and yet are still in BCNF– If we have every book connected to every

instructor connected to the course

Copyright © 2003-2012 Curt Hill

Copyright © 2003-2012 Curt Hill

MultiValue Dependency

• An MVD determines a value from a set

• Notation is two arrows• Dept,Number Instructor and

• Dept,Number Book

• The correct decomposition is splitting teacher from book

Copyright © 2003-2012 Curt Hill

Course/Instructor/Book

Dept Number Instructor Book

CIS 385 221 Smith & Boss

CIS 385 221 Noble

CIS 385 403 Smith & Boss

CIS 385 403 Noble

Dept Num Instruct

CIS 385 221

CIS 385 403

Dept Num Book

CIS 385 Smith & Boss

CIS 385 Noble

Project into

Copyright © 2003-2012 Curt Hill

Fourth Normal Form

• The above two tables are in 4th NF

• A table is in 4th NF if and only if

• The table is in BCNF

• All MVDs are now FDs

• If there are no MVDs then BCNF is also 4NF

Copyright © 2003-2012 Curt Hill

Another View of 4th NF

• If a relation is in 4th NF then for each MVD, X A one of the following must hold

• The MVD is trivial– A is part of X or– XA is the whole relation

• X is a superkey

Copyright © 2003-2012 Curt Hill

Is this 4th NF?Dept Number Instructor Book

CIS 385 221 Smith & Boss

CIS 385 221 Noble

CIS 385 403 Smith & Boss

CIS 385 403 Noble

• There are two MVDs– Dept,Number Instructor

– Dept,Number Book

• Trivial MVDs? - No• Dept,Number superkey? - No

Copyright © 2003-2012 Curt Hill

Is this 4th NF?Dept Num Instruct

CIS 385 221

CIS 385 403

• There is one MVD– Dept,Num Instructor

• Trivial MVD?– Yes, this is whole relation

Copyright © 2003-2012 Curt Hill

Decomposability

• A strange thing happens:

• There are relations that may not be lossless join decomposed into two relations

• But they can be decomposed into larger number of relations

• The following example shows a relation that can be decomposed into three but not two

Copyright © 2003-2012 Curt Hill

S P J

1 1 2

1 2 1

2 1 1

1 1 1

AExample

Copyright © 2003-2012 Curt Hill

What about this?

• What is the key?– Entire tuple– Must be in 4th NF

• What MVDs?– S P– S J– P J– Among others

Decomposition

• In the next slide we will see the table decomposed into tables of two fields

• However, no two of them can be joined into the original without extra rows

• All three of them can be joined into the original

Copyright © 2003-2012 Curt Hill

Copyright © 2003-2012 Curt Hill

S P J

1 1 2

1 2 1

2 1 1

1 1 1

S J

1 2

1 1

2 1

S P

1 1

1 2

2 1

P J

1 2

2 1

1 1

S P J

1 1 2

1 2 2

1 2 1

2 1 1

1 1 1

S P J

1 1 2

1 2 1

2 1 1

1 1 1

AB C D

Example Decomposed

Copyright © 2003-2012 Curt Hill

What Just Happened?

• A could not be lossless join decomposed into any two of {B, C, D}– Decomposing into just two must break an MVD

• It could be lossless join decomposed into all three

• There is a join dependency between A and {B, C, D}

• There is no join dependency between any of– A and {B, C} – A and {B, D} – A and {C, D}

Copyright © 2003-2012 Curt Hill

Join Dependencies• A Join Dependency {R1,R2,…RN} holds over R if

R1,R2,…RN is a lossless join decomposition of R– In other words, joining R1,R2,…RN gives R

• Notation: {R1,R2,…RN}• A JD is a generalization of MVDs

• In the previous example, the MVDs S P S JP Jmay be expressed as the join dependency {B,C,D}

Copyright © 2003-2012 Curt Hill

Trivial Join Decompositions

• The join dependency {R1,R2,…RN} on R is trivial iff– At least one of R1,R2,…RN is the set of all

attributes of R– In other words, there is a relation

equivalent to R in the decomposition• Joining R to any decomposition of R or its join

reproduces the original

Copyright © 2003-2012 Curt Hill

Implied Join Dependencies

• Suppose the join dependency {R1,R2,…RN} on R

• This Join Dependency is Implied by the Candidate Key(s) iff

• Each relation R1,R2,…RN is a superkey for R

Copyright © 2003-2012 Curt Hill

Fifth Normal Form

• 5th NF is also known as: Projection Join Normal Form (PJNF)

• A relation R is in 5th NF if and only if every non-trivial join dependency that is satisfied by R is implied by the candidate key(s) of R

Copyright © 2003-2012 Curt Hill

S P J

1 1 2

1 2 1

2 1 1

1 1 1

Is this in 5th NF?• There is a non-trivial join

decomposition, {B,C,D}

–None of these are A

• This decomposition is not implied by the only candidate key, SPJ–None of these contain SPJ

• No – not in 5NF

Copyright © 2003-2012 Curt Hill

Is 5th NF the Ultimate?• It is the ultimate that can be obtained

with just projections– The guaranteed best in terms of a lack of

anomalies that can be removed by projections

• Hence the name Join Projection Normal Form

• However, there may be some anomalies that cannot be eliminated with just projections

Copyright © 2003-2012 Curt Hill

JDs and FDs

• FDs and MVDs have a set of inference rules– This allows us to reason about them

• JDs lack this set

• Thus finding JDs and using them to move to 5th NF has its problems

• We do have one tool

Copyright © 2003-2012 Curt Hill

3NF and 5NF

• If a relation is in 3rd NF and each of its keys is atomicthen the relation is also in 5th NF– The same may be said on BCNF

• There may be 5th NF relations that do not have atomic keys

• When we can apply this we can determine the table is in 5th NF without any consideration of JDs

Copyright © 2003-2012 Curt Hill

Denormalization

• The argument against making everything 5th NF:– Lots of separate relations– These relations become separate files– This means lots of I/O

• Since SQL cannot separate a relation from a file, the argument has some merit

Conclusion

• MVD are much less common than FD• Thus tables that are in BCNF are very

often in 5NF because there are no MVDs• MVDs are also harder to observe and

reason about• Thus 3NF and BCNF are the most

common normal forms

Copyright © 2003-2012 Curt Hill

top related