international computer institute, izmir, turkey database design and normal forms asst.prof.dr.İlker...

56
International Computer Institute, Izmir, Turkey International Computer Institute, Izmir, Turkey Database Design and Normal Database Design and Normal Forms Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at http://ube.ege.edu.tr/~ikocabas

Upload: dana-bridges

Post on 25-Dec-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

International Computer Institute, Izmir, TurkeyInternational Computer Institute, Izmir, Turkey

Database Design and Normal FormsDatabase Design and Normal Forms

Asst.Prof.Dr.İlker Kocabaş

UBİ502 at http://ube.ege.edu.tr/~ikocabas

Page 2: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.2 of 56

UBI 502

Database Management Systems

Database Design and Normal FormsDatabase Design and Normal Forms

First Normal Form

Functional Dependencies

Decomposition

Boyce-Codd Normal Form

Database Design Process

Page 3: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.3 of 56

UBI 502

Database Management Systems

First Normal FormFirst Normal Form

A domain is atomic if its elements are considered to be indivisible units Examples of non-atomic domains:

• Set-valued attributes, composite attributes

• Identifiers like UBİ502 that can be broken up into parts

A relational schema R is in first normal form if the domains of all attributes of R are atomic

Non-atomic values complicate storage

encourage redundancy

interpretation of non-atomic values built into application programs

• $cid = substring( $result [ “course-id” ], 1, 3 );

Page 4: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.4 of 56

UBI 502

Database Management Systems

First Normal Form (cont)First Normal Form (cont)

Atomicity: not an intrinsic property of the elements of the domain

Atomicity is a property of how the elements of the domain are used E.g. strings containing a possible delimiter (here: space)

• cities = “Melbourne Sydney” (non-atomic: space separated list)

• surname = “Fortescue Smythe” (atomic: compound surname)

E.g. strings encoding two separate fields

• student_id = CS1234

• If the first two characters are extracted to find the department, the domain of student identifiers is not atomic

• leads to encoding of information in application program rather than in the database

Page 5: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.5 of 56

UBI 502

Database Management Systems

PitfallsPitfalls (Traps) (Traps) in Relational Database Design in Relational Database Design

Relational database design requires that we find a “good” collection of relation schemas

A bad design may lead to redundant information difficulty in representing certain information difficulty in checking integrity constraints

Design Goals: Avoid redundant data Ensure that relationships among attributes are

represented Facilitate the checking of updates for violation of

integrity constraints

Page 6: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.6 of 56

UBI 502

Database Management Systems

Example of Bad DesignExample of Bad Design Consider the relation schema:

Lending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount)

Redundant Information: Data for branch-name, branch-city, assets are repeated for each loan that

a branch makes

Wastes space and complicates updates, introducing possibility of inconsistency of assets value

Difficulty representing certain information: Cannot store information about a branch if no loans exist

Can use null values, but they are difficult to handle

Page 7: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.7 of 56

UBI 502

Database Management Systems

Solution: DecompositionSolution: Decomposition

Break up such redundant tables into multiple tables this operation is called decomposition

E.g. consider Lending-schema again:

Lending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount)

now decompose as follows:

Branch-schema = (branch-name, branch-city,assets)

Loan-info-schema = (customer-name, loan-number, branch-name, amount)

Want to ensure that the original data is recoverable

1. all attributes of the original schema (R) must appear in the decomposition (R1, R2), i.e. R = R1 R2

2. decomposition must be a lossless-join decomposition

Page 8: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.8 of 56

UBI 502

Database Management Systems

Lossless-Join Decomposition: DefinitionLossless-Join Decomposition: Definition

Let R, R1, R2 be schemas and where R = R1 R2

R1, R2 is a lossless-join decomposition of R

if, for all possible relations r(R)

r = R1 ( r ) ⋈ R2 ( r )

Here “possible” means “meaningful in the context of the particular database design” we will formalize this notion using functional

dependencies

Page 9: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.9 of 56

UBI 502

Database Management Systems

Lossless-Join Decomposition: ExampleLossless-Join Decomposition: Example

Example of Non Lossless-Join Decomposition

Decomposition of R = (A, B)R2 = (A) R2 = (B)

A B

121

A

B

12

rA ( r )

B ( r )

A ( r ) ⋈ B ( r )

A B

1212

Thus, r is different to A (r) ⋈ B (r) and so A,B is not a lossless-join decomposition of R.

Page 10: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.10 of 56

UBI 502

Database Management Systems

Goal — Formalize the notion of good designGoal — Formalize the notion of good design

Process: Decide whether a particular relation R is in “good” form.

In the case that a relation R is not in “good” form, decompose it into a set of relations {R1, R2, ..., Rn} such that

• each relation is in good form

• the decomposition is a lossless-join decomposition

Our theory is based on functional dependencies Constraints on the set of legal relations

Require that the value for a certain set of attributes determines uniquely the value for another set of attributes

generalizes the notion of a key

Functional dependencies allow us to formalize good database design

Page 11: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.11 of 56

UBI 502

Database Management Systems

Functional Dependencies: DefinitionFunctional Dependencies: Definition Let R be a relation schema

R and R The functional dependency (FD) holds on R iff

for any legal relations r(R)

whenever any two tuples t1 and t2 of r agree on the attributes they also agree on the attributes

i.e. ( t1 ) = ( t2 ) ( t1 ) = ( t2 )

Example: Consider r(A,B) with the following instance of r:

On this instance, A B does NOT hold, but B A does hold

1 41 53 7

Page 12: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.12 of 56

UBI 502

Database Management Systems

Functional Dependency : Functional Dependency : Another Another DefinitionDefinition

A functional dependency occurs when the value of one (set of) attribute(s) determines the value of a second (set of) attribute(s):

StudentID StudentName

StudentID (DormName, DormRoom, Fee)

The attribute on the left side of the functional dependency is called the determinant.

Functional dependencies may be based on equations:

ExtendedPrice = Quantity X UnitPrice

(Quantity, UnitPrice) ExtendedPrice

Function dependencies are not equations!

Page 13: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.13 of 56

UBI 502

Database Management Systems

Composite DeterminantsComposite Determinants

Composite determinant = a determinant of a functional dependency that consists of more than one attribute

(StudentName, ClassName) (Grade)

Functional Dependency RulesFunctional Dependency Rules

If A (B, C), then A B and A C.

If (A,B) C, then neither A nor B determines C by itself.

Page 14: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.14 of 56

UBI 502

Database Management Systems

Functional Dependencies: VisualizationFunctional Dependencies: Visualization

t

u

General form of a FD: A1...An B1...Bm

A1...An

if t and u agree here

B1...Bm

then they must also agree here

Page 15: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.15 of 56

UBI 502

Database Management Systems

Functional Dependencies vs KeysFunctional Dependencies vs Keys

FDs can express the same constraints we could express using keys:

Superkeys: K is a superkey for relation schema R if and only if K R

Candidate keys: K is a candidate key for R if and only if

• K R, and

• there is no K’ K such that K’ R

However,FDs are more general i.e. we can express constraints that cannot be expressed using

keys

Page 16: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.16 of 56

UBI 502

Database Management Systems

Functional Dependencies vs Keys (cont)Functional Dependencies vs Keys (cont)

Example of FDs that can’t be represented using keys:

Consider the following Loan-info-schema:

Loan-info-schema = (customer-name, loan-number, branch-name, amount).

We expect these FDs to hold:

loan-number amountloan-number branch-name

We could try to express this by making loan-number the key, however the following FD does not hold:

loan-number customer-name

Page 17: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.17 of 56

UBI 502

Database Management Systems

Functional Dependencies (cont)Functional Dependencies (cont)

Movies(title, year, length, studioName, starName)title year length studioName starName

Star Wars 1977 124 Fox Carrie Fisher

Star Wars 1977 124 Fox Harrison Ford

Mighty Ducks 1991 104 Disney Emilio Estevez

Wayne’s World 1992 95 Paramount Dana Carvey

Wayne’s World 1992 95 Paramount Mike Meyers

FD: title, year length, studioName

not an FD: title, year starName

candidate key, a minimal K such that K R propose: K = {title, year, starName}

check: does K functionally determine R?

to answer this question we’ll need to look at closures

Page 18: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.18 of 56

UBI 502

Database Management Systems

Functional Dependencies (cont)Functional Dependencies (cont)

An FD is an assertion about a schema, not an instance

If we only consider an instance, we can’t tell if an FD holds e.g. inspecting the movies relation, we might suggest that

length title, since no two films in the table have the same length

However, we cannot assert this FD for the movies relation, since we know it is not true of the domain in general

Thus, identifying FDs is part of the data modelling process

Page 19: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.19 of 56

UBI 502

Database Management Systems

Modification AnomaliesModification Anomalies

Deletion anomaly

Insertion anomaly

Update anomaly

Movies(title, year, length, studioName, starName)

title year length studioName starName

Star Wars 1977 124 Fox Carrie Fisher

Star Wars 1977 124 Fox Harrison Ford

Mighty Ducks 1991 104 Disney Emilio Estevez

Wayne’s World 1992 95 Paramount Dana Carvey

Wayne’s World 1992 95 Paramount Mike Meyers

Update lenght on Row-1 is an anomaly, two different lenghts are recorded.

Page 20: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.20 of 56

UBI 502

Database Management Systems

Normal FormsNormal Forms

Relations are categorized as a normal form based on which modification anomalies or other problems they are subject to:

Page 21: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.21 of 56

UBI 502

Database Management Systems

Normal FormsNormal Forms

1NF—a table that qualifies as a relation is in 1NF. 2NF—a relation is in 2NF if all of its non-key attributes

are dependent on all of the primary keys. 3NF—a relation is in 3NF if it is in 2NF and has no

determinants except the primary key. Boyce-Codd Normal Form (BCNF)—a relation is in

BCNF if every determinant is a candidate key.

“I swear to construct my tables so that all non-keycolumns are dependent on the key, the whole key and nothing but the key, so help me Codd.”

Page 22: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.22 of 56

UBI 502

Database Management Systems

Eliminating Modification Anomalies from Eliminating Modification Anomalies from Functional Dependencies in Relations:Functional Dependencies in Relations:

Put All Relations into BCNFPut All Relations into BCNF

Page 23: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.23 of 56

UBI 502

Database Management Systems

Functional Dependencies: UsesFunctional Dependencies: Uses

We use FDs to: test relations to see if they are legal under a given set of FDs

• If a relation r is legal under a set F of FDs, we say that r satisfies F

specify constraints on the set of legal relations

• We say that F holds on R if all legal relations on R satisfy the set of FDs F

Note: A specific instance of a relation schema may satisfy an FD even if the FD does not hold on all legal instances. For example, a specific instance of Loan-schema may, by

chance, satisfy loan-number customer-name

Page 24: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.24 of 56

UBI 502

Database Management Systems

Aside: Trivial Functional DependenciesAside: Trivial Functional Dependencies

An FD is trivial if it is satisfied by all instances of a relation E.g.

• customer-name, loan-number customer-name

• customer-name customer-name

In general, is trivial if

Permitting such FDs makes certain definitions and algorithms easier to state

Page 25: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.25 of 56

UBI 502

Database Management Systems

FD Closure: DefinitionFD Closure: Definition

Given a set F of fds, there are other FDs logically implied by F E.g. If A B and B C, then we can infer that A C

The set of all FDs implied by F is the closure of F, written F+

We can find all of F+ by applying Armstrong’s Axioms: if , then (reflexivity) if , then (augmentation) if , and , then (transitivity)

Additional rules (derivable from Armstrong’s Axioms): If holds and holds, then holds (union) If holds, then holds and holds (decomposition) If holds and holds, then holds

(pseudotransitivity)

Page 26: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.26 of 56

UBI 502

Database Management Systems

FD Closure: ExampleFD Closure: Example R = (A, B, C, G, H, I)

F = { A B A CCG HCG I B H}

some members of F+

A H

• by transitivity from A B and B H

AG I

• by augmenting A C with G, to get AG CG and then transitivity with CG I

CG HI

• by union rule with CG H and CG I

Page 27: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.27 of 56

UBI 502

Database Management Systems

Computing FD ClosureComputing FD Closure

To compute the closure of a set of FDs F:

F+ = Frepeat

for each FD f in F+

apply reflexivity and augmentation rules on f add the resulting FDs to F+

for each pair of FDs f1and f2 in F+

if f1 and f2 can be combined using transitivity

then add the resulting FD to F+

until F+ does not change any further

(NOTE: More efficient algorithms exist)

Page 28: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.28 of 56

UBI 502

Database Management Systems

Minimal Cover of an FD SetMinimal Cover of an FD Set

The opposite of closure: what is the “minimal” set of FDs equivalent to F, having no redundant FDs (or extraneous attributes)

Sets of FDs may have redundant FDs that can be inferred from the others Eg: A C is redundant in: {A B, B C, A C}

Parts of an FD may be redundant

• E.g. on RHS: {A B, B C, A CD} can be simplified to {A B, B C, A D}

• E.g. on LHS: {A B, B C, AC D} can be simplified to {A B, B C, A D}

(We’ll cover these later under the heading of extraneous attributes)

(NB Textbook calls this “canonical” cover, though there is no guarantee of uniqueness.)

Page 29: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.29 of 56

UBI 502

Database Management Systems

Closure of Attribute SetsClosure of Attribute Sets

Given a set of attributes define the closure of under F (denoted by +) as the set of attributes that are functionally determined by under F:

is in F+ +

Algorithm to compute +, the closure of under Fresult := ;while (changes to result) do

for each in F dobegin

if result then result := result end

Page 30: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.30 of 56

UBI 502

Database Management Systems

Closure of Attribute Sets: ExampleClosure of Attribute Sets: Example R = (A, B, C, G, H, I) F = {A B

A C CG HCG IB H}

(AG)+

1. result = AG

2. result = ABCG (A C and A B)

3. result = ABCGH (CG H and CG AGBC)

4. result = ABCGHI (CG I and CG AGBCH)

Is AG a candidate key? 1. Is AG a superkey?

1. Does AG R? == Is (AG)+ R

2. Is any subset of AG a superkey?

1. Does A R? == Is (A)+ R

2. Does G R? == Is (G)+ R

Page 31: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.31 of 56

UBI 502

Database Management Systems

Testing for superkey: To test if is a superkey, we compute +, and check if +

contains all attributes of R

Testing FDs

To check if a FD holds (or, in other words, is in F+), just check if +

i.e. compute + by using attribute closure, and then check if it contains

Is a simple and cheap test, and very useful

Closure of Attribute Sets: UsesClosure of Attribute Sets: Uses

Page 32: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.32 of 56

UBI 502

Database Management Systems

Extraneous AttributesExtraneous Attributes Recall that we could have redundant FDs. Parts of FDs can

also be redundant

Consider a set F of FDs and the FD in F. Attribute A is extraneous in if A

and F logically implies (F – { }) {( – A) }.

Attribute A is extraneous in if A and the set of functional dependencies (F – { }) { ( – A)} logically implies F.

Example: Given F = {A C, AB C } B is extraneous in AB C because {A C, AB C} logically

implies A C (I.e. the result of dropping B from AB C).

Example: Given F = {A C, AB CD} C is extraneous in AB CD since AB C can be inferred

even after deleting C

Page 33: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.33 of 56

UBI 502

Database Management Systems

DecompositionDecomposition

Decompose the relation schema Lending-schema into:

Branch-schema = (branch-name, branch-city, assets)

Loan-info-schema = (customer-name, loan-number, branch-name, amount)

All attributes of an original schema (R) must appear in the decomposition (R1, R2):

R = R1 R2

Lossless-join decomposition.For all possible relations r on schema R

r = R1 (r) ⋈ R2 (r)

A decomposition of R into R1 and R2 is lossless-join if and only if at least one of the following dependencies is in F+: R1 R2 R1

R1 R2 R2

Page 34: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.34 of 56

UBI 502

Database Management Systems

Decomposition Using Functional DependenciesDecomposition Using Functional Dependencies

When we decompose a relation schema R with a set of FDs F into R1, R2,.., Rn we want

1. Lossless-join decomposition: Otherwise decomposition would result in information loss

2. No redundancy: The relations Ri should be in BCNF

3. Dependency preservation: Let Fi be the set of FDs F+ that include only attributes in Ri

Preferably the decomposition should be dependency preserving, that is, (F1 F2 … Fn)

+ = F+

Otherwise, checking updates for violation of FDs may require computing joins, which is expensive

Page 35: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.35 of 56

UBI 502

Database Management Systems

ExampleExample

R = (A, B, C)F = {A B, B C) Can be decomposed in two different ways

R1 = (A, B), R2 = (B, C)

Lossless-join decomposition:

R1 R2 = {B} and B BC

Dependency preserving

R1 = (A, B), R2 = (A, C)

Lossless-join decomposition:

R1 R2 = {A} and A AB

Not dependency preserving (cannot check B C without computing R1 ⋈ R2)

Page 36: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.36 of 56

UBI 502

Database Management Systems

SummarySummary

First Normal Form

Functional Dependencies

Decomposition to eliminate redundancy

lossless-join

dependency preserving

Next Up:

Boyce-Codd Normal Form

Database Design Process

Page 37: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.37 of 56

UBI 502

Database Management Systems

Boyce-Codd Normal Form (BCNF)Boyce-Codd Normal Form (BCNF)

A relation schema R is in BCNF with respect to a set F of FDs if

for all FDs in F+ of the form where R and R

at least one of the following holds:

is trivial (i.e., ), or

is a superkey for R

Page 38: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.38 of 56

UBI 502

Database Management Systems

ExampleExample

R = (A, B, C)F = {A B

B C}Key = {A}

R is not in BCNF

Decomposition R1 = (A, B), R2 = (B, C)

R1 and R2 in BCNF

Lossless-join decomposition

Dependency preserving

Question: How do we decompose a schema to get BCNF schemas in the general case?

Page 39: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.39 of 56

UBI 502

Database Management Systems

BCNF DecompositionBCNF Decomposition

First, we need a method to check if a non-trivial dependency on R violates BCNF

1. compute + (the attribute closure of ), and

2. verify that it includes all attributes of R ie. + is a superkey of R

3. if not, then violates BCNF

Page 40: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.40 of 56

UBI 502

Database Management Systems

BCNF Decomposition AlgorithmBCNF Decomposition Algorithm

result := {R};done := false;compute F+;while (not done) do

if (there is a schema Ri in result that is not in BCNF)then begin

let be a nontrivial functionaldependency that holds on Ri

such that Ri is not in F+, and = ;

result := (result – Ri ) (Ri – ) (, ); end

else done := true;

Note: each Ri is in BCNF, and decomposition is lossless-join

Page 41: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.41 of 56

UBI 502

Database Management Systems

Example of BCNF DecompositionExample of BCNF Decomposition

R = (branch-name, branch-city, assets,

customer-name, loan-number, amount)

F = {branch-name assets branch-city

loan-number amount branch-name}

Key = {loan-number, customer-name}

Is R in BCNF? Are there non-trivial FDs in which the LHS is not a superkey?

FD: branch-name assets branch-city

• Is branch-name a superkey? (no)

FD: loan-number amount branch-name

• Is loan-number a superkey? (no)

Page 42: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.42 of 56

UBI 502

Database Management Systems

Example of BCNF Decomposition (cont)Example of BCNF Decomposition (cont)

R = (branch-name, branch-city, assets,

customer-name, loan-number, amount)

F = {branch-name assets branch-city

loan-number amount branch-name}

BCNF Decomposition consider FD branch-name assets branch-city

= branch-name, = assets branch-city

• result := (result – Ri ) (Ri – ) (, );

Replace R with and R-• R1: = (branch-name, assets, branch-city)

• R2: R- = (branch-name, customer-name, loan-number, amount)

Page 43: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.43 of 56

UBI 502

Database Management Systems

Example of BCNF Decomposition (cont)Example of BCNF Decomposition (cont)

R1 = (branch-name, assets, branch-city)R2 = (branch-name, customer-name, loan-number, amount)

F = {branch-name assets branch-city

loan-number amount branch-name}

R1 is in BCNF, R2 is not in BCNF

BCNF Decomposition consider FD loan-number amount branch-name

= loan-number, = amount branch-name

Replace R2 with and R2-

• R3: = (branch-name, loan-number, amount)

• R4: R- = (customer-name, loan-number)

Page 44: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.44 of 56

UBI 502

Database Management Systems

Example of BCNF Decomposition (cont)Example of BCNF Decomposition (cont)

R1 = (branch-name, assets, branch-city)R3 = (branch-name, loan-number, amount)R4 = (customer-name, loan-number)

F = {branch-name assets branch-city

loan-number amount branch-name}

All relations are now BCNF!

Why does it work – i.e. why is this a lossless-join decomposition?

Page 45: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.45 of 56

UBI 502

Database Management Systems

Why are BCNF DecompositionsWhy are BCNF Decompositionslossless-join?lossless-join?

A1...An B1...Bm

A’sB’s others

A’sB’s

others

For every combination of A’s with others, we repeat the B’s

So put the B’s in a separate table R1, for which the A’s are keys, and put the remainder in R2

R1 R2

R

Page 46: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.46 of 56

UBI 502

Database Management Systems

Why are BCNF DecompositionsWhy are BCNF Decompositionslossless-join? (cont)lossless-join? (cont)

r = R1 (r) ⋈ R2 (r) ? Consider R = (A,B,C), FD B C not in BCNF

BCNF decomposition gives us: R1 = (B, C), R2 = (A, B)

Do we lose any tuples in R1 (r) ⋈ R2 (r) ? Let t = (a,b,c) be a tuple in r

t projects as (b,c) for R1 and (a,b) for R2

joining these tuples gives us t back again

thus, we don’t lose any tuples, and so r is contained in R1 (r) ⋈ R2 (r)

Do we gain any tuples in R1 (r) ⋈ R2 (r) ? Let t = (a,b,c) and u = (d,b,e) be tuples in r By projecting and joining them, can we create (a,b,e) or (d,b,c)? Since B C we know that c=e So we can’t create any tuple we didn’t already have

Thus, the FD ensures r contains R1 (r) ⋈ R2 (r)

Therefore r = R1 (r) ⋈ R2 (r)

Page 47: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.47 of 56

UBI 502

Database Management Systems

BCNF and Dependency PreservationBCNF and Dependency Preservation

R = (J, K, L)F = {JK L

L K}Two candidate keys = JK and JL

R is not in BCNF

Any decomposition of R will fail to preserve

JK L

Two solutions: test FDs across relations

use Third Normal Form

It is not always possible to get a BCNF decomposition that is dependency preserving

Page 48: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.48 of 56

UBI 502

Database Management Systems

Testing for FDs Across RelationsTesting for FDs Across Relations

Suppose that is a dependency not preserved in a decomposition

Create a new materialized view for The materialized view is defined as a projection on of the

join of the relations in the decomposition

Many database systems support materialized views

No extra coding effort for programmer

Declare as a candidate key on the materialized view

Checking for candidate key is cheaper than checking

The down-side: Space overhead: for storing the materialized view

Time overhead: Need to keep materialized view up to date

Database system may not support key declarations on materialized views

Page 49: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.49 of 56

UBI 502

Database Management Systems

Aside 1: Third Normal FormAside 1: Third Normal Form

There are some situations where BCNF is not dependency preserving, and

efficient checking for FD violations is important

Solution: define a weaker normal form, called Third Normal Form. Allows some redundancy

FDs can be checked on individual relations without computing any joins

There is always a lossless-join, dependency-preserving decomposition into 3NF

Details are beyond the scope of this course

Page 50: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.50 of 56

UBI 502

Database Management Systems

Aside 2: SQL Support for FDsAside 2: SQL Support for FDs

SQL does not provide a direct way of specifying functional dependencies other than superkeys

Can specify FDs using assertions assertions must express the following type of constraint

(t1) = (t2) (t1) = (t2)

these are expensive to test (especially if LHS of FD not a key)

Page 51: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.51 of 56

UBI 502

Database Management Systems

Design GoalsDesign Goals

Goal for a relational database design is: eliminate redundancies by decomposing relations

must be able to recover original data using lossless joins

BCNF: no redundancies

no guarantee of dependency preservation

(3NF: dependency preservation, but redundancies)

Page 52: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.52 of 56

UBI 502

Database Management Systems

Overall Database Design ProcessOverall Database Design Process

We have assumed schema R is given R could have been generated when converting E-R diagram to a

set of tables.

R could have been a single relation containing all attributes that are of interest (called universal relation).

Normalization breaks R into smaller relations.

R could have been the result of some ad hoc design of relations, which we then test/convert to normal form.

Page 53: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.53 of 56

UBI 502

Database Management Systems

E-R Model and NormalizationE-R Model and Normalization

When an E-R diagram is carefully designed, identifying all entities correctly, the tables generated from the E-R diagram should not need further normalization

However, in a real (imperfect) design there can be FDs from non-key attributes of an entity to other attributes of the entity

The keys identified in our E-R diagram might not be minimal (only FDs force us to identify minimal keys)

E.g. employee entity with attributes department-number and department-address, and an FD department-number department-address Good design would have made department an entity

FDs from non-key attributes of a relationship set are possible, but rare

Page 54: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.54 of 56

UBI 502

Database Management Systems

May want to use non-normalized schema for performance

E.g. displaying customer-name along with account-number and balance requires join of account with depositor

Alternative 1: Use denormalized relation containing attributes of account as well as depositor with all above attributes faster lookup

extra space and extra execution time for updates

extra coding work for programmer and possibility of error in extra code

Alternative 2: use a materialized view defined as account ⋈ depositor benefits and drawbacks same as above, except no extra coding

work for programmer and avoids possible errors

Denormalization for PerformanceDenormalization for Performance

Page 55: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.55 of 56

UBI 502

Database Management Systems

Other Design IssuesOther Design Issues

Some aspects of database design are not caught by normalization

Examples of bad database design, to be avoided: E.g suppose that, instead of earnings(company-id, year,

amount), we used: earnings-2000, earnings-2001, earnings-2002, etc., all on the

schema (company-id, earnings)

• all are BCNF, but make querying across years difficult

• needs a new table each year company-year(company-id, earnings-2000,

earnings-2001, earnings-2002)

• in BCNF, but makes querying across years difficult

• requires new attribute each year

Page 56: International Computer Institute, Izmir, Turkey Database Design and Normal Forms Asst.Prof.Dr.İlker Kocabaş UBİ502 at ikocabas

©Silberschatz, Korth and Sudarshan

Modifications & additions by Cengiz Güngör8.56 of 56

UBI 502

Database Management Systems

SummarySummary

Functional Dependencies and Decomposition help us achieve our design goals: Avoid redundant data

Ensure that relationships among attributes are represented

Facilitate the checking of updates for violation of integrity constraints