7 copyright © 2006, oracle. all rights reserved. normalization of relational tables (part i)

34
7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

Upload: christine-mckinney

Post on 25-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7Copyright © 2006, Oracle. All rights reserved.

Normalization

of Relational Tables

(Part I)

Page 2: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 2

Outline

• Modification anomalies ( 修改的異常 )

• Functional dependencies ( 函數性的依賴 )

• Major normal forms

• Practical concerns ( 實務的考量 )

Page 3: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 3

Modification Anomalies( 修改的異常 : 修改資料時所發生的異常現象 )

• Definition:– Unexpected side effects (未預期到的副作用 ) that occurs

when changing the data in a table designed with

excessive redundancy (額外的多餘性、累贅性 ).

• Result of side effect

– Insert, update, and delete more data than desired

• Types

– Insertion Anomaly (新增的異常 )

– Update Anomaly (更新的異常 )

– Deletion Anomaly (刪除的異常 )

Page 4: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 4

Example of a Poor Table Design(Big University Database)

• PK design: combination of StdSSN and OfferNo• Pros

• Easier to query (no join is needed)– enrollments of student S1 or S2– students of offering O2– students or offerings of course C2

• Cons• Table has obvious redundancies (shown by blocks in colors)

– Result : more difficult to change

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGrade CourseNo CrsDesc

S1 Seattle JUN O1 Fall 2006 3.5 C1 DB

S1 Seattle JUN O2 Fall 2006 3.3 C2 VB

S2 Bothell JUN O3 Spring 2007 3.1 C3 OO

S2 Bothell JUN O2 Fall 2006 3.4 C2 VB

Page 5: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 5

Insertion Anomaly( 新增的異常 )

Definition:• In an insertion, extra data beyond the desired data may be

added to the database.

Example:• Cannot insert a new student without enrolling in an

offering (Because OfferNo is part of PK)– Insert more column data than desired

• Other example ?• Why ? each data row denotes student, offering, course, enrollment.

PK consists of StdSSN denoting student and OfferNo denoting offering

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGrade CourseNo CrsDesc

S1 Seattle JUN O1 Fall 2006 3.5 C1 DB

S1 Seattle JUN O2 Fall 2006 3.3 C2 VB

S2 Bothell JUN O3 Spring 2007 3.1 C3 OO

S2 Bothell JUN O2 Fall 2006 3.4 C2 VB

Page 6: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 6

Update Anomaly( 更新的異常 )

Definition: • In order to modify only a single fact, it may be

necessary to change multiple rows.

Example:• If changing a course description, it must change

every enrollment of the course– Try to change C2’s course description, ….

• Other example ?

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGrade CourseNo CrsDesc

S1 Seattle JUN O1 Fall 2006 3.5 C1 DB

S1 Seattle JUN O2 Fall 2006 3.3 C2 VB

S2 Bothell JUN O3 Spring 2007 3.1 C3 OO

S2 Bothell JUN O2 Fall 2006 3.4 C2 VB

colored table

Page 7: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 7

Deletion Anomaly( 刪除的異常 )

Definition: • Deleting a row may inadvertently ( 不注意地 ) cause

other data to be deleted.

Example:• If we remove enrollment of student S2 in offering O3,

causing loss of information about offering O3 and course C3

• Other example ?

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGrade CourseNo CrsDesc

S1 Seattle JUN O1 Fall 2006 3.5 C1 DB

S1 Seattle JUN O2 Fall 2006 3.3 C2 VB

S2 Bothell JUN O3 Spring 2007 3.1 C3 OO

S2 Bothell JUN O2 Fall 2006 3.4 C2 VB

Page 8: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 8

StdSSN OfferNo EnrGrade

S1 O1 3.5

S1 O2 3.3

S2 O3 3.1

S2 O2 3.4

OfferNo OffYear CourseNo

O1 MW C1

O2 MW C2

O3 MW C3

Table Name: Offering

Table Name: Enrollment

Table Name: Student

StdSSN StdLastName StdClass

S1 WELLS JUN

S2 NORBERT JUN

S3 KENDALL JUN

CourseNo CreDesc

C1 DB

C2 VB

C3 OO

Table Name: Course

Example of a Better Table Design(Big University Database : 4 Tables Denoting 4 Objects+FKs )

DeleteAnomaly?

UpdateAnomaly?

InsertAnomaly?

InsertAnomaly?

Page 9: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 9

Normalization( 正規化 )

• A good database design ensures the users can change the contents of a database without unexpected side effects (modification anomalies).

– A better solution is to modify the table design to remove the redundancies that cause the anomalies.

• Normalization:– The process of removing redundancies in a

table so that the table is easier to modify.

Page 10: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 10

Constraints of Database Content

• Value-based constraintsA comparison of a column to a constant– Example: Age >= 21

• Value-neutral constraintsA comparison of columns (column to column)– PK (entity integrity constraint)

— Constraint about the PK column of one or more rows

– FK (referential integrity constraint)— Constraint about parent PK and child FK of one or more

rows

– Functional dependency ( 函數性的依賴 )— Constraint about two or more columns of a table

Page 11: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 11

Functional Dependency( 函數性的依賴 )

“X determines Y” is denoted as X Y• For each X value, there is at most one Y value

• X: left-hand-side (LHS) or determinant (決定項 )

• Y: right-hand-side (RHS)• Like a mathematical function: Y = f (X)

– f : like a table

– X : like the key of a table

– Y : like a column of a table

• Example: StdSSN StdName

StdSSN StdClass

Page 12: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 12

Functional Dependency(FD)

• Think about functional dependencies as

identifying potential candidate keys

• X Y denotes an FD between columns X and Y

– If X and Y are placed together in a table without other

columns, X is a candidate key.

Page 13: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 13

StdSSN, OfferNo EnrGradeStdSSN StdCity, StdClassOfferNo OffTerm, OffYear, CourseNo, CrsDescCourseNo CrsDesc

Functional Dependency Diagram and List

( 函數依賴圖和函數依賴清單 )

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGradeCourseNo CrsDesc

Functional Dependency DiagramFunctional Dependency Diagram

List of Functional DependenciesList of Functional Dependencies

Table Scheme

Page 14: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 14

How to Identify Functional Dependencies

• Deriving from uniqueness statement

• Deriving from 1-M relationships

• Considering minimalism (極簡化 ) of FD’s

LHS (Determinant)

Page 15: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 15

How to Identify Functional Dependencies

• Deriving from uniqueness statement

Example:

– A user may state that each course offering has a

unique offering number along with the year and

term of the offering : OfferNo OfferYear, OfferTerm

Page 16: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 16

How to Identify Functional Dependencies

Deriving from 1-M relationships

• For an 1-M relationship, an FD exists in

– the child table-to-parent table direction

(not the parent-to-child direction)

– Because each LHS value of an FD can be

associated with at most one RHS value.

– Example:

A faculty teaches many offerings, but an offering

is taught by one teacher : OfferNo FacNo

Page 17: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 17

Minimalism (極簡化 ) of FD’s LHS (Determinant)

• The determinant of an FD

(Columns appearing at the LHS of an FD)

– Must be minimal (can not contain extra columns)

• One column vs. a combination of columns

– An FD in which the LHS contains more than one

column may represent an M-N relationship.

– Example : OrdNo, ProdNo OrdQty

Order quantity depends on the combination of order

number and product number.

How to Identify Functional Dependencies

Page 18: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 18

Eliminating FDs Using Sample Data

• An FD cannot exist, If

– two rows of a table have the same value for the LHS

but different values for the RHS of the FD

• A FD cannot be proven to exist by only examining

the rows of a table.

• However you can falsify ( 否定 ) an FD by

examining the content of a table.

– Using sample data to eliminate potential FDs

Page 19: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 19

Eliminating FDs Using Sample Data

Disprove X Y • Two rows that have the same X value but a different

Y value

• Example:

OfferNo OfferNo StdSSN (?) StdSSN (?)

StdSSN StdSSN OffYear (?) OffYear (?)

StdSSN StdClass OfferNo OffYear EnrGrade CourseNo CrsDesc

S1 JUN O1 2006 3.5 C1 DB

S1 JUN O2 2006 3.3 C2 VB

S2 JUN O3 2007 3.1 C3 OO

S2 JUN O2 2006 3.4 C2 VB

Page 20: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 20

Normal Forms

• Normalization : the process of removing redundancies in a table so that the table is easier to modify

• A normal form is a rule about allowable FDs in tables.

• Each normal form removes certain kinds of redundancies.

• First normal form (1NF) is the starting point.

• Second Normal Form (2NF) is stronger (嚴格 ) than 1NF.– Only a subset of the 1NF tables is in 2NF.

• 3NF/BCNF is the most important in practice because higher normal forms than 3NF/BCNF involve other kinds of FDs that are less common and more difficult to understand.

Page 21: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 21

Relationships of Normal Forms

1NF

2NF

3NF/BCNF

4NF

5NF

DKNF

Page 22: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 22

First Normal Form(1NF, 第一正規化型式 )

• 1NF prohibits nesting or repeating data groups

in a table

• Starting point of normalization for most

relational DBMSs

– Most commercial DBMSs use 1NF tables

• A table not in 1NF is unnormalized (未正規化的 )

or nonormalized (無正規化的 ).

Page 23: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 23

First Normal Form

Table above is not normalized (not in 1NF)

• The table has 2 rows

Containing repeating groups or nested columns.– S1 row has 5 nested columns (OfferNo, OffYear, …)

– S2 row has 5 nested columns (OfferNo, OffYear, …)

StdSSN StdClass OfferNo OffYear EnrGrade CourseNo CrsDesc

S1 JUN O1 2006 3.5 C1 DB

O2 2006 3.3 C2 VB

S2 JUN O3 2007 3.1 C3 OO

O2 2006 3.4 C2 VB

Page 24: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 24

Convert to First Normal Form

• Replace each repeating group with a row

• In a new row, copy the nonrepeating columns

– (S1, JUN) for row two with (O2, 2006, 3.3, C2, VB)

– (S2, JUN) for row four with (O2, 2006, 3.4, C2, VB)

• Redefine PK if necessary

StdSSN StdClass OfferNo OffYear EnrGrade CourseNo CrsDesc

S1 JUN O1 2006 3.5 C1 DB

S1 JUN O2 2006 3.3 C2 VB

S2 JUN O3 2007 3.1 C3 OO

S2 JUN O2 2006 3.4 C2 VB

Page 25: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 25

Second Normal Form (2NF, 第二正規化型式 )

• Goal of 2NF and 3NF

produces tables in which every key determines

the other columns

• The definition of 2NF and 3NF distinguish

between key and nonkey columns.

– A column is a key column if it is a candidate key or

a part of candidate key

– A nonkey column is any other column.

Page 26: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 26

Second Normal Form (2NF, 第二正規化型式 )

• Partial Dependency ( 部分的依賴 )

A nonkey column depends on a subset of columns in

any candidate key

(A part of a compound key → A nonkey column)

• A table is in 2NF if (no partial dependency exists)

– the key contains only one column, or

– each nonkey column depends on all of the columns

in any candidate key, not a subset of columns in

any candidate key

Page 27: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 27

Second Normal Form

• Violation of 2NF (partial dependency exists)

– A part of a compound key A nonkey column

– Only for checking compound keys ( 組合索引鍵 )

• A key containing only one column cannot violate 2NF

(A table containing a sinlge-column key cannot violate 2NF)

• Steps for converting to 2NF

1. Analyze FDs

2. Find violating FDs of 2NF (FD1, FD2 in next slide)

3. Splitting the original table into small tables that satisfy the 2NF definition

(Split the columns of every violating FD into a new table)

Page 28: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 28

Convert to Second Normal Form(Analyze FDs)

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGradeCourseNo CrsDesc

FD 1 FD 2

FD 3

FD 4

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGrade CourseNo CrsDesc

S1 Seattle JUN O1 Fall 2006 3.5 C1 DB

S1 Seattle JUN O2 Fall 2006 3.3 C2 VB

S2 Bothell JUN O3 Spring 2007 3.1 C3 OO

S2 Bothell JUN O2 Fall 2006 3.4 C2 VB

PK = ?

Any Partial Dependency? FD1, FD2, FD3, FD4 ?

Page 29: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 29

Convert to Second Normal Form(Splitting Original Table )

• Splitting the original table into small tables that

satisfy the 2NF definition

– In each smaller table, the entire primary key

should determine the nonkey columns

– The original table should be recoverable by using

natural join operations on the smaller tables

– The FDs in the original table should be derivable

from the FDs in the smaller tables.

• The splitting process involves the project

operator of relational algebra

Page 30: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 30

Convert to Second Normal Form

After splitting, you should add referential integrity constraints to connect the tables.

UnivTable1 (StdSSN, StdCity, StdClass)

UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc) UnivTable3 (StdSSN, OfferNo, EnrGrade)

FOREIGN KEY (StdSSN) REFERENCES UnivTable1FOREIGN KEY (OfferNo) REFERENCES UnivTable2

StdSSN StdCity StdClass OfferNo OffTerm OffYear EnrGradeCourseNo CrsDesc

Page 31: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 31

Third Normal Form (3NF)

• A table is in 3NF if– It is in 2NF (no partial dependency) and

– Each nonkey column depends only on candidate keys, not on other nonkey columns.

(no transitive dependency)

• Transitive Dependency ( 傳遞 / 遞移的依賴 )– Nonkey column depends on other nonkey columns

– If A B, B C, then A C. So, A C is a transitive dependency, and B C causes a violation of 3NF.

OfferNo CourseNo, CourseNo CrsDesc

OfferNo CrsDesc

Page 32: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 32

Convert to Third Normal Form

Consider UnivTable2

OfferNo CourseNo

CourseNo CrsDesc

OfferNo CrsDesc

causes a violation of 3NF in UnivTable2CourseNo CrsDesc

UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc)

OfferNo OffTerm OffYear CourseNo CrsDesc

Page 33: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 33

Convert to Third Normal Form

UnivTable2-1 (CourseNo, CrsDesc)

UnivTable2-2 (OfferNo, OffTerm, OffYear, CourseNo)FOREIGN KEY (CourseNo) REFERENCES UnivTable2-1

UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc)

OfferNo OffTerm OffYear CourseNo CrsDesc

Steps for converting to 3NF

1. Find violating FDs of 3NF

2. Splitting the original table into small tables that satisfy the 3NF definition(Split the columns of every violation FD into a new table)

Page 34: 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

7.1 - 34

自我練習作業

HW HW 第七章第七章 239239 頁頁 Questions: 1, 2, 3Questions: 1, 2, 3, 14, 15, 24, 14, 15, 24