© 2007 by prentice hall (hoffer, prescott & mcfadden) 1 normalization

17
© 2007 by Prentice Hall © 2007 by Prentice Hall (Hoffer, Prescott & (Hoffer, Prescott & McFadden) McFadden) 1 Normalization Normalization

Upload: darrell-mckenzie

Post on 13-Jan-2016

232 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

© 2007 by Prentice Hall© 2007 by Prentice Hall(Hoffer, Prescott & McFadden)(Hoffer, Prescott & McFadden) 11

NormalizationNormalization

Page 2: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 22

Data NormalizationData Normalization Primarily a tool to validate and Primarily a tool to validate and

improve a logical design so that it improve a logical design so that it satisfies certain constraints that satisfies certain constraints that avoid unnecessary avoid unnecessary duplication of dataduplication of data

The process of decomposing The process of decomposing relations with anomalies to produce relations with anomalies to produce smaller, smaller, well-structuredwell-structured relationsrelations

Page 3: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 33

Well-Structured RelationsWell-Structured Relations A relation that contains minimal data redundancy A relation that contains minimal data redundancy

and allows users to insert, delete, and update and allows users to insert, delete, and update rows without causing data inconsistenciesrows without causing data inconsistencies

Goal is to avoid anomaliesGoal is to avoid anomalies Insertion AnomalyInsertion Anomaly–adding new rows forces user to –adding new rows forces user to

create duplicate datacreate duplicate data Deletion AnomalyDeletion Anomaly–deleting rows may cause a loss of –deleting rows may cause a loss of

data that would be needed for other future rowsdata that would be needed for other future rows Modification AnomalyModification Anomaly–changing data in a row forces –changing data in a row forces

changes to other rows because of duplicationchanges to other rows because of duplication

General rule of thumb: A table should not pertain to more than one entity type

Page 4: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 44

Example–Figure 5-2bExample–Figure 5-2b

Question–Is this a relation? Answer–Yes: Unique rows and no multivalued attributes

Question–What’s the primary key? Answer–Composite: Emp_ID, Course_Title

Page 5: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 55

Anomalies in this TableAnomalies in this Table InsertionInsertion–can’t enter a new employee without –can’t enter a new employee without

having the employee take a classhaving the employee take a class DeletionDeletion–if we remove employee 140, we lose –if we remove employee 140, we lose

information about the existence of a Tax Acc information about the existence of a Tax Acc classclass

ModificationModification–giving a salary increase to –giving a salary increase to employee 100 forces us to update multiple employee 100 forces us to update multiple recordsrecordsWhy do these anomalies exist?

Because there are two themes (entity types) in this one relation. This results in data duplication and an unnecessary dependency between the entities

Page 6: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 66

Functional Dependencies and Functional Dependencies and KeysKeys

Functional Dependency: The value of one Functional Dependency: The value of one attribute (the attribute (the determinantdeterminant) determines ) determines the value of another attributethe value of another attribute

Candidate Key:Candidate Key: A unique identifier. One of the candidate A unique identifier. One of the candidate

keys will become the primary keykeys will become the primary key E.g. perhaps there is both credit card number and E.g. perhaps there is both credit card number and

SS# in a table…in this case both are candidate SS# in a table…in this case both are candidate keyskeys

Each non-key field is functionally dependent Each non-key field is functionally dependent on every candidate keyon every candidate key

Page 7: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 77

Figure 5.22 Steps in normalization

Page 8: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 88

First Normal FormFirst Normal Form No multivalued attributesNo multivalued attributes Every attribute value is atomicEvery attribute value is atomic Fig. 5-25 Fig. 5-25 is notis not in 1 in 1stst Normal Form Normal Form

(multivalued attributes) (multivalued attributes) it is not a it is not a relationrelation

Fig. 5-26 Fig. 5-26 isis in 1 in 1stst Normal form Normal form All relationsAll relations are in 1 are in 1stst Normal Form Normal Form

Page 9: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 99

Table with multivalued attributes, not in 1st normal form

Note: this is NOT a relation

Page 10: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 1010

Table with no multivalued attributes and unique rows, in 1st normal form

Note: this is relation, but not a well-structured one

Page 11: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 1111

Anomalies in this TableAnomalies in this Table InsertionInsertion–if new product is ordered for order –if new product is ordered for order

1007 of existing customer, customer data 1007 of existing customer, customer data must be re-entered, causing duplicationmust be re-entered, causing duplication

DeletionDeletion–if we delete the Dining Table from –if we delete the Dining Table from Order 1006, we lose information concerning Order 1006, we lose information concerning this item's finish and pricethis item's finish and price

UpdateUpdate–changing the price of product ID 4 –changing the price of product ID 4 requires update in several recordsrequires update in several records

Why do these anomalies exist? Because there are multiple themes (entity types) in one relation. This results in duplication and an unnecessary dependency between the entities

Page 12: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 1212

Second Normal FormSecond Normal Form 1NF PLUS 1NF PLUS every non-key every non-key

attribute is fully functionally attribute is fully functionally dependent on the ENTIRE dependent on the ENTIRE primary keyprimary key Every non-key attribute must be Every non-key attribute must be

defined by the entire key, not by only defined by the entire key, not by only part of the keypart of the key

No partial functional dependenciesNo partial functional dependencies

Page 13: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 1313

Order_ID Order_Date, Customer_ID, Customer_Name, Customer_Address

Therefore, NOT in 2nd Normal Form

Customer_ID Customer_Name, Customer_Address

Product_ID Product_Description, Product_Finish, Unit_Price

Order_ID, Product_ID Order_Quantity

Figure 5-27 Functional dependency diagram for INVOICE

Page 14: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 1414

Partial dependencies are removed, but there are still transitive dependencies

Getting it into Getting it into Second Normal Second Normal FormForm

Figure 5-28 Removing partial dependencies

Page 15: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 1515

Third Normal FormThird Normal Form 2NF PLUS 2NF PLUS no transitive dependenciesno transitive dependencies

(functional dependencies on non-primary-key (functional dependencies on non-primary-key attributes)attributes)

Note: This is called transitive, because the Note: This is called transitive, because the primary key is a determinant for another primary key is a determinant for another attribute, which in turn is a determinant for a attribute, which in turn is a determinant for a thirdthird

Solution: Non-key determinant with transitive Solution: Non-key determinant with transitive dependencies go into a new table; non-key dependencies go into a new table; non-key determinant becomes primary key in the determinant becomes primary key in the new table and stays as foreign key in the old new table and stays as foreign key in the old table table

Page 16: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 1616

Transitive dependencies are removed

Figure 5-28 Removing partial dependencies

Getting it into Getting it into Third Normal Third Normal FormForm

Page 17: © 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Normalization

Chapter 5 © 2007 by Prentice Hall© 2007 by Prentice Hall 1717

Merging RelationsMerging Relations View Integration–Combining entities from View Integration–Combining entities from

multiple ER models into common relationsmultiple ER models into common relations Issues to watch out for when merging entities Issues to watch out for when merging entities

from different ER models:from different ER models: Synonyms–two or more attributes with different Synonyms–two or more attributes with different

names but same meaningnames but same meaning Homonyms–attributes with same name but different Homonyms–attributes with same name but different

meaningsmeanings Transitive dependencies–even if relations are in 3NF Transitive dependencies–even if relations are in 3NF

prior to merging, they may not be after mergingprior to merging, they may not be after merging Supertype/subtype relationships–may be hidden Supertype/subtype relationships–may be hidden

prior to mergingprior to merging