relational database systems higher information systems
TRANSCRIPT
Relational Database Systems
Higher Information Systems
The Relational Model
data is grouped into entities which are related, in order to minimise data duplication and achieve data integrity
many-to-many relationships between entities are removed and replaced with one-to-many relationships
Entity-Occurrence Modelling
Entity-Occurrence Modelling Lines indicate how
the instances ofeach entity arelinked
E.g. Member 1034 has rented DVDs 002 and 015
DVD 003 has been rented by members 1012 1056
Entity-Occurrence Modelling
Each DVD can berented by manyMembers
Each Member canrent many DVDs
So there is a many-to-many relationship between Member and DVD
Entity-Occurrence Modelling
This method isonly as good asthe available data
Make up “dummy”data if necessary tofill in the gaps
More about keys An atomic key consists of one attribute
MEMBER(Member Number, Name, Telephone Number)
A compound key consists of two or more attributes
MEMBER(Member Number, Name, Telephone Number)
A surrogate key is a made up attribute designed to identify a record Member Number is a surrogate key
Choosing a key An atomic key is better than a
compound key A numeric attribute is better than a
text attribute KISS = Keep It Short and Simple A key must have a value—it cannot
be blank (or “null”) A key should not change over time
The flat file revisited…
DVD Code Title Cost Date Out Date Due
Member Number Name
Telephone Number
002 Finding Nemo £2.50 03/09/04 04/09/04 1034 John Silver 142536
003 American Pie £2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263
003 American Pie £2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847
008 The Pianist £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756
What is a suitable key? DVD Code? Member Number? (DVD Code, Member Number)?
Update Anomalies
DVD Code Title Cost Date Out Date Due
Member Number Name
Telephone Number
002 Finding Nemo £2.50 03/09/04 04/09/04 1034 John Silver 142536
003 American Pie £2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263
003 American Pie £2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847
008 The Pianist £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756
There is no way of storing the details of a member who hasn’t rented any DVDs
A value must be provided for both DVD Code and Member Number for the key
This is called an “insertion anomaly”
Update Anomalies
DVD Code Title Cost Date Out Date Due
Member Number Name
Telephone Number
002 Finding Nemo £2.50 03/09/04 04/09/04 1034 John Silver 142536
003 American Pie £2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263
003 American Pie £2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847
008 The Pianist £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756
If a member’s details have to be amended, this must be done in each record with those details
This can lead to data inconsistency if there is an error or omission in making the change
This is called a “modification anomaly”
Update Anomalies
DVD Code Title Cost Date Out Date Due
Member Number Name
Telephone Number
002 Finding Nemo £2.50 03/09/04 04/09/04 1034 John Silver 142536
003 American Pie £2.50 27/08/04 28/08/04 1056 Fred Flintstone 817263
003 American Pie £2.50 01/09/04 02/09/04 1012 Isobel Ringer 293847
008 The Pianist £2.50 04/09/04 06/09/04 1097 Annette Kirton 384756
If a DVD is removed from the database, then it may also remove the only record of a member’s details
This is called a “deletion anomaly”
Update Anomalies
Insertion anomalies Modification anomalies Deletion anomalies These are characteristics of poorly
designed databases The solution is to use a relational
database We use normalisation to help work out
what tables are required and which data items should be stored in each table
Normalisation
Un-normalised Form (UNF) Identify an entity List all the
attributes Identify a key
ORDER
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone NumberItem CodeDescriptionUnit CostQuantity)
Un-normalised Form (UNF)
Identify repeating data items
Un-normalised Form (UNF) Identify repeating
data items
ORDER (Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number
Repeating itemsItem CodeDescriptionUnit CostQuantity)
First Normal Form (1NF) Remove
repeating data items to form a new entity
Take the key with you!
ORDER (Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number
Repeating itemsItem CodeDescriptionUnit CostQuantity)
First Normal Form (1NF) Remove
repeating data items to form a new entity
Take the key with you!
ORDER
ORDER_ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(Order NumberItem CodeDescriptionUnit CostQuantity)
First Normal Form (1NF) Identify a key for
the new entity It will be a
compound key Use the original
key and add to it
ORDER
ORDER_ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(Order NumberItem CodeDescriptionUnit CostQuantity)
First Normal Form (1NF) Identify a key for
the new entity It will be a
compound key Use the original
key and add to it Label the foreign
key Order Number is
both part of the compound primary key and also a foreign key.
ORDER
ORDER_ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order NumberItem CodeDescriptionUnit CostQuantity)
First Normal Form (1NF) A data model is
in 1NF if it has no multi-valued attributes
ORDER
ORDER_ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order NumberItem CodeDescriptionUnit CostQuantity)
First Normal Form (1NF)
First Normal Form (1NF)
But what if there were lots of orders for large deluxe red widgets…?
There are still update anomalies
Second Normal Form (2NF) Examine any entity
with a compound key (in this case ORDER_ITEM)
See if any attributes are dependent on just one part of the compound key
These are called partial dependencies
ORDER
ORDER_ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order NumberItem CodeDescriptionUnit CostQuantity)
Second Normal Form (2NF) Order Number is part
of the key Item Code is part of
the key Description is
dependent on the Item Code
Unit Cost is dependent on the Item Code
Quantity is dependent on both Order Number and Item Code.
ORDER
ORDER_ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order NumberItem CodeDescriptionUnit CostQuantity)
Second Normal Form (2NF) Description and Unit
Cost are partial dependencies
They are dependent on Item Code
Remove these attributes to a new entity
Take a copy of the attribute they are dependent on
ORDER
ORDER_ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order NumberItem CodeDescriptionUnit CostQuantity)
Second Normal Form (2NF) Item Code
becomes the key of the new entity
And becomes a foreign key in ORDER-ITEM
ORDER
ORDER_ITEM
ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order Number*Item CodeQuantity)
(Item CodeDescriptionUnit Cost)
Second Normal Form (2NF) A data model is in
2NF if it is in 1NF and there are no partial dependencies
ORDER
ORDER_ITEM
ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order Number*Item CodeQuantity)
(Item CodeDescriptionUnit Cost)
Second Normal Form (2NF)
We can add an item to the Item table without it having to be on an order
We can delete an order in the Order table without deleting details of the items on the order
We can update item details once in the Item table without affecting the orders for that item in the Order-Item table
Second Normal Form (2NF) But there are still
update anomalies with the Order entity
ORDER (Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
Third Normal Form (3NF) Examine all the
entities produced so far
See if there are any non-key attributes which are dependent on any other non-key attributes
These are called non-key dependencies
ORDER
ORDER_ITEM
ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order Number*Item CodeQuantity)
(Item CodeDescriptionUnit Cost)
Third Normal Form (3NF) In the ORDER
entity, Customer Name, Address, Post Code and Telephone Number are all dependent on Customer Number
ORDER
ORDER_ITEM
ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order Number*Item CodeQuantity)
(Item CodeDescriptionUnit Cost)
Third Normal Form (3NF) Remove these
attributes to a new entity
ORDER
ORDER_ITEM
ITEM
(Order NumberOrder DateCustomer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order Number*Item CodeQuantity)
(Item CodeDescriptionUnit Cost)
Third Normal Form (3NF) Remove these
attributes to a new entity
Customer Number is the key of the new entity
Leave Customer Number behind as a foreign key
ORDER
CUSTOMER
ORDER_ITEM
ITEM
(Order NumberOrder Date*Customer Number)
(Customer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order Number*Item CodeQuantity)
(Item CodeDescriptionUnit Cost)
Third Normal Form (3NF) A data model is in
3NF if it is in 2NF and there are no non-key dependencies
ORDER
CUSTOMER
ORDER_ITEM
ITEM
(Order NumberOrder Date*Customer Number)
(Customer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order Number*Item CodeQuantity)
(Item CodeDescriptionUnit Cost)
Third Normal Form (3NF)
We can add a customer to the Customer table without the customer having to place an order
We can delete an order in the Order table without deleting details of the customer who placed the order
We can update a customer’s details once in the Customer table without affecting the orders placed by that customer in the Order table
Memory Aid
In 3NF, each attribute is dependent on
the key the whole key and nothing but the key
Entity-Relationship DiagramORDER
CUSTOMER
ORDER_ITEM
ITEM
(Order NumberOrder Date*Customer Number)
(Customer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order Number*Item CodeQuantity)
(Item CodeDescriptionUnit Cost)
CUSTOMER
ITEMORDER
ORDER_ITEM
CUSTOMER
ITEMORDER
ORDER_ITEM
Entity-Relationship DiagramORDER
CUSTOMER
ORDER_ITEM
ITEM
(Order NumberOrder Date*Customer Number)
(Customer NumberCustomer NameAddressPost CodeTelephone Number)
(*Order Number*Item CodeQuantity)
(Item CodeDescriptionUnit Cost)
CUSTOMER
ITEMORDER
ORDER_ITEM
CUSTOMER
ITEMORDER
ORDER_ITEM
The foreign key is always at the “many” end of the relationship
Source documents
Source documents List all the
attributes which must be stored in the database
DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)
Source documents List all the
attributes which must be stored in the database
Identify a key
DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)
Source documents There are two
attributes called Title
DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)
Source documents There are two
attributes called Title
Member Number is the same as Member
DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)
Source documents There are two
attributes called Title
Member Number is the same as Member
Number or No?
DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NoDVD CodeTitleCostDate HiredDate DueMember)
Source documents Tidy up UNF Carry on as
before to 3NF
DVD_RENTAL (Member NumberTitleForenameSurnameTelephone NumberDVD CodeDVD TitleCostDate HiredDate Due)
Database Design
For each attribute you must decide its name its data type its properties
Database Design
For each attribute you must decide its name
Choose sensible and meaningful field names
Be consistent! e.g. Number/Num/No/#
Database Design For each attribute you must decide
its name its data type
text (alphanumeric, string) numeric (integer, real, currency) date or time Boolean (yes or no) link object (e.g. picture, sound, file)
Data Types Text
“Smith” “John Smith” Alphanumeric
“IV99 9ZZ” “01234 567890” “10 Downing Street” “10”
Free text: “The cat sat on the mat, etc…”
Data Types
Numeric Integer: 3, 1246, 0, -5 Real/floating point: 3.14, 1246.0, 0, -
5.2 Currency: 3.14, 1246.00, 0.00, -5.20
Note that the currency symbol is not stored
Data Types
Date “Short” date: 1/1/2006 “Long” date: 29 February 2004 “Medium” date: 29 Feb 2004 dd/mm/yyyy indicates format Watch out for US dates: mm/dd/yyyy
Database Design Names are usually stored as 3 or 4
fields Title (Mr/Mrs/Miss/Ms) Forename Initials/Other Names Surname
Database Design
Addresses are usually stored as 3 or 4 fields Address1 (Street Address) Address2 (Town) Address3 (District) Post Code Sometimes the house number is stored
separately from the Street Name
Database Design Telephone Numbers are always text Numbers are usually text if they are not
used in calculations, e.g. House Number Other “numbers” are also stored as text
ISBNs Vehicle Registration “numbers”
Use integers for whole numbers
Database Design
For each attribute you must decide its name its data type its properties
Primary key/foreign key PK/FK Validation (presence, range, restricted
choice) Default value Format
Database Design
For each attribute you must decide its name its data type its properties
Store this information in a Data Dictionary
Data DictionaryEntity Attribute Key Data Type Required Unique Format Validation
DVD Code PK Integer Y Y
Film Code FK Integer Y N Lookup value from FILM table
DVD
Cost Currency Y N >=1 and <=3
Member Number PK Integer Y Y
Title Text Y N Choice of Mr/Mrs/Miss/Ms/Dr
Forename Text (15) Y N
Surname Text (20) Y N
Address 1 Text (20) Y N
Address 2 Text (20) N N
Address 3 Text (20) N N
Post Code Text (7) Y N A?09 0AA
MEMBER
Telephone Number Text (11) N N (99999) 000000
Film Code PK Integer Y Y FILM
Title Text (30) Y N
Member Number PK Integer Y N Lookup value from MEMBER table
DVD Code PK Integer Y N Lookup value from DVD table
Date Hired PK Date Y N dd/mm/yy
LOAN
Date Due Date Y N dd/mm/yy
Data Dictionary
Data DictionaryEntity Attribute Key Data Type Required Unique Format Validation
Member Number PK Integer Y Y
Title Text Y N Choice of Mr/Mrs/Miss/Ms/Dr
Forename Text (15) Y N
Surname Text (20) Y N
Address 1 Text (20) Y N
Address 2 Text (20) N N
Address 3 Text (20) N N
Post Code Text (7) Y N A?09 0AA
MEMBER
Telephone Number Text (11) N N (99999) 000000
Data DictionaryEntity Attribute Key Data Type Required Unique Format Validation
Film Code PK Integer Y Y FILM
Title Text (30) Y N
Data DictionaryEntity Attribute Key Data Type Required Unique Format Validation
Member Number PK Integer Y N Lookup value from MEMBER table
DVD Code PK Integer Y N Lookup value from DVD table
Date Hired PK Date Y N dd/mm/yy
LOAN
Date Due Date Y N dd/mm/yy