lecture 9: data design csc 470 spring 2010. 2 entity relationship model u used to capture the...

34
Lecture 9: Data Design CSC 470 Spring 2010

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

Lecture 9: Data Design

CSC 470

Spring 2010

Page 2: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

2

Entity Relationship Model Used to capture the conceptual design of a

database– It represents the logical structure of a database

Three basic elements in an ER Model :– Entity types– Relationship types – Attributes

We use UML class notation for the ER model– Only the first two sections are used

Page 3: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

3

Entity Type Entity type

– Define a group of objects with same properties or attributes.

– Examples: EMPLOYEE, CUSTOMER, SUPPLIER entity types.

Entity occurrence (or entity instance)– Uniquely identifiable object of an entity type. – Examples: Employee John Doe, Customer Mike

Jordan, Supplier Office Depot. Entity set is the set of entity occurrences.

Page 4: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

4

ER diagram of Staff and Branch entity types

An entity type is displayed in a rectangular box.

Page 5: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

5

Relationship Types Relationship type

– Define a set of meaningful associations among entity types.

– Examples: A Branch HAS some Staff, An Employee WORKS ON a Project.

Relationship occurrence– Uniquely identifiable association, which includes one

occurrence from each participating entity type. Relationship set is the set of relationship

occurrences (current state of a relationship).

Page 6: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

6

Binary and ternary relationshipsBinary:

Ternary:

Page 7: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

7

Relationship Types Recursive Relationship

– A relationship type where same entity type participates more than once in different roles.

Page 8: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

8

Attributes Attribute

– Property of an entity or a relationship type. Attribute Domain

– Set of allowable values for one or more attributes. Simple Attribute (or single-valued Attribute)

– An entity has a single atomic value for the attribute.– Example: Employee with SSN 123-45-6789

Composite Attribute– Attribute composed of multiple components.– Examples: Address, Name.– It can be nested.

Page 9: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

9

Attributes

Multi-valued Attribute– Attribute that holds multiple values for each

occurrence of an entity type.– Example: Previous jobs for an employee.– Identify them with a (*) after the attribute name

Derived Attribute– Attribute that represents a value that is derivable from

value of a related attribute, or set of attributes, not necessarily in the same entity type.

– Example: Age derived from DOB and Today’s date.– Precede the attribute name with a ‘/’ character

Page 10: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

10

Keys Candidate Key

– Minimal set of attributes that uniquely identifies each occurrence of an entity type.

Primary Key– Candidate key selected to uniquely identify each

occurrence of an entity type.

Composite Key– A candidate key that consists of two or more

attributes.

Page 11: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

11

ER diagram of Staff and Branch entities and their attributes

Page 12: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

12

Relationship called Advertises with its own attributes

Page 13: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

13

Structural Constraints Main type of constraint on relationships is called

multiplicity. – Indicate how many instances of an entity type are

related to an instance of another entity type .

Binary relationships are generally referred to as being:– one-to-one (1:1)– one-to-many (1:* or 1:M)– many-to-many (*:* or M:N)

Page 14: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

14

Multiplicity of Staff Manages Branch (1:1) relationship

1 1

Page 15: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

15

Multiplicity of Staff Oversees PropertyForRent (1:*) relationship type

1:M

1 M

Page 16: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

16

Multiplicity of Newspaper Advertises PropertyForRent (*:*) relationship

M M

M:M

Page 17: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

17

Multiplicity of ternary Registers relationship

A staff/branch value pair registers several clients

1 1

M

Page 18: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

Example of a complete ER diagram

Student Registration System

18

SSStudentStudentIDNameAddress

SSCourseCourseCodeNameCreditNum

SSFacultyFacultyIDNameAddress

SSSectionSectionNumClassTimeRoom

enrolls

1part of

*

*1 teaches *

*

*

Page 19: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

19

Database Shared collection of logically related data designed

to meet the information needs of an organization.– Data are known facts that can be recorded and have an

implicit meaning. Database Management System (DBMS)

– A software system designed to store and manage databases easily and efficiently.

Database Schema:– The description of a database.– Includes descriptions of the database structure, data types,

and the constraints on the database.

Page 20: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

20

Example of a database schema

Page 21: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

21

Example of a database instance

Page 22: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

22

The Relational Model

Proposed by Dr. E. Codd in the early 70’s– He worked at IBM Research Lab

Model based on the notion of a relation (table) It has solid theoretical foundation Easy to understand, powerful query languages Most widely used model

– Vendors: IBM (DB2), Oracle (Oracle10g), Microsoft (SQL Server 2005), Sybase, Informix, etc.

Page 23: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

23

Relational Model Terminology A relation is a mathematical concept based on the

ideas of sets.– Informally, it is a table (relation) with columns (attributes)

and rows (tuples). A relation has a schema.

– Denoted by R(A1, A2, …, An)– R is the name of the relation, Ai’s are its attributes

A relational database schema is the set of relation schemas, each with a distinct name that belongs to the same database.

Page 24: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

24

Mathematical definition of a relation

The Cartesian product of n sets (D1, D2, . . ., Dn) is:D1 ´ D2 ´ . . . ´ Dn = {<d1, d2, . . . , dn> | d1 ÎD1, d2 ÎD2, . . . , dnÎDn}

Any set of n-tuples from this Cartesian product is a relation instance on the n sets.

Consider two sets, D1 & D2, where D1 = {2, 4} and D2 = {1, 3, 5}.

R = {<2, 1>, <4, 1>} is a relation.

Page 25: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

25

Mathematical definition of a relation

The Cartesian product of n sets (D1, D2, . . ., Dn) is:D1 ´ D2 ´ . . . ´ Dn = {<d1, d2, . . . , dn> | d1 ÎD1, d2 ÎD2, . . . , dnÎDn}

Any set of n-tuples from this Cartesian product is a relation instance on the n sets.

Consider two sets, D1 & D2, where D1 = {2, 4} and D2 = {1, 3, 5}.

R = {<2, 1>, <4, 1>} is a relation.

Page 26: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

26

Characteristics of a relation Rows contain data about an entity. Columns contain data about attributes of the entity. All entries in a column are of the same kind. Each column has a unique name. Cells of a table hold a single value. The order of the columns in unimportant. The order of the rows is unimportant. Each row is distinct. There are no duplicate rows. A special NULL values represent unknown values.

Page 27: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

27

Relational Keys Superkey (SK)

– An attribute, or a set of attributes (composite key), that uniquely identifies a tuple within a relation.

Candidate Key (CK)– Superkey (K) such that no proper subset is a superkey

within the relation. – In each tuple of R, values of K uniquely identify that tuple

(uniqueness).– No proper subset of K has the uniqueness property

(irreducibility).

Page 28: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

28

Relational Keys Primary Key (PK)

– Candidate key selected to identify tuples uniquely within relation.

– Primary keys are underlined in a relational DB schema. Foreign Key (FK)

– Attribute, or set of attributes, within one relation that matches candidate key of some (possibly same) relation.

– Displayed as a direct arc from foreign key to referenced table (or its primary key).

Surrogate Key (SuK: used as a primary key)– A unique attribute introduced in the relation which is not

a property of a relation.

Page 29: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

29

Example of COMPANY Database Schema

Page 30: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

30

Relational Integrity Constraints Integrity constraints (IC)

– Conditions that must be true for any instance of the database.

– Defined during creation of the database.– Checked when relations are modified.

Null– Represents value for an attribute that is currently

unknown or not applicable for a tuple.– Deals with incomplete or exceptional data.– Represents the absence of a value and is not the same as

zero or spaces, which are values.

Page 31: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

Mapping ERD to RDB Schema Map each entity to a DB relation Map each binary M:M relationship to a DB relation

– New relation includes the PK of each related entity– These PKs act as FKs to the “parent” relations– PK of new relation are usually these two FKs

Map each binary 1:M relationship – Copy the PK of the “1-side” entity (“parent”) into the “M-

side” entity (“child”) of the relationship (acting as FK) Map each binary 1:M relationship

– Copy the PK of one of the entities into the entity (“child”) that participate the most in the relationship

31

Page 32: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

Mapping ERD to RDB Schema (cont.)

Map more complex relationships to a DB relation– New relation includes the PK of each related entity– These PKs becomes FK in the new relations

Relationships with their own attributes– Place them in the new relation or in the “child” relation

Mapping multivalued attributes (MVAs)– For each MVA, create a new DB relation– Place a copy of the PK of the owner entity into the new

relation (acting as a FK)– Place the MVA in the new relation– PK of the new relation are these two attributes

32

Page 33: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

Mapping Example: ERD

33

schedAt

applies

reserves

worksAt

Page 34: Lecture 9: Data Design CSC 470 Spring 2010. 2 Entity Relationship Model u Used to capture the conceptual design of a database –It represents the logical

Mapping result: RDB schemaStudent(studID, studName, studPhone, studEmail, studURL, major, minor, gpa)

Company(compID, compName, compURL)

LaborDeptPosition(posID, title, description)

TimeBlock(blockID, date, startTime, endTime)

Opening(openID, salary, tentStartDate, specificDesc, posID, compID)

FK: posID references LaborDeptPosition(posID)

FK: compID references Company(compID)

CityOpening(openID, city, state) // implement the multivalued attribute city in the Opening entity

FK: openID references Opening(openID)

Reservation(blockID, compID, room, building) // implements the M:M relationship called reserves

FK: blockID references TimeBlock(blockID)

FK: compID references Company(compID)

Interviewer(empID, empName, empPhone, empEmail, dateHired, compID)

FK: compID references Company(compID)

Interview(intviewID, blockID, time, room, building, empID)

FK: blockID references TimeBlock(blockID)

FK: empID references Interviewer(empID)

Application(studID, intviewID, openID, applDate) // implements the complex relationship called applies

FK: studID references Student(studID)

FK: intviewID references Interview(intviewID)

FK: openID references Opening(openID)

34