database - microsoft azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · definition...

39
database Comp 205 advanced web programming 1

Upload: others

Post on 12-Jan-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

database

Comp 205 advanced web programming

1

Page 2: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Definition Database - a collection of structured information

for one or more specific purposes – often represented by a cylinders

2

Page 3: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Definition Database - a collection of structured information

for one or more specific purposes

Relational Database - information is stored in a set of related tables

Table – organizational structure used to represent an entity – Columns are attributes of entity also called fields

• Name & Data Type

– Rows are instances of the entity also called records

3

Page 4: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Music Database - 1st Attempt We can store our data in a delimited text file – Represents One-Table Solution– Problems?

4

Name Ar(st Album Year

TeenageDream

KatyPerry TeenageDream 2010

VivalaVida Coldplay DeathandAllhisFriends 2009

Stronger KanyeWest

GraduaHon 2007

Teenage Dream, Katy Perry, Teenage Dream, 2010 Viva la Vida, Coldplay, Death and All His Friends, 2009 Strong, Kanye West, Graduation, 2007 …

AJributes

Instances

Page 5: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Normal Forms Normalization - process to develop clean DB design

Normal Forms – incremental set of DB designs– increases the number of tables & attributes– try to use most simple form as possible– goal is to have the greatest access to all data with the

fewest operations

5

There are three main reasons to normalize a database:

1. to minimize duplicate data,2. to minimize or avoid data modification issues, and3. to simplify queries.

Page 6: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Reasons for Normalization

The first thing to notice is this table serves many purposes including:

1. Identifying the organization’s salespeople2. Listing the sales offices and phone numbers3. Associating a salesperson with an sales office4. sShowing each salesperson’s customers

Page 7: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Reasons for Normalization

1. Insert Anomoly. We cannot record a new sales office until we alsoknow the sales person.

a. in order to create the record, we need provide a primary key.In our case this is the EmployeeID.

2. Update Anomoly. The same information is recorded in multiplerows.

a. if the office number changes, then there are multiple updatesthat need to be made across all rows.

3. Deletion Anomoly. Deletion of a row can cause more than one setof facts to be removed.

a. if John Hunt retires, then deleting that row cause use to loseinformation about the New York office.

Page 8: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

First Normal Form (1NF) 1. All attributes are “single-valued”2. All instances have a unique identifier

The repeating groups of columns now become separate rows in the Customer table linked by the EmployeeID foreign key. A foreign key is a value which matches back to another table’s primary key.

Page 9: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

This design is superior to our original table in several ways:

The original design limited each SalesStaffInformation entry to three customers. In the new design, the number of customers associated to each design is practically unlimited.

It was nearly impossible to Sort the original data by Customer. Now, it is simple to sort customers.

The insert and deletion anomalies for Customer have been eliminated. You can delete all the customer for a SalesPerson without having to delete the entire SalesStaffInformaiton row.

Page 10: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

First Normal Form (1NF) 1. All attributes are “single-valued”2. All instances have a unique identifier

Does this 1NF work for our Music DB? – No, collaborations between artists

6

Song

Name

ArHst

Album

Year

Genre

RecordLabel

Page 11: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Multiple Tables Multiple-value attribute should be removed by

adding multiple tables

7

Song

Name

Album

Year

Genre

RecordLabel

Ar(st

Name

Country

CountryAbbr

Page 12: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Unique Identifiers We want a way to uniquely identify each song – covers, remakes, songs with same name

Solution: create an artificial ID for each instance in each table – auto-incrementing integer

Turnbull-CS205-Topic11 8

Song

ID

Name

Album

Year

Genre

RecordLabel

Ar(st

ID

Name

Country

CountryAbbr

Page 13: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Relationships Three-types of Relationships – one-to-one - can usually merge two tables

– one-to-many - most common

– many-to-many - most complex

What is the relationship between the song and artist tables?

9

Song

ID

Name

Album

Year

Genre

RecordLabel

M2M

Ar(st

ID

Name

Country

CountryAbbr

Page 14: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF) • Everything from 1NF• Non-identifying attributes should be moved

• Idea: if same value appears multiple time for an attribute, it should be another entity or

All the non-key columns are dependent on the table’s primary key.

The primary key uniquely identifies each row in a table.

All columns must depend on the primary key:

in order to find a particular value, such as what color is Kris’ hair, you would first have to know the primary key, such as an EmployeeID, to look up the answer.

Page 15: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF) • Everything from 1NF• Non-identifying attributes should be moved

• Idea: if same value appears multiple time for an attribute,it should be another entity

Once you identify a table’s purpose, then look at each of the table’s columns and ask yourself,

“Does this column serve to describe what the primary key identifies?”

If you answer “yes,” then the column is dependent on the primary key and belongs in the table.

If you answer “no,” then the column should be moved different table.

When a table is in second normal form, it has a single purpose, such as storing employee information.

Page 16: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF) The first issue is the SalesStaffInformation table has two columns which aren’t dependent on the EmployeeID.

The second issue is that there are several attributes which don’t completely rely on the entire Customer table primary key.

Page 17: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF)

Since the columns identified in red aren’t completely dependent on the table’s primary key, they belong elsewhere. In both cases, the columns are moved to new tables.

In the case of SalesOffice and OfficeNumber, a SalesOffice was created. A foreign key was then added to SalesStaffInformaiton so we can still describe in which office a sales person is based.

Page 18: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF)

The changes to make Customer a second normal form table are trickier.

Rather than move the offending columns CustomerName, CustomerCity, and CustomerPostalCode to new table, recognize that the issue is EmployeeID! The three columns don’t depend on this part of the key.

So remove EmployeeID from the table

Page 19: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF)

Now create a table named SalesStaffCustomer to describe which customers a sales person calls upon.

This table has two columns CustomerID and EmployeeID.

Together, they form a primary key.

Separately, they are foreign keys to the Customer and SalesStaffInformation tables respectively.

Page 20: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF)

Page 21: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF)

You can now eliminate all the sales people, yet retain customer records. Also, if all the SalesOffices close, it doesn’t mean you have to delete the records containing sales people.

The SalesStaffCustomer table is all keys!

This type of table is called an intersection table. An intersection table is useful when you need to model a many-to-many relationship.

Page 22: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational
Page 23: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF) • Everything from 1NF• Non-identifying attributes should be moved

• Idea: if same value appears multiple time for an attribute,it should be another entity

10

Song

ID

Name

Album

Year

Genre

Ar(st

ID

Name

Country

CountryAbbr

M2M

Album

ID

Name

Year

O2M

Genre

ID

Name

O2M

Page 24: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

2nd Normal Form (2NF) • Everything from 1NF• Non-identifying attributes should be moved

• Idea: if same value appears multiple time for an attribute,it should be another entity

11

Song

ID

NameM2M

Album

ID

Name

Year

O2M

Genre

ID

Name

O2M

Ar(st

ID

Name

Country

CountryAbbr

Page 25: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

3rd Normal Form (3NF) • Everything from 2NF• No Attribute Dependencies

• Idea: don’t allow of bad data entry to corrupt DB

12

Song

ID

NameM2M

Album

ID

Name

Year

O2M

Genre

ID

Name

O2M

Ar(st

ID

Name

Country

CountryAbbr

Country

ID

Name

Abbr

Page 26: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

3rd Normal Form (2NF) • Everything from 2NF• No Attribute Dependencies

• Idea: don’t allow of bad data entry to corrupt DB

13

Song

ID

NameM2M

Album

ID

Name

Year

O2M

Genre

ID

Name

O2M

Ar(st

ID

Name

Country

ID

Name

Abbr

O2Mtransitive dependence: a column’s value relies upon another column through a second intermediate column.

see https://www.essentialsql.com/get-ready-to-learn-sql-11-database-third-normal-form-explained-in-simple-english/

Page 27: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Six Important Concepts 1. Entites are tables2. Attributes or Fields are columns of tables

3. Each attributes has a data type (int, string, date)4. Instances or Records are rows of a tables

5. Unique ID for instance is call the “primary key”

6. Relationships encoded as “foreign keys”

14

Page 28: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Foreign Keys For a one-to-many relationship, we add a

“foreign key” to the “many” table.

15

Song

ID

Name

AlbumID

Album

ID

Name

Year

O2M

Song

ID

Name

Album

ID

Name

Year

O2M

Page 29: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Many-To-Many We can implement M2M by adding “join tables” – sometime called junctions– Idea: M2M ≈ M2O + O2M

16

Song

ID

Name

M2MAr(st

ID

Name

Song

ID

Name

O2MAr(st

ID

Name

O2MSongToAr(st

ID

SongID

Ar(stID

Page 30: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Putting it all together

17

Album

ID

Name

Year

Song

ID

Name

AlbumID

GenreID

Genre

ID

Name

Ar(st

ID

Name

CountryID

Country

ID

Name

Abbr

SongToAr(st

ID

SongID

ArHstID

Page 31: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Summary: DB Schema Creation Algorithm

1. Identify Major Entities– draw a box for each table

2. Figure out attributes for each entity– add integer id– name & data type

3. Figure out relationship between each pair of entities– O2O – combine entities– O2M – add foreign key to– M2M – create a new join table

18

Page 32: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Exercise

Design a database schema for keeping track of class rosters (e.g., Homer): Hints: Consider students, courses and professors Assume each course has at most one professor

19

Page 33: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Next time We will introduce you to SQL – Structure Query Language– Designed to directly encode semantics of DB

• “Select all songs by Kanye West from 2007”

20

Page 34: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

sql

COMP 205 advanced web programming

21

Page 35: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Pop Quiz Design a database schema for keeping track of class rosters (e.g., Homer):

Hints: Consider students, courses and professors Assume each course has at most one professor

22

Page 36: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

IC Database Schema

23

Student

id

firstname

lastname

gpa

email

major

CourseidcourseNumberdaysHmeroominstructorID

StudentToCourseidstudentIDcourseID

Instructoridfirstnamelastnameemail

Rules:1) TableareCapitalizedCammelback2) ProperHesare(lowercase)Cammelback3) FirstAJributeisalwaysthe“id”4) JoinTablesarecalled“Table1ToTable2”

Page 37: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Why Databases? Make it easy to relate, store, and retrieve data

24

client request

response

serverwebserver

24database

server-sideprogram

Page 38: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

SQL Structured Query Language

standard for most DBs – mysql, sqlite3, postgres

Uses: – create database “schema”– insert, update, delete data– “query” the database of information

25

Page 39: database - Microsoft Azureclasses.eastus.cloudapp.azure.com/~barr/classes/... · Definition Database - a collection of structured information for one or more specific purposes Relational

Learn by doing…

26