it – som værktøj

48
IT – som værktøj Bent Thomsen Institut for Datalogi Aalborg Universitet

Upload: hayes-decker

Post on 31-Dec-2015

24 views

Category:

Documents


2 download

DESCRIPTION

IT – som værktøj. Bent Thomsen Institut for Datalogi Aalborg Universitet. Introduction to databases. Bent Thomsen. Who uses databases?. Everybody! Every time you use your credit card, you (indirectly) use a database When you book a flight you use a database - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: IT – som værktøj

IT – som værktøj

Bent Thomsen

Institut for Datalogi

Aalborg Universitet

Page 2: IT – som værktøj

Introduction to databases

Bent Thomsen

Page 3: IT – som værktøj

Who uses databases?

• Everybody!– Every time you use your credit card, you

(indirectly) use a database– When you book a flight you use a database– When you take out insurance or a mortgage,

you use a database– When you contact public services– When you buy something online

Page 4: IT – som værktøj

Classes of database users

• Workers on the scene– End user: The person who makes queries,

updates and extracts reports– Database administrator: The person

responsible for managing the database system– Database designer– Application programmer: Design and

implement transactions for end-users

Page 5: IT – som værktøj

Database users (cont.)

• Workers behind the scene– Database designers and implementers– Tool developers– Operators and maintenance personnel

• Run and maintain the hardware and software

Page 6: IT – som værktøj

Types of databases

• Large database systems– Oracle, DB2, Informix, Ingres, MS SQLServer– Huge amounts of data – high throughput

• Small(ish) database systems– Excell DataLists– MySQL– MS Access– Small companies, clubs, scientific data

Page 7: IT – som værktøj

Concepts of Database– A database is a collection of records that are organized for

a particular purpose – A student record book with addresses and grades– An inventory with software, books, and hardware

– A field is a basic fact (or data element).• Name, address,…

– A record is a set of fields.• A student, a book

– A table is a set of records.– A database consists of one or more tables.– A primary key is a field that identify a single record.

• In the primary key field, every record has a unique number. (Student ID Number)

Page 8: IT – som værktøj

Parts of a databaseRecord

Attribute/Field

Tables

• Records become “rows”• Attributes/fields become “columns”• Rules determine the relationship between the tables and tie the data together to form a database

Page 9: IT – som værktøj

Creating a database

• What information are we trying to store?• How do we describe the information?• Phone Book/Contact entries

– Name– Address– Company– Phone Number– URL/Web Page– Age– Height (in meters)– Birthday– When we added the entry

Page 10: IT – som værktøj

Data Types

• Binary– Database specific binary objects– Pictures, digital signatures, etc.

• Boolean– True/False values

• Character– Fixed width or variable size

• Numeric– Integer, Real (floating decimal point), Money

• Temporal– Time, Date, Timestamp

Page 11: IT – som værktøj

Phone Book/Contact Record

Name CharacterAddress CharacterCompany CharacterPhone Number CharacterURL/Web Page CharacterAge IntegerHeight Real (float)Birthday DateWhen we added the entry Timestamp

Page 12: IT – som værktøj

An example

ContactsName Company Address Phone1 Phone2 Phone3 ZipCode

Joe ABC 123 5532 2234 3211 12345

Jane XYZ 456 3421 14454

Chris PDQ 789 2341 6655 14423

Page 13: IT – som værktøj

Example (cont.)

Contacts

Id Name Company Address Phone ZipCode

1 Joe ABC 123 5532 12345

1 Joe ABC 123 2234 12345

1 Joe ABC 123 3211 12345

2 Jane XYZ 456 3421 14454

3 Chris PDQ 789 2341 14423

3 Chris PDQ 789 6655 14423

Could be a DataList table in Excell!

Page 14: IT – som værktøj

What is a relational database?

• Originally developed by E.F. Codd in 1970

• Organizes data into tables where each item is a row and the attributes of the item are in columns.

• Different from “flat file” databases because you can define “relationships” between items in different tables.

Page 15: IT – som værktøj

Normal forms

• E. F. Codd in 1972 wrote a paper on “Further Normalization of the Data Base Relational Model”

• Normal forms reduce the amount of redundancy and inconsistent dependency within databases.

• Codd proposed three normal forms and through the years two more have been added.

Summarized from Barry Wise’s article on Database Normalizationhttp://www.phpbuilder.com/columns/barry20000731.php3?page=1

Page 16: IT – som værktøj

The Zero Form• No rules have

been applied

• Where most people start (and stop)

• No room for growth

• Usually wastes space

ContactsName Company Address Phone1 Phone2 Phone3 ZipCode

Joe ABC 123 5532 2234 3211 12345

Jane XYZ 456 3421 14454

Chris PDQ 789 2341 6655 14423

Page 17: IT – som værktøj

First Normal Form• Eliminate repeating

columns in each table

• Create a separate table for each set of related data

• Identify each set of related data with a primary key

ContactsId Name Company Address Phone ZipCode

1 Joe ABC 123 5532 12345

1 Joe ABC 123 2234 12345

1 Joe ABC 123 3211 12345

2 Jane XYZ 456 3421 14454

3 Chris PDQ 789 2341 14423

3 Chris PDQ 789 6655 14423

Benefits: Now we can have infinite phone numbers or company addresses for each contact.

Drawback: Now we have to type in everything over and over again. This leads to inconsistency, redundancy and wasting space. Thus, the second normal form…

Page 18: IT – som værktøj

Second Normal Form

• Create separate tables for sets of values that apply to multiple records

• Relate these tables with a “foreign key”.

People

Id Name Company Address Zip

1 Joe ABC 123 12345

2 Jane XYZ 456 14454

3 Chris PDQ 789 14423

PhoneNumbers

PhoneID Id Phone

1 1 5532

2 1 2234

3 1 3211

4 2 3421

5 3 2341

6 3 6655

Page 19: IT – som værktøj

Third Normal Form

• Eliminate fields that do not depend on the primary key.

People

Id Name AddressID

1 Joe 1

2 Jane 2

3 Chris 3

Address

AddressID Company Address Zip

1 ABC 123 12345

2 XYZ 456 14454

3 PDQ 789 14423

PhoneNumbers

PhoneID Id Phone

1 1 5532

2 1 2234

3 1 3211

4 2 3421

5 3 2341

6 3 6655 Is this enough? Codd thought so…What about “many to many”?

Page 20: IT – som værktøj

Kinds of Relationships

• “One to One”– One row of a table matches exactly to another

• One person, one id number, one address

• “One to Many”– One row of a table matches many of another

• One person, many phone numbers

• “Many to Many”– One row may match many of another or many

rows match one row of another

Page 21: IT – som værktøj

Fourth Normal Form• In a “many to many” relationship, independent entities cannot be stored in the same table.

PhoneNumbersPhoneID Phone

1 5532

2 2234

3 3211

4 3421

5 2341

6 6655PhoneRelationsPhoneRelID Id PhoneID

1 1 1

2 1 2

3 1 3

4 2 4

5 3 5

6 3 6

People

Id Name AddressID

1 Joe 1

2 Jane 2

3 Chris 3

Address

AddressID Company Address Zip

1 ABC 123 12345

2 XYZ 456 14454

3 PDQ 789 14423

Page 22: IT – som værktøj

Fifth Normal Form

• The “very esoteric” one that is probably not required to get the most out of your database.

• “The original table must be reconstructed from the tables into which it has been broken down.”

• The rule ensures that you have not created any extraneous columns and all the tables are only as large as they need to be.

Page 23: IT – som værktøj

The Normal Forms

• First Form– Eliminate replicated data in tables– Create separate tables for each set of related data– Identify each set of related data with a primary key

• Second Form– Create separate tables for sets of values that apply to multiple records– Relate the tables with a foreign key

• Third Form– Eliminate fields that do not depend on the primary key

• Fourth Form– In many-to-many relationships, independent entities cannot be stored

in the same table

Page 24: IT – som værktøj

Why normalize?

• Increases the integrity of the data

• Reduces redundancy

• Improves efficiency

• Although normalization can be hard, it is worth it in the long run.

Page 25: IT – som værktøj

What do I need to remember?

• Keep normalization in mind.

• Don’t replicate data in a table.

• If you break the rules, know why you are breaking the rules and do it for a good reason.

Page 26: IT – som værktøj

All you need to know about SQL in 30 minutes (or less)

SQL = Structured Query Language

Page 27: IT – som værktøj

Basic SQL Commands

• Creating tables with CREATE

• Adding data with INSERT

• Viewing data with SELECT

• Removing data with DELETE

• Modifying data with UPDATE

• Destroying tables with DROP

Page 28: IT – som værktøj

Creating tables with CREATE

• Generic form

CREATE TABLE tablename (

column_name data_type attributes…,

column_name data_type attributes…,

)

• Table and column names can’t have spaces or be “reserved words” like TABLE, CREATE, etc.

Page 29: IT – som værktøj

Phone Book/Contact Record

Name CharacterAddress CharacterCompany CharacterPhone Number CharacterURL/Web Page CharacterAge IntegerHeight Real (float)Birthday DateWhen we added the entry Timestamp

Page 30: IT – som værktøj

Phone Book/Contact Table

CREATE TABLE contacts (Name VARCHAR(40),Address VARCHAR(60),Company VARCHAR(60),Phone VARCHAR(11),URL VARCHAR(80),Age INT,Height FLOAT,Birthday DATE,WhenEntered TIMESTAMP

);Plan your tables very carefully!

Once created, they are difficult to change!

Page 31: IT – som værktøj

Phone Book/Contact Table

CREATE TABLE contacts (ContactID INT PRIMARY KEY,Name VARCHAR(40),Address VARCHAR(60),Company VARCHAR(60),Phone VARCHAR(11),URL VARCHAR(80),Age INT,Height FLOAT,Birthday DATE,WhenEntered TIMESTAMP

);If you are going to use the relational nature of a database,don’t forget you need to have a unique way to access records!

There is a way to make the key automatically increment,so you don’t have to worry about which one is next.

Page 32: IT – som værktøj

Adding data with INSERT

• Generic Form

INSERT INTO tablename (column_name,…)

VALUES (value,…)

Page 33: IT – som værktøj

Inserting a record into ‘contacts’

INSERT INTO contacts (contactid,name,address,company,phone,url,age,height,birthday,whenentered)

VALUES

(1,‘Joe’,’123 Any St.’,’ABC’,

’800-555-1212’,‘http://abc.com’,30,1.9,’6/14/1972’,now());

Page 34: IT – som værktøj

Inserting a partial record

INSERT INTO contacts (contactid,name,phone)

VALUES (2,’Jane’,’212-555-1212’);

Page 35: IT – som værktøj

Viewing data with SELECT

• Generic FormSELECT column,… FROM table,…

WHERE condition GROUP BY group_by_expressionHAVING condition ORDER BY order_expression

• The most used command• Probably the most complicated also• If used improperly, can cause very long waits

because complex computations

Page 36: IT – som værktøj

A few simple SELECTs

• SELECT * FROM contacts;– Display all records in the ‘contacts’ table

• SELECT contactid,name FROM contacts;– Display only the record number and names

• SELECT DISTINCT url FROM contacts;– Display only one entry for every value of URL.

Page 37: IT – som værktøj

Refining selections with WHERE

• The WHERE “subclause” allows you to select records based on a condition.

• SELECT * FROM contactsWHERE age<10;

– Display records from contacts where age<10

• SELECT * FROM contactsWHERE age BETWEEN 18 AND 35;

– Display records where age is 18-35

Page 38: IT – som værktøj

Additional selections

• The “LIKE” condition– Allows you to look at strings that are alike

• SELECT * FROM contactsWHERE name LIKE ‘J%’;

– Display records where the name starts with ‘J’

• SELECT * FROM contactsWHERE url LIKE ‘%.com’;

– Display records where url ends in “.com”

Page 39: IT – som værktøj

Removing data with DELETE

• Generic Form

DELETE FROM table WHERE condition;

DELETE FROM contacts WHERE age<13;

Page 40: IT – som værktøj

Modifying data with UPDATE

• Generic Form

UPDATE table SET column=expression

WHERE condition;

UPDATE contacts SET company=‘AOL’

WHERE company=‘Time Warner’;

Page 41: IT – som værktøj

Destroying tables with DROP

• Generic Form

DROP TABLE tablename;

DROP TABLE contacts;

Page 42: IT – som værktøj

More about SELECT

Page 43: IT – som værktøj

“Normal Forms” and SELECT

• Good database design using the normal forms requires data to be separated into different tables

• SELECT allows us to join the data back together

• We can use “views” to create virtual tables

Page 44: IT – som værktøj

Joining together tables

• SELECT name,phone,zip FROM people, phonenumbers, address WHERE people.addressid=address.addressid AND people.id=phonenumbers.id;

People

Id Name AddressID

1 Joe 1

2 Jane 2

3 Chris 3

Address

AddressID Company Address Zip

1 ABC 123 12345

2 XYZ 456 14454

3 PDQ 789 14423

PhoneNumbers

PhoneID Id Phone

1 1 5532

2 1 2234

3 1 3211

4 2 3421

5 3 2341

6 3 6655

Page 45: IT – som værktøj

General form of SELECT/JOINSELECT columns,…

FROM left_table

join_type JOIN right_table ON condition;

SELECT name,phone FROM people

JOIN phonenumbers ON people.id=phonenumbers.id;

Page 46: IT – som værktøj

ORDER BY

• The “ORDER BY” clause allows you to sort the results returned by SELECT.

SELECT * FROM contactsORDER BY company;

SELECT * FROM contactsORDER BY company, name;

Page 47: IT – som værktøj

Views

• You can use “CREATE VIEW” to create a virtual table from a SELECT statement.

CREATE VIEW contactview AS

(SELECT name,phone,zip FROM

people,phonenumbers,address

WHERE people.id=phonenumbers.id AND people.addressid=address.addressid);

Page 48: IT – som værktøj

Finishing off

• SQL is a large language – You have seen some basic commands– Takes time and effort to learn– Mainly needed for powerusers– Lots of stuff we haven’t talked about

• E.g. security, multiple users, program accessing DB, …

– Many database systems now use graphical user interfaces instead of SQL directly