1 information management dig 3563 lecture 5: databases j. michael moshell university of central...

57
1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al . Imagery is fromWikimedia except where marked with *. Licensing is listed.

Upload: cori-ray

Post on 23-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

1

Information Management DIG 3563

Lecture 5: DatabasesJ. Michael Moshell

University of Central Florida

Original image* by Moshell et al .

Imagery is fromWikimedia except where marked with *. Licensing is listed.

Page 2: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-2 -

Why is a Database not just a File System?

• Data needs to be searchable

... and sortable and mergeable

• Dumb techniques lead to slow results.

• Example: Sorting n documents.

• A story about nametags ....

Page 3: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-3 -

(flashback to Lecture on Sorting)

Linear Search Binary Search

1000 items 10 steps

1 million items 20 steps

1 billion items 30 steps

sorted

index

A

Z

Each comparison

cuts in half

the search space

O(log2 k)

Page 4: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-4 -

Sorting 1000 nametags by BFI(brute force and ignorance)

.. takes about (1000)^2 = 1 million operations

Sorting them INTELLIGENTLY takes about (1000) log2 1000 = 1000 * 10 = 10,000 operations

How? Divide into small piles; sort. Then MERGE them.

Similar well-thought-out techniques are used in databases

to greatly speed up searching and inserting,

compared to directly manipulating simple data files.

Page 5: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-5 -

History of Databases

First generation (1950-1980): Hierarchical Databases

(Don't worry about how they worked.)

Second generation (1980-present:) Relational Databases

We will learn a lot about these, today.

Third generation (1990-present :) Object Oriented Databases

You might encounter one someday, in a specialized use.

We will say nothing more about them in this course.

Page 6: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-6 -

A relational DB consists of tables:

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

Table: “winery”

Database:"winestore"

Page 7: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-7 -

A relational DB consists of tables:

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

Table: “winery”

One record, or "tuple"

(a ROW)

Page 8: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-8 -

A relational DB consists of tables:

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

Table: “winery”

One field, or "attribute"

Page 9: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-9 -

A relational DB consists of tables:

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

Table: “winery”

An attribute is selected from a DOMAIN

(e.g. integers; strings; e-mail addresses..)

A DOMAIN is the set of allowable contents

for an attribute or field in the DB.

Page 10: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-10 -

Queries: How we interact with a DB:

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName FROM winery WHERE regionID=3

Example Query:

Table: “winery”

Page 11: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-11 - -11 -

Queries: How we interact with a DB:

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName FROM winery WHERE regionID=3

Example Query:

Result: Moss Brothers

Table: “winery”

Page 12: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-12 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName FROM winery WHERE regionID=3

Task: Make a query to find name of wineries in region 2.

Practice with mySQL Queries: Table: “winery”

Example Query:

Page 13: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-13 - -13 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName FROM winery WHERE regionID=3

ExampleQuery:

Task: Make a query to find name of wineries in region 2.Write your query on paper, show to your neighbor.(Also write the result!)

Practice with mySQL Queries: Table: “winery”

Page 14: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-14 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName FROM winery WHERE regionID=3

Example

Task: Make a query to find name of wineries in region 2.

SELECT wineryName FROM winery WHERE regionID=2

Result: Lindemanns

Practice with mySQL Queries: Table: “winery”

Page 15: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-15 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName FROM winery WHERE regionID=3

Example

Task: Find the name of wineries in region 1.

Practice with mySQL Queries: Table: “winery”

Page 16: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-16 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName FROM winery WHERE regionID=3

Example

Task: Find the name of wineries in region 1.SELECT wineryName FROM winery WHERE regionID=1

Practice with mySQL Queries: Table: “winery”

But ... it will return MULTIPLE ROWS! <<This is GOOD>>

Page 17: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-17 - -17 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName FROM winery WHERE regionID=3

Example

Task: Find the name of State where Yarra Valley is located.

Practice with mySQL Queries: Table: “winery”

Page 18: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-18 - -18 - -18 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName FROM winery WHERE regionID=3

Example

Task: Find the name of State where Yarra Valley is located.

Practice with mySQL Queries: Table: “winery”

SELECT state FROM region

WHERE regionName="Yarra Valley"

Page 19: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

19

Your next query practice

1) Think up a question, based on ONE of the tables.

2) Write it down in English. Example:

"What is Penfold's Address?"

3) Hand it to your neighbor.

4) Write down the answer to the question you just received.

5) Check your neighbor's answer.

LET ME KNOW when you have a correct answer.

Page 20: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

20

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT address FROM 'winery' WHERE wineryName='Penfolds'

Example

Task: Make up question; give to your neighbor;

design a query to answer the question you received.

Check your neighbor's answer.

Practice with mySQL Queries: Table: “winery”

Page 21: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

21

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName,address FROM winery WHERE regionID=3

Example

More things queries can do: Table: “winery”

Page 22: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-22 -22

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT wineryName,address FROM winery WHERE regionID=3

Moss Brothers, Smith Rd.

Example

More things queries can do: Table: “winery”

Page 23: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

23

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT * FROM winery WHERE regionID=3

Example

More things queries can do: Table: “winery”

Page 24: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-24 -24

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT * FROM winery WHERE regionID=3

1, Moss Brothers, Smith Rd, 3

Example

More things queries can do: Table: “winery”

Page 25: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

25

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT * FROM winery WHERE (regionID=3)

OR (regionID=2)

Example

More things queries can do: Table: “winery”

Page 26: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-26 -26

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT * FROM winery WHERE (regionID=3)

OR (regionID=2)

1, Moss Brothers, Smith Rd, 3

4, Lindemans, Smith Ave, 2

Example

More things queries can do: Table: “winery”

Page 27: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

27

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT * FROM winery

Example

More things queries can do: Table: “winery”

Page 28: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-28 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

SELECT * FROM winery

Prints out the entire table

Example

More things queries can do: Table: “winery”

Page 29: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-29 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem

I want the NAME of the REGION where Moss Brothers is.- but: winerynames are in the winery table, and- regionnames are in the region table.

How can we join them together?

Joins: using two tables at onceTable: “winery”

Page 30: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-30 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem

I want the NAME of the REGION where Moss Brothers is.Dumb way: (1) Look up Moss Brothers; it's in region 3.

(2) Look up region3; its name is Margaret River.

Table: “winery”

Joins: using two tables at once

Page 31: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-31 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem

I want the NAME of the REGION where Moss Brothers is.Dumb way: (1) Look up Moss Brothers; it's in region 3.

(2) Look up region3; its name is Margaret River.

(Dumb? It works!) But the computer can do the joining.

Table: “winery”

Joins: using two tables at once

Page 32: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-32 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem

Smart way: A JOIN query. Ask the computer to hook the tables together.

Here's how to make a Join query.(Note: there are other ways … this is my favorite.)

Table: “winery”

Joins: using two tables at once

Page 33: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

33

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem

I want the NAME of the REGION where Moss Brothers is.

Table: “winery”

•SELECT regionName FROM winery, region WHERE winery.name="Moss Brothers"

AND winery.regionID = region.regionID

Joins: using two tables at once

Page 34: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

34

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

Homework

Problems

(mostly

Using Joins)

Table: “winery”

Joins: using two tables at once

1) Produce the names of the wineries in Barossa Valley

2) Produce the winery addresses in Yarra Valley

3) Produce the address and state of Lindemans

4) Produce a list of all the states producing wine.

Page 35: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-35 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia

Exam warning

Table: “winery”

Joins: using two tables at once

There WILL be questions about relational database

tables and queries on the Midterm Exam! Practice these

concepts!

Page 36: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-36 -

Now ... how does it WORK?

How can a dumb computer look something up in a table?e. g.

"select wineryid from winery where wineryname="Penfields"

"wineryname='Penfields'. output: wineryid=3

We study 3 ways:

1) sequential search

2) Indexed search

3) Hash based search

Page 37: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-37 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4

etc3000 records, let's say.

Sequential search: BFI

for ($w=1; $w<=3000; $w++)

{

if (wineryname($w)==$searchvalue)

report success;

} else report failure;

Dumb, eh?

Page 38: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-38 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4

etc3000 records, let's say.

Indexed search: build index tree

Lindemans:3

Johnsons:6

Hardy Brothers:2

Build:O (n log2 n)

Orlando:5

Moss:1

Penfield:3

etc etc

Page 39: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-39 -

Indexed search: build index tree

Lindemans:3

Johnsons:6

Hardy Brothers:2

Build:O (n log2 n)

Orlando:5

Moss:1

Penfield:3

etc etc

How would it actually be built? Don't worry about it ... you can imaginea BIG piece of paper, if you want, or a wall.

(We do it with Data Structures: arrays, lists...)

Page 40: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-40 -

How to search a tree

Search:O(log2 n)

Lindemans:3

Johnsons:6

Hardy Brothers:2

Orlando:5

Moss:1

Penfield:3

etc etc

?Moss?

Looking for Moss:Start at the root. Is Moss>Lindemans? Yes.

So follow the right-hand branch.Is Moss>Orlando? No.

So follow the left-hand branch.Quickly find the "Moss node".

Page 41: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-41 - -41 -

Indexed search: build index tree

Search:O(log2 n)

Lindemans:3

Johnsons:6

Orlando:5

Moss:1

Penfield:3

etc etc

?Moss? Hardy Brothers:2

How quickly?Each layer of the tree doubles the number of

records. So 2 layers -> 4 records (on the "frontier")

3 layers -> 8 records 4 layers -> 16 records, etc.

Page 42: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-42 - -42 - -42 -

Indexed search: build index tree

Search:O(log2 n)

Lindemans:3

Johnsons:6

Orlando:5

Moss:1

Penfield:3

etc etc

?Moss? Hardy Brothers:2

So – Indexed search is quite fast comparedTo sequential search.

To search an indexed database of n records takes O (log2 n) time.

Page 43: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-43 -

Hash based search: the wierdest idea

example: h($s)

=242*val($s[1])+17*val($s[2]) ... etc

A "hash function" is a rule that takes a string as input(like "Michael") and computes a number. The functionmay look ridiculous or random, but it has specialmathematical properties.

In the above example, val($s[1]) means the ASCII valueOf the first letter in the string variable $s="Michael".

So, h("Michael") yields a very different value fromh("Mike"). This is important.

Page 44: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-44 - -44 -

Hash based search: the wierdest idea

Johnsons

Hash function:•similar inputs, different outputs•fast to compute

24612

Johnston 98307

10001

10002

...

24612 Johnsons: 6

....

98307 Johnston: 91

....

To fetch the data, just compute the hashand look in that-numbered location.

example: h($s)

=242*val($s[1])+17*val($s[2]) ... etc

Page 45: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-45 - -45 - -45 -

Hash based search: the wierdest idea

Johnsons

Hash function:•similar inputs, different outputs•fast to compute

24612

Johnston 98307

10001

10002

...

24612 Johnsons: 6

....

98307 Johnston: 91

....

To fetch the data, just compute the hashand look there.

Imagine a library with the books "all over the place!" But if youKnow the author's name, you can compute its shelf numberBy finding h($authorname).

example: h($s)

=242*val($s[1])+17*val($s[2]) ... etc

Page 46: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-46 -

Hash based search: the wierdest idea

Johnsons

Hash function:•similar inputs, different outputs•fast to compute

24612

Johnston 98307

10001

10002

...

24612 Johnsons: 6

....

98307 Johnston: 91

....

To fetch the data, just compute the hashof the search term, and look there.

BUT what if two words COLLIDE? (that is, are mapped to same place?)

Wilsons 24612

example: h($s)

=242*val($s[1])+17*val($s[2]) ... etc

Page 47: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-47 -

Hash based search: the wierdest idea

Johnsons

Hash function:•similar inputs, different outputs•fast to compute

24612

Johnston 98307

10001

10002

...

24612 Johnsons: 6

24613 Wilsons: 37

....

98307 Johnston: 91

....

To fetch the data, just compute the hashand look in that numbered location.

BUT what if two words COLLIDE?

Wilsons 24612

example: h($s)

=242*val($s[1])+17*val($s[2]) ... etc

Just move down to firstavailable space, store Wilsons!

Page 48: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-48 -

Hash based search: when to use it

Hashing works best when the size of the database is fixedor unlikely to grow. It needs at least twice as much memory asthe expected maximum content.

Example: string indexed arrays in PHP: $father['Michael']

Database management systems (DBMS) use a mixture of

tree-based and hash-based indexing systems.

Facts to remember: (1) an INDEX takes time to build;

(2) an INDEX must be updated when records are added/deleted;

(3) The INDEX's input is a key. Its output is the location

where the associated information is stored.

Page 49: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-49 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4

etc3000 records, let's say.

What's a key?

If you designate one FIELD of a TABLE as the primary key,

the DBMS will build an efficient index for that field.

The KEY field must be unique in each row ...

no two rows will have the same key value.

Then: Gimme the key, I find the row in O(log2 n) time.

Page 50: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-50 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4

etc3000 records, let's say.

What's a key?

Searching the same table on a non-key field (like "Address"):

-The DBMS has only one option – SEQUENTIAL SEARCH.

-Chug, chug, chug ... it'll eventually find the matches ... tomorrow?

-Order(n) time .. 10 million records takes 10 million accesses.

Page 51: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-51 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4

etc3000 records, let's say.

What's an index?

You CAN ask the DBMS to produce indexes on OTHER fields,

even if they aren't unique.

Now we could search for an Address, just as fast as for a WineryID.

Page 52: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-52 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1 3000 Wineries

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria 600 Regions3 Margaret River Western Australia

Table: “winery”

•SELECT regionName FROM winery, region WHERE winery.name="Moss Brothers" AND winery.regionID = region.regionID

Joins and Keys

BFI: Sequential search 3000 records for "Moss Brothers."

get the regionIDs. Then Seq. Search 600 records for matches.

Page 53: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-53 -

Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name

1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1 3000 Wineries

Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name

1 Barossa Valley South Australia2 Yarra Valley Victoria 600 Regions3 Margaret River Western Australia

Table: “winery”

•SELECT regionName FROM winery, region WHERE winery.name="Moss Brothers" AND winery.regionID = region.regionID

Joins and Keys

Smart: Index:: log2 (3000) = 12 steps to find Moss Bros;

get the regionIDs. Log2 600 = 10 steps to find the matches.

Page 54: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-54 -

One big table: Easy to understand, but wasteful.Example: TV programs as a database.

Efficient use of tables

Commercial 1a

Program Segment 1

Commercial2

Program Segment 2

Commercial 1b – rerun of 1a

Program Segment 3

Thursday:

Commercial 1a

Program Segment 4

Commercial3

Program Segment 5

Commercial 2

Program Segment 6

Friday:

Page 55: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-55 -

Three tables: More efficient, easier to modify

Efficient use of tables

C1 Commercial 1

C2 Commercial2

C3 Commercial 3

Program plan

Thurs:C1

P1

C2

P2

C1

P3

Friday:C1

P4

C3

P5

C2

P6

Programs:

P1 Program Segment 1

P2 Program Segment 2

P3 Program Segment 3

P4 Program Segment 4

P5 Program Segment 5

P6 Program Segment 6

Commercials:

Page 56: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-56 -

What must I know how to dowith regard to Databases?

1.Construct queries on one table.

1.Understand queries (one one table) thatI give you, and show what the result would be – if I give you the table

data.

3. Understand Join queries (that I give you) on two tables, and produce the

results, from data tables I give you.

Page 57: 1 Information Management DIG 3563 Lecture 5: Databases J. Michael Moshell University of Central Florida Original image* by Moshell et al. Imagery is fromWikimedia

-57 - -57 -

What else will we study about databases?

Next week we will discuss how to set uptables for a particular application,and practice creating tables.

NOTE: The "innards" of every CMSconsists of a large set (maybe hundreds)of database tables.

So, to understand CMS, you must understand database queries and tables.