1 information management dig 3563 lecture 5: databases j. michael moshell university of central...
TRANSCRIPT
1
Information Management DIG 3563
Lecture 5: DatabasesJ. Michael Moshell
University of Central Florida
Original image* by Moshell et al .
Imagery is fromWikimedia except where marked with *. Licensing is listed.
-2 -
Why is a Database not just a File System?
• Data needs to be searchable
... and sortable and mergeable
• Dumb techniques lead to slow results.
• Example: Sorting n documents.
• A story about nametags ....
-3 -
(flashback to Lecture on Sorting)
Linear Search Binary Search
1000 items 10 steps
1 million items 20 steps
1 billion items 30 steps
sorted
index
A
Z
Each comparison
cuts in half
the search space
O(log2 k)
-4 -
Sorting 1000 nametags by BFI(brute force and ignorance)
.. takes about (1000)^2 = 1 million operations
Sorting them INTELLIGENTLY takes about (1000) log2 1000 = 1000 * 10 = 10,000 operations
How? Divide into small piles; sort. Then MERGE them.
Similar well-thought-out techniques are used in databases
to greatly speed up searching and inserting,
compared to directly manipulating simple data files.
-5 -
History of Databases
First generation (1950-1980): Hierarchical Databases
(Don't worry about how they worked.)
Second generation (1980-present:) Relational Databases
We will learn a lot about these, today.
Third generation (1990-present :) Object Oriented Databases
You might encounter one someday, in a specialized use.
We will say nothing more about them in this course.
-6 -
A relational DB consists of tables:
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
Table: “winery”
Database:"winestore"
-7 -
A relational DB consists of tables:
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
Table: “winery”
One record, or "tuple"
(a ROW)
-8 -
A relational DB consists of tables:
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
Table: “winery”
One field, or "attribute"
-9 -
A relational DB consists of tables:
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
Table: “winery”
An attribute is selected from a DOMAIN
(e.g. integers; strings; e-mail addresses..)
A DOMAIN is the set of allowable contents
for an attribute or field in the DB.
-10 -
Queries: How we interact with a DB:
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName FROM winery WHERE regionID=3
Example Query:
Table: “winery”
-11 - -11 -
Queries: How we interact with a DB:
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName FROM winery WHERE regionID=3
Example Query:
Result: Moss Brothers
Table: “winery”
-12 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName FROM winery WHERE regionID=3
Task: Make a query to find name of wineries in region 2.
Practice with mySQL Queries: Table: “winery”
Example Query:
-13 - -13 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName FROM winery WHERE regionID=3
ExampleQuery:
Task: Make a query to find name of wineries in region 2.Write your query on paper, show to your neighbor.(Also write the result!)
Practice with mySQL Queries: Table: “winery”
-14 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName FROM winery WHERE regionID=3
Example
Task: Make a query to find name of wineries in region 2.
SELECT wineryName FROM winery WHERE regionID=2
Result: Lindemanns
Practice with mySQL Queries: Table: “winery”
-15 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName FROM winery WHERE regionID=3
Example
Task: Find the name of wineries in region 1.
Practice with mySQL Queries: Table: “winery”
-16 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName FROM winery WHERE regionID=3
Example
Task: Find the name of wineries in region 1.SELECT wineryName FROM winery WHERE regionID=1
Practice with mySQL Queries: Table: “winery”
But ... it will return MULTIPLE ROWS! <<This is GOOD>>
-17 - -17 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName FROM winery WHERE regionID=3
Example
Task: Find the name of State where Yarra Valley is located.
Practice with mySQL Queries: Table: “winery”
-18 - -18 - -18 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName FROM winery WHERE regionID=3
Example
Task: Find the name of State where Yarra Valley is located.
Practice with mySQL Queries: Table: “winery”
SELECT state FROM region
WHERE regionName="Yarra Valley"
19
Your next query practice
1) Think up a question, based on ONE of the tables.
2) Write it down in English. Example:
"What is Penfold's Address?"
3) Hand it to your neighbor.
4) Write down the answer to the question you just received.
5) Check your neighbor's answer.
LET ME KNOW when you have a correct answer.
20
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT address FROM 'winery' WHERE wineryName='Penfolds'
Example
Task: Make up question; give to your neighbor;
design a query to answer the question you received.
Check your neighbor's answer.
Practice with mySQL Queries: Table: “winery”
21
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName,address FROM winery WHERE regionID=3
Example
More things queries can do: Table: “winery”
-22 -22
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT wineryName,address FROM winery WHERE regionID=3
Moss Brothers, Smith Rd.
Example
More things queries can do: Table: “winery”
23
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT * FROM winery WHERE regionID=3
Example
More things queries can do: Table: “winery”
-24 -24
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT * FROM winery WHERE regionID=3
1, Moss Brothers, Smith Rd, 3
Example
More things queries can do: Table: “winery”
25
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT * FROM winery WHERE (regionID=3)
OR (regionID=2)
Example
More things queries can do: Table: “winery”
-26 -26
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT * FROM winery WHERE (regionID=3)
OR (regionID=2)
1, Moss Brothers, Smith Rd, 3
4, Lindemans, Smith Ave, 2
Example
More things queries can do: Table: “winery”
27
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT * FROM winery
Example
More things queries can do: Table: “winery”
-28 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
SELECT * FROM winery
Prints out the entire table
Example
More things queries can do: Table: “winery”
-29 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem
I want the NAME of the REGION where Moss Brothers is.- but: winerynames are in the winery table, and- regionnames are in the region table.
How can we join them together?
Joins: using two tables at onceTable: “winery”
-30 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem
I want the NAME of the REGION where Moss Brothers is.Dumb way: (1) Look up Moss Brothers; it's in region 3.
(2) Look up region3; its name is Margaret River.
Table: “winery”
Joins: using two tables at once
-31 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem
I want the NAME of the REGION where Moss Brothers is.Dumb way: (1) Look up Moss Brothers; it's in region 3.
(2) Look up region3; its name is Margaret River.
(Dumb? It works!) But the computer can do the joining.
Table: “winery”
Joins: using two tables at once
-32 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem
Smart way: A JOIN query. Ask the computer to hook the tables together.
Here's how to make a Join query.(Note: there are other ways … this is my favorite.)
Table: “winery”
Joins: using two tables at once
33
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western AustraliaProblem
I want the NAME of the REGION where Moss Brothers is.
Table: “winery”
•SELECT regionName FROM winery, region WHERE winery.name="Moss Brothers"
AND winery.regionID = region.regionID
Joins: using two tables at once
34
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
Homework
Problems
(mostly
Using Joins)
Table: “winery”
Joins: using two tables at once
1) Produce the names of the wineries in Barossa Valley
2) Produce the winery addresses in Yarra Valley
3) Produce the address and state of Lindemans
4) Produce a list of all the states producing wine.
-35 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria3 Margaret River Western Australia
Exam warning
Table: “winery”
Joins: using two tables at once
There WILL be questions about relational database
tables and queries on the Midterm Exam! Practice these
concepts!
-36 -
Now ... how does it WORK?
How can a dumb computer look something up in a table?e. g.
"select wineryid from winery where wineryname="Penfields"
"wineryname='Penfields'. output: wineryid=3
We study 3 ways:
1) sequential search
2) Indexed search
3) Hash based search
-37 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4
etc3000 records, let's say.
Sequential search: BFI
for ($w=1; $w<=3000; $w++)
{
if (wineryname($w)==$searchvalue)
report success;
} else report failure;
Dumb, eh?
-38 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4
etc3000 records, let's say.
Indexed search: build index tree
Lindemans:3
Johnsons:6
Hardy Brothers:2
Build:O (n log2 n)
Orlando:5
Moss:1
Penfield:3
etc etc
-39 -
Indexed search: build index tree
Lindemans:3
Johnsons:6
Hardy Brothers:2
Build:O (n log2 n)
Orlando:5
Moss:1
Penfield:3
etc etc
How would it actually be built? Don't worry about it ... you can imaginea BIG piece of paper, if you want, or a wall.
(We do it with Data Structures: arrays, lists...)
-40 -
How to search a tree
Search:O(log2 n)
Lindemans:3
Johnsons:6
Hardy Brothers:2
Orlando:5
Moss:1
Penfield:3
etc etc
?Moss?
Looking for Moss:Start at the root. Is Moss>Lindemans? Yes.
So follow the right-hand branch.Is Moss>Orlando? No.
So follow the left-hand branch.Quickly find the "Moss node".
-41 - -41 -
Indexed search: build index tree
Search:O(log2 n)
Lindemans:3
Johnsons:6
Orlando:5
Moss:1
Penfield:3
etc etc
?Moss? Hardy Brothers:2
How quickly?Each layer of the tree doubles the number of
records. So 2 layers -> 4 records (on the "frontier")
3 layers -> 8 records 4 layers -> 16 records, etc.
-42 - -42 - -42 -
Indexed search: build index tree
Search:O(log2 n)
Lindemans:3
Johnsons:6
Orlando:5
Moss:1
Penfield:3
etc etc
?Moss? Hardy Brothers:2
So – Indexed search is quite fast comparedTo sequential search.
To search an indexed database of n records takes O (log2 n) time.
-43 -
Hash based search: the wierdest idea
example: h($s)
=242*val($s[1])+17*val($s[2]) ... etc
A "hash function" is a rule that takes a string as input(like "Michael") and computes a number. The functionmay look ridiculous or random, but it has specialmathematical properties.
In the above example, val($s[1]) means the ASCII valueOf the first letter in the string variable $s="Michael".
So, h("Michael") yields a very different value fromh("Mike"). This is important.
-44 - -44 -
Hash based search: the wierdest idea
Johnsons
Hash function:•similar inputs, different outputs•fast to compute
24612
Johnston 98307
10001
10002
...
24612 Johnsons: 6
....
98307 Johnston: 91
....
To fetch the data, just compute the hashand look in that-numbered location.
example: h($s)
=242*val($s[1])+17*val($s[2]) ... etc
-45 - -45 - -45 -
Hash based search: the wierdest idea
Johnsons
Hash function:•similar inputs, different outputs•fast to compute
24612
Johnston 98307
10001
10002
...
24612 Johnsons: 6
....
98307 Johnston: 91
....
To fetch the data, just compute the hashand look there.
Imagine a library with the books "all over the place!" But if youKnow the author's name, you can compute its shelf numberBy finding h($authorname).
example: h($s)
=242*val($s[1])+17*val($s[2]) ... etc
-46 -
Hash based search: the wierdest idea
Johnsons
Hash function:•similar inputs, different outputs•fast to compute
24612
Johnston 98307
10001
10002
...
24612 Johnsons: 6
....
98307 Johnston: 91
....
To fetch the data, just compute the hashof the search term, and look there.
BUT what if two words COLLIDE? (that is, are mapped to same place?)
Wilsons 24612
example: h($s)
=242*val($s[1])+17*val($s[2]) ... etc
-47 -
Hash based search: the wierdest idea
Johnsons
Hash function:•similar inputs, different outputs•fast to compute
24612
Johnston 98307
10001
10002
...
24612 Johnsons: 6
24613 Wilsons: 37
....
98307 Johnston: 91
....
To fetch the data, just compute the hashand look in that numbered location.
BUT what if two words COLLIDE?
Wilsons 24612
example: h($s)
=242*val($s[1])+17*val($s[2]) ... etc
Just move down to firstavailable space, store Wilsons!
-48 -
Hash based search: when to use it
Hashing works best when the size of the database is fixedor unlikely to grow. It needs at least twice as much memory asthe expected maximum content.
Example: string indexed arrays in PHP: $father['Michael']
Database management systems (DBMS) use a mixture of
tree-based and hash-based indexing systems.
Facts to remember: (1) an INDEX takes time to build;
(2) an INDEX must be updated when records are added/deleted;
(3) The INDEX's input is a key. Its output is the location
where the associated information is stored.
-49 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4
etc3000 records, let's say.
What's a key?
If you designate one FIELD of a TABLE as the primary key,
the DBMS will build an efficient index for that field.
The KEY field must be unique in each row ...
no two rows will have the same key value.
Then: Gimme the key, I find the row in O(log2 n) time.
-50 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4
etc3000 records, let's say.
What's a key?
Searching the same table on a non-key field (like "Address"):
-The DBMS has only one option – SEQUENTIAL SEARCH.
-Chug, chug, chug ... it'll eventually find the matches ... tomorrow?
-Order(n) time .. 10 million records takes 10 million accesses.
-51 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 16 Johnson's South Street 4
etc3000 records, let's say.
What's an index?
You CAN ask the DBMS to produce indexes on OTHER fields,
even if they aren't unique.
Now we could search for an Address, just as fast as for a WineryID.
-52 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1 3000 Wineries
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria 600 Regions3 Margaret River Western Australia
Table: “winery”
•SELECT regionName FROM winery, region WHERE winery.name="Moss Brothers" AND winery.regionID = region.regionID
Joins and Keys
BFI: Sequential search 3000 records for "Moss Brothers."
get the regionIDs. Then Seq. Search 600 records for matches.
-53 -
Winery ID Winery Name Address Region ID <Meaning of the fieldwineryID wineryName address regionID <Field name
1 Moss Brothers Smith Rd. 32 Hardy Brothers Jones St. 13 Penfolds Arthurton Rd. 14 Lindemans Smith Ave. 25 Orlando Jones St. 1 3000 Wineries
Table: "region"Region ID Region Name State <Meaning of the fieldregionID regionName state <Field name
1 Barossa Valley South Australia2 Yarra Valley Victoria 600 Regions3 Margaret River Western Australia
Table: “winery”
•SELECT regionName FROM winery, region WHERE winery.name="Moss Brothers" AND winery.regionID = region.regionID
Joins and Keys
Smart: Index:: log2 (3000) = 12 steps to find Moss Bros;
get the regionIDs. Log2 600 = 10 steps to find the matches.
-54 -
One big table: Easy to understand, but wasteful.Example: TV programs as a database.
Efficient use of tables
Commercial 1a
Program Segment 1
Commercial2
Program Segment 2
Commercial 1b – rerun of 1a
Program Segment 3
Thursday:
Commercial 1a
Program Segment 4
Commercial3
Program Segment 5
Commercial 2
Program Segment 6
Friday:
-55 -
Three tables: More efficient, easier to modify
Efficient use of tables
C1 Commercial 1
C2 Commercial2
C3 Commercial 3
Program plan
Thurs:C1
P1
C2
P2
C1
P3
Friday:C1
P4
C3
P5
C2
P6
Programs:
P1 Program Segment 1
P2 Program Segment 2
P3 Program Segment 3
P4 Program Segment 4
P5 Program Segment 5
P6 Program Segment 6
Commercials:
-56 -
What must I know how to dowith regard to Databases?
1.Construct queries on one table.
1.Understand queries (one one table) thatI give you, and show what the result would be – if I give you the table
data.
3. Understand Join queries (that I give you) on two tables, and produce the
results, from data tables I give you.
-57 - -57 -
What else will we study about databases?
Next week we will discuss how to set uptables for a particular application,and practice creating tables.
NOTE: The "innards" of every CMSconsists of a large set (maybe hundreds)of database tables.
So, to understand CMS, you must understand database queries and tables.