database management systems chapter 3 the relational data model (ii) instructor: li ma department of...

25
Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston September, 2007

Upload: charlotte-maxwell

Post on 17-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Database Management Systems

Chapter 3 The Relational Data Model (II)

Instructor: Li Ma

Department of Computer ScienceTexas Southern University, Houston

September, 2007

Page 2: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Functional Dependencies

Meaning of FD’sKeys and Superkeys

Rules about FD’s

Page 3: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 3

Functional Dependencies

X -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X, then they must also agree on the attribute A. Say “X -> A holds in R.” Convention: …, X, Y, Z represent sets of

attributes; A, B, C,… represent single attributes.

Convention: no set formers in sets of attributes, just ABC, rather than {A,B,C }.

Page 4: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 4

Example

Drinkers(name, addr, beersLiked, manf, favBeer)

Reasonable FD’s to assert:1. name -> addr2. name -> favBeer3. beersLiked -> manf

Page 5: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 5

Example Data

name addr beersLiked manf favBeerJaneway Voyager Bud A.B. WickedAleJaneway Voyager WickedAle Pete’s WickedAleSpock Enterprise Bud A.B. Bud

Because name -> addr Because name -> favBeer

Because beersLiked -> manf

Page 6: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 6

FD’s With Multiple Attributes

No need for FD’s with > 1 attribute on right. But sometimes convenient to combine

FD’s as a shorthand. Example: name -> addr and

name -> favBeer become name -> addr favBeer

> 1 attribute on left may be essential. Example: bar beer -> price

Page 7: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 7

Keys of Relations

K is a superkey for relation R if K functionally determines all of R.

K is a key for R if K is a superkey, but no proper subset of K is a superkey.

Page 8: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 8

Example

Drinkers(name, addr, beersLiked, manf, favBeer)

{name, beersLiked} is a superkey because together these attributes determine all the other attributes. name -> addr favBeer beersLiked -> manf

Page 9: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 9

Example, Cont.

{name, beersLiked} is a key because neither {name} nor {beersLiked} is a superkey. name doesn’t -> manf; beersLiked

doesn’t -> addr. There are no other keys, but lots of

superkeys. Any superset of {name, beersLiked}.

Page 10: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 10

E/R and Relational Keys

Keys in E/R concern entities. Keys in relations concern tuples. Usually, one tuple corresponds to

one entity, so the ideas are the same.

But --- in poor relational designs, one entity can become several tuples, so E/R keys and Relational keys are different.

Page 11: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 11

Example Data

name addr beersLiked manf favBeerJaneway Voyager Bud A.B. WickedAleJaneway Voyager WickedAle Pete’s WickedAleSpock Enterprise Bud A.B. Bud

Relational key = {name beersLiked}But in E/R, name is a key for Drinkers, and beersLiked is a key

for Beers.Note: 2 tuples for Janeway entity and 2 tuples for Bud entity.

Page 12: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 12

Where Do Keys Come From?

1. Just assert a key K. The only FD’s are K -> A for all

attributes A.

2. Assert FD’s and deduce the keys by systematic exploration.

E/R model gives us FD’s from entity-set keys and from many-one relationships.

Page 13: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 13

More FD’s From “Physics”

Example: “no two courses can meet in the same room at the same time” tells us: hour room -> course.

Page 14: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 14

Inferring FD’s

We are given FD’s X1 -> A1, X2 -> A2,…, Xn -> An , and we want to know whether an FD Y -> B must hold in any relation that satisfies the given FD’s. Example: If A -> B and B -> C hold, surely

A -> C holds, even if we don’t say so. Important for design of good relation

schemas.

Page 15: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 15

About Two FD’s Set

Two sets of FD’s are equivalent: they are satisfied by exactly the same set of relation instances

If every relation instance that satisfies all the FD’s in set T also satisfies all the FD’s in set S, then we say S follows from T

Page 16: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 16

Splitting/Combining Rule

X -> B1B2...Bn is the shorthand for the set of FD’s

X -> B1

X -> B2

…X -> Bn

Page 17: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 17

Trivial-Dependency Rule

A FD X -> B is trivial if B is in the attribute set X Example: AB -> A

We can always remove from the right side of a FD those attributes that appear on the left ABC –> AD is equivalent to ABC -> D

Page 18: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 18

Transitive Rule

If X -> Y and Y -> Z hold in relation R, then X -> Z also holds in R Similar as the transitive rule in

Arithmetic We can use either inference test or

closure test to prove this rule

Page 19: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 19

Inference Test

To test if Y -> B, start by assuming two tuples agree in all attributes of Y.

Y B0000000. . . 0000000? . . . ?

Page 20: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 20

Example Given a relation R with attributes A, B,

and C satisfies the FD’s A -> B and B -> C, test if R also satisfies the FD A -> C Assume two tuples agree on A: (a,b1,c1) and

(a,b2,c2) Since R satisfies A -> B, they also agree on

B: (a,b,c1) and (a,b,c2) Similarly, since R satisfies B -> C, they also

agree on C: (a,b,c) and (a,b,c) So we proved any two tuples of R that agree

on A also agree on C

Page 21: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 21

Closure of Attribute Set

The closure of an attribute set X under the FD’s in S is the attribute set Y such that every relation that satisfies all the FD’s in S also satisfies X -> Y X -> Y follows from the FD’s in S We denote X+ = Y

Page 22: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 22

Closure Algorithm

An easier way to test is to compute the closure of Y, denoted Y +.

Basis: Y + = Y. Induction: Look for an FD’s left side

X that is a subset of the current Y +. If the FD is X -> A, add A to Y +.

Page 23: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 23

Y+new Y+

X A

Page 24: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 24

Closure Algorithm (2)

Repeat the induction step until no more attribute can be added to Y +. Y + is the closure of Y.

The closure algorithm claims only true FD’s

The closure algorithm discovers all true FD’s

Page 25: Database Management Systems Chapter 3 The Relational Data Model (II) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston

Jeffrey Ullman 25

Closure Test

For the transitive rule If X -> Y and Y -> Z hold in relation R,

then X -> Z also holds in R If we can compute the closure of X

(X+) and Z is a subset of X+, then the rule is proved Need to use the trivial-dependency

rule