temple university – cis dept. cis616– principles of data management

51
Temple University – CIS Dept. CIS616– Principles of Data Management V. Megalooikonomou Functional Dependencies (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU)

Upload: noelle

Post on 06-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Temple University – CIS Dept. CIS616– Principles of Data Management. V. Megalooikonomou Functional Dependencies (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU). General Overview. Formal query languages rel algebra and calculi Commercial query languages - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Temple University – CIS Dept. CIS616– Principles of Data Management

Temple University – CIS Dept.CIS616– Principles of Data Management

V. Megalooikonomou

Functional Dependencies

(based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU)

Page 2: Temple University – CIS Dept. CIS616– Principles of Data Management

General Overview

Formal query languages rel algebra and calculi

Commercial query languages SQL QBE, (QUEL)

Integrity constraints Functional Dependencies Normalization - ‘good’ DB design

Page 3: Temple University – CIS Dept. CIS616– Principles of Data Management

Overview Domain; Ref. Integrity constraints Assertions and Triggers Security Functional dependencies

why definition Armstrong’s “axioms” closure and cover

Page 4: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

motivation: ‘good’ tables

takes1 (ssn, c-id, grade, name, address)

‘good’ or ‘bad’?

Page 5: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

takes1 (ssn, c-id, grade, name, address)

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 6: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

‘Bad’ - why?

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 7: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional Dependencies

Redundancy space inconsistencies insertion/deletion anomalies (later…)

What caused the problem?

Page 8: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

… ‘name’ depends on ‘ssn’ define ‘depends’

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 9: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

Definition: ‘a’ functionally determines ‘b’

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

ba

Page 10: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

Informally: ‘if you know ‘a’, there is only one ‘b’ to match’

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 11: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependenciesformally:

if two tuples agree on the ‘X’ attribute,they *must* agree on the ‘Y’ attribute, too(e.g., if ssn is the same, so should address)

… a functional dependency is a generalization of the notion of a key

])[2][1][2][1( ytytxtxtYX

Page 12: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

‘X’, ‘Y’ can be sets of attributes other examples??

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 13: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

ssn -> name, address ssn, c-id -> grade

Ssn c-id Grade Name Address

123 413 A smith Main

123 415 B smith Main

123 211 A smith Main

Page 14: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

K is a superkey for relation R iff K -> R

K is a candidate key for relation R iff:K -> Rfor no a K, a -> R

Page 15: Temple University – CIS Dept. CIS616– Principles of Data Management

Functional dependencies

Closure of a set of FD: all implied FDs – e.g.:ssn -> name, addressssn, c-id -> grade

implyssn, c-id -> grade, name, addressssn, c-id -> ssn

Page 16: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Armstrong’s axioms

Closure of a set of FD: all implied FDs – e.g.:ssn -> name, addressssn, c-id -> grade

how to find all the implied ones, systematically?

Page 17: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Armstrong’s axioms

“Armstrong’s axioms” guarantee soundness and completeness:

Reflexivity: e.g., ssn, name -> ssn Augmentation

e.g., ssn->name then ssn,grade-> ssn,grade

YXXY

YWXWYX

Page 18: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Armstrong’s axioms

Transitivity

ssn->address address-> county-tax-rateTHEN:

ssn-> county-tax-rate

ZXZY

YX

Page 19: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Armstrong’s axiomsReflexivity:

Augmentation:

Transitivity:

ZXZY

YX

YXXY

YWXWYX

‘sound’ and ‘complete’

Page 20: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs – finding the closure F+

F+ = Frepeat

for each functional dependency f in F+

apply reflexivity and augmentation rules on f add the resulting functional dependencies to F+

for each pair of functional dependencies f1and f2 in F+

if f1 and f2 can be combined using transitivity then add the resulting functional dependency to F+

until F+ does not change any further

We can further simplify manual computation of F+ by using the following additional rules

Page 21: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Armstrong’s axioms

Additional rules:

Union

Decomposition

Pseudo-transitivity

ZXWZYW

YX

ZX

YXYZX

YZXZX

YX

Page 22: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Armstrong’s axioms

Prove ‘Union’ from the three axioms:

YZXZX

YX

?

Page 23: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Armstrong’s axioms

Prove ‘Union’ from the three axioms:

YZXtytransitiviand

thusXisXXbut

XZXXXwaugm

YZXZZwaugm

ZX

YX

)4()3(

;

)4(/.)2(

)3(/.)1(

)2(

)1(

Page 24: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Armstrong’s axioms

Prove Pseudo-transitivity:

ZXWZYW

YX

?

ZXZY

YX

YXXY

YWXWYX

Page 25: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Armstrong’s axioms

Prove Decomposition

ZXZY

YX

YXXY

YWXWYX

ZX

YXYZX

?

Page 26: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure F+

Given a set F of FD (on a schema)F+ is the set of all implied FD. E.g.,takes(ssn, c-id, grade, name,

address)ssn, c-id -> grade

ssn-> name, address

}F

Page 27: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure F+

ssn, c-id -> grade ssn-> name, address ssn-> ssn ssn, c-id-> address c-id, address-> c-id ...

F+

Page 28: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure F+

R=(A,B,C,G,H,I)

F= { A->BA->CCG->HCG->IB->H}

Some members of F+:A->HAG->ICG->HI

Page 29: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure A+

Given a set F of FD (on a schema)A+ is the set of all attributes determined

by A:takes(ssn, c-id, grade, name, address)

ssn, c-id -> grade ssn-> name, address

{ssn}+ =??

}F

Page 30: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure A+

takes(ssn, c-id, grade, name, address)ssn, c-id -> grade

ssn-> name, address

{ssn}+ ={ssn, name, address }

}F

Page 31: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure A+

takes(ssn, c-id, grade, name, address)ssn, c-id -> grade

ssn-> name, address

{c-id}+ = ??

}F

Page 32: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure A+

takes(ssn, c-id, grade, name, address)ssn, c-id -> grade

ssn-> name, address

{c-id, ssn}+ = ??

}F

Page 33: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure A+

if A+ = {all attributes of table}then ‘A’ is a candidate key

Page 34: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure A+

Algorithm to compute +, the closure of under Fresult := ;while (changes to result) do

for each in F do begin

if result then result := result

end

Page 35: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - Closure A+ (example)

R = (A, B, C, G, H, I) F = {A B, A C, CG H, CG I, B H} (AG)+

1.result = AG2.result = ABCG (A C and A B)3.result = ABCGH (CG H and CG AGBC)4.result = ABCGHI (CG I and CG AGBCH)

Is AG a candidate key? 1. Is AG a super key?

1. Does AG R? 2. Is any subset of AG a superkey?

1. Does A+ R?2. Does G+ R?

Page 36: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - A+ closure

Diagrams

AB->C (1)A->BC (2)B->C (3)A->B (4)

CA

B

Page 37: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

Given a set F of FD (on a schema)Fc is a minimal set of equivalent FD.

E.g.,takes(ssn, c-id, grade, name, address)

ssn, c-id -> grade ssn-> name, address ssn,name-> name, address ssn, c-id-> grade, name

F

Page 38: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

ssn, c-id -> grade ssn-> name, address ssn,name-> name, address ssn, c-id-> grade, name

FFc

Page 39: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

why do we need it? define it properly compute it efficiently

Page 40: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

why do we need it? easier to compute candidate keys

define it properly compute it efficiently

Page 41: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

define it properly - three properties every FD a->b has no extraneous

attributes on the RHS same for the LHS all LHS parts are unique

Page 42: Temple University – CIS Dept. CIS616– Principles of Data Management

‘extraneous’ attribute: if the closure is the same, before and

after its elimination or if F-before implies F-after and vice-

versa

FDs - ‘canonical cover’ Fc

Page 43: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

ssn, c-id -> grade ssn-> name, address ssn,name-> name, address ssn, c-id-> grade, name

F

Page 44: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

Algorithm: examine each FD; drop extraneous

LHS or RHS attributes merge FDs with same LHS repeat until no change

Page 45: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

Trace algo forAB->C (1)A->BC (2)B->C (3)A->B (4)

Page 46: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

Trace algo forAB->C (1)A->BC (2)B->C (3)A->B (4) (4) and (2)

merge:

AB->C (1)A->BC (2)B->C (3)

Page 47: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

AB->C (1)A->BC (2)B->C (3)

in (2): ‘C’ is extr.

AB->C (1)A->B (2’)B->C (3)

Page 48: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

AB->C (1)A->B (2’)B->C (3)

in (1): ‘A’ is extr.

B->C (1’)A->B (2’)B->C (3)

Page 49: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

B->C (1’)A->B (2’)B->C (3)

(1’) and (3) merge

A->B (2’)B->C (3)

nothing is extraneous: ‘canonical cover’

Page 50: Temple University – CIS Dept. CIS616– Principles of Data Management

FDs - ‘canonical cover’ Fc

AFTER

A->B (2’)B->C (3)

BEFOREAB->C (1)A->BC (2)B->C (3)A->B (4)

Page 51: Temple University – CIS Dept. CIS616– Principles of Data Management

Overview - conclusions

Domain; Ref. Integrity constraints Assertions and Triggers Functional dependencies

why definition Armstrong’s “axioms” closure and cover