temple university – cis dept. cis661 – principles of data management v. megalooikonomou query...
TRANSCRIPT
Temple University – CIS Dept. CIS661 – Principles of Data Management
V. MegalooikonomouQuery Optimization
(based on slides by C. Faloutsos at CMU)
General Overview - rel. model Relational model - SQL Functional Dependencies &
Normalization Physical Design Indexing Query optimization Transaction processing
Overview of a DBMSDBAcasual
user
DML parser
buffer mgr
trans. mgr
DMLprecomp.
DDL parser
catalogData-files
Naïve user
Overview - detailed Why q-opt? Equivalence of expressions Cost estimation Cost of indices Join strategies
Why Q-opt? SQL: ~declarative good q-opt -> big difference
eg., seq. Scan vs B-tree index, on P=1,000 pages
Q-opt steps bring query in internal form (eg.,
parse tree) … into ‘canonical form’ (syntactic
q-opt) generate alternative plans estimate cost; pick best
Q-opt - example
select name
from STUDENT, TAKES
where c-id=‘CIS661’ and
STUDENT.ssn=TAKES.ssn
STUDENT TAKES
STUDENT TAKES
Canonical form
Q-opt - example
STUDENT TAKES
Index; seq. scan
Hash join; merge join;
nested loops;
Overview - detailed Why q-opt? Equivalence of expressions Cost estimation Cost of indices Join strategies
Equivalence of expressions A.k.a.: syntactic q-opt In short: perform selections and
projections early More details:
see transformation rules in text
Equivalence of expressions Q: How to prove a transformation
rule?
A: use TRC, to show that LHS = RHS, e.g.: )2()1()21(
?
RRRR PPP
)2()1()21(?
RRRR PPP
Equivalence of expressions
))()2())(1(
)()21(
)()21(
)2()1()21(?
tPRttPRt
tPRtRt
tPRRt
LHSt
RRRR PPP
Equivalence of expressions
QED
RHSt
RRt
RtRt
tPRttPRt
RRRR
PP
PP
PPP
)2()1(
))2(())1((
))()2())(1(
...
)2()1()21(?
Equivalence of expressions Q: how to disprove a rule??
)2()1()21(?
RRRR AAA
Equivalence of expressions Selections
perform them early break a complex predicate, and push
simplify a complex predicate (‘X=Y and Y=3’) -> ‘X=3 and Y=3’
))...)((...()( 21^...2^1 RR pnpppnpp
Equivalence of expressions Projections
perform them early (but carefully…) Smaller tuples Fewer tuples (if duplicates are
eliminated) project out all attributes except the
ones requested or required (e.g., joining attr.)
Equivalence of expressions Joins
Commutative , associative
Q: n-way join - how many diff. orderings? … Exhaustive enumeration too slow…
RSSR
)()( TSRTSR
Q-opt steps bring query in internal form (eg.,
parse tree) … into ‘canonical form’ (syntactic
q-opt) generate alt. plans estimate cost; pick best
19
Cost estimation Eg., find ssn’s of students with an
‘A’ in CIS661 (using seq. scanning) How long will a query take?
CPU (but: small cost; decreasing; tough to estimate)
Disk (mainly, # block transfers) How many tuples will qualify? (what statistics do we need to
keep?)
Cost estimation
Statistics: for each relation ‘r’ we keep nr : # tuples; Sr : size of tuple in
bytes …
Sr
#1#2
#3
#nr
Cost estimation Statistics: for each
relation ‘r’ we keep … V(A,r): number of
distinct values of attr. ‘A’
(recently, histograms, too)
…
Sr
#1#2
#3
#nr
Derivable statistics fr: blocking factor =
max# records/block (=?? )
br: # blocks (=?? ) SC(A,r) = selection
cardinality = avg# of records with A=given (=?? )
…
fr
Sr
#1
#2
#br
Derivable statistics fr: blocking factor = max#
records/block (= B/Sr ; B: block size in bytes)
br: # blocks (= nr / fr )
Derivable statistics SC(A,r) = selection cardinality =
avg# of records with A=given (= nr / V(A,r) ) (assumes uniformity...) – eg: 30,000 students, 10 colleges – how many students in CST?
Additional quantities we need:
For index ‘i’: fi: average fanout - degree (~50-100) HTi: # levels of index ‘i’ (~2-3)
~ log(#entries)/log(fi) LBi: # blocks at leaf level
HTi
Statistics Where do we store them? How often do we update them?
Q-opt steps bring query in internal form (eg.,
parse tree) … into ‘canonical form’ (syntactic q-
opt) generate alt. plans
selections; sorting; projections joins
estimate cost; pick best
Cost estimation + plan generation Selections – eg.,
select * from TAKESwhere grade =
‘A’ Plans? …
fr
Sr
#1
#2
#br
Cost estimation + plan generation Plans?
seq. scan binary search
(if sorted & consecutive)
index search if an index
exists
…
fr
Sr
#1
#2
#br
Cost estimation + plan generation
seq. scan – cost? br (worst case) br/2 (average, if
we search for primary key)
…
fr
Sr
#1
#2
#br
Cost estimation + plan generation
binary search – cost?if sorted and
consecutive: ~log(br) + SC(A,r)/fr (=#blocks
spanned by qualified tuples)
-1
…
fr
Sr
#1
#2
#br
Cost estimation + plan generation
estimation of selection cardinalities SC(A,r):
non-trivial – details later
…
fr
Sr
#1
#2
#br
Cost estimation + plan generation
method#3: index – cost? levels of index + blocks w/ qual. tuples
…
fr
Sr
#1
#2
#br
...
case#1: primary key
case#2: sec. key – clustering index
case#3: sec. key – non-clust. index
Cost estimation + plan generation
method#3: index – cost? levels of index + blocks w/ qual. tuples
…
fr
Sr
#1
#2
#br
..
case#1: primary key – cost:
HTi + 1
HTi
Cost estimation + plan generation
method#3: index - cost? levels of index + blocks w/ qual. tuples
…
fr
Sr
#1
#2
#br
case#2: sec. key – clustering index
OR prim. index on non-key
…retrieve multiple records
HTi + SC(A,r)/fr
HTi
Cost estimation + plan generation
method#3: index – cost? levels of index + blocks w/ qual. tuples
…
fr
Sr
#1
#2
#br
...
case#3: sec. key – non-clust. index
HTi + SC(A,r)
(actually, pessimistic...)
Cost estimation – arithmetic examples find accounts with branch-name =
‘Perryridge’ account(branch-name, balance, ...)
Arithm. examples – cont’d n-account = 10,000 tuples f-account = 20 tuples/block V(balance, account) = 500 distinct
values V(branch-name, account) = 50
distinct values for branch-index: fanout fi = 20
Arithm. examples Q1: cost of seq. scan? A1: 500 disk accesses Q2: assume a clustering index on
branch-name – cost?
Cost estimation + plan generation
method#3: index – cost? levels of index + blocks w/ qual.
tuples
…
fr
Sr
#1
#2
#br
case#2: sec. key – clustering index
HTi + SC(A,r)/frHTi
Arithm. examples A2:
HTi + SC(branch-name, account)/f-account
HTi: 50 values, with index fanout 20 -> HT=2 levels (log(50)/log(20) = 1+)
SC(..)= # qualified records = nr/V(A,r) = 10,000/50 = 200 tuples SC/f: spanning 200/20 blocks = 10 blocks
Arithm. examples A2 final answer: 2+10= 12 block
accesses (vs. 500 block accesses of seq.
scan) footnote: in all fairness
seq. disk accesses: ~2msec or less random disk accesses: ~10msec
Q-opt steps bring query in internal form (eg.,
parse tree) … into ‘canonical form’ (syntactic q-
opt) generate alternative plans
selections; sorting; projections joins
estimate cost; pick best
General Overview - rel. model Relational model - SQL Functional Dependencies &
Normalization Physical Design Indexing Query optimization Transaction processing
Q-opt steps bring query in internal form (eg., parse
tree) … into ‘canonical form’ (syntactic q-opt) generate alt. plans
selections (simple; complex predicates) sorting; projections joins
estimate cost; pick best
Reminder – statistics: for each relation ‘r’ we keep
nr : # tuples; Sr : size of tuple in bytes V(A,r): number of distinct values of attr.
‘A’ fr: blocking factor br: number of blocks SC(A,r): selection cardinality (avg.# of
records with A=given)
Selections we saw simple predicates
(A=constant; eg., ‘name=Smith’) how about more complex predicates,
like ‘salary > 10K’ ‘age = 30 and job-code=“analyst” ’
what is their selectivity?
Selections – complex predicates
selectivity sel(P) of predicate P :== fraction of tuples that qualifysel(P) = SC(P) * nr
Selections – complex predicates
eg., assume that V(grade, TAKES)=5 distinct values
simple predicate P: A=constant sel(A=constant) = 1/V(A,r) eg., sel(grade=‘B’) = 1/5
(what if V(A,r) is unknown??)grade
count
AF
Selections – complex predicates range query: sel( grade >= ‘C’)
sel(A>a) = (Amax – a) / (Amax – Amin)
grade
count
AF
Selections - complex predicates negation: sel( grade != ‘C’)
sel( not P) = 1 – sel(P) (Observation: selectivity =~
probability)
grade
count
AF
‘P’
Selections – complex predicates
conjunction: sel( grade = ‘C’ and course = ‘CIS661’) sel(P1 and P2) = sel(P1) * sel(P2) INDEPENDENCE ASSUMPTION
P1 P2
Selections – complex predicates
disjunction: sel( grade = ‘C’ or course = ‘CIS661’) sel(P1 or P2) = sel(P1) + sel(P2) – sel(P1 and P2) = sel(P1) + sel(P2) – sel(P1)*sel(P2) INDEPENDENCE ASSUMPTION, again
P1 P2
Selections – complex predicates
disjunction: in generalsel(P1 or P2 or … Pn) =1 - (1- sel(P1) ) * (1 - sel(P2) ) * … (1 - sel(Pn))
P1 P2
Selections – summary sel(A=constant) = 1/V(A,r) sel( A>a) = (Amax – a) / (Amax – Amin) sel(not P) = 1 – sel(P) sel(P1 and P2) = sel(P1) * sel(P2) sel(P1 or P2) = sel(P1) + sel(P2) –
sel(P1)*sel(P2)
UNIFORMITY and INDEPENDENCE ASSUMPTIONS
Q-opt steps bring query in internal form (eg.,
parse tree) … into ‘canonical form’ (syntactic q-
opt) generate alt. plans
selections (simple; complex predicates) sorting; projections joins
estimate cost; pick best
Sorting Assume br blocks of rel. ‘r’, and only M (<br) buffers in main memory Q1: how to sort (‘external sorting’)? Q2: cost?
...
12
M
1
br
...
‘r’
Sorting Q1: how to sort (‘external sorting’)? A1:
create sorted runs of size M merge
...
12
M
1
br
...
‘r’
Sorting create sorted runs of size M (how many?) merge them (how?)
M
... ...
Sorting create sorted runs of size M merge first M-1 runs into a sorted run of (M-1) *M, ...
M
... ...…..
Sorting How many steps we need to do? ‘i’, where M*(M-1)^i > br How many reads/writes per step? br+br
(each step reads every block once and writes it once)
M
... ...…..
Sorting In short, excluding the final ‘write’, we need ceil(log(br/M) / log(M-1)) * 2 * br + br
M
... ...…..
Q-opt steps bring query in internal form (eg.,
parse tree) … into ‘canonical form’ (syntactic q-
opt) generate alt. plans
selections (simple; complex predicates) sorting; projections, aggregations joins
estimate cost; pick best
Projection - dupl. elimination
eg., select distinct c-idfrom TAKES
How?Pros and cons?
Set operations
eg., select * from REGULAR-STUDENTunionselect * from SPECIAL-STUDENT
How?Pros and cons?
Aggregations
eg., select ssn, avg(grade) from TAKES
How?
Q-opt steps bring query in internal form (eg., parse
tree) … into ‘canonical form’ (syntactic q-opt) generate alt. plans
selections; sorting; projections, aggregations joins
2-way joins n-way joins
estimate cost; pick best
2-way joins output size estimation: r JOIN s nr, ns tuples each case#1: cartesian product (R, S
have no common attribute) #of output tuples=??
2-way joins output size estimation: r JOIN s case#2: r(A,B), s(A,C,D), A is cand.
key for ‘r’ #of output tuples=??
r(A, ...)
s(A, ......)nr
ns
<=ns
2-way joins output size estimation: r JOIN s case#3: r(A,B), s(A,C,D), A is cand.
key for neither (is it possible??) #of output tuples=??
r(A, ...)
s(A, ......)nr
ns
2-way joins #of output tuples= nr * ns/V(A,s) or ns * nr/V(A,r) (whichever is less)
r(A, ...)
s(A, ......)nr
ns
Q-opt steps bring query in internal form (eg., parse
tree) … into ‘canonical form’ (syntactic q-opt) generate alt. plans
selections; sorting; projections, aggregations joins
2-way joins - output size estimation; algorithms n-way joins
estimate cost; pick best
2-way joins algorithm(s) for r JOIN s? nr, ns tuples each
r(A, ...)
s(A, ......)nr
ns
2-way joins Algorithm #0: (naive) nested loop
(SLOW!)for each tuple tr of r
for each tuple ts of sprint, if they match
r(A, ...)
s(A, ......)nr
ns
2-way joins Algorithm #0: why is it bad? how many disk accesses (‘br’ and
‘bs’ are the number of blocks for ‘r’ and ‘s’)?r(A, ...)
s(A, ......)nr
ns
nr*bs + br
2-way joins Algorithm #1: Blocked nested-loop join
read in a block of r read in a block of s
print matching tuples
r(A, ...)
s(A, ......)nr,
brns records, bs blocks
cost: br + br * bs
2-way joins Arithmetic example:
nr = 10,000 tuples, br = 1,000 blocks ns = 1,000 tuples, bs = 200 blocks
r(A, ...)
s(A, ......)10,000
1,000 1,000 records,
200 blocks
alg#0: 2,001,000 d.a.
alg#1: 201,000 d.a.
2-way joins Observation1: Algo#1: asymmetric:
cost: br + br * bs - reverse roles: cost= bs + bs*br
Best choice?
r(A, ...)
s(A, ......)nr,
brns records, bs blocks
smallest relation in outer loop
2-way joinsObservation2 [NOT IN BOOK]:
what if we have ‘k’ buffers available?
r(A, ...)
s(A, ......)nr,
brns records, bs blocks
read in ‘k-1’ blocks of ‘r’
read in a block of ‘s’
print matching tuples
2-way joins Cost?
r(A, ...)
s(A, ......)nr,
brns records, bs blocks
read in ‘k-1’ blocks of ‘r’
read in a block of ‘s’
print matching tuples
br + br/(k-1) * bs
what if br=k-1?what if we assign k-1 blocks to inner?)
2-way joins Observation3: can we get rid of the ‘br’ term?
cost: br + br * bs
r(A, ...)
s(A, ......)nr,
brns records, bs blocks
A: read the inner relation backwards half of the times! Q: cons?
2-way joins Other algorithm(s) for r JOIN s? nr, ns tuples each
r(A, ...)
s(A, ......)nr
ns
2-way joins - other algo’s sort-merge
sort ‘r’; sort ‘s’; merge sorted versions (good, if one or both are already sorted)
r(A, ...)
s(A, ......)nr
ns
2-way joins - other algo’s sort-merge - cost: ~ 2* br * log(br) + 2* bs * log(bs) + br + bs needs temporary space (for sorted versions) gives output in sorted order
r(A, ...)
s(A, ......)nr
ns
use an existing index, or even build one on the fly
cost: br + nr * c (c: look-up cost)
2-way joins - other algo’s
r(A, ...)
nr
s(A, ......)
ns
hash join: hash ‘r’ into (0, 1, ..., ‘max’) buckets hash ‘s’ into buckets (same hash function) join each pair of matching buckets
2-way joins - other algo’s
r(A, ...)s(A, ......)
0
1
max
how to join each pair of partitions Hr-i, Hs-i ?
A: build another hash table for Hs-i, and probe (look up) it with each tuple of Hr-i
2-way joins - hash join details
r(A, ...)s(A, ......)
Hr-0
0
1
max
Hs-0
what if Hs-i is too large to fit in main-memory?
A: recursive partitioning more details (overflows, hybrid hash
joins): in book cost of hash join? (under certain
assumptions) 2(br + bs) + (br + bs) + 4* max
2-way joins - hash join details
Q-opt steps bring query in internal form (eg., parse
tree) … into ‘canonical form’ (syntactic q-opt) generate alt. plans
selections; sorting; projections, aggregations joins
2-way joins - output size estimation; algorithms n-way joins
estimate cost; pick best
r1 JOIN r2 JOIN ... JOIN rn typically, break problem into 2-way
joins
n-way joins
System R: break query in query blocks simple queries (ie., no joins): look at stats n-way joins: left-deep join trees; ie., only
one intermediate result at a time pros: smaller search space; pipelining cons: may miss optimal
2-way joins: nested-loop and sort-merge
Structure of query optimizers:
r1 r2 r3 r4
More heuristics by Oracle, Sybase and Starburst (-> DB2) : in book
In general: q-opt is very important for large databases.
(‘explain select <sql-statement>’ gives plan)
Structure of query optimizers:
Q-opt steps bring query in internal form (eg.,
parse tree) … into ‘canonical form’ (syntactic q-
opt) generate alt. plans
selections (simple; complex predicates) sorting; projections joins
estimate cost; pick best