070326 fragmentation
TRANSCRIPT
-
8/2/2019 070326 Fragmentation
1/17
Institut fr Scientific Computing Universitt WienP.Brezany
Fragmentation
Univ.-Prof. Dr. Peter Brezany
Institut fr Scientific Computing
Universitt WienTel. 4277 39425
Sprechstunde: Di, 13.00-14.00
LV-Portal: www.par.univie.ac.at/~brezany/teach/gckfk/300658.html
http://www.par.univie.ac.at/~brezany/teach/gckfk/300658.htmlhttp://www.par.univie.ac.at/~brezany/teach/gckfk/300658.html -
8/2/2019 070326 Fragmentation
2/17
Institut fr Scientific Computing Universitt WienP.Brezany 2
Introduction
We already presented the various fragmentation
strategies. Fragmentation strategies:
horizontal
vertical
nesting fragments in a hybrid fashion.
-
8/2/2019 070326 Fragmentation
3/17
Institut fr Scientific Computing Universitt WienP.Brezany 3
Horizontal Fragmentation
Primary horizontal fragmentation of a relation is
performed using predicates that are defined on thatrelation.
Derived horizontal fragmentation is the partitioning of arelation that results from predicates being defined onanother relation.
Information requirements of horizontal fragmentation Database information
Application information
-
8/2/2019 070326 Fragmentation
4/17
Institut fr Scientific Computing Universitt WienP.Brezany 4
Database Example
-
8/2/2019 070326 Fragmentation
5/17Institut fr Scientific Computing Universitt WienP.Brezany 5
Database Information It concerns the global conceptual schema. In this context it is
important to note how the database relations are connected to oneanother, especially with joins.
Example:
Given link L1of the above figure, the owner and member functions have thethe following values: owner(L1) = PAY; member(L1) = EMPThe quantitative information required about the database is the cardinalityof each relation R, denoted card(R).
-
8/2/2019 070326 Fragmentation
6/17Institut fr Scientific Computing Universitt WienP.Brezany 6
Application Information
It is required: qualitative information, which guides the fragmentation activity
quantitative information is incorporated primarily into the allocation models.
The fundamental qualitative information consists of the predicates usedin user queries.It is not possible to analyze all of the user applicationsto determine these predicates one should at least investigate themost important ones. a rule of thumb: the most active 20% ofuser queries account for 80% of the total data access.
Simple predicates: Given a relation R(A1, A2, ..., An), where Ai is anattribute defined over domain Di, a simple predicate pj defined on Rhas the form
pj : Ai Valuewhere {, , , , , } and Value Di. We use Pri to denote the set
of all simple predicates defined on a relation Ri. The members of Priare denoted by pij.
Example: for the relation instance PROJ:
PNAME = Maintenance BUDGET 200000
-
8/2/2019 070326 Fragmentation
7/17Institut fr Scientific Computing Universitt WienP.Brezany 7
Application Information (cont.)
User queries often include more complicated predicates, which are
Boolean combinations of simple predicates.One important combination: minterm predicate conjunction ofsimple predicates.
Given a set of simple predicates for relation
Ri, the set of minterm predicates is defined as
},...,,{Pr 21 imiii ppp
},...,,{ 21 iziii mmmM
zjmkpmmM ikp
ijijiiik
1,1},|{*
Pr
where ikik pp *
or ikik pp *
So each simple predicate can occur in a minterm predicate eitherin its natural form or ist negated form.
-
8/2/2019 070326 Fragmentation
8/17Institut fr Scientific Computing Universitt WienP.Brezany 8
Application Information (cont.)
Example:
-
8/2/2019 070326 Fragmentation
9/17Institut fr Scientific Computing Universitt WienP.Brezany 9
Application Information (cont.)
In terms of quantitative information about the userapplications, we need to have 2 sets of data:1. Minterm selectivity: number of tuples of the relation that would
be accessed by a user query specified according to a givenminterm predicate. E.g., in previous example, sel(m1)=0 sincethere are no tuples in PAY that satisfy the minterm predicate.
sel(m2)=12. Access frequency: frequency with which user applications accessdata. If Q = {q1, q2, ..., qq} is a set of user queries, acc(qi)indicates the access frequency of query qi in a given period.
The minterm access frequencies can be determinedfrom the query frequencies acc(mi) the accessfrequency of a minterm mi.
-
8/2/2019 070326 Fragmentation
10/17Institut fr Scientific Computing Universitt WienP.Brezany 10
Primary Horizontal Fragmentation
It is defined by a selection operation on the owner relations ofa database schema.
Given a relation R, its horizontal fragments are given by
Ri=
Fi(R), 1 i w
where Fi is the selection formula used to obtain fragment Ri.
Example : PROJ PROJ1 and PROJ2PROJ1 =
BUDGET
200000
(PROJ)
PROJ2 = BUDGET 200000(PROJ)
-
8/2/2019 070326 Fragmentation
11/17Institut fr Scientific Computing Universitt WienP.Brezany 11
Primary Horizontal Fragmentation (cont.)
Example:
-
8/2/2019 070326 Fragmentation
12/17Institut fr Scientific Computing Universitt WienP.Brezany 12
Primary Horizontal Fragmentation (cont.)
A more formal definition of a horizontal fragment:
A horizontal fragment of relation Ri consists of all the tuplesof R that satisfy a minterm predicate mj.
Hence, given a set of minterm predicates M, there are asmany horizontal fragments of R as there are mintermpredicates. minterm fragments.
An important aspect of simple predicates is theircompleteness; another is their minimality.
A set od simple predicates Pr is said to be complete if andonly if there is an equal probability of access by everyapplication to any tuple belonging to any minterm fragmentthat is defined according to Pr.
-
8/2/2019 070326 Fragmentation
13/17
Institut fr Scientific Computing Universitt WienP.Brezany 13
Primary Horizontal Fragmentation (cont.)
Example: Consider the fragmentation of PROJ in the last
example. If the only application that accesses PROJwants to access the tuples according to the location, theset is complete since each tuple of each fragment PROJi,has the same probability of being accessed.
If there is a second application which accesses only those
project tuples where the budegt is less than $200.000,then Pr is not complete. Some of the tuples within eachPROJi have a higher probability of being accessed due tothis second application.
To make the set of predicates complete, we need to add
(BUDGET 200000, BUDGET > 20000) to Pr:Pr = {LOC=Montreal, LOC=New York, LOC=Paris,
BUDGET200000, BUDGET > 20000}
-
8/2/2019 070326 Fragmentation
14/17
Institut fr Scientific Computing Universitt WienP.Brezany 14
Primary Horizontal Fragmentation (cont.)
The second desirable property of the set ofpredicates, according to which minterm predicates andturn, fragments are to be defined, is minimality.
If a predicate influences how fragmentation isperformed (i.e., causes a fragment f to be furtherfragmented into, say, fi and fj), there should be atleast one application that accesses fi and fjdifferently. In other words, the simple predicateshould be relevant in determining a fragmentation.
If all the predicates of a set Pr are relevant, Pr isminimal.
-
8/2/2019 070326 Fragmentation
15/17
Institut fr Scientific Computing Universitt WienP.Brezany 15
Primary Horizontal Fragmentation (cont.)
Example:The set Pr defined in the previous example iscomplete and minimal. If, however, we were to
add the predicate
PNAME = Instrumentation
to Pr, the resulting set would not be minimal sincethe new predicate is not relevant with respect to
Pr. There is no application that would access theresulting fragments any differently.
-
8/2/2019 070326 Fragmentation
16/17
Institut fr Scientific Computing Universitt WienP.Brezany 16
Derived Horizontal Fragmentation
A derived horizontal fragmentation is defined on a member
relation of a link according to a selection operation specified onits owner.
Given a link L where owner(L) = S and member(L) = R, the derivedhorizontal fragments of R are defined as
Ri = R Si, 1 i wwhere w is the maximum number of fragments that will be
defined on R, and Si = Fi(S), where Fi is the formula according towhich the primary horizontal fragment Si is defined.
-
8/2/2019 070326 Fragmentation
17/17
Institut fr Scientific Computing Universitt WienP.Brezany 17
Derived Horizontal Fragmentation (cont.)Example: Consider link L1, where
owner(L1) = PAY andmember(L1) = EMP.Then we can group engineers into2 groups according to theirsalary:
$30.000 and > $30.000. The 2fragmentsEMP1 and EMP2 are defined:
EMP1 = EMP PAY1EMP2 = EMP PAY2
where
PAY1 = SAL30000 (PAY)
PAY2 = SAL>30000 (PAY)
Derived horizontal fragmentation of EMP