itis 5160 indexing. indexing datacubes objective: speed queries up. traditional databases (oltp):...

17
ITIS 5160 Indexing

Post on 21-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

ITIS 5160

Indexing

Page 2: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Indexing datacubes

Objective: speed queries up.

Traditional databases (OLTP): B-Trees

• Time and space logarithmic to the amount of indexed keys.

• Dynamic, stable and exhibit good performance under updates. (But OLAP is not about updates….)

Bitmaps:

• Space efficient

• Difficult to update (but we don’t care in DW).

• Can effectively prune searches before looking at data.

Page 3: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

BitmapsR = (…., A,….., M)

R (A) B8 B7 B6 B5 B4 B3 B2 B1 B0

3 0 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 1 0 0 8 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 7 0 1 0 0 0 0 0 0 0 5 0 0 0 1 0 0 0 0 0 6 0 0 1 0 0 0 0 0 0 4 0 0 0 0 1 0 0 0 0

Page 4: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Query optimization

Consider a high-selectivity-factor query with predicates on two attributes.

Query optimizer: builds plans(P1) Full relation scan (filter as you go).(P2) Index scan on the predicate with lower selectivity

factor, followed by temporary relation scan, to filter out non-qualifying tuples, using the other predicate. (Works well if data is clustered on the first index key).

(P3) Index scan for each predicate (separately), followed by merge of RID.

Page 5: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Query optimization (continued)

When using bitmap indexes (P3) can be an easy winner!

CPU operations in bitmaps (AND, OR, XOR, etc.) are more efficient than regular RID merges: just apply the binary operations to the bitmaps

(In B-trees, you would have to scan the two lists and select tuples in both -- merge operation--)

Of course, you can build B-trees on the compound key, butwe would need one for every compound predicate (exponential number of trees…).

Page 6: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Bitmaps and predicates

A = a1 AND B = b2

Bitmap for a1 Bitmap for b2

AND =

Bitmap for a1 and b2

Page 7: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Tradeoffs

Dimension cardinality small dense bitmaps

Dimension cardinality large sparse bitmaps

Compression

(decompression)

Page 8: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Star-Joins

Select F.S, D1.A1, D2.A2, …. Dn.An

from F,D1,D2,Dn where F.A1 = D1.A1

F.A2 = D2.A2 … F.An = Dn.An

and D1.B1 = ‘c1’ D2.B2 = ‘p2’ ….

Likely strategy:

For each Di find suitable values of Ai such that Di.Bi = ‘xi’ (unless you have a bitmap index for Bi). Use bitmap index on Ai’ values to form a bitmap for related rows of F (OR-ing the bitmaps).

At this stage, you have n such bitmaps, the result can be found AND-ing them.

Page 9: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

BitmapsR = (…., A,….., M) value-list index

R (A) B8 B7 B6 B5 B4 B3 B2 B1 B0

3 0 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 1 0 0 8 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 7 0 1 0 0 0 0 0 0 0 5 0 0 0 1 0 0 0 0 0 6 0 0 1 0 0 0 0 0 0 4 0 0 0 0 1 0 0 0 0

Page 10: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Examplesequence <3,3> value-list index (equality)

R (A) B22

B12

B02 B2

1 B11 B0

1

3 (1x3+0) 0 1 0 0 0 1 2 0 0 1 1 0 0 1 0 0 1 0 1 0 2 0 0 1 1 0 0 8 1 0 0 1 0 0 2 0 0 1 1 0 0 2 0 0 1 1 0 0 0 0 0 1 0 0 1 7 1 0 0 0 1 0 5 0 1 0 1 0 0 6 1 0 0 0 0 1 4 0 1 0 0 1 0

Page 11: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Encoding scheme

Equality encoding: all bits to 0 except the one that corresponds to the value

Range Encoding: the vi rightmost bits to 0, the remaining to 1

Page 12: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Range encodingsingle component, base-9

R (A) B8 B7 B6 B5 B4 B3 B2 B1 B0

3 1 1 1 1 1 1 0 0 0 2 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 0 8 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 7 1 1 0 0 0 0 0 0 0 5 1 1 1 1 0 0 0 0 0 6 1 1 1 0 0 0 0 0 0 4 1 1 1 1 1 0 0 0 0

Page 13: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

RangeEval

Evaluates each range predicate by computing two bitmaps: BEQ bitmap and either BGT or BLT

RangeEval-Opt uses only <=

A < v is the same as A <= v-1

A > v is the same as Not( A <= v)

A >= v is the same as Not (A <= v-1)

Page 14: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Example (revisited)sequence <3,3> value-list index(Equality)

R (A) B22

B12

B02 B2

1 B11 B0

1

3 (1x3+0) 0 1 0 0 0 1 2 0 0 1 1 0 0 1 0 0 1 0 1 0 2 0 0 1 1 0 0 8 1 0 0 1 0 0 2 0 0 1 1 0 0 2 0 0 1 1 0 0 0 0 0 1 0 0 1 7 1 0 0 0 1 0 5 0 1 0 1 0 0 6 1 0 0 0 0 1 4 0 1 0 0 1 0

Page 15: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

Examplesequence <3,3> range-encoded index

R (A) B12

B02 B1

1 B01

3 1 0 1 1 2 1 1 0 0 1 1 1 1 0 2 1 1 0 0 8 0 0 0 0 2 1 1 0 0 2 1 1 0 0 0 1 1 1 1 7 0 0 1 0 5 1 0 0 0 6 0 0 1 1 4 1 0 1 0

Page 16: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of

RangeEval-OPT

Page 17: ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of