online reorg db2 1 ghs 10/6/2015 a method for online reorganization of a database gary sockut (joint...

Post on 30-Dec-2015

220 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Online REORG DB2 1GHS 4/19/23

A Method for Online Reorganization of a Database

Gary Sockut

(joint work with Thomas Beavin & Chung-Chia Chang)

Work performed at

IBM Silicon Valley Laboratory, San Jose, CA

Online REORG DB2 2GHS 4/19/23

Introduction: reorganization

Reorganization: Δ way arranged (logical / physical).

Any DBMS & database might need:

• Δ logical database definition:generalize 1-many -► many-many, split column.

• Δ physical database definition:construct index, split partition.

• Restore physical arrangement of instances without Δ definition:* Compact space.* Collect garbage.* Restore clustering (store near each other; criteria). ▼ I/O. Writing degrades; reorganization restores. This method.

Online REORG DB2 3GHS 4/19/23

highly available (24 x 7) (web commerce, armed forces)

very large

Introduction: offline vs. online reorganization

Reorganize online (concurrently withusage or incrementally within users’transactions). Importance ▲.

TIME

USAGE BATCH WINDOW (offline OK) USAGE

REORGANIZATION

Traditional: reorganize offline:

2 categories:unacceptable

Online REORG DB2 4GHS 4/19/23

Introduction: online reorganization

Method for restoring clustering online.IBM’s Database 2 (DB2) relational DBMS.

Rest of presentation:

• Storage structures

• Overview (unload, reload, process log); problem; solution

• Steps of reorganization & more details

• Comparison: previous research

Online REORG DB2 5GHS 4/19/23

ID map

Storage structures

• Not unique key. File pages, indexes, entries in loguse record identifier (RID) to identify records:– Page # & # within page. ID map.– Δ only during reorganization.

• Header of page: position in log (log record sequence number (LRSN)) current when most recently written.

LRSN data data * *

data record data recordfile page:

• Rows of tables in data records in file pages:

Online REORG DB2 6GHS 4/19/23

ID map

ID map

Storage structures

• Variable-length: grow: regular -► pointer & overflow:

LRSN data RID * *

regulardata record

pointerdata recordfile page:

LRSN data data * *

regulardata record

overflowdata recordfile page:

bad!

Online REORG DB2 7GHS 4/19/23

Storage structures: writing

Users write rows -► DBMS writes data records (& log entries).Usually, Insert / Update / Delete row -► 1 regular data record.Exceptions on Update:

• Has regular & new data too large for regular's page -►I overflow & U regular to pointer.

• Has overflow & new data fits on overflow's page -►U overflow to overflow.

• Has overflow & new data too large for overflow's page -►* Now room on pointer's page -► U pointer to regular & D overflow.* Still no room on pointer's page -► I overflow, D overflow, & U pointer to pointer.

Exception on Delete: has overflow -► D pointer & overflow.

Backout (undo if failure) reverses.

Online REORG DB2 8GHS 4/19/23

Storage structures: reorganization

• Database administrator invokes reorganization:* Restores clustering (clustering key: not necessarily unique).* Removes overflow.* Distributes free space. Improves performance.

• Area being reorganized: set of 1 or more tablesor set of 1 or more partitions of 1 table (parameter).

• Offline:

1. Read-only (R/O): unload data, sort.

2. No access: reload.

• Online: Read/write (R/W): most. --►

Online REORG DB2 9GHS 4/19/23

Overview of method for online reorganization

An inspiration:Fuzzy dumping: unload data (backup)while letting users write:A. Dumping: (1) record current LRSN for

log; (2) unload.

B. Recovery: (1) reload; (2) bring up to date:apply log entries (start: recorded LRSN).Ignore log entry whose LRSN ≤LRSN of indicated page.

data-base

back-up

log

data-base

back-up

1

2

-► Fuzzy reorganization.

Online REORG DB2 10GHS 4/19/23

Overview of method for online reorganization

A) Record current LRSN; unload, sort, reload (reorganize). (R/W):

users’data

old copyof area

new copyof area

log

READ

WRITE

UNLOAD,SORT, RELOAD

users’data

old copyof area

new copyof area

log

READ

WRITE

B) Process log (read; apply). (n iterations R/W; 1 R/O):

PROCESS

users’data

new copyof area

log

READ

WRITE

C) Switch users' accessing to new. (brief offline; R/W):

name 0name 1

name 0 name 1

name 0 name 1

Online REORG DB2 11GHS 4/19/23

Problem & solution

Problem: reorganization Δ RIDs:

• Log entries: old RIDs.

• Apply: identify data record in new copy (new RID).

Solution:

• Temporary table maps old & new RIDs, stores LRSNs.

• Translate log entries before applying.

Novelty: interaction: maintain table & process log.

Online REORG DB2 12GHS 4/19/23

Main steps of reorganization (more detail)

1. Record current LRSN for log. 2.

3.

4. 5. 6. 7.

8. 9.

--►Delete old.Start R/W new.

Switch future accesses: exchange names of files(data, indexes). Users: no access.

Quiesce all access.Process log. Append changed pages to backup. R/O.

Quiesce writing (block; wait until finished). Record LRSN.

Process log: iterate:* Read subset of log between recorded LRSNs (users' writing): translate; apply to new. R/W.* Record current LRSN. Iterate or next step (criteria).* Last iteration: append changed pages to backup.

Unload data, sort, reload into new (includes indexes),create backup copy. Users R/W old.Table maps: old & new RIDs. Record current LRSN.

Online REORG DB2 13GHS 4/19/23

oldRID

newRID

Unloading, sorting, & reloading (R/W)

2. Sort by clustering key.

old copyof area

new copyof area

1. UNLOAD

2. SORT

3. RELOAD

1. ADD ENTRY 3. ADD ENTRY

mappingtable

sort file

1. Unload: scan sequentially:* Regular or overflow: unload data, old RID, & LRSN of page.* Pointer: add entry to mapping table.

3. Reload: also add entry to mapping table.

Online REORG DB2 14GHS 4/19/23

Processing of log (n R/W & 1 R/O) 5 phases

5. APPLY (insertion: actual)

pointers. sort: LRSN copy of log (buffer) log1. COPY

2. SORT BY OLD RID (speed access to mapping table)

pointers. sort: old RID copy of log (buffer)

3. TRANSLATE RIDS (insertion: estimate, for sorting)

pointers. sort: old RID copy of log (buffer)mappingtable

4. SORT BY NEW RID (speed access to new copy of area)

pointers. sort: new RID copy of log (buffer)

pointers. sort: new RID copy of log (buffer)

mapping table

new copyof area

3, 5, control of iterations --►

Online REORG DB2 15GHS 4/19/23

Phase 3: translation of RIDs

DBMS's log application (recovery) ignores log entry if LRSN ≤page’s. Translation ignores log entry if LRSN ≤ mapping table’s.Insertion:

• R or O: estimate new RID; store in log entry (buffer);insert entry in mapping table (old, estimated new).

• P: delete log entry; insert entry in mapping table (old, no new).

Update:• R to R or O to O: store new RID in log entry.• P to P: delete log entry.• P to R: ~I: estimate new RID. Log entry: store estimated new RID; U -► I. Mapping table: store estimated new RID.• R to P: ~D: log entry: store new RID; U -► D.

Deletion:• R or O: store new RID in log entry; delete mapping table’s entry.• P: delete log entry; delete mapping table’s entry.

Online REORG DB2 16GHS 4/19/23

Phase 5: application

Scan set of pointers to log entries (sorted by new RID).For each RID value:1. Find all pointers (contiguous).2.

3. Apply sequentially:I:* Insert in new copy; obtain actual new RID.* In mapping table & in log entries for current RID, estimated new RID -► actual.U or D: treat like DBMS's handling of user's.

≥ 1 D log entry -► delete certain log entries:* 1st entry is I -► delete last D & all preceding.* 1st entry is D or U -► keep last D; delete all preceding.Omit log entries for which no entries in mapping table;omit I U U D.

Online REORG DB2 17GHS 4/19/23

HILO

Control of iterations of log processing

Iterate (allowing R/W) until:1. Estimated time for next ≤ parameter for maximum

R/O -► next is last (R/O). Parameter: trade-off. --►

2. Estimated completion for next > deadline(parameter) -► cancel reorganization.

3. Amount of log for next not sufficiently < current(not catching up) -► send message to operator:After delay (parameter), DBMS will continue,quiesce writing, or cancel (parameter).During delay, database administrator canΔ parameters, adjust priorities,quiesce writing, cancel, or let DBMS take action.

Display status.

iteration

time

Online REORG DB2 18GHS 4/19/23

Scheduling of online reorganization

1. High tolerance of delay -► R/O & offline tolerable.2. Low rate of writing -► easy to catch up to log.3. No long-running transactions -► quick quiescing.

Trade-off.

Online REORG DB2 19GHS 4/19/23

Comparison with previous research

Calculation of clustering: not novel;Sort by clustering key (index); assign to pages.

Compare mapping tables, fuzzy reorganization:

1) Mapping tables:• Omiecinski et al.: reorganization in place;

map RIDs to translate entries in leaves of indexes.• Wiener & Naughton: loading data into object database;

map surrogate object identifiers (source file) into object IDs.

2) Several authors mention fuzzy reorganization (no detail).Unique identifier does not Δ. No mapping table.Many customers dislike requirement.Fuzzy reorganization when RIDs Δ --►

Online REORG DB2 20GHS 4/19/23

Comparison with previous research

O'Toole et al.: fuzzy garbage collectionfor persistent data. Forwarding field: In database context (not O'Toole), mapping table’s advantages:1. Reorganization copies, user deletes, DBMS reuses space

(not O'Toole). Mapping table safe; forwarding field gone.2. Mapping table entry < data record. Fewer pages.

Less I/O to read/write forwarding information.3. Less locking for old (unload old, reload/map, process log):

Mapping table: (R, no, no). Forwarding field: (R, W, R/W?).4. Avoid extra space (permanent) for forwarding field

(O'Toole: space already existed).

Forwarding field’s advantages:1. One field (already existed: O'Toole), not two (temporary).2. Seems simpler.

old copy new copy

data RID data

Online REORG DB2 21GHS 4/19/23

Possible (but rejected) alternatives

1. Reorganize only index online:support, but not enough.

2.

3.

record’sposition

record’sposition

Δ▬►user’s

positionin scan

reorganization

Reorganize partition (index) (support):fine-grained; reorganize offline:* Correct other indexes.* Slow routing, increase space for partition descriptors.* Less uniform growth/shrinkage -► increase total free space: • coarse: | high growth shrinkage low growth | • fine: | high growth | shrinkage | low growth |

Reorganize online in place (notby copying): inaccuracy. --►Complexity and/or slowness,or low-concurrency locking.

Online REORG DB2 22GHS 4/19/23

Publications

Web site: Google GARY SOCKUT(http://alum.mit.edu/www/ghs); click “publications”: -► pdf.

1. This work: G. H. Sockut, T. A. Beavin, & C.-C. Chang, “A Method for On-line Reorganization of a Database,” IBM Systems Journal, Vol. 36, No. 3, 1997, pp. 411-436; erratum in Vol. 37, No. 1, 1998, p. 152. Web site: slides.

2. Survey on online reorganization (not just IBM, not just clustering, not just fuzzy): G. H. Sockut & B. R. Iyer, “Online Reorganization of Databases,” Computing Surveys, Vol. 41, No. 3, ACM, Article 14, July 2009, 136 pages. Web site: table of contents.

Online REORG DB2 23GHS 4/19/23

Reorganize online:

• Copy data from old to new in reorganized form; R/W.Table maps between old & new RIDs.

• Apply log to new; map.

• Switch users' accessing to new.

Summary

Online REORG DB2 24GHS 4/19/23

Appendices

Online REORG DB2 25GHS 4/19/23

Appendix: effect of fine-grained partitioning

p

op

ula

tio

n d

ens

ity

(pe

op

le /

sq

ua

re m

ile)

urban area with many high-rise apartment buildings (Manhattan Community District 8: ~ Upper East Side)

uninhabited area

NewJersey

Alaska

coarse (state)fine (square mile)granularity of partitioning

Population densities in the area covered by the 50 states:

finer

0

~1 (census)

~100K(demographia

.com)

~1K (census)

top related