select operation strategies and indexing (chapter 8)

Select Operation Strategies

And Indexing

(Chapter 8)

Disk access

• DBs traditionally stored on disk

• Cheaper to store on disk than in memory

• Seek time, latency, data transfer time

• disk access is page oriented

• 2 - 4 KB page size

Access time

• A to randomly access a page – 12-20 ms –50-83 I/O's per second

• large disparity between disk access and memory access (10-200 ns)

• hash disk page address and look in lookaside table to see if page in memory buffer

• In memory DBs the future?

Table scan

• Linear search - all data rows read in – I/O parallelism can be used

• multiple I/O read requests satisfied at the same time

• stripe the data across different disks

– Problems with parallelism?• must balance disk arm load to gain maximum

parallelism • requires the same total number of random I/O's,

but using devices for a shorter time

Sequential prefetch I/O

• retrieve one disk page after another (on same track) - typically 32

• seek time no longer a problem

• must know in advance to read 32 successive pages

• speed up of I/O by a factor of 10 (500 I/O's per second vs. 70)

Access time

• Seek time – 10-15ms

• Latency time – 2-5 ms

• Data transfer time – 10-200 ns

Access time for fast I/O

RIO Seq. Prefetch .010 .010 Seek - disk arm to cylinder .002 .002 Latency - platter to sector .0015 .048 Data transfer - Page .0135 .060 1 page vs. 32 pages

.43 seconds .060 seconds for 32 pages for both

Textbook access time

RIO Seq. Prefetch .008 .008 Seek - disk arm to cylinder .004 .004 Latency - platter to sector .0005 .016 Data transfer - Page .0125 .028 1 page vs. 32 pages

.40 seconds .028 seconds for 32 pages for both

Disk allocation

• Disk Resource Allocation for Databases (control DBA has)

• No standard SQL approach, but general way to deal with allocation

• Some OS allow specification of size of file and disk device

• contiguous sectors on disk - want close together as possible to minimize seek time

Tablespace

• Allocation medium for tables and indexes for ORACLE, DB2

• usually files (relations) cannot span disk devices • can put >1 table in table space if accessed

together • corresponds to 1 or more OS files and can span

disk devices

Query Language

• ORACLE DB's contain several tablespaces, including one called system - data description + indexes + user-defined tables

Create tablespace tspace1 datafile 'fname1', 'fname2';

• default tablespace given to each user • if multiple tablespaces - better control over load

balancing • can take some disk space off-line

Extent

• extent - contiguous storage on disk • when data segment or index segment first created,

given an initial extent from tablespace 10KB (5 pages) • if need more space given next contiguous extent

• can increase the size by a positive % (cannot decrease) initial n - size of initial extent next n - size of next max extents - maximum number of extents min extents - number of extents initially

allocated pct increase n - % by which next extent

grows over previous one

Create table

• Create table statement - can specify tablespace, no. of extents

• can override parameters for extent allocation • pctfree - determine how much space can be

used for inserts of new rows • if 10%, inserts stop when page is 90% full• pctused - where new inserts start again • if fall below certain percentage of total, default =

40% pctfree + pctused < 100

Rows

• Row layout on each disk page (see figure) • Row directory - page byte offset • can have rows from multiple tables on same page, more

info • in index, point to or RID –

page #, slot # • RID can be retrieved in ORACLE but not DB2 (violates

relational model rule) – in ORACLE, rows can be slit between pages (row record

fragmentation) – in DB2, entire row moved to new page, need forwarding pointer

Binary Search

• Binary search on disk – optimal for comparisons - not optimal for disk-

based look-up – must keep data in order – may be reading values from same page at

different times

• Instead use B+-tree index

Indexing

• Keyed access retrieval method • index is a sorted file - sorted by index key • index entries:

index key pointer (RID)

• pointer is RID • index resides on disk, partially memory resident when

accessed

B+-tree

• Most commonly used index structure type in DBs today

• Based on B-tree

• Used to minimize disk I/O

• available in DB2, ORACLE also has hash cluster, Ingres has heap structure, B-tree, isam (chain together new nodes) Example

B+-tree

• leaf level pointers to data (RID)

• the remaining are directory (index) nodes that point to other index nodes

• assume number of entries in each index node fits on one page - one node is one page

• if tree with depth of 3, 3 I/Os to get pointer to data

B+-tree

• B+-tree structured to get most out of every disk page read

• Read in index node, can make multiple probes to same page if remains in memory

• likely since frequent access to upper -level nodes of actively used B+-trees

• search for leftmost index entry Si such that

X <= Si

B+-tree

• Index has a directory structure that allows retrieval of a range of values efficiently

• Index entries always placed in sequence by value - can use sequential prefetch on index

• Index entries shorter than data rows and require proportionately less I/O

B+-tree

• Balancing of B+-trees - insert, delete

• nodes usually not full

• utilities to reorganize to lower disk I/O

• most systems allow nodes to become depopulated- no automatic algorithm to balance

• average node below root level 71% full in active growing B+-trees

Duplicate key values

• Duplicate key values in index • leaf nodes have sibling pointers • but a delete of a row that has a heavily

duplicated key entails a long search through the leaf-level of the B+-tree

• Index compression - with multiple duplicates | header info | PrX keyval RID RID ... RID | PrX keyval RID…RID|

where PrX is count of RID values

Create Index

Options: multiple columns

tablespace storage - initial extents, etc. percent free default = 10

% of each page left unfilled free page (1 free page for every n

index pages) Can control % of B+-tree node pages left

unfilled when index created, refers to initial creation

Clustering

• Placing rows on disk in order by some common index key value (remember the index itself is always sorted)

• clustered (clustering) index - index with rows in the same order as the key values

• efficiency advantage read in a page, get all of the rows with

the same value • clustering is useful for range queries

e.g. between keyval1 and keyval2

Clustering

• can only cluster table by 1 clustering index at a time

• In DB2 – – if the table is empty, rows sorted as placed on

disk – subsequent insertions not clustered, must use

REORG

Indexes vs. table scan

• To illustrate the difference between table scan, secondary index (non clustered) and clustered index

Assume 10 M customers, 200 cities2KB/page, row = 100 bytes, 20 rows/page Select *

From Customers Where city = Birmingham

1/200 * 10M if assume selectivity = 1/200 50,000 customers in a city

Tables Scan

Table Scan - read entire table

10,000,000/20 = 500,000 pages

If use prefetch?

500000/32 * .? =

Secondary Index

Secondary Index–

• In the worst case 1 entry for B'ham per page • 50,000 pages (10M/200)• 3 upper nodes of the tree • Assume 1000 index entries per leaf node, read

50000/1000 index pages • (3 + 50 + 50,000)*?=

Clustering Index

• Clustering Index –

• All entries for B'ham clustered on same pages

• 50,000/20 = 2500 pages (with 20 rows per page)

• (3 + 50 + 2500)*?=

% Free

• Redo the previous calculations assuming relations created with 50% free option specified.

Multiple Indexes

• More than one index on a relation – e.g. class - one index, gender - one index

Composite Index

• One index based on more than one attribute Create Index index_name on Table (col1, col2,... coln)

• Composite index entry - values for each attribute class, gender entry in index is: C1, C2, RID

• What would B+ tree look like?

Creating Indexes

When determining what indexes to create consider: workload - mix of queries and frequencies of requests 20% of requests are updates, etc.

can create lots of indexes but: cost to create insertions initial load time high if a large table index entries can become longer and

longer as multiple columns included

select operation strategies and indexing (chapter 8)

Documents