select operation strategies and indexing (chapter 8)
DESCRIPTION
Select Operation Strategies And Indexing (Chapter 8). Disk access. DBs traditionally stored on disk Cheaper to store on disk than in memory Seek time, latency, data transfer time disk access is page oriented 2 - 4 KB page size. Access time. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/1.jpg)
Select Operation Strategies
And Indexing
(Chapter 8)
![Page 2: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/2.jpg)
Disk access
• DBs traditionally stored on disk
• Cheaper to store on disk than in memory
• Seek time, latency, data transfer time
• disk access is page oriented
• 2 - 4 KB page size
![Page 3: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/3.jpg)
Access time
• A to randomly access a page – 12-20 ms –50-83 I/O's per second
• large disparity between disk access and memory access (10-200 ns)
• hash disk page address and look in lookaside table to see if page in memory buffer
• In memory DBs the future?
![Page 4: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/4.jpg)
Table scan
• Linear search - all data rows read in – I/O parallelism can be used
• multiple I/O read requests satisfied at the same time
• stripe the data across different disks
– Problems with parallelism?• must balance disk arm load to gain maximum
parallelism • requires the same total number of random I/O's,
but using devices for a shorter time
![Page 5: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/5.jpg)
Sequential prefetch I/O
• retrieve one disk page after another (on same track) - typically 32
• seek time no longer a problem
• must know in advance to read 32 successive pages
• speed up of I/O by a factor of 10 (500 I/O's per second vs. 70)
![Page 6: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/6.jpg)
Access time
• Seek time – 10-15ms
• Latency time – 2-5 ms
• Data transfer time – 10-200 ns
![Page 7: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/7.jpg)
Access time for fast I/O
RIO Seq. Prefetch .010 .010 Seek - disk arm to cylinder .002 .002 Latency - platter to sector .0015 .048 Data transfer - Page .0135 .060 1 page vs. 32 pages
.43 seconds .060 seconds for 32 pages for both
![Page 8: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/8.jpg)
Textbook access time
RIO Seq. Prefetch .008 .008 Seek - disk arm to cylinder .004 .004 Latency - platter to sector .0005 .016 Data transfer - Page .0125 .028 1 page vs. 32 pages
.40 seconds .028 seconds for 32 pages for both
![Page 9: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/9.jpg)
Disk allocation
• Disk Resource Allocation for Databases (control DBA has)
• No standard SQL approach, but general way to deal with allocation
• Some OS allow specification of size of file and disk device
• contiguous sectors on disk - want close together as possible to minimize seek time
![Page 10: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/10.jpg)
Tablespace
• Allocation medium for tables and indexes for ORACLE, DB2
• usually files (relations) cannot span disk devices • can put >1 table in table space if accessed
together • corresponds to 1 or more OS files and can span
disk devices
![Page 11: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/11.jpg)
Query Language
• ORACLE DB's contain several tablespaces, including one called system - data description + indexes + user-defined tables
Create tablespace tspace1 datafile 'fname1', 'fname2';
• default tablespace given to each user • if multiple tablespaces - better control over load
balancing • can take some disk space off-line
![Page 12: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/12.jpg)
Extent
• extent - contiguous storage on disk • when data segment or index segment first created,
given an initial extent from tablespace 10KB (5 pages) • if need more space given next contiguous extent
• can increase the size by a positive % (cannot decrease) initial n - size of initial extent next n - size of next max extents - maximum number of extents min extents - number of extents initially
allocated pct increase n - % by which next extent
grows over previous one
![Page 13: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/13.jpg)
Create table
• Create table statement - can specify tablespace, no. of extents
• can override parameters for extent allocation • pctfree - determine how much space can be
used for inserts of new rows • if 10%, inserts stop when page is 90% full• pctused - where new inserts start again • if fall below certain percentage of total, default =
40% pctfree + pctused < 100
![Page 14: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/14.jpg)
Rows
• Row layout on each disk page (see figure) • Row directory - page byte offset • can have rows from multiple tables on same page, more
info • in index, point to or RID –
page #, slot # • RID can be retrieved in ORACLE but not DB2 (violates
relational model rule) – in ORACLE, rows can be slit between pages (row record
fragmentation) – in DB2, entire row moved to new page, need forwarding pointer
![Page 15: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/15.jpg)
Binary Search
• Binary search on disk – optimal for comparisons - not optimal for disk-
based look-up – must keep data in order – may be reading values from same page at
different times
• Instead use B+-tree index
![Page 16: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/16.jpg)
Indexing
• Keyed access retrieval method • index is a sorted file - sorted by index key • index entries:
index key pointer (RID)
• pointer is RID • index resides on disk, partially memory resident when
accessed
![Page 17: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/17.jpg)
B+-tree
• Most commonly used index structure type in DBs today
• Based on B-tree
• Used to minimize disk I/O
• available in DB2, ORACLE also has hash cluster, Ingres has heap structure, B-tree, isam (chain together new nodes) Example
![Page 18: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/18.jpg)
B+-tree
• leaf level pointers to data (RID)
• the remaining are directory (index) nodes that point to other index nodes
• assume number of entries in each index node fits on one page - one node is one page
• if tree with depth of 3, 3 I/Os to get pointer to data
![Page 19: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/19.jpg)
B+-tree
• B+-tree structured to get most out of every disk page read
• Read in index node, can make multiple probes to same page if remains in memory
• likely since frequent access to upper -level nodes of actively used B+-trees
• search for leftmost index entry Si such that
X <= Si
![Page 20: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/20.jpg)
B+-tree
• Index has a directory structure that allows retrieval of a range of values efficiently
• Index entries always placed in sequence by value - can use sequential prefetch on index
• Index entries shorter than data rows and require proportionately less I/O
![Page 21: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/21.jpg)
B+-tree
• Balancing of B+-trees - insert, delete
• nodes usually not full
• utilities to reorganize to lower disk I/O
• most systems allow nodes to become depopulated- no automatic algorithm to balance
• average node below root level 71% full in active growing B+-trees
![Page 22: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/22.jpg)
Duplicate key values
• Duplicate key values in index • leaf nodes have sibling pointers • but a delete of a row that has a heavily
duplicated key entails a long search through the leaf-level of the B+-tree
• Index compression - with multiple duplicates | header info | PrX keyval RID RID ... RID | PrX keyval RID…RID|
where PrX is count of RID values
![Page 23: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/23.jpg)
Create Index
Options: multiple columns
tablespace storage - initial extents, etc. percent free default = 10
% of each page left unfilled free page (1 free page for every n
index pages) Can control % of B+-tree node pages left
unfilled when index created, refers to initial creation
![Page 24: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/24.jpg)
Clustering
• Placing rows on disk in order by some common index key value (remember the index itself is always sorted)
• clustered (clustering) index - index with rows in the same order as the key values
• efficiency advantage read in a page, get all of the rows with
the same value • clustering is useful for range queries
e.g. between keyval1 and keyval2
![Page 25: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/25.jpg)
Clustering
• can only cluster table by 1 clustering index at a time
• In DB2 – – if the table is empty, rows sorted as placed on
disk – subsequent insertions not clustered, must use
REORG
![Page 26: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/26.jpg)
Indexes vs. table scan
• To illustrate the difference between table scan, secondary index (non clustered) and clustered index
Assume 10 M customers, 200 cities2KB/page, row = 100 bytes, 20 rows/page Select *
From Customers Where city = Birmingham
1/200 * 10M if assume selectivity = 1/200 50,000 customers in a city
![Page 27: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/27.jpg)
Tables Scan
Table Scan - read entire table
10,000,000/20 = 500,000 pages
If use prefetch?
500000/32 * .? =
![Page 28: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/28.jpg)
Secondary Index
Secondary Index–
• In the worst case 1 entry for B'ham per page • 50,000 pages (10M/200)• 3 upper nodes of the tree • Assume 1000 index entries per leaf node, read
50000/1000 index pages • (3 + 50 + 50,000)*?=
![Page 29: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/29.jpg)
Clustering Index
• Clustering Index –
• All entries for B'ham clustered on same pages
• 50,000/20 = 2500 pages (with 20 rows per page)
• (3 + 50 + 2500)*?=
![Page 30: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/30.jpg)
% Free
• Redo the previous calculations assuming relations created with 50% free option specified.
![Page 31: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/31.jpg)
Multiple Indexes
• More than one index on a relation – e.g. class - one index, gender - one index
![Page 32: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/32.jpg)
Composite Index
• One index based on more than one attribute Create Index index_name on Table (col1, col2,... coln)
• Composite index entry - values for each attribute class, gender entry in index is: C1, C2, RID
• What would B+ tree look like?
![Page 33: Select Operation Strategies And Indexing (Chapter 8)](https://reader035.vdocuments.mx/reader035/viewer/2022062808/5681525d550346895dc09100/html5/thumbnails/33.jpg)
Creating Indexes
When determining what indexes to create consider: workload - mix of queries and frequencies of requests 20% of requests are updates, etc.
can create lots of indexes but: cost to create insertions initial load time high if a large table index entries can become longer and
longer as multiple columns included