chapter 6 external memory structuresedijason.github.io/courses/dba1718/handouts/ch6... · chapter...
TRANSCRIPT
![Page 1: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/1.jpg)
Chapter 6
External Memory Structures
1
![Page 2: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/2.jpg)
Acknowledgements• Disk storage, Wikipedia. https://en.wikipedia.org/wiki/Disk_storage
• B-tree, Wikipedia. https://en.wikipedia.org/wiki/B-tree
• R-tree, Wikipedia. https://en.wikipedia.org/wiki/R-tree
• Multimedia Databases and Data Mining. Primary key indexing – B-trees. Christos Faloutsos – CMU
• Spatial Access Methods, Chapter 26 of book. Dr Eamonn Keogh, Computer Science & Engineering Department, University of California –Riverside, Riverside,CA 92521
2
![Page 3: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/3.jpg)
Chapter Outline• External Disk Storage
• Working with External Data
• B-tree
• B+-tree
• R-tree
3
![Page 4: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/4.jpg)
External Storage
4
![Page 5: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/5.jpg)
Computer Architecture
5
![Page 6: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/6.jpg)
Types of External Memory• Magnetic Tape
• Optical• CD (Compact Disc)
• CD-ROM
• CD-R
• CD-RW
• VCD (Video Compact Disc)
• DVD (Digital Video/Versatile Disc)
• Magnetic Disk• RAID
6
![Page 7: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/7.jpg)
Magnetic Disk• Metal or plastic disk coated, on one or both sides,
with magnetizable material
• Data read and written through a magnetic head (coil) by means of induction
7
![Page 8: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/8.jpg)
8
Disk Data Layout
![Page 9: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/9.jpg)
Data Organization and Formatting
• Concentric rings or tracks• Gaps between tracks, reduce gap to increase capacity
• Same number of bits per track (variable density)
• Constant angular velocity
• Tracks divided into sectors
• Data read/written in blocks• Minimum block size is one sector
• May have more than one sector per block
9
![Page 10: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/10.jpg)
10
Finding Sectors• Must be able to identify start of track and sector
• Format disk• Additional information not available to user
• Marks tracks and sectors
![Page 11: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/11.jpg)
11
Multiple Platters• One head per side
• Heads are joined and aligned
• Aligned tracks on each platter form cylinders
• Data is striped by cylinder• reduces head movement
• increases speed (transfer rate)
![Page 12: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/12.jpg)
Speed• Seek time
• Moving head to the right track
• (Rotational) latency• Waiting for data to rotate under head
• Access time = Seek + Latency
• Transfer rate: speed of copying bytes from disk
12
Total time
![Page 13: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/13.jpg)
Look at External Storage
• A file partitioned into blocks of records
![Page 14: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/14.jpg)
RAID• Redundant Arrays of Independent Disks
14
![Page 15: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/15.jpg)
B-tree
15
![Page 16: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/16.jpg)
16
B-trees
Eg., B-tree of order 3:
1 3
6
7
9
13
<6
>6 <9>9
![Page 17: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/17.jpg)
17
B - tree properties:• each node, in a B-tree of order n:
• Key order• at most n pointers
• at least n/2 pointers (except root)
• all leaves at the same level
• if number of pointers is k, then node has exactly k-1 keys
• (leaves are empty)
v1 v2 … vn-1
p1 pn
![Page 18: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/18.jpg)
18
Queries• Algo for exact match query? (eg., ssn=8?)
1 3
6
7
9
13
<6
>6 <9>9
![Page 19: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/19.jpg)
19
Queries• Algo for exact match query? (eg., ssn=8?)
1 3
6
7
9
13
<6
>6 <9>9
![Page 20: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/20.jpg)
20
Queries• Algo for exact match query? (eg., ssn=8?)
1 3
6
7
9
13
<6
>6 <9>9
![Page 21: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/21.jpg)
21
Queries• Algo for exact match query? (eg., ssn=8?)
1 3
6
7
9
13
<6
>6 <9>9
![Page 22: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/22.jpg)
22
Queries• Algo for exact match query? (eg., ssn=8?)
1 3
6
7
9
13
<6
>6 <9>9
H steps (= disk
accesses)
![Page 23: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/23.jpg)
23
Queries
• what about range queries? (eg., 5<salary<8)
• Proximity/ nearest neighbor searches? (eg., salary ~ 8 )
1 3
6
7
9
13
<6
>6 <9>9
![Page 24: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/24.jpg)
24
Queries
• what about range queries? (eg., 5<salary<8)
• Proximity/ nearest neighbor searches? (eg., salary ~ 8 )
1 3
6
7
9
13
<6
>6 <9>9
![Page 25: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/25.jpg)
25
B-trees: Insertion• Insert in leaf; on overflow, push middle up
(recursively)• split: preserves B - tree properties
![Page 26: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/26.jpg)
26
B-trees: Insertion
Easy case: Tree T0; insert ‘8’
1 3
6
7
9
13
<6
>6 <9>9
![Page 27: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/27.jpg)
27
B-trees: Insertion
Tree T0; insert ‘8’
1 3
6
7
9
13
<6
>6 <9>9
8
![Page 28: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/28.jpg)
28
B-trees: Insertion
Hardest case: Tree T0; insert ‘2’
1 3
6
7
9
13
<6
>6 <9>9
2
![Page 29: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/29.jpg)
29
B-trees: Insertion
Hardest case: Tree T0; insert ‘2’
1 2
6
7
9
133
push middle up
![Page 30: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/30.jpg)
30
B-trees: Insertion
Hardest case: Tree T0; insert ‘2’
6
7
9
131 3
22Ovf; push middle
![Page 31: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/31.jpg)
31
B-trees: Insertion
Hardest case: Tree T0; insert ‘2’
7
9
131 3
2
6
Final state
![Page 32: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/32.jpg)
32
B-trees: Deletion
• Case1: delete a key at a leaf – no underflow
• Case2: delete non-leaf key – no underflow
• Case3: delete leaf-key; underflow, and ‘rich sibling’
• Case4: delete leaf-key; underflow, and ‘poor sibling’
![Page 33: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/33.jpg)
33
B-trees: Deletion
Easiest case: Tree T0; delete ‘3’
1 3
6
7
9
13
<6
>6 <9>9
![Page 34: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/34.jpg)
34
B-trees: Deletion
Easiest case: Tree T0; delete ‘3’
1
6
7
9
13
<6
>6 <9>9
![Page 35: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/35.jpg)
35
B-trees: Deletion
Easiest case: Tree T0; delete ‘3’
1
6
7
9
13
<6
>6 <9>9
![Page 36: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/36.jpg)
15-826 Copyright: C. Faloutsos (2012) 36
B-trees: Deletion
• Case1: delete a key at a leaf – no underflow
• Case2: delete non-leaf key – no underflow
• Case3: delete leaf-key; underflow, and ‘rich sibling’
• Case4: delete leaf-key; underflow, and ‘poor sibling’
![Page 37: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/37.jpg)
15-826 Copyright: C. Faloutsos (2012) 37
B-trees: Deletion
• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)
1 3
6
7
9
13
<6
>6 <9>9
Delete &
promote, ie:
![Page 38: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/38.jpg)
38
B-trees: Deletion
• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)
1 3 7
9
13
<6
>6 <9>9
Delete &
promote, ie:
![Page 39: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/39.jpg)
39
B-trees: Deletion
• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)
1 7
9
13
<6
>6 <9>9
Delete &
promote, ie:3
![Page 40: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/40.jpg)
40
B-trees: Deletion
• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)
1 7
9
13
<3
>3 <9>9
3FINAL TREE
![Page 41: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/41.jpg)
41
B-trees: Deletion
• Case2: delete a key at a non-leaf – no underflow (eg., delete 6 from T0)
• Q: How to promote?
• A: pick the largest key from the left sub-tree (or the smallest from the right sub-tree)
• Observation: every deletion eventually becomes a deletion of a leaf key
![Page 42: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/42.jpg)
42
B-trees: Deletion
• Case1: delete a key at a leaf – no underflow
• Case2: delete non-leaf key – no underflow
• Case3: delete leaf-key; underflow, and ‘rich sibling’
• Case4: delete leaf-key; underflow, and ‘poor sibling’
![Page 43: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/43.jpg)
43
B-trees: Deletion
• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)
1 3
6
7
9
13
<6
>6 <9>9
Delete &
borrow, ie:
![Page 44: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/44.jpg)
44
B-trees: Deletion
• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)
1 3
6 9
13
<6
>6 <9>9
Delete &
borrow, ie:
Rich sibling
![Page 45: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/45.jpg)
45
B-trees: Deletion
• Case3: underflow & ‘rich sibling’
• ‘rich’ = can give a key, without underflowing
• ‘borrowing’ a key: THROUGH the PARENT!
![Page 46: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/46.jpg)
46
B-trees: Deletion
• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)
1 3
6 9
13
<6
>6 <9>9
Delete &
borrow, ie:
Rich sibling
NO!!
![Page 47: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/47.jpg)
47
B-trees: Deletion
• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)
1 3
6 9
13
<6
>6 <9>9
Delete &
borrow, ie:
![Page 48: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/48.jpg)
48
B-trees: Deletion
• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)
1 3
9
13
<6
>6 <9>9
Delete &
borrow, ie:
6
![Page 49: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/49.jpg)
49
B-trees: Deletion
• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)
1
3 9
13
<6
>6 <9>9
Delete &
borrow, ie:
6
![Page 50: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/50.jpg)
50
B-trees: Deletion
• Case3: underflow & ‘rich sibling’ (eg., delete 7 from T0)
1
3 9
13
<3
>3 <9>9
Delete &
borrow,
through the
parent
6
FINAL TREE
![Page 51: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/51.jpg)
51
B-trees: Deletion
• Case1: delete a key at a leaf – no underflow
• Case2: delete non-leaf key – no underflow
• Case3: delete leaf-key; underflow, and ‘rich sibling’
• Case4: delete leaf-key; underflow, and ‘poor sibling’
![Page 52: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/52.jpg)
52
B-trees: Deletion
• Case4: underflow & ‘poor sibling’ (eg., delete 13from T0)
1 3
6
7
9
13
<6
>6 <9>9
![Page 53: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/53.jpg)
53
B-trees: Deletion
• Case4: underflow & ‘poor sibling’ (eg., delete 13from T0)
1 3
6
7
9<6
>6 <9>9
![Page 54: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/54.jpg)
54
B-trees: Deletion
• Case4: underflow & ‘poor sibling’ (eg., delete 13from T0)
1 3
6
7
9<6
>6 <9>9
A: merge w/
‘poor’ sibling
![Page 55: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/55.jpg)
55
B-trees: Deletion
• Case4: underflow & ‘poor sibling’ (eg., delete 13from T0)
• Merge, by pulling a key from the parent
• exact reversal from insertion: ‘split and push up’, vs. ‘merge and pull down’
![Page 56: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/56.jpg)
56
B-trees: Deletion
• Case4: underflow & ‘poor sibling’ (eg., delete 13from T0)
1 3
6
7
<6
>6
A: merge w/
‘poor’ sibling
9
![Page 57: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/57.jpg)
57
B-trees: Deletion
• Case4: underflow & ‘poor sibling’ (eg., delete 13from T0)
1 3
6
7
<6
>6
9
FINAL TREE
![Page 58: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/58.jpg)
B+-Tree
58
![Page 59: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/59.jpg)
59
B+-trees: Motivation
if we want to store the whole record with the key –> problems (what?)
1 3
6
7
9
13
<6
>6 <9>9
![Page 60: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/60.jpg)
60
Solution: B+-trees • They string all leaf nodes together
• AND
• replicate keys from non-leaf nodes, to make sure every key appears at the leaf level
![Page 61: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/61.jpg)
61
B+ trees
1 3
6
6
9
9
<6
>=6 <9>=9
7 13
![Page 62: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/62.jpg)
R-Tree
62
![Page 63: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/63.jpg)
Spatial Data
• Given such a database we can easily answer queries by using SQL, such as
• List all Mexican restaurants.• List all Grade A restaurants.
• However, classic databases do not allow queries such as• List all Mexican restaurants within five miles of UCR• List the pizza restaurant nearest to 91 and 60.
• These kinds of queries are called spatial queries• Nearest neighbor queries • Range queries • Spatial joins
63
B876,65878-1342ITA4Sues Pasta
A123,32878-1333MEX3Tinas Mexican
A34,764848-1298US2Joes Bugers
D244,365888-1212ITA1Marios Pizza
GradeLocationPhoneTypeIDName
B876,65878-1342ITA4Sues Pasta
A123,32878-1333MEX3Tinas Mexican
A34,764848-1298US2Joes Bugers
D244,365888-1212ITA1Marios Pizza
GradeLocationPhoneTypeIDName
![Page 64: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/64.jpg)
Indexing Spatial Data• So, we call always index 1-dimensional data (if you can sort
it, you can index it), such that we can answer 1-nearest neighbor queries by accessing just O(log(n) ) of the database. (n is the number of items in the database). (i.e. the B-tree)
• But we cannot sort 2 dimensional data…
• Solution: R-Tree• introduced by Guttman in the 1984 SIGMOD conference.
64
![Page 65: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/65.jpg)
R-Trees• R-trees are a N-dimensional extension of B+-trees,
useful for indexing sets of rectangles and other polygons.
• Supported in many modern database systems, along with variants like R+ -trees and R*-trees.
• Basic idea: generalize the notion of a one-dimensional interval associated with each B+ -tree node to an N-dimensional interval, that is, an N-dimensional rectangle.
• Will consider only the two-dimensional case (N = 2) • generalization for N > 2 is straightforward, although R-
trees work well only for relatively small N
65
![Page 66: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/66.jpg)
R-Trees• A rectangular bounding box is associated with each
tree node.• Bounding box of a leaf node is a minimum sized rectangle
that contains all the rectangles/polygons associated with the leaf node.
• The bounding box associated with a non-leaf node contains the bounding box associated with all its children.
• Bounding box of a node serves as its key in its parent node (if any)
• Bounding boxes of children of a node are allowed to overlap
• A polygon is stored only in one node, and the bounding box of the node must contain the polygon• The storage efficiency or R-trees is better than that of k-d
trees or quadtrees since a polygon is stored only once
66
![Page 67: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/67.jpg)
MBR• Suppose we have a cluster of points in 2-D space...
• We can build a “box” around points. The smallest box (which is axis parallel) that contains all the points is called a Minimum Bounding Rectangle (MBR)
67
MBR = {(L.x,L.y)(U.x,U.y)}
![Page 68: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/68.jpg)
MINDIST• The formula for the distance between a point and
the closest possible point within an MBR
68
MBR = {(L.x,L.y)(U.x,U.y)}Q = (x,y)
MINDIST(Q,MBR)
if L.x < x < U.x and L.y < y < U.y then 0elseif L.x < x < U.x then min( (L.y -y)2 , (U.y -y)2 )elseif ….
![Page 69: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/69.jpg)
MINDIST Example
69
MINDIST(point, MBR) = 5 MINDIST(point, MBR) = 0
![Page 70: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/70.jpg)
MINDIST Example• Suppose we have a query point Q and one known
point R. Could any of the points in the MBR be closer to Q than R is?
70
MBR = {(6,1),(8,4)}
Q = (3,5)
R = (1,7)
![Page 71: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/71.jpg)
Constructing MBR• Each MBR can be represented with just two points.
The lower left corner, and the upper right corner.
• We can further recursively group MBRs into larger MBRs….
71
R1
R2R5
R3
R7R9
R8
R6
R4
R10 R11
R12
![Page 72: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/72.jpg)
Constructing R-Tree• …these nested MBRs are organized as a tree (called
a spatial access tree or a multidimensional tree).
72
R10 R11 R12
R1 R2 R3 R4 R5 R6 R7 R8 R9
Data nodes containing points
R10 R11
R12
![Page 73: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/73.jpg)
Constructing R-Tree• At the leave nodes we
have the location, and a pointer to the record in question
• At the internal nodes, we just have MBR information
73
R10
R1 R2 R3
(3,4) 77(1,3) 88(2,3) 22(5,4) 13
(2,2) 47(3,0) 86(7,9) 52
(5,1) 32(1,4) 45(5,6) 27(7,8) 73
{(1,3),(5,4)} {(2,0),(7,9)} {(1,1),(7,8)}
{(1,0),(7,9)}
![Page 74: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/74.jpg)
Search in R-Tree• To find data items (rectangles/polygons)
intersecting (overlaps) a given query point/region, do the following, starting from the root node:• If the node is a leaf node, output the data items whose
keys intersect the given query point/region.• Else, for each child of the current node whose bounding
box overlaps the query point/region, recursively search the child
• Can be very inefficient in worst case since multiple paths may need to be searched• but works acceptably in practice.
• Simple extensions of search procedure to handle predicates contained-in and contains
74
![Page 75: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/75.jpg)
Insertion in R-Tree• To insert a data item:
• Find a leaf to store it, and add it to the leaf• To find leaf, follow a child (if any) whose bounding box contains
bounding box of data item, else child whose overlap with data item bounding box is maximum
• Handle overflows by splits (as in B+ -trees) • Split procedure is different though (see below)
• Adjust bounding boxes starting from the leaf upwards
• Split procedure:• Goal: divide entries of an overfull node into two sets such
that the bounding boxes have minimum total area • This is a heuristic. Alternatives like minimum overlap are possible
• Finding the “best” split is expensive, use heuristics instead• See next slide
75
![Page 76: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/76.jpg)
Splitting an R-Tree Node• Quadratic split divides the entries in a node into two
new nodes as follows1. Find pair of entries with “maximum separation”
• that is, the pair such that the bounding box of the two would has the maximum wasted space (area of bounding box – sum of areas of two entries)
2. Place these entries in two new nodes3. Repeatedly find the entry with “maximum preference” for
one of the two new nodes, and assign the entry to that node• Preference of an entry to a node is the increase in area of
bounding box if the entry is added to the other node
4. Stop when half the entries have been added to one node• Then assign remaining entries to the other node
• Cheaper linear split heuristic works in time linear in number of entries,• Cheaper but generates slightly worse splits.
76
![Page 77: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/77.jpg)
Deleting in R-Trees• Deletion of an entry in an R-tree done much like a
B+-tree deletion.• In case of underfull node, borrow entries from a sibling
if possible, else merging sibling nodes
• Alternative approach removes all entries from the underfull node, deletes the node, then reinserts all entries
• As always, deletion tends to be rarer than insertion for many real world databases.
77
![Page 78: Chapter 6 External Memory Structuresedijason.github.io/courses/DBA1718/handouts/ch6... · Chapter Outline •External Disk Storage •Working with External Data •B-tree •B+-tree](https://reader030.vdocuments.mx/reader030/viewer/2022040406/5ea4f6d55a15aa62cf07b1cf/html5/thumbnails/78.jpg)
End of Chapter 6