chapter 12 b+ trees cs 157b spring 2003 by: miriam sy

25
Chapter 12 Chapter 12 B+ Trees B+ Trees CS 157B CS 157B Spring 2003 Spring 2003 By: Miriam Sy By: Miriam Sy

Upload: suzan-riley

Post on 04-Jan-2016

231 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Chapter 12Chapter 12B+ TreesB+ Trees

CS 157BCS 157B

Spring 2003Spring 2003

By: Miriam SyBy: Miriam Sy

Page 2: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

B+ Tree B+ Tree

The B+ Tree index structure is the most The B+ Tree index structure is the most widely used of several index structures widely used of several index structures that that maintain their efficiencymaintain their efficiency despite despite insertion and deletion of data.insertion and deletion of data.

It takes the form of a It takes the form of a balanced treebalanced tree in in which every path from the root of the tree which every path from the root of the tree to a leaf is of the same length. to a leaf is of the same length.

Page 3: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

B Trees & B+ TreesB Trees & B+ Trees

B Trees:B Trees: Multi-way treesMulti-way trees Dynamic growthDynamic growth Contains only data Contains only data

pagespages

B+ Trees:B+ Trees: Contains features Contains features

from B Treesfrom B Trees Contains index and Contains index and

data pagesdata pages Dynamic growthDynamic growth

Page 4: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Fill FactorFill Factor B+ Trees use a B+ Trees use a “fill “fill

factor”factor” to control the to control the growth and the growth and the shrinkage.shrinkage.

50% 50% fill factorfill factor is the is the minimum for a B+ minimum for a B+ Tree. Tree.

For n=4, the For n=4, the following guidelines following guidelines must be met:must be met:

Number of Number of Keys/PageKeys/Page

44

Number of Number of Pointers/PagePointers/Page

55

Fill FactorFill Factor 50%50%

Minimum Keys Minimum Keys in each pagein each page

22

Page 5: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

B+ TreesB+ Trees

B+ Tree with n=4B+ Tree with n=4

Page 6: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Inserting RecordsInserting Records

The key value determines a record’s The key value determines a record’s placement in a B+ Treeplacement in a B+ Tree

The leaf pages are maintained in The leaf pages are maintained in sequential order AND a sequential order AND a doubly linked doubly linked listlist (usually not shown) ,which connects (usually not shown) ,which connects each leaf page with its sibling page(s). each leaf page with its sibling page(s). This doubly linked list speeds data This doubly linked list speeds data movement as the pages grow and movement as the pages grow and contract.contract.

Page 7: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Insertion AlgorithmInsertion AlgorithmLeaf Page Leaf Page Full ?Full ?

Index Page Index Page Full ?Full ?

ActionAction

NoNo NoNo Place the record in sorted position Place the record in sorted position in the appropriate leaf pagein the appropriate leaf page

YesYes NoNo

1. Split the leaf page1. Split the leaf page

2. Place Middle Key in the index 2. Place Middle Key in the index page in sorted order.page in sorted order.

3. Left leaf page contains records 3. Left leaf page contains records with keys below the middle key.with keys below the middle key.

4. Right leaf page contains records 4. Right leaf page contains records with keys = > than the middle with keys = > than the middle key.key.

Page 8: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Insertion Algorithm Insertion Algorithm cont…cont…

Leaf Page Leaf Page Full?Full?

Index Page Index Page Full?Full?

ActionAction

YesYes YesYes

1.Split the leaf page1.Split the leaf page

2. Records with keys < middle key go to the 2. Records with keys < middle key go to the left leaf pageleft leaf page

3. Records with keys > = middle key go to the 3. Records with keys > = middle key go to the right leaf page.right leaf page.

4. Split the index page4. Split the index page

5. Keys < middle key go to the left index page.5. Keys < middle key go to the left index page.

6. Keys > middle key go to the right index 6. Keys > middle key go to the right index page.page.

7. The middle key goes to the next (higher) 7. The middle key goes to the next (higher) level. If the next level index page is full, level. If the next level index page is full, continue splitting the index pages.continue splitting the index pages.

Page 9: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Insertion 1Insertion 1stst Case: Case: leaf page & index page NOT leaf page & index page NOT

FULLFULL Original Table:Original Table:

Insert 28:Insert 28:

Page 10: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Insertion 2Insertion 2ndnd Case: leaf page Case: leaf page is Full but index page is Not is Full but index page is Not

FullFull Next, we’re going to insert a record with a key Next, we’re going to insert a record with a key

value of 70. This record should go in the leaf value of 70. This record should go in the leaf page with 50, 55, 60, and 65. Unfortunately, page with 50, 55, 60, and 65. Unfortunately, this page is already full so we need to split this page is already full so we need to split nodes into the following:nodes into the following:

The middle key (60) is placed in the index page The middle key (60) is placed in the index page between 50 and 75.between 50 and 75.

Left leaf pageLeft leaf page Right leaf pageRight leaf page

50 5550 55 60 65 7060 65 70

Page 11: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

B+ Tree after insertion of B+ Tree after insertion of 7070

Page 12: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Insertion 3Insertion 3rdrd Case: leaf page Case: leaf page & index page is FULL& index page is FULL

For our last example, we need to add 95For our last example, we need to add 95

This record belongs in the page containing 75, 80, 85, and 90. This record belongs in the page containing 75, 80, 85, and 90. Unfortunately, this is full so we need to split it into two pages:Unfortunately, this is full so we need to split it into two pages:

The middle key, 85, rises to the index page. Unfortunately, the The middle key, 85, rises to the index page. Unfortunately, the index page is also full so we must split it up:index page is also full so we must split it up:

Left leaf pageLeft leaf page Right leaf pageRight leaf page

75 8075 80 85 90 9585 90 95

Left index pageLeft index page Right index pageRight index page New index pageNew index page

25 5025 50 75 8575 85 6060

Page 13: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

B+ Tree after insertion of B+ Tree after insertion of 9595

Page 14: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

RotationRotation

B+ Trees can incorporate B+ Trees can incorporate rotationrotation to to reduce the number of page splits.reduce the number of page splits.

A rotation occurs when a leaf page is full, A rotation occurs when a leaf page is full, but one of its siblings pages is not full. but one of its siblings pages is not full. Rather than splitting the leaf page, we Rather than splitting the leaf page, we move a record to its sibling, adjusting move a record to its sibling, adjusting indices as necessary.indices as necessary.

The The leftleft sibling is usually checked first, sibling is usually checked first, then the then the right right siblingsibling ..

Page 15: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

RotationRotation For example, consider the B+ Tree before the addition of 70. For example, consider the B+ Tree before the addition of 70.

As previously stated, this record belongs in the leaf node As previously stated, this record belongs in the leaf node containing 50 55 60 65. Notice how this node is full but it’s left containing 50 55 60 65. Notice how this node is full but it’s left sibling is not.sibling is not.

Using rotation, we shift the record with the lowest key to its Using rotation, we shift the record with the lowest key to its sibling. We must also modify the index page since this key sibling. We must also modify the index page since this key also appears there.also appears there.

Page 16: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

RotationRotation

This is what the new B+ Tree looks like This is what the new B+ Tree looks like with the rotation:with the rotation:

Page 17: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Deletion AlgorithmDeletion Algorithm Like the insertion algorithm, there are three scenarios Like the insertion algorithm, there are three scenarios

we must consider when deleting a record from a B+ we must consider when deleting a record from a B+ Tree.Tree.

First Scenario:First Scenario:

Leaf Page Leaf Page Below Fill Factor Below Fill Factor

Index Page Below Index Page Below Fill FactorFill Factor

ActionAction

NoNo NoNo

Delete the record from Delete the record from the leaf page. Arrange the leaf page. Arrange keys in ascending order keys in ascending order to fill void. If the key of to fill void. If the key of the deleted record the deleted record appears in the index appears in the index page, use the next key page, use the next key to replace it.to replace it.

Page 18: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Deletion Algorithm Deletion Algorithm cont…cont…

Leaf Page Leaf Page Below Fill Below Fill Factor ?Factor ?

Index Page Index Page Below Fill Below Fill Factor ?Factor ?

ActionAction

YesYes NoNo

Combine the leaf page and its sibling. Combine the leaf page and its sibling. Change the index page to reflect the Change the index page to reflect the other change.other change.

YesYes YesYes

1. Combine the leaf page and its sibling.1. Combine the leaf page and its sibling.

2. Adjust the index page to reflect the other 2. Adjust the index page to reflect the other change.change.

3. Combine the index page with its sibling.3. Combine the index page with its sibling.

Continue combining index pages until you Continue combining index pages until you reach a page with the correct fill factor or reach a page with the correct fill factor or you reach the root page.you reach the root page.

Page 19: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

DeletionDeletion To refresh our memories, here’s our B+ To refresh our memories, here’s our B+

Tree right after the insertion of 95:Tree right after the insertion of 95:

Page 20: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Deletion (1Deletion (1stst case) case) Delete record with Key 70Delete record with Key 70

Page 21: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Deletion (2Deletion (2ndnd case) case) Delete Delete 25 25 from the B+ Treefrom the B+ Tree Because Because 2525 appears in the index page, when we delete appears in the index page, when we delete 2525, ,

we must replace it with we must replace it with 2828 in the index page. in the index page.

Page 22: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Deletion (3Deletion (3rdrd case) case)

Deleting 60 from the B+ TreeDeleting 60 from the B+ Tree The leaf page containing 60 (60 65) will be below The leaf page containing 60 (60 65) will be below

the fill factor after the deletion. Thus, we must the fill factor after the deletion. Thus, we must combine leaf pages.combine leaf pages.

With recombined pages, the index page will be With recombined pages, the index page will be reduced by one key. Hence, it will also fall below reduced by one key. Hence, it will also fall below the fill factor. Thus, we must combine index the fill factor. Thus, we must combine index pages.pages.

60 appears as the only key in the root index 60 appears as the only key in the root index page. Obviously, it will be removed with the page. Obviously, it will be removed with the deletion.deletion.

Page 23: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Deletion (3Deletion (3rdrd case) case) B+ Tree after deletion of 60B+ Tree after deletion of 60

Page 24: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

B+ Trees in PracticeB+ Trees in Practice

Typical Order: Typical Order: 100100 Typical Fill Factor: Typical Fill Factor: 67%67% Typical Capacities:Typical Capacities:

Height 4: 133^4 = Height 4: 133^4 = 312,900,700312,900,700 records records Height 3: 133^3 = Height 3: 133^3 = 2,352,6372,352,637 records records

Page 25: Chapter 12 B+ Trees CS 157B Spring 2003 By: Miriam Sy

Concluding RemarksConcluding Remarks

Tree structured indexes are ideal for Tree structured indexes are ideal for range-searches, also good for equality range-searches, also good for equality searchessearches

B+ Tree is a dynamic structureB+ Tree is a dynamic structure Insertions and deletions leave tree height Insertions and deletions leave tree height

balancedbalanced High fanout means depth usually just 3 or 4High fanout means depth usually just 3 or 4 Almost always better than maintaining a Almost always better than maintaining a

sorted filesorted file