file organization and storage structures chapter 5

29
File Organization and Storage Structures Chapter 5

Upload: kerrie-ellis

Post on 03-Jan-2016

231 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: File Organization and Storage Structures Chapter 5

File Organization and

Storage Structures

Chapter 5

Page 2: File Organization and Storage Structures Chapter 5
Page 3: File Organization and Storage Structures Chapter 5

Basic Concepts

The database on secondary storage is organized into one or more files, where each file consists of a number of records.

Each record consists of one or more fields.

Typically, a record corresponds to an entity and a field to an attribute.

The physical record is the unit of transfer between disk and primary storage, and vice versa.

A physical record , sometimes called block or page, contains mostly several logical records, depending on the size of the records.

Page 4: File Organization and Storage Structures Chapter 5

List structures

Elementary listSingular list

Circular list

Symmetric list

Symmetric circular list

Page 5: File Organization and Storage Structures Chapter 5

Sequential insertion

X(1)

X(2)

X(3)

X(4)

FreeZone

X’(1)=X(1)

X’(2)=Y

X’(3)=X(2)

X’(4)=X(3)

freeZone

X’(5)=X(4)

Page 6: File Organization and Storage Structures Chapter 5

Insertion with pointer technique

X(1)

X(3)

X(2)

X(4)

Y

X’(1)=X(1)

X’(4)=X(3)

X’(3)=X(2)

X’(5)=X(4)

X’(2)=Y

Page 7: File Organization and Storage Structures Chapter 5

Multi-list structure

record with pointer record length 10

address

list1

list2

list emptyplaces

2000

3000

2020

2030

-1

-1

2050

2040

2010

2000

2060

3000

-1

.

.

.

A

B

K

L

Page 8: File Organization and Storage Structures Chapter 5

Insertion at beginning of list 2

list1

list2

2000

3000

2020

2030

-1

-1

2010

2050

2040

2000

2060

3000

-1

.

.

.

A

B

K

L

M

List1: A B

List2: M K L

Page 9: File Organization and Storage Structures Chapter 5

General tree structureA

B C

D E F H J K L

M N P Q R

Page 10: File Organization and Storage Structures Chapter 5

Equivalent binary tree structure

A

B C

D E F

H J K L

Q R

M N P

Page 11: File Organization and Storage Structures Chapter 5

Pointer Implementation

A

B C

D E F

H J K L

Q R

M N P

-1

-1

-1-1-1

-1-1-1-1

-1-1-1

-1-1-1-1

Page 12: File Organization and Storage Structures Chapter 5

Bi-directional treeX

Y R S

Z U T

Entry

-1 X

Y -1 R -1 S

-1 Z -1 U -1 T

- first lower- higher- next

Page 13: File Organization and Storage Structures Chapter 5

Ring structure

X

Y Z U

V T R

Entry

X

Y Z U

V T R

Page 14: File Organization and Storage Structures Chapter 5

File Organization

File OrganizationThe physical arrangement of data into records and pages on

secondary storageMain types

• Heap or unordered

• Sorted

• Hash

Access methodThe steps involved in storing and retrieving records from a

file

Page 15: File Organization and Storage Structures Chapter 5

Sample Data

SUPPLIER file

SNUM SNAME STATUS CITY

S1 De Smet 20 London

S2 Janssens 10 Paris

S3 Blanchart 30 Paris

S4 Clark 20 London

S5 Adams 30 Athens

Page 16: File Organization and Storage Structures Chapter 5

Hash Files

S300 Blanchart 30 Paris

0 1

2 3

4 5

6 7

8 9

10 11

12

S200 Janssens 10 Paris

S500 Adams 30 Athens

S100 De Smet 20 London

S400 Clark 20 London

Hashing techniques

Duplicate handling

- open addressing- unchained overflow- Chained overflow- Multiple hashing

Hashing algorithms

- folding- mid-square- division by prime number

Limitations: - inappropriate for value ranges - retrieval on the non-hash fields

Page 17: File Organization and Storage Structures Chapter 5

An Index

An index provides an ACCESS PATH to the file it is indexing

a file may have several associated indexes

the sequential access path is always available

an index imposes an ordering on the file it is indexing

it can be used for direct access

it speeds up retrieval and slows down updating

it is not the same thing as a key

can be build on combinations of fields

can be SRA or symbolic

Page 18: File Organization and Storage Structures Chapter 5

Sample Data

SUPPLIER file

SNUM SNAME STATUS CITY

S1 De Smet 20 London

S2 Janssens 10 Paris

S3 Blanchart 30 Paris

S4 Clark 20 London

S5 Adams 30 Athens

Page 19: File Organization and Storage Structures Chapter 5

Supplier file with index on city

Supplier file

SNUM SNAME STATUS CITY

S1 De Smet 20 London

S2 Janssens 10 Paris

S3 Blanchart 30 Paris

S4 Clark 20 London

S5 Adams 30 Athens

City-index

Athens .

London .

London .

Paris .

Paris .

Page 20: File Organization and Storage Structures Chapter 5

Supplier file with two indexes

10

20

20

30

30

Supplier file

City-index

Athens .

London .

London .

Paris .

Paris .

SNUM SNAME STATUS CITY

S1 De Smet 20 London

S2 Janssens 10 Paris

S3 Blanchart 30 Paris

S4 Clark 20 London

S5 Adams 30 Athens

Page 21: File Organization and Storage Structures Chapter 5

Non-dense index

S2 .

S4 .

S5 .

block 1

block 2

block 3

SNUM-index SNUM SNAME STATUS CITY

S1 De Smet 20 London

S2 Janssens 10 Paris

S3 Blanchart 30 Paris

S4 Clark 20 London

S5 Adams 30 Athens

Page 22: File Organization and Storage Structures Chapter 5

Factoring out a field

SNUM SNAME STATUS CITY-pointer

S1 De Smet 20

S2 Janssens 10

S3 Blanchart 30

S4 Clark 20

S5 Adams 30

Supplier fileCITY-file

CITY

Athens

London

Paris

Page 23: File Organization and Storage Structures Chapter 5

Combining Indexing and factoring out

S1 De Smet 20

S2 Janssens 10

S3 Blanchart 30

S4 Clark 20

S5 Adams 30

Athens London Paris

Page 24: File Organization and Storage Structures Chapter 5

Parent - Child structure

S1 De Smet 20

S2 Janssens 10

S3 Blanchart 30

S4 Clark 20

S5 Adams 30

Athens London Paris

CITY file

SUPPLIER file

Page 25: File Organization and Storage Structures Chapter 5

Fully inverted file

SNAME-index STATUS-index CITY-index Supplier-

file

De Smet S1-> 10 S1-> Athens S5-> S1

Janssens S2-> 20 S1->,S4-> London S1->,S4-> S2

Blanchart S3-> 30 S3->,S5-> Paris S2->,S3-> S3

Clark S4-> S4

Adams S5-> S5

Page 26: File Organization and Storage Structures Chapter 5

File organization: Indexed-sequential

multi-levelindex blocks

datablocks

BehrDoomsFagin

AdamsAlbertBehr

BodooClaesCoddDooms

ErnestFagin

AceAdamoAdams

AdemarAertsAlanAlbert

AloisBallBehr

BensBodoo

parameters - index block size - data block size

Page 27: File Organization and Storage Structures Chapter 5

B-tree conceptBALANCED tree

25 144

9 - 64 100 196 -

1 4 - 9 16 - 25 36 49 64 81 - 100 121 - 144 169 - 196225250

non-dense index

dense index

Page 28: File Organization and Storage Structures Chapter 5

B-tree insertion

non-dense index

dense index

same B-tree after insertion of record 32

64 -

25 - 144 -

9 - 36 - 100 - 196 -

1 4 - 9 16 - 25 32 - 36 49 - 64 81 - 100 121 - 144 169 - 196225256

Page 29: File Organization and Storage Structures Chapter 5

B-tree deletion

25 81

9 - 36 - 144 196

non-dense index

1 4 -- 9 16 - 25 32 - 36 49 - 81 100 121 144169 - 196225 256

Deletion of 64