e.bertino, l.matino object-oriented database systems chapter 8. storage management and indexing...

62
E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer Engineering OOPSLA Lab.

Upload: anthony-greene

Post on 29-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

E.Bertino, L.MatinoObject-Oriented Database Systems

Chapter 8. Storage Managementand Indexing Techniques

Seoul National University

Department of Computer Engineering

OOPSLA Lab.

Page 2: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

2OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Table of Contents

Storage Techniques for Relational DBMS Storage Techniques for Objects Clustering Techniques Indexing Techniques for OODBMS Object Identifiers Swizzling

Page 3: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

3OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Storage Techniques for Relational DBMS

Disk Organization Storing Records in RDBMS Addressing Records with a Slot Vector

Page 4: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

4OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Disk Organization

Disk  partitions   segments   pages/blocks Disk header

# of partitions the address and  the size of each partition log for recovery in case of a system crash

Page addresses for each segment are stored in tables

Page = page header + offsets of objects + objects

Page 5: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

5OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

DISK

header

partition1 partitionnN logl1

l1

ln

ln

… …

segment1… … segmentm

page1 … … pagei

header

array of offset

adjacent free space

totalfree space

Z A F B

Page 6: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

6OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Storing Records in RDBMS

Fixed length records normally stored contiguously on the disk all the records of a relation can be stored in a single file

Variable length records stored directly on the disk with an ID structure of ID is important on the retrieval speed

Structure of ID in System R high order bits for the segment and the page of the file low order bits for a record within a page

Page 7: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

7OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Addressing Records with a Slot Vector

Advantages as fast as using the complete address of a record the length of records can be changed the records can be relocated often faster than using the purely logical ID

RECORD

SLOT

Page 8: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

8OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Storage Techniques for Objects

Structure of Objects Access Patterns to Objects Approaches to Storage Organization for Objects Storage and Variable Length and Large Attributes Storage and Inheritance Hierarchy

Page 9: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

9OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

 Structure of objects

Storage/memory organization must support objects with both atomic and complex attributes objects with multi-valued attributes objects with variant attributes objects with long field attributes such as multimedia

information, texts, images, voice, etc

Efficiency of storage organization depends on structure of objects and their relations access pattern which is the way in which the application

programs access the objects

Page 10: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

10OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Categories of Access Patterns

Access based on the whole object for applications which execute complex manipulations of

objects by means of specialized program whole object is copied onto the application's memory direct model

Access based on the attributes of the object appropriate when large objects need to be accessed used to retrieve attributes of objects along the aggregation

hierarchy normalized model

Page 11: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

11OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Direct Model of Storage Organization(1)

Objects are stored in the same way in which they are defined in the conceptual schema storage unit = semantic unit objects of the same class are stored in the same file

Advantages simplest and same as the one used in RDBMS transferring of a whole objects is a very efficient

Disadvantages accesses to a set of attributes of an object can be very

expensive

Page 12: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

12OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Direct Model of Storage Organization(2)

Situations where direct model is inefficient variable length attribute new attributes the majority of attributes have the null value

Page 13: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

13OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Normalized Model of Storage Organization

Decompose an object into atomic components Each component are stored in different files Relation between the components is maintained by

OIDs

Page 14: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

14OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Intermediate Approach

Complex objects are decomposed Components are grouped together according to

access patterns to be stored in the same file Problem

efficiency depends on having prior knowledge of the exact access pattern for applications

Page 15: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

15OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Variable-length and Large Attributes

Normalized method Property list method Stream (or demand-page) mechanism

portions of the object can be transferred in increments

Page 16: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

16OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Property list(1)

Sequence of triples < identifier, size, value > identifier : which attribute of the object is stored size : # of bytes stored value : that (of varying size) of the attribute

Page 17: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

17OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Property List(2)

Advantages variable length attributes different set of attributes sparse attributes attributes can be stored in different physical locations

Disadvantages whole property list scanning to find the desired attribute transformation of the property list to the proper format

for the application programming language

Page 18: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

18OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Storage and Inheritance hierarchy

Attributes of the superclass should be stored Single inheritance

storing the attributes of superclass first, then those of subclass

variable length attribute alongside with the property list

Multiple inheritance property lists storing objects separately each of above contains the fields for superclass, and linked

to one another

Page 19: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

19OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Clustering Techniques

Clustering in DBMS Clustering in RDBMS Clustering in OODBMS Static Clustering Dynamic Clustering Clustering for Multiple Relations

Page 20: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

20OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Clustering in DBMS

Focus partitioning objects in the database placing these partitions on disk

Aim reduce the number of I/O operations on disk

Consideration structure of the objects access pattern of applications

Page 21: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

21OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Clustering Techniques for RDBMS

Tuples of a relation in the same page segment on the basis of the value of an attribute or of a

combination of attributes in a relation

Tuples of more than one relation in the same segment one or more attributes in common with the same values efficient for processing queries with join operation

Page 22: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

22OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Clustering Techniques for OODBMS

New considerations compared with RDBMS complex objects single or multiple inheritance methods

Linear clustering sequence for complex object all the descendant nodes of each node p in the hierarchy

are stored immediately after p in depth-first order efficient on retrieval of an object and all its descendants

Page 23: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

23OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Basic Options for Clustering for OODBMS

Proposed by Won Kim in 1990 both clustering techniques as in RDBMS clustering all the instances of classes which belong to

an aggregation hierarchy clustering all the instances of classes which belong to

the inheritance hierarchy combination of the two previous strategies

The clustering strategies above are static

Page 24: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

24OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Static Clustering

Unchangeable at run-time Problems

no considerations on the dynamic evolution of objects objects can be shared among several objects clustering schema based on the single access pattern

Page 25: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

25OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Dynamic Clustering

The sequence of creation of objects would NOT be the same as the desired clustering sequence.

Reorganizing and recompacting pages in a cluster Types of file reorganization

on-line : optimal one is NP-complete problem off-line : when the reorganization will be done?

On-line reorganization technique by Chen, Hurson chunks(set of pages) as the unit of clustering cost model ratio between the read and write operations

Page 26: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

26OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Clustering for Multiple Relations

Certain relationships can be used more frequently Direct graph

nodes for objects arcs for relationships weights for ordering relationships

Clustering algorithm with levels by Chen, Hurson arranges all the nodes of the graph in a linear sequence nodes connected by heavier arcs are nearer than others access time is around half that for objects randomly

Page 27: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

27OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Indexing Techniques for OODBMS

Indexing Techniques for Aggregation Hierarchy Index Structures and Operations Comparison of Index Organization Indexing Techniques for Inheritance Hierarchy Precomputing and Caching

Page 28: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

28OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Preliminary Definitions

Path a branch in an aggregation hierarchy

 Path instantiation a sequence of objects obtained by instantiating the path

Nested index an index for a direct connection between the starting object

and the ending object of the path instantiation

Path index an index for storing instantiation of a path same index key as nested index

Index Key

Page 29: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

29OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Project

Company

Division

PersonExample of Aggregation Hierarchy

Page 30: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

30OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Definition of Path

Given an aggregation hierarchy H, a path P is defined as C1.A1.A2…..An(n 1) where C1 is a class in H

A1 is an attribute of class C1

Ai is an attribute of class Ci in H, such that Ci is the domain of the attribute Ai - 1 of class Ci - 1 (1< i n )

length(P) : the length of the path classes(P) : the set of classes along the path dom(P) : the domain of attribute An of class Cn

Page 31: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

31OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Examples of Path

P1:Project.main_contracting_company.divisions.head.name length( P1) = 4

classes( P1) = { Project, Company, Division, Person }

dom( P1) = STRING

P2 : Person.divisions.city

length(P2) = 2

classes(P2) = { Project, Division }

dom(P2) = STRING

Page 32: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

32OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Definition of Complete Instantiation

Complete instantiation is a sequence of objects along path

Given the path P = C1.A1.A2…..An , CI is denoted as O1.O2…..On+1 , where

O1 is an instance of class C1

Oi is the value of the attribute Ai - 1 of object Oi - 1

• Oi = Oi - 1 .Ai - I or Oi Oi - i . Ai - i (1 i n +1)

Examples of CI, where path is given as P1

Project[i].Company[k].Division[k].Person[x].Jones Project[j].Company[i].Division[h].Person[y].Smith

Page 33: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

33OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Definition of Partial Instantiation

Partial instantiation is the part of CI, which ends at the last object of CI

Given a path P = C1.A1.A2…..An, PI is denoted as O1.O2…..Oj (j<n+1), where O1 is an instance of class Ck in Class(P) such that k+j-

1=n+1 Oi is the value of attribute Ai - 1 of an object Oi - 1

Examples of PI, where path is given as P1

Division[k].Person[x].Jones Division[h].Person[y].Smith

Page 34: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

34OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Definition of Redundancy

Given a PI as O1.O2…..Oj, it is not redundant if there are no CI or PI as O'1.O'2…..O’k, where k>j and

Oi = O’k - j + 1 (i=1,...,j)

Examples of redundant PI Division[k].Person[x].Jones is redundant to

Project[i].Company[k].Division[k].Person[x].Jones

Division[h].Person[y].Smith is redundant to

Project[j].Company[i].Division[h].Person[y].Smith

Page 35: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

35OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Definition of Projection of Path

Projection of Path is the part of CI or PI, which begins from the first object of it

<m>(p) denotes a projection of p with a length m P = C1.A1.A2…..An

as PI (or CI) of P, p = O1.O2.O3…..Oj (j n+1)

<m>(p) = O1.O2.…..Om (m<j)

Example <2>(Project[i].Company[k].Division[k].Person[x].Jones) ==

Project[i].Company[k]

Page 36: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

36OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Multi-index

Index to each of the classes constituting the path Multi-index is a set of n simple indices I1, I2 ,…,In

given a path P = C1.A1.A2…..An

Ii is an index defined on Ci . Ai, 1 i n

Solving a nested predicate scans n indices first scanning the last index In on the path

the results of the scan using Ii are used as keys for Ii-1

Only for reverse traversal scanning strategies Low updating cost

Page 37: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

37OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Examples of Multi-index

First index I1 on Project.main_contracting_company (Company[k], {Project[i]}) (Company[i], {Project[j], Project[l]})

Second index I2 on Company.divisions (Division[h], {Company[i]}) (Division[i], {Company[i]}) (Division[k], {Company[k]})

Third index I3 on Division.city (Boston, {Division[h]}) (New York, {Division[i]}) (Los Angeles, {Division[k]})

Page 38: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

38OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Example of Using Multi-index

Select all the projects with a main contracting company which has a division in Los Angeles Scanning index I3 with the key-value = Los Angeles

{Division[k]} Scanning index I2 with the key-value = Division[k]

{Company[k]} Scanning index I1 with the key-value = Company[k]

{Project[i]} Result: {Project[i]}

Page 39: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

39OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Join Index

To perform joins in relational model efficiently Binary join index for binary relation (r, s)

one index clustered on r the other index clustered on s

BJI can be used in a multi-index organization reverse traversal faster forward traversal in cases of high access costs to

objects since no database access for objects more suitable for complex queries

Page 40: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

40OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Nested Index

Direct association between the ending object and the starting object in path

Given a path P = C1.A1.A2…..An, nested index on P is defined as a set of pairs (O,S) S = {O' such that there is O1.O2…..On+1 as a CI where O'

= O1 and O = On+1}

Examples (Boston, {Project[j]}) (New York, {Project[j], Project[k], Project[l]}) (Los Angeles, {Project[i]})

Page 41: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

41OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Properties of Nested Index

Retrieval is quite fast for scanning only one index Problem on update operation

the access to several objects forward traversal to determine the value of the indexed

attribute reverse traversal to determine all instances at the

beginning of the path ==> inverse references

Page 42: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

42OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Path Index

Given a key, all the path instantiations are stored Given a path P=C1.A1.A2…..An, a path index on P is

defined as a set of pairs (O,S) where S={<j-1>(pi),

pi = O1.O2.O…..On (1 j n+1) is a CI or non-redundant PI of P

Oj = O }

Examples (Boston, {Project[j].Company[i].Division[h]}) (New York, {Project[j].Company[i].Division[i],

Project[k].Company[m].Division[j], Project[l].Company[i].Division[i]})

Page 43: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

43OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Properties of Path Index

For nested predicates in all classes along the path Updates of a path index

only forward traversals are required

Identical with nested index where n = 1

Page 44: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

44OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Access Relations

Similar to path indices storing all instantiations along a path in a relation

Examples <Project[i], Company[k], Division[k], Los Angeles> <Project[j], Company[i], Division[h], Boston> <Project[j], Company[i], Division[i], New York> <Project[k], Company[m], Division[j], New York> <Project[l], Company[i], Division[h], Boston> <Project[l], Company[i], Division[i], New York>

Several subpaths to different relations

Page 45: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

45OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Index Structures using B+tree Structure of the internal node

n records of <key-length, key, pointer>  A record of a leaf node in a nested index

record-length key-length, key-value  # of OIDs associated with the key list of OIDs

A record of a leaf node in a path index record-length  key-length, key-value  # of the path instantiations associated with the key  list of path instantiations

Page 46: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

46OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Operations with Nested Index

To solve a predicate against a nested attribute An of class C1 single index scan same cost to solve the predicate on a simple attribute of C1

For update operation one forward traversal to find the old key value another one forward traversal to find the new key value one reverse traversal to find the OID of associated object

Page 47: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

47OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Operations with Path Index

To solve a predicate against the nested attribute An of class Ci (1 i n)   one index scan determine the PI or CI associated with the key value extract the OIDs occupying the i-th position of them

For update operation one forward traversal to find the old path instantiation another one forward traversal to find the new path

instantiation

Page 48: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

48OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Comparisons of Index Organizations(1)

Degree of reference sharing important in evaluating an index organization reference is shared when two or more objects refer to the

same object

Retrieval operation nested index has the lowest cost path index has a lower cost than the multi-index nested index has better performance than the path index path index allows predicates to be solved for all the

classes along a path but not nested index

Page 49: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

49OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Comparisons of Index Organizations(2)

Update operation the multi-index has the lowest cost for paths with a length 2 nested index has slightly lower cost than the path index

for paths with a length greater than 2 nested index has slightly lower cost than the path index if

the updates are executed on the first two classes In other cases nested index involves a significantly higher cost

Page 50: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

50OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Indexing Techniques for Inheritance Hierarchies

Scope of a query only a given class C the class C and the inheritance hierarchy rooted in C

Solution based on conventional indices construct an index on an attribute for each of the

classes of the subgraph scan all these indices perform the union of their result

Page 51: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

51OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Inherited Index

By Won Kim, et al in 1989 direct support for queries on an inheritance subgraph one index on the common attributes for all classes an index entry contains the identifiers of all the classes

in the hierarchy

A leaf node of an  inherited index

More efficient for all queries whose access scope involves significant subset of classes in the hierarchy

recordlength

keylength

keyvalue

classdirectory

# of OIDs (OID1,...,OIDn) ...

# of classes class1 offset ... classn offset

Page 52: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

52OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Precomputing and Caching

Index on attributes Index on methods

precomputing(caching) the results of method invocation

How to detect when the computed method results are no longer valid?

Dependency information  keeps track of which objects and attributes have been

used to compute a given method when an object is modified, all the precomputed results

of the methods which have used them are invalidated

Page 53: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

53OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Solutions for Dependency Information

A relation by Kemper et al. in 1991 record <oidi, method-name, <oid1, ..., oidn>>

oidi was used for compute the method method-name with input parameters <oid1, ..., oidn>

By Bertino, Quarati in 1992 for local methods, all the dependency information is

stored in the object itself for other methods, they are stored in the special object

• all the objects whose attributes are used in precomputation of the method have reference to the special object

Page 54: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

54OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Identification of Attributes used in Precomputing

Static approach inspection of the method implementation determines all

attributes that can possibly be used in the execution of it system keeps the list of attributes used in the method on modification of a attribute, the system invalidates a

method only if it uses the modified attribute same method precomputed on different objects may use

different sets of attributes

Dynamic approach attributes are determined only when the method is

actually precomputed

Page 55: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

55OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Object Identifiers

OID is used to refer object represent relations between objects 

Physical OID actual address of the object

Logical OID index from which the address of the object is obtained

Influence the performance of an OODBMS 

Page 56: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

56OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Types of OID(1)

Physical address rarely used in OODBMS

Structured address  (segment number, page number) (logical slot number) retrieve an object with a single page access movement of the object to another page requires two

disk reads to retrieve the object.

Page 57: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

57OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Types of OID(2)

Surrogate OID pure logical not very efficient in object retrieval transformed into an address by a hash function GemStone, POSTGRES

Typed Surrogate OID (Type_ID, OID) similar performance to that of surrogate OID more difficult to change the type of an object ORION, ITASCA

Page 58: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

58OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Length of the OIDs

Another factor which affects the performance 32~ 48-bit long OIDs

affect the overall size of a DB 32-bit long OIDs can have thousand million objects

64-bit long OIDs in the following situations OID must be unique for the entire life of the object surrogates generated by a monotonically increasing

function distributed environment

Page 59: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

59OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Swizzling

Transformation of OID into the memory address on the retrieval of object from disk to the main memory

Advantage increase the speed of navigation of objects using OIDs

Disadvantages costly process not the best solution for infrequently referenced objects

Page 60: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

60OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Alternatives to Swizzling(1)

Tables mapping OIDs to object memory addresses when objects will be swapped out with high probability when the references are not used frequently Objectivity/DB

Combination swizzling with disk imaging the main memory address is physically written over the

field of the object which contains the OID before writing the object back to disk, all the swizzled

OIDs must be transformed back into OIDs

Page 61: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

61OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

Alternatives to Swizzling(2)

Maintenance of the OIDs in the swizzled format at the creation of the object,

• assigned to fixed address in adjacent segments of VM at the loading of the object into main memory

• map the object to the same virtual memory address

• if impossible, the object on the page is transformed to be placed in another VM address

limits the total number of objects in the database to the maximum size of the VM

ObjectStore

Page 62: E.Bertino, L.Matino Object-Oriented Database Systems Chapter 8. Storage Management and Indexing Techniques Seoul National University Department of Computer

62OOPSLA Lab

Chapter8.Storage Management and Indexing Techniques

When to Execute Swizzling

The first time an application retrieves an object from disk

The first time a reference has to be followed Under application request, by an explicit call to

the OODBMS at run-time