m.frank lhcb/cern - in behalf of the lhcb gaudi team data persistency solution for lhcb ã...
TRANSCRIPT
M.Frank LHCb/CERN - In behalf of the LHCb GAUDI team
Data Persistency Solution for LHCb
Motivation Data access Generic model Experience & Conclusions
M.Frank LHCb/CERNGAUDI
Motivation
Physics software should be independent of the underlying data storage technology
Data of different nature has to be accessed Event data, detector data, statistical data, … The access patterns differ The data set size varies from several Mbytes to 1 Pbyte Legacy data was written in ZEBRA format
It is unclear how these data will be stored Locking into one technology may be a disadvantage
M.Frank LHCb/CERNGAUDI
Strategy
Transient data representation is separated from the persistent data representation Each representation can be optimized separately Transient representation can be used to convert to any
other representation
Minimize coupling between algorithms and the transient data Algorithms see only transient data Transient data items are not intelligent Algorithms post and retrieve transient data from a
“black-board”, the data store
M.Frank LHCb/CERNGAUDI
How are Data Accessed?
Event DataService
Transient Event Store
Detec. DataService
Transient Detector
Store
HistogramService
TransientHistogram
Store
AlgorithmAlgorithm
Algorithm
M.Frank LHCb/CERNGAUDI
Functionality of Data Stores
Manage data objects like a librarian Clients store objects Other clients pick up objects when needed Retrieve object collections
Manage ownership Does cleanup
Manage objects of similar lifetime
M.Frank LHCb/CERNGAUDI
Structure of the Data Store
Tree - similar to file system
Tree nodehas data members (payload)contains other node objects
(directory structure)
Identification by logical addresses: ”/Event/Mc/MCParticles”
Browse capability
M.Frank LHCb/CERNGAUDI
Symbolic linksC
D
/node2
/node2/D
/node2/C
Data Members
Data Object
Layout of the Data Object
/node1
/node1/B
/node1/A
Data Members Node objects/node1/A/A2
/node1/A/A1
Data Object
M.Frank LHCb/CERNGAUDI
Store Dynamics: Loading
(5) Register
DataService
Algorithm
(1) retrieveObject(…)
Try to accessan object data
(2) Search in Store
Data Store
Will be unsuccessful,requested object is
not present
ConverterConverterConverterConverter
(4) Request creation
(3) Request load
PersistencyService
ConversionService
Request dispatcher
Objy, SICB, ROOT,..
M.Frank LHCb/CERNGAUDI
Store Dynamics: Storing
ConverterConverterConverter
ConversionService
(2) Create persistent representations
PersistencyService
Output Stream
Criteria
(1) Collect references
Data Store
M.Frank LHCb/CERNGAUDI
Generic Persistent Model
Objects & pointers Objects, object IDs, collections & DBs
Transient PersistentC++ pointer >> object ID
M.Frank LHCb/CERNGAUDI
Database Technologies
Identify commonalties and differencesNecessary knowledge when reading/writing
Generic ZEBRA ROOT RDBMS ObjyDatabase File File Database DatabaseCollection Bank Tree/Branch Table Container
W
rite
I tem ID Record # Event # Prim.Key
DatabaseCollection
Read
I tem IDAs for writing OID
RDBMS: More or less traditional Objy is different: How to match ?
M.Frank LHCb/CERNGAUDI
Generic Model: Assumptions
Data Store ConversionSvc
DiskStoragedatabasedatabaseDatabase
ContainerContainerContainer
ObjectsObjectsObjectsObjectsObjectsObjectsObjectsObjects
Class ID,Object path
Storage type
DB name
Cont.nameItem ID
M.Frank LHCb/CERNGAUDI
Storage TypeClass IDEntry IDLink ID
Converter
Technology
XID
(2)(3)(4)
(1)
Link ID Link Info
DB/Cont.name,...... ...
<number>
Lookup table
Generic Model: References
M.Frank LHCb/CERNGAUDI
Generic Model: Extended Object ID
Storage Type Class ID
Storage Type dependent part
OR
OID
Link ID Record ID
Objectivity
ZEBRAROOTRDBMS
M.Frank LHCb/CERNGAUDI
Experience & Conclusions
It is possible to write physics data without knowledge of the underlying store technology
Our approach can adopt any technology based on database files, collections and objects within collectionsZEBRA, ROOT, Objy and RDBMSWe are able to choose technologies according to needs
Overhead of transient-persistent separation looks manageable
http://lhcb.cern.ch/computing/Components/html/GaudiMain.html