© 2007open grid forum ggf19, 1'st february 2007 ogsa data architecture services dave berry...
TRANSCRIPT
© 2007Open Grid Forum
GGF19, 1'st February 2007
OGSA Data Architecture ServicesDave Berry & Allen Luniewski
2© 2007 Open Grid Forum
OGF IPR Policies Apply
• “I acknowledge that participation in this meeting is subject to the OGF Intellectual Property Policy.”• Intellectual Property Notices Note Well: All statements related to the activities of the OGF and addressed to the
OGF are subject to all provisions of Appendix B of GFD-C.1, which grants to the OGF and its participants certain licenses and rights in such statements. Such statements include verbal statements in OGF meetings, as well as written and electronic communications made at any time or place, which are addressed to:
• the OGF plenary session, • any OGF working group or portion thereof, • the OGF Board of Directors, the GFSG, or any member thereof on behalf of the OGF, • the ADCOM, or any member thereof on behalf of the ADCOM, • any OGF mailing list, including any group list, or any other list functioning under OGF auspices, • the OGF Editor or the document authoring and review process
• Statements made outside of a OGF meeting, mailing list or other function, that are clearly not intended to be input to an OGF activity, group or function, are not subject to these provisions.
• Excerpt from Appendix B of GFD-C.1: ”Where the OGF knows of rights, or claimed rights, the OGF secretariat shall attempt to obtain from the claimant of such rights, a written assurance that upon approval by the GFSG of the relevant OGF document(s), any party will be able to obtain the right to implement, use and distribute the technology or works when implementing, using or distributing technology based upon the specific specification(s) under openly specified, reasonable, non-discriminatory terms. The working group or research group proposing the use of the technology with respect to which the proprietary rights are claimed may assist the OGF secretariat in this effort. The results of this procedure shall not affect advancement of document, except that the GFSG may defer approval where a delay may facilitate the obtaining of such assurances. The results will, however, be recorded by the OGF Secretariat, and made available. The GFSG may also direct that a summary of the results be included in any GFD published containing the specification.”
• OGF Intellectual Property Policies are adapted from the IETF Intellectual Property Policies that support the Internet Standards Process.
3© 2007 Open Grid Forum
Contents
• Current Status• Architecture Document• Scenarios Document• Storage Management Discussion
4© 2007 Open Grid Forum
Two Informational Documents
• OGSA Data Architecture• 70+ pages• Describes services and their interfaces
• OGSA Data Scenarios• 50+ pages• Describes how the services can be
composed to address particular scenarios
5© 2007 Open Grid Forum
Progress since GGF18
• Slow progress• Focus on adding interfaces to Arch.
• Mostly complete; review/editing in progress
• Scenarios Document• Scenario refinement• Integration w/ Architecture document• Preliminary work on EMS Scenario• Many thanks to Stephen Davey
6© 2007 Open Grid Forum
Current State
• Architecture Document• Technical content substantially complete• Review of interface work in progress
• Scenarios Document• Some scenarios still need work• Adding EMS scenario (next session)
• Dates for public comment missed
7© 2007 Open Grid Forum
Architecture Document
8© 2007 Open Grid Forum
Current Scope
• Files and databases (& storage)• Not streams, sessions, …• Services and interfaces
• Storage, Access, Transfer• Replication, Caching, Federation, Metadata catalogues
• Cross-cutting themes• Security, Policies, …
• Part of the bigger OGSA picture• E.g. Naming, Workflow, Transactions, Scheduling,
Provisioning, …
9© 2007 Open Grid Forum
Architecture Document
• Services• Data Transfer• Data Access• Storage Resource
Management• Data Cache• Data Replication• Data Federation• Metadata Catalogues
• Appendices• Specifications
referenced• Mappings to
specifications• DAIS
• ByteIO
• SRM• DMI
• …
• Glossary
10© 2007 Open Grid Forum
Data Transfer
Create
Control
EPR
e.g. GridFTP
User
Transfer Factory
Transfer Instance
Source
Sink
11© 2007 Open Grid Forum
Replication
• Factory• CreateReplica()
• Replica• Management Process + Targets• ModifyReplicatedContents()• SynchroniseReplica()• ValidateReplica()• Destroy Replica()
• Replica Catalogue• Maps name to list of (replica, target) pairs
12© 2007 Open Grid Forum
Gaps: Need people to work on
• Information model for data resources• For management, resource reservation, …
• File metadata• URI Registries• Security extensions• Integration of access and transfer
• 3rd-party delivery• Policies
• Replication coherency, caching coherency, catalogue consistency, etc.
• Sessions• Provisioning, etc.• Review!
13© 2007 Open Grid Forum
Scenarios
14© 2007 Open Grid Forum
Scenarios document
• Example scenarios of a generic nature• Illustrates how the services and
interfaces described in the OGSA Data Architecture document can be put together in a selection of typical data scenarios.
• Not a use case document generating requirements.
15© 2007 Open Grid Forum
Scenarios
• Data replication• Data pipelining• Data integration• Personal data service• Peer to peer data discovery• Data storage• Data provenance• Grid file system• Data transfer• Data staging
16© 2007 Open Grid Forum
Work to be Done
• Add EMS Scenario (next session)• Review & update scenarios• Needs an editor
17© 2007 Open Grid Forum
Storage Management Discussion
18© 2007 Open Grid Forum
Storage Management
• Current section derived from SRM spec.• It has concepts of:
• Storage (e.g., disks, volumes)• Files• Directories• File spaces & systems
• Is this appropriate for the architecture?
19© 2007 Open Grid Forum
Storage Management (1)
• Directory Management Functions• Synchronous & asynchronous• List Files• Release File Locks• Remove Files• Copy Files• Move Files• Make Directory• Delete Directory
20© 2007 Open Grid Forum
Storage Management (2)
• Space Management Functions:• Reserve Space()• Get Space()• Release Space()• Set Quota()
• Sink / Source• Covered by Transfer
21© 2007 Open Grid Forum
Discussion
© 2007Open Grid Forum
Questions?
23© 2007 Open Grid Forum
Full Copyright Notice
Copyright (C) Open Grid Forum 2006. All Rights Reserved.
This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works.
The limited permissions granted above are perpetual and will not be revoked by the OGF or its successors or assignees.
24© 2007 Open Grid Forum
Backup
34© 2007 Open Grid Forum
Data Pipelining
Completed Animations
Visualisation Service
Customer2
1. Submit job. 2. Store results.
3. Transfer results.
4. Return results.
Customer1 Data Transfer
Service
3. Transfer results.
Rendering Service
Data AccessService
35© 2007 Open Grid Forum
Data Storage – Bringing data online
Storage Devices
CustomerData
StorageService
TransferService
1. Make files online.
2. Transfer files.
2. Transfer files.
Nearline Storage
Online Storage
1. Make online.
1. Make online.
3. Retire to nearline.
3. Retire to nearline.
36© 2007 Open Grid Forum
Data Replication – 1
Customer1
Data TransferService
ReplicationService
Data Storage1
Data Storage2
Data Access Service 2
Data AccessService 1
1b. Publish
2. Transfer copies
6. Update
4. Access data
5. Notify
2. Transfer copies
2. Transfer copies
Registry Service
3. Find data
1a. Register data
Customer2
37© 2007 Open Grid Forum
Data Replication – 2
1. Register
Customer1 Data
TransferService
Data Storage 1
Data Storage 2
Data Access Service 2
Data Access Service 1
2. Transfer copies
6. Update
3. Find data
4. Access data
5. Notify
2. Transfer copies
2. Transfer copies
Repli-cation
Service
DataService
Replica Catalogue
Service
Customer2
38© 2007 Open Grid Forum
Joint OGSA Data + EMS Scenario
• The steps of this simple scenario are as follows:
1. Submit job to BES container. (JSDL contains execution & data staging info).
2. Use data transfer service to do the required data staging.
3. Run the executable on the BES container with the input data.
4. Stage result output data back to Data Service 1.
5. Delete staged input data at BES container.
6. Delete staged output data BES container.
39© 2007 Open Grid Forum
Data Staging
Data Transfer Service
BES Container
Input Data(copy)
OutputData
1. Submit JSDL script.
2a. Stage input data.
DataService 1
DataService 2
2b. Transfer input data.
4a. Stage output data.
2a.4a.
2a.4a.Client
4b. Transfer output data.
BES Container:
3. Run executable & save resulting output data.
InputData
OutputData (copy)
5. Delete input data (copy).
6. Delete output data.
40© 2007 Open Grid Forum
Personal Data Service
Customer 1 (site 1)
RegistryService
Data Service 1
Data Service 2
Data Service 3
Local Cache
Service 2
Local Cache
Service 1
Index
Index
Index
2. Create named space.
3. Name collection.
1. Locate data.
2. Create.4. Use named space.
Customer 1 (site 2)
6. Use named space.
7. Update.
5. Update.
Personal Data Service
Global Name Resolver Service