transaction-based grid data replication using ogsa-dai presented by yin chen february 2007
TRANSCRIPT
Transaction-based Grid Data Transaction-based Grid Data
ReplicationReplication Using OGSA-DAI Using OGSA-DAI
Presented by Yin Chen Presented by Yin Chen
February 2007February 2007
• Initial copying of data & synchronization of updating
• Is not Cashing
– Client phenomenon
– Only for improving response time
• Is not a Backup
– Not automatically overwritten when the original data is modified
– Normally, cannot directly access
What is replication?What is replication?
• Data consolidation (central audit & analyse)
• Data distribution (for branch labs)
• Performance– Access efficiency (moving data near processing)– Load balancing (distributing access load)– Security (data protection)– Availability (off-line access)– Reliability (disaster recovery, avoiding single point of failure)
Why do we need it?Why do we need it?
• How to copy the large data among heterogeneous DBs• How to maintain the consistency of data in a highly
distributed network environment• How to discover & self-repair the dead parts
Challenges of Grid database Challenges of Grid database replicationreplication
• Existing Grid “replication” systems E.g. the EDG replica manager/ the Globus data replication
service/ SRB Support large dataset copying Yet, merely deal with files Too simple (e.g. not support updating, database replication, etc.) Not consistent
• Relational database replication tools E.g. Oracle/ Sybase/ DB2/ MySQL replication Very flexible (e.g., portion copy, bi-direction update) Yet, not suit for virtual organizations (e.g. can’t copy large data/
difficult to search for replicas)
Problems of existing Problems of existing technologiestechnologies
ArchitectureArchitecture
Metadata Catalogue
Relational Database Replication Mechanism
ReplicationControl Service
Transfer Service
Data Resource
Data Replica
Data flow directions
Request
Replication Control Service
MetadataSearchEngine
Metadata Register
Initiator
Selector
Starter
Metadata Catalogue
Relational Database Replication Mechanism
Transfer Service
Data Resource
ReplicationTarget
Replication control Replication control workflowworkflow
OGSA-DAI activities OGSA-DAI activities (ongoing)(ongoing)
• High-level APIs to interact with relational replication mechanisms:
CreateReplicaDatabase() DropReplicaDatabase() ConfigReplication() CleanUp() -- to clean up replication configuration StartReplication() StopReplication() MonitorReplication() -- to check the status of each process
• Control the workflow of data replication, i.e.sequence.addChild(createDB2RelicaDB);sequence.addChild(configDB2Replication);sequence.addChild(startDB2Replication);
IBM DB2 SQL IBM DB2 SQL ReplicationReplication
IBM Replication
• Admin: create replication criteria control table
• Capture: use log/trigger to capture the changes temp table
• Apply: scheduled apply transactions accumulated target DB
• Alert Monitor: monitor and notify users
• Supports: after-image copy / before-image copy (can rollback)
• Allows subset/simple view/ complex joins & unions copy
• Asynchronous replication, allows specifying schedule
FeaturesFeatures
• Combine Relational Database Replication with Grid technologies, to gain benefits from both
Keep the features of relational database replication
Supporting more scalable, secure, high performance data access
• Explore the abilities of OGSA-DAI to control workflows
InformationInformation• Project members:
• Dave Berry (NeSC, UK)
• Patrick Dantressangle (IBM, Hursley)
• Yin Chen (NeSC, UK)
• Simon Laws (IBM, Hursley)
• Project website:
http://www.aiai.ed.ac.uk/~ychen/ibm_ogsadai/ibm-ogsadai-index.html