8/25/2005ieee pacrim 20051 the design concept and initial implementation of agentteamwork grid...
TRANSCRIPT
8/25/2005 IEEE PacRim 2005 1
The Design Concept and Initial Implementation of AgentTeamwork Grid Computing Middleware
Munehiro FukudaComputing & Software Systems, University of Washington, Bothell
Koichi KashiwagiShinya Kobayashi
Computer Science, Ehime University
Funded by
8/25/2005 IEEE PacRim 2005 2
Background
Most grid-computing systems Centralized resource/job management Two drawbacks
A powerful central server essential to manage all slave computing nodes
Applications based on master-slave or parameter-sweep model Mobile agents
An execution model previously highlighted as a prospective infrastructure of distributed systems.
No more than an alternative approach to centralized grid middleware implementation.
Our motivation Decentralized job distribution and coordination Decentralized fault tolerance Applications based on a variety of communication models
8/25/2005 IEEE PacRim 2005 3
Objective
A mobile agent execution platform fitted to grid computing Allowing an agent to identify which MPI rank to handle and
which agent to send a job snapshot to. A fault-tolerant inter-process communication
Recovering lost messages. Allowing over-gateway connections.
Agent-collaborative algorithms for job coordination Allocating computing nodes in a distributed manner. Implementing decentralized snapshot maintenance and job
recovery.
8/25/2005 IEEE PacRim 2005 4
System Overview
FTPServer
UserA
UserB
UserB
snapshotsnapshot
snapshots snapshots
User program wrapper
SnapshotMethods
GridTCP
User program wrapper
SnapshotMethods
GridTCP
User program wrapper
SnapshotMethods
GridTCP
snapshot
User A’sProcess
User A’sProcess
User B’sProcess
TCPCommunication
Commander Agent
Commander Agent
Sentinel Agent
Sentinel Agent
Resource Agent
Sentinel Agent
Resource Agent
Bookkeeper Agent
BookkeeperAgent
ResultsResults
8/25/2005 IEEE PacRim 2005 5
Execution Layer
Operating systems
UWAgents mobile agent execution platform
Commander, resource, sentinel, and bookkeeper agents
User program wrapper
GridTcpJava socket
mpiJava-AmpiJava-S
mpiJava API
Java user applications
UWAgents mobile agent execution platform
Commander, resource, sentinel, and bookkeeper agents
8/25/2005 IEEE PacRim 2005 6
id 0
Agent domain (time=3:31pm, 8/25/05 ip = perseus.uwb.edu name = fukuda)
id 0
UWInject: submits a new agent from shell.
Agent domain (time=3:30pm, 8/25/05 ip = medusa.uwb.edu name = fukuda)
UWAgents Execution Platform
Agent domain created per each submission from the Unix shell # children each agent can spawn is given upon the initial submission No name server Messages forwarded through an agent tree A user job scheduled as a thread, using suspend/resume
User
id 1 id 2 id 3
id 7id 6id 5id 4 id 11id 10id 9id 8 id 12
-m 4
id 1 id 2
-m 3
UWPlace
A user job
8/25/2005 IEEE PacRim 2005 7
Job DistributionUser
Commanderid 0
Sentinelid 2
rank 0
Bookkeeperid 3
rank 0
Resourceid 1eXist
Sentinelid 8
rank 1
Sentinelid 11
rank 4
Sentinelid 10
rank 3
Sentinelid 9
rank 2
Bookkeeper
id 12rank 1
Bookkeeper
id 15rank 4
Bookkeeper
id 14rank 3
Bookkeeper
id 13rank 2
Sentinelid 32
rank 5
Sentinelid 34
rank 7
Sentinelid 33
rank 6
Bookkeeper
id 48rank 5
Bookkeeper
id 50rank 7
Bookkeeper
id 49rank 6
Job Submission
XML QuerySpawn
id: agent idrank: MPI Rank
snapshot
snapshot
8/25/2005 IEEE PacRim 2005 8
Resource Allocation
Node 1Node 0 Node 2
User
Commanderid 0
Resourceid 1eXist
Job submission
An XML query CPU ArchitectureOSMemoryDiskTotal nodesMultiplier
total nodes x multiplier
A list of available nodes
Spawn
Sentinelid 2
rank 0
Bookkeeperid 2
rank 0
Node 1Node 0 Node5Node 4Node 3Node 2
Sentinelid 8
rank 1
Bookkeeperid 12
rank 5
Sentinelid 2
rank 0
Sentinelid 8
rank 1
Bookkeeperid 2
rank 0
Bookkeeperid 12
rank 5
Case 1:Total nodes = 2Multiplier = 1.5
Case 2:Total nodes = 2Multiplier = 3
Future use
Future use Future use
8/25/2005 IEEE PacRim 2005 9
Job Resumption by a Parent SentinelSentinel
id 2rank 0
Sentinelid 8
rank 1
Sentinelid 11
rank 4
Sentinelid 10
rank 3
Sentinelid 9
rank 2
Bookkeeperid 15
rank 4
(0) Send a new snapshot periodically
MPI connections
(2) Search for the latest snapshot
(1) Detect a ping error
Sentinelid 11
rank 4
New
(4) Send a new agent
(5) Restart a user program
(3) Retrieve the snapshot
8/25/2005 IEEE PacRim 2005 10
Job Resumption by a Child Sentinel
Commanderid 0
Sentinelid 2
rank 0
Bookkeeperid 3
rank 0
Sentinelid 8
rank 1
Bookkeeper
id 12rank 1
Resourceid 1
(1) No pings for 8 * 5 (= 40sec)
No pings for 12 * 5 (= 60sec)
(2) Search for the latest snapshot
(3) Search for the latest snapshot
(4) Retrieve the snapshot
NewSentinel
id 2rank 0
(5) Send a new agent
(7) Search for the latest snapshot
(8) Search for the latest snapshot
(9) Retrieve the snapshot
(11) Detect a ping error (13) Detect a ping error and follow the samechild resumption procedure as in p9.
Commanderid 0
(10) Send a new agent
(6) No pings for 2 * 5 (= 10sec)
(12) Restart a new resource agent from its beginning
Resourceid 1
New
8/25/2005 IEEE PacRim 2005 11
Computational Granularity 1Master-slave computation
0
1
10
100
1000
10,0
00/1
,000
10,0
00/1
0,00
0
10,0
00/1
00,0
00
20,0
00/1
,000
20,0
00/1
0,00
0
20,0
00/1
00,0
00
40,0
00/1
,000
40,0
00/1
0,00
0
40,0
00/1
00,0
00
Size (doubles) / # floating-point divides
Tim
e (sec)
1 CPU
8 CPUs
16 CPUs
24 CPUs
8/25/2005 IEEE PacRim 2005 12
Computational Granularity 2Heartbeat
0
1
10
100
1000
10,0
00/1
,000
10,0
00/1
0,00
0
10,0
00/1
00,0
00
20,0
00/1
,000
20,0
00/1
0,00
0
20,0
00/1
00,0
00
40,0
00/1
,000
40,0
00/1
0,00
0
40,0
00/1
00,0
00
Size (doubles) / # floating-point divisions
Tim
e (sec)
1 CPU
8 CPUs
16 CPUs
24 CPUs
8/25/2005 IEEE PacRim 2005 13
Computational Granularity 3Broadcast
0
1
10
100
1000
10,0
00/1
,000
10,0
00/1
0,00
0
10,0
00/1
00,0
00
20,0
00/1
,000
20,0
00/1
0,00
0
20,0
00/1
00,0
00
40,0
00/1
,000
40,0
00/1
0,00
0
40,0
00/1
00,0
00
Size (doubles) / # floating-point divides
Tim
e (sec)
1 CPU
8 CPUs
16 CPUs
24 CPUs
8/25/2005 IEEE PacRim 2005 14
Performance Evaluation - Series
0
50
100
150
200
250
300
350
1 4 8 12 16 24
# CPUs
Tim
e (sec)
Agent deployment
Disk operations
Snapshot
Application
8/25/2005 IEEE PacRim 2005 15
Performance Evaluation - RayTracer
0
50
100
150
200
250
300
350
1 4 8 12 16 24
# CPUs
Tim
e (s
ec)
Agent deployment
Disk operations
Snapshot
Application
8/25/2005 IEEE PacRim 2005 16
Performance Evaluation – MolDyn
0
50
100
150
200
250
300
350
1 2 4 8
# CPUs
Tim
e (s
ec)
Agent deployment
Snapshot
Disk operations
GridTcp overhead
Java application
8/25/2005 IEEE PacRim 2005 18
Conclusions Our focus
A decentralized job execution and fault-tolerant environment Applications not restricted to the master-slave or parameter-
sweeping model. Applications
40,000 doubles x 10,000 floating-point operations Moderate data transfer combined with massive/collective
communication At least three times larger than its computational granularity
Future work UWAgents enhancement: over-gateway deployment and security Programming support: preprocessor implementation Job scheduling algorithms: priority-based agent migration