8/25/2005ieee pacrim 20051 the design concept and initial implementation of agentteamwork grid...

18
8/25/2005 IEEE PacRim 2005 1 The Design Concept and Initial Implementation of AgentTeamwork Grid Computing Middleware Munehiro Fukuda Computing & Software Systems, University of Washington, Bothell Koichi Kashiwagi Shinya Kobayashi Computer Science, Ehime University Funded by

Upload: alberta-holmes

Post on 13-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

8/25/2005 IEEE PacRim 2005 1

The Design Concept and Initial Implementation of AgentTeamwork Grid Computing Middleware

Munehiro FukudaComputing & Software Systems, University of Washington, Bothell

Koichi KashiwagiShinya Kobayashi

Computer Science, Ehime University

Funded by

8/25/2005 IEEE PacRim 2005 2

Background

Most grid-computing systems Centralized resource/job management Two drawbacks

A powerful central server essential to manage all slave computing nodes

Applications based on master-slave or parameter-sweep model Mobile agents

An execution model previously highlighted as a prospective infrastructure of distributed systems.

No more than an alternative approach to centralized grid middleware implementation.

Our motivation Decentralized job distribution and coordination Decentralized fault tolerance Applications based on a variety of communication models

8/25/2005 IEEE PacRim 2005 3

Objective

A mobile agent execution platform fitted to grid computing Allowing an agent to identify which MPI rank to handle and

which agent to send a job snapshot to. A fault-tolerant inter-process communication

Recovering lost messages. Allowing over-gateway connections.

Agent-collaborative algorithms for job coordination Allocating computing nodes in a distributed manner. Implementing decentralized snapshot maintenance and job

recovery.

8/25/2005 IEEE PacRim 2005 4

System Overview

FTPServer

UserA

UserB

UserB

snapshotsnapshot

snapshots snapshots

User program wrapper

SnapshotMethods

GridTCP

User program wrapper

SnapshotMethods

GridTCP

User program wrapper

SnapshotMethods

GridTCP

snapshot

User A’sProcess

User A’sProcess

User B’sProcess

TCPCommunication

Commander Agent

Commander Agent

Sentinel Agent

Sentinel Agent

Resource Agent

Sentinel Agent

Resource Agent

Bookkeeper Agent

BookkeeperAgent

ResultsResults

8/25/2005 IEEE PacRim 2005 5

Execution Layer

Operating systems

UWAgents mobile agent execution platform

Commander, resource, sentinel, and bookkeeper agents

User program wrapper

GridTcpJava socket

mpiJava-AmpiJava-S

mpiJava API

Java user applications

UWAgents mobile agent execution platform

Commander, resource, sentinel, and bookkeeper agents

8/25/2005 IEEE PacRim 2005 6

id 0

Agent domain (time=3:31pm, 8/25/05 ip = perseus.uwb.edu name = fukuda)

id 0

UWInject: submits a new agent from shell.

Agent domain (time=3:30pm, 8/25/05 ip = medusa.uwb.edu name = fukuda)

UWAgents Execution Platform

Agent domain created per each submission from the Unix shell # children each agent can spawn is given upon the initial submission No name server Messages forwarded through an agent tree A user job scheduled as a thread, using suspend/resume

User

id 1 id 2 id 3

id 7id 6id 5id 4 id 11id 10id 9id 8 id 12

-m 4

id 1 id 2

-m 3

UWPlace

A user job

8/25/2005 IEEE PacRim 2005 7

Job DistributionUser

Commanderid 0

Sentinelid 2

rank 0

Bookkeeperid 3

rank 0

Resourceid 1eXist

Sentinelid 8

rank 1

Sentinelid 11

rank 4

Sentinelid 10

rank 3

Sentinelid 9

rank 2

Bookkeeper

id 12rank 1

Bookkeeper

id 15rank 4

Bookkeeper

id 14rank 3

Bookkeeper

id 13rank 2

Sentinelid 32

rank 5

Sentinelid 34

rank 7

Sentinelid 33

rank 6

Bookkeeper

id 48rank 5

Bookkeeper

id 50rank 7

Bookkeeper

id 49rank 6

Job Submission

XML QuerySpawn

id: agent idrank: MPI Rank

snapshot

snapshot

8/25/2005 IEEE PacRim 2005 8

Resource Allocation

Node 1Node 0 Node 2

User

Commanderid 0

Resourceid 1eXist

Job submission

An XML query CPU ArchitectureOSMemoryDiskTotal nodesMultiplier

total nodes x multiplier

A list of available nodes

Spawn

Sentinelid 2

rank 0

Bookkeeperid 2

rank 0

Node 1Node 0 Node5Node 4Node 3Node 2

Sentinelid 8

rank 1

Bookkeeperid 12

rank 5

Sentinelid 2

rank 0

Sentinelid 8

rank 1

Bookkeeperid 2

rank 0

Bookkeeperid 12

rank 5

Case 1:Total nodes = 2Multiplier = 1.5

Case 2:Total nodes = 2Multiplier = 3

Future use

Future use Future use

8/25/2005 IEEE PacRim 2005 9

Job Resumption by a Parent SentinelSentinel

id 2rank 0

Sentinelid 8

rank 1

Sentinelid 11

rank 4

Sentinelid 10

rank 3

Sentinelid 9

rank 2

Bookkeeperid 15

rank 4

(0) Send a new snapshot periodically

MPI connections

(2) Search for the latest snapshot

(1) Detect a ping error

Sentinelid 11

rank 4

New

(4) Send a new agent

(5) Restart a user program

(3) Retrieve the snapshot

8/25/2005 IEEE PacRim 2005 10

Job Resumption by a Child Sentinel

Commanderid 0

Sentinelid 2

rank 0

Bookkeeperid 3

rank 0

Sentinelid 8

rank 1

Bookkeeper

id 12rank 1

Resourceid 1

(1) No pings for 8 * 5 (= 40sec)

No pings for 12 * 5 (= 60sec)

(2) Search for the latest snapshot

(3) Search for the latest snapshot

(4) Retrieve the snapshot

NewSentinel

id 2rank 0

(5) Send a new agent

(7) Search for the latest snapshot

(8) Search for the latest snapshot

(9) Retrieve the snapshot

(11) Detect a ping error (13) Detect a ping error and follow the samechild resumption procedure as in p9.

Commanderid 0

(10) Send a new agent

(6) No pings for 2 * 5 (= 10sec)

(12) Restart a new resource agent from its beginning

Resourceid 1

New

8/25/2005 IEEE PacRim 2005 11

Computational Granularity 1Master-slave computation

0

1

10

100

1000

10,0

00/1

,000

10,0

00/1

0,00

0

10,0

00/1

00,0

00

20,0

00/1

,000

20,0

00/1

0,00

0

20,0

00/1

00,0

00

40,0

00/1

,000

40,0

00/1

0,00

0

40,0

00/1

00,0

00

Size (doubles) / # floating-point divides

Tim

e (sec)

1 CPU

8 CPUs

16 CPUs

24 CPUs

8/25/2005 IEEE PacRim 2005 12

Computational Granularity 2Heartbeat

0

1

10

100

1000

10,0

00/1

,000

10,0

00/1

0,00

0

10,0

00/1

00,0

00

20,0

00/1

,000

20,0

00/1

0,00

0

20,0

00/1

00,0

00

40,0

00/1

,000

40,0

00/1

0,00

0

40,0

00/1

00,0

00

Size (doubles) / # floating-point divisions

Tim

e (sec)

1 CPU

8 CPUs

16 CPUs

24 CPUs

8/25/2005 IEEE PacRim 2005 13

Computational Granularity 3Broadcast

0

1

10

100

1000

10,0

00/1

,000

10,0

00/1

0,00

0

10,0

00/1

00,0

00

20,0

00/1

,000

20,0

00/1

0,00

0

20,0

00/1

00,0

00

40,0

00/1

,000

40,0

00/1

0,00

0

40,0

00/1

00,0

00

Size (doubles) / # floating-point divides

Tim

e (sec)

1 CPU

8 CPUs

16 CPUs

24 CPUs

8/25/2005 IEEE PacRim 2005 14

Performance Evaluation - Series

0

50

100

150

200

250

300

350

1 4 8 12 16 24

# CPUs

Tim

e (sec)

Agent deployment

Disk operations

Snapshot

Application

8/25/2005 IEEE PacRim 2005 15

Performance Evaluation - RayTracer

0

50

100

150

200

250

300

350

1 4 8 12 16 24

# CPUs

Tim

e (s

ec)

Agent deployment

Disk operations

Snapshot

Application

8/25/2005 IEEE PacRim 2005 16

Performance Evaluation – MolDyn

0

50

100

150

200

250

300

350

1 2 4 8

# CPUs

Tim

e (s

ec)

Agent deployment

Snapshot

Disk operations

GridTcp overhead

Java application

8/25/2005 IEEE PacRim 2005 17

Overhead of Job Resumption

8/25/2005 IEEE PacRim 2005 18

Conclusions Our focus

A decentralized job execution and fault-tolerant environment Applications not restricted to the master-slave or parameter-

sweeping model. Applications

40,000 doubles x 10,000 floating-point operations Moderate data transfer combined with massive/collective

communication At least three times larger than its computational granularity

Future work UWAgents enhancement: over-gateway deployment and security Programming support: preprocessor implementation Job scheduling algorithms: priority-based agent migration