gridflow: workflow management for grid computing

22
GridFlow: Workflow Management for Grid Computing Kavita Shinde

Upload: dorie

Post on 17-Mar-2016

40 views

Category:

Documents


0 download

DESCRIPTION

GridFlow: Workflow Management for Grid Computing. Kavita Shinde. Outline. Introduction Grid Resource Management Grid Workflow Management An Example Scenario Conclusion. Introduction. GridFow given a set of workflow tasks and a set of resources,how do we map them to Grid resources? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: GridFlow: Workflow Management for Grid Computing

GridFlow: Workflow Management for Grid Computing

Kavita Shinde

Page 2: GridFlow: Workflow Management for Grid Computing

Outline Introduction Grid Resource Management Grid Workflow Management An Example Scenario Conclusion

Page 3: GridFlow: Workflow Management for Grid Computing

Introduction

GridFow given a set of workflow tasks and a set of

resources,how do we map them to Grid resources? workflow management systems developed at

University of Warwick developed on top of an agent-based resource

management system for Grid computing(ARMS) focus is on service-level scheduling and workflow

management

Page 4: GridFlow: Workflow Management for Grid Computing

Grid Resource Management Three Layers of resource management system

within the GridFlow system Grid Resource

high-end computing or storage resource accessed remotely Multiprocessors, or clusters of workstations or PCs with large

disk storage space Local Grid

multiple grid resources that belong to one organization resources are connected with high speed networks

Global Grid consists of all local Grids

Page 5: GridFlow: Workflow Management for Grid Computing

Grid Resource Management PACE

a toolset for resource performance and usage analysis

takes separate resource and application models as inputs and is able to predict the execution time of a task prior to run time

scalability(execution time vs. level of parallelism) can be determine

helps in preventing over-occupying of resourcesuseful when trying to interleave sub-workflows as

much as possible

Page 6: GridFlow: Workflow Management for Grid Computing

Grid Resource Management Titan

grid resource manager locates a suitable resource set and passes the sub-

workflow to a local scheduler utilizes free processors to minimize idle-time and

improve throughput supported by the PACE performance predictive data

Page 7: GridFlow: Workflow Management for Grid Computing

Grid Resource Management ARMS

main component – agent agent – representative of a local grid at a global level

of grid resource management agents cooperate with each other to find the available

resources and there characteristics dispatch requests that can not be satisfied locally to

neighboring agents

Page 8: GridFlow: Workflow Management for Grid Computing

Grid Workflow ManagementThe implementation of grid workflow management is

carried out at multiple layers Tasks

basic building block of application e.g.. MPI(Message Passing Interface) and PVM(Parallel Virtual

Machine) jobs running on multiple processors tasks Sub-workflows

a flow of closely related tasks that is to be executed in a predefined sequence on grid resources of a local grid

usually significant communication between tasks, but resource conflicts may occur when multiple sub-workflows require the same resource simultaneously

Workflows a flow of several different sub-workflows

Page 9: GridFlow: Workflow Management for Grid Computing

GridFlow user portal provides graphical user interface to compose workflow elements and access additional grid services

LGSS handles conflicts - scheduled sub-workflows may belong to different workflows

ARMS represents a local Grid at a global level of Grid resource management, and conducts local Grid sub-workflow scheduling

Globus MDS provides information about the available resources on the Grid and their status

Titanutilizes performance data obtained from PACE for resource scheduling

Page 10: GridFlow: Workflow Management for Grid Computing

Grid Workflow Management GGWM

Simulation takes place before a grid workflow is actually executed,

workflow schedule is achieved returns simulation results to GridFlow portal for user agreement

Execution executed according to the simulated schedule

the actual execution may differ - dynamic nature of grid delays - send back to the simulation engine & rescheduled

Monitoring provides access to real-time status reports of tasks or sub-

workflow execution

Page 11: GridFlow: Workflow Management for Grid Computing

Global Grid Workflow Management

Scheduling Algorithm initialize all properties of each sub-workflow – null look for a schedulable sub-workflow

ensure pre- sub-workflows have all been scheduled configure the start time of the chosen sub-workflow to

be the latest end time of its pre- sub-workflows submit the start time and the sub-workflow to a grid

level Agent(ARMS) finds a suitable local grid using LGSS

Page 12: GridFlow: Workflow Management for Grid Computing

Global Grid Workflow Management

ARMS reschedules the less critical sub-workflows algorithm relies heavily on the simulation results of

LGSS

Page 13: GridFlow: Workflow Management for Grid Computing

Workflow W : a set of sub-workflows Si(i=1,….n) Si and Sn starting and ending points

pi : number of pre- sub-workflows of Si

qi : number of post- sub-workflows of Si

G: global grid – set of local grids Lj(j=1….m)

k: true if sub-workflow is scheduled else false

Page 14: GridFlow: Workflow Management for Grid Computing

Local Grid Sub-Workflow Scheduling Scheduling Algorithm

very similar to GGWM has to deal with multiple tasks that may belong to different

workflows start time of the chosen task can’t be configured with the

latest end time of its pre-tasks directly resource conflicts

Executes the task with the higher priority first gives higher priority to a possibly earlier enabled task

Page 15: GridFlow: Workflow Management for Grid Computing

Fuzzy Time Operations LGSS and GGWM algorithms are implemented

using fuzzy timing techniques fuzzy time function –

gives numerical estimate of the possibility that an event arrives at time advantages:can be computed very fastsuitable for scheduling time critical applications

they do not necessarily provide the best scheduling solution

Page 16: GridFlow: Workflow Management for Grid Computing

1() = 0.5(0,2,6,7)

2() = (2,4,4,6)

a: possibility distributions of 1 and 2

b: latest arrival distribution of 1 and 2

c: earliest enabling time

d: operator min – intersection of 1 and 2

e: operator max – union of 1 and 2

f: sum of 1 and 2

min(0.5,1)(0+2, 2+4, 6+4, 7+6)=0.5(2, 6, 10, 13)

Page 17: GridFlow: Workflow Management for Grid Computing

An Example Scenario W1, W2: Workflows L1, L2: Local Grids task A2 of sub-workflow S3

from W1 is being executed S3 from W2 is to be scheduled resource conflict between A3

and A4 schedule aims to find the

e5()

Page 18: GridFlow: Workflow Management for Grid Computing

An Example Scenario

task enabling times – from pre-task end times task execution times – from TITAN system supported by

PACE functions

a3()=(3,5,5,7); d

3()=(5,6,7,8);

a4()=(0,3,3,5); d

4()=(10,12,14,16);

d5()=(2,5,6,9);

Page 19: GridFlow: Workflow Management for Grid Computing

An Example Scenariousing LGSS

s3() = min{(3,5,5,7),earliest{(3,5,5,7),(0,3,3,5)}}

= min{(3,5,5,7),(0,3,3,5)} = 0.5(3,4,4,5)

s4() = min{(0,3,3,5),earliest{(3,5,5,7),(0,3,3,5)}}

= min{(0,3,3,5),(0,3,3,5)} = (0,3,3,5)

e13()= sum{0.5(3,4,4,5),(5,6,7,8)}

= 0.5(8,10,11,13)

Page 20: GridFlow: Workflow Management for Grid Computing

An Example Scenarioe1

4()= sum{latest{0.5(8,10,11,13),(0,3,3,5)},(10,12,14,16)} = sum{0.5(8,10,11,13),(10,12,14,16)} = 0.5(18,22,25,29)

e24()= sum{(0,3,3,5)},(10,12,14,16)} = (10,15,17,21)

e23()= sum{latest{ (10,15,17,21),0.5(3,4,4,5)},(5,6,7,8)} = sun{0.5(10,12.5,26,29),(5,6,7,8)} = 0.5(15,18.5,26,29)

e4()= max{0.5(18,22,25,29),(10,15,17,21)} = (10,15,17,29)

Page 21: GridFlow: Workflow Management for Grid Computing

An Example Scenario

e5()= sum{(10,15,17,29),(2,5,6,9)} = (12,20,23,38)

so S3 from W2 will complete on local grid L1 most likely between 20 to 23

submit this data to GGWM – decides whether the local grid L1 should be allocated the sub-workflow S3 from W2

Page 22: GridFlow: Workflow Management for Grid Computing

Conclusion the fuzzy timing technique provides a good solution to the

conflict solving problem arising from grid workflow management issue

results indicate that local and global grid workflow management can coordinate with each other to optimize workflow execution time and solve conflicts of interest

useful in highly dynamic grid environments large network latencies exists and application

performance is difficult to predict accurately needs more flexible cooperation among different grid

services and components which challenges security