Resource Manager for Grid with global Resource Manager for Grid with global job queue and with planning job queue and with planning
based on local schedulesbased on local schedules
V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, E.Z.Ljubimskii, A.V.Orlov, E.V.HuhlaevE.Z.Ljubimskii, A.V.Orlov, E.V.Huhlaev
{kvn,kei,{kvn,kei,koryagin,ljubimsk,ao,huhkoryagin,ljubimsk,ao,huh}@keldysh.ru}@keldysh.ru
Keldysh Institute of Applied MathematicsKeldysh Institute of Applied Mathematics
Russian Academy of SciencesRussian Academy of Sciences
Keldysh Institute of Applied MathematicsKeldysh Institute of Applied Mathematics
Russian Academy of SciencesRussian Academy of Sciences
1111
Job submittingJob submitting inin GlobusGlobus systemsystem
Job submittingJob submitting inin GlobusGlobus systemsystem
Job submittingJob submitting by means of by means of BrokerBroker
Job submittingJob submitting by means of by means of BrokerBroker
BrokerBroker
2222
GRID Resource Broker (GRB) – HPC lab, GRID Resource Broker (GRB) – HPC lab, University of Lecce, Italy and CACR, California University of Lecce, Italy and CACR, California Institute of Technology. http://sara.unile.It/grb/Institute of Technology. http://sara.unile.It/grb/
EZ-Grid - Department of Computer Science, EZ-Grid - Department of Computer Science, University of Houston. University of Houston.
http: //www.cs.uh.edu/~ ezgrid/http: //www.cs.uh.edu/~ ezgrid/
GRID Resource Broker (GRB) – HPC lab, GRID Resource Broker (GRB) – HPC lab, University of Lecce, Italy and CACR, California University of Lecce, Italy and CACR, California Institute of Technology. http://sara.unile.It/grb/Institute of Technology. http://sara.unile.It/grb/
EZ-Grid - Department of Computer Science, EZ-Grid - Department of Computer Science, University of Houston. University of Houston.
http: //www.cs.uh.edu/~ ezgrid/http: //www.cs.uh.edu/~ ezgrid/
Resource BrokersResource BrokersResource BrokersResource Brokers
MetaDispatcher – Keldysh Institute of MetaDispatcher – Keldysh Institute of Applied Mathematics, MoscowApplied Mathematics, Moscow
MetaDispatcher – Keldysh Institute of MetaDispatcher – Keldysh Institute of Applied Mathematics, MoscowApplied Mathematics, Moscow
3333
Job submittingJob submitting inin GlobusGlobus systemsystem
Job submittingJob submitting inin GlobusGlobus systemsystem
Job submittingJob submitting by means of by means of BrokerBroker
Job submittingJob submitting by means of by means of BrokerBroker
BrokerBroker
4444
Architecture of MetaDispatcherArchitecture of MetaDispatcherArchitecture of MetaDispatcherArchitecture of MetaDispatcher
Client Metadispatcher
JOBS SPOOL
Scheduler
(reacts to events)
Job monitor
Target GRAM Job manager
Request (RSL) status
Proxy
interception
submit status cancel clean get-output
Cli
en
t u
tili
tie
s
submit status cancel get-output clean C
om
ma
nd
in
terp
reta
tor
Deleg. Proxy
Loc.copies executable
stdin
Ga
tek
ee
pe
r
Jo
bm
an
ag
er-
me
ta
Start (jobid)
Gatekeeper
Deleg.-2 Proxy
Bufferized
stdout, stderr
Cancel tail Cleanup
MDS GRIS GIIS GIIS
Pro
ving
O
f Job
GIIS
Statics
Dyn
amic
s
Client Metadispatcher
JOBS SPOOL
Scheduler
(reacts to events)
Job monitor
Target GRAM Job manager
Request (RSL) status
Proxy
interception
submit status cancel clean get-output
Cli
en
t u
tili
tie
s
submit status cancel get-output clean C
om
ma
nd
in
terp
reta
tor
Deleg. Proxy
Loc.copies executable
stdin
Ga
tek
ee
pe
r
Jo
bm
an
ag
er-
me
ta
Start (jobid)
Gatekeeper
Deleg.-2 Proxy
Bufferized
stdout, stderr
Cancel tail Cleanup
MDS GRIS GIIS GIIS
Pro
ving
O
f Job
GIIS
Statics
Dyn
amic
s
5555
Problem of schedulingProblem of schedulingThe problem of scheduling is decided The problem of scheduling is decided
on two sets: 1) the set of jobs and 2) on two sets: 1) the set of jobs and 2) the set of computing elements. the set of computing elements.
Scheduling results: Scheduling results:
-The dispatch time for each jobThe dispatch time for each job
-The place, where the job should be The place, where the job should be directed and executed directed and executed
Problem of schedulingProblem of schedulingThe problem of scheduling is decided The problem of scheduling is decided
on two sets: 1) the set of jobs and 2) on two sets: 1) the set of jobs and 2) the set of computing elements. the set of computing elements.
Scheduling results: Scheduling results:
-The dispatch time for each jobThe dispatch time for each job
-The place, where the job should be The place, where the job should be directed and executed directed and executed
6666
Config. Config. Config. Config.
Config. fileConfig. fileConfig. fileConfig. file
Two management levels - local and global, each having Two management levels - local and global, each having own objects: job, queue, and management system - own objects: job, queue, and management system - Local Resource Monitor (LRM) and MetaDispatcher.Local Resource Monitor (LRM) and MetaDispatcher.
Two management levels - local and global, each having Two management levels - local and global, each having own objects: job, queue, and management system - own objects: job, queue, and management system - Local Resource Monitor (LRM) and MetaDispatcher.Local Resource Monitor (LRM) and MetaDispatcher.
Global levelGlobal levelGlobal levelGlobal level
LRMLRM
LocalLocalqueuequeue
Local levelLocal levelLocal levelLocal level
MetaDispatcherMetaDispatcherMetaDispatcherMetaDispatcher
jobjobjobjob
jobjobjobjob
jobjobjobjobjobjobjobjob
Global Global
queuequeue
7777
Question 1Question 1: : In What Order Should In What Order Should the Global Jobs Be Served?the Global Jobs Be Served?
Question 1Question 1: : In What Order Should In What Order Should the Global Jobs Be Served?the Global Jobs Be Served?
The order, in which the scheduler serves the job The order, in which the scheduler serves the job queue, should differ from FIFO.queue, should differ from FIFO.
User should have available the management User should have available the management facilities for placing his job at any position in the facilities for placing his job at any position in the global queue.global queue.
To achieve that:To achieve that:
Limited budget is allocated to each user.Limited budget is allocated to each user.
Within the budget limits user prices his jobs.Within the budget limits user prices his jobs.
Function GP evaluates Function GP evaluates global priorityglobal priority of the job: of the job:
GP=GP(price, required resources, run timeGP=GP(price, required resources, run time ))
The order, in which the scheduler serves the job The order, in which the scheduler serves the job queue, should differ from FIFO.queue, should differ from FIFO.
User should have available the management User should have available the management facilities for placing his job at any position in the facilities for placing his job at any position in the global queue.global queue.
To achieve that:To achieve that:
Limited budget is allocated to each user.Limited budget is allocated to each user.
Within the budget limits user prices his jobs.Within the budget limits user prices his jobs.
Function GP evaluates Function GP evaluates global priorityglobal priority of the job: of the job:
GP=GP(price, required resources, run timeGP=GP(price, required resources, run time ))
job
job
job
job
job
jobjobjobjobjob
jobjobjobjob
jobjobjobjob
jobjobjobjob
jobjobjobjob
jobjobjobjob
new jobnew jobnew jobnew job
8888
Question 2:Question 2: When When ForwardForward a Job to a a Job to a Target Computing Element?Target Computing Element?
Question 2:Question 2: When When ForwardForward a Job to a a Job to a Target Computing Element?Target Computing Element?
jobjobjobjobjobjobjobjob
jobjobjobjobjobjobjobjob
IfIf destination point of a job is determined at destination point of a job is determined at the moment, when it comes in to a global the moment, when it comes in to a global queue, and the job is immediately routed to queue, and the job is immediately routed to a local queue…a local queue…
IfIf destination point of a job is determined at destination point of a job is determined at the moment, when it comes in to a global the moment, when it comes in to a global queue, and the job is immediately routed to queue, and the job is immediately routed to a local queue…a local queue…
itit may be delayed there because of the local may be delayed there because of the local job arrival.job arrival. At the same time resources of At the same time resources of other computing elements may become free other computing elements may become free and idleand idle..
itit may be delayed there because of the local may be delayed there because of the local job arrival.job arrival. At the same time resources of At the same time resources of other computing elements may become free other computing elements may become free and idleand idle..
The conclusion:The conclusion:It is more reasonablly to store global jobs in global queue It is more reasonablly to store global jobs in global queue as long as possible, best of all up to the moment of start.as long as possible, best of all up to the moment of start.
The conclusion:The conclusion:It is more reasonablly to store global jobs in global queue It is more reasonablly to store global jobs in global queue as long as possible, best of all up to the moment of start.as long as possible, best of all up to the moment of start.
new jobnew job
jobjobjobjobjobjobjobjob
jobjobjobjobjobjobjobjob
jobjobjobjobjobjobjobjob
jobjobjobjob
9999
The scheduling model of computing The scheduling model of computing installation:installation:
A set of resourcesA set of resources
Resource description:Resource description:Static attributes: (OS type, CPU time, memory volume)Static attributes: (OS type, CPU time, memory volume)
Dynamic attributes: free/busy, resource amountDynamic attributes: free/busy, resource amount
The scheduling model of computing The scheduling model of computing installation:installation:
A set of resourcesA set of resources
Resource description:Resource description:Static attributes: (OS type, CPU time, memory volume)Static attributes: (OS type, CPU time, memory volume)
Dynamic attributes: free/busy, resource amountDynamic attributes: free/busy, resource amount
Question 3:Question 3: To Which Computing To Which Computing Elements a Job Should Be Passed? Elements a Job Should Be Passed? Question 3:Question 3: To Which Computing To Which Computing Elements a Job Should Be Passed? Elements a Job Should Be Passed?
10101010
Resource Release TimeResource Release Time Resource Release TimeResource Release Time
However the scheduler must have a guarantee, However the scheduler must have a guarantee, that the planned global job will really start and that the planned global job will really start and will not stay waiting in a local queue.will not stay waiting in a local queue.
However the scheduler must have a guarantee, However the scheduler must have a guarantee, that the planned global job will really start and that the planned global job will really start and will not stay waiting in a local queue.will not stay waiting in a local queue.
Resource
TimeRunning jobRunning job
Running jobRunning job
Running jobRunning job
Busy resources have an Busy resources have an additional attribute – release additional attribute – release time estimated from the time estimated from the request of a running job. request of a running job. Being aware of the release Being aware of the release time, the scheduler is able to time, the scheduler is able to plan the future usage of the plan the future usage of the busy resource. busy resource.
Busy resources have an Busy resources have an additional attribute – release additional attribute – release time estimated from the time estimated from the request of a running job. request of a running job. Being aware of the release Being aware of the release time, the scheduler is able to time, the scheduler is able to plan the future usage of the plan the future usage of the busy resource. busy resource.
11111111
+
Question 4:Question 4: How the Interaction of the Global How the Interaction of the Global Scheduler and Local Resource Monitor Should Be Scheduler and Local Resource Monitor Should Be
Organized?Organized?
Question 4:Question 4: How the Interaction of the Global How the Interaction of the Global Scheduler and Local Resource Monitor Should Be Scheduler and Local Resource Monitor Should Be
Organized?Organized?
Autonomy of computing element:Autonomy of computing element:Each computing element of the Grid belongs to a certain owner that Each computing element of the Grid belongs to a certain owner that could be able to restrict access for external jobs completely or partly.could be able to restrict access for external jobs completely or partly.
Autonomy of computing element:Autonomy of computing element:Each computing element of the Grid belongs to a certain owner that Each computing element of the Grid belongs to a certain owner that could be able to restrict access for external jobs completely or partly.could be able to restrict access for external jobs completely or partly.
If global and local jobs make demands for the same resources, their If global and local jobs make demands for the same resources, their priorities are compared. For this purpose each computing element i priorities are compared. For this purpose each computing element i determines the function LPi() that calculates the local priority of a determines the function LPi() that calculates the local priority of a global job. This function depends on job’s price, consumable global job. This function depends on job’s price, consumable resources and run time:resources and run time:
LPi = LPi (price, consumable resources, run time) LPi = LPi (price, consumable resources, run time)
If global and local jobs make demands for the same resources, their If global and local jobs make demands for the same resources, their priorities are compared. For this purpose each computing element i priorities are compared. For this purpose each computing element i determines the function LPi() that calculates the local priority of a determines the function LPi() that calculates the local priority of a global job. This function depends on job’s price, consumable global job. This function depends on job’s price, consumable resources and run time:resources and run time:
LPi = LPi (price, consumable resources, run time) LPi = LPi (price, consumable resources, run time)
If two jobs, local and global,If two jobs, local and global, ask for free resources, which one ask for free resources, which one should be preferred?should be preferred? If two jobs, local and global,If two jobs, local and global, ask for free resources, which one ask for free resources, which one should be preferred?should be preferred?
Question 4:Question 4: How should the interaction of the How should the interaction of the global scheduler and local resource monitor global scheduler and local resource monitor
be organized?be organized?
Question 4:Question 4: How should the interaction of the How should the interaction of the global scheduler and local resource monitor global scheduler and local resource monitor
be organized?be organized?
12121212
+
Question 4:Question 4: How the Interaction of the Global How the Interaction of the Global Scheduler and Local Resource Monitor Should Be Scheduler and Local Resource Monitor Should Be
Organized?Organized?
Question 4:Question 4: How the Interaction of the Global How the Interaction of the Global Scheduler and Local Resource Monitor Should Be Scheduler and Local Resource Monitor Should Be
Organized?Organized?
The global scheduler should distribute its jobs so that the global jobs The global scheduler should distribute its jobs so that the global jobs would not withhold would not withhold the start of any more "expensive” local jobs. the start of any more "expensive” local jobs. The global scheduler should distribute its jobs so that the global jobs The global scheduler should distribute its jobs so that the global jobs would not withhold would not withhold the start of any more "expensive” local jobs. the start of any more "expensive” local jobs.
Resource
TimeRunning jobRunning job
Running jobRunning job
Global queueGlobal queue
PPGG<P<PLL
PPGG
PPGG= LP(job= LP(jobGG))
jobjobGG
PPLLLocal queueLocal queue
jobjobLL
13131313
ScheduleScheduleScheduleSchedule
ResourceResource
Future
Time
Future
Time
Running jobRunning jobRunning jobRunning job
Running jobRunning jobRunning jobRunning job
Running jobRunning jobRunning jobRunning job
priority1priority1priority1priority1priority2priority2priority2priority2
priority4priority4priority4priority4
priority3priority3priority3priority3
The The local schedulelocal schedule is the plan of resource occupation by local jobs is the plan of resource occupation by local jobs for some period of time in the future. for some period of time in the future.
Local schedule: Local schedule: For each local jobFor each local job
{priority, assigned resources, occupation and release time}{priority, assigned resources, occupation and release time}
The The local schedulelocal schedule is the plan of resource occupation by local jobs is the plan of resource occupation by local jobs for some period of time in the future. for some period of time in the future.
Local schedule: Local schedule: For each local jobFor each local job
{priority, assigned resources, occupation and release time}{priority, assigned resources, occupation and release time}14141414
The local schedule is drawn up by the special The local schedule is drawn up by the special agentsagents of the global scheduler. Such agents, of the global scheduler. Such agents, working on each computing installation, arrange the working on each computing installation, arrange the schedule in precise conformity with scheduling schedule in precise conformity with scheduling strategy and configuration parameters of the local strategy and configuration parameters of the local monitor.monitor.
The actual state of all local schedules is The actual state of all local schedules is delivered to the delivered to the information baseinformation base of the global of the global scheduler, and, thus, it has available the scheduler, and, thus, it has available the information about the usage plan of all virtual information about the usage plan of all virtual organization resources. organization resources.
On the basis of this aggregate schedule the On the basis of this aggregate schedule the scheduler can scheduler can make upmake up the layout of global jobs the layout of global jobs
allocation to resources.allocation to resources.
The local schedule is drawn up by the special The local schedule is drawn up by the special agentsagents of the global scheduler. Such agents, of the global scheduler. Such agents, working on each computing installation, arrange the working on each computing installation, arrange the schedule in precise conformity with scheduling schedule in precise conformity with scheduling strategy and configuration parameters of the local strategy and configuration parameters of the local monitor.monitor.
The actual state of all local schedules is The actual state of all local schedules is delivered to the delivered to the information baseinformation base of the global of the global scheduler, and, thus, it has available the scheduler, and, thus, it has available the information about the usage plan of all virtual information about the usage plan of all virtual organization resources. organization resources.
On the basis of this aggregate schedule the On the basis of this aggregate schedule the scheduler can scheduler can make upmake up the layout of global jobs the layout of global jobs
allocation to resources.allocation to resources.
15151515
Data BaseData Base
jobjobjobjob
jobjobjobjob
jobjobjobjob
jobjobjobjob
Global Global queuequeue
PProgram architecture of schedulingrogram architecture of schedulingPProgram architecture of schedulingrogram architecture of scheduling
AgentAgent
LRMLRMAgentAgent
LRMLRMAgentAgent
QueueQueue
LRMLRM
SchedulerSchedulerSchedulerScheduler
16161616
The global schedulerThe global scheduler implement implementing ing certaincertain scheduling strategscheduling strategy make up the global schedule.y make up the global schedule.
The information baseThe information base resides adjacently with the resides adjacently with the scheduler and stores aggregate schedule. scheduler and stores aggregate schedule. ForFor data data management the distributed management the distributed systemsystem like like SSpitfire of pitfire of DDatagrid project atagrid project with relational data base as a core is with relational data base as a core is considered.considered.
TThe local agenthe local agentss of the scheduler works on each of the scheduler works on each computing computing elementelement. Interacting with the local . Interacting with the local resource monitor, the agent resource monitor, the agent arrangesarranges a local a local schedule of this computing element and transfers schedule of this computing element and transfers updates to the global scheduler. updates to the global scheduler. Proposed Proposed implementation is based on Maui schedulerimplementation is based on Maui scheduler. .
The global schedulerThe global scheduler implement implementing ing certaincertain scheduling strategscheduling strategy make up the global schedule.y make up the global schedule.
The information baseThe information base resides adjacently with the resides adjacently with the scheduler and stores aggregate schedule. scheduler and stores aggregate schedule. ForFor data data management the distributed management the distributed systemsystem like like SSpitfire of pitfire of DDatagrid project atagrid project with relational data base as a core is with relational data base as a core is considered.considered.
TThe local agenthe local agentss of the scheduler works on each of the scheduler works on each computing computing elementelement. Interacting with the local . Interacting with the local resource monitor, the agent resource monitor, the agent arrangesarranges a local a local schedule of this computing element and transfers schedule of this computing element and transfers updates to the global scheduler. updates to the global scheduler. Proposed Proposed implementation is based on Maui schedulerimplementation is based on Maui scheduler. .
17171717
Future directions:Future directions:
Backfill algorithm implementation at the Backfill algorithm implementation at the global level to avoid blocking of the global level to avoid blocking of the jobs.jobs.
AdvanceAdvancedd resource reservation for resource reservation for distributed multiprocessor jobs.distributed multiprocessor jobs.
Economical model of virtual Economical model of virtual organiorganizzation as applied to scheduling. ation as applied to scheduling.
Future directions:Future directions:
Backfill algorithm implementation at the Backfill algorithm implementation at the global level to avoid blocking of the global level to avoid blocking of the jobs.jobs.
AdvanceAdvancedd resource reservation for resource reservation for distributed multiprocessor jobs.distributed multiprocessor jobs.
Economical model of virtual Economical model of virtual organiorganizzation as applied to scheduling. ation as applied to scheduling.
18181818