Download - 1 Uwe Schwiegelshohn, 2 Andrei Tchernykh, 1 Ramin Yahyapour 1 Technische Universität Dortmund, Germany [email protected], [email protected]

1Uwe Schwiegelshohn, 2Andrei Tchernykh, 1Ramin Yahyapour

1Technische Universität Dortmund, Germany

[email protected], [email protected] Research Center, Ensenada, Baja California, Mexico

[email protected]

CIRM-Marseille-Luminy, May 12 - 16, 2008

Online Scheduling in Grids

Computational Grid

2CICESE Parallel Computing Laboratory

(by Christophe Jacquet)

Grid Model


• An encompassing and precise representation of a Grid is usually too complex to address various problems occurring in Grids.

• Application of a suitable model that considers important properties of a Grid.

Important properties Unconsidered properties

Heterogeneity of Grid (Machines with different numbers of identical processors)

Heterogeneity of processors

Fixed job parallelism (rigid jobs) Variable parallelism

Non clairvoyant scheduling Estimation of processing times

Online operationAdvance reservation, multi-site allocation

Utilization (makespan) User priorities

Grid Model


• Each job J is described by a triple : • release date,• size (degree of parallelism), • execution time on processors.

jr

0jr

jpmj msize 1

Job must be executed on processors on one machine without interruption (space sharing mode).

GPm | sizej | Cmax

},,{ jjj psizer

jJ jsize

Pm | rj, sizei | Cmax is referred to as PS while the scheduling on a set of parallel machines

GPm | rj, sizei | Cmax is referred to as MPS.

The Grid contains m machines. Machine Mi has size mi if it comprises mi processors. All processors in the Grid are identical.

jsize

List Scheduling

Processors Processors

Time

Non clairvoyant scheduling

CICESE Parallel Computing Laboratory 5

Processors Processors

Time

Cmax(LIST)=17

Cmax*=9


List Scheduling

Cmax(LIST)/Cmax* ≤ 2-1 / m

– All jobs are sequential and have release date 0.

• Graham 1966– Jobs have release date 0 and may be parallel.

• Garey, Graham 1975– Jobs are parallel and submitted over time (online scheduling)

• Naroska, Schwiegelshohn 2002

Does the same bound hold for Grids as well?


List Scheduling on Parallel Processors


Applicability to Grids

There is no polynomial time algorithm that always produces schedules S with

Cmax(S)/Cmax ∗ < 2

for

GPm | sizei | Cmax

and all input data unless P = NP.


• 2 machines with m processors each• All jobs have processing time 1 and different degrees of parallelism

– Total requirement of all jobs: 2m processors

• Consider an arbitrary algorithm A.

machine 2machine 1

Cmax (A)=1 Cmax*=1: optimal solution

Cmax (A)=2 and Cmax *=2: optimal solution

Cmax (A)=2 and Cmax *=1: optimal solution

machine 2machine 1


How do we know whether Cmax*=2 applies?

– Partition: NP-hard – There is no algorithm A with polynomial time complexity

guaranteeing Cmax(A)/Cmax* < 2.

Scheduling in Grids is inherently more difficult than

scheduling on a single parallel processor.



Time

Machines with different numbers of processors

Cmax(LIST)=4


List Scheduling in the Grid

Time

Machines with different numbers of processors

Cmax*=2


List Scheduling in the Grid

• Cmax(LIST)/Cmax* = (k+1)/2

• Analysis of the problem– Jobs with little parallelism occupy large machines which are not

available for highly parallel jobs.– In case of few highly parallel jobs it is inefficient to prevent jobs with

little parallelism from using these large machines.• Simple approach

– Increased priority for highly parallel jobs– Arranging jobs in descending order of their parallelism

• Fairness is neglected.


Problems of List Scheduling

Sorting in Order of Parallelism

Processors

Time

Predominantly executionof sequential jobs

Few available processors for parallel jobs


Does Ordering the Jobs Help?

• We are interested in an algorithm that does not use a single list of jobs.– Some machines are blocked from executing some jobs under

certain circumstances.


Online Job Stealing Scheduling in Grids

Does Ordering the Jobs Help?

• We assume a machine indexing such that mi−1 ≤ mi holds

Three sets of jobs are considered

o Set Ai contains all jobs that cannot execute on the previous (next smaller) machine and require more than 50% of the processors of machine Mi.

o Set Bi contains all jobs that cannot execute on the previous machine but require at most 50% of the processors of machine Mi.

o Set Hi contains all jobs that require more 50% of the processors of machine Mi but can also be executed on the previous machine.


Grid Scheduling Algorithm

2. A job is assigned to the first machine that can execute it.

Group A: >= half of the processors on this machine are required. Group B: < half of the processors on this machine are required.

1. The machines are arranged in ascending order of processor numbers.


3. Any machine applies a priority order when selecting jobs for execution:

Jobs of its group A Jobs of its group B Jobs that are enabled for execution on its previous machine.


Grid Scheduling Algorithm

• Theoretical evaluation

– Cmax(LIST)/Cmax* < 3 in the offline case

– Cmax(LIST)/Cmax* < 5 in the online case

U.Schwiegelshohn, A.Tchernykh, R.Yahyapour

Online Scheduling in Grids. IEEE, IPDPS’08, 2008


Performance of the Algorithm

Conclusion

• Common list scheduling does not work well in Grids.• Jobs should receive priority on the machines that provide the right

amount of parallelism.• Jobs with less parallelism are only executed on these machines if

better suited jobs are not available.

• The presented algorithm has a constant worst case bound and relatively small gap.


Adaptive Admissible Allocation

Two Level Grid Model


GridWorkload

Broker

Allocation

Local queue Local queue Local queue

Local scheduler Local scheduler Local scheduler

node node node

We regard MPS as two stage (two layer) scheduling MPS = MPS_Allocation + PS.

Allocation


For each job:

first be the minimum i such that node is able to execute a job . last is the maximum i

set of nodes first, first+1, . . . , last is a set M-available.

…m1 m2 m3 m4 m5 mm

first(Jj) = 2 last(Jj) = m

M-available


…m1 m2 m3 m4 m5 mm

first(Jj) = 2

last(Jj) = m

M-available

M-admis

last(Jj) = 5

If last is the minimum r such that

m

jfirstii

r

jfirstii

jj

mam)()(

Allocation


1 f0 f l0 l m

a*m(f,m) (1-a)*m(f,m)

a*m(f0,m) (1-a)*m(f0,m)


For a set of machines with identical processors, and for a set of rigid jobs with admissible range

the competitive factor of Min_LB-a + Best_PS is

10 a

for

for

PSBestaLBMin __

ma

12

11

),0(

),(

mfm

mfma

maa

1

)1(

11

),0(

),(

mfm

mfma


Competitive factor

10 20 30 40 50 60 70 80 90 100

2

4

6

8

10

12

14

16

18

20

a


Competitive factor

10 20 30 40 50 60 70 80 90 100

2

4

6

8

10

12

14

16

18

20

a

A.Tchernykh, U.Schwiegelshohn, R.Yahyapour, N.Kuzurin.

Online Hierarchical Job Scheduling in Grids.IEEE, CoreGrid’08, EuroPar, 2008

Thank you

Download - 1 Uwe Schwiegelshohn, 2 Andrei Tchernykh, 1 Ramin Yahyapour 1 Technische Universität Dortmund, Germany [email protected], [email protected]

Top Related