ldcp+: an optimal algorithm for static task scheduling in grid systems

8/9/2019 LDCP+: An Optimal Algorithm for Static Task Scheduling in Grid Systems

http://slidepdf.com/reader/full/ldcp-an-optimal-algorithm-for-static-task-scheduling-in-grid-systems 1/6

(IJCSIS) International Journal of Computer Science and Information Security,

Vol. 8, No. 4, July 2010

LDCP+: An Optimal Algorithm for Static Task

Scheduling in Grid Systems

Negin Rzavi

Islamic Azad University,

Science and Research Branch,

Tehran, Iran

[email protected]

Safieh Siadat



Tehran, Iran

[email protected]

Amir Masoud Rahmani



Tehran, Iran

[email protected]

Abstract— after a computational job is designed and realized as a

set of tasks, an optimal assignment of these tasks to the

processing elements in a given architecture needs to be

determined. In grid system with the existence of heterogeneous

processing elements and data transferring time between them,determining an assignment of tasks to processing elements in

order to optimize the performance and efficiency is so important.

In this paper a heuristic algorithm named LDCP+ is presented,

which has optimized the Longest Dynamic Critical Path

algorithm (LDCP) presented by Mohammad L. Daoud and

Nawwaf Kharma in 2007. This algorithm is a list-based algorithm

in the way it assigns each task a priority for its execution. Using

task duplication, using idle processing element's time and also

optimizing priority assignment method which is used in LDCP

algorithm, are the basic specifications of LDCP+, since LDCP

algorithm is executable with the assumption that computation

cost of tasks are monotonic, our algorithm which is presented in

this paper has made the scheduling algorithm free from this

restriction and in the case of non-monotonic computation costs,

LDCP+ has the minimum total finish time in the comparison of

other scheduling algorithms such as HEFT and CPOP.

Keywords- Grid; Static task scheduling; Longest Dynamic

Critical Path.

I. INTRODUCTION

A Grid system is a group of connected computers that hasthe ability of executing parallel programs via a high speedinterconnection. The efficiency of program parallelism in Gridsystems depends on methods used in task scheduling onavailable processing elements. Inner connection of processingelements in Grid causes an overhead when two tasks assigned

to different processing elements of distinct computers, transferdata. In fact, task scheduling in distributed heterogeneoussystems are more complex in which each task can havedifferent execution time on different processing elements, soscheduling algorithms for a Grid system should consider theexecution time of each task on different processing elementsand even one incorrect decision can restrict the systemperformance to the slowest processing element [2].

There are two kinds of scheduling algorithms: static schedulingalgorithms and dynamic scheduling algorithms. In static

scheduling algorithms all information needed for schedulingsuch as the structure of the parallel application, the executiontime of individual tasks and the communication costs betweentasks must be known, in contrast, these information areunknown in dynamic task scheduling algorithms.

Among different types of scheduling algorithms, HEFT is ascheduling algorithm for heterogeneous distributed computingsystems which is consists of two phases: first, cost computingfor each task and task selection, second, processor selection. Inthe task selection phase the algorithm sets the computationcosts of tasks to their mean values and this may limit the abilityof scheduling algorithm to precisely compute the priorities of tasks. The CPOP algorithm is same as HEFT in the two phasesbut with different strategies in assigning priorities to tasks andprocessor selection. These two algorithms have beenmentioned as optimal algorithms in the parameter of total finishtime.

In this paper we present a heuristic list-based algorithmcalled LDCP+ (optimized of Longest Dynamic Critical Pathalgorithm) for static task scheduling in Grid systems withlimited number of processors and we compare our schedulingresults with other algorithms such as CPOP, HEFT and LDCPfor performance evaluation.

II. RELATED WORKS

Static task scheduling for Grid systems, in general is knownto be NP-Complete problem [4, 7, 9] and most of thesealgorithms are heuristic [1, 2, 3, 4, 7]. One of the mostimportant classes of heuristic algorithms is list-basedalgorithms [6], in such algorithms each task is assigned with a

priority and three steps of task selection, processor selectionand status update are repeated until all tasks are scheduled. Inthe task selection phase the unscheduled task with the highestpriority is selected. In the processor selection phase, theselected task is assigned to the processor that minimizes apredefined cost criterion that can be minimizing the schedulelength. At last in status update phase, the status of the system isupdated. Examples of list-based algorithms are: HeterogeneousEarliest Finish Time (HEFT) [9], Critical Path on a Processor(CPOC) [9], Critical Path on a Cluster (CPOC) [5], Dynamic

335 http://sites.google.com/site/ijcsis/ISSN 1947-5500




Vol. 8, No. 4, July 2010

3

12

5

0n1n

2n

3n

6

9

7

3

12

5

0n1n

2n

3n

5 4

8

3

6

Level Scheduling (DLS) [8], Modified Critical Path (MCP)[10], Mapping Heuristic (MH) [3], Dynamic Critical Path(DCP), and Longest Dynamic Critical Path (LDCP) [2].

III. PROBLEM DEFINITION

In static task scheduling in Grid system, the executionprecedence between tasks is represented by a Directed Acyclic

Graph (DAG), each DAG is shown by tupple (T, E) where T isa set of n tasks and E is a set of e edges. Each T t i ∈ represents a task and each E t t e ji ji ∈= ),(,

represents theexecution precedence between the two tasks which are

connected with the edge jie , .

If E t t ji ∈),( then the execution of task i

t T ∈ cannot bestarted before task finishes its execution. For the edge ),( ji t t ,

the source task it is parent of the sink task

jt , while jt is a

child of it . A task with no parents is called an entry task and a

task with no children is called an exit task. Associated witheach edge ( , )

i jt t is a value ,i jd that represents the amount of

data to be transmitted from task to task jt , and in some cases italso represents the minimum time that a task needs to wait for

starting after task it finishes its execution.A Grid system is represented by a set P of m processors, a

set T of n tasks and n × m computation cost matrix ( mnW × ).

Each element mk niW w k i≤≤<≤∈ 1,1,,

represents theexecution time (computation cost) of task it on processor

k P .We have the same assumption as LDCP that all processors arefully connected and communications between processors occur

via independent communication units [2], so, we can have task execution and data transferring in parallel. Also the datatransfer rate between any two processors on the network isassumed to be fixed and constant as same as LDCP. The

communication cost between two processors is represented byn × n matrix ( nn D× ).

,i jd D∈ is zero if two tasks it and jt of

and are scheduled on the same processor and it is equal tocommunication cost (non zero) in the other case. A task canstart its execution on a processor only when all data from itsparent become available to that processor. The goal of our

algorithm is to assign tasks in processors in a way thatminimizes the total finish time, or the schedule length.

A. Definition 1

Schedule Length: The maximum execution time of theprocessors or the finish time of the final task after task scheduling is called scheduled length. There is a DAG and acomputation cost matrix with two processors as shown inFig.1. The schedule length is computed in Fig.2. and it is equalto 23.

(a) (b)

Figure 1. An example of (a) DAG (b) computation matrix

Figure 2. Schedule length of the presented DAG in Fig. 1. on two processors

(a) (b)

Figure 3. Tasks computation time on each processor that will be acquired

from cost matrix in Fig. 1.

Assigning task priorities in Grid system the efficiency of list-based scheduling algorithms depends on the methods whichassign priorities to tasks.

In our suggested algorithm LDCP+, if selecting a task inone step of scheduling causes the minimum schedule length weassign a high priority to that task. There are some basicdefinitions which are used in LDCP algorithm and becauseLDCP+ is the result of optimization of LDCP, we decided torepresent this basic knowledge too.

B. Definition 2

Critical Path: For a given DAG, the Critical Path (CP) isdefined as the path from an entry task to an exit task for whichthe sum of the computation costs of tasks and thecommunication costs of edges is maximal.

IV. LDCP: LONGEST DYNAMIC CRITICAL PATH

A. Definition 3

Longest Dynamic Critical Path: Given a DAG with n tasksand e edges and a Grid system with m processors, DCP duringa particular scheduling step is a path of tasks and edges from anentry task to an exit task.

LDCP is the largest DCP, considering that communication

costs between tasks scheduled on the same processors areassumed zero, and the execution constraints are preserved.Fig.3. represents two dynamic critical paths. First path inFig.3.a. is composed of tasks 0 2,t t and 3t which is scheduledon processor

0 p and has the length of 29. The second DCP inFig.3.b. is composed of tasks

0 2,t t and

3t which is scheduled

on processor1 p and has the length of 23, so at the first step of

scheduling, LDCP is composed of tasks0 2,t t and

3t and with

the schedule length of 29.

0t

1t 1 p

2t

3t

0 6 15

0 4 15

0 p

20 23

Idle

0t 1t

3t

2t 3

12

5

1 p0 p Task

5 6 0t 4 6 1

t 8 9 2t 3 7 3

t

336 http://sites.google.com/site/ijcsis/

ISSN 1947-5500




Vol. 8, No. 4, July 2010

V. LDCP+: THE PROPOSED ALGORITHM

In the algorithm of LDCP+, each scheduling iterationincludes three phases below:

1. Task selection

2. Processor selection phase

3. Status update phase

These 3 phases will be accomplished for each task until lastinput task is selected for scheduling.

A. Task Selection Phase

LDCP+ selects a set of tasks that play main role indetermining schedule length.

In first step of this phase, DAG of each processor isrequired for scheduling.

1) Definition 4

Directed Graph: With the assumption of having a DAGincluding n tasks, e edges and a Grid system with m processors( m p p p ,...,, 10 ),

k DAGP is the directed acyclic graph that

corresponds to processor k p . The computation cost of eachtask in the processor k

p , is represented by a number on therelated node of the

k DAGP .

0 DAGP is shown in Fig.3.a. and1 DAGP is shown in

Fig.3.b. These figures are related with the DAG and the Gridsystem which is represented in Fig.1. Trough the course of thispaper, ti is used to refer to the i'th task in directed acyclic graphand the node in identifies task it in

k DAGP . The numberassociated with this node represents the computation cost of task ti on processor pk. In each

k DAGP , all nodes are assignedwith a number named UpwardRank (URank). URanks are usedto determine tasks priorities in

k DAGP .

2)

Definition 5URank: UpwardRank of i'th node (in ) in

k DAGP is

defined recursively as

, ( )( ) max { ( , ) ( )}l k ik i i k n succ n k i l lURank n w c n n URank n∈= + + (1)

where ( )k isucc n is a set of immediate successors of node

in , ( , )k i l

c n n is the communication cost between nodes

in andln in

k DAGP , and ,i k w is the computation cost of

it on processor

k p .

3) Definition 6 URankSet: Each element of URankSet is defined as

∑

−

=

1

0)}({

m

k

ik nURank Max (2)

where ( )k iURank n is ( )i

URank n ink

DAGP .

4) Definition 7

KeyNode: KeyNode is the node that has the maximum

URank in URankSet. Corresponded task to this node is used asselected task for scheduling algorithm.

5) Definition 8

KeyNodeSet: This set includes KeyNodes that are selectedfor scheduling and in the first scheduling iteration it caninclude several tasks, but in other iterations it has only one task for scheduling and in the first scheduling iteration it caninclude several tasks, but in other iterations it has only onetask.

6) Definition 9.Least Execution Time (LET): Least Execution Time is

defined as

})(min{ ,, jik ik d w pe processTim ++ (3)

where )( k pe processTim is the time that find scheduledtask on processor

k p finishes its execution, ,i k w is the

computation time of task corresponded to i on processor k , and

,i jd is the communication time between it and jt . If both

it and j

t are scheduled on processork

p , then communicationcost between them will be assumed zero. After computingURankSet, the destined task for scheduling algorithm is thetask corresponding to existing KeyNode in URankSet. In the

first iteration to obtain minimum execution time on availableprocessing elements, if the number of entry tasks is equal orless than processors number, all entry tasks will be consider asKeyNode, in other case, as same as the number of processors,tasks with maximum URanks will be selected as KeyNodes and

place in KeyNodeSet. In the next iterations, KeyNodeSetmerely includes one KeyNode (a set with one member).

B. Processor Selection Phase

In this phase, selected task will be assigned to a processorin the way to gain the minimum schedule length in eachiteration of scheduling. Therefore, in LDCP+, these stages willbe passed: As mentioned above, in the first schedulingiteration, KeyNodeSet can have more than one KeyNode. For

the purpose of optimizing LDCP algorithm, LDCP+ computesdistinct permutation of tasks, which their correspondingKeyNodes are available in KeyNodeSet, on differentprocessors and the permutation with the minimum averageexecution time on processors will be the first assignment of tasks to processors. This average execution time can beachieved from

⎪⎪⎭

⎪⎪⎬

⎫

⎪⎪⎩

⎪⎪⎨

⎧∑−

=

m

wm

k

k i

1

0

,

min (4)

Where i is the number of tasks corresponding to their

KeyNodes, ,i k w w∈ and m is the number of processors. Inthe next iterations, the only available KeyNode in KeyNodeSetis selected to be scheduled.

1) Definition 10

Idle Space: In a processor when there is a gap between thestart time of a task and the end time of the previous task, thatinterval time is called idle space.


ISSN 1947-5500




Vol. 8, No. 4, July 2010

2) Definition 11

Replacement Ability: One task can be placed in an idlespace when parents of that task have been terminated before thestart time of the task. If any of its parents have been scheduledon a different processor, the required time for transferring databetween processors should be mentioned.

If there is a processor with the idle space and selected task has the ability of locating in that space (replacement ability),that processor will be selected. At the end of this phase,LDCP+ algorithm uses duplication process to decrease theschedule length. With this definition, after selecting theprocessor if the selected task has a parent scheduled on adifferent processor and the selected processor has an idle spacebefore the start time of the selected task, then duplicationprocess in the idle space will be used (regarding to thereplacement ability).

3) Definition 12Duplication Process: Duplication process is repeating the

execution of one task on other processors.

C. Status Update Phase

After selecting the task and assigning it to a processor,appropriate URank with the selected task will be deleted fromURankSet. Finish process time of the selected processor will beupdated after the task has been assigned to the processor.Selected task will be deleted from the list of unscheduled tasks.LDCP+ algorithm is proposed in Fig.4.

VI. CASE STUDY

In this section, execution results of CPOP, HEFT andLDCP+ algorithms are compared in the case of having nonmonotonic computation cost matrix. A Grid system compose of three single-processor computers (m=3), fifteen tasks (n=15), anon monotonic computation cost matrix and a DAG with

communication costs assigned to graph edges are shown inFig.5. which also presents scheduling results of the mentionedDAG, executed by HEFT, CPOP and LDCP+ algorithms.Execution results of LDCP and LDCP+ are comparedaccording to monotonic computation cost matrix. A Gridsystem with the parameters m=2 and n=10, a monotoniccomputation cost matrix and a DAG with communication costsassigned to graph edges are shown in Fig.6. Fig.6 also showsscheduling results of the mentioned DAG presented in Fig6.b,executed by LDCP and LDCP+ scheduling algorithms.

Figure 4. LDCP+ algorithm

(a) (b)

(c) (d)

Establish k

DAGP for all processors in the system where 0 1k m≤ ≤ − Calculate URanks for all

k DAGP Compute the URankSet While there are unscheduled tasks in task list do Find the KeyNode(s) in the URankSet Put the KeyNode(s) in KeyNodeSet If (it’s the first step of scheduling) then

Choose the processors which have the minimum permutation; Else If (there is any processor with idle time and the task have the replacement ability) then Selected the processor;

Else Compute the finish time of the selected task on every system; Find and select the processor that minimizes the finish time of the Selected task;

End if Duplicate the parent(s) of the selected task if needed;

End if Assign the selected task to the selected processor; Update the selected processor time; Update the URANK set; Update unscheduled task list;

End while

p3 p2 p1 Task

916141

1819132

1913113

178134

1013125

916136

111577

141158

2012189

1672110

0

10

20

30

40

50

60

70

80

90

1 p

2 p 3

p

1n

6n

3n

4n

5n

7n

9n

10n

2n

8n

1 p

2 p 3

p

1n

6n

3n

4n

5n

10n

9n

7n

8n

2n


ISSN 1947-5500




Vol. 8, No. 4, July 2010

(e) (f)

Figure 5. Scheduling results for HEFT, CPOP, LDCP+ algorithms. (a) A

graph with 10 tasks. (b) Graph cost matrix. (c) HEFT Scheduling algorithm

with schedule length of 80. (d) CPOP algorithm with schedule length of 89.

(e) LDCP+ algorithms with schedule length of 68. (f) Tasks executionsequence in LDCP+ algorithm. Duplicated tasks:

(a) (b)

(c) (d)

(e) (f)

Figure 6. Scheduling results for LDCP and LDCP+ algorithms. (a) A graph

with 11 tasks. (b) Graph cost matrix (c) tasks execution sequence in LDCP

algorithm (d) LDCP algorithm with schedule length of 64 (e) tasks execution

sequence in LDCP+ algorithm. (f) LDCP+ algorithm with schedule length of 61.5

VII. CONCLUSION AND FUTURE WORK

In Grid systems, task scheduling is an important problem inthe domain of optimizing heterogeneous distributed systems. Inthis paper a new heuristic scheduling algorithm, namedLDCP+, is proposed. This algorithm has optimized LDCPalgorithm that better result are attained for schedule length byimproving these three phases: task selection phase, processorselection phase and status update phase. LDCP+ can scheduletasks in Grid systems in both case of having monotonic andnon monotonic cost matrix. Using duplication process foroptimizing priority assigns to tasks and also using idle spacesof processors will result in having better schedule length rather

than other scheduling algorithms. In real time environment, theassignment of resources such as processors in a specific time isso important. More works can be done to improve algorithmswith less computation cost for such environments.

REFERENCES

[1] S. Bansal, P. Kumar, and K. Singh. An improved duplication strategy forscheduling precedence constrained graphs in multiprocessor systems. InIEEE Transactions on Parallel and Distributed Systems 14(6), pages533-544, 2003.

[2] M. I. Daoud and N. N. Kharma. A high performance algorithm for statictask scheduling in heterogeneous distributed computing systems. InJournal of Parallel and Distributed Computing 68(4), pages 399-409,2008.

[3] H. El-Rewini and T. G. Lewis. Scheduling parallel program tasks ontoarbitrary target machines. Journal of Parallel and Distributed Computing9(2), pages 138-153, 1990.

[4] E. Ilavarasan, P. Thambidurai, and R. Mahilmannan. Performanceeffective task scheduling algorithm for heterogeneous computingsystem. 4th International Symposium on Parallel and DistributedComputing, 0:28-38, 2005.

[5] J. Kim, J. Rho, J.-O. Lee, and M.-C. Ko. Cpoc: Effective static task scheduling for grid computing. In Proceeding of the 2005 International

[6] Conference on High Performance Computing and Communcations,pages 477-486, 2005.

Selected

processor

Selected

task

step

p0t21

p1 t12

p1 t43

p0 t94

p0 t55

p0 t36

p1 t77

p1 t68

p0 t89

p0 t1110

p1 t1011

p1p0task

64t1

22.515t2

64t3

19.513t4

1510t5

10.57t6

128t7

64t8

1812t9

96t10

13.59t11

Selected

processor

Selected

task

step

p0 t 2 1 p1 t 1 2 p1 t 4 3 p0 t 9 4 p0 t 5 5 p0 t 3 6 p1 t 7 7 p1 t 6 8 p0 t 8 9 p0 t 10 10 p0 t 11 11

Selected

processo

r

Selecte

d task

step

p1n1 1

p2n42

p1n33

p3n24

p2n55

p3n66

p2n97

p1n78

p1n8 9

p2n10 10


ISSN 1947-5500




Vol. 8, No. 4, July 2010[7] Y.-K. Kwok and I. Ahmad. Static scheduling algorithms for allocating

directed task graphs to multiprocessors. ACM Comput. Surv. 31(4),pages 406-471, 1999.

[8] Y. kwong Kwok, I. Ahmad, and I. Ahmad. Dynamic critical-pathscheduling: An effective technique for allocating task graphs tomultiprocessors. IEEE Transactions on Parallel and Distributed Systems7(5), pages 506-521, 1996.

[9] G. C. Sih and E. A. Lee. A compile-time scheduling heuristic forinterconnection constrained heterogeneous processor architectures. IEEETransaction on Parallel and Distributed Systems 4(2), pages 175-187,1993.

[10] H. Topcuoglu, S. Hariri, and W. Min-You. Performance-effective andlow complexity task scheduling forheterogeneous computing. IEEE

Transaction on Parallel and Distributed Systems 13(3), pages 260-274,2002.


ISSN 1947-5500

ldcp+: an optimal algorithm for static task scheduling in grid systems

Documents