ldcp+: an optimal algorithm for static task scheduling in grid systems
TRANSCRIPT
8/9/2019 LDCP+: An Optimal Algorithm for Static Task Scheduling in Grid Systems
http://slidepdf.com/reader/full/ldcp-an-optimal-algorithm-for-static-task-scheduling-in-grid-systems 1/6
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
LDCP+: An Optimal Algorithm for Static Task
Scheduling in Grid Systems
Negin Rzavi
Islamic Azad University,
Science and Research Branch,
Tehran, Iran
Safieh Siadat
Islamic Azad University,
Science and Research Branch,
Tehran, Iran
Amir Masoud Rahmani
Islamic Azad University,
Science and Research Branch,
Tehran, Iran
Abstract— after a computational job is designed and realized as a
set of tasks, an optimal assignment of these tasks to the
processing elements in a given architecture needs to be
determined. In grid system with the existence of heterogeneous
processing elements and data transferring time between them,determining an assignment of tasks to processing elements in
order to optimize the performance and efficiency is so important.
In this paper a heuristic algorithm named LDCP+ is presented,
which has optimized the Longest Dynamic Critical Path
algorithm (LDCP) presented by Mohammad L. Daoud and
Nawwaf Kharma in 2007. This algorithm is a list-based algorithm
in the way it assigns each task a priority for its execution. Using
task duplication, using idle processing element's time and also
optimizing priority assignment method which is used in LDCP
algorithm, are the basic specifications of LDCP+, since LDCP
algorithm is executable with the assumption that computation
cost of tasks are monotonic, our algorithm which is presented in
this paper has made the scheduling algorithm free from this
restriction and in the case of non-monotonic computation costs,
LDCP+ has the minimum total finish time in the comparison of
other scheduling algorithms such as HEFT and CPOP.
Keywords- Grid; Static task scheduling; Longest Dynamic
Critical Path.
I. INTRODUCTION
A Grid system is a group of connected computers that hasthe ability of executing parallel programs via a high speedinterconnection. The efficiency of program parallelism in Gridsystems depends on methods used in task scheduling onavailable processing elements. Inner connection of processingelements in Grid causes an overhead when two tasks assigned
to different processing elements of distinct computers, transferdata. In fact, task scheduling in distributed heterogeneoussystems are more complex in which each task can havedifferent execution time on different processing elements, soscheduling algorithms for a Grid system should consider theexecution time of each task on different processing elementsand even one incorrect decision can restrict the systemperformance to the slowest processing element [2].
There are two kinds of scheduling algorithms: static schedulingalgorithms and dynamic scheduling algorithms. In static
scheduling algorithms all information needed for schedulingsuch as the structure of the parallel application, the executiontime of individual tasks and the communication costs betweentasks must be known, in contrast, these information areunknown in dynamic task scheduling algorithms.
Among different types of scheduling algorithms, HEFT is ascheduling algorithm for heterogeneous distributed computingsystems which is consists of two phases: first, cost computingfor each task and task selection, second, processor selection. Inthe task selection phase the algorithm sets the computationcosts of tasks to their mean values and this may limit the abilityof scheduling algorithm to precisely compute the priorities of tasks. The CPOP algorithm is same as HEFT in the two phasesbut with different strategies in assigning priorities to tasks andprocessor selection. These two algorithms have beenmentioned as optimal algorithms in the parameter of total finishtime.
In this paper we present a heuristic list-based algorithmcalled LDCP+ (optimized of Longest Dynamic Critical Pathalgorithm) for static task scheduling in Grid systems withlimited number of processors and we compare our schedulingresults with other algorithms such as CPOP, HEFT and LDCPfor performance evaluation.
II. RELATED WORKS
Static task scheduling for Grid systems, in general is knownto be NP-Complete problem [4, 7, 9] and most of thesealgorithms are heuristic [1, 2, 3, 4, 7]. One of the mostimportant classes of heuristic algorithms is list-basedalgorithms [6], in such algorithms each task is assigned with a
priority and three steps of task selection, processor selectionand status update are repeated until all tasks are scheduled. Inthe task selection phase the unscheduled task with the highestpriority is selected. In the processor selection phase, theselected task is assigned to the processor that minimizes apredefined cost criterion that can be minimizing the schedulelength. At last in status update phase, the status of the system isupdated. Examples of list-based algorithms are: HeterogeneousEarliest Finish Time (HEFT) [9], Critical Path on a Processor(CPOC) [9], Critical Path on a Cluster (CPOC) [5], Dynamic
335 http://sites.google.com/site/ijcsis/ISSN 1947-5500
8/9/2019 LDCP+: An Optimal Algorithm for Static Task Scheduling in Grid Systems
http://slidepdf.com/reader/full/ldcp-an-optimal-algorithm-for-static-task-scheduling-in-grid-systems 2/6
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
3
12
5
0n1n
2n
3n
6
9
7
3
12
5
0n1n
2n
3n
5 4
8
3
6
Level Scheduling (DLS) [8], Modified Critical Path (MCP)[10], Mapping Heuristic (MH) [3], Dynamic Critical Path(DCP), and Longest Dynamic Critical Path (LDCP) [2].
III. PROBLEM DEFINITION
In static task scheduling in Grid system, the executionprecedence between tasks is represented by a Directed Acyclic
Graph (DAG), each DAG is shown by tupple (T, E) where T isa set of n tasks and E is a set of e edges. Each T t i ∈ represents a task and each E t t e ji ji ∈= ),(,
represents theexecution precedence between the two tasks which are
connected with the edge jie , .
If E t t ji ∈),( then the execution of task i
t T ∈ cannot bestarted before task finishes its execution. For the edge ),( ji t t ,
the source task it is parent of the sink task
jt , while jt is a
child of it . A task with no parents is called an entry task and a
task with no children is called an exit task. Associated witheach edge ( , )
i jt t is a value ,i jd that represents the amount of
data to be transmitted from task to task jt , and in some cases italso represents the minimum time that a task needs to wait for
starting after task it finishes its execution.A Grid system is represented by a set P of m processors, a
set T of n tasks and n × m computation cost matrix ( mnW × ).
Each element mk niW w k i≤≤<≤∈ 1,1,,
represents theexecution time (computation cost) of task it on processor
k P .We have the same assumption as LDCP that all processors arefully connected and communications between processors occur
via independent communication units [2], so, we can have task execution and data transferring in parallel. Also the datatransfer rate between any two processors on the network isassumed to be fixed and constant as same as LDCP. The
communication cost between two processors is represented byn × n matrix ( nn D× ).
,i jd D∈ is zero if two tasks it and jt of
and are scheduled on the same processor and it is equal tocommunication cost (non zero) in the other case. A task canstart its execution on a processor only when all data from itsparent become available to that processor. The goal of our
algorithm is to assign tasks in processors in a way thatminimizes the total finish time, or the schedule length.
A. Definition 1
Schedule Length: The maximum execution time of theprocessors or the finish time of the final task after task scheduling is called scheduled length. There is a DAG and acomputation cost matrix with two processors as shown inFig.1. The schedule length is computed in Fig.2. and it is equalto 23.
(a) (b)
Figure 1. An example of (a) DAG (b) computation matrix
Figure 2. Schedule length of the presented DAG in Fig. 1. on two processors
(a) (b)
Figure 3. Tasks computation time on each processor that will be acquired
from cost matrix in Fig. 1.
Assigning task priorities in Grid system the efficiency of list-based scheduling algorithms depends on the methods whichassign priorities to tasks.
In our suggested algorithm LDCP+, if selecting a task inone step of scheduling causes the minimum schedule length weassign a high priority to that task. There are some basicdefinitions which are used in LDCP algorithm and becauseLDCP+ is the result of optimization of LDCP, we decided torepresent this basic knowledge too.
B. Definition 2
Critical Path: For a given DAG, the Critical Path (CP) isdefined as the path from an entry task to an exit task for whichthe sum of the computation costs of tasks and thecommunication costs of edges is maximal.
IV. LDCP: LONGEST DYNAMIC CRITICAL PATH
A. Definition 3
Longest Dynamic Critical Path: Given a DAG with n tasksand e edges and a Grid system with m processors, DCP duringa particular scheduling step is a path of tasks and edges from anentry task to an exit task.
LDCP is the largest DCP, considering that communication
costs between tasks scheduled on the same processors areassumed zero, and the execution constraints are preserved.Fig.3. represents two dynamic critical paths. First path inFig.3.a. is composed of tasks 0 2,t t and 3t which is scheduledon processor
0 p and has the length of 29. The second DCP inFig.3.b. is composed of tasks
0 2,t t and
3t which is scheduled
on processor1 p and has the length of 23, so at the first step of
scheduling, LDCP is composed of tasks0 2,t t and
3t and with
the schedule length of 29.
0t
1t 1 p
2t
3t
0 6 15
0 4 15
0 p
20 23
Idle
0t 1t
3t
2t 3
12
5
1 p0 p Task
5 6 0t 4 6 1
t 8 9 2t 3 7 3
t
336 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
8/9/2019 LDCP+: An Optimal Algorithm for Static Task Scheduling in Grid Systems
http://slidepdf.com/reader/full/ldcp-an-optimal-algorithm-for-static-task-scheduling-in-grid-systems 3/6
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
V. LDCP+: THE PROPOSED ALGORITHM
In the algorithm of LDCP+, each scheduling iterationincludes three phases below:
1. Task selection
2. Processor selection phase
3. Status update phase
These 3 phases will be accomplished for each task until lastinput task is selected for scheduling.
A. Task Selection Phase
LDCP+ selects a set of tasks that play main role indetermining schedule length.
In first step of this phase, DAG of each processor isrequired for scheduling.
1) Definition 4
Directed Graph: With the assumption of having a DAGincluding n tasks, e edges and a Grid system with m processors( m p p p ,...,, 10 ),
k DAGP is the directed acyclic graph that
corresponds to processor k p . The computation cost of eachtask in the processor k
p , is represented by a number on therelated node of the
k DAGP .
0 DAGP is shown in Fig.3.a. and1 DAGP is shown in
Fig.3.b. These figures are related with the DAG and the Gridsystem which is represented in Fig.1. Trough the course of thispaper, ti is used to refer to the i'th task in directed acyclic graphand the node in identifies task it in
k DAGP . The numberassociated with this node represents the computation cost of task ti on processor pk. In each
k DAGP , all nodes are assignedwith a number named UpwardRank (URank). URanks are usedto determine tasks priorities in
k DAGP .
2)
Definition 5URank: UpwardRank of i'th node (in ) in
k DAGP is
defined recursively as
, ( )( ) max { ( , ) ( )}l k ik i i k n succ n k i l lURank n w c n n URank n∈= + + (1)
where ( )k isucc n is a set of immediate successors of node
in , ( , )k i l
c n n is the communication cost between nodes
in andln in
k DAGP , and ,i k w is the computation cost of
it on processor
k p .
3) Definition 6 URankSet: Each element of URankSet is defined as
∑
−
=
1
0)}({
m
k
ik nURank Max (2)
where ( )k iURank n is ( )i
URank n ink
DAGP .
4) Definition 7
KeyNode: KeyNode is the node that has the maximum
URank in URankSet. Corresponded task to this node is used asselected task for scheduling algorithm.
5) Definition 8
KeyNodeSet: This set includes KeyNodes that are selectedfor scheduling and in the first scheduling iteration it caninclude several tasks, but in other iterations it has only one task for scheduling and in the first scheduling iteration it caninclude several tasks, but in other iterations it has only onetask.
6) Definition 9.Least Execution Time (LET): Least Execution Time is
defined as
})(min{ ,, jik ik d w pe processTim ++ (3)
where )( k pe processTim is the time that find scheduledtask on processor
k p finishes its execution, ,i k w is the
computation time of task corresponded to i on processor k , and
,i jd is the communication time between it and jt . If both
it and j
t are scheduled on processork
p , then communicationcost between them will be assumed zero. After computingURankSet, the destined task for scheduling algorithm is thetask corresponding to existing KeyNode in URankSet. In the
first iteration to obtain minimum execution time on availableprocessing elements, if the number of entry tasks is equal orless than processors number, all entry tasks will be consider asKeyNode, in other case, as same as the number of processors,tasks with maximum URanks will be selected as KeyNodes and
place in KeyNodeSet. In the next iterations, KeyNodeSetmerely includes one KeyNode (a set with one member).
B. Processor Selection Phase
In this phase, selected task will be assigned to a processorin the way to gain the minimum schedule length in eachiteration of scheduling. Therefore, in LDCP+, these stages willbe passed: As mentioned above, in the first schedulingiteration, KeyNodeSet can have more than one KeyNode. For
the purpose of optimizing LDCP algorithm, LDCP+ computesdistinct permutation of tasks, which their correspondingKeyNodes are available in KeyNodeSet, on differentprocessors and the permutation with the minimum averageexecution time on processors will be the first assignment of tasks to processors. This average execution time can beachieved from
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧∑−
=
m
wm
k
k i
1
0
,
min (4)
Where i is the number of tasks corresponding to their
KeyNodes, ,i k w w∈ and m is the number of processors. Inthe next iterations, the only available KeyNode in KeyNodeSetis selected to be scheduled.
1) Definition 10
Idle Space: In a processor when there is a gap between thestart time of a task and the end time of the previous task, thatinterval time is called idle space.
337 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
8/9/2019 LDCP+: An Optimal Algorithm for Static Task Scheduling in Grid Systems
http://slidepdf.com/reader/full/ldcp-an-optimal-algorithm-for-static-task-scheduling-in-grid-systems 4/6
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
2) Definition 11
Replacement Ability: One task can be placed in an idlespace when parents of that task have been terminated before thestart time of the task. If any of its parents have been scheduledon a different processor, the required time for transferring databetween processors should be mentioned.
If there is a processor with the idle space and selected task has the ability of locating in that space (replacement ability),that processor will be selected. At the end of this phase,LDCP+ algorithm uses duplication process to decrease theschedule length. With this definition, after selecting theprocessor if the selected task has a parent scheduled on adifferent processor and the selected processor has an idle spacebefore the start time of the selected task, then duplicationprocess in the idle space will be used (regarding to thereplacement ability).
3) Definition 12Duplication Process: Duplication process is repeating the
execution of one task on other processors.
C. Status Update Phase
After selecting the task and assigning it to a processor,appropriate URank with the selected task will be deleted fromURankSet. Finish process time of the selected processor will beupdated after the task has been assigned to the processor.Selected task will be deleted from the list of unscheduled tasks.LDCP+ algorithm is proposed in Fig.4.
VI. CASE STUDY
In this section, execution results of CPOP, HEFT andLDCP+ algorithms are compared in the case of having nonmonotonic computation cost matrix. A Grid system compose of three single-processor computers (m=3), fifteen tasks (n=15), anon monotonic computation cost matrix and a DAG with
communication costs assigned to graph edges are shown inFig.5. which also presents scheduling results of the mentionedDAG, executed by HEFT, CPOP and LDCP+ algorithms.Execution results of LDCP and LDCP+ are comparedaccording to monotonic computation cost matrix. A Gridsystem with the parameters m=2 and n=10, a monotoniccomputation cost matrix and a DAG with communication costsassigned to graph edges are shown in Fig.6. Fig.6 also showsscheduling results of the mentioned DAG presented in Fig6.b,executed by LDCP and LDCP+ scheduling algorithms.
Figure 4. LDCP+ algorithm
(a) (b)
(c) (d)
Establish k
DAGP for all processors in the system where 0 1k m≤ ≤ − Calculate URanks for all
k DAGP Compute the URankSet While there are unscheduled tasks in task list do Find the KeyNode(s) in the URankSet Put the KeyNode(s) in KeyNodeSet If (it’s the first step of scheduling) then
Choose the processors which have the minimum permutation; Else If (there is any processor with idle time and the task have the replacement ability) then Selected the processor;
Else Compute the finish time of the selected task on every system; Find and select the processor that minimizes the finish time of the Selected task;
End if Duplicate the parent(s) of the selected task if needed;
End if Assign the selected task to the selected processor; Update the selected processor time; Update the URANK set; Update unscheduled task list;
End while
p3 p2 p1 Task
916141
1819132
1913113
178134
1013125
916136
111577
141158
2012189
1672110
0
10
20
30
40
50
60
70
80
90
1 p
2 p 3
p
1n
6n
3n
4n
5n
7n
9n
10n
2n
8n
1 p
2 p 3
p
1n
6n
3n
4n
5n
10n
9n
7n
8n
2n
338 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
8/9/2019 LDCP+: An Optimal Algorithm for Static Task Scheduling in Grid Systems
http://slidepdf.com/reader/full/ldcp-an-optimal-algorithm-for-static-task-scheduling-in-grid-systems 5/6
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
(e) (f)
Figure 5. Scheduling results for HEFT, CPOP, LDCP+ algorithms. (a) A
graph with 10 tasks. (b) Graph cost matrix. (c) HEFT Scheduling algorithm
with schedule length of 80. (d) CPOP algorithm with schedule length of 89.
(e) LDCP+ algorithms with schedule length of 68. (f) Tasks executionsequence in LDCP+ algorithm. Duplicated tasks:
(a) (b)
(c) (d)
(e) (f)
Figure 6. Scheduling results for LDCP and LDCP+ algorithms. (a) A graph
with 11 tasks. (b) Graph cost matrix (c) tasks execution sequence in LDCP
algorithm (d) LDCP algorithm with schedule length of 64 (e) tasks execution
sequence in LDCP+ algorithm. (f) LDCP+ algorithm with schedule length of 61.5
VII. CONCLUSION AND FUTURE WORK
In Grid systems, task scheduling is an important problem inthe domain of optimizing heterogeneous distributed systems. Inthis paper a new heuristic scheduling algorithm, namedLDCP+, is proposed. This algorithm has optimized LDCPalgorithm that better result are attained for schedule length byimproving these three phases: task selection phase, processorselection phase and status update phase. LDCP+ can scheduletasks in Grid systems in both case of having monotonic andnon monotonic cost matrix. Using duplication process foroptimizing priority assigns to tasks and also using idle spacesof processors will result in having better schedule length rather
than other scheduling algorithms. In real time environment, theassignment of resources such as processors in a specific time isso important. More works can be done to improve algorithmswith less computation cost for such environments.
REFERENCES
[1] S. Bansal, P. Kumar, and K. Singh. An improved duplication strategy forscheduling precedence constrained graphs in multiprocessor systems. InIEEE Transactions on Parallel and Distributed Systems 14(6), pages533-544, 2003.
[2] M. I. Daoud and N. N. Kharma. A high performance algorithm for statictask scheduling in heterogeneous distributed computing systems. InJournal of Parallel and Distributed Computing 68(4), pages 399-409,2008.
[3] H. El-Rewini and T. G. Lewis. Scheduling parallel program tasks ontoarbitrary target machines. Journal of Parallel and Distributed Computing9(2), pages 138-153, 1990.
[4] E. Ilavarasan, P. Thambidurai, and R. Mahilmannan. Performanceeffective task scheduling algorithm for heterogeneous computingsystem. 4th International Symposium on Parallel and DistributedComputing, 0:28-38, 2005.
[5] J. Kim, J. Rho, J.-O. Lee, and M.-C. Ko. Cpoc: Effective static task scheduling for grid computing. In Proceeding of the 2005 International
[6] Conference on High Performance Computing and Communcations,pages 477-486, 2005.
Selected
processor
Selected
task
step
p0t21
p1 t12
p1 t43
p0 t94
p0 t55
p0 t36
p1 t77
p1 t68
p0 t89
p0 t1110
p1 t1011
p1p0task
64t1
22.515t2
64t3
19.513t4
1510t5
10.57t6
128t7
64t8
1812t9
96t10
13.59t11
Selected
processor
Selected
task
step
p0 t 2 1 p1 t 1 2 p1 t 4 3 p0 t 9 4 p0 t 5 5 p0 t 3 6 p1 t 7 7 p1 t 6 8 p0 t 8 9 p0 t 10 10 p0 t 11 11
Selected
processo
r
Selecte
d task
step
p1n1 1
p2n42
p1n33
p3n24
p2n55
p3n66
p2n97
p1n78
p1n8 9
p2n10 10
339 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
8/9/2019 LDCP+: An Optimal Algorithm for Static Task Scheduling in Grid Systems
http://slidepdf.com/reader/full/ldcp-an-optimal-algorithm-for-static-task-scheduling-in-grid-systems 6/6
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010[7] Y.-K. Kwok and I. Ahmad. Static scheduling algorithms for allocating
directed task graphs to multiprocessors. ACM Comput. Surv. 31(4),pages 406-471, 1999.
[8] Y. kwong Kwok, I. Ahmad, and I. Ahmad. Dynamic critical-pathscheduling: An effective technique for allocating task graphs tomultiprocessors. IEEE Transactions on Parallel and Distributed Systems7(5), pages 506-521, 1996.
[9] G. C. Sih and E. A. Lee. A compile-time scheduling heuristic forinterconnection constrained heterogeneous processor architectures. IEEETransaction on Parallel and Distributed Systems 4(2), pages 175-187,1993.
[10] H. Topcuoglu, S. Hariri, and W. Min-You. Performance-effective andlow complexity task scheduling forheterogeneous computing. IEEE
Transaction on Parallel and Distributed Systems 13(3), pages 260-274,2002.
340 http://sites.google.com/site/ijcsis/
ISSN 1947-5500