lre-tl cited 2

8/2/2019 Lre-tl Cited 2

1/22

On the Design of an Optimal Multiprocessor Real-Time Scheduling

Algorithm under Practical Considerations (Extended Version)

Shelby Funk Vincent Nelis1 Joel Goossens Dragomir Milojevic

Geoffrey Nelissen1

January 27, 2010

Abstract

Thisresearch addresses the multiprocessor scheduling problemof hard real-time systems, and it especially focuses

on optimal and global schedulers when practical constraints are taken into account. First, we propose an improvementof the optimal algorithm BF. We formally prove that our adaptation is (i) optimal, i.e., it always generates a feasible

schedule as long as such a schedule exists, and (ii) valid, i.e., it complies with the all the requirements. We also show

that it outperforms BF by providing a computing complexity of O(n), where n is the number of tasks to be scheduled.

Next, we propose a schedulability analysis which indicates a priori whether the real-time application can be scheduled

by our improvement of BF without missing any deadline. This analysis is, to the best of our knowledge, the first such

test for multiprocessors that takes into account all the main overheads generated by the Operating System.

1 Introduction

The first global and optimal algorithm for scheduling hard real-time systems on multiprocessor platforms was pro-

posed by Baruah et al. in [1], where the tasks are assumed to be periodic and have implicit deadlines. The approach

proposed in [1] is namedPF

for Proportionate Fairscheduling. The idea is to divide the time into time slots and to

schedule every task in such a way that each one receives an amount of time slots proportional to its utilization. This

was named the fairness propriety in [1]. Different adaptations ofPF were published these last few years and amongthe most interesting ones, one can cite PD [2] and PD2 [3]. These adaptations follow the same idea than PF butsucceeded to considerably reduce the computing complexity of the scheduling decisions. The worst drawback of all

Pfair-like approaches resides in the fact that a scheduling decision must be taken at each time unit. That is, although

the theoretical correctness of these algorithms PF, PD and PD2 is undeniable, their application in the real world islimited due to the high number of overheads (and especially the preemption overheads) that are generated during the

execution of the system. Notice that, during an overhead on any processor (say Pk), only the Operating System isrunning while the tasks supposed to be executed on Pk are waiting for execution. Therefore, it exist some intervalsof time in the actual schedule where a task is not executed while it was supposed to be executed in the theoretical

schedule. Since the existing schedulability analyses do not take overheads into account, a task can sometimes miss its

deadline even if the system was said to be feasible in theory.

More recently, Zhu, et al., investigated the problem of the overheads and designed a new class of algorithms named

Boundary fair schedulers (i.e., Bfair-like algorithms). Their idea is to ensure the fairness only at boundaries, i.e.,

some particular events occurring during the system execution, resulting in a considerable reduction of the number of

decisions taken by the scheduler, as well as the cost of the overheads, compared to Pfair-like algorithms. Among

the most interesting studies in this area, one can cite [4], [5] and [6]. Basically, these Bfair-like schedulers can be

1Supported by the Belgian National Science Foundation (F.N.R.S.) under a F.R.I.A. grant.

arXiv:1001.4

115v2

[cs.OS]27

Jan2010


2/22

divided into two categories: the continuous-time Bfair schedulers, i.e., the time is assumed to be continuous, and the

quantum-based Bfair schedulers that divide the time into time slots. To the best of our knowledge, the most recent

approach belonging to the continuous-time category is LRE-TL [6]. This algorithm was proved to be optimal but, likeevery scheduler that belong to this category, it suffers from another drawback. Indeed, such schedulers can sometimes

preempt a taski by another taskj only to execute an extremely small portion ofj . That is, the generated schedule

can undergo a preemption overhead only to execute an insignificant portion of a task, hence producing schedules in

which the time actually used to execute the tasks is limited. On the other hand, quantum-basedtechniques have beendesigned in order to ensure that the Operating System only preempts a taski by another taskj ifj has to execute at

least one time-quantum. In our opinion, the most interesting study in this category is [4]. The proposed scheduler BFis optimal and it considerably reduces the number of preemptions compared to continuous-time Bfair and Pfair-like

schedulers. However, the complexity ofBF is O(n Tmax) where Tmax is the largest period of the tasks2, but since

Tmax could be quite large, the overhead due to the execution of the scheduler can be large as well.

Contribution of this work. Our first contribution is an improvement of the algorithm BF that confirms that its com-plexity can actually be reduced to O(n). Then, our second contribution is to provide a schedulability test that takes

into account all the main overheads generated by the Operating System (i.e., preemption costs, migration costs, the

time needed to handle tasks arrivals, to execute the scheduling algorithm, etc.).

Paper organization. In Section 2, we introduce our system model and give some preliminary definitions. In Section 3,

we describe our improvement of BF (that we name BF2) and we define the notations used throughout the paper. InSection 4, we prove both the optimality and the validity of BF2. In Section 5, we identify all the main overheads thatoccur in the schedules produced while running BF2 on our multiprocessor architecture. In Section 6, we model allthese overheads and we include them into our schedulability analysis. Finally in Section 7, we introduce our future

works and we conclude.

2 System model and preliminaries

We consider multiprocessor platforms composed of a known and fixed number m ofidentical processors {P1, P2, . . . , Pm}upon which a real-time system is executed. Identical means that all the processors have the same profile (in term of

consumption, computational capabilities, etc.) and are interchangeable.

A periodic real-time system is a set of n functionalities denoted by {1, 2, . . . , n}. Every functionality i ismodeled by a periodic implicit-deadline task characterized by two parameters (Ci, Ti ) a Worst-Case Execution

Time (WCET) Ci and an inter-arrival delay Ti with the interpretation that the task i generates successive jobs i,j(with j = 1, . . . , ) arriving at times ai,j such that ai,j = ai,j1 + Ti (with ai,1 = 0), each such job has a worst-caseexecution time of at most Ci time units and must be completed at (or before) the next task release ai,j+1. During the

scheduling of the tasks, we assume that the time is divided into time slots (that we name time unithereafter) and both

parameters Ci and Ti of every taski are expressed as an integer multiple of time unit. We assume that any time unit

on any CPU can only be allocated to one job and a job can only execute on one CPU at any time (i.e., job parallelism

is forbidden).

The utilization Ui of any task i is defined as Uidef= Ci

Tiwith 0 < Ui 1 and the total utilization Usum of the

system is defined as the sum of the utilization of every task, i.e., Usumdef=

iUi. In this work, we assume with

no loss of generality that Usum =

m. Indeed, if the total utilization is initially not equal to m, the WCET of the tasks

can be overestimated (hence increasing their individual utilization) so that a total utilization of m is reached. We also

assume that all the tasks are independent, i.e., there is no communication, no precedence constraint and no shared

resource (except the processors) between them. Although this assumption is highly unrealistic, it is a necessary first

2Although the authors claimed that the sub-routine responsible for this Tmax factor can be replaced by another one running in O(1) (leading to

a complexity of O(n) for BF), the proof of the equivalence between these two sub-routines does not appear in the literature and, in our opinion, this

equivalence is not obvious at all.


3/22

step toward a practical implementation and we will reconsider it in our future work.

Definition 1 (Boundary) A boundary is a time-instant which corresponds to a task release. Notice that boundaries

are indexed from b0 = 0 at which every task releases its first job according to our task model.

Definition 2 (Time slice) A time slice TSkdef= [bk1, bk) is a time interval between two consecutive and distinct task

arrivals in the schedule, i.e., two consecutive and distinct boundaries. The length of the kth

time slice TS

k

(withk = 1, . . . , ) is denoted by Lk

def= bk bk1.

Definition 3 (Fluid schedule) A schedule is said to be fluid if and only if at any time t 0, every taski has executedexactly Ui t execution units. Notice that, unfortunately, such a schedule is impossible to actually implement.

Targeting systems in which the time is divided into time units, the scheduler cannot allocate a fraction of time unit to

the tasks. Thereby, we define the lag as follows.

Definition 4 (Lag) The lag of a taski at any time t is the difference between Ui t (i.e., the amount of execution unitsexecuted in the fluid schedule at time t) and the amount of time units thati has already executed at time t in the actual

schedule.

Definition 5 (Boundary fair schedule [4]) A periodic schedule is said to be boundary fair if and only if the absolute

value of the lag of every taski at any boundary bk is strictly less than one time unit.

Lemma 1 (from [4]) For the multiprocessor periodic scheduling problem, if the utilization of every taski is lower

than 1 and the total utilization Usum is not larger than the numberm of processors, a boundary fair schedule exists.

3 The BF2 algorithm

3.1 Description of the algorithm

The BF2 algorithm proposed in this paper is an improvement of the BF algorithm proposed in [4]. BF2 benefits froma lower computing complexity, thereby reducing the time overheads due to the execution of the scheduler. The main

idea of the BF algorithm is that it schedules tasks in a manner that remains reasonably close to the fluid schedule. BF

schedules the task so that, at every boundary, the amount of time each task has executed differs from the fluid scheduleby less than 1 time unit. At each boundary, all the time units between the current and the next boundary are allocated

to tasks. That is, each task first receives a mandatory integer amount of time units in such a way that the fairness

is guaranteed at the next boundary. Then, if some units remain unallocated, an additional method (using dynamic

priorities) selects the tasks that will receive these units. In BF, this additional method has a computing complexity of

O(n Tmax), where Tmaxdef= maxi{Ti}, whereas the complexity is O(n) in BF

2. Algorithm 1 gives the pseudo-

code of BF2. The only difference with BF resides between lines 8 and 11 (included), i.e., the method for allocatingthe remaining unallocated units.

In the following, we define and describe the notations used in Algorithm 1. The lag of a taski at the end of the

time slice TSk, denoted lagki , corresponds to the lag ofi at boundary bk. That is, any task i with a positive, null ornegative lag is said to be behind, punctual or ahead, respectively. Notice that, at the beginning of the first time slice

TS1, we have lag0i = 0 i since every task releases its first job at time 0 and a job can only be punctual at its releasetime. The mandatory time units mandki allocated to i at line 4 is the amount of time units that i must execute inTSk so that its lag at the next boundary bk is strictly less than 1. The lag upper bound of a task i in a time sliceTSk, denoted lubki , is computed at line 5 and corresponds to the lag at boundary bk if the scheduler only assigns themandatory units to i. Once all the tasks received their mandatory time units, there may be some remaining units

that are not allocated to the tasks due to the use of the floor operator at line 4. These so named remaining units are

denoted RUk and are computed at line 7. From lines 8 to 11, the RUk remaining units are allocated to the tasks asoptional units (details are given below) and finally, the lags are updated accordingly at line 12. Notice that, according


4/22


5/22

B C D

D

A B

E

B C D

D

A B

E

B C D

D

A B

E

BCD

D

AB

E

1

P2

P3

P1

P2

P3

TSk

Sk+1

Figure 2: The upper schedule is produced by BF whereas the other one is generated by BF2. In the schedule ofBF2, the local schedule in the time slice TSk+1 has been horizontally flipped compared to the local schedule in theprevious time slice TSk, thus avoiding 3 preemptions and 2 migrations.

part of box) that overflows from TSk are packed to the next processor and this process continues until all the tasks are

allocated.

The complexity of the whole algorithm BF2 can be easily proved to be O(n). Indeed, the complexity is obviouslyO(n) from line 2 to line 8 (included). Using any linear-comparison selection algorithm (e.g., [7]) provides a linear

complexity to line 9 and lines 10, 11 and 12 are obviously in O(n). Finally, the function GenerateSchedule() clearly

runs in O(n), leading to a total complexity of O(n) for BF2.

3.2 Differences between BF and BF2

The main difference between BF and the adaptation BF2 that we propose here resides at line 9 of Algorithm 1, whereboth algorithms determine which tasks will receive an optional time unit. In BF, the authors drew inspiration fromthe approach PF proposed in [1] and their resulting procedure that select the tasks runs in O(n Tmax) where Tmaxdenotes the largest period of the tasks. However, in Section 3 of their paper (see [4]), authors claim that this selection

procedure could be improved in order to run in O( n) and they briefly explain the idea behind their intuition. However,although they provided all the needed tools to start a deeper investigation, they have never achieved the proofs that

would confirm their intuition. In this paper, we prove with our selection procedure used at line 9 of BF2 that such animprovement is possible since it actually runs in O(n).

The second difference between BF and BF2 resides in the procedure GenerateSchedule()at line 13 of Algorithm 1.Indeed, this procedure receives a third argument in BF2, i.e., (k mod 2), that alternates between 0 and 1 between twoconsecutive time slices. This binary value indicates whether the schedule generated in the current time slice must be

horizontally flipped (see Figure 2). This flipping operation considerably reduces the number of preemptions and/or

migrations that can occur at both extremities of every time slice. However, we did not determine in this study the

minimal and maximal number of preemptions and/or migrations that this technique avoids.

4 Formal proofs

4.1 Optimality ofBF2

Lemma 2 Any discrete-time boundary fair schedule is feasible.


6/22

Proof The proof is a direct consequence of the fact that every job in the schedule can only executes an integer amount

of time units within any interval of time. As a result, since the absolute value of the lag of any jobi,j is strictly less

than 1 at its deadline (which corresponds to a boundary), we know thati,j executed at leastCi time units. Otherwise,

its lag (i.e. lagki ) would be greater than or equal to 1.

Lemma 3 For any system = {1, 2, . . . , n} providedUsum = m andUi 1 i, Algorithm 1 generates boundaryfair schedules (i.e., | lagki | < 1 i, TS

k) and it also holds i, TSk that| lubki | < 1.

Proof The proof is obtained by induction on k.

Basis: At time 0, we know thatlag0i and the lag upper bound lub0i of every taski is 0. Therefore we obviously have

| lag0i | < 1 and| lub0i | < 1 i.

Inductive step: Assuming that | lagk1i | < 1 and | lubk1i | < 1 i, we prove in this step that | lag

ki | < 1

and | lubki | < 1 i. According to Algorithm 1, two cases may occur at line 4: mandki is set either to 0 or to

lagk1i + Lk .Ui. We prove in the following that it provides | lub

ki | < 1 i in both cases, resulting in | lag

ki | < 1

i. Leti be any task in :

Case 1 mandki is set to 0. Thus we know from line 4 of Algorithm 1 that

lagk1i + Lk .Ui 0

and it holds thatlagk1i + L

k .Ui < 1

Since by assumption | lagk1i | < 1 and(Lk Ui) is positive,

lagk1i + Lk .Ui > 1

leading to

| lagk1i + Lk .Ui| < 1 (1)

The calculation of the lag upper bounds lubki at line 5 of Algorithm 1 provides i

lubki = lagk1i + L

k Ui mandki

And since mandki has been set to 0 in this case, the previous equality can be rewritten as

lubki = lagk1i + Lk Ui

leading to

| lubki | < 1

according to Inequality 1.

Case 2. mandki is set to lagk1i + L

k .Ui. Thus we know from line 4 of Algorithm 1 that

lagk1i + Lk .Ui > 0 (2)

The calculation of the lag upper bounds lubki at line 5 of Algorithm 1 provides i

lubki = lagk1i + L

k Ui mandki

And since mandki has been set to lagk1i + L

k .Ui in this case, the previous equality can be rewritten as

lubki = lagki + L

k Ui lagk1i + L

k .Ui

Since we know thatlagk1i + Lk .Ui > 0 according to inequality 2, we get

0 lubki < 1

And so

| lubki | < 1


7/22

The two cases presented above prove that after line 5 of Algorithm 1, we have i

| lubki | < 1 (3)

Then, the calculation oflagki at line 12 provides i

lagki = lubki opt

ki

And since optki can only be set to 1 iflubki > 0, i t holds i that

| lagki | < 1 (4)

The lemma follows from Inequalities 3 and4.

Corollary 1 For any system = {1, 2, . . . , n} provided Usum = m and Ui 1 i, Algorithm BF2

generates a

feasible schedule.

Proof The proof is a direct consequence from Lemmas 2 and3.

4.2 Validity ofBF2

In this section, we prove that for any system = {1, 2, . . . , n} provided

iUi = m and Ui 1 i, Algorithm

BF2 is valid in the sense that it ensures the four following proprieties:

1. In any time slice TSk, there are at least RUk eligible tasks (see Lemma 6), i.e.,

i

optki = RUk

2. In any time slice TSk, the allocated mandatory work can be executed within the time slice (see Lemma 7), i.e.,

m Lk

i

mandki 0

3. In any time slice TSk, the total amount of time units allocated to the tasks is never larger than the amount of

available time units (see Lemma 9), i.e., at any time bk1 of any time slice TSk def= [bk1, bk), it holds i that

i

(mandki +optki ) m L

k

4. No task is allocated more than Lk time units during any time slice TSk so the job parallelism is avoided (seeLemma 10), i.e.,

mand

k

i +optk

i Lk

Lemma 4 In any time slice TSk, if

ilagk1i = 0 then applying Algorithm 1 provides RU

k =

ilubki .

Proof Assuming that

ilagk1i = 0, l ine 5 of Algorithm 1 provides

lubki = (lagk1i + L

k Ui) mandki .


8/22

And therefore

i

lubki =

i

(lagk1i + Lk Ui mand

ki )

=

i

lagk1i +

i

(Lk Ui)

i

mandki

= Lk m

i

mandki

= RUk

which states the Lemma.

In the following proofs, we will use the notation lubki (S) to refer to the lag upper bound of the task i in the time sliceTSk in the schedule S. The notations lagki (S), mand

ki (S), opt

ki (S) ... are similarly defined.

Lemma 5 LetSBF andSBF2 denote the schedule produced by BF andBF2, respectively, while running any real-time

system providing Usum = m and Ui 1 i on m processors. At any boundary bk with k 0, it holds i that

lag

k

i (SBF2

) is equal to eitherlag

k

i (SBF) orlag

k

i (SBF) + 1 orlag

k

i (SBF) 1. Hereafter, we denote this by

lagki (SBF2) = lagki (SBF){1}

Proof From Lemma 3, we know thatBF2 is boundary fair and algorithm BF has been proved to be also boundary fairin [4]. This means that both algorithms have a lag strictly less than 1 time unit and, by definition of both algorithms,

for every taski in every time slice TSk it holds that

k

=1

(mandi +opti) = Ui bk

ork

=1

(mandi +opti) = Ui bk

which states the lemma.

Lemma 6 In any time slice TSk, there are at leastRUk eligible tasks, i.e.,

i

optki = RUk

Proof The proof is not easy to understand and Figure 3 has been drawn in order to help the reader. This picture

depicts two schedules produced for the given set of tasks. The upper one notedSBF is the schedule produced by the

optimal algorithm BF whereas the other one notedSBF2 is produced by our algorithm BF2.

First time slice: At time b0 = 0, we have lag0i (SBF2) = 0 i. Therefore, mand1i (SBF2) = L1 Ui i (from line 4),lub1i (SBF2) 0 i (from line 5) andRU

1(SBF2) =

ilub1i (SBF2) (from Lemma 4). Because lub

1i (SBF2) = 0

for any task i with mand1i (SBF2) = L

1, there are at least RU1(SBF2) tasks j such that lub1j(SBF2) > 0 and

mand1j(SBF2) < Lk. As a result, all the remaining units are allocated to the tasks in the first time slice.

Next time slices: LetTSk be any time slice such thatk > 1 and suppose that in SBF2 there is no unallocated timeunit within [b0, bk1). Then, by contradiction, suppose that in TS

kin SBF2, there is at least one remaining unit that


9/22

SBF

SBF2

13

A

A

AAB

2

B

4

B

B

B

B CD

D B

A B

S1

Sq

TSj

Sp

TSk

C

Figure 3: Illustration of the schedules SBF and SBF2 produced by algorithms BF and BF2, respectively, when a hole

appears in TSk in SBF2 .

can not be allocated to any task, i.e., there are fewer than RUk eligible tasks. This situation is illustrated in Figure3where an unallocated unit (henceforth called hole) is depicted in time slice TSk. Below, we prove that the hole canbe shifted in SBF2 from one time slice to another but it will always stay confined betweenTS

1 (i.e., the first time slice)

andTSk (the first time slice from TS1 in which it appears). Therefore, ifSBF2 has a hole in any time slice withinTS1

andTSk then SBF must also have a hole between these two time slices, hence contradicting its optimality.

Since there is an unallocated time unit in TSk (i.e., a hole), we know that there exists at least one task (say A) thatexecutes longer in the time slice TSk in SBF than in SBF2 . By Lemma 5, we know lag

k1A (SBF) lag

k1A (SBF2)

{1, 0, 1}. Below, we show that this difference cannot be 0 or -1. Hence,

lagk1A (SBF) = lagk1A (SBF2) + 1 (5)

meaning thatBF schedules A to be further behind the ideal schedule at time bk1 than BF2 does. Indeed, we show

in the following that the two other possible cases (from Lemma 5) lead to a contradiction.

Case 1. Suppose thatlagk1A (SBF) = lagk1A (SBF2). Line 4 of Algorithm 1 yields mand

kA(SBF) = mand

kA(SBF2)

and from line 5,

lubkA(SBF) = lubkA(SBF2)

In order to execute longer in SBF, A must receive an optional unit in TS

k

in SBF and not in SBF2

. However,if A receives an optional unit in TSk

in SBF then it holds thatlubkA(SBF) > 0 andmand

kA(SBF) < Lk and

therefore, it also holds thatlubkA(SBF2) > 0 andmandkA(SBF2) < Lk since the lag upper bounds are equal in

both schedules. It means thatA is eligible in TSk in SBF2 and does not receive any optional unit, leading to a

contradiction with the fact that there is a remaining unit that can not be allocated inTSk in SBF2 .

Case 2. Suppose that lagk1A (SBF) = lagk1A (SBF2) 1. Line 4 of Algorithm 1 yields either mand

kA(SBF) =

mandkA(SBF2) 1 ormandkA(SBF) = mand

kA(SBF2) = 0.

1. IfmandkA(SBF) = mandkA(SBF2) 1, then even ifA receives an optional unit in TS

kin SBF it can not

execute longer than in SBF2 .

2. IfmandkA(SBF) = mandkA(SBF2) = 0, then the lag upper bounds are positive in both schedules SBF and

SBF2, and it leads to a contradiction for the same reason than that explained in Case 1.

According to Equality 5, we know from the definition of the lags that there exists a time slice TSj with 1 j < ksuch thatA executes one time unit more in SBF2 than in SBF. LetA denote any time unit ofA in TS

jin SBF2 and

suppose that we swap the hole with A (see in Figure 3). Since we assumed that [1, k)

i

(mandi(SBF2) + opti(SBF2)) = m L


10/22

it held (before moving A) that in TSj

i

(mandji (SBF2) + optji (SBF2)) = m L

j

Since moving A generated a hole in TSj in SBF2, from the above equality there exists a taskB such that

mandjB(SBF2) + optjB(SBF2) < mandjB(SBF) + optjB(SBF)

Two cases may occur concerning B: eitherlagjB(SBF2) 0 or lag

jB(SBF2) > 0. First, suppose lag

jB(SBF2) 0.

This condition allows us to perform a backward move defined below.

Backward move: A backward move shifts the hole from the current time slice (here, TSj) to a former time slice. Inthe remainder of this part, TSj is assumed to be the current time slice andB is the task that executes longer in TS

j

in SBF than in SBF2 . Furthermore, we know by assumption thatlagjB(SBF2) 0 and we prove in the following that

it yields lagj1B (SBF2) < 0 and lagj1B (SBF) > 0 and, from the definition of the lags, there must exist a time slice

TSq with 1 q < j such thatB executes one time unit more in SBF2 than in SBF (see in Figure 3). By Lemmas 3and5, one of the following must hold.

lagj1B (SBF2) = lag

j1B (SBF)

or lagj1B (SBF2) > 0 > lagj1B (SBF)

or lagj1B (SBF2) < 0 < lagj1B (SBF)

Suppose lagj1B (SBF2) = lagj1B (SBF). Therefore, it holds from line 4 of Algorithm1 thatmand

jB(SBF2) =

mandjB(SBF) and from line 5,

lubjB(SBF2) = lubjB(SBF) (6)

Since B executes longer in SBF than in SBF2, we must have optjB(SBF) = 1 andopt

jB(SBF2) = 0. However,

having optjB(SBF) = 1 is possible only iflubjB(SBF) > 0 and since from Equality 6 the lag upper bounds of

B are equal in both schedules, it also holds thatlubjB(SBF2) > 0. Finally, since opt

jB(SBF2) = 0, we know

from line 12 of Algorithm 1 thatlagjB(SBF2) > 0, leading to a contradiction with our assumption.

Suppose lagj1B (SBF2) > 0 > lagj1B (SBF). Therefore we know from Lemma 5 thatlag

j1B (SBF) = lag

j1B (SBF2)

1. Iflagj1B (SBF) < 0 then it holds from line 4 of Algorithm 1 thatmand

jB(SBF) = mand

jB(SBF2){+1}.

1. Suppose mandjB(SBF) = mandjB(SBF2). Since lag

j1B (SBF) < 0 andlag

j1B (SBF2) > 0, we know that

if the amounts of mandatory units are equal in both schedules then they are both equal to 0 and it holds

thatlagj1B (SBF2) < 1 Lk UB andlagj1B (SBF) < Lk UB. Therefore, lub

jB(SBF) < 0 from line 5

of Algorithm 1 andoptjB(SBF) = 0 from line 9. Consequently, B does not execute longer in SBF than inSBF2 .

2. Suppose mandjB(SBF) = mandjB(SBF2) + 1. In this case, even ifopt

jB(SBF) = 0, B can not execute

longer in SBF than in SBF2 .

Therefore, iflagjB(SBF2) 0, we conclude that lagj1B (SBF2) < 0 and lag

j1B (SBF) > 0. From the definition

of the lags, there exists a time slice TSq with 1 q < j such that B executes one time unit more in SBF2 than inSBF (see in Figure 3). Informally, in any time slice TS

where the hole is shifted to, there will always exist a task

(say i) that executes longer in SBF than in SBF2 and a backward move can be performed as long as lagji (SBF2) 0.

Otherwise, iflagji (SBF2) > 0, we can perform a forward move defined below.

Notice that, by performing successive backward moves, the hole will finally be moved to the first time sliceTS1. Inthis case, we know that there is a task (say D) that executes longer in SBF than in SBF2 (see in Figure 3) and we


11/22

know thatD receives the same amount of mandatory units in TS1 in both schedules. Therefore, it must have received

an optional unit in SBF and not in SBF2 and it yields lag1D(SBF2) > 0. A forward move can therefore be performed.

Forward move: We have seen that when lagjB(SBF2) 0, we can perform a backward move. We now consider the

case where lagjB(SBF2) > 0. We will discover that in this case, we can perform a forward move, which shifts the hole

from the current time slice (say TSj) to another time slice (say TSp) such that j < p k, where TSk is the time

slice in which the hole initially appears (see in Figure 3). In the following, we prove by contradiction thatB mustexecute for at least one time unit in SBF2 within [bj , bk). Indeed, suppose that(mand

B(SBF2) + opt

B(SBF2)) = 0

(bj , bk]. Then,

lubkB(SBF2) = lagk1B (SBF2) + L

k UB mandkB(SBF2)

= lubk1B (SBF2) optk1B (SBF2) + L

k UB mandkB

= lagk2B (SBF2) + Lk1 UB mand

k1B (SBF2) opt

k1B (SBF2) + L

k UB mandkB(SBF2)

= lagk2B (SBF2) + UB (Lk1 + Lk)

k

=k1

mandB(SBF2) k1

=k1

optk1B (SBF2)

= ...

= lagjB(SBF2) + UB k

=j+1

L k

=j+1

mandB(SBF2) k1

=j+1

optk1B (SBF2)

= lagjB(SBF2) + (bk bj) UB

Since lagjB(SBF2) > 0 and(bk bj) UB > 0, it holds thatlubkB(SBF2) > 0 and by assumption mand

kB(SBF2) =

0. Therefore, B is eligible for an optional time unit in T Sk, leading to a contradiction with the fact that there is at

least one time unit that can not be allocated in TSk in SBF2 . From the above result, we know that there exists a timeslice T Sp with j < p k such thatB is executed for at least one time unit and the forward move consists in filling thehole in the current time slice TSj with any time unit ofB in TS

p (thus shifting the hole to TSp). Again, after havingswapped the hole with any time unit ofB, we know that there is at least one task, say C (C could be the same task

than B), that is executed for less time in TSp in SBF2 than in SBF. We are therefore in the same situation than in TS

j .

Each time we move the hole, whether forward or backward, we are making SBF2 more closely resemble SBF. Ineach case, the hole remains in some time slices TS, where k. Therefore, there must also be a hole in the scheduleSBF, which contradicts its optimality proved in [4].

Lemma 7 In any time slice TSk, the allocated mandatory work can be executed within the time slice, i.e.,

m Lk

i

mandki 0

as long as all the time units are allocated to the tasks in every time slice TS with 1 < k.

Proof The proof is not easy to understand and Figure 4 has been drawn in order to help the reader. This picture

depicts two schedules produced for the given set of tasks. The upper one notedSBF is the schedule produced by theoptimal algorithm BF whereas the other one noted SBF2 is produced by our algorithm SBF2 . Hereafter, we will usethe notation lubki (S) to refer to the lag upper bounds of the task i in the time slice TS

kin the schedule S. The

notations lagki (S), mandki (S), opt

ki (S) ... are similarly defined.

First time slice: In the first time slice TS1 it holds thatRU1(SBF2) 0 within [b0, b1). Indeed at time b0 = 0, wehave lag0i (SBF2) = 0 i. Therefore, it holds thatmand

1i (SBF2) = L

1 Ui i (from line 4), lub1i (SBF2) 0 i


12/22

SBF

SBF2

C

2

AA

C2

1

2

5 6 7

3

4

BAAB

S1

Sq

TSj

Sp

TSv

TSk

Figure 4: Illustration of the schedules SBF and SBF2 produced by algorithms BF and BF2, respectively, when an

overflow appears in TSk in SBF2 .

(from line 5) andRU1(SBF2) 0 (from Lemma 4). Notice that, since the difference between algorithms BF andBF2

is the method for allocating the optional time units, it holds inTS1

that

i : mand1i (SBF) = mand

1i (SBF2) (7)

Next time slices: Let TSkdef= [bk1, bk) be any time slice such that k > 1 and suppose that in SBF2 there is no

unallocated time unit within [b0, bk1). Then, suppose by contradiction that in TSk,

i

mandki (SBF2) > m Lk (8)

In the following, we prove that if Inequality 8 holds then there cannot exist a valid schedule from time 0 to time bk,thus leading to a contradiction with the optimality and validity of AlgorithmBF.

Above, we assumed that there is an overflow in the time sliceTSk in SBF2 . In other words, there exists at least one

task (say A) that executes longer in the time slice TSk in SBF2 than in SBF. Due to the overflow, at least 1 unit ofAwill be unable to execute during TSk. We consider a single such time unit and we say this time unit isejected from TSk.

By Inequality 8, we know thatRUk < 0 and every task (including A) only executes its mandatory part in TSk in

SBF2 . Therefore,

mandkA(SBF2) > mandkA(SBF) + opt

kA(SBF)

Since optkA(SBF) can be only 1 or 0, it holds that

mandkA(SBF2) > mandkA(SBF)

and so (from line 4)

lagk1A (SBF2) > lagk1A (SBF)

According to Lemma 5, the above inequality can be rewritten as

lagk1A (SBF2) = lagk1A (SBF) + 1 (9)

Since the time unit ejected from TSk is a mandatory unit, we can perform the Backward move defined below.

Backward move: As a consequence of the above inequality, we know from the definition of the lags that there exists

a time slice before TSk such that the task A executes at least one time unit more in SBF than in SBF2 . Hereafter,


13/22

A andTSk will be said to be the reference task and the reference time slice of this backward move. Remember that

the backward move can only be performed if the time unit ejected from the overflowed time slice (here, TSk) is amandatory unit. LetTSj (with 1 j < k) denote the latest time slice (from TS1) such thatA executes at least onetime unit more in SBF than in SBF2 and letA denote this time unit (see in Figure 4). By definition ofTS

j and from

Lemma 5, we know that

lagjA(SBF2) = lagjA(SBF){+1} (10)

For the correctness of the remainder of this proof, we need to be sure thatA does not execute any optional unit inSBF2 within [bj1, bk). This is proved in Subproofs B.1, B.2 and B.3 below.

Subproof B.1 : Suppose that there exists a time slice TS with j < < k such thatA executes one time unit morein BF than in BF2. This directly results in a contradiction with the definition ofTSj .

Subproof B.2 : Suppose that there exists a time slice TS with j < < k such thatA executes one time unit morein BF2 than in BF. From Expression 10 we have

lag1A (SBF2) = lag1A (SBF){+1}

and it therefore holds that

lagA(SBF2) = lagA(SBF){1}

However, since from Subproof B.1 there does not exist a time slice TSo with < o < k such that A executesone time units more in BF than in BF2, we get at boundary bk1

lagk1A (SBF2) = lagk1A (SBF){1}

leading to a contradiction with Equality 9. Notice that, according to Subproofs B.1 and B.2, we know thatBFandBF2 must execute A for the same amount of time within each time slice in [bj , bk1). As a consequence,the difference between the lags ofA in BF andBF

2 from boundary bj to boundary bk1 remains constant and,

according to Equality 9 we have at boundary bj

lagjA(SBF2) = lagjA(SBF) + 1 (11)

Subproof B.3 : From Subproofs B.1 and B.2 we know that in every time sliceTS such thatj < k, A executes the

same amount of time in both BF2 andBF. From Equality 11 we know thatlagjA(SBF2) > 0 andlagjA(SBF) < 0and from Subproofs B.1 and B.2 it holds that

lag1A (SBF2) = lag1A (SBF) + 1 (12)

From line 4 of Algorithm 1 we have in TS in both SBF andSBF2

mandA = max(0, lag1A +UA L

)

and two cases may occur.

1. mandA(SBF2) = 0. Therefore it holds in SBF2 that

lag1A (SBF2) + UA L 0

and according to Equality 12, it holds in SBF that

lag1A (SBF) + UA L 1

leading to mandA(SBF) = 0. This means thatA does not execute any mandatory units in both SBF andSBF2 . Since mand

A(SBF) = 0, we have

lubA(SBF) = lag1A (SBF) + Lk UA


14/22

and from Equality 12,

lubA(SBF) = lag1A (SBF2) 1 + Lk UA

= lubA(SBF2) 1

As a consequence, we know from Lemma 3 thatlubA(SBF) > 0 andlubA(SBF2) < 0. Hence, A can not

receive an optional unit in TS

in BF2

and from Subproofs B.1 and B.2, it can not receive an optional unitin SBF either.

2. lag1A (SBF2) + UA L > 0 which leads to

mandA(SBF2) = mandA(SBF) + 1

As a result, having mandA(SBF2)+ optA(SBF2) = mand

A(SBF)+ opt

A(SBF) is possible if and only if

optA(SBF2) = 0 andoptA(SBF) = 1.

Subproofs B.1, B.2 and B.3 state that A does not execute any optional time units in SBF2 within [bj1, bk). Now,

we move the time unitA from TSk to TSj (see in Figure 4). Since we assumed that [1, k)

i

(mandi(SBF2) + opti(SBF2)) = m L

we have in TSj

i

(mandji (SBF2) + optji (SBF2)) = m L

j

Consequently, placing the time unitA in TSj in SBF2 obviously generates an overflow of one time unit. Suppose thatwe place A in TSj anyway in order to imitate the optimal schedule SBF. Since there is now an overflow in TS

j in

SBF2, there must exist a taskB such that

mandjB(SBF2) + optjB(SBF2) > mand

jB(SBF) + opt

jB(SBF)

Let B denote the time unit of B which is ejected from TSj

in SBF2 (see in Figure 4). Two cases may occur:

B belongs to the mandatory part ofB (i.e., optjB(SBF2) = 0) or B was allocated as an optional time unit (i.e.,optjB(SBF2) = 1).

Case 1. The ejected time unit (here, B) belongs to the mandatory part of a task (here, B), i.e., optjB(SBF2) = 0.

Following the same development as forA, there must exist a time slice TSq

with 1 q < j such thatB exe-cutes one unit more in SBF than in SBF2 . As a result, we can perform the backward move with B as reference

task andTSj as reference time slice. The backward move can be repeated as long as the ejected unit is a manda-tory one and it ends when we reach one of the two following situations: either we reach a time slice TSq with1 < q < j where an optional unit is ejected or we reach TSq = TS1 (see and in Figure 4). Since in TS1 theamount of mandatory units of every task is identical in both SBF andSBF2 (according to Equality 7), we know

that the time unit ejected from TS1 is an optional one. In conclusion, the backward move stops when the timeunit ejected from a time slice is optional, and we know that this must happen ifTS1 is reached, if not beforehand.

Case 2. The ejected time unit (say B) was allocated as an optional time unit (i.e., optjB(SBF2) = 1). Therefore, weperform a Forward move defined below.


15/22

Forward move: The forward move is performed when the time unit ejected from a time slice is an optional one. A

forward move is characterized by an origin, a reference task and a destination with the following interpretation: the

origin is the time slice from which the optional time unit is ejected (here,TSj), the reference taskis the one responsiblefor the ejection of the first optional time unit (here,A) and the destination is the time slice in which an extra time unit

(compared to SBF) will be finally generated (here, TSk

). In the following, we prove that

F.1 withdrawing the optional unit from B in the origin time slice TSj

in SBF2 ensures thatB executes at least onemandatory unit in a time slice TSp in SBF2 such thatj < p k where TS

jandTSk are the origin and destination

time slice of the forward move, respectively.

F.2 at least one of the mandatory units ofB executed within [bj , bk) has been generated by withdrawing its op-tional time unit in TSj (see in Figure 4). That is, withdrawing the optional unit fromB in TS

j in SBF2 generates

an overflow of one time unit in another time slice TSp (with j < p k).

Subproof F.1 Here, we prove that withdrawing the optional time unit from B in TSj in SBF2 ensures that B exe-

cutes at least one mandatory unit in a time sliceTSp in SBF2 such thatj < p k. Within this sub-proof, whichonly considers the schedule SBF2, we will use the notations lag

ki , lub

ki ... instead oflag

ki (SBF2), lub

ki (SBF2),

etc.

Since A has been responsible for an overflow of a mandatory unit in TSk in SBF2, it holds in SBF2 that

1 lagk1A + Lk UA

lagk1A + Lk UA

lubk1A optk1A + L

k UA

Since optk1A 0 we have

1 lubk1A + Lk UA

lagk2A + Lk1 UA mand

k1A + L

k UA

Since mandk1A 0, it holds that

1 lagk1A + Lk2 UA + L

k UA

lagk2A +(Lk1 + Lk) UA

. . .

lagjA +(Lj+1 + . . . + Lk1 + Lk) UA

lagjA +(bk bj) UA

From line 12 of Algorithm 1 we know thatlagjA = lubjA opt

jA with opt

jA = 0 (we proved in the Subproofs B.1,

B.2 and B.3 thatA did not receive an optional time unit in TSj in SBF2). Consequently, the above inequality

can be rewritten as

lubjA +(bk bj) UA 1

And so(bk bj)

1 lubjAUA

(13)

Now, suppose that the optional unit is withdrawn from B in TSj in SBF2 (i.e., opt

jB = 0) and suppose (by

contradiction) thatB does not execute in any time slice TSp

such thatj < p k. Therefore, in any such TSp

we have

lagpB + Lp UB < 1


16/22

And in particular (since p k) we have in TSk in SBF2

1 > lagk1B + Lk UB

> lagk1B + Lk UB

> lubk1B optk1B + L

k UB

Since optk1B = 0 we have

1 > lubk1B + Lk UB

> lagk2B + Lk1 UB mand

k1B + L

k UB

Since mandk1B = 0, it holds that

1 > lagk2B + Lk1 UB + L

k UB

> lagk2B +(Lk1 + Lk) UB

> . . .

> lagjB +(Lj+1 + . . . + Lk1 + Lk) UB

> lagj

B +(bk bj) UB

From line 12 of Algorithm 1 we know thatlagjB = lubjB opt

jB and since the optional unit has been withdrew

from B we have optjB = 0. Consequently, the above inequality can be rewritten as

lubjB +(bk bj) UB < 1

which implies

(bk bj)

lre-tl cited 2

Documents