lre-tl cited 2

Upload: vjkantn

Post on 05-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Lre-tl Cited 2

    1/22

    On the Design of an Optimal Multiprocessor Real-Time Scheduling

    Algorithm under Practical Considerations (Extended Version)

    Shelby Funk Vincent Nelis1 Joel Goossens Dragomir Milojevic

    Geoffrey Nelissen1

    January 27, 2010

    Abstract

    Thisresearch addresses the multiprocessor scheduling problemof hard real-time systems, and it especially focuses

    on optimal and global schedulers when practical constraints are taken into account. First, we propose an improvementof the optimal algorithm BF. We formally prove that our adaptation is (i) optimal, i.e., it always generates a feasible

    schedule as long as such a schedule exists, and (ii) valid, i.e., it complies with the all the requirements. We also show

    that it outperforms BF by providing a computing complexity of O(n), where n is the number of tasks to be scheduled.

    Next, we propose a schedulability analysis which indicates a priori whether the real-time application can be scheduled

    by our improvement of BF without missing any deadline. This analysis is, to the best of our knowledge, the first such

    test for multiprocessors that takes into account all the main overheads generated by the Operating System.

    1 Introduction

    The first global and optimal algorithm for scheduling hard real-time systems on multiprocessor platforms was pro-

    posed by Baruah et al. in [1], where the tasks are assumed to be periodic and have implicit deadlines. The approach

    proposed in [1] is namedPF

    for Proportionate Fairscheduling. The idea is to divide the time into time slots and to

    schedule every task in such a way that each one receives an amount of time slots proportional to its utilization. This

    was named the fairness propriety in [1]. Different adaptations ofPF were published these last few years and amongthe most interesting ones, one can cite PD [2] and PD2 [3]. These adaptations follow the same idea than PF butsucceeded to considerably reduce the computing complexity of the scheduling decisions. The worst drawback of all

    Pfair-like approaches resides in the fact that a scheduling decision must be taken at each time unit. That is, although

    the theoretical correctness of these algorithms PF, PD and PD2 is undeniable, their application in the real world islimited due to the high number of overheads (and especially the preemption overheads) that are generated during the

    execution of the system. Notice that, during an overhead on any processor (say Pk), only the Operating System isrunning while the tasks supposed to be executed on Pk are waiting for execution. Therefore, it exist some intervalsof time in the actual schedule where a task is not executed while it was supposed to be executed in the theoretical

    schedule. Since the existing schedulability analyses do not take overheads into account, a task can sometimes miss its

    deadline even if the system was said to be feasible in theory.

    More recently, Zhu, et al., investigated the problem of the overheads and designed a new class of algorithms named

    Boundary fair schedulers (i.e., Bfair-like algorithms). Their idea is to ensure the fairness only at boundaries, i.e.,

    some particular events occurring during the system execution, resulting in a considerable reduction of the number of

    decisions taken by the scheduler, as well as the cost of the overheads, compared to Pfair-like algorithms. Among

    the most interesting studies in this area, one can cite [4], [5] and [6]. Basically, these Bfair-like schedulers can be

    1Supported by the Belgian National Science Foundation (F.N.R.S.) under a F.R.I.A. grant.

    arXiv:1001.4

    115v2

    [cs.OS]27

    Jan2010

  • 8/2/2019 Lre-tl Cited 2

    2/22

    divided into two categories: the continuous-time Bfair schedulers, i.e., the time is assumed to be continuous, and the

    quantum-based Bfair schedulers that divide the time into time slots. To the best of our knowledge, the most recent

    approach belonging to the continuous-time category is LRE-TL [6]. This algorithm was proved to be optimal but, likeevery scheduler that belong to this category, it suffers from another drawback. Indeed, such schedulers can sometimes

    preempt a taski by another taskj only to execute an extremely small portion ofj . That is, the generated schedule

    can undergo a preemption overhead only to execute an insignificant portion of a task, hence producing schedules in

    which the time actually used to execute the tasks is limited. On the other hand, quantum-basedtechniques have beendesigned in order to ensure that the Operating System only preempts a taski by another taskj ifj has to execute at

    least one time-quantum. In our opinion, the most interesting study in this category is [4]. The proposed scheduler BFis optimal and it considerably reduces the number of preemptions compared to continuous-time Bfair and Pfair-like

    schedulers. However, the complexity ofBF is O(n Tmax) where Tmax is the largest period of the tasks2, but since

    Tmax could be quite large, the overhead due to the execution of the scheduler can be large as well.

    Contribution of this work. Our first contribution is an improvement of the algorithm BF that confirms that its com-plexity can actually be reduced to O(n). Then, our second contribution is to provide a schedulability test that takes

    into account all the main overheads generated by the Operating System (i.e., preemption costs, migration costs, the

    time needed to handle tasks arrivals, to execute the scheduling algorithm, etc.).

    Paper organization. In Section 2, we introduce our system model and give some preliminary definitions. In Section 3,

    we describe our improvement of BF (that we name BF2) and we define the notations used throughout the paper. InSection 4, we prove both the optimality and the validity of BF2. In Section 5, we identify all the main overheads thatoccur in the schedules produced while running BF2 on our multiprocessor architecture. In Section 6, we model allthese overheads and we include them into our schedulability analysis. Finally in Section 7, we introduce our future

    works and we conclude.

    2 System model and preliminaries

    We consider multiprocessor platforms composed of a known and fixed number m ofidentical processors {P1, P2, . . . , Pm}upon which a real-time system is executed. Identical means that all the processors have the same profile (in term of

    consumption, computational capabilities, etc.) and are interchangeable.

    A periodic real-time system is a set of n functionalities denoted by {1, 2, . . . , n}. Every functionality i ismodeled by a periodic implicit-deadline task characterized by two parameters (Ci, Ti ) a Worst-Case Execution

    Time (WCET) Ci and an inter-arrival delay Ti with the interpretation that the task i generates successive jobs i,j(with j = 1, . . . , ) arriving at times ai,j such that ai,j = ai,j1 + Ti (with ai,1 = 0), each such job has a worst-caseexecution time of at most Ci time units and must be completed at (or before) the next task release ai,j+1. During the

    scheduling of the tasks, we assume that the time is divided into time slots (that we name time unithereafter) and both

    parameters Ci and Ti of every taski are expressed as an integer multiple of time unit. We assume that any time unit

    on any CPU can only be allocated to one job and a job can only execute on one CPU at any time (i.e., job parallelism

    is forbidden).

    The utilization Ui of any task i is defined as Uidef= Ci

    Tiwith 0 < Ui 1 and the total utilization Usum of the

    system is defined as the sum of the utilization of every task, i.e., Usumdef=

    iUi. In this work, we assume with

    no loss of generality that Usum =

    m. Indeed, if the total utilization is initially not equal to m, the WCET of the tasks

    can be overestimated (hence increasing their individual utilization) so that a total utilization of m is reached. We also

    assume that all the tasks are independent, i.e., there is no communication, no precedence constraint and no shared

    resource (except the processors) between them. Although this assumption is highly unrealistic, it is a necessary first

    2Although the authors claimed that the sub-routine responsible for this Tmax factor can be replaced by another one running in O(1) (leading to

    a complexity of O(n) for BF), the proof of the equivalence between these two sub-routines does not appear in the literature and, in our opinion, this

    equivalence is not obvious at all.

  • 8/2/2019 Lre-tl Cited 2

    3/22

    step toward a practical implementation and we will reconsider it in our future work.

    Definition 1 (Boundary) A boundary is a time-instant which corresponds to a task release. Notice that boundaries

    are indexed from b0 = 0 at which every task releases its first job according to our task model.

    Definition 2 (Time slice) A time slice TSkdef= [bk1, bk) is a time interval between two consecutive and distinct task

    arrivals in the schedule, i.e., two consecutive and distinct boundaries. The length of the kth

    time slice TS

    k

    (withk = 1, . . . , ) is denoted by Lk

    def= bk bk1.

    Definition 3 (Fluid schedule) A schedule is said to be fluid if and only if at any time t 0, every taski has executedexactly Ui t execution units. Notice that, unfortunately, such a schedule is impossible to actually implement.

    Targeting systems in which the time is divided into time units, the scheduler cannot allocate a fraction of time unit to

    the tasks. Thereby, we define the lag as follows.

    Definition 4 (Lag) The lag of a taski at any time t is the difference between Ui t (i.e., the amount of execution unitsexecuted in the fluid schedule at time t) and the amount of time units thati has already executed at time t in the actual

    schedule.

    Definition 5 (Boundary fair schedule [4]) A periodic schedule is said to be boundary fair if and only if the absolute

    value of the lag of every taski at any boundary bk is strictly less than one time unit.

    Lemma 1 (from [4]) For the multiprocessor periodic scheduling problem, if the utilization of every taski is lower

    than 1 and the total utilization Usum is not larger than the numberm of processors, a boundary fair schedule exists.

    3 The BF2 algorithm

    3.1 Description of the algorithm

    The BF2 algorithm proposed in this paper is an improvement of the BF algorithm proposed in [4]. BF2 benefits froma lower computing complexity, thereby reducing the time overheads due to the execution of the scheduler. The main

    idea of the BF algorithm is that it schedules tasks in a manner that remains reasonably close to the fluid schedule. BF

    schedules the task so that, at every boundary, the amount of time each task has executed differs from the fluid scheduleby less than 1 time unit. At each boundary, all the time units between the current and the next boundary are allocated

    to tasks. That is, each task first receives a mandatory integer amount of time units in such a way that the fairness

    is guaranteed at the next boundary. Then, if some units remain unallocated, an additional method (using dynamic

    priorities) selects the tasks that will receive these units. In BF, this additional method has a computing complexity of

    O(n Tmax), where Tmaxdef= maxi{Ti}, whereas the complexity is O(n) in BF

    2. Algorithm 1 gives the pseudo-

    code of BF2. The only difference with BF resides between lines 8 and 11 (included), i.e., the method for allocatingthe remaining unallocated units.

    In the following, we define and describe the notations used in Algorithm 1. The lag of a taski at the end of the

    time slice TSk, denoted lagki , corresponds to the lag ofi at boundary bk. That is, any task i with a positive, null ornegative lag is said to be behind, punctual or ahead, respectively. Notice that, at the beginning of the first time slice

    TS1, we have lag0i = 0 i since every task releases its first job at time 0 and a job can only be punctual at its releasetime. The mandatory time units mandki allocated to i at line 4 is the amount of time units that i must execute inTSk so that its lag at the next boundary bk is strictly less than 1. The lag upper bound of a task i in a time sliceTSk, denoted lubki , is computed at line 5 and corresponds to the lag at boundary bk if the scheduler only assigns themandatory units to i. Once all the tasks received their mandatory time units, there may be some remaining units

    that are not allocated to the tasks due to the use of the floor operator at line 4. These so named remaining units are

    denoted RUk and are computed at line 7. From lines 8 to 11, the RUk remaining units are allocated to the tasks asoptional units (details are given below) and finally, the lags are updated accordingly at line 12. Notice that, according

  • 8/2/2019 Lre-tl Cited 2

    4/22

  • 8/2/2019 Lre-tl Cited 2

    5/22

    B C D

    D

    A B

    E

    B C D

    D

    A B

    E

    B C D

    D

    A B

    E

    BCD

    D

    AB

    E

    1

    P2

    P3

    P1

    P2

    P3

    TSk

    Sk+1

    Figure 2: The upper schedule is produced by BF whereas the other one is generated by BF2. In the schedule ofBF2, the local schedule in the time slice TSk+1 has been horizontally flipped compared to the local schedule in theprevious time slice TSk, thus avoiding 3 preemptions and 2 migrations.

    part of box) that overflows from TSk are packed to the next processor and this process continues until all the tasks are

    allocated.

    The complexity of the whole algorithm BF2 can be easily proved to be O(n). Indeed, the complexity is obviouslyO(n) from line 2 to line 8 (included). Using any linear-comparison selection algorithm (e.g., [7]) provides a linear

    complexity to line 9 and lines 10, 11 and 12 are obviously in O(n). Finally, the function GenerateSchedule() clearly

    runs in O(n), leading to a total complexity of O(n) for BF2.

    3.2 Differences between BF and BF2

    The main difference between BF and the adaptation BF2 that we propose here resides at line 9 of Algorithm 1, whereboth algorithms determine which tasks will receive an optional time unit. In BF, the authors drew inspiration fromthe approach PF proposed in [1] and their resulting procedure that select the tasks runs in O(n Tmax) where Tmaxdenotes the largest period of the tasks. However, in Section 3 of their paper (see [4]), authors claim that this selection

    procedure could be improved in order to run in O( n) and they briefly explain the idea behind their intuition. However,although they provided all the needed tools to start a deeper investigation, they have never achieved the proofs that

    would confirm their intuition. In this paper, we prove with our selection procedure used at line 9 of BF2 that such animprovement is possible since it actually runs in O(n).

    The second difference between BF and BF2 resides in the procedure GenerateSchedule()at line 13 of Algorithm 1.Indeed, this procedure receives a third argument in BF2, i.e., (k mod 2), that alternates between 0 and 1 between twoconsecutive time slices. This binary value indicates whether the schedule generated in the current time slice must be

    horizontally flipped (see Figure 2). This flipping operation considerably reduces the number of preemptions and/or

    migrations that can occur at both extremities of every time slice. However, we did not determine in this study the

    minimal and maximal number of preemptions and/or migrations that this technique avoids.

    4 Formal proofs

    4.1 Optimality ofBF2

    Lemma 2 Any discrete-time boundary fair schedule is feasible.

  • 8/2/2019 Lre-tl Cited 2

    6/22

    Proof The proof is a direct consequence of the fact that every job in the schedule can only executes an integer amount

    of time units within any interval of time. As a result, since the absolute value of the lag of any jobi,j is strictly less

    than 1 at its deadline (which corresponds to a boundary), we know thati,j executed at leastCi time units. Otherwise,

    its lag (i.e. lagki ) would be greater than or equal to 1.

    Lemma 3 For any system = {1, 2, . . . , n} providedUsum = m andUi 1 i, Algorithm 1 generates boundaryfair schedules (i.e., | lagki | < 1 i, TS

    k) and it also holds i, TSk that| lubki | < 1.

    Proof The proof is obtained by induction on k.

    Basis: At time 0, we know thatlag0i and the lag upper bound lub0i of every taski is 0. Therefore we obviously have

    | lag0i | < 1 and| lub0i | < 1 i.

    Inductive step: Assuming that | lagk1i | < 1 and | lubk1i | < 1 i, we prove in this step that | lag

    ki | < 1

    and | lubki | < 1 i. According to Algorithm 1, two cases may occur at line 4: mandki is set either to 0 or to

    lagk1i + Lk .Ui. We prove in the following that it provides | lub

    ki | < 1 i in both cases, resulting in | lag

    ki | < 1

    i. Leti be any task in :

    Case 1 mandki is set to 0. Thus we know from line 4 of Algorithm 1 that

    lagk1i + Lk .Ui 0

    and it holds thatlagk1i + L

    k .Ui < 1

    Since by assumption | lagk1i | < 1 and(Lk Ui) is positive,

    lagk1i + Lk .Ui > 1

    leading to

    | lagk1i + Lk .Ui| < 1 (1)

    The calculation of the lag upper bounds lubki at line 5 of Algorithm 1 provides i

    lubki = lagk1i + L

    k Ui mandki

    And since mandki has been set to 0 in this case, the previous equality can be rewritten as

    lubki = lagk1i + Lk Ui

    leading to

    | lubki | < 1

    according to Inequality 1.

    Case 2. mandki is set to lagk1i + L

    k .Ui. Thus we know from line 4 of Algorithm 1 that

    lagk1i + Lk .Ui > 0 (2)

    The calculation of the lag upper bounds lubki at line 5 of Algorithm 1 provides i

    lubki = lagk1i + L

    k Ui mandki

    And since mandki has been set to lagk1i + L

    k .Ui in this case, the previous equality can be rewritten as

    lubki = lagki + L

    k Ui lagk1i + L

    k .Ui

    Since we know thatlagk1i + Lk .Ui > 0 according to inequality 2, we get

    0 lubki < 1

    And so

    | lubki | < 1

  • 8/2/2019 Lre-tl Cited 2

    7/22

    The two cases presented above prove that after line 5 of Algorithm 1, we have i

    | lubki | < 1 (3)

    Then, the calculation oflagki at line 12 provides i

    lagki = lubki opt

    ki

    And since optki can only be set to 1 iflubki > 0, i t holds i that

    | lagki | < 1 (4)

    The lemma follows from Inequalities 3 and4.

    Corollary 1 For any system = {1, 2, . . . , n} provided Usum = m and Ui 1 i, Algorithm BF2

    generates a

    feasible schedule.

    Proof The proof is a direct consequence from Lemmas 2 and3.

    4.2 Validity ofBF2

    In this section, we prove that for any system = {1, 2, . . . , n} provided

    iUi = m and Ui 1 i, Algorithm

    BF2 is valid in the sense that it ensures the four following proprieties:

    1. In any time slice TSk, there are at least RUk eligible tasks (see Lemma 6), i.e.,

    i

    optki = RUk

    2. In any time slice TSk, the allocated mandatory work can be executed within the time slice (see Lemma 7), i.e.,

    m Lk

    i

    mandki 0

    3. In any time slice TSk, the total amount of time units allocated to the tasks is never larger than the amount of

    available time units (see Lemma 9), i.e., at any time bk1 of any time slice TSk def= [bk1, bk), it holds i that

    i

    (mandki +optki ) m L

    k

    4. No task is allocated more than Lk time units during any time slice TSk so the job parallelism is avoided (seeLemma 10), i.e.,

    mand

    k

    i +optk

    i Lk

    Lemma 4 In any time slice TSk, if

    ilagk1i = 0 then applying Algorithm 1 provides RU

    k =

    ilubki .

    Proof Assuming that

    ilagk1i = 0, l ine 5 of Algorithm 1 provides

    lubki = (lagk1i + L

    k Ui) mandki .

  • 8/2/2019 Lre-tl Cited 2

    8/22

    And therefore

    i

    lubki =

    i

    (lagk1i + Lk Ui mand

    ki )

    =

    i

    lagk1i +

    i

    (Lk Ui)

    i

    mandki

    = Lk m

    i

    mandki

    = RUk

    which states the Lemma.

    In the following proofs, we will use the notation lubki (S) to refer to the lag upper bound of the task i in the time sliceTSk in the schedule S. The notations lagki (S), mand

    ki (S), opt

    ki (S) ... are similarly defined.

    Lemma 5 LetSBF andSBF2 denote the schedule produced by BF andBF2, respectively, while running any real-time

    system providing Usum = m and Ui 1 i on m processors. At any boundary bk with k 0, it holds i that

    lag

    k

    i (SBF2

    ) is equal to eitherlag

    k

    i (SBF) orlag

    k

    i (SBF) + 1 orlag

    k

    i (SBF) 1. Hereafter, we denote this by

    lagki (SBF2) = lagki (SBF){1}

    Proof From Lemma 3, we know thatBF2 is boundary fair and algorithm BF has been proved to be also boundary fairin [4]. This means that both algorithms have a lag strictly less than 1 time unit and, by definition of both algorithms,

    for every taski in every time slice TSk it holds that

    k

    =1

    (mandi +opti) = Ui bk

    ork

    =1

    (mandi +opti) = Ui bk

    which states the lemma.

    Lemma 6 In any time slice TSk, there are at leastRUk eligible tasks, i.e.,

    i

    optki = RUk

    Proof The proof is not easy to understand and Figure 3 has been drawn in order to help the reader. This picture

    depicts two schedules produced for the given set of tasks. The upper one notedSBF is the schedule produced by the

    optimal algorithm BF whereas the other one notedSBF2 is produced by our algorithm BF2.

    First time slice: At time b0 = 0, we have lag0i (SBF2) = 0 i. Therefore, mand1i (SBF2) = L1 Ui i (from line 4),lub1i (SBF2) 0 i (from line 5) andRU

    1(SBF2) =

    ilub1i (SBF2) (from Lemma 4). Because lub

    1i (SBF2) = 0

    for any task i with mand1i (SBF2) = L

    1, there are at least RU1(SBF2) tasks j such that lub1j(SBF2) > 0 and

    mand1j(SBF2) < Lk. As a result, all the remaining units are allocated to the tasks in the first time slice.

    Next time slices: LetTSk be any time slice such thatk > 1 and suppose that in SBF2 there is no unallocated timeunit within [b0, bk1). Then, by contradiction, suppose that in TS

    kin SBF2, there is at least one remaining unit that

  • 8/2/2019 Lre-tl Cited 2

    9/22

    SBF

    SBF2

    13

    A

    A

    AAB

    2

    B

    4

    B

    B

    B

    B CD

    D B

    A B

    S1

    Sq

    TSj

    Sp

    TSk

    C

    Figure 3: Illustration of the schedules SBF and SBF2 produced by algorithms BF and BF2, respectively, when a hole

    appears in TSk in SBF2 .

    can not be allocated to any task, i.e., there are fewer than RUk eligible tasks. This situation is illustrated in Figure3where an unallocated unit (henceforth called hole) is depicted in time slice TSk. Below, we prove that the hole canbe shifted in SBF2 from one time slice to another but it will always stay confined betweenTS

    1 (i.e., the first time slice)

    andTSk (the first time slice from TS1 in which it appears). Therefore, ifSBF2 has a hole in any time slice withinTS1

    andTSk then SBF must also have a hole between these two time slices, hence contradicting its optimality.

    Since there is an unallocated time unit in TSk (i.e., a hole), we know that there exists at least one task (say A) thatexecutes longer in the time slice TSk in SBF than in SBF2 . By Lemma 5, we know lag

    k1A (SBF) lag

    k1A (SBF2)

    {1, 0, 1}. Below, we show that this difference cannot be 0 or -1. Hence,

    lagk1A (SBF) = lagk1A (SBF2) + 1 (5)

    meaning thatBF schedules A to be further behind the ideal schedule at time bk1 than BF2 does. Indeed, we show

    in the following that the two other possible cases (from Lemma 5) lead to a contradiction.

    Case 1. Suppose thatlagk1A (SBF) = lagk1A (SBF2). Line 4 of Algorithm 1 yields mand

    kA(SBF) = mand

    kA(SBF2)

    and from line 5,

    lubkA(SBF) = lubkA(SBF2)

    In order to execute longer in SBF, A must receive an optional unit in TS

    k

    in SBF and not in SBF2

    . However,if A receives an optional unit in TSk

    in SBF then it holds thatlubkA(SBF) > 0 andmand

    kA(SBF) < Lk and

    therefore, it also holds thatlubkA(SBF2) > 0 andmandkA(SBF2) < Lk since the lag upper bounds are equal in

    both schedules. It means thatA is eligible in TSk in SBF2 and does not receive any optional unit, leading to a

    contradiction with the fact that there is a remaining unit that can not be allocated inTSk in SBF2 .

    Case 2. Suppose that lagk1A (SBF) = lagk1A (SBF2) 1. Line 4 of Algorithm 1 yields either mand

    kA(SBF) =

    mandkA(SBF2) 1 ormandkA(SBF) = mand

    kA(SBF2) = 0.

    1. IfmandkA(SBF) = mandkA(SBF2) 1, then even ifA receives an optional unit in TS

    kin SBF it can not

    execute longer than in SBF2 .

    2. IfmandkA(SBF) = mandkA(SBF2) = 0, then the lag upper bounds are positive in both schedules SBF and

    SBF2, and it leads to a contradiction for the same reason than that explained in Case 1.

    According to Equality 5, we know from the definition of the lags that there exists a time slice TSj with 1 j < ksuch thatA executes one time unit more in SBF2 than in SBF. LetA denote any time unit ofA in TS

    jin SBF2 and

    suppose that we swap the hole with A (see in Figure 3). Since we assumed that [1, k)

    i

    (mandi(SBF2) + opti(SBF2)) = m L

  • 8/2/2019 Lre-tl Cited 2

    10/22

    it held (before moving A) that in TSj

    i

    (mandji (SBF2) + optji (SBF2)) = m L

    j

    Since moving A generated a hole in TSj in SBF2, from the above equality there exists a taskB such that

    mandjB(SBF2) + optjB(SBF2) < mandjB(SBF) + optjB(SBF)

    Two cases may occur concerning B: eitherlagjB(SBF2) 0 or lag

    jB(SBF2) > 0. First, suppose lag

    jB(SBF2) 0.

    This condition allows us to perform a backward move defined below.

    Backward move: A backward move shifts the hole from the current time slice (here, TSj) to a former time slice. Inthe remainder of this part, TSj is assumed to be the current time slice andB is the task that executes longer in TS

    j

    in SBF than in SBF2 . Furthermore, we know by assumption thatlagjB(SBF2) 0 and we prove in the following that

    it yields lagj1B (SBF2) < 0 and lagj1B (SBF) > 0 and, from the definition of the lags, there must exist a time slice

    TSq with 1 q < j such thatB executes one time unit more in SBF2 than in SBF (see in Figure 3). By Lemmas 3and5, one of the following must hold.

    lagj1B (SBF2) = lag

    j1B (SBF)

    or lagj1B (SBF2) > 0 > lagj1B (SBF)

    or lagj1B (SBF2) < 0 < lagj1B (SBF)

    Suppose lagj1B (SBF2) = lagj1B (SBF). Therefore, it holds from line 4 of Algorithm1 thatmand

    jB(SBF2) =

    mandjB(SBF) and from line 5,

    lubjB(SBF2) = lubjB(SBF) (6)

    Since B executes longer in SBF than in SBF2, we must have optjB(SBF) = 1 andopt

    jB(SBF2) = 0. However,

    having optjB(SBF) = 1 is possible only iflubjB(SBF) > 0 and since from Equality 6 the lag upper bounds of

    B are equal in both schedules, it also holds thatlubjB(SBF2) > 0. Finally, since opt

    jB(SBF2) = 0, we know

    from line 12 of Algorithm 1 thatlagjB(SBF2) > 0, leading to a contradiction with our assumption.

    Suppose lagj1B (SBF2) > 0 > lagj1B (SBF). Therefore we know from Lemma 5 thatlag

    j1B (SBF) = lag

    j1B (SBF2)

    1. Iflagj1B (SBF) < 0 then it holds from line 4 of Algorithm 1 thatmand

    jB(SBF) = mand

    jB(SBF2){+1}.

    1. Suppose mandjB(SBF) = mandjB(SBF2). Since lag

    j1B (SBF) < 0 andlag

    j1B (SBF2) > 0, we know that

    if the amounts of mandatory units are equal in both schedules then they are both equal to 0 and it holds

    thatlagj1B (SBF2) < 1 Lk UB andlagj1B (SBF) < Lk UB. Therefore, lub

    jB(SBF) < 0 from line 5

    of Algorithm 1 andoptjB(SBF) = 0 from line 9. Consequently, B does not execute longer in SBF than inSBF2 .

    2. Suppose mandjB(SBF) = mandjB(SBF2) + 1. In this case, even ifopt

    jB(SBF) = 0, B can not execute

    longer in SBF than in SBF2 .

    Therefore, iflagjB(SBF2) 0, we conclude that lagj1B (SBF2) < 0 and lag

    j1B (SBF) > 0. From the definition

    of the lags, there exists a time slice TSq with 1 q < j such that B executes one time unit more in SBF2 than inSBF (see in Figure 3). Informally, in any time slice TS

    where the hole is shifted to, there will always exist a task

    (say i) that executes longer in SBF than in SBF2 and a backward move can be performed as long as lagji (SBF2) 0.

    Otherwise, iflagji (SBF2) > 0, we can perform a forward move defined below.

    Notice that, by performing successive backward moves, the hole will finally be moved to the first time sliceTS1. Inthis case, we know that there is a task (say D) that executes longer in SBF than in SBF2 (see in Figure 3) and we

  • 8/2/2019 Lre-tl Cited 2

    11/22

    know thatD receives the same amount of mandatory units in TS1 in both schedules. Therefore, it must have received

    an optional unit in SBF and not in SBF2 and it yields lag1D(SBF2) > 0. A forward move can therefore be performed.

    Forward move: We have seen that when lagjB(SBF2) 0, we can perform a backward move. We now consider the

    case where lagjB(SBF2) > 0. We will discover that in this case, we can perform a forward move, which shifts the hole

    from the current time slice (say TSj) to another time slice (say TSp) such that j < p k, where TSk is the time

    slice in which the hole initially appears (see in Figure 3). In the following, we prove by contradiction thatB mustexecute for at least one time unit in SBF2 within [bj , bk). Indeed, suppose that(mand

    B(SBF2) + opt

    B(SBF2)) = 0

    (bj , bk]. Then,

    lubkB(SBF2) = lagk1B (SBF2) + L

    k UB mandkB(SBF2)

    = lubk1B (SBF2) optk1B (SBF2) + L

    k UB mandkB

    = lagk2B (SBF2) + Lk1 UB mand

    k1B (SBF2) opt

    k1B (SBF2) + L

    k UB mandkB(SBF2)

    = lagk2B (SBF2) + UB (Lk1 + Lk)

    k

    =k1

    mandB(SBF2) k1

    =k1

    optk1B (SBF2)

    = ...

    = lagjB(SBF2) + UB k

    =j+1

    L k

    =j+1

    mandB(SBF2) k1

    =j+1

    optk1B (SBF2)

    = lagjB(SBF2) + (bk bj) UB

    Since lagjB(SBF2) > 0 and(bk bj) UB > 0, it holds thatlubkB(SBF2) > 0 and by assumption mand

    kB(SBF2) =

    0. Therefore, B is eligible for an optional time unit in T Sk, leading to a contradiction with the fact that there is at

    least one time unit that can not be allocated in TSk in SBF2 . From the above result, we know that there exists a timeslice T Sp with j < p k such thatB is executed for at least one time unit and the forward move consists in filling thehole in the current time slice TSj with any time unit ofB in TS

    p (thus shifting the hole to TSp). Again, after havingswapped the hole with any time unit ofB, we know that there is at least one task, say C (C could be the same task

    than B), that is executed for less time in TSp in SBF2 than in SBF. We are therefore in the same situation than in TS

    j .

    Each time we move the hole, whether forward or backward, we are making SBF2 more closely resemble SBF. Ineach case, the hole remains in some time slices TS, where k. Therefore, there must also be a hole in the scheduleSBF, which contradicts its optimality proved in [4].

    Lemma 7 In any time slice TSk, the allocated mandatory work can be executed within the time slice, i.e.,

    m Lk

    i

    mandki 0

    as long as all the time units are allocated to the tasks in every time slice TS with 1 < k.

    Proof The proof is not easy to understand and Figure 4 has been drawn in order to help the reader. This picture

    depicts two schedules produced for the given set of tasks. The upper one notedSBF is the schedule produced by theoptimal algorithm BF whereas the other one noted SBF2 is produced by our algorithm SBF2 . Hereafter, we will usethe notation lubki (S) to refer to the lag upper bounds of the task i in the time slice TS

    kin the schedule S. The

    notations lagki (S), mandki (S), opt

    ki (S) ... are similarly defined.

    First time slice: In the first time slice TS1 it holds thatRU1(SBF2) 0 within [b0, b1). Indeed at time b0 = 0, wehave lag0i (SBF2) = 0 i. Therefore, it holds thatmand

    1i (SBF2) = L

    1 Ui i (from line 4), lub1i (SBF2) 0 i

  • 8/2/2019 Lre-tl Cited 2

    12/22

    SBF

    SBF2

    C

    2

    AA

    C2

    1

    2

    5 6 7

    3

    4

    BAAB

    S1

    Sq

    TSj

    Sp

    TSv

    TSk

    Figure 4: Illustration of the schedules SBF and SBF2 produced by algorithms BF and BF2, respectively, when an

    overflow appears in TSk in SBF2 .

    (from line 5) andRU1(SBF2) 0 (from Lemma 4). Notice that, since the difference between algorithms BF andBF2

    is the method for allocating the optional time units, it holds inTS1

    that

    i : mand1i (SBF) = mand

    1i (SBF2) (7)

    Next time slices: Let TSkdef= [bk1, bk) be any time slice such that k > 1 and suppose that in SBF2 there is no

    unallocated time unit within [b0, bk1). Then, suppose by contradiction that in TSk,

    i

    mandki (SBF2) > m Lk (8)

    In the following, we prove that if Inequality 8 holds then there cannot exist a valid schedule from time 0 to time bk,thus leading to a contradiction with the optimality and validity of AlgorithmBF.

    Above, we assumed that there is an overflow in the time sliceTSk in SBF2 . In other words, there exists at least one

    task (say A) that executes longer in the time slice TSk in SBF2 than in SBF. Due to the overflow, at least 1 unit ofAwill be unable to execute during TSk. We consider a single such time unit and we say this time unit isejected from TSk.

    By Inequality 8, we know thatRUk < 0 and every task (including A) only executes its mandatory part in TSk in

    SBF2 . Therefore,

    mandkA(SBF2) > mandkA(SBF) + opt

    kA(SBF)

    Since optkA(SBF) can be only 1 or 0, it holds that

    mandkA(SBF2) > mandkA(SBF)

    and so (from line 4)

    lagk1A (SBF2) > lagk1A (SBF)

    According to Lemma 5, the above inequality can be rewritten as

    lagk1A (SBF2) = lagk1A (SBF) + 1 (9)

    Since the time unit ejected from TSk is a mandatory unit, we can perform the Backward move defined below.

    Backward move: As a consequence of the above inequality, we know from the definition of the lags that there exists

    a time slice before TSk such that the task A executes at least one time unit more in SBF than in SBF2 . Hereafter,

  • 8/2/2019 Lre-tl Cited 2

    13/22

    A andTSk will be said to be the reference task and the reference time slice of this backward move. Remember that

    the backward move can only be performed if the time unit ejected from the overflowed time slice (here, TSk) is amandatory unit. LetTSj (with 1 j < k) denote the latest time slice (from TS1) such thatA executes at least onetime unit more in SBF than in SBF2 and letA denote this time unit (see in Figure 4). By definition ofTS

    j and from

    Lemma 5, we know that

    lagjA(SBF2) = lagjA(SBF){+1} (10)

    For the correctness of the remainder of this proof, we need to be sure thatA does not execute any optional unit inSBF2 within [bj1, bk). This is proved in Subproofs B.1, B.2 and B.3 below.

    Subproof B.1 : Suppose that there exists a time slice TS with j < < k such thatA executes one time unit morein BF than in BF2. This directly results in a contradiction with the definition ofTSj .

    Subproof B.2 : Suppose that there exists a time slice TS with j < < k such thatA executes one time unit morein BF2 than in BF. From Expression 10 we have

    lag1A (SBF2) = lag1A (SBF){+1}

    and it therefore holds that

    lagA(SBF2) = lagA(SBF){1}

    However, since from Subproof B.1 there does not exist a time slice TSo with < o < k such that A executesone time units more in BF than in BF2, we get at boundary bk1

    lagk1A (SBF2) = lagk1A (SBF){1}

    leading to a contradiction with Equality 9. Notice that, according to Subproofs B.1 and B.2, we know thatBFandBF2 must execute A for the same amount of time within each time slice in [bj , bk1). As a consequence,the difference between the lags ofA in BF andBF

    2 from boundary bj to boundary bk1 remains constant and,

    according to Equality 9 we have at boundary bj

    lagjA(SBF2) = lagjA(SBF) + 1 (11)

    Subproof B.3 : From Subproofs B.1 and B.2 we know that in every time sliceTS such thatj < k, A executes the

    same amount of time in both BF2 andBF. From Equality 11 we know thatlagjA(SBF2) > 0 andlagjA(SBF) < 0and from Subproofs B.1 and B.2 it holds that

    lag1A (SBF2) = lag1A (SBF) + 1 (12)

    From line 4 of Algorithm 1 we have in TS in both SBF andSBF2

    mandA = max(0, lag1A +UA L

    )

    and two cases may occur.

    1. mandA(SBF2) = 0. Therefore it holds in SBF2 that

    lag1A (SBF2) + UA L 0

    and according to Equality 12, it holds in SBF that

    lag1A (SBF) + UA L 1

    leading to mandA(SBF) = 0. This means thatA does not execute any mandatory units in both SBF andSBF2 . Since mand

    A(SBF) = 0, we have

    lubA(SBF) = lag1A (SBF) + Lk UA

  • 8/2/2019 Lre-tl Cited 2

    14/22

    and from Equality 12,

    lubA(SBF) = lag1A (SBF2) 1 + Lk UA

    = lubA(SBF2) 1

    As a consequence, we know from Lemma 3 thatlubA(SBF) > 0 andlubA(SBF2) < 0. Hence, A can not

    receive an optional unit in TS

    in BF2

    and from Subproofs B.1 and B.2, it can not receive an optional unitin SBF either.

    2. lag1A (SBF2) + UA L > 0 which leads to

    mandA(SBF2) = mandA(SBF) + 1

    As a result, having mandA(SBF2)+ optA(SBF2) = mand

    A(SBF)+ opt

    A(SBF) is possible if and only if

    optA(SBF2) = 0 andoptA(SBF) = 1.

    Subproofs B.1, B.2 and B.3 state that A does not execute any optional time units in SBF2 within [bj1, bk). Now,

    we move the time unitA from TSk to TSj (see in Figure 4). Since we assumed that [1, k)

    i

    (mandi(SBF2) + opti(SBF2)) = m L

    we have in TSj

    i

    (mandji (SBF2) + optji (SBF2)) = m L

    j

    Consequently, placing the time unitA in TSj in SBF2 obviously generates an overflow of one time unit. Suppose thatwe place A in TSj anyway in order to imitate the optimal schedule SBF. Since there is now an overflow in TS

    j in

    SBF2, there must exist a taskB such that

    mandjB(SBF2) + optjB(SBF2) > mand

    jB(SBF) + opt

    jB(SBF)

    Let B denote the time unit of B which is ejected from TSj

    in SBF2 (see in Figure 4). Two cases may occur:

    B belongs to the mandatory part ofB (i.e., optjB(SBF2) = 0) or B was allocated as an optional time unit (i.e.,optjB(SBF2) = 1).

    Case 1. The ejected time unit (here, B) belongs to the mandatory part of a task (here, B), i.e., optjB(SBF2) = 0.

    Following the same development as forA, there must exist a time slice TSq

    with 1 q < j such thatB exe-cutes one unit more in SBF than in SBF2 . As a result, we can perform the backward move with B as reference

    task andTSj as reference time slice. The backward move can be repeated as long as the ejected unit is a manda-tory one and it ends when we reach one of the two following situations: either we reach a time slice TSq with1 < q < j where an optional unit is ejected or we reach TSq = TS1 (see and in Figure 4). Since in TS1 theamount of mandatory units of every task is identical in both SBF andSBF2 (according to Equality 7), we know

    that the time unit ejected from TS1 is an optional one. In conclusion, the backward move stops when the timeunit ejected from a time slice is optional, and we know that this must happen ifTS1 is reached, if not beforehand.

    Case 2. The ejected time unit (say B) was allocated as an optional time unit (i.e., optjB(SBF2) = 1). Therefore, weperform a Forward move defined below.

  • 8/2/2019 Lre-tl Cited 2

    15/22

    Forward move: The forward move is performed when the time unit ejected from a time slice is an optional one. A

    forward move is characterized by an origin, a reference task and a destination with the following interpretation: the

    origin is the time slice from which the optional time unit is ejected (here,TSj), the reference taskis the one responsiblefor the ejection of the first optional time unit (here,A) and the destination is the time slice in which an extra time unit

    (compared to SBF) will be finally generated (here, TSk

    ). In the following, we prove that

    F.1 withdrawing the optional unit from B in the origin time slice TSj

    in SBF2 ensures thatB executes at least onemandatory unit in a time slice TSp in SBF2 such thatj < p k where TS

    jandTSk are the origin and destination

    time slice of the forward move, respectively.

    F.2 at least one of the mandatory units ofB executed within [bj , bk) has been generated by withdrawing its op-tional time unit in TSj (see in Figure 4). That is, withdrawing the optional unit fromB in TS

    j in SBF2 generates

    an overflow of one time unit in another time slice TSp (with j < p k).

    Subproof F.1 Here, we prove that withdrawing the optional time unit from B in TSj in SBF2 ensures that B exe-

    cutes at least one mandatory unit in a time sliceTSp in SBF2 such thatj < p k. Within this sub-proof, whichonly considers the schedule SBF2, we will use the notations lag

    ki , lub

    ki ... instead oflag

    ki (SBF2), lub

    ki (SBF2),

    etc.

    Since A has been responsible for an overflow of a mandatory unit in TSk in SBF2, it holds in SBF2 that

    1 lagk1A + Lk UA

    lagk1A + Lk UA

    lubk1A optk1A + L

    k UA

    Since optk1A 0 we have

    1 lubk1A + Lk UA

    lagk2A + Lk1 UA mand

    k1A + L

    k UA

    Since mandk1A 0, it holds that

    1 lagk1A + Lk2 UA + L

    k UA

    lagk2A +(Lk1 + Lk) UA

    . . .

    lagjA +(Lj+1 + . . . + Lk1 + Lk) UA

    lagjA +(bk bj) UA

    From line 12 of Algorithm 1 we know thatlagjA = lubjA opt

    jA with opt

    jA = 0 (we proved in the Subproofs B.1,

    B.2 and B.3 thatA did not receive an optional time unit in TSj in SBF2). Consequently, the above inequality

    can be rewritten as

    lubjA +(bk bj) UA 1

    And so(bk bj)

    1 lubjAUA

    (13)

    Now, suppose that the optional unit is withdrawn from B in TSj in SBF2 (i.e., opt

    jB = 0) and suppose (by

    contradiction) thatB does not execute in any time slice TSp

    such thatj < p k. Therefore, in any such TSp

    we have

    lagpB + Lp UB < 1

  • 8/2/2019 Lre-tl Cited 2

    16/22

    And in particular (since p k) we have in TSk in SBF2

    1 > lagk1B + Lk UB

    > lagk1B + Lk UB

    > lubk1B optk1B + L

    k UB

    Since optk1B = 0 we have

    1 > lubk1B + Lk UB

    > lagk2B + Lk1 UB mand

    k1B + L

    k UB

    Since mandk1B = 0, it holds that

    1 > lagk2B + Lk1 UB + L

    k UB

    > lagk2B +(Lk1 + Lk) UB

    > . . .

    > lagjB +(Lj+1 + . . . + Lk1 + Lk) UB

    > lagj

    B +(bk bj) UB

    From line 12 of Algorithm 1 we know thatlagjB = lubjB opt

    jB and since the optional unit has been withdrew

    from B we have optjB = 0. Consequently, the above inequality can be rewritten as

    lubjB +(bk bj) UB < 1

    which implies

    (bk bj)