5 may 20051 cmpe 516 fault tolerant scheduling in multiprocessor systems betül demiröz

31
5 May 2005 1 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

Upload: camron-walker

Post on 05-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 1

CmpE 516

Fault Tolerant Scheduling in Multiprocessor Systems

Betül Demiröz

Page 2: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 2

Outline

General consepts about tasks and schedulingReal time systemsFault Tolerant SchedulingBasic approaches used in Fault Tolerant SchedulingAlgorithms and their execution details

Page 3: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 3

Task

Deadlinetime the task should be finished

Preemptive tasks can be stopped during executionrestarted

Nonpreemptive tasks cannot berestartedinterrupted during execution

Page 4: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 4

Task Properties

PeriodicAperiodic

activated only when certain events occurarrival times are not knownscheduled dynamically

DependentIndependent

Page 5: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 5

Task Scheduling

Distribution of tasks to the processors according to a given policy.Major goals of task scheduling:

distribute the system loadreduce total execution time

Page 6: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 6

Static & Dynamic SchedulingStatic Scheduling

compile time schedulingan accurate weight estimation is neededschedules of all tasks are predetermined

Dynamic Schedulingscheduling at run timeuses actual values of execution times of processes and communication times

Page 7: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 7

Real Time Systems

Hard Real TimeCorrectness depends on

logical resultsthe result production time

missing a deadline may be catastrophicmission-critical or life-critical applicationsfault tolerance is extremely important

Soft Real Time

Page 8: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 8

Processors In The System

Uniprocessorthere is a single processor

Multiprocessorthere are n processors in the systemcan be identical (homogenous)can have different properties (heterogenous)

Page 9: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 9

Hard Real Time SystemsUse multiprocessorAdvantages

more reliableunless a processor failure causes the whole system to failcan happen if no fault-tolerant capability is provided

one processor failure does not cause the whole system to failmore computational power

Disadvantagethe probability of processor failure is higher

Page 10: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 10

Fault Tolerant SystemThe system should produce correct results even in the presence of faultsImportant for most real time applicationsTasks can have deadlines, and should be finished before the deadline

fault tolerance requiredhard real time systems

Page 11: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 11

Error Detection in Fault Tolerant Scheduling

Fail-Signalnotify other processors of a detected fault

Alarms or watchdogsdetection of timing failures

Signaturesdetection of HW/SW faults

Acceptance Teststest results for HW/SW faults

Page 12: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 12

Fault Tolerance In Multiprocessor Systems

Multiple copies of tasks scheduled on different processorsAim: the task completes before its deadline

Page 13: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 13

Fault Tolerante Scheduling In Multiprocessor Systems (Cont.)

Multiple copies of tasks are scheduled to different processorsOne or more copies can run to ensure task completion before deadlinePB (Primary/Backup Approach)TMR (Triple Modular Redundancy)

Error checking is done by comparing results

Page 14: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 14

PB (Primary/Backup Approach)

If incorrect results are generated from primary processor, backup processor is activatedSmall HW resource requirementsTasks are

nonpreemptive, aperiodic, real-time

Page 15: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 15

An Algorithm For Real Time Fault Tolerant Scheduling in

Multiprocessor Systems

N periodic tasks are scheduled on a number of processorsFor each task i, there is a primary copy Pi and a backup copy Bi

If primary copy fails, backup copy is activatedEnough time needed to execute backup copiesStatic scheduling of tasks

Page 16: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 16

Scheduling RequirementsEach task is executed by one processor at a timeAll tasks should meet their deadlinesMaximize the number of processor failures to be toleratedPi and Bi are assigned to only one processor which are different.Tasks are preemptive The number of processors used should be minimized

Page 17: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 17

Scheduling AlgorithmPrimary tasks are arranged in order of decreasing computation timesPrimary copies are scheduled (m processors are used)

assign each copy to existing processors

Primary schedule is dublicated for the backup copies (m processors are used)Any pair of primary and backup copies should not overlap

Page 18: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 18

An Example Distribution

S={T1, T2, T3, T4, T5}

C={5, 4, 4, 3, 2}T1 -> P1

T2 -> P2

T3 -> P1

T4 -> P2

T5 -> P2

Page 19: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 19

Example Cont.

Page 20: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 20

Another Algorithm

Two copies of tasks allowed to start execution on different timesImproves schedulability of tasksN identical processors and a scheduling processor are usedDynamic scheduling

Page 21: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 21

System Model

A task is scheduled ifPreviously scheduled and the arrived task meet their deadlines

OtherwiseTask is rejected because its deadline is not met despite of a fault

Page 22: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 22

Techniques Used

Backup copies are activated only when a fault occurs on the processor executing the primary copyBackup Overloading

overlaping multiple slots for backups

Backup De-allocationRelease the slot for a backup copy when its primary copy is completed successfully

Page 23: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 23

Backup Overloading

Page 24: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 24

Backup Deallocation

Page 25: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 25

Proposed Technique

The primary copy and backup copy are scheduled and executed in parallelThe backup copy is divided into

preceding part executed together with primary copy (redundant part)remaining part executed after the primary copy is completed (backup part)

Backup overloading and backup deallocation is used

Page 26: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 26

Proposed Technique (Cont.)

Page 27: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 27

Scheduling Algorithm

Schedule primary copy try to find a free slot between arrival time and deadline time

Schedule backup copyschedule both redundant and backup parts

Page 28: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 28

System Overwiev

Page 29: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 29

Experiments

Basic parameters used in experiments

system loadnumber of processors and tasks usedcomputation timewindow size

Analysing resultsrejection rate

Page 30: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 30

Experimental Results

Page 31: 5 May 20051 CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz

5 May 2005 31

Thank You

ANY QUESTIONS?