basic concepts scheduling criteria scheduling algorithms multiple-processor scheduling algorithm...
TRANSCRIPT
CPU SchedulingBasic ConceptsScheduling Criteria Scheduling AlgorithmsMultiple-Processor SchedulingAlgorithm Evaluation
Maximum CPU utilization obtained with multiprogramming
CPU–I/O Burst Cycle – Process execution consists of a cycle of CPU execution and I/O wait.
CPU burst distribution
Basic Concepts
CPU And I/O Bursts
Histogram of CPU-burst Times
Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them.
CPU scheduling decisions may take place when a process:◦ Switches from running to waiting state.◦ Switches from running to ready state.◦ Switches from waiting to ready.◦ Terminates.
Scheduling under 1 and 4 is nonpreemptive.
All other scheduling is preemptive.
CPU Scheduler
Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves:◦ switching context◦ switching to user mode◦ jumping to the proper location in the user
program to restart that program Dispatch latency – time it takes for the
dispatcher to stop one process and start another running.
Dispatcher
CPU utilization – keep the CPU as busy as possible
Throughput – # of processes that complete their execution per time unit
Turnaround time – amount of time to execute a particular process
Scheduling Criteria
Waiting time – amount of time a process has been waiting in the ready queue
Response time – amount of time it takes from when a request was submitted until the first response is produced, not output (for time-sharing environment)
Scheduling Criteria
Max CPU utilization Max throughput Min turnaround time Min waiting time Min response time
Optimization Criteria
Process Burst Time P1 24 P2 3 P3 3
Suppose that the processes arrive in the order: P1 , P2 , P3 . The Gantt Chart for the schedule is:
Waiting time for P1 = 0; P2 = 24; P3 = 27 Average waiting time: (0 + 24 + 27)/3 = 17
First-Come, First-Served (FCFS)
P1 P2 P3
24 27 300
Suppose that the processes arrive in the order P2 , P3 , P1
The Gantt chart for the schedule is:
Waiting time for P1 = 6; P2 = 0; P3 = 3 Average waiting time: (6 + 0 + 3)/3 = 3 Much better than previous case. Convoy effect: short process behind long process
FCFS Scheduling (Cont.)
P1P3P2
63 300
Associate with each process the length of its next CPU burst. Use these lengths to schedule the process with the shortest time.
Two schemes: ◦ nonpreemptive – once CPU given to the process it
cannot be preempted until completes its CPU burst.
Shortest-Job-First (SJF) Scheduling
Preemptive Version (SRTF)◦ if a new process arrives with CPU burst length less
than remaining time of current executing process, preempt. This scheme is know as the Shortest-Remaining-Time-First (SRTF).
SJF is optimal – gives minimum average waiting time for a given set of processes.
Shortest-Job-First (SJF) Scheduling
Process Arrival Time Burst Time P1 0.0 7 P2 2.0 4 P3 4.0 1 P4 5.0 4
SJF (non-preemptive)
Average waiting time = (0 + 6 + 3 + 7)/4 = 4
Example of Non-Preemptive SJF
P1 P3 P2
73 160
P4
8 12
Process Arrival Time Burst Time P1 0.0 7 P2 2.0 4 P3 4.0 1 P4 5.0 4
SJF (preemptive)
Average waiting time = (9 + 1 + 0 +2)/4 = 3
Example of Preemptive SJF
P1 P3P2
42 110
P4
5 7
P2 P1
16
The exponential average formula can only estimate the length.
Can be done by using the length of previous CPU bursts, using exponential averaging.
Predicting Next CPU Burst
:Define 4.
10 , 3.
burst CPU next the for value predicted 2.
burst CPU of lenght actual 1.
1n
thn nt
.1 1 nnn t
actual length
Next CPU Burst
=0◦ n+1 = n
◦ Recent history does not count.
=1◦ n+1 = tn
◦ Only the actual last CPU burst counts.
Exponential Averaging
If we expand the formula, we get:n+1 = tn+(1 - ) n
n+1 = tn+(1 - )[ tn-1 +(1 - ) n-1 ] + …
+(1 - )j tn-j + …
+(1 - )n+1 0
Since both and (1 - ) are less than or equal to 1, each successive term has less weight than its predecessor.
Exponential Averaging
A priority number (integer) is associated with each process
The CPU is allocated to the process with the highest priority (smallest integer highest priority).◦ Preemptive◦ nonpreemptive
Priority Scheduling
SJF is a priority scheduling where priority is the predicted next CPU burst time.
Problem Starvation – low priority processes may never execute.
Solution Aging – as time progresses increase the priority of the process.
Priority Scheduling
Each process gets a small unit of CPU time (time quantum), usually 10-100 milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready queue.
If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)q time units.
Performance◦ q large FIFO◦ q small q must be large with respect to context switch,
otherwise overhead is too high.
Round Robin (RR)
Process Burst TimeP1 53 P2 17 P3 68 P4 24
The Gantt chart is:
Typically, higher average turnaround than SJF, but better response.
RR with Time Quantum = 20
P1 P2 P3 P4 P1 P3 P4 P1 P3 P3
0 20 37 57 77 97 117 121 134 154 162
Time Quantum vs Context Switch Time
Turnaround Time Varies With The Time Quantum
Ready queue is partitioned into separate queues:foreground (interactive)background (batch)
Each queue has its own scheduling algorithm:
foreground – RRbackground – FCFS
Multilevel Queue
Scheduling must be done between the queues.
◦ Fixed priority scheduling; (i.e., serve all from foreground then from background). Possibility of starvation.
◦ Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its processes; i.e., 80% to foreground in RR
◦ 20% to background in FCFS
Multilevel Queue
Multilevel Queue Scheduling
A process can move between the various queues; aging can be implemented this way.
Multilevel-feedback-queue scheduler defined by the following parameters:◦ number of queues◦ scheduling algorithms for each queue◦ method used to determine when to upgrade a
process◦ method used to determine when to demote a process◦ method used to determine which queue a process
will enter when that process needs service
Multilevel Feedback Queue
Three queues: ◦ Q0 – time quantum 8 milliseconds◦ Q1 – time quantum 16 milliseconds◦ Q2 – FCFS
Scheduling◦ A new job enters queue Q0 which is served FCFS.
When it gains CPU, job receives 8 milliseconds. If it does not finish in 8 milliseconds, job is moved to queue Q1.
◦ At Q1 job is again served FCFS and receives 16 additional milliseconds. If it still does not complete, it is preempted and moved to queue Q2.
Multilevel Feedback Queue
Multilevel Feedback Queues
Advantages:◦ Flexibility – can be used to implement many
different policies◦ Good balance of priority and fairness (and it’s
tweakeable)
Disadvantages:◦ Complexity to code◦ Run-time efficiency
Multilevel Feedback Queues
CPU scheduling gets more complex when multiple CPUs are available.
Assume homogeneous processors within a multiprocessor.◦ Note: Future processors may not fit this model!
Schedules can be made either:◦ Symmetrically – each processor schedules itself
from a shared queue◦ Asymmetrically – one processor only in kernel
mode
Multiple-Processor Scheduling
Processor Affinity◦ When a process was running on CPU a, and is
migrated to CPU b, there is a penalty for cache/TLB/memory flush/re-load.
Load Balancing:◦ Processes should be scheduled fairly evenly
across multiple processors◦ Avoids wasting CPU cycles when there is
something that could be running
These two are both important, but they tend to fight each other.
Affinity vs. Balance
To avoid flushing the caches, each process will “preference” a CPU (or set of CPUs)◦ “Soft” affinity – can be moved if needed◦ “Hard” affinity – locked in on the current CPU
Implementation varies – depends on:◦ System architecture and available resources◦ Load balancing needs
Processor Affinity
Solutions: Run with a single ready queue
◦ If a CPU becomes idle, it just pulls the next process
◦ Not used in practice (SMP common) Push migration
◦ An OS task examines the load every so often, and if needed, pushes some processes from heavily loaded CPUs to lightly loaded ones.
Pull migration◦ An idle CPU pulls a process from another CPUs
queue.
Load Balancing
Most multi-core processors today also implement hardware simultaneous multithreading◦ i.e. Intel i7 (Nehalem) “Hyperthreading”
CPU is represented to OS as multiple logical CPUs – one for each “hardware thread”◦ i.e. UltraSPARC T1 8 cores, each with 4 threads
32 logical processors!
Multi-CORE Scheduling
OS must schedule the processes into the “logical processors”.◦ Typical scheduling algorithms (?)
CPU decides which “hardware thread” to run on each core◦ Coarse-grained parallelism rotate threads when
a long-latency instruction happens, like a load which misses cache
◦ Fine-grained parallelism “Round Robin” the hardware threads, often every clock cycle. This is very rare in practice
Multi-CORE Scheduling
Deterministic modeling:
◦ takes a particular predetermined workload◦ defines the performance of each algorithm for
that workload.
What you did in 265
Algorithm Evaluation
Queueing models – Little’s law◦ N = λ W
N = average queue length λ = average arrival rate W = average waiting time in queue
◦ e.g., if N = 8, λ = 2 W = 4
Algorithm Evaluation
Simulations◦ Clock◦ Random generation of:
Processes Burst times Arrival & departures rates
Algorithm Evaluation
Based on probability distributions:◦ Uniform◦ Exponential◦ Poisson◦ Empirical
Implementation
Algorithm Evaluation
CPU Scheduler Evaluation by Simulation
UNIX schedulers
UNIX schedulers
Bands of Priorities
Priority Formula
Solaris Scheduling
Solaris Dispatch Table
Linux pre-2.6 O(n) SchedulerO(n) scheduling a task takes O(n) where n = # of tasks
One runqueue for all processors in a symmetric multiprocessor system
- Task can be scheduled on any CPU - Good for load balancing
- Bad for memory cache movements, e.g., task previously on CPU1 is - run on CPU2 move cache1 to cache2
Single runqueue CPUs had to contend with shared lock.No preemption allowed higher priority process may have to wait.
Linux 0(1) 2.6 Scheduler
- Each CPU has its own runqueue (priority = 1 to 140)- Tasks are scheduled RR multilevel feedback paradigm.- Tasks whose TQ expires are moved to Expired runqueue
(priorities are recalculated)
SMP Tasks Indexed on Priorities
Linux 0(1) 2.6 SchedulerWhy O(1)?- Bitmap of priorities is read (each priority level points to process)- Since size of bitmap is 140, selection of process does not depend on the number of processes in the runqueue.
Active runqueue pointer- When active runqueue is empty, pointer is set to expired runqueue, i.e., expired runqueue now becomes active runqueue and vice versa- Each CPU (in an SMP system) sets locks on its own runqueues all CPUs can schedule without contention from other CPUs.
Linux 0(1) 2.6 Scheduler
Dynamic Priority Assignment- CPU bound processes are penalized (increased by 5 levels)- I/O bound rewarded (prioirity# decreased by 5 levels) I/O bound use CPU to set up I/O and is suspended give other processes a chance to execute altruistic
Heuristic for I/O or CPU bound category??- interactivity heuristic based on: time task executes compared with time it is suspended.- I/O bounds sleep time is large increase in interactivity metric rewarded.- Priority adjustments only applied to user processes.
Priority-based, pre-emptive. Dispatcher handles scheduling. Thread currently running is interrupted
when:◦(a) it terminates, or◦(b) higher priority thread arrives, or◦(c) time-quantum expires, or◦(d) does I/O
gives real-time threads priority
Windows XP Scheduling
Dispatcher uses multi-level priority scheme
Priorities are divided into two classes:◦ (a) variable (1-15)◦ (b) real-time (16-31)
thread running at priority 0 is used for memory mgmt.
Processes are rewarded/penalized for sleeping/using up CPU
Windows XP Scheduling
Windows XP SchedulingPriority Class
Relative Priority