cello a disk scheduling framework for next generation operating systems – prashant j. shenoy and...

CelloCelloA Disk Scheduling Framework for A Disk Scheduling Framework for Next Generation Operating Next Generation Operating Systems – Prashant J. Shenoy and Systems – Prashant J. Shenoy and Harrick M. VinHarrick M. Vin

Presented by Evan ClarkPresented by Evan ClarkSeptember 8, 2003September 8, 2003

ContentsContents

Disk Scheduler BasicsDisk Scheduler Basics Cello ArchitectureCello Architecture Performance CharacteristicsPerformance Characteristics ConclusionsConclusions

Disk Scheduler BasicsDisk Scheduler Basics

What is a Disk What is a Disk Scheduler?Scheduler? A part of the OS’s disk subsystem A part of the OS’s disk subsystem

responsible to reordering disk requests responsible to reordering disk requests to optimize disk accessto optimize disk access

Architecturally, a disk scheduler is either Architecturally, a disk scheduler is either part of the file system, or it sits above it part of the file system, or it sits above it

Typically, a single disk scheduler Typically, a single disk scheduler manages one physical device, regardless manages one physical device, regardless of the number of logical partitions. of the number of logical partitions.

Disk PerformanceDisk Performance

Transfer rateTransfer rate Positional performancePositional performance

– Head Seek TimeHead Seek Time– Rotational LatencyRotational Latency

Performance is a function of where Performance is a function of where the head is and where the data the head is and where the data residesresides

Access time measured in millisecondsAccess time measured in milliseconds

Scheduling PoliciesScheduling Policies

Seek-Reducing PoliciesSeek-Reducing Policies– Good overall throughputGood overall throughput

Deadline-Oriented Scheduling Deadline-Oriented Scheduling policiespolicies– Can provide response-time Can provide response-time

guaranteesguarantees

Seek-Reducing PoliciesSeek-Reducing Policies

FCFS – First Come First ServedFCFS – First Come First Served– Only efficient for very light, low concurrency Only efficient for very light, low concurrency

workloadsworkloads SPTF – Shortest Positioning-Time FirstSPTF – Shortest Positioning-Time First

– Optimizes overall disk throughputOptimizes overall disk throughput– Large worst-case response times (starvation)Large worst-case response times (starvation)– Needs intimate knowledge of the physical deviceNeeds intimate knowledge of the physical device

LOOK/SCAN LOOK/SCAN – Disk head traverses only in one directionDisk head traverses only in one direction– Good throughputGood throughput– Avoids starvationAvoids starvation– Large worst-case response timesLarge worst-case response times

Deadline-Oriented Deadline-Oriented Scheduling policiesScheduling policies Time SlicingTime Slicing

– Tradeoff between throughput and average Tradeoff between throughput and average response timeresponse time

– Not work-conservingNot work-conserving EDF – Earliest Deadline FirstEDF – Earliest Deadline First

– Poor throughput due to high positional latencyPoor throughput due to high positional latency– Good at servicing requests in order of importanceGood at servicing requests in order of importance

FD-SCAN Feasible deadline scanFD-SCAN Feasible deadline scan– Scan in the direction of the request with the next Scan in the direction of the request with the next

deadlinedeadline– Good statistical performanceGood statistical performance

Application ClassesApplication Classes

Real-time applicationsReal-time applications– Hard: require worst-case response time Hard: require worst-case response time

guaranteesguarantees– Soft: require average response time Soft: require average response time

guaranteesguarantees Best-effort applicationsBest-effort applications

– Interactive: require low average response Interactive: require low average response timestimes

– Throughput-intensive: require overall Throughput-intensive: require overall system performancesystem performance

Policy SuitabilityPolicy Suitability

Real-time applicationsReal-time applications– Hard: EDF, etc.Hard: EDF, etc.– Soft: FD-SCAN etc.Soft: FD-SCAN etc.

Best-effort applicationsBest-effort applications– Interactive: SCAN, etc.Interactive: SCAN, etc.– Throughput-intensive: SPTF, etc.Throughput-intensive: SPTF, etc.

Similarities with Similarities with Thread schedulingThread scheduling Performance requirements with Performance requirements with

respect to application classesrespect to application classes Coexistence – you must protect Coexistence – you must protect

one application class from one application class from anotheranother

Cello Architecture Cello Architecture

StructureStructure

Hierarchy of queuesHierarchy of queues Requests move from Requests move from

one queue to one queue to another by a another by a combined effort of combined effort of the class-specific the class-specific scheduler and the scheduler and the class-independent class-independent schedulerscheduler

Each Application Class Each Application Class has:has: A weight, A weight, wwii, which , which

is used to partition is used to partition bandwidthbandwidth

A pending queueA pending queue A class-specific A class-specific

schedulerscheduler

Class-Specific Class-Specific schedulers are schedulers are responsible for:responsible for: Ordering their Ordering their

pending queues.pending queues. Placing requests of Placing requests of

its class into the its class into the scheduled queue.scheduled queue.

Class-Independent Class-Independent schedulers are schedulers are responsible for:responsible for: Regulating the entry Regulating the entry

of requests into the of requests into the scheduled queue.scheduled queue.

Giving the class-Giving the class-specific schedulers specific schedulers information about the information about the time requirements of time requirements of requests in the requests in the scheduled queue.scheduled queue.

The life-cycle of a The life-cycle of a RequestRequest1.1. A new request is handed to the A new request is handed to the

schedulerscheduler2.2. Its class-specific scheduler, Its class-specific scheduler, SSii, ,

places it in its pending queue, places it in its pending queue, where it waits its turnwhere it waits its turn

3.3. The class-independent The class-independent scheduler, scheduler, CC, asks , asks SSii for a for a requestrequest

4.4. CC gives the request back to the gives the request back to the SSii and asks it to place it in the and asks it to place it in the scheduled queuescheduled queue

5.5. CC tells tells SSii whether it or not it whether it or not it rejects the placement rejects the placement

6.6. Once the scheduled queue is as Once the scheduled queue is as full as it will get, each request is full as it will get, each request is sent to the file system in ordersent to the file system in order

Proportionate Time- Proportionate Time- AllocationAllocation Each application class Each application class

is given a piece of the is given a piece of the interval, interval, PP, proportional , proportional to its weight, to its weight, wwii

CC asks asks SSii for additional for additional requests until that requests until that application class’s application class’s allotment has filledallotment has filled

Unused time is divided Unused time is divided among classes with among classes with pending requestspending requests

Proportionate Byte-Proportionate Byte-AllocationAllocation In each interval, an In each interval, an

application class is application class is allowed to process allowed to process an amount of data an amount of data proportional to its proportional to its weight, weight, wwii

Performance Performance CharacteristicsCharacteristics

UtilizationUtilization

Very goodVery good Cello is a work-conserving Cello is a work-conserving

schedulerscheduler Periodic requests are handled at Periodic requests are handled at

whatever frequency is defined by whatever frequency is defined by the applicationthe application

Best-effort requests fill in the gaps Best-effort requests fill in the gaps left by periodic requestsleft by periodic requests

Response timeResponse time

Can be tuned as needed by the application classCan be tuned as needed by the application class Mixed request types are interleavedMixed request types are interleaved Computational overhead is acceptableComputational overhead is acceptable

ConclusionsConclusions

Questions/ConcernsQuestions/Concerns

““The two-level framework cleanly The two-level framework cleanly separates class-independent mechanisms separates class-independent mechanisms from class-specific scheduling policies”from class-specific scheduling policies”

Deceptive idleness – a phenomenon in Deceptive idleness – a phenomenon in work-conserving schedulers in which a work-conserving schedulers in which a requester is assumed to be idle because requester is assumed to be idle because it has no requests pending at decision it has no requests pending at decision pointspoints– How does high utilization (>80%) affect mixed How does high utilization (>80%) affect mixed

use performance?use performance?– Wouldn’t using a 1000ms interval cause Wouldn’t using a 1000ms interval cause

problems at high utilization?problems at high utilization?

A Step BackA Step Back

Good utilizationGood utilization Mixed use response time is very goodMixed use response time is very good In the presence of homogeneous requests, the In the presence of homogeneous requests, the

scheduling is optimal for that application classscheduling is optimal for that application class Used in QLinux version 2.4.x, but is reportedly Used in QLinux version 2.4.x, but is reportedly

“not stable yet”“not stable yet” Applications must classify themselves among Applications must classify themselves among

the available classes:the available classes:– Interactive best effort Interactive best effort – Throughput-intensive best effort Throughput-intensive best effort – Real timeReal time

cello a disk scheduling framework for next generation operating systems – prashant j. shenoy and...

Documents