![Page 1: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/1.jpg)
Design and Implementation ofDesign and Implementation ofa Generic Resource-Sharinga Generic Resource-Sharing
Virtual-Time DispatcherVirtual-Time Dispatcher
Tal Ben-NunScl. Eng & CS
Hebrew University
Yoav EtsionCS Dept
Barcelona SC Ctr
Dror FeitelsonScl. Eng & CS
Hebrew University
Supported by the Israel Science Foundation, grant no. 28/09
![Page 2: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/2.jpg)
Design and Implementation ofa Generic Resource-Sharing
Virtual-Time Dispatcher
Goal is to control share of resources, not to optimize performance – important in virtualization
![Page 3: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/3.jpg)
Design and Implementation ofa Generic Resource-Sharing
Virtual-Time Dispatcher
Goal is to control share of resources, not to optimize performance – important in virtualization
Same module used for diverse resources
![Page 4: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/4.jpg)
Design and Implementation ofa Generic Resource-Sharing
Virtual-Time Dispatcher
Goal is to control share of resources, not to optimize performance – important in virtualization
Same module used for diverse resourcesMechanism used: dispatch the most deserving client at each instant
![Page 5: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/5.jpg)
Design and Implementation ofa Generic Resource-Sharing
Virtual-Time Dispatcher
Goal is to control share of resources, not to optimize performance – important in virtualization
Same module used for diverse resourcesMechanism used: dispatch the most deserving client at each instant
Selection of deserving client using virtual time formalism
![Page 6: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/6.jpg)
Design and Implementation ofa Generic Resource-Sharing
Virtual-Time Dispatcher
Goal is to control share of resources, not to optimize performance – important in virtualization
Same module used for diverse resourcesMechanism used: dispatch the most deserving client at each instant
Selection of deserving client using virtual time formalism
Implemented and measured in Linux
![Page 7: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/7.jpg)
Motivation
Context: VMM for server consolidation Multiple legacy servers share physical platform Improved utilization and easier maintenance Flexibility in allocating resources to virtual machines Virtual machines typically run a single application
(“appliances”)
![Page 8: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/8.jpg)
Motivation
Assumed goal: enforce predefined allocation of resources to different virtual machines(“fair share” scheduling) Based on importance / SLA Can change with time or due to external events
Problem: what is “30% of the resources” when there are many different resources, and diverse requirements?
![Page 9: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/9.jpg)
Global Scheduling
“Fair share” usually applied to a single resource But what if this resource is not a bottleneck?
Global scheduling idea:1) Identify the system bottleneck resource2)Apply fair share scheduling on this resource3)This induces appropriate allocations on other
resources This paper: how to apply fair-share scheduling
on any resource in the system
![Page 10: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/10.jpg)
Previous Work I: Virtual Time
Accounting is inversely proportional to allocation Schedule the client that is farthest behind
![Page 11: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/11.jpg)
Previous Work II: Traffic Shaping
• Leaky bucket– Variable requests– Constant rate transmission– Bucket represent buffer
• Token bucket– Variable requests– Constant allocations– Bucket represents stored
capacity
![Page 12: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/12.jpg)
Putting them Together: RSVT
• “Resource sharing”: all clients make progress continuously– Generalization of processor sharing
• Each job has its ideal resource sharing progress– This is considered to be the allocation ai
– Grows at constant rate• Each job has its actual consumption ci
– Grows only when job runs• Scheduling priority is the difference:
pi = ai – ci
![Page 13: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/13.jpg)
ExampleThree clientsAllocations roughly 50%, 30%, 20%Consumption always occur in resource time
Wallclock time
Con
sum
ed re
sour
ce ti
me
![Page 14: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/14.jpg)
Bookkeeping
• The set of active jobs is A• The relative allocation of job i is ri
• During an interval T job k has run• Update allocations:
• Update consumptions:
Tr
raAj j
ii
otherwise
kiTci 0
![Page 15: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/15.jpg)
The Active Set
• Active jobs (the set A) are those that can use the resource now
• Allocations are relative to the active set• The active set may change
– New job arrives– Job terminates– Job stops using resource temporarily– Job resumes use of resource
![Page 16: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/16.jpg)
Grace Period
• Intermittent activity: process data / send packet• should retain allocations even when inactive• Thus ai continues to grow during grace period
after it becomes inactive• Grace period reflects notion of continuity• Sub-second time scale
![Page 17: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/17.jpg)
Rebirth
• Resumption after very long inactive periods should be treated as new arrivals
• Due to grace period, job that becomes inactive accrues extra allocation
• Forget this extra allocation after rebirth period
(set ai = ci)• Two order of magnitude larger than grace period
![Page 18: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/18.jpg)
Implementation
• Kernel module with generic functionality– Create / destroy module– Create / destroy client– Make request / set active / set inactive– Make allocations– Dispatch– Check-in (note resource usage)
• Glue code for specific subsystems– Currently networking and CPU– Plan to add disk I/O
![Page 19: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/19.jpg)
Networking Glue Code
Use the Linux QoS framework: create RSVT queueing discipline
IP
QoS
NIC
TCP
App
queueingdiscipline
![Page 20: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/20.jpg)
Networking Glue Code
Non-RSVT traffic has priority (e.g. NFS traffic) and is counted as dead time
IP
NIC
TCP
App
RSVT?
sendimmediately
no enqueue
selectand send
yes
![Page 21: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/21.jpg)
CPU Scheduling Glue Code
• Use Linux modular scheduling core• Add an RSVT scheduling policy
– RSVT module essentially replaces the policy runqueue
– Initial implementation only for uniprocessors
• CFS and possibly other policies also exist and have higher priority– When they run, this is considered dead time
![Page 22: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/22.jpg)
Timer Interrupts
• Linux employs timer interrupts (250 Hz)• Allocations are done at these times
– Translate time into microseconds– Subtract known dead time (unavailable to us)– Divide among active clients according to relative
allocations– Bound divergence of allocation from consumption
• Also handling of grace period (mark as inactive)• Also handling of rebirth (set ai = ci)
![Page 23: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/23.jpg)
Multi-Queue
• At dispatch, need to find client with highest priority
• But priorities change at different rates• Solution: allow only a limited discrete set of
relative priorities• Each priority has a separate queue• Maintain all clients in each queue in priority
order• Only need to check the first in each queue to
find the maximum
![Page 24: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/24.jpg)
Experiment – Basic Allocations
rate bandwidth1 30.890.0
52 61.410.0
2
![Page 25: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/25.jpg)
Experiment – Basic Allocations
rate bandwidth1 15.690.1
12 30.810.0
33 46.100.0
3
![Page 26: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/26.jpg)
Experiment – Active Set
![Page 27: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/27.jpg)
Experiment – Grace Period
![Page 28: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/28.jpg)
Experiment – Rebirth
![Page 29: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/29.jpg)
Experiment – Throttling
•Two competing MPlayers
•The one with higher allocation does not need all of it– Allocation
tracks consumption
![Page 30: Design and Implementation of a Generic Resource-Sharing Virtual-Time Dispatcher](https://reader034.vdocuments.mx/reader034/viewer/2022042616/56813a81550346895da27d96/html5/thumbnails/30.jpg)
Conclusions
• Demonstrated generic virtual-time based resource sharing dispatcher
• Need to complete implementation– Support for I/O scheduling– More details, e.g. SMP support
• Building block of global scheduling vision