dios

DIOS: Dynamic Instrumentation for(not so) Outstanding SchedulingBlake Sutton & Chris Sosa

Motivation

Scheduling jobs on a group of machines Cluster Distributed operating system

Don’t know what to expect at submission time!

Memory contention

Migrate processes away to a better place...

Approach: Adaptive Distributed Scheduler

Monitor machines and processes to motivate migration decisions.

Gather application-specific info and feed to local schedulers.

Global scheduler collects local schedulers’ observations and uses information on all machines and all applications to make decisions. Migrate? Which one? Where? Pause? Which one? How long?

Dynamic Instrumentation with Pin

Insert new code into apps on the fly No recompile Operates on copy Code cache

Our Pintool Routine-level Instruction-level

Application-Specific Information

Want to capture memory behavior over time

We gathered: Ratio of malloc to free calls

Wall-clock time to execute 10,000,000 insns

Number of memory ops in last 2,000,000 insns

Evaluation

Distributed scheduler Rhino on realitytv16, Hare

on realitytv13-16 Looks for % memory free

and restarts youngest job heatedplate with modified

parameters Baseline: Queue balancing

Pintool 2 applications from

SPLASH-2 Heatedplate

The Good

Potential for improvement

Lower total runtime with simple policy

Restart youngest

The Bad

Overhead from Pintool is too high to realize gains Pin isn’t designed for on-the-fly analysis Couldn’t attach / detach Code caching can’t save it

application native only pin count malloc/free # mems latency

heatedplate 1.00 1.88 2.65 5.43 7.45 7.26

ocean 1.00 1.48 2.87 7.84 6.04 5.81

lu 1.00 1.25 6.27 14.51 7.90 7.64

The “Interesting”

Pintool does capture intriguing info…

Conclusion: the Future of DIOS

Overhead is prohibitive – for now Add attach / detach Lighter instrumentation framework

But instrumentation can capture aspects of application-specific behavior!

Marty was right.

Find out the final answer: 9am 5/9, MEC215.

¿Preguntas?

Wait…hasn’t this been solved?

Condor popular user-space distributed scheduler process migration tries to keep queues balanced

but jobs have different behavior over time from each other

LSF (Load Sharing Facility) monitors system, moves processes around based on what they need must input static job information (requires profiling etc beforehand)

what if something about your job isn't captured by your input? what if you end up giving it margins that are too large? too small? unnecessary inefficiencies? it's not exactly hassle-free...

Hardware feedback PAPI Still not very portable (invasive kernel patch for install)

Wouldn't it be nice if the scheduler could just..."do the right thing"?

dios

Technology

memory behavior

global scheduler

dynamic instrumentation

specific behavior

memory free

youngest job heatedplate

interesting pintool

submission time