![Page 1: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/1.jpg)
http://www.istc-cc.cmu.edu/
Charles Reiss*, Alexey Tumanov†,Gregory R. Ganger†, Randy H. Katz*,Michael A. Kozuch‡
* UC Berkeley † CMU ‡ Intel Labs
Heterogeneity and Dynamicityof Clouds at Scale:Google Trace Analysis
![Page 2: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/2.jpg)
2
Lessons for scheduler designers
Challenges resulting from cluster consolidation
Our goal
![Page 3: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/3.jpg)
3
Run all workloads on one cluster
Increased efficiency-Fill in “gaps” in interactive workload-Delay batch if interactive demand
spikes Increased flexibility
-Share data between batch and interactive
Cluster consolidation
![Page 4: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/4.jpg)
4
Released Nov 2011
“make visible many of the scheduling complexities that affect Google's workload”
Challenges motivating second system [Omega]:
-Scale-Flexibility-Complexity
The Google trace
![Page 5: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/5.jpg)
5
Mixed types of workloadWould be separate clusters
elsewhere
What cluster scheduler sees
Over ten thousand machines, one month
No comparable public trace in scale and variety
The Google trace
![Page 6: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/6.jpg)
6
High-performance/throughput computing:large, long-lived jobs; often gang-scheduled; CPU and/or memory intensive
DAG of Tasks (e.g. MapReduce):jobs of similar small, independent tasks
Interactive services (e.g. web serving):indefinite-length ‘jobs’; variable demand;pre-placed servers
Background: Common workloads
![Page 7: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/7.jpg)
7
Units of work are interchangeable(in space or time; to a scheduler)
Scheduler acts infrequently (or simply)
Tasks will indicate what resources they require
Machines are interchangeable
Assumptions this trace breaks
![Page 8: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/8.jpg)
8
tasks (25M): ‘run a program somewhere once’
-more like MapReduce worker than MR task
-Linux containers (shared kernel; isolation)
-may fail and be retried (still same task)
jobs (650k): collections of related tasks-no formal coscheduling requirement
machines (12.5k): real machines
Terminology and sizes
![Page 9: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/9.jpg)
9
Units of work are interchangeable(in space or time; to a scheduler)
Scheduler acts infrequently (or acts simply)
Tasks will indicate what resources they require
Machines are interchangeable
Assumptions this trace breaks
![Page 10: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/10.jpg)
10
Mixed workload:Task sizes
~ order of magnitude
no fixed “slots”
![Page 11: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/11.jpg)
11
Mixed workload:Job durations
Length of trace
Median duration: ~3 min
Long tail of hours-long and days-long jobs
![Page 12: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/12.jpg)
12
Mixed workload:Long-running tasks are most usage
<4% of CPU-time
<2% of RAM-time
sub-hour jobs (90% of jobs)
![Page 13: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/13.jpg)
13
Units of work are interchangeable(in space or time; to a scheduler)
Scheduler acts infrequently (or acts simply)
Tasks will indicate what resources they require
Machines are interchangeable
Assumptions this trace breaks
![Page 14: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/14.jpg)
14
Fast-moving workload:100k+ of decisions per hour
![Page 15: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/15.jpg)
15
Fast-moving workload:100k+ decisions per hour
![Page 16: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/16.jpg)
16
Fast-moving workload:Crash-loops
![Page 17: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/17.jpg)
17
Fast-moving workload:Evictions
![Page 18: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/18.jpg)
18
Most evictions for higher-priority tasks:-Coincide with those tasks starting-0.04 evictions/task-hour for lowest
priority
A few for machine downtime:-40% of machines down once in the
month-Upgrades, repairs, failures
Fast-moving workload:Evictions
![Page 19: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/19.jpg)
19
Units of work are interchangeable(in space or time; to a scheduler)
Scheduler acts infrequently (or acts simply)
Tasks will indicate what resources they require
Machines are interchangeable
Assumptions this trace breaks
![Page 20: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/20.jpg)
20
[utilization graph]
How busy is the cluster?from requests; split by priorities (red=high)
"production"(higher prio)
lower prio
![Page 21: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/21.jpg)
21
How busy is the cluster really?what was usedwhat was asked for (and run)
Lower priorities
Higher priorities
![Page 22: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/22.jpg)
22
Requests estimate worst-case usage
~60% of request/usage difference from difference between worst/average usage:
-Average task versus worst task in job
-Average usage versus worst usage in task
But not enough to explain request/usage gap
Request accuracy:Maximum versus Average
![Page 23: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/23.jpg)
23
Request accuracy:Requests are people
![Page 24: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/24.jpg)
24
Units of work are interchangeable(in space or time; to a scheduler)
Scheduler acts infrequently (or acts simply)
Tasks will indicate what resources they require
Machines are interchangeable
Assumptions this trace breaks
![Page 25: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/25.jpg)
25
Not all machines are equal:Machine types
Count Platform CPU Memory6732 B 0.50 0.50
3863 B 0.50 0.25
1001 B 0.50 0.75
795 C 1.00 1.00
126 A 0.25 0.25
<100 B and C (various) (various)
Three micro-architectures
Factor of 4
![Page 26: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/26.jpg)
26
Tasks can restrict acceptable machines(for reasons other than resources)
Used by ~6% of tasks
Examples:-Some jobs require each task to be
on a different machine
-Some tasks avoid 142 marked machines
Not all machines are equal:Task constraints
![Page 27: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/27.jpg)
27
New scheduling challenges for mixed workloads:
Complex task requests-Order of magnitude range of
resources-Extra constraints-Matched against variety of machines
Rapid scheduling decisions-Short tasks (with little utilization)-Restarted tasks
Users’ requests not enough for high utilization
Conclusion
![Page 28: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/28.jpg)
28
![Page 29: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/29.jpg)
29
[Backup/Discarded Slides]
![Page 30: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/30.jpg)
30
Resource requests = worst-case usage
Estimate: "ideal" request ≈ high percentile of usage within each job
Imperfect:Opportunistic usageAssumes outliers are spuriousDoesn't account for peaksExtra capacity needed for failover,
etc.
Request accuracy:Evaluating
![Page 31: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/31.jpg)
31
Heterogenous: Machines, type of work varies
Some scheduling strategies won't work
Space on a machine variesDynamic: Work comes fast
Not just initial submissionsOnly a small amount matters for
utilizationResource requests are suboptimal
Users do not make good requestsResource requirements may vary
Conclusion
![Page 32: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/32.jpg)
32
Request accuracy:Not very
= ideal requests
excessive requests
insufficient requests
two-thirds of RAM-time requested "accurately"
![Page 33: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/33.jpg)
33
Predictable usage:Usage stability
![Page 34: Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](https://reader036.vdocuments.mx/reader036/viewer/2022062222/56816711550346895ddb7912/html5/thumbnails/34.jpg)
34
Mixed workload:Daily patterns
Interactive services?
Batch+development?