challenges in optimizing job scheduling on mesos
TRANSCRIPT
![Page 1: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/1.jpg)
Challenges in Optimizing Job Scheduling on Mesos
Alex Gaudio
![Page 2: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/2.jpg)
![Page 3: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/3.jpg)
● Data Scientist and Engineer at Sailthru
● Mesos User
● Creator of Relay.Mesos
Who Am I?
![Page 4: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/4.jpg)
● Data Scientist and Engineer at Sailthru○ Distributed Computation and Machine Learning
● Mesos User○ 1 year
● Creator of Relay.Mesos○ intelligently auto-scale Mesos tasks
Who Am I?
![Page 5: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/5.jpg)
![Page 6: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/6.jpg)
What are the goals of this talk?
1. Understand the problem of job scheduling using basic principles
![Page 7: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/7.jpg)
What are the goals of this talk?
1. Understand the problem of job scheduling using basic principles
2. Learn ways to think about, use or develop Mesos more effectively
![Page 8: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/8.jpg)
What are the goals of this talk?
1. Understand the problem of job scheduling using basic principles
2. Learn how to think about and use or develop Mesos more effectively
3. Have some fun along the way!
![Page 9: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/9.jpg)
![Page 10: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/10.jpg)
Contents
- The Problem of Utilization
- How does Mesos do (or not do) Job Scheduling?
![Page 11: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/11.jpg)
The Problem of Utilization
Here’s a Box
![Page 12: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/12.jpg)
The Problem of Utilization
LengthWidth
Height
It has 3 dimensions
![Page 13: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/13.jpg)
What can you do with a box that has 3 dimensions?
![Page 14: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/14.jpg)
What does this mean?!
![Page 15: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/15.jpg)
The Problem of Utilization
Stuff the box
![Page 16: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/16.jpg)
The Problem of Utilization
Unpack the box
![Page 17: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/17.jpg)
The Problem of Utilization
Box in a box
![Page 18: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/18.jpg)
The Problem of Utilization
Carry the box
![Page 19: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/19.jpg)
The Problem of Utilization
![Page 20: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/20.jpg)
The Problem of Utilization
Is really …
All about the box!
![Page 21: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/21.jpg)
The Problem of Utilization
By Example:
Please efficiently pack these stolen boxes into my get-away car!
![Page 22: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/22.jpg)
The Problem of Utilization
![Page 23: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/23.jpg)
The Problem of Utilization
Box Computer
A Computer is really just a Box
![Page 24: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/24.jpg)
The Problem of Utilization
Height
LengthWidth
We can represent a box with 3 dimensions
![Page 25: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/25.jpg)
The Problem of Utilization
RA
M
CPUDisk
… If we relabel the dimensions
![Page 26: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/26.jpg)
The Problem of Utilization
RA
M
CPUDisk
A computer, like a box, is a multi-dimensional object.
![Page 27: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/27.jpg)
The Problem of Utilization
RA
M
CPUDisk
A computer, is just a collection of resources
![Page 28: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/28.jpg)
If we put things in boxes,
What can we put in our computer?
The Problem of Utilization
![Page 29: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/29.jpg)
What can we put in our computer?
The Problem of Utilization
Processes!
![Page 30: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/30.jpg)
Output of a computer’s Process Tree
$ pstree
![Page 31: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/31.jpg)
This is an interesting slide!
$ pstree
![Page 32: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/32.jpg)
Why is the pstree slide interesting?
1. It introduces the concept of a process.
A process is an instance of code that accesses resources over time.
![Page 33: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/33.jpg)
Why is the pstree slide interesting?
1. It introduces the concept of a process.
A process may use, share, steal, lock or release resources
![Page 34: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/34.jpg)
Why is the pstree slide interesting?
2. It shows a computer with multiple processes running on it.
![Page 35: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/35.jpg)
Why is the pstree slide interesting?
2. It shows a computer with multiple processes running on it.
- The processes access the same pool of resources.
![Page 36: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/36.jpg)
Why is the pstree slide interesting?
2. It shows a computer with multiple processes running on it.
- Shared access to same pool of resources.
- Processes are categorized into a hierarchical structure.
![Page 37: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/37.jpg)
At this point, we can ask a couple great questions!
![Page 38: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/38.jpg)
At this point, we can ask a couple great questions!
● Why don't computers just have 1 process per box?
![Page 39: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/39.jpg)
At this point, we can ask a couple great questions!
● Why don't computers just have 1 process per box?
● Is it inefficient to have so many processes on one box?
![Page 40: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/40.jpg)
At this point, we can ask a couple great questions!
● Why don't computers just have 1 process per box?
● Is it inefficient to have so many processes on one box?
● Aren’t processes just another kind of box?
![Page 41: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/41.jpg)
The Problem of Utilization
Let’s try to answer these questions!
![Page 42: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/42.jpg)
The Problem of Utilization←------------------> CPU <-------------------->
←--
----
----
----
----
-> R
AM
←--
----
----
----
----
->
Imagine a computerwith only 2 resources.
![Page 43: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/43.jpg)
The Problem of Utilization
Process 1
Process 2
Process 3
←----------------> CPU Time <------------------>
←--
----
----
----
----
--->
RA
M ←
----
----
----
----
----
->
Imagine a computerwith only 2 resources.
Only 3 distinct process types run on this computer
![Page 44: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/44.jpg)
The Problem of Utilization←----------------> CPU Time <------------------>
←--
----
----
----
----
--->
RA
M ←
----
----
----
----
----
->
There is a fixed number of ways we can use up the computer’s resources.
Process 2
![Page 45: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/45.jpg)
The Problem of Utilization←----------------> CPU Time <------------------>
←--
----
----
----
----
--->
RA
M ←
----
----
----
----
----
->
There is a fixed number of ways we can use up the computer’s resources.
Process 2
1 process at a time.Could be great if all processes were the size of the computer
![Page 46: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/46.jpg)
The Problem of Utilization←----------------> CPU Time <------------------>
←--
----
----
----
----
--->
RA
M ←
----
----
----
----
----
->
There is a fixed number of ways we can use up the computer’s resources.
Process 2
2+ processes Sharing resourcesNew Concept: Shared State
Process 3
![Page 47: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/47.jpg)
The Problem of Utilization←----------------> CPU Time <------------------>
←--
----
----
----
----
--->
RA
M ←
----
----
----
----
----
->
Different Utilization Strategies
Process 3
Process 1
Process 2
Maximum VariationUnder-utilized
![Page 48: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/48.jpg)
The Problem of Utilization
Process 1
←----------------> CPU Time <------------------>
←--
----
----
----
----
--->
RA
M ←
----
----
----
----
----
->
There is a fixed number of ways we can use up the computer’s resources.Process 1
Process 1
Process 1
Process 1
Process 1
Maximum UtilizationNo Variation
![Page 49: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/49.jpg)
The Problem of Utilization←----------------> CPU Time <------------------>
←--
----
----
----
----
--->
RA
M ←
----
----
----
----
----
->
There is a fixed number of ways we can use up the computer’s resources.
Process 3Process 3Process 3
Process 3
Over-provisionedand Under-utilized
![Page 50: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/50.jpg)
The Problem of Utilization←----------------> CPU Time <------------------>
←--
----
----
----
----
--->
RA
M ←
----
----
----
----
----
->
Competing for shared resources. Unclear consequences.
Process 3Process 3Process 3
Process 3
Over-provisionedand Under-utilized
![Page 51: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/51.jpg)
A multi-dimensional problem!
Andvery complicated!
The Problem of Utilization
![Page 52: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/52.jpg)
Many ways we can use a computer’s resources.
Many different factors inform how we choose to utilize a set of resources.
Take-Aways
![Page 53: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/53.jpg)
Benefits of Shared State
● increased utilization
● flexibility to do different things simultaneously
● exposes a lot of interesting problems to solve
![Page 54: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/54.jpg)
Drawbacks of Shared State
● resource competition○ network and io congestion○ context switching○ out of memory errors
● less predictable○ constantly changing dynamic systems○ non-deterministic waiting○ feedback loops
![Page 55: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/55.jpg)
One machine, a host of problems
● Operating systems are complicated!● Your laptop’s kernel solves these scheduling
problems well.
![Page 56: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/56.jpg)
![Page 57: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/57.jpg)
● Thus far, we’ve discussed resource utilization on 1 machine.
● Is 1 machine enough?
● And what about Mesos?
The Problem of Utilization
![Page 58: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/58.jpg)
Obviously, 1 machine isn’t enough● Problems of scale:
○ Too much data○ Not enough compute power○ Everything can’t connect to 1 node
● Problems of reliability and availability:○ 1 machine is a Single Point of Failure○ No redundancy
![Page 59: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/59.jpg)
Many machines, then?
![Page 60: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/60.jpg)
Mesos!
![Page 61: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/61.jpg)
Recall the Box...
Box Computer
A Computer is really just a Box
![Page 62: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/62.jpg)
Mesos is really just a box, too
![Page 63: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/63.jpg)
AND Mesos is just a Computer
Double Analogy
![Page 64: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/64.jpg)
Mesos is a Distributed Computer
RA
MCPU
![Page 65: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/65.jpg)
Mesos is a Distributed Computer
RA
MCPU
● a lot of machines● all solving the similar
problems
![Page 66: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/66.jpg)
Mesos is a Distributed Computer
RA
MCPU
● a lot of machines● all solving the similar
problems
● We need ways to tell each machine what to do.
![Page 67: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/67.jpg)
Must rebuild all elements of an operating system in context of a distributed system!
![Page 68: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/68.jpg)
Must rebuild all elements of an operating system in context of a distributed system!
Same old problems
Awesome new technology
![Page 69: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/69.jpg)
Part 2:
![Page 70: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/70.jpg)
Part 2: How does Mesos do Job Scheduling?
![Page 71: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/71.jpg)
How Mesos does Job Scheduling
A very big box
Let’s call it “Grid”
![Page 72: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/72.jpg)
How Mesos does Job Scheduling
Mesos Slaves (aka computers or boxes)
The “Grid” holds a lot of smaller boxes.
The little boxes are “Slaves”
![Page 73: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/73.jpg)
How Mesos does Job Scheduling
Mesos Slaves
Each slave is a partitioned pool of resources
RA
MCPU
![Page 74: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/74.jpg)
How Mesos does Job Scheduling
Mesos Slaves
Mesos Master
● Slaves advertise resources to Master
● Master packages resources into resource offers.
![Page 75: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/75.jpg)
How Mesos does Job Scheduling
Mesos Slaves
Mesos Master
Frameworks
Master offers resources to frameworks
![Page 76: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/76.jpg)
How Mesos does Job Scheduling
Mesos Slaves
Mesos Master
Frameworks
Frameworks accept or reject resource offers.
![Page 77: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/77.jpg)
How Mesos does Job Scheduling
Mesos Slaves
Mesos Master
Frameworks
Accepted offers result in tasks that do useful work.
![Page 78: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/78.jpg)
3 Types of Scheduling Architectures(aka 3 Types of Distributed Kernels)
Mesos has a two-level architecture.
![Page 79: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/79.jpg)
3 Types of Scheduling Architectures
from the Google Omega Whitepaper
Mesos Master
(manage resource and framework state)
Mesos Frameworks(manage task state)
![Page 80: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/80.jpg)
3 Types of Scheduling Architectures
from the Google Omega Whitepaper
![Page 81: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/81.jpg)
3 Types of Scheduling Architectures(aka 3 Types of Distributed Kernels)
Goal
![Page 82: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/82.jpg)
3 Types of Scheduling Architectures(aka 3 Types of Distributed Kernels)
![Page 83: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/83.jpg)
3 Types of Scheduling Architectures(aka 3 Types of Distributed Kernels)
Borg (Google)
![Page 84: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/84.jpg)
Remainder of this talk...
Point out weaknesses with Mesos that
1. Prevent it from being a shared state kernel.
2. Can make Mesos challenging to use.
![Page 85: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/85.jpg)
Remainder of this talk...
1. Optimistic Vs Pessimistic Offers
2. DRF Algorithm and Framework Sorters
3. Missing APIs / Enhancements
![Page 86: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/86.jpg)
Optimistic Vs Pessimistic Offers
We Trust Everyone!
![Page 87: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/87.jpg)
Optimistic Vs Pessimistic Offers
Everyone promised
not to take my
spot
Protect my spot
from thiefs!
![Page 88: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/88.jpg)
Optimistic Vs Pessimistic Offers
![Page 89: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/89.jpg)
Optimistic Vs Pessimistic Offers
● 2 frameworks sharing the same resources is not safe
![Page 90: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/90.jpg)
Optimistic Vs Pessimistic Offers
● 2 frameworks sharing the same resources is not safe
● A chunk of resources is only offered to a single framework scheduler at a time.
![Page 91: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/91.jpg)
Why is this a problem?
When a Framework receives resource offers, it has 2 options:
Make an immediate decision
Hold onto the offer forever in
a state of indecision
![Page 92: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/92.jpg)
Why is this a problem?
When a Framework receives resource offers, it has 2 options:
Make an immediate decision
Hold onto the offer forever in
a state of indecision
![Page 93: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/93.jpg)
Why is this a problem?
Under-utilization
If the framework holds the offer forever, those resources can’t be used.
… or eaten!
![Page 94: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/94.jpg)
Why is this a problem?
Under-utilization
Can be hard toschedule large tasks
![Page 95: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/95.jpg)
Why is this a problem?
Gaming the System
If it’s hard to schedule large tasks, frameworks might hold onto tons of offers until it can schedule its huge task.
![Page 96: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/96.jpg)
Why is this a problem?
Gaming the System:
One could create many instances of a framework to trick Mesos to let it hoard more offers!
![Page 97: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/97.jpg)
Workarounds / Solutions
● --offer_timeout Set short timeouts to penalize slow frameworks
● MESOS-1607: Wait for optimistic offers!○ Submit one offer to multiple frameworks, but rescind
the offer when necessary.○ Encourages more sophisticated allocation algorithms
![Page 98: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/98.jpg)
Remainder of this talk...
1. Optimistic Vs Pessimistic Offers
2. DRF Algorithm and Framework Sorter
3. Missing APIs / Enhancements
![Page 99: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/99.jpg)
DRF and Framework Sorter
![Page 100: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/100.jpg)
DRF and Framework Sorter
Mesos Master must choose which Frameworks to give offers to first.
![Page 101: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/101.jpg)
DRF and Framework Sorter
Mesos Master must choose which Frameworks to give offers to first.
In a pessimistic system, this is very important!
![Page 102: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/102.jpg)
What is DRF?
“Dominant Resource Fairness” Algorithm
![Page 103: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/103.jpg)
What is DRF?
“Dominant Resource Fairness” Algorithm
● A method for prioritizing which frameworks to give a resource offer to first.
![Page 104: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/104.jpg)
What is DRF?
“Dominant Resource Fairness” Algorithm
Framework XYZResource Usage
12% 30% 3% 7%
We can represent a framework by how many resources it uses.
![Page 105: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/105.jpg)
What is DRF?
“Dominant Resource Fairness” Algorithm
Framework XYZResource Usage
12% 30% 3% 7%
We can represent a framework by how many resources it uses.
For example:- 30% of total RAM- 12% of total CPU
![Page 106: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/106.jpg)
What is DRF?
“Dominant Resource Fairness” Algorithm
Framework XYZResource Usage
12% 30% 3% 7%
Framework XYZ’s Dominant Resource is the 30% RAM
![Page 107: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/107.jpg)
How does DRF work?
“Dominant Resource Fairness” Algorithm
30%
F1
10%
F2
20%
F3Identify all frameworks by their dominant resource
![Page 108: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/108.jpg)
How does DRF work?
“Dominant Resource Fairness” Algorithm
30%
F1
10%
F2
20%
F3Out of all frameworks (F1, F2 and F3),
F2 has the minimum dominant share of resources.
![Page 109: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/109.jpg)
How does DRF work?
“Dominant Resource Fairness” Algorithm
F2 DRF says that as long as resources are available,
Mesos should offer resources to F2 first, F3 second, and F1 last.
F3
F1
![Page 110: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/110.jpg)
How does DRF work?
Weighted DRF30%
F1
10%
F2
20%
F3
Per-framework weights, if defined, adjust the dominant share for each framework.
F1 F2 F3
![Page 111: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/111.jpg)
How does DRF work?
Weighted DRF30%
F1
10%
F2
20%
F3
Per-framework weights, if defined, adjust the dominant share for each framework.
Weighting informs Mesos that it should generally prefer some Frameworks over others.
F1 F2 F3
![Page 112: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/112.jpg)
DRF is great if...
![Page 113: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/113.jpg)
DRF is great if...
● All frameworks have work to do
![Page 114: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/114.jpg)
DRF is great if...
● All frameworks have work to do
● A framework’s “hunger” for more resources does not change over its lifetime
![Page 115: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/115.jpg)
DRF is great if...
● All frameworks have work to do
● A framework’s “hunger” for more resources does not change over its lifetime
● You know apriori that specific frameworks to use more or less resources
![Page 116: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/116.jpg)
DRF is bad if...
![Page 117: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/117.jpg)
DRF is bad if...
● Some frameworks don’t want any more tasks, while others do.
![Page 118: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/118.jpg)
DRF is bad if...
● Some frameworks don’t want any more tasks, while others do.
● The framework's "hunger" for resources changes over its lifetime (perhaps based on queue size or pending web requests)
![Page 119: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/119.jpg)
DRF Examples
Framework 1
1 task
Framework 2
6 tasks
Framework 4
30 tasks
Framework 6
50 tasks
Framework 3
1 task
Framework 5
1 task
Framework 4 always wants
30 tasks
![Page 120: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/120.jpg)
DRF Examples
Framework 1
1 task
Framework 2
6 tasks
Framework 4
30 tasks
Framework 6
50 tasks
Framework 3
1 task
Framework 5
1 task
DRF with weightsis great IF these expected
ratios never change.
Framework 4 always wants
30 tasks
![Page 121: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/121.jpg)
DRF Examples
Framework 1
0 tasks
Framework 2
0 tasks
Framework 4
0 tasks
Framework 6
50 tasks
Framework 3
0 task
Framework 5
1 task
Sometimes frameworks
don’t want to do work
![Page 122: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/122.jpg)
DRF Examples
Framework 1
0 tasks
Framework 2
0 tasks
Framework 4
0 tasks
Framework 6
50 tasks
Framework 3
0 task
Framework 5
1 task
Sometimes frameworks don’t want to
do work
● DRF gives preference to the “0 tasks” frameworks.
● Framework 6 gets starved for resources!
![Page 123: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/123.jpg)
DRF Examples
Framework 1
0 tasks
Framework 2
0 tasks
Framework 4
0 tasks
Framework 6
50 tasks
Framework 3
0 task
Framework 5
1 task
Sometimes frameworks don’t want to
do work
● DRF gives preference to the “0 tasks” frameworks.
● Framework 6 gets starved for resources!
![Page 124: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/124.jpg)
Real-world Examples of Bad DRFAny Framework that declines usable offers suggests DRF isn’t working well
● Consumer Framework that consumes an occasionally empty queue
● Web Server Framework that sometimes doesn’t get a lot of requests
● Database Framework that doesn’t have a lot to do sometimes
![Page 125: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/125.jpg)
Workarounds / Solutions
● Ensure all your frameworks always want more tasks○ Can be very hard, perhaps impossible, to do.○ ie. What if a framework just maintains N services? ○ Might encourage sloppy or inefficient frameworks.
![Page 126: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/126.jpg)
Workarounds / Solutions
● Write your own allocation algorithm!○ See Li Jin’s 11:50 talk, "Preemptive Task Scheduling
in Mesos Framework"○ Maybe other talks?
![Page 127: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/127.jpg)
● wait for optimistic offers to make this less of an issue
● allow frameworks to periodically restart themselves and define a different DRF weighting every time they restart
Workarounds / Solutions
![Page 128: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/128.jpg)
DRF Speculation
● A really good dynamic weighting algorithm would benefit by knowledge of the current distribution of weights by other frameworks across the system. ○ Frameworks could compete with each other based
on this information○ Makes Mesos more like a shared-state scheduler
![Page 129: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/129.jpg)
Remainder of this talk...
1. Optimistic Vs Pessimistic Offers
2. DRF Algorithm and Framework Sorter
3. Missing APIs / Enhancements
![Page 130: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/130.jpg)
![Page 131: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/131.jpg)
These are my opinions
Not sure whether others will agree
If you have opinions too, let’s get beers tonight!
![Page 132: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/132.jpg)
Missing APIs / Enhancements
● In my opinion, different framework sorter algorithms and even optimistic offers, will only take us so far.
![Page 133: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/133.jpg)
Missing APIs / Enhancements
● Frameworks should more actively leverage statistics about resource utilization to inform mesos master about how it should be allocated.
![Page 134: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/134.jpg)
Missing APIs / Enhancements
● Frameworks should more actively leverage statistics about resource utilization to inform mesos master about how it should be allocated.○ Frameworks know their resource needs better than the
Master.○ Some frameworks can make simple decisions○ Others can be smart in how they wish to populate the
grid
![Page 135: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/135.jpg)
Missing APIs / Enhancements
● Frameworks should be able to tell mesos what they will want in the future (and how badly they want it)○ Let the framework developer community play the game
to “optimize this scheduling problem”
● The DRF algorithm, or hierarchical allocator in general, should leverage historical data.
![Page 136: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/136.jpg)
For more about our story, check out this talk at 4:50!
![Page 137: Challenges in Optimizing Job Scheduling on Mesos](https://reader034.vdocuments.mx/reader034/viewer/2022052705/58f2c8851a28abc8278b457d/html5/thumbnails/137.jpg)