uc#berkeley# mesos: a platform for fine-...
TRANSCRIPT
![Page 1: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/1.jpg)
Mesos: A Platform for Fine-Grained Resource Sharing
in Data Centers (I)
UC BERKELEY
Anthony D. Joseph
LASER Summer School September 2013
![Page 2: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/2.jpg)
My Talks at LASER 2013
1. AMP Lab introduction
2. The Datacenter Needs an Operating System
3. Mesos, part one
4. Dominant Resource Fairness
5. Mesos, part two
6. Spark 2
![Page 3: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/3.jpg)
Collaborators
• Matei Zaharia
• Benjamin Hindman
• Andy Konwinski
• Ali Ghodsi
• Randy Katz
• Scott Shenker
• Ion Stoica 3
![Page 4: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/4.jpg)
Modern Data Center Paradigm Commodity machines (100’s – 10,000’s of machines) » Attached storage devices
Data distributed and replicated across nodes » Data locality to computation matters
Solution: Use a datacenter computing framework » Divide jobs into smaller tasks, so that jobs can take turns
accessing each node and ideally locally accessing data » Tasks are both fine-grained in time (short) and space (use
fraction of a machine)
4
![Page 5: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/5.jpg)
Rapid innovation in datacenter computing frameworks
No single framework optimal for all applications
Want to run multiple frameworks in a single datacenter » …to maximize utilization » …to share data between frameworks
Pig
Datacenter Scheduling Problem
Dryad
Pregel
Percolator
CIEL
5
![Page 6: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/6.jpg)
Hadoop
Pregel
MPI Shared cluster
Today: static partitioning Dynamic sharing
Where We Want to Go
6
![Page 7: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/7.jpg)
Solution: Apache Mesos
Mesos
Node Node Node Node
Hadoop Pregel …
Node Node
Hadoop
Node Node
Pregel
…
Mesos is a common resource sharing layer over which diverse frameworks can run
Run multiple instances of the same framework » Isolate production and experimental jobs » Run multiple versions of the framework concurrently
Build specialized frameworks targeting particular problem domains » Better performance than general-purpose abstractions
http://mesos.apache.org/ 7
![Page 8: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/8.jpg)
Mesos Goals
High utilization of resources
Support diverse frameworks (current & future)
Scalability to 10,000’s of nodes
Reliability in face of failures
Resulting design: Small microkernel-like core that pushes scheduling logic to frameworks
8
![Page 9: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/9.jpg)
Previous Approaches Locality less important (HPC & grid computing) » Expensive, dedicated storage (SANs, Parallel FS) » Expensive, high speed networks (Infiniband)
Fine-grained task model infeasible » Ad-hoc programs (many barriers, tight message passing) » Legacy programs (Fortran77)
Approach taken: coarse-grained sharing » Job specifies number of machines and amount of time needed » Scheduler queues job and allocates all machines at the same time
9
![Page 10: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/10.jpg)
Mesos Design Elements
Fine-grained sharing: » Allocation at the level of tasks within a job » Improves utilization, latency, and data locality
Resource offers: » Simple, scalable application-controlled scheduling
mechanism
10
![Page 11: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/11.jpg)
Element 1: Fine-Grained Sharing
Framework 1
Framework 2
Framework 3
Coarse-Grained Sharing (HPC): Fine-Grained Sharing (Mesos):
+ Improved utilization, responsiveness, data locality
Storage System (e.g. HDFS) Storage System (e.g. HDFS)
Fw. 1
Fw. 1 Fw. 3
Fw. 3 Fw. 2 Fw. 2
Fw. 2
Fw. 1
Fw. 3
Fw. 2 Fw. 3
Fw. 1
Fw. 1 Fw. 2 Fw. 2
Fw. 1
Fw. 3 Fw. 3
Fw. 3
Fw. 2
Fw. 2
11
![Page 12: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/12.jpg)
Element 2: Resource Offers
Option: Global scheduler » Frameworks express needs in a specification language,
global scheduler matches them to resources
+ Can make optimal decisions
– Complex: language must support all framework needs – Difficult to scale and to make robust
– Future frameworks may have unanticipated needs
12
![Page 13: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/13.jpg)
Element 2: Resource Offers
Mesos: Resource offers » Offer available resources to frameworks, let them pick
which resources to use and which tasks to launch ���
+ Keeps Mesos simple, lets it support future frameworks - Decentralized decisions might not be optimal
13
![Page 14: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/14.jpg)
Machines Make datacenter a real computer!
14
Node OS (e.g. Linux)
Node OS (e.g. Windows)
Node OS (e.g. Linux)
…
Spar
k SCADS
…
Datacenter “OS” (e.g., Apache Mesos)
Had
oop
MPI
Hyp
ertb
ale
…
Cas
sand
ra
Hive PIQL
Support interactive and iterative data analysis (e.g., ML algorithms)
Consistency adjustable data store
Predictive & insightful query language
AMP stack
Existing stack
![Page 15: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/15.jpg)
Allocation Policies
Mesos controls how many resources each framework can get, but not which resources
Allocation policies are pluggable to suit organization needs
15
![Page 16: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/16.jpg)
Example: Hierarchical Fair Sharing
Facebook.com
Spam Ads
Job 3
Job 2
User 1
Job 1
User 2
Job 4
100%
0%
20%
40%
60%
80%
100%
0 1 2 3 Time
Cluster Utilization
Curr Time
80% 20%
30%
70% User 1 User 2
Cluster Share Policy
20%
80%
Spam Dept.
Ads Dept.
20% 14% 100%
Curr Time
6%
Curr Time
0%
70% 30%
16
![Page 17: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/17.jpg)
Mesos Architecture Slave 1
Hadoop Executor MPI executor
Slave 2 Hadoop Executor
Slave 3
Mesos Master Allocation Module
Framework Scheduler Hadoop
JobTracker
Framework Scheduler MPI
Scheduler
Resource offer Status
Slaves send status updates about
available resources
Pluggable policy picks which framework to offer resources to
Framework scheduler selects resources
and provides tasks
Framework executors run tasks and may persist across tasks
Launch Hadoop task 2
task 1
task 2
task 1
17
![Page 18: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/18.jpg)
Resource Offer Details A resource offer is a set of machine-resource tuples » { [m1, 1 CPU, 1GB], [m2, 4 CPU, 16GB] }
Resource offers count towards a frameworks share » Rescinded after a time out (incentive to reply fast)
Optimizations » Frameworks indicate interest to get offers » Frameworks can set filters to automatically filter out certain
nodes or nodes with too few resources
18
![Page 19: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/19.jpg)
Dynamic Resource Sharing
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 101 201 301 401 501 601 701 801 901 1001
Clus
ter U
tiliz
ation
Time (seconds)
Torque Hadoop Instance 1 Hadoop Instance 2 Hadoop Instance 3
19
![Page 20: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/20.jpg)
Which Offers to Accept?
Delay scheduling » Initially only accept preferred (e.g., local) resources » Accept any resource after timeout (1-5 seconds)
Can achieve near optimal locality
20
![Page 21: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/21.jpg)
Multiple Hadoops Experiment
Hadoop1
Hadoop 2
Hadoop 3
Storage System (e.g. HDFS) Storage System (e.g. HDFS)
Hadoop 1
Hadoop 1 Hadoop 3
Hadoop 3 Hadoop 2 Hadoop 2
Hadoop 2
Fw. 1
Hadoop 3
Fw. 2 Hadoop 3
Hadoop 1
Hadoop 1 Hadoop 2 Hadoop 2
Hadoop 1
Hadoop 3 Hadoop 3
Hadoop 3
Hadoop 2
Hadoop 2
21
![Page 22: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/22.jpg)
Data Locality on Mesos 16 Hadoop MapReduce instances over shared file system
22
0%
20%
40%
60%
80%
100%
Static partitioning Mesos, no delay sched. Mesos, 1s delay sched. Mesos, 5s delay sched.
Loca
l Map
Tas
ks (%
)
0
100
200
300
400
500
600
Static partitioning Mesos, no delay sched. Mesos, 1s delay sched. Mesos, 5s delay sched.
Job Run
ning
Tim
e (s)
![Page 23: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/23.jpg)
Some Related Datacenter Resource Managers
Hadoop YARN » Open-source follow-on to Hadoop with pluggable
allocation policies » Primary focus is Hadoop jobs
Google’s Omega resource manager » Closed-source follow-on to original resource manager » Framework-specific schedulers use optimistic
concurrency model – all compete simultaneously to select resources
23
![Page 24: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/24.jpg)
In Mesos Part II Lecture
Implementation Details and Supported Frameworks
Isolation
Handling Mesos Master Failure
Resource Revocation
Scalability
Results and Macrobenchmarks
24
![Page 25: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/25.jpg)
Summary (Part One)
Mesos is a platform for sharing data centers among diverse cluster computing frameworks » Enables efficient fine-grained sharing » Gives frameworks control over scheduling » Supports current and future frameworks » Achieves high utilization
25
![Page 26: UC#BERKELEY# Mesos: A Platform for Fine- …laser.inf.ethz.ch/2013/material/joseph/LASER-Joseph-3.pdfMesos: A Platform for Fine-Grained Resource Sharing in Data Centers (I)! UC#BERKELEY#](https://reader036.vdocuments.mx/reader036/viewer/2022070710/5ec562909c571c0d232230f8/html5/thumbnails/26.jpg)
My Talks at LASER 2013
1. AMP Lab introduction
2. The Datacenter Needs an Operating System
3. Mesos, part one
4. Dominant Resource Fairness
5. Mesos, part two
6. Spark 26