meson: building a machine learning orchestration framework on mesos
TRANSCRIPT
![Page 1: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/1.jpg)
Building a Machine Learning Orchestration Framework on Mesos
0
Antony Arokiasamy | Kedar Sadekar | Personalization Infrastructure
![Page 2: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/2.jpg)
1
Help members find content to watch and enjoy to maximize member satisfaction and retention
![Page 3: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/3.jpg)
Everything is a Recommendation2
Recommendations are driven by Machine Learning
Ranking
Row
s
![Page 4: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/4.jpg)
Machine Learning Pipeline3
User Selection
Feature Generation
Model Validation
PublishModel
Model Training
![Page 5: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/5.jpg)
Machine Learning Pipeline Challenges
4
• Innovation• Heterogeneous Environments
• Spark• Native Support
• Separate Orchestration and Execution
• Multi Tenancy
• ML Constructs• Parameter Sweep – 30k Dockers
![Page 6: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/6.jpg)
Meson Workflow System in 30 seconds
5
• General Purpose Workflow Orchestration and Scheduling framework• Delegates execution to resource managers like Mesos
• Optimized for Machine Learning Pipelines and Visualization
• Checkout the Blog• bit.ly/mesonws or techblog.netflix.com
![Page 7: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/7.jpg)
Meson Architecture6
![Page 8: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/8.jpg)
Mesos Usage7
• Executors• Custom Executor• Executor Caching• Executor Cleanup
• Framework Messages
• Resource Attributes• Multi Tenancy• Cluster Management
![Page 9: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/9.jpg)
Custom Executors8
• Reuse Executor Process• e.g. Spark• Executor Id = <unique id>
• Two Way Communication
![Page 10: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/10.jpg)
Executor Caching9
![Page 11: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/11.jpg)
Executor Caching10
• Executor Id = hash(<something unique for the class of executors>)• E.g. Executor Id = hash(classpath)
• Match with Executor Id in Offer
offers
accept
![Page 12: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/12.jpg)
Executor Cleanup11
• Expiration
• Explicitly keep track of Executors
![Page 13: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/13.jpg)
Framework Messages12
![Page 14: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/14.jpg)
Multi Tenancy13
• Resource Attributes • spark.mesos.constraints
![Page 15: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/15.jpg)
Cluster Management14
• Red-Black software updates
• Scale up/Scale down
![Page 16: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/16.jpg)
Mesos Cluster15
• 100s of Concurrent Jobs
• 700 Nodes
• 5000 Cores
• 25 TB Memory
• Apps: Meson Workflow System, Spark and Dockers
• Few smaller clusters
![Page 17: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/17.jpg)
What's Next16
• Fenzo Scheduler - https://github.com/Netflix/Fenzo• Bin Packing, Auto Scaling, Host Attributes/Constraints, Groups, etc
• Cook Scheduler - https://github.com/twosigma/Cook• Multi tenant Spark Scheduler
• Open Source Meson Workflow System
![Page 18: Meson: Building a Machine Learning Orchestration Framework on Mesos](https://reader036.vdocuments.mx/reader036/viewer/2022062820/58a9ab051a28ab9c758b56b7/html5/thumbnails/18.jpg)
17
Antony Arokiasamy
Kedar Sadekar
@aasamy
/aasamy
@kedar_sadekar
/kedar-sadekar