distributed scheduler hell (microxchg 2017 berlin)

37
Distributed scheduler hell

Upload: matthew-campbell

Post on 12-Apr-2017

29 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Distributed scheduler hell (MicroXChg 2017 Berlin)

Distributed scheduler hell

Page 2: Distributed scheduler hell (MicroXChg 2017 Berlin)

Story of ..

How we moved 100(s) of virtual machines, onto containers

Page 3: Distributed scheduler hell (MicroXChg 2017 Berlin)

What is a distributed scheduler

Your cloud provider: Digital Ocean, AWS, Google.

For containers: Kubernetes, Nomad, Mesos

Page 4: Distributed scheduler hell (MicroXChg 2017 Berlin)

Why do we care?

Microservices...

Page 5: Distributed scheduler hell (MicroXChg 2017 Berlin)

Vulcan

Distributed timeseries database.

open source https://github.com/digitalocean/vulcan

Page 6: Distributed scheduler hell (MicroXChg 2017 Berlin)

Requirements

4 3 Gbits a second of Network traffic

4 20 TB Storage

4 Sub 100ms Read times

4 100k writes a second

Page 7: Distributed scheduler hell (MicroXChg 2017 Berlin)
Page 8: Distributed scheduler hell (MicroXChg 2017 Berlin)

Single process

Page 9: Distributed scheduler hell (MicroXChg 2017 Berlin)

HypervisorLinux Scheduler

Page 10: Distributed scheduler hell (MicroXChg 2017 Berlin)

Kernel Provides

4 Virtual Memory

4 Process Isolation

4 Disk storage

4 Networking

4 CPU scheduling

Page 11: Distributed scheduler hell (MicroXChg 2017 Berlin)

Distributed Scheduler Provides

4 Container Deploy

4 Virtual Machine Deploy

4 Memory Quota

4 Disk storage

4 Networking

4 CPU scheduling

4 Scaling Instances

Page 12: Distributed scheduler hell (MicroXChg 2017 Berlin)

Mesos

Page 13: Distributed scheduler hell (MicroXChg 2017 Berlin)
Page 14: Distributed scheduler hell (MicroXChg 2017 Berlin)
Page 15: Distributed scheduler hell (MicroXChg 2017 Berlin)

Deploying our App On Mesos

Page 16: Distributed scheduler hell (MicroXChg 2017 Berlin)
Page 17: Distributed scheduler hell (MicroXChg 2017 Berlin)

Marathon Architecture

Page 18: Distributed scheduler hell (MicroXChg 2017 Berlin)

Kafka

Custom Scheduler

Page 19: Distributed scheduler hell (MicroXChg 2017 Berlin)

Cassandra

Pinning to specific Mesos nodes

Page 20: Distributed scheduler hell (MicroXChg 2017 Berlin)

Mesos Failure Modes

Network Partition FAIL

Page 21: Distributed scheduler hell (MicroXChg 2017 Berlin)

Counter Example: Nomad

Page 22: Distributed scheduler hell (MicroXChg 2017 Berlin)

Nomad Architecture

Page 23: Distributed scheduler hell (MicroXChg 2017 Berlin)

Kubernetes

Page 24: Distributed scheduler hell (MicroXChg 2017 Berlin)
Page 25: Distributed scheduler hell (MicroXChg 2017 Berlin)

Custom Json Language{ "application": { "name": "timeseries-ingestor", "scale": 1, "ingresses": { "timeseries-ingestor-health": { "scheme": "http", "container_port": 8001 } }, "containers": { "timeseries-ingestor": { "image": "docker.com/timeseries/ingestor", "image_tag": "fcb24ca", "ports": [ 8001, 9090 ], "env": { "CASSANDRA_ADDRESS_LIST": "" }, "resources": { "cpu": "4", "memory": "4000" } } }, "metrics": { "port": 8001, "path": "/debug/metrics" } }, "maintainer": "[email protected]"}

Page 26: Distributed scheduler hell (MicroXChg 2017 Berlin)

Command Line Deploys

docc --contexts dev_env deploy timeseries-microservice1.json

Docc is our internal Kubernetes tool

Page 27: Distributed scheduler hell (MicroXChg 2017 Berlin)

Deployment tool with diffs to Kubernetes

"env": {+++ "CASSANDRA_CQL_VERSION": "3.1.7",--- "CASSANDRA_KEYSPACE": "staging_vulcan",+++ "CASSANDRA_KEYSPACE": "staging_vulcan_123",

Simliar to KubeDiff

Page 28: Distributed scheduler hell (MicroXChg 2017 Berlin)

Final Architecture

Page 29: Distributed scheduler hell (MicroXChg 2017 Berlin)
Page 30: Distributed scheduler hell (MicroXChg 2017 Berlin)

Load balancing

Pushing 3 Gbs to kubernetes using Flannel

Page 31: Distributed scheduler hell (MicroXChg 2017 Berlin)

Metrics

Page 32: Distributed scheduler hell (MicroXChg 2017 Berlin)
Page 33: Distributed scheduler hell (MicroXChg 2017 Berlin)

Logging

Page 34: Distributed scheduler hell (MicroXChg 2017 Berlin)
Page 35: Distributed scheduler hell (MicroXChg 2017 Berlin)

Upsides to Distributed Schedulers

Page 36: Distributed scheduler hell (MicroXChg 2017 Berlin)

How to choice your abstractation

Page 37: Distributed scheduler hell (MicroXChg 2017 Berlin)

Alles hat ein Ende nur die Wurst hat zwei

('everything has one end, only the sausage has two')