the state of linux containers

44
The State of Linux Containers

Upload: insidehpc

Post on 07-Jan-2017

555 views

Category:

Technology


0 download

TRANSCRIPT

ssThe State of Linux Containers

2

Gaikai

PS Now announcement at CES 2014

3

Gaikai

- caching

+ controller feedback

1. “Linux Container” / “Docker Ecosystem” in a Nutshell

2. Confusion about Ecosystem / Vision to tackle it

3. Docker -> SWARM -> SLURM -> BigData

4. Discussion of Opportunities and Problems

4

Agenda

The Bits and Pieces…

Userland(OS)Userland(OS) Userland(OS)

Userland(OS)

Ubuntu:14.04 Ubuntu:15.10 RHEL7.2

TinyCoreLinux

Linux Containers

6

SERVER

HOSTKERNEL

HYPERVISOR

KERNEL

SERVICE

Userland(OS)

KERNEL KERNEL

Userland(OS)Userland(OS) Userland(OS)

SERVICE SERVICE

SERVER

HOSTKERNEL

SERVICE SERVICE SERVICE

Traditional Virtualisation Containerisation

Containers do not spin up a distinct kernel all containers & the host share the same

user-lands are independent

they are separated by Kernel Namespaces

Containers are ‘grouped processes’ isolated by Kernel Namespaces

resource restrictions applicable through CGroups (disk/netIO)

HOSTcontainer1

7

Kernel Namespaces

bash

ls -l

container2

apache

container3

mysqld

consul consul

PIDNamespaces: Network Mount IPC UTS

container4

slurmd

ssh

consul

Container Runtime Daemon creates/…/removes containers, exposes REST API

handles Namespaces, CGroups, bind-mounts, etc.

IP connectivity by default via ‘host-only’ network bridge

Docker Engine

8SERVEReth0

dock

er0

container1

container2

Docker-Engine

Docker Compose

9

Describes stack of container configurations instead of writing a small bash script…

… it holds the runtime configuration as YAML file.

Docker Networking spans networks across engines KV-store to synchronise (Zookeeper, etcd, Consul)

VXLAN to pass messages along

SERVER0 SERVER1 SERVER<n>

Docker Networking

10

Consul

Docker-Engine

Consul Consul

Docker-Engine Docker-Engine

Consul DC

global

container0 container1 containerN

Docker Swarm proxies docker-engines serves an API endpoint in front of multiple docker-engines

does placement decisions.

SERVER0 SERVER1 SERVER<n>

Docker Swarm

11

Docker-Engine Docker-Engine Docker-Engine

swarm-client swarm-client swarm-client

swarm-master

:2376 :2376 :2376

:2375

container1

-e constraint:node==SERVER0

Docker Swarm [cont]

12

query docker-enginequery docker-swarm

Introduce new Technologies

Introducing new Tech

14

Self-perception when introducing new tech…

credit: TF2 - Meet the Pyro

Introducing new Tech

15

… not always the same as the perception of others.

credit: TF2 - Meet the Pyro

Docker Buzzword Chaos!

Distributions

Solutions

Auto-ScalingOn-Premise & OverSpill

Orchestration

self-healing

16

production-readyenterprise-grade

1. No special distributions useful for certain use-cases, such as elasticity and green-field deployment

not so much for an on-premise datacenter w/ legacy in it.

2. Leverage existing processes/resources install workflow, syslog, monitoring

security (ssh infrastructure), user auth.

3. keep up with docker ecosystem incorporate new features of engine, swarm, compose

networking, volumes, user-namespaces17

Vision

Reduce to the max!

Hardware (courtesy of ) 8x Sun Fire x2250, 2x 4core XEON, 32GB, Mellanox ConnectX-2)

Software Base installation

CentOS 7.2 base installation (updated from 7-alpha)

Ansible

consul, sensu

docker v1.10, docker-compose

docker SWARM

19

Testbed

node1

node2

node8

20

Docker Networking

Synchronised by Consul

Consul

Consul DC

Consul

Consul

Docker-Engine

Docker-Engine

Docker-Engine

node1

node2

node8

21

Docker SWARM

Docker SWARM Synchronised by Consul KV-store

Consul

Consul DC

Consul

Consul

Docker-Engine

Docker-Engine

Docker-Engine

swarm

swarm

SWARM

swarm master

node8

node2

node1

22

SLURM Cluster

Consul

Consul DC

Consul

Consul

SLURM within SWARM

slurmctld slurmd

slurmd

slurmd

Docker-Engine

Docker-Engine

Docker-Engine

swarm

swarm

SWARM

swarm master

SLURM

23

SLURM Cluster [cont]

node8

node2

node1

24

SLURM Cluster [cont]

Consul

Consul DC

Consul

Consul

SLURM within SWARM slurmd within app-container

pre-stage containers slurmctld slurmd

slurmd

slurmd

Docker-Engine

Docker-Engine

Docker-Engine

swarm

swarm

hpcg

hpcg

SWARM

hpcg

swarm master

SLURM

25

MPI Benchmark

http://qnib.org/mpi

http://qnib.org/mpi-paper

node8

node2

node1

26

SLURM Cluster [cont]

Consul

Consul DC

Consul

Consul

SLURM within SWARM slurmd within app-container

pre-stage containers slurmctld slurmd

slurmd

slurmd

Docker-Engine

Docker-Engine

Docker-Engine

swarm

swarm

hpcg

hpcg

SWARM

hpcg

OpenFOAM

OpenFOAM

OpenFOAM

swarm master

SLURM

27

OpenFOAM Benchmark

http://qnib.org/immutable

http://qnib.org/immutable-paper

node1

node2

node8

28

Samza Cluster

Consul

Consul DC

Consul

Consul

Distributed Samza Zookeeper and Kafka cluster

Samza instances to run jobsDocker-Engine

Docker-Engine

Docker-Engine

swarm

swarm

SWARM

swarm masterzookeeper

zookeeper

zookeeper

kafka

kafka

kafka

samza

samza

samza

$ cat test.log |awk ‘{print $1}’ |sed -e ’s/HPC/BigData/g’ |tee out.log

To Be Explored

1. Where to base images on? Ubuntu/Fedora: ~200MB

Debian: ~100MB

Alpine Linux: 5MB (musl-libc)

2. Trimm the Images down at all cost? How about debugging tools? Possibility to run tools on the host and ‘inspect’ namespaced processes inside of a container.

If PID-sharing arrives, carving out (e.g.) monitoring could be a thing.

30

Small vs. Big

1. In an ideal world… a container only runs one process, e.g. the HPC solver.

2. In reality… MPI want’s to connect to a sshd within the job-peers

monitoring, syslog, service discovery should be present as well.

3. How fast / aggressive to break traditional approaches?

31

One vs. Many Processes

Plugin System VXLAN

MACVLAN

How about IPoIB?

32

Docker Network

Running OpenFOAM on small scale is cumbersome manually install OpenFOAM on a workstation

be confident that the installation works correctly

A containerised OpenFOAM installation tackles both

33

Reproducibility / Downscaling

http://qnib.org/immutablehttp://qnib.org/immutable-paper

1. Since the environments are rather dynamic… how does the containers discover services?

external registry as part of the framework?

discovery service as part of the container stacks?

34

Service Discovery

With Docker Swarm it is rather easy to spin up a Kubernetes or Mesos cluster within Swarm.

35

Orchestration Frameworks

SERVER0 SERVER1 SERVER<n>

Docker-Engine Docker-Engine Docker-Engine

swarm-client swarm-client swarm-client

swarm-master

etcd

kubelet

scheduler apiserver

etcd

kubelet

etcd

kubelet

1. Containers should be controlled via ENV or flags External access/change of a running container is discouraged

2. Configuration management Downgraded to bootstrap a host?

36

Immutable vs. Config Mgmt

If containers are immutable within pipeline testing/deployment should be automated

developers should have a production replica

37

Continuous Dev./Integration

38

Docker Momentum

Software Dev

Dat

acen

ter O

ps

IT Tinkering (Hello World)

Continuous Dev/Int/Dep

Microservices, hyper scale

Big Data

High Performance Computing

HPC

Disclaimer: subjective exaggeration

Spinning up production-like environment is great MongoDB, PostreSQL, memcached as separate containers

python2.7, python3.4

39

Docker in Software Development

Like python’s virtualenv on steroids, iteration speedup through reproducibility

Spinning up production-like environment is… …not that easy

focus more on engineer/scientist, not the software-developer

1. For development it might work close to non-HPC software dev

2. But is that the iteration-focus? rather job settings / input data?

40

Docker in HPC development

Split input iteration / development from operation non-distributed stays vanilla

transition to HPC cluster using tech to foster operation

41

Separation of Concerns?

http://gmkurtzer.github.io/singularity

Input/Dev

Docker-Engine 1.11 will not be the parent of containers runC usage under the hood

42

containerd Integration

1. Separat Dev and Ops don’t block the momentum fostering iteration speed in Development 

2. Using vanilla docker-tech keep up with the ecosystem and prevent vendor/ecosystem lock-in

3. 80/20 rule have caveats on the radar but don’t bother too much

everything is so fast moving - it’s hard to predict

43

Recap aka. IMHO

Q&Ahttps://github.com/qnib/hpcac-cluster2016

http://qnib.org

eGalea Workshop (Pisa)<plz ping me if you are interested>

23.06.2016