container & kubernetes

31
Container & Kubernetes Written by Ted Jung ([email protected]) (Cloud Native Engineer)

Upload: ted-jung

Post on 16-Apr-2017

1.628 views

Category:

Engineering


2 download

TRANSCRIPT

Page 1: Container & kubernetes

Container & Kubernetes

Written by Ted Jung ([email protected])(Cloud Native Engineer)

Page 2: Container & kubernetes

I. Base Techs(container)FSCGroupsNamespacesCOW

II. Kubernetes (service networking)

Page 3: Container & kubernetes

What is Container?Lightweight VM. But, It’s not quite like a VM

1 Uses the host kernel2 Does not need to boot a different OS3 Does not have its own modules4 Does not need init as PID 1

It’s just normal processes on a host machine

Page 4: Container & kubernetes

What is Container?Containers wrap a pieces of software in a complete filesystem that contains everything it needs to run:• Code,• Runtime,• System tools• System librariesAnything you can install on a server

This guarantees that it will always run the same regardless of the environment where it is running on.

Page 5: Container & kubernetes

VM vs. Container

Infrastructure

Operating system

Hypervisor

Guest OS

Guest OS

Guest OS

Bins/Libs

App1

Bins/Libs

App2

Bins/Libs

App3

Infrastructure

Operating system

Docker Engine

Bins/Libs

App1

Bins/Libs

App2

Bins/Libs

App3

Share the kernel with other containersRunning as isolated processes in user spaceDocker containers are not tied to any specific infrastructure

Page 6: Container & kubernetes

What is Docker?

lmctfyopenvzzonelibcontainerlxcrkt

Page 7: Container & kubernetes

Why Docker?

• Easy to use : Simple and accessible tooling

• High degree of reuse and extensibility

: stackable file system

Page 8: Container & kubernetes

Before go ahead further..

FSCgroupsNamespaces

Page 9: Container & kubernetes

Base tech of container(AUFS)

Group of branches by order- a branch (=a single directory)- is stored in a directory in the hostat least,- a single branch for Read-only many Read-Write branches Read-only

Read-write

Read-writeRead-write

Page 10: Container & kubernetes

Base tech of container(AUFS)

Mount pointAUFS, mount-point of a container is:/var/lib/docker/aufs/mnt/$CONTAINER_ID/

It is only mounted when the container is running

AUFS branches(read-only & read-write) are in:/var/lib/docker/aufs/diff/$CONTAINER_OR_IMAGE_ID

Page 11: Container & kubernetes

Base tech of container(AUFS)

e.g. Create Container

/proc/mount/sys/fs/aufs/si_XXXX/br*

/var/lib/docker/aufs/diff/XXXContainer = a group of branches

host container

Page 12: Container & kubernetes

Base tech of container(AUFS)A file (container / host)

Delete container

container

Host

Page 13: Container & kubernetes

Base tech of container(AUFS)

Docker V1.10

: Content addressable storage model

Ubuntu: 15.04 Image

C84bfc126a2 188MB

D14bfc54ea1 194.5KB

c80179960767 1.895KB

6d45a3841788 0 B

Thin R/W layer Container layer

Image layer (R/O)

- Docker storage driver is:enabling and managing both image layer & container layer.stacking layers , providing a single unified view

- Location: /var/lib/docker/.

Ubuntu: 15.04 Image

C84bfc126a2 188MB

D14bfc54ea1 194.5KB

c80179960767 1.895KB

6d45a3841788 0 B

Thin R/W layer

• Security• Avoid ID Collisions• Guarantees data integrity

Random UUID

CryptographicContent hashes

Page 14: Container & kubernetes

Storage DriverAUFS BtrfsDevice mapperOverlayFSZFS

1. Search through the image layers top-down approach

2. Perform “copy-up” operation copies the file thin writable layer

3. Modify the copy of the file

File modification(create, delete, update) steps..

Ubuntu: 15.04 Image

C84bfc126a2 188MB

D14bfc54ea1 194.5KB

c80179960767 1.895KB

6d45a3841788 0 B

Thin R/W layer

Ubuntu: 15.04 Image

C84bfc126a2 188MB

D14bfc54ea1 194.5KB

c80179960767 1.895KB

6d45a3841788 0 B

Thin R/W layer

6d45a3841788 2B

Modification2B on 6d~

copy-up

modification

Page 15: Container & kubernetes

Developed by Rohit Seth in 2006 under the name “Process Containers”Kernel capability to limit, account(metering) and isolate resourcesCPU, Memory, Disk I/O, Network

Base tech of container(CGroups)

Cgroup controllers Memory controller CPUset controller CPUaccounting controller CPUscheduler controller Devices controller I/O controller for block devices Freezer Network Class Controller

reducing resource contention and increasing predictability in performance

Page 16: Container & kubernetes

Controller Description

memoryAllows for setting limits of RAM and resource usage and querying cumulative usage of all processes in the group

cpuset Binding of processes within a group to a set of CPUs and controlling migration between CPUs

cpuacct Information about CPU usage for a group of processes

cpu Controlling the prioritization of processes in the group

devices Access control lists on character and block devices

Base tech of container(CGroups)

Page 17: Container & kubernetes

Base tech of container(CGroups)

Cgroups(control groups)A ‘cgroups’ associate a set of tasks with a set of parameters for one or more subsystemsA ‘subsystem’ is a module that makes use of the task grouping facilities provided by cgroups to treat groups of tasks in particular waysA ‘subsystem’ is typically a “resource controller” that schedules a resource and applies per-cgroup limitsA ‘hierarchy’ is a set of cgroups arranged in a tree, such that every task in the system is in exactly one of the cgroups in the hierarchy and a set of subsystems; each subsystem has system-specific state attached to each cgroups in the hierarchy. Each hierarchy has an instance of the cgroups virtual filesystem associated with it.

Cgroup subsystem-Isolation and special controls: cpuset, namespace, freezer, device, checkpoint/restart-Resource control: cpu(scheduler), memory, disk io, network

Page 18: Container & kubernetes

Base tech of container(Namespace)

handle six items in table belowController Description

PID Processes (Process ID)NET Network Interface/ Iptables/ Routing Tables/ SocketsMNT Root File SystemUTS HostnameIPC Inter Process Communication

USER UID/GID, security improvement

Page 19: Container & kubernetes

Base tech of container(Namespace)

Namespaces are created with system call “clone()”Namespaces are materialized by pseudo-files in /proc/<pid>/ns

Page 20: Container & kubernetes

Base tech of container(Summarize)

Why do we need CGroups?SLA Management: reduce resource contention and increase predictability in performanceLarge Virtual Consolidation: prevent single or group of virtual machines monopolizing resources or impacting other env

Cgroups-Limit use of resources

Namespace-Limits what resources can be seenNamespace provide processes with their own view of system Docker

Linux Kernel

namespaces cgroups

libcontainer

Page 21: Container & kubernetes

Base tech of container(COW)Everyone has a single shared copy of the same data until it’s over written, and then a copy is made.

Docker uses COW, which essentially means that every instance of your docker image uses the same files until one of them needs to change a file.

Page 22: Container & kubernetes

K8S terms

ReplicationControllers

Dynamically manage(create, kill, etc) the lifecycle of pods(Scaling up/down, rolling updates)

Clusters

Services• abstraction• a REST object• a logical set of

pods & a policy

Servicespod pod pod

pod pod pod

Pods• a collocated

group of Docker containers with shared volumes

• each of pods are born and die

container container

server server server

Deployable unit• Created• Scheduled• Managed

Pool ofKubernetesresources

IPtables Rule

containercontainer

Page 23: Container & kubernetes

endpoints

K8S terms{ “kind”: ”Service”, “apiVersion”:”v1”, “metadata”:{ “name”: ”my-service” }, “spec”:{ “selector”: { “app”: ”MyApp” }, “ports”:[{ “protocol”: ”TCP”, “port”:”80”, “targetPort”:9376” }] } }

service

pod pod

endpoint

Selector = “app: MyApp”

Cluster IP my-service

targetPort:9376

Serviceproxy

Page 24: Container & kubernetes

K8S terms (routing mode of service traffic)

Iptables rule

service

endpoint

endpoint

endpoint

Kube-proxy

Master

mode: userspace

pod

redirect

Iptables rule

service

endpoint

endpoint

endpoint

Kube-proxy

Master

mode: iptables

pod

redirect

• Fast• ReliableBut,• No retry

Page 25: Container & kubernetes

How K8S worksKubernetes Master

Worker Node

API server

ETCD

Scheduler

Kubernetes controller manager server

kublet Kube-proxyMaster’s status is stored

Validates and configuresPodServiceReplication controller

REST operations

Container manifest: YAML

(description of pod)Services

pod pod pod

8080

4001

8080

8080

Schedule pods to worker nodesSynchronize pod status

Page 26: Container & kubernetes

K8S Service Traffic Flows

rc:3 rc:1 rc:2

Service 2

(…)

Service 3

(back-end)

kube-proxy kube-proxy

Service 1

(front-end)

kube-proxy

request

Cluster-domain : 10.100.0.10 (Service_Cluster_IP_Range, virtual IP)Cluster-pool: 192.168.0.0/16

ClusterDomain

ClusterPool

skyd

ns

skyd

ns

podcontain

er

pod podcontain

ercontain

er

pod pod podcontain

ercontain

ercontain

er

Page 27: Container & kubernetes

K8S Service Traffic Flows (e.g.)

Page 28: Container & kubernetes

Then, what is Kube-proxy?

Node #2Node #1

Kube-proxy

podcontainer

podcontainer

Iptables rule

Watches kubernetes masterto add and remove the objects- Service- Endpoints

Can do simple TCP,UDP stream forwardingRound Robin TCP, UDP forwardingVIP is managed by kube-proxyWatch all servicesUpdates iptables after backend changingTranslate ServiceIP to Pod IP

Master ETCD Cluster

API Server ETCDCluster statusCurrent configuration

Page 29: Container & kubernetes

SkyDNSSkyDNS in Kubernetes?Kubernetes offers a DNS cluster addon, which most of the supported environments enabled by default.SkyDNS is a DNS service, with some custom logic to slave it to the Kubernetes API Server

Create Service DNS name is mapped to the service

Virtual IP address is assigned to a service

Kubelet –v=5 –address=0.0.0.0 –port=10250 –hostname_override=105.144.47.24 –api_servers=105.*.*.23:8080 –healthz_bind_address=0.0.0.0 –healthz_port=10248 –network_plugin=calico –cluster-domain=cluster.local –cluster-dns=10.100.0.10 –logtostderr=true

Page 30: Container & kubernetes

SkyDNS(cont..)

ETCD in pod(DNS record)

SkyDNS in pod(DNS server)

Kube2SKY in pod

(bridging between Kubernetes and

ETCD)

Kubernetes(kubelet)

Pods in running

Kubernetes(Master)

Service info is published/written into etcdThen,SkyDNS be able to retrieve the name of service

Kublet pretends itself to a DNS server

Info of Service is pulledfrom master into SkyDNSe.g. what services has changed?

RetrieveSearch

QueryUpdate

Page 31: Container & kubernetes

Thank You