criu: time and space travel for linux containers

Download CRIU: Time and Space Travel for Linux Containers

If you can't read please download the document

Upload: kirill-kolyshkin

Post on 12-Apr-2017

1.059 views

Category:

Software


0 download

TRANSCRIPT

CRIU:
time and space travel
for Linux containers

Kirill KolyshkinContainerDays NYC, 30 Oct 2015

Agenda

Why would we want to migrate containers

Why wouldn't we want to migrate containers

How complex is to migrate containers

It's not about CRIU per se, as I can talk for a whole day about it, and you are probably not interested. It's about one of it's applications, which is containers live migration. I'm going to tell why and when it is useful, why it's not, and what are the obstacles if you decide to do it.

Live migration at a glance

Save the state

Transfer the state

Restore the state

What is live migration? Live migration is very well described in science fiction, it's just its called teleportation there. An object is analyzed, information about its bits and pieces are communicated to the other side, and it's assembled there at the destination.It's pretty much the same for containers, except for the fact it's already implemented.

Container live migration

It is already implemented in OpenVZ, for about 10 years, in the kernel, as a kernel modules. For the last 4 years we are working on re-implementing that feature using a different engine, developing the functionality of analyzing, decomposing and then re-composing the processes not as kernel modules, but as a user-space application.

Why would we want to migrate containers?

It's awesome!

Load balancing in a cluster

Kernel upgradeCan be done without migration

Hardware upgrade

Why would we want to migrate containers?: First, It looks awesome, totally mind blowing. If you take an inexperienced user and show them a set of processes with all the bells and whistles and stuff being moved from one physical server to another without being stopped --- it looks cool!Live migration can also be used to balance a load between a few machines.

Why wouldn't we want to live migrate containers?

Of course live migration is a complex technology, and it is error-prone and people are afraid of using it because of various possible side effects, good or bad. So, there are ways to avoid live migration.

How to avoid live migrating containers

Incoming traffic load balancing

Microservices

Crash-driven upgrades

Scheduled downtimes

One method is to balance not the processes using the resources, but the reason why they start to do it. For example, incoming network traffic you can use some frontend to load balance, if your architecture allows it.

Another method is microservices you run services that don't have much context, much state, so you can stop anything and run it on a different machine pretty fast and without losing anything. Again, if your architecture allows it. This is a paradigm of OpenStack, Docker, and some Docker-based projects such as Kubernetes.

Third option is somewhat peculiar, but is still being used. You wait until there's a major problem with the machine, and then you reboot and upgrade.

Obvious option is to plan a downtime.

How to make live migration really live?

Need to get rid of migrating memory while the container is frozen

Two ways:Pre-copy the memory

Post-copy the memory

Anyway, live migration is also a way to go, and once we start using it we'll see that during migration a lot of time is spent on moving the memory over the network. To make the migration really live, to have a really uninterrupted service, you need to exclude this memory migration from the period of time when the container is frozen. There are two options for that.

First one is to copy all or most of the memory before freezing the container.

Second is not to migrate the memory.

Live migration in more details

Pre-copy: collect and transfer the memory (might be iterative)

Freeze the container

Save its state

Copy the state

Restore

Unfreeze

Post-copy: swap in the memory over the network

Once we take into account this need to pre- or post-migrate the memory, the live migration is becoming more complicated.

Obstacles, booby traps, and rakes

VS

There is some specifics in implementing such a technology for containers. As live migration for VMs exist for a while, while for containers it's relatively new. So to better understand the details, let's compare containers and VMs. Let's do it step by step.

What do we need to migrate

Virtual MachineEnvironment (i.e. virtual hardware)

CPU state

Memory

ContainerEnvironment (cgroups, namespaces)

Processes and stuff

Memory

All the virtual hardware a hypervisor gives to the guest OS, virtual CPU state and memory state.

It's sort of like the same for Cts, but named differently. Instead of virtual hardware we have cgroups and namespaces. Instead of CPUs we have processes.

Collect and copy the memory

Virtual MachineAll memory is at hand

ContainerMemory is spread through the processes

Different types of memory (shared/private, backed by a file or not)

Need to collect the processes firstOnly then collect the memory

Not a problem for VM, as a hypervisor manages VM memory and knows everything about it.

For Cts, there are many different types of memory shared or private, backed by a file or not backed by a file, etc etc

Freezing

Virtual MachineSuspend all CPUs

ContainerWalk the tree (/proc), catch the processes and freeze those

Freeze cgroup helps a bit

There are two ways to catch the processes. First, we follow the steps of ps utility, get the processes one by one, stop them, make sure the ones we haven't stopped yet might fork and their children might fork.
A second option is to use freeze cgroup. If you put processes inside such a cgroup you can later say freeze! and it will. In such case this freezing will be done by the kernel who is good at it.

Saving the state

Virtual MachineHardware state, tree, 300K, ~70 objects

ContainerState of all objects, graph, 160K, ~1000 objects

Not all objects have decent API to get the state

For VM running a fresh install of say Fedora Linux, excluding the memory it will be about 300K of data and less than 100 objects.

For CT, this is way more fine grained open files, sockets, and everything those processes might have used. Plus, some of those objects might be shared, like files so we have a graph rather than a tree. It takes somethat less space (comparable to VM), but the number of objects is two orders of magnitude greater! The second problem is not a fundamental one, but rather a specifics of the CRIU implementation. If we would do checkpoint from the kernel, we would know everything, every state of every object. But as we are doing it from the userspace we need some API to get such state.

Copying the state

Virtual MachineCan read and copy at once, easy to serialize

ContainerNot easy to serialize as it's a graph not a tree

For containers, receiving side can't get it from a socket as there might be some objects depending on the objects that are not yet copied

Restoring the state

VM: recreate the memory, state of CPUs and virtual hardware

ContainersIn-kernel: create a myriad of small objects

In CRIU: same, but there might not be a convenient APIOver 1000 syscalls

Need to sort it all out

For CTs, we have a set of objects to be restored, and we have relations between those objects, a graph, and we have some rules, some restrictions on how to create these objects with their relations. It's not like we can create an object and then tie it to some other objects. We also have a state to which we want to go. So we need to solve this task, figure out a sequence to recreate all this.

Freeze

VM: resume the virtual CPUs

Container

Either SIGCONT through the tree

Or unfreeze the cgroup

Problem: need to wake processes in the proper order

To install a font: Open Fonts by clicking the Start button , clicking Control Panel, clicking Appearance and Personalization, and then clicking Fonts.Click File, and then click Install New Font. ...In the Add Fonts dialog box, under Drives, click the drive where the font that you want to install is located.http://windows.microsoft.com/en-us/windows-vista/install-or-uninstall-fonts

Post-memory migration: network swap device

Not yet ready for neither VMs nor CTs

userfaultfd by Andrea Arcangeli of Red Hata file descriptor to inform about page fault and get a memory back

merged into 4.2 kernel

work in progress to use it for KVM/QEMU

Container

Userfault FD is not sufficient for CRIU case

If a page is missing, the kernel won't kill the process but send a special message over that file descriptor so the listening process can get this memory and give it to the kernel

Userfaultfd is not working as it for CRIU for a few reasons:- with QEMU, it's the same process initialing and handling the page fault,
with CRIU it's different processes- not all memory types are currently supported . - an app can remap its memory, currently unsupported- fork() is not supported, child wil have pages with zeroes

Implementation

https://criu.org

[email protected]

plus.google.com/+CriuOrg

@__criu__

github: xemul/criu

Vibrant community, version 1.7.2 was released this week. Mostly driven by Odin, but also Google, Canonical, Red Hat, SuSE Debian, Samsung, Huawei, DockerIntegrated with OpenVZ (future version), LXC, LXD, Docker/Rocket libcontainer.Linux kernel developers are aware and helpful

CRIU uses beyond the live migration

HPC jobs: periodic checkpoints

Slow boot services speed up

That magical SAVE button e.g. in games

Software testing speed up

Reverse debugging

For slow boot, we tried starting Eclipse GUI, took 30s to start, 1.5s to restore.

Live migration

P.HaulProcess hauler

http://criu.org/P.Haul

Uses CRIU for c/r

Project logo is the little humpbacked horse (a magic pony)

That's all Folks!

Kirill [email protected]

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelMain Presentation Title.Font: Gotham Light 36 Point.Use Title Case

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelPresenter. Font: Gotham Light 22 Points. Title Case

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelPresenter Title. Set in Title Case. 18 Points.

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelSection Title.Gotham Light. 34 Points. Title Case

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline Level01

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelSupporting text area. Set in sentence case and Gotham Light at 24 points.

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline Level2014

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelOct. 24

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline Levelsection

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelSupporting text in sentence case, set in Gotham Light at 20 points.

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelCreator TitleDepartment or email (24 points)

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelCreated ByGotham Light. 34 Points.

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelSupporting text in sentence case, set in Gotham Light at 20 points.

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline Level2014

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelOct. 24

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline Level

Click to edit the title text formatSample With Bullets

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelClick to edit Master text stylesSecond levelThird levelFourth level

Fifth level

Click to edit the title text formatDemo Title
Gotham Light. 34 Points. Title Case

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelTitle (left)

Click to edit the title text format3-Col: Content, Titles & Descriptions

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelTitle (center)

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelTwo LinesTitle (right)

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelClick to edit Master text styles

Click to edit the title text formatSample With Only Title

Click to edit the title text formatPresenter Name

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline Level

Click to edit the title text formatImage (Left)

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline Level

Click to edit the title text formatImage (Right)

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline Level

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelDrag your picture to placeholder or click icon to add.

Click to edit the title text formatLarge Image: w/Slide Title Area

2014 04.01

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelThis is an image takeover slide. Drag your picture to placeholder or click icon to add.

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelDrag picture to placeholder or click icon to add

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelImage Title. Font: Gotham Light 24 Points. Title Case

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelPresenter Title. Set in Title Case. 20 Points.

Click to edit the title text format3-Col: Pictures, Titles & Descriptions

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelTwo LinesTitle text (right)

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelTitle text (center)

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelTitle text (left)

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelDrag your picture to placeholder or click icon to add.

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelDrag your picture to placeholder or click icon to add.

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelDrag your picture to placeholder or click icon to add.

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelThank you

Click to edit the title text formatSample With Bullets

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelClick to edit Master text stylesSecond levelThird levelFourth level

Fifth level