criu texas-linux-fest-2014

Download Criu texas-linux-fest-2014

If you can't read please download the document

Upload: kirill-kolyshkin

Post on 16-Apr-2017

1.297 views

Category:

Software


0 download

TRANSCRIPT

SWsoft Corporate

CRIU:

Time and Space Travel Service

for Linux Applications

Kir KolyshkinTexas Linux Fest, 14 Jun 2014

Agenda

What is CRIU?Project history and stateUsage scenariosLive migration

Reboot-less kernel upgrade

Slow services startup

Advanced debugging and testing

and more...

What is CRIU?

Checkpoint Restore In Userspace

Checkpoint
or
Dump

Restore
or
Restart

Full
info
about
state

CRIU pre-history

OpenVZ project

Containers live migration feature

Containers Upstream Linux1500+ kernel patches from us

Kernel-level checkpoint-restore merge failed

User-level checkpoint-restore ...

Why in userspace?

Kernel

User-space

Dump:
- ptrace
- /proc
- netlink
- syscalls

Restore:
- syscalls

Process

kmod

C/R API

Some history

Project started almost 3 years ago an RFC on kernel memory API extension

small command line tool

minimal dump of process' internals

First release v0.1 -- 23 Jul 2012 (x86 and basic stuff)

Since then Kernel part completed a year ago (150+ kernel patches:
new APIs for reading and setting process' state)

Current project state

The latest releasev1.3rc1

supports x86_64 & ARM & AARM64

support features that typical apps use

works on unmodified linux-3.11+

Included into Debian, Fedora, Ubuntu, Arch, SUSE, Gentoo, CoreOS...

Explicitly checkedApache, nginx, Oracle*, mysql, mongodb

ssh/sshd, openvpn, cron, sendmail

Java, gcc, make

VNC + { gimp, mplayer, blender, supertux }

Screen + { bash, top, tcpdump, tar/bz2 }

* some kernel tweaks required

Some vitals

- 55K lines of code

- 150+ kernel patches

- contribs from Google, Huawei, Samsung, Canonical

Usage scenarios

Live migrationincl. Docker, LXC, OpenVZ containers

Kernel upgrade w/o reboot

Slow services startup

Periodic snapshots (HPC)

Advanced debugging and testing

Live migration

Host A

Host B

Live migration

Host A

Host B

Shared FS

Pre-migrate memory

with memory tracker

http://criu.org/P.Haul

Load balancing on cluster

Host A

Host C

Host B

Power saving on cluster

Host A

Host C

Host B

Node maintenance

Host A

Host B

Kernel upgrade w/o reboot

Host

Kernel A

Kexec

Kernel B

Slow services startup

time

# service foo start

Service readiness

Spawn process

Load config

Top-up caches

Initialize resource pools

Ready

T

100%

Slow services startup

time

T

t < T

Ready

Spawn process

100%

Service readiness

# service foo restore

Periodic snapshots

time

Memory tracker helps
to keep images smaller

HPC

time

Power
failure

0%

20%

40%

60%

60%

Advanced debugging

Production Host

Application
in trouble

Developer Host

Debugger

Advanced testing

...

New test
or
new hardware

?

More (funny) use cases

Forgot to launch your program in screenLive-migrate it there

Playing a game without the save buttonSnapshot it

[Put your own use case here]

http://criu.org/Usage_scenarios

Recap

Started as containers live-migration tool

General tool to dump/restore apps state

v1.2 + Linux-3.11+ can do the trick

A lot of interesting technologiesMemory tracker

Migration of TCP connections

Injecting your code into a running application

Detecting kernel objects sharing

etc.

Resources

http://criu.org main site, documentation

http://git.criu.org git repo with tool sources

http://plus.google.com/+CRIU page

[email protected] mailing list

Kir Kolyshkin that's me

Thank you!

Parallels Optimized ComputingTM

Confidential