open wg talk #2 everything you wanted to know about criu (but were afraid to ask)
TRANSCRIPT
2
Agenda
● CRIU and use-cases
● History
● Current state
● Under the hood
● Kernel impact
● How to integrate with/into CRIU
● P.haul
● Questions
3
History
● Berkeley Lab Checkpoint/Restart (BLCR) (2003)
– Load a kernel module and link with a library
● DMTCP: Distributed MultiThreaded CheckPointing (2004-2006)
– Preload a library
● OpenVZ (2005)
– OpenVZ kernel
● Linux Checkpoint/Restart by Oren Laadan (2008)
– A non-mainline kernel
● CRIU (2011)
OpenVZ2005
BLCR2003
Linux C/R2008
CRIU2011
DMTCP2007
4
What is C/R and how can it be used?
C/R is the ability to save states of processesand to restore them later.
Usage scenarios:
– Failure recovery
– Live migration
– RKU (seamless kernel update)
– Rollback to the previous state
– Speed up of slow-boot services
– HPC issues
6
How does this work?
Kernel objects Process tree
crtools
Image files
Name-spaces
Files
Sockets
Pipes
001101101010110001011010000011010101
001101101010110001011010000011010101
001101101010110001011010000011010101
001101101010110001011010000011010101
001101101010110001011010000011010101
001101101010110001011010000011010101
8
Dump
● Parasite code
– Receive file descriptors
– Dump memory content
– Prctl(), sigaction, pending signals, timers, etc.
● Ptrace
– freeze processes
– Inject a parasite code
● Netlink
– Get information about sockets, netns
● Procfs
/proc/PID/maps, /proc/PID/map_files/, /proc/PID/status, /proc/PID/mountinfo
10
Restore
● Collect shared objects
● Restore name-spaces
● Create a process tree
– Restore SID, PGID
– Restore objects, which should be inherited
● Files, sockets, pipes, ...
● Restore per-task properties.
● Restore memory
● Sim! Sala bim!
● Awesome
Namespaces
Processes
12
New features in a kernel
● Parasite code injection (by Tejun Heo)
– Read task states, that are currently retrieved by a task only about itself
● The kcmp() system call
– Helps checking which kernel objects are shared between processes
● Proc map_files directory
– Find out what exact file is mapped
– Mappings sharing info
● A bunch of prctl extensions
– Set various private stuff on task/mm objects (c/r-only feature)
● Last-pid sysctl
– Restore task with desired PID value
13
New features in a kernel
● Sockets information dumping via netlink (sock_diag)
– Extendable sockets state retrieving engine
● TCP repair mode
– Read intimate state of a TCP connectionand reconstructs it from scratch on a freshly created socket
● Virtual net devices indexes
– Allows to restore network devices in a namespace
● Socket peeking offset
– Allows peeking sockets queues (reading without removing data from queue)
● Task memory tracking
– incremental snapshots, online migration
14
How to integrate with CRIU
● Action scripts
– block/unblock network
– setup namespaces
– post-dump and post-restore
● RPC, shared library
● Plugins
15
RPC and libcriu.so
● Easy to use from other languages
– The protocol is based on protobuf messages
● Allow to use CRIU for unprivileged processes
– CRIU still requires root privileges to run
– UNIX domain sockets support passing credentials
● Self-dump
– A process can request to dump itself
16
Plugins
● Unknown file types
● External dependencies
– Unix sockets (dbus, journald, rsyslog, etc)
– Unknown character and block devices.
– External bind-mounts
– External net devices
– External something else
18
In a Nutshell, CRIU...
.... has had 4,375 commits made by 36 contributorsrepresenting 58,688 lines of code
... is mostly written in Cwith a very low number of source code comments
... has a young, but established codebasemaintained by a large development teamwith stable Y-O-Y commits
... estimated cost $ 787,432
https://www.ohloh.net/p/criu#
20
P.haul (process hauler) - Live migration using CRIU
Live migration using CRIU
● Iterative
● Optimal
● Customizable
#./p.haul ovz 100 10.30.25.213
Migration succeededtotal time is ~2.86 secfrozen time is ~1.99 sec
( ['0.27', '0.18', '1.55'] )restore time is ~0.86 secimg sync time is ~0.32 sec