putting your users in a box · outline . 1)protect the machine from the job. 2)protect the job from...
TRANSCRIPT
![Page 1: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/1.jpg)
Putting your users in a Box
Greg Thain
Condor Week 2013
![Page 2: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/2.jpg)
› Why put job in a box?
› Old boxes that work everywhere* » *Everywhere that isn’t Windows
› New shiny boxes
2
Outline
![Page 3: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/3.jpg)
1) Protect the machine from the job.
2) Protect the job from the machine.
3) Protect one job from another.
3 Protections
3
![Page 4: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/4.jpg)
› Allows nesting
› Need not require root
› Can’t be broken out of
› Portable to all OSes
› Allows full management:
Creation // Destruction
Monitoring
Limiting
The perfect box
4
![Page 5: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/5.jpg)
› Resources a job can (ab)use
CPU
Memory
Disk
Signals
Network.
A Job ain’t nothing but work
5
![Page 6: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/6.jpg)
› HTCondor Preempt expression
PREEMPT = TARGET.MemoryUsage > threshold
• ProportionalSetSizeKb > threshold
› setrlimit call
USER_JOB_WRAPPER
STARTER_RLIMIT_AS
Previous Solutions
6
![Page 7: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/7.jpg)
› Newish stuff
From here on out…
7
![Page 8: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/8.jpg)
› Some people see this problem, and say
› “I know, we’ll use a Virtual Machine”
The Big Hammer
8
![Page 9: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/9.jpg)
› Might need hypervisor installed
The right hypervisor (the right Version…)
› Need to keep full OS image maintained
› Difficult to debug
› Hard to federate
› Just too heavyweight
Problems with VMs
9
![Page 10: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/10.jpg)
› Want opaque box
› Much LXC work applicable here
› Work with Best feature of HTCondor ever?
Containers, not VMs
10
![Page 11: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/11.jpg)
› ASSIGN_CPU_AFFINITY=true
› Now works with dynamic slots
› Need not be root
› Any Linux version
Only limits the job
CPU AFFINITY
11
![Page 12: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/12.jpg)
› You can’t kill what you can’t see
› Requirements:
HTCondor 7.9.4+
RHEL 6
USE_PID_NAMESPACES = true
• (off by default)
Doesn’t work with privsep
Must be root
PID namespaces
12
![Page 13: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/13.jpg)
PID Namespaces
13
Init (1)
Master (pid 15)
Startd (pid 26)
Starter (pid 39)
Job (pid 1)
Starter (pid 73)
Job (pid 1)
![Page 14: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/14.jpg)
› “Lock the kids in their room”
› Startd advertises set
›NAMED_CHROOT = /foo/R1,/foo/R2
› Job picks one:
›+RequestedChroot = “/foo/R1”
› Make sure path is secure!
Named Chroots
14
![Page 15: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/15.jpg)
› Two basic kernel abstractions:
› 1) nested groups of processes
› 2) “controllers” which limit resources
Control Groups
aka “cgroups”
15
![Page 16: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/16.jpg)
› Implemented as filesystem
Mounted on /sys/fs/cgroup, or /cgroup or …
› User-space tools in flux
Systemd
Cgservice
› /proc/self/cgroup
Control Cgroup setup
16
![Page 17: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/17.jpg)
› Cpu
› Memory
› freezer
Cgroup controllers
17
![Page 18: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/18.jpg)
› Requires:
RHEL6
HTCondor 7.9.5+
Rootly condor
No privsep
BASE_CGROUP=htcondor
And… cgroup fs mounted…
Enabling cgroups
18
![Page 19: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/19.jpg)
› Starter puts each job into own cgroup
Named exec_dir + job id
› Procd monitors
Procd freezes and kills atomically
› MEMORY attr into memory controller
› CGROUP_MEMORY_LIMIT_POLICY
Hard or soft
Job goes on hold with specific message
Cgroups
19
![Page 20: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/20.jpg)
Cgroup artifacts
20
04/22/13 11:39:08 Requesting cgroup
htcondor/condor_exec_slot1@localhost for job
…
StarterLog:
ProcLog
…
cgroup to htcondor/condor_exec_slot1@localhost for ProcFamily
2727.
04/22/13 11:39:13 : PROC_FAMILY_GET_USAGE
04/22/13 11:39:13 : gathering usage data for family with root
pid 2724
04/22/13 11:39:17 : PROC_FAMILY_GET_USAGE
04/22/13 11:39:17 : gathering usage
![Page 21: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/21.jpg)
$ condor_q
-- Submitter: localhost : <127.0.0.1:58873> : localhost
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE
CMD
2.0 gthain 4/22 11:36 0+00:00:02 R 0 0.0 sleep 3600
›$ ps ax | grep 3600 gthain 2727 4268 4880 condor_exec.exe 3600
21
![Page 22: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/22.jpg)
$ cat /proc/2727/cgroup
3:freezer:/htcondor/condor_exec_slot1@localhost
2:memory:/htcondor/condor_exec_slot1@localhost
1:cpuacct,cpu:/htcondor/condor_exec_slot1@localho
st
A process with Cgroups
22
![Page 23: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/23.jpg)
$ cd
/sys/fs/cgroup/memory/htcondor/condor_exec_sl
ot1@localhost/
$ cat memory.usage_in_bytes
258048
$ cat tasks
2727
23
![Page 24: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/24.jpg)
› Or, “Shared subtrees”
› Goal: protect /tmp from shared jobs
› Requires
Condor 7.9.4+
RHEL 5
Doesn’t work with privsep
HTCondor must be running as root
MOUNT_UNDER_SCRATCH = /tmp,/var/tmp
MOUNT_UNDER_SCRATCH
24
![Page 25: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/25.jpg)
MOUNT_UNDER_SCRATCH=/tmp,/var/tmp
Each job sees private /tmp, /var/tmp
Downsides:
No sharing of files in /tmp
MOUNT_UNDER_SCRATCH
25
![Page 26: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/26.jpg)
› Per job FUSE and other mounts?
› non-root namespaces
Future work
26
![Page 27: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/27.jpg)
› Prevent jobs from messing with everyone
on the network:
› See Lark and SDN talks Thursday at 11
Not covered in this talk
27
![Page 28: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work](https://reader034.vdocuments.mx/reader034/viewer/2022050417/5f8cd53dde4d30198446ea45/html5/thumbnails/28.jpg)
› Questions?
› See cgroup reference material in kernel doc • https://www.kernel.org/doc/Documentation/cgroups/
cgroups.txt
› LKN article about shared subtree mounts: • http://lwn.net/Articles/159077/
Conclusion
28