understanding and building your own docker

Post on 11-Jan-2017

53 Views

Category:

Software

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1/22

Building blocks of Linux Containers

Motiejus Jakstysmotiejus@uber.com

@mo kelione

2016-11-18

c© 2016. Uber Technologies Inc. All rights reserved.

2/22

Table of Contents

IntroductionWhy meA container in Linux is...

NamespacesIsolation in LinuxWhat did we just do

File systems and COW

What did we forget?Leftover elephants in the room

The End

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details

I Many use Docker.I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details!

→ You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details.

I Many use Docker.I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details!

→ You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details?

I Many use Docker.I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details!

→ You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details?

I Many use Docker.

I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details!

→ You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details?

I Many use Docker.I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details!

→ You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details?

I Many use Docker.I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details!

→ You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details?

I Many use Docker.I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details!

→ You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details?

I Many use Docker.I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details!

→ You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details?

I Many use Docker.I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details!

→ You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

3/22

Conclusion!

Devil Hides in The Details?

I Many use Docker.I We lack time to understand.

I You need to understand infra to successfullytroubleshoot infra.

I There are trade-offs in the configuration.

I Make container engine in 30 minutes.

I Details! → You will still pick existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

4/22

Why me

My resume: oncall experience.

I 2009− 2012 Telecom (Dev + Ops).

I 2012− 2014 Online Gaming (Dev + Ops).

I 2014− 2016 Amazon (Dev + Ops).I 2016− now Uber (Dev + Ops):

I From 2016.02: Dev.I From 2016.11: SRE.

I had to understand how exactly infrastructureworks.

c© 2016. Uber Technologies Inc. All rights reserved.

4/22

Why me

My resume: oncall experience.

I 2009− 2012 Telecom (Dev + Ops).

I 2012− 2014 Online Gaming (Dev + Ops).

I 2014− 2016 Amazon (Dev + Ops).I 2016− now Uber (Dev + Ops):

I From 2016.02: Dev.I From 2016.11: SRE.

I had to understand how exactly infrastructureworks.

c© 2016. Uber Technologies Inc. All rights reserved.

5/22

A container in Linux is ...

Fork/exec with bells & whistles:

I Fancy tarball for distribution.

I COW filesystem to make it start fast.

I Cgroups for fairness.

I Namespaces for isolation.

c© 2016. Uber Technologies Inc. All rights reserved.

5/22

A container in Linux is ...

Fork/exec with bells & whistles:

I Fancy tarball for distribution.

I COW filesystem to make it start fast.

I Cgroups for fairness.

I Namespaces for isolation.

c© 2016. Uber Technologies Inc. All rights reserved.

5/22

A container in Linux is ...

Fork/exec with bells & whistles:

I Fancy tarball for distribution.

I COW filesystem to make it start fast.

I Cgroups for fairness.

I Namespaces for isolation.

c© 2016. Uber Technologies Inc. All rights reserved.

5/22

A container in Linux is ...

Fork/exec with bells & whistles:

I Fancy tarball for distribution.

I COW filesystem to make it start fast.

I Cgroups for fairness.

I Namespaces for isolation.

c© 2016. Uber Technologies Inc. All rights reserved.

5/22

A container in Linux is ...

Fork/exec with bells & whistles:

I Fancy tarball for distribution.

I COW filesystem to make it start fast.

I Cgroups for fairness.

I Namespaces for isolation.

c© 2016. Uber Technologies Inc. All rights reserved.

5/22

A container in Linux is ...

Fork/exec with bells & whistles:

I Fancy tarball for distribution.

I COW filesystem to make it start fast.

I Cgroups for fairness.

I Namespaces for isolation.

c© 2016. Uber Technologies Inc. All rights reserved.

6/22

Table of Contents

IntroductionWhy meA container in Linux is...

NamespacesIsolation in LinuxWhat did we just do

File systems and COW

What did we forget?Leftover elephants in the room

The End

c© 2016. Uber Technologies Inc. All rights reserved.

7/22

We will cover

I User namespaces.

I Pid namespaces.

I Mount namespaces.

I Network namespaces.

I There are more, but not today.

c© 2016. Uber Technologies Inc. All rights reserved.

7/22

We will cover

I User namespaces.

I Pid namespaces.

I Mount namespaces.

I Network namespaces.

I There are more, but not today.

c© 2016. Uber Technologies Inc. All rights reserved.

7/22

We will cover

I User namespaces.

I Pid namespaces.

I Mount namespaces.

I Network namespaces.

I There are more, but not today.

c© 2016. Uber Technologies Inc. All rights reserved.

7/22

We will cover

I User namespaces.

I Pid namespaces.

I Mount namespaces.

I Network namespaces.

I There are more, but not today.

c© 2016. Uber Technologies Inc. All rights reserved.

7/22

We will cover

I User namespaces.

I Pid namespaces.

I Mount namespaces.

I Network namespaces.

I There are more, but not today.

c© 2016. Uber Technologies Inc. All rights reserved.

7/22

We will cover

I User namespaces.

I Pid namespaces.

I Mount namespaces.

I Network namespaces.

I There are more, but not today.

c© 2016. Uber Technologies Inc. All rights reserved.

8/22

User namespace

Become container-local root.unshare --map-root-user

c© 2016. Uber Technologies Inc. All rights reserved.

9/22

Mount namespace

Hide container mounts.unshare --mount

c© 2016. Uber Technologies Inc. All rights reserved.

10/22

Pid namespace

Hide other pids.unshare --pid --mount-proc --fork

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).

I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.

I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.

Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

11/22

Network namespace

Demonstrate this:

I Create namespace.

I Activate loopback (lo).I Create pair of devices veth1a and veth1b:

I veth1b will go to the namespace.I veth1a will stay in default.

I Add ip addresses.

I curl and ping.

I lsof, bind on ports separately.Ever wanted to run tcpdump on an application?

c© 2016. Uber Technologies Inc. All rights reserved.

12/22

Network namespace

default

lo

127.0.0.1

eth0

192.0.2.1

t1

lolo

127.0.0.1

veth1a veth1bveth1a

10.0.0.1

veth1b

10.0.0.2

c© 2016. Uber Technologies Inc. All rights reserved.

12/22

Network namespace

default

lo

127.0.0.1

eth0

192.0.2.1

t1

lo

lo

127.0.0.1

veth1a veth1bveth1a

10.0.0.1

veth1b

10.0.0.2

c© 2016. Uber Technologies Inc. All rights reserved.

12/22

Network namespace

default

lo

127.0.0.1

eth0

192.0.2.1

t1

lolo

127.0.0.1

veth1a veth1bveth1a

10.0.0.1

veth1b

10.0.0.2

c© 2016. Uber Technologies Inc. All rights reserved.

12/22

Network namespace

default

lo

127.0.0.1

eth0

192.0.2.1

t1

lolo

127.0.0.1

veth1a veth1b

veth1a

10.0.0.1

veth1b

10.0.0.2

c© 2016. Uber Technologies Inc. All rights reserved.

12/22

Network namespace

default

lo

127.0.0.1

eth0

192.0.2.1

t1

lolo

127.0.0.1

veth1a veth1bveth1a

10.0.0.1

veth1b

10.0.0.2

c© 2016. Uber Technologies Inc. All rights reserved.

13/22

What did we just do

Created a container:

User namespace apt-get, iptables, mount, etc.

Isolated pids no nobody, isolate from each other.

Isolated mounts e.g. for /tmp.

Isolated network safely bind to :80.

An improvement over ”run and hope it doesn’taffect anything else”.

c© 2016. Uber Technologies Inc. All rights reserved.

13/22

What did we just do

Created a container:

User namespace apt-get, iptables, mount, etc.

Isolated pids no nobody, isolate from each other.

Isolated mounts e.g. for /tmp.

Isolated network safely bind to :80.

An improvement over ”run and hope it doesn’taffect anything else”.

c© 2016. Uber Technologies Inc. All rights reserved.

13/22

What did we just do

Created a container:

User namespace apt-get, iptables, mount, etc.

Isolated pids no nobody, isolate from each other.

Isolated mounts e.g. for /tmp.

Isolated network safely bind to :80.

An improvement over ”run and hope it doesn’taffect anything else”.

c© 2016. Uber Technologies Inc. All rights reserved.

13/22

What did we just do

Created a container:

User namespace apt-get, iptables, mount, etc.

Isolated pids no nobody, isolate from each other.

Isolated mounts e.g. for /tmp.

Isolated network safely bind to :80.

An improvement over ”run and hope it doesn’taffect anything else”.

c© 2016. Uber Technologies Inc. All rights reserved.

13/22

What did we just do

Created a container:

User namespace apt-get, iptables, mount, etc.

Isolated pids no nobody, isolate from each other.

Isolated mounts e.g. for /tmp.

Isolated network safely bind to :80.

An improvement over ”run and hope it doesn’taffect anything else”.

c© 2016. Uber Technologies Inc. All rights reserved.

13/22

What did we just do

Created a container:

User namespace apt-get, iptables, mount, etc.

Isolated pids no nobody, isolate from each other.

Isolated mounts e.g. for /tmp.

Isolated network safely bind to :80.

An improvement over ”run and hope it doesn’taffect anything else”.

c© 2016. Uber Technologies Inc. All rights reserved.

14/22

Table of Contents

IntroductionWhy meA container in Linux is...

NamespacesIsolation in LinuxWhat did we just do

File systems and COW

What did we forget?Leftover elephants in the room

The End

c© 2016. Uber Technologies Inc. All rights reserved.

15/22

File systems and COW

A container:

I Needs a file system.

I Starts quickly regardless of size.

Do not want to copy 1GB with every startup.Copy On Write!lvm? zfs? btrfs?

c© 2016. Uber Technologies Inc. All rights reserved.

15/22

File systems and COW

A container:

I Needs a file system.

I Starts quickly regardless of size.

Do not want to copy 1GB with every startup.Copy On Write!lvm? zfs? btrfs?

c© 2016. Uber Technologies Inc. All rights reserved.

15/22

File systems and COW

A container:

I Needs a file system.

I Starts quickly regardless of size.

Do not want to copy 1GB with every startup.

Copy On Write!lvm? zfs? btrfs?

c© 2016. Uber Technologies Inc. All rights reserved.

15/22

File systems and COW

A container:

I Needs a file system.

I Starts quickly regardless of size.

Do not want to copy 1GB with every startup.Copy On Write!

lvm? zfs? btrfs?

c© 2016. Uber Technologies Inc. All rights reserved.

15/22

File systems and COW

A container:

I Needs a file system.

I Starts quickly regardless of size.

Do not want to copy 1GB with every startup.Copy On Write!lvm? zfs? btrfs?

c© 2016. Uber Technologies Inc. All rights reserved.

16/22

A quick demo

I Create tank/images/debian@latest

I Create tank/containers/t1 from @latest

I unshare --mount --pid --fork chroot . bash

c© 2016. Uber Technologies Inc. All rights reserved.

17/22

Table of Contents

IntroductionWhy meA container in Linux is...

NamespacesIsolation in LinuxWhat did we just do

File systems and COW

What did we forget?Leftover elephants in the room

The End

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

18/22

Leftover elephants in the room

I Trivial to escape this ”container”.

I Sec: no leftover file descriptors.

I Resource fairness.

I Sec/DoS: shared kernel resources.

I Supervision, daemonization and cleanup.

I Logging.

I Collect zombie processes.

I Image management.

Should someone else do it?

c© 2016. Uber Technologies Inc. All rights reserved.

19/22

We almost have a container engine

I But look at my conclusions again.

I Devil hides in the details.

I Tooling companies (Docker, CoreOS, etc)raised > $108.

c© 2016. Uber Technologies Inc. All rights reserved.

19/22

We almost have a container engine

I But look at my conclusions again.

I Devil hides in the details.

I Tooling companies (Docker, CoreOS, etc)raised > $108.

c© 2016. Uber Technologies Inc. All rights reserved.

19/22

We almost have a container engine

I But look at my conclusions again.

I Devil hides in the details.

I Tooling companies (Docker, CoreOS, etc)raised > $108.

c© 2016. Uber Technologies Inc. All rights reserved.

19/22

We almost have a container engine

I But look at my conclusions again.

I Devil hides in the details.

I Tooling companies (Docker, CoreOS, etc)raised > $108.

c© 2016. Uber Technologies Inc. All rights reserved.

20/22

To recap

I Easy to understand kernel facilities.

I Devil hides in the details.

I Either spend a lot of time and headache, orre-use existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

20/22

To recap

I Easy to understand kernel facilities.

I Devil hides in the details.

I Either spend a lot of time and headache, orre-use existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

20/22

To recap

I Easy to understand kernel facilities.

I Devil hides in the details.

I Either spend a lot of time and headache, orre-use existing tools.

c© 2016. Uber Technologies Inc. All rights reserved.

21/22

Table of Contents

IntroductionWhy meA container in Linux is...

NamespacesIsolation in LinuxWhat did we just do

File systems and COW

What did we forget?Leftover elephants in the room

The End

c© 2016. Uber Technologies Inc. All rights reserved.

22/22

We’re hiring!

Uber SRE locations: SF, NYC, Seattle, Vilnius.

I Check out join.uber.com

I Also, contact me at motiejus@uber.com

c© 2016. Uber Technologies Inc. All rights reserved.

top related