parallelizing ci using docker swarm-mode · •loads are automatically balanced via docker's...

Copyright©2017 NTT Corp. All Rights Reserved.

Akihiro Suda <[email protected]>

NTT Software Innovation Center

Parallelizing CI using Docker Swarm-Mode

Open Source Summit Japan (June 1, 2017) Last update: May 17, 2017

2Copyright©2017 NTT Corp. All Rights Reserved.

https://github.com/AkihiroSuda

• Software Engineer at NTT Corporation

• Several talks at FLOSS community

• FOSDEM 2016

• ApacheCon Core North America 2016, etc.

• Docker Moby Project core maintainer

Who am I

: ≒ :

Docker transited into Moby Project (April, 2017)

https://github.com/AkihiroSuda


A problem in Docker/Moby project: CI is slow

https://jenkins.dockerproject.org/job/Docker-PRs/buildTimeTrend

capture: March 3, 2017

120 min

red valley = test failed immediately

https://jenkins.dockerproject.org/job/Docker-PRs/buildTimeTrend


How about other FLOSS projects?

https://builds.apache.org/view/All/job/PreCo

mmit-HDFS-Build/buildTimeTrend

https://builds.apache.org/view/All/job/PreCo

mmit-YARN-Build/buildTimeTrend

https://amplab.cs.berkeley.edu/jenkins/job/

SparkPullRequestBuilder/buildTimeTrend

capture: March 3, 2017200 min

260 min

240 min

https://grpc-

testing.appspot.com/job/gRPC_pull_requ

ests_linux/buildTimeTrend

550 min

(for ease of visualization, picked up some projects that use Jenkins from various categories)


•Blocker for reviewing/merging patches

•Discourages developers from writing tests

•Discourages developers from enabling additional testing features

• e.g. race detector (go test –race)

Why slow CI matters?


Why slow CI matters?

Poor implementation quality &Slow development cycle


•go test –parallel N

•mvn –-threads N

•parallel

• ...

Solution: Parallelize CI ?


But just doing parallelization is not enough

•No isolation

• Concurrent test tasks may race for certain shared resources(e.g. files under `/tmp`, TCP port, ...)

•Poor scalability

• CPU/RAM resource limitation

• I/O parallelism limitation

Solution: Parallelize CI ?


•Docker provides isolation

•Swarm-mode provides scalability

Solution: Docker (in Swarm-mode)

Distributeacross Swarm-mode

Parallelize & Isolateusing Docker containers

✓Isolation ✓Scalability


• For ideal isolation, each of the test functions should be encapsulated into independent containers

• But this is not optimal in actual due to setup/teardown code

Challenge 1: Redundant setup/teardown

func TestMain(m *testing.M) {

setUp()

m.Run()

tearDown()

}

func TestFoo(t *testing.T)

func TestBar(t *testing.T)

redundantly executed

setUp()

testFoo.Run()

tearDown()

container

setUp()

testBar.Run()

tearDown()

container


• Solution: execute a chunk of multiple test functions sequentially in a single container

Optimization 1: Chunking

setUp()

testFoo.Run()

tearDown()

container

setUp()

testBar.Run()

tearDown()

containerchunk

setUp()

testFoo.Run()

testBar.Run()

tearDown()

container


• Test sequence is executed in lexicographic order (ABC...)

• Observation: long-running test functions concentrate on a certain portion

• because testing similar scenarios

Challenge 2: Makespan non-uniformity

Execution Order

TestApple1TestApple2

TestBanana1

TestBanana2

TestCarrot1

TestCarrot2



0

10

20

30

40

50

60

70

80

90

100

0 500 1000 1500N-th test

Makespan (seconds)

Example: Docker itself

DockerSuite.TestBuild*

DockerSwarmSuite.Test*


Makespans of the chunks are likely to result in non-uniformity


Test1

Test2

Test3

Test4

Test1

Test2

Test3

Test4non-uniformity(wasted time for container 1)

speed-up

1 2

Chunks for containersSequence


Solution: shuffle the chunks

• No guarantee for optimal schedule though

Optimization 2: Shuffling

Test1

Test2

Test3

Test4

Test1

Test2

Test3

Test4

Test1

Test2

Test3

Test4

1 2 1 2

Chunks Chunks (Shuffled)Sequence


• Based on Funker (github.com/bfirsh/funker)

• FaaS-like architecture

• Loads are automatically balanced via Docker's built-in LB

• No explicit task queue

• Could be easily portable to other container orchestrators

Implementation

masterBuilt-in

LB

Client

worker.2

worker.1

worker.3

Funker!

Docker Swarm-mode cluster(typically on cloud)

(typically on laptop)


•Evaluated my hack against the CI of Docker itself• Of course, this hack can be applicable to CI of other

software as well

•Target: Docker 16.04-dev (git:7fb83eb7)• Contains 1,648 test functions

•Machine: Google Computing Enginen1-standard-4 instances (4 vCPUS, 15GB RAM)

Experimental result


Experimental result

1h22m7s(traditional testing)

4m10s

1 node

10 nodesCost: x10

Speed-up: x19.7

15m3sCost: x1

Speed-up: x5.5

1 node

•20 times faster at maximum with 10 nodes• But even effective with a single node!


Detailed result

Best BCR (benefit-cost

ratio)

1(Chunk size:

1648)

10(165)

30(55)

50(33)

70(24)

1

1h22m7s(=traditional)

15m3sN/A (more than 30m)

2 12m7s 10m12s 11m25s 13m57s

5 10m16s 6m18s 5m46s 6m28s

10 8m26s 4m31s 4m10s 4m20s

Containers running in parallel

Fastest

configuration

x5.5 (BCR x5.5)

x19.7 (BCR x2.0)

x8.1 (BCR x4.0)

x14.2 (BCR x2.8)

No

de

s

Time: average of 5 runs, Graph driver: overlay2


What if no optimization techniques?

Nodes ParallelizeParallelize

+ Chunking

Parallelize +

Chunking+

Shuffling(= previous slide)

1

more than 30m

14m58s 15m3s

2 10m1s 10m12s

5 7m32s 5m46s

10 6m9s 4m10s

Significantly faster x1.5 faster

See the previous slide for the number of containers running in parallel


Scale-out vs Scale-up?

NodesTotal

vCPUsTotal RAM Containers Result

Scaleout 10

40 150GB 50

4m10s

Scaleup 1 19m17s

with both chunking and shuffling

Scale-out wins• Better I/O parallelism, mutex contention, etc.


•Record and use the past execution history to optimize the schedule, rather than just shuffling

•Investigate deeply why scale-out wins

Future work


PR (merged): https://github.com/docker/docker/pull/29775

The code is available!

$ cd $GOPATH/src/github.com/docker/docker

$ make build-integration-cli-on-swarm

$ ./hack/integration-cli-on-swarm/integration-cli-on-swarm ╲

-replicas 50 ╲

-shuffle ╲

-push-worker-image your-docker-registry/worker:latest

https://github.com/docker/docker/pull/29775


•Yes, with just a few line of modifications

•Planning to split this hack into an independent generic test tool

Is it applicable to other software as well?


•Docker Swarm-mode is effective for parallelizing CI jobs

•Introduced some techniques for optimal scheduling

Conclusion

1h22m7s(traditional testing)

4m10s

1 node

10 nodesCost: x10

Speed-up: x19.7

15m3sCost: x1

Speed-up: x5.5

1 node

parallelizing ci using docker swarm-mode · •loads are automatically balanced via docker's...

Documents