scaling your jenkins ci pipeline

24
Scale your Jenkins pipeline FOSDEM 2013 Anders Nickelsen, PhD QA engineer @ Tradeshift @anickelsen, [email protected]

Upload: anickelsen

Post on 08-May-2015

1.974 views

Category:

Technology


2 download

DESCRIPTION

As your product grows the number of tests will grow. At one point, the number of tests overwhelm the number of servers to run those tests, developer feedback becomes slower and bugs can sneak in. To solve this problem, we built parallelization into our Jenkins build pipeline, so we only need to add servers to keep overall test time down. In this talk, I'll cover how we did it and what came out of it in terms of pros and cons. As your product grows the number of test grows. At Tradeshift, we develop code in a test-driven manner, so every change, be it a new feature or a bug fix, includes a set of functional tests driven by Selenium. At one point, it started taking too long to run all of our tests in sequence, even when we scaled capacity of test-servers to make the individual tests run faster. When tests are slow developers don’t get immediate feedback when they commit a change and more time is wasted switching between tasks and different parts of the code. To solve this problem, we built parallelization into our Jenkins build pipeline, so we only need to add servers to make our test suites run faster and keep overall test time down. In this talk, I'll cover how we did it and what came out of it in terms of pros and cons.

TRANSCRIPT

Page 1: Scaling your Jenkins CI pipeline

Scale yourJenkins pipeline

FOSDEM 2013Anders Nickelsen, PhD

QA engineer @ Tradeshift@anickelsen, [email protected]

Page 2: Scaling your Jenkins CI pipeline
Page 3: Scaling your Jenkins CI pipeline
Page 4: Scaling your Jenkins CI pipeline

Tradeshift

Page 5: Scaling your Jenkins CI pipeline

FOSDEM 2013

Tradeshift

Platform for your business interactionsCore services are free

~3 years old

One product – one production environment

~20 developersCopenhagen, DK and San Francisco, CA

Feb 2, 2013

Page 6: Scaling your Jenkins CI pipeline
Page 7: Scaling your Jenkins CI pipeline

Slow test is slow

Page 8: Scaling your Jenkins CI pipeline

FOSDEM 2013

Why scale? Fast feedback

Integration tests (IT): Selenium 2 from Geb (Java/Groovy)Run by Jenkins on every change

Takes 2 hours to run all in sequence

10+ team branches, each trigger ITVerified merges to master, also trigger IT

Pipeline is congestion pointFeedback not fast enough

Feb 2, 2013

Page 9: Scaling your Jenkins CI pipeline

FOSDEM 2013

Release pipeline

Feb 2, 2013

Master

10 team branches

Integration tests on changes

Any branch(unit tested)

= 10 projects

Production

Page 10: Scaling your Jenkins CI pipeline

The swarm

Page 11: Scaling your Jenkins CI pipeline

© Blizzard 2013

Page 12: Scaling your Jenkins CI pipeline

FOSDEM 2013

What?

12 Jenkins slaves– New slaves join swarm on boot– Orchestrated by ec2-collective– Configured by Puppet at boot

Tests distributed in setsLongest processing timeUses test run-time from last stable build

1 set per online slave (dynamic)Feb 2, 2013

Page 13: Scaling your Jenkins CI pipeline
Page 14: Scaling your Jenkins CI pipeline

FOSDEM 2013

Fast tests!

All sets are put into Jenkins build queuePicked up by any slave– throttled to one per slave

12 slaves => 20 minutes– Tests: 10 min– Overhead: 10 min

24 slaves => 15 minutes

Feb 2, 2013

Page 15: Scaling your Jenkins CI pipeline

FOSDEM 2013

Post processing

Join when all sets complete– Triggered builds are blocking

Collect test results– JUnit, HTML, screenshots, videos, console logs =

500 MB/run– Curl request to Jenkins API

Process test results– JVM outputs, slice console logs, archive artifacts– Custom groovy script

Feb 2, 2013

Page 16: Scaling your Jenkins CI pipeline

Optimizations

Page 17: Scaling your Jenkins CI pipeline

FOSDEM 2013

Optimizations

Only on ITCase file-level, file scan

Only on spec level–min time = longest running spec

Only scale test processing time– 10 minutes today

Feb 2, 2013

Page 18: Scaling your Jenkins CI pipeline

Lessons learned

Page 19: Scaling your Jenkins CI pipeline

FOSDEM 2013

Parallelization overhead

Swarm setup and tear down– node initialization and test result

collection

Tests break a lot when run in parallelFixing tests and code hurts

Feb 2, 2013

Page 20: Scaling your Jenkins CI pipeline

FOSDEM 2013

‘Random failures’

Failures appear probabilistic / random– Order dependencies– Timing issues– Rebuild ‘fixes’ the tests

Slave failures–Weakest link breaks the pipeline– 24 slaves => 1 filled up due to slow

mount => pipeline brokenFeb 2, 2013

Page 21: Scaling your Jenkins CI pipeline
Page 22: Scaling your Jenkins CI pipeline

FOSDEM 2013

Cost optimization

AWS EC2 instances:– 1/2 price gave 3/4 performance– 4/3 swarm size gave 1/4 price reduction– Equivalent performance

Swarm is offline when not used– Kept online for 1 hour after use

Feb 2, 2013

Page 23: Scaling your Jenkins CI pipeline
Page 24: Scaling your Jenkins CI pipeline

FOSDEM 2013

Credits

puppetlabs.comgithub.com/andersdyekjaerhansen/ec2_collective

Jenkins and plugins– Swarm plugin, parameterized trigger, envinject, rebuild,

join, throttle concurrent build, conditional build step

More details of our setup– tradeshift.com/blog/just-add-servers/

We’re hiring!– tradeshift.com/jobs

Feb 2, 2013