continuous deployment: the dirty details

114
Mike Brittain @mikebrittain [email protected] ENGINEERING DIRECTOR CONTINUOUS DEPLOYMENT The Dirty Details

Upload: mike-brittain

Post on 08-May-2015

18.274 views

Category:

Technology


1 download

DESCRIPTION

Presented at ALM Summit 3 in Redmond, WA. January 2013. Like what you've read? We're frequently hiring for a variety of engineering roles at Etsy. If you're interested, drop me a line or send me your resume: [email protected]. http://www.etsy.com/careers

TRANSCRIPT

Page 1: Continuous Deployment: The Dirty Details

Mike Brittain@[email protected]

ENGINEERING DIRECTOR

CONTINUOUS DEPLOYMENT

The Dirty Details

Page 2: Continuous Deployment: The Dirty Details

CONTINUOUS DEPLOYMENT

The Dirty Details

“OK, sounds cool. But I have some questions...”

Page 3: Continuous Deployment: The Dirty Details

CD 100- & 200-levels

- CI environment for automated tests- Committing to trunk- Branching in code- Config flags (a.k.a. feature flags)- DevOps mentality- Metrics and alerting- Automated deploy scripts

credit: photobookgirl (flickr)

Page 4: Continuous Deployment: The Dirty Details

CD 100- & 200-levels

- CI environment for automated tests- Committing to trunk- Branching in code- Config flags (a.k.a. feature flags)- DevOps mentality- Metrics and alerting- Automated deploy scripts

- Deploys vs. releases- Decoupled systems, schema changes- How we work: Arch. & Process- Integration and Operations

CD 300 level

credit: photobookgirl (flickr)

Page 5: Continuous Deployment: The Dirty Details
Page 6: Continuous Deployment: The Dirty Details

www. .com

Page 7: Continuous Deployment: The Dirty Details

GROSS MERCHANDISE SALES

http://www.etsy.com/blog/news/2013/notes-from-chad-2012-year-in-review/

Page 8: Continuous Deployment: The Dirty Details

DECEMBER 20121.5 Billion page views$117 Million of goods sold6 Million items sold

http://www.etsy.com/blog/news/2013/etsy-statistics-december-2012-weather-report/Items by anjaysdesigns, betwixxt, OneStarLeatherGoods, mediumcontrol, TheDesignPallet

Page 9: Continuous Deployment: The Dirty Details

175+ Committers, everyone deploys

credit: martin_heigan (flickr)

Page 10: Continuous Deployment: The Dirty Details

Very end of 2009Today

DEPLOYMENTS PER DAY

Page 11: Continuous Deployment: The Dirty Details

Continuous delivery is a pattern language in growing use in software development to improve the process of software delivery.

~wikipedia

Techniques such as automated testing, continuous integration, and continuous deployment allow software to be developed to a high standard and easily packaged and deployed to test environments, resulting in the ability to rapidly, reliably and repeatedly push out enhancements and bug fixes to customers at low risk and with minimal manual overhead.

credit: Stewart, redgen (flickr)

Page 12: Continuous Deployment: The Dirty Details

Linux, Apache, MySQL, PHPArchitecture Stack

Memcache, Gearman, Postgresql, Solr, Java, Traffic Server, Hadoop, HBase

credit: Saire Elizabeth (flickr)

Git, Jenkins

Page 13: Continuous Deployment: The Dirty Details
Page 14: Continuous Deployment: The Dirty Details

2010-today2009

Then Now

Just before we started using CD

Page 15: Continuous Deployment: The Dirty Details

15 mins6-14 hours1 person“Deployment Army”

Then Now

Part of everydayworkflow

Special event andhighly orchestrated

Page 16: Continuous Deployment: The Dirty Details

Blocked for15 minutes.

15 minutes toredeploy

Blocked for6-14 hours.

6+ hours toredeploy

Then Now

Page 17: Continuous Deployment: The Dirty Details

Mainline,minimal linking

and building,rsync,site up

Release branch,database schemas,

data transforms,packaging,

rolling restarts,cache purging,

scheduled downtime

Then Now

Page 18: Continuous Deployment: The Dirty Details

1!" #$%Put your face on Etsy.com.

Page 19: Continuous Deployment: The Dirty Details

2&# #$%Complete tax, insurance, and

benefits forms.

credit: ktpupp (flickr)

Page 20: Continuous Deployment: The Dirty Details
Page 21: Continuous Deployment: The Dirty Details

WARNING

Page 22: Continuous Deployment: The Dirty Details

GROSS MERCHANDISE SALES

Page 23: Continuous Deployment: The Dirty Details

Continuous DeploymentSmall, frequent changes.

Constantly integrating into production.30+ deploys per day.

Page 24: Continuous Deployment: The Dirty Details

“Wow... 30 deploys a day.How do you build features so quickly?”

Page 25: Continuous Deployment: The Dirty Details

Software Deploy ≠ Product Launch

Page 26: Continuous Deployment: The Dirty Details

Deploys frequently gated by config flags(“dark” releases)

Page 27: Continuous Deployment: The Dirty Details

$cfg[‘new_search’] = array('enabled' => 'off');$cfg[‘sign_in’] = array('enabled' => 'on');$cfg[‘checkout’] = array('enabled' => 'on');$cfg[‘homepage’] = array('enabled' => 'on');

Page 28: Continuous Deployment: The Dirty Details

$cfg[‘new_search’] = array('enabled' => 'off');

Page 29: Continuous Deployment: The Dirty Details

$cfg[‘new_search’] = array('enabled' => 'off');

// Meanwhile...

# old and boring search$results = do_grep();

Page 30: Continuous Deployment: The Dirty Details

$cfg[‘new_search’] = array('enabled' => 'off');

// Meanwhile...

if ($cfg[‘new_search’] == ‘on’) { # New and fancy search $results = do_solr();} else { # old and boring search $results = do_grep();}

Page 31: Continuous Deployment: The Dirty Details

$cfg[‘new_search’] = array('enabled' => 'on');

// or...

$cfg[‘new_search’] = array('enabled' => 'staff');

// or...

$cfg[‘new_search’] = array('enabled' => '1%');

// or...

$cfg[‘new_search’] = array('enabled' => 'users', 'user_list' => 'mike,john,kellan');

Page 32: Continuous Deployment: The Dirty Details

Validate in production, hidden from public.

Page 33: Continuous Deployment: The Dirty Details

Small incremental changes to the applicationNew classes, methods, controllersGraphics, stylesheets, templatesCopy/content changes

Turning flags on, off, or % ramp up

What’s in a deploy?

Page 34: Continuous Deployment: The Dirty Details

Latent bugs and security holesTraffic management, load sheddingAdding and removing infrastructure

Tweaking config flags or releasing patches.

Low MTTR (response times)

Page 35: Continuous Deployment: The Dirty Details

http://www.flickr.com/photos/flyforfun/2694158656/

Page 36: Continuous Deployment: The Dirty Details

http://www.flickr.com/photos/flyforfun/2694158656/

OperatorConfig flags

Metrics

Page 37: Continuous Deployment: The Dirty Details

Favorites“on”

Page 38: Continuous Deployment: The Dirty Details

Favorites“off”

Page 39: Continuous Deployment: The Dirty Details

Many deploys eventually lead to a product launch.

Page 40: Continuous Deployment: The Dirty Details

“How do you continuously deploy database schema changes?”

Page 41: Continuous Deployment: The Dirty Details

Code deploysSchema changes

~15-20 minutes

Page 42: Continuous Deployment: The Dirty Details

Code deploysSchema changes

~15-20 minutesTHURSDAYS!

Page 43: Continuous Deployment: The Dirty Details

Our web application is largely monolithic.

Etsy.com, Support & Back-office tools, Developer API, Gearman (async work)

Page 44: Continuous Deployment: The Dirty Details

Our web application is largely monolithic.

Etsy.com, Support & Back-office tools, Developer API, Gearman (async work)

PHP, Apache, Memcache

Page 45: Continuous Deployment: The Dirty Details

External “services” are not deployed with the main application.

e.g. Databases, Search, Photo storage, Payments

Page 46: Continuous Deployment: The Dirty Details

External “services” are not deployed with the main application.

e.g. Databases, Search, Photo storage, Payments

MYSQL(schema changes)

SOLR, JVM(rolling restarts)

PROXY CACHE, FILERS, AMAZON S3

(specialized infra.)

PCI(controlled access)

Page 47: Continuous Deployment: The Dirty Details

For every config flag, there are two stateswe can support — present and future.

Page 48: Continuous Deployment: The Dirty Details

For every config flag, there are two stateswe can support — present and future.

... or past and present.

Page 49: Continuous Deployment: The Dirty Details

“Non-Breaking Expansions”Expose new version in a service interface;

support multiple versions in the consumer.

Page 50: Continuous Deployment: The Dirty Details

Example: Changing a Database SchemaMerging “users” and “users_prefs”

Page 51: Continuous Deployment: The Dirty Details

RULE OF THUMB:Prefer ADDs over ALTERs (non-breaking expansion)C

Page 52: Continuous Deployment: The Dirty Details

1. Write to both versions2. Backfill historical data3. Read from new version4. Cut-off writes to old version

Page 53: Continuous Deployment: The Dirty Details

0. Add new version to schema1. Write to both versions2. Backfill historical data3. Read from new version4. Cut-off writes to old version

Page 54: Continuous Deployment: The Dirty Details

0. Add new version to schemaSchema change to add prefs columns to “users” table.

“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “off”“read_prefs_from_users_table” => “off”

Page 55: Continuous Deployment: The Dirty Details

1. Write to both versionsWrite code for writing prefs to the “users” table.

“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “off”

Page 56: Continuous Deployment: The Dirty Details

2. Backfill historical dataOffline process to sync existing data from “user_prefs”to new columns in “users”

Page 57: Continuous Deployment: The Dirty Details

3. Read from new versionData validation tests. Ensure consistency both internallyand in production.

“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “staff”

Page 58: Continuous Deployment: The Dirty Details

3. Read from new versionData validation tests. Ensure consistency both internallyand in production.

“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “1%”

Page 59: Continuous Deployment: The Dirty Details

3. Read from new versionData validation tests. Ensure consistency both internallyand in production.

“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “5%”

Page 60: Continuous Deployment: The Dirty Details

3. Read from new versionData validation tests. Ensure consistency both internallyand in production.

“write_prefs_to_user_prefs_table” => “on”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “on”

(“on” == “100%”)

Page 61: Continuous Deployment: The Dirty Details

4. Cut-off writes to old versionAfter running on the new table for a significant amount of time, we can cut off writes to the old table.

“write_prefs_to_user_prefs_table” => “off”“write_prefs_to_users_table” => “on”“read_prefs_from_users_table” => “on”

Page 62: Continuous Deployment: The Dirty Details

“Branch by Astraction”

Controller Controller

Users Model

“users” (old) “user_prefs” “users”

old schema new schema

(Abstraction)

http://paulhammant.com/blog/branch_by_abstraction.htmlhttp://continuousdelivery.com/2011/05/make-large-scale-changes-incrementally-with-branch-by-abstraction/

Page 63: Continuous Deployment: The Dirty Details

1. Write to both versions2. Backfill historical data3. Read from new version4. Cut-off writes to old version

“The Migration 4-Step”

Page 64: Continuous Deployment: The Dirty Details

“When do you clean up all of those config flags?

Page 65: Continuous Deployment: The Dirty Details

We might remove config flags for the old version when...

It is no longer valid for the business.It is no longer stable, maintained, or trusted.It has poor performance characteristics.The code is a mess, or difficult to read.We can afford to spend time on it.

Page 66: Continuous Deployment: The Dirty Details

Promote “dev flags” to “feature flags”

Page 67: Continuous Deployment: The Dirty Details

// Feature flag$cfg[‘mobilized_pages’] = array('enabled' => 'on');

// Dev flags$cfg[‘mobile_templates_seller_tools’] = array('enabled' => 'on');$cfg[‘mobile_templates_account_tools’] = array('enabled' => 'on');$cfg[‘mobile_templates_member_profile’] = array('enabled' => 'on');$cfg[‘mobile_templates_search’] = array('enabled' => 'off');$cfg[‘mobile_templates_activity_feed’] = array('enabled' => 'off');

...

if ($cfg[‘mobilized_pages’] == ‘on’ && $cfg[‘mobile_templates_search’] == ‘on’) { // ... // ...}

Page 68: Continuous Deployment: The Dirty Details

// Feature flags$cfg[‘search’] = array('enabled' => 'on');$cfg[‘developer_api’] = array('enabled' => 'on');$cfg[‘seller_tools’] = array('enabled' => 'on');

$cfg[‘the_entire_web_site’] = array('enabled' => 'on');

Page 69: Continuous Deployment: The Dirty Details
Page 70: Continuous Deployment: The Dirty Details

// Feature flags$cfg[‘search’] = array('enabled' => 'on');$cfg[‘developer_api’] = array('enabled' => 'on');$cfg[‘seller_tools’] = array('enabled' => 'on');

$cfg[‘the_entire_web_site’] = array('enabled' => 'on');$cfg[‘the_entire_web_site_no_really_i_mean_it’] = array('enabled' => 'on');

Page 71: Continuous Deployment: The Dirty Details

Architecture and Process

Page 72: Continuous Deployment: The Dirty Details

Deploying is cheap.

Page 73: Continuous Deployment: The Dirty Details

Releasing is cheap.

Page 74: Continuous Deployment: The Dirty Details

Some philosophies on product development...

Page 75: Continuous Deployment: The Dirty Details

Gathering data should be cheap, too.staff, opt-in prototypes, 1%

Page 76: Continuous Deployment: The Dirty Details

Treat first iterations as experiments.

Page 77: Continuous Deployment: The Dirty Details

Get into code as quickly as possible.

Page 78: Continuous Deployment: The Dirty Details

“Where a new system concept or new technology is used, one has to build a system to throw away, for even the best planning is not so omniscient as to get it right the first time. Hence plan to throw one away; you will, anyhow.”

~ Fred Brooks, The Mythical Man-Month

Page 79: Continuous Deployment: The Dirty Details

Architecture largely doesn’t matter.

Page 80: Continuous Deployment: The Dirty Details

Kill things that don’t work.

Page 81: Continuous Deployment: The Dirty Details

Your assumptions will be wrongonce you’ve scaled 10x.

Page 82: Continuous Deployment: The Dirty Details

“We don’t optimize for being right. We optimize for quickly detecting when we’re wrong.”

~Kellan Elliott-McCrea, CTO

Page 83: Continuous Deployment: The Dirty Details

Become really good at changingyour architecture.

Page 84: Continuous Deployment: The Dirty Details

Invest time in architecture by the2nd or 3rd iteration.

Page 85: Continuous Deployment: The Dirty Details

Integration and Operations

Page 86: Continuous Deployment: The Dirty Details

WARNING

REMEMBER THIS?

Page 87: Continuous Deployment: The Dirty Details

Continuous DeploymentSmall, frequent changes.

Constantly integrating into production.30 deploys per day.

Page 88: Continuous Deployment: The Dirty Details

Code review before commit

Page 89: Continuous Deployment: The Dirty Details

Automated tests before deploy

Page 90: Continuous Deployment: The Dirty Details

Why Integrate with Production?

Page 91: Continuous Deployment: The Dirty Details

Dev ≠ Prod

Page 92: Continuous Deployment: The Dirty Details

Verify frequently and in small batches.

Page 93: Continuous Deployment: The Dirty Details

Integrating with production is a test in itself.We do this frequently and in small batches.

Page 94: Continuous Deployment: The Dirty Details

More database servers in prod.Bigger database hardware in prod.More web servers.Various replication schemes.Different versions of server and OS software.Schema changes applied at different times.Physical hardware in prod.More data in prod.Legacy data (7 years of odd user states).More traffic in prod.Wait, I mean MUCH more traffic in prod.Fewer elves.Faster disks (SSDs) in prod.

Page 95: Continuous Deployment: The Dirty Details

Using a MySQL database in dev for an application that will be runningon Oracle in production: Priceless

Page 96: Continuous Deployment: The Dirty Details

Verify frequently and in small batches.

Page 97: Continuous Deployment: The Dirty Details

Dev ⇾ QA ⇾ Staging ⇾ Prod

Page 98: Continuous Deployment: The Dirty Details

Dev ⇾ QA ⇾ Staging ⇾ Prod

Page 99: Continuous Deployment: The Dirty Details

Dev ⇾ Pre-Prod ⇾ Prod

Page 100: Continuous Deployment: The Dirty Details

Test and integrate where you’ll see value.

Page 101: Continuous Deployment: The Dirty Details

Config flags (again)

off, on, staff, opt-in prototypes, user list, 0-100%

Page 102: Continuous Deployment: The Dirty Details

Config flags (again)

off, on, staff, opt-in prototypes, user list, 0-100%

“canary pools”

Page 103: Continuous Deployment: The Dirty Details

Automated tests after deploy

Page 104: Continuous Deployment: The Dirty Details

Real-time metrics and dashboardsNetwork & Servers, Application, Business

Page 105: Continuous Deployment: The Dirty Details
Page 106: Continuous Deployment: The Dirty Details

SERVER METRICS

Apache requests/sec, Busy processes,CPU utilization, Script exec time (med. & 95th)

APPLICATION METRICS

Logins, Registrations, Checkouts, Listings created, Forum posts

Time and event correlated.

Page 107: Continuous Deployment: The Dirty Details
Page 108: Continuous Deployment: The Dirty Details

Real humans reporting trouble!

Page 109: Continuous Deployment: The Dirty Details

“Theoretical” vs. “Practical” Engineering

Page 110: Continuous Deployment: The Dirty Details

http://www.flickr.com/photos/flyforfun/2694158656/

OperatorConfig flags

Metrics

Page 111: Continuous Deployment: The Dirty Details

Thanksgiving, “Black Friday,” “Cyber Monday” ➔ Christmas(~30 days)

Managing risk during Holiday Shopping season

Code Freeze?

Page 112: Continuous Deployment: The Dirty Details

“Code Slush”

DEPLOYMENTS PER DAY

Page 113: Continuous Deployment: The Dirty Details

Tighten your feedback cyclesIntegrate with production and validate early in cycle.Use tools that allow you to detect issues early.Optimize for quick response times.

Applied to both feature development and operability.

Page 114: Continuous Deployment: The Dirty Details

Thank you

Mike Brittain

... and questions?

These slides will be available later today at http://mikebrittain.com/talks

@[email protected]

ENGINEERING DIRECTOR