puppet camp silicon valley 2015: how tubemogul reached 10,000 puppet deployment in one year

32
How TubeMogul reached 10,000 Puppet deployment in one year May 26 th , 2015 Nicolas Brousse | Sr. Director Of Operations Engineering | [email protected] Julien Fabre | Site Reliability Engineer | [email protected]

Upload: nicolas-brousse

Post on 27-Jul-2015

297 views

Category:

Engineering


0 download

TRANSCRIPT

Page 1: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

How TubeMogul reached 10,000 Puppet deployment in

one yearMay 26th, 2015

Nicolas Brousse | Sr. Director Of Operations Engineering | [email protected]

Julien Fabre | Site Reliability Engineer | [email protected]

Page 2: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Who are we?

TubeMogul● Enterprise software company for digital branding● Over 27 Billions Ads served in 2014● Over 30 Billions Ad Auctions per day● Bid processed in less than 50 ms● Bid served in less than 80 ms (include network round trip)● 5 PB of monthly video traffic served● 1.3 EB of data stored

Page 3: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Who are we?

Operations Engineering● Ensure the smooth day to day operation of the platform

infrastructure● Provide a cost effective and cutting edge infrastructure● Team composed of SREs, SEs and DBAs● Managing over 2,500 servers (virtual and physical)

Page 4: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Our Infrastructure

Public Cloud On Premises

Multiple locations with a mix of Public Cloud and On Premises

Page 5: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● Java (a lot!)● MySQL● Couchbase● Vertica● Kafka● Storm● Zookeeper, Exhibitor● Hadoop, HBase, Hive● Terracotta● ElasticSearch, Kibana● LogStash● PHP, Python, Ruby, Go...● Apache httpd● Nagios● Ganglia

Technology Hoarders

● Graphite● Memcached● Puppet● HAproxy● OpenStack● Git and Gerrit● Gor● ActiveMQ● OpenLDAP● Redis● Blackbox● Jenkins, Sonar● Tomcat● Jetty (embedded)● AWS DynamoDB, EC2, S3...

Page 6: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● 2008 - 2010: Use SVN, Bash scripts and custom templates.

● 2010: Managing about 250 instances. Start looking at Puppet.

● 2011: Puppet 0.25 then 2.7 by EOY on 400 servers with 2 contributors.

● 2012: 800 servers managed by Puppet. 4 contributors.

● 2013: 1,000 servers managed by Puppet. 6 contributors.

● 2014: 1,500 servers managed by Puppet. Introduced Continuous Delivery Workflow. 9 contributors. Start 3.7 migration.

● 2015: 2,000 servers managed by Puppet. 13 contributors.

Five Years Of Puppet!

Page 7: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● 2000 nodes

● 225 unique nodes definition

● 1 puppetmaster

● 112 Puppet modules

Puppet Stats

Page 8: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● Virtual and Physical Servers Configuration : Master mode

● Building AWS AMI with Packer : Master mode

● Local development environment with Vagrant : Master mode

● OpenStack deployment : Masterless mode

Where and how do we use Puppet ?

Page 9: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Code Review?

Page 10: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● Gerrit, an industry standard : Eclipse, Google, Chromium, OpenStack, WikiMedia, LibreOffice, Spotify, GlusterFS, etc...

● Fine Grained Permissions Rules● Plugged to LDAP● Code Review per commit● Stream Events● Use GitBlit● Integrated with Jenkins and Jira● Managing about 600 Git repositories

A Powerful Gerrit Integration

Page 11: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Gerrit in Action

verify -1 when no ticket # or doesn’t pass Jenkins code validation

Page 12: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● 1 job per module● 1 job for the manifests and hiera data● 1 job for the Puppet fileserver● 1 job to deploy

Continuous Delivery with Jenkins

Global Jenkins stats for the past year● ~10,000 Puppet deployment● Over 8,500 Production App Deployment

Page 13: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Plugin : github.com/jenkinsci/job-dsl-plugin

● Automate the jobs creation

● Ensure a standard across all the jobs

● Versioned the configuration

● Apply changes to all your jobs without pain

● Test your configuration changes

Jenkins job DSL : code your Jenkins jobs

Page 14: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Team Awareness: HipChat Integration with Hubot

Page 15: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Infrastructure As Code● Follow standard development lifecycle● Repeatable and consistent server

provisioning

Continuous Delivery● Iterate quickly● Automated code review to improve code

quality

Reliability● Improve Production Stability● Enforce Better Security Practices

Puppet Continuous Delivery Workflow: The Vision

Page 16: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

The Workflow

Page 17: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

The Workflow : Puppet code logic

Puppet environments● Dedicated node manifests (*.pp)● Modules deployed by branch with Git submodules

All the data in Hiera● Try to avoid params.pp class● Store everything : modules parameters, classes, keys, passwords, ...

Page 18: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Puppet Code Hierarchy

/etc/puppet├── puppet.conf, hiera.yaml, *.conf├── hiera└── environments ├── dev │ ├── manifests │ │ ├── nodes/*.pp │ │ └── site.pp │ └── modules │ ├── activemq │ ... │ └── zookeeper └── production ├── manifests │ ├── nodes/*.pp │ └── site.pp └── modules ├── activemq … └── zookeeper

Git submodules, branch dev

Git submodules, branch production

Page 19: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Hiera Configuration

$ cat /etc/puppet/hiera.yaml---:backends: - eyaml - yaml:yaml: :datadir: /etc/puppet/hiera:eyaml: :pkcs7_private_key: /var/lib/puppet/hiera_keys/private_key.pkcs7.pem :pkcs7_public_key: /var/lib/puppet/hiera_keys/public_key.pkcs7.pem:hierarchy: - fqdn/%{::fqdn} - "%{::zone}/%{::vpc}/%{::hostgroup}" - "%{::zone}/%{::vpc}/all" - "%{::zone}/%{::hostgroup}" - "%{::zone}/all" - hostname/%{::hostname} - hostgroup/%{::hostgroup} - environment/%{::environment} - common:merge_behavior: deeper

Page 20: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Hiera eyaml : github.com/TomPoulton/hiera-eyaml

● Hiera backend● Easy to use● Powerful CLI : eyaml edit /etc/puppet/hiera/secrets.yaml

Encrypt Your Secrets

$ cat secret.yaml---ec2::access_key_id: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMIIIBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAVIa28OwyaqI5N1TDCvVkBZz3YG+s+Hfzr0lqgcvRCIuJGpq28sQmmuBaQjWY38i86ZSFu0gM6saOHfG64OzVlurO7k/l0CKeL0JfXNaVM4TUqMaN9dSkL5e2vsmpLKrMASawmarqbLYwllTrTe32H4NWxU1e+qWLeUMr9ciBnA3W1Azm4RIo+3bsvgvMfdks....=]

Page 21: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Encrypt Files

Blackbox : github.com/StackExchange/blackbox

● Use GPG to encrypt secret files● Easy to add/delete team members● No need to change your Puppet code !

# modules/${modules_name}/files/credentials.yaml.gpg

file { ‘/etc/app/credentials.yaml’: ensure => ‘file’, owner => ‘root’, group => ‘root’, mode => ‘0644’, source => ‘puppet:///modules/${module_name}/credentials.yaml’}

Page 22: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

The Workflow

Page 23: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

The Workflow : bottlenecks

● Only Ops team members can commit (SRE, SE)

● Review and validation is done only by a SRE

● Jenkins will verify the code but will not validate the commit

● Static Puppet environments

● Rely a lot on server hostnames

Page 24: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Flexibility : R10K github.com/adrienthebo/r10k !

● Dynamic environments

● No Git submodules anymore ! : - )

● Easy to reproduce any environment

● Can use private and forge Puppet modules

● Can use branches and tags

● Based on Puppetfile

Puppet Workflow Reloaded!

Page 25: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

R10K

$ cat Puppetfileforge "https://forgeapi.puppetlabs.com"

# Forge modulesmod 'pdxcat/collectd'mod 'puppetlabs/rabbitmq'mod 'arioch/redis'mod 'maestrodev/wget'mod 'puppetlabs/apt'mod 'puppetlabs/stdlib'

# Tubemogul modulesmod "hosts", :git => 'ssh://<gerrit_host>/puppet/modules/hosts', :branch => 'dev'mod "timezone", :git => 'ssh://<gerrit_host>/puppet/modules/timezone', :branch => 'dev'

...

Page 26: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Puppet Workflow Reloaded!

Better code organization : Roles and Profiles

● Represent the business logic : Roleso Highest abstraction layero Use Profiles for implementation

● Implement the applications : Profileso Remove potential code duplicationo Use modules and other Puppet resources

Page 27: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Roles/Profiles Pattern

class role::logs { include profile::base include profile::logstash::server include profile::elasticsearch}

class profile::logstash { $version = hiera('profile::logstash::server::version', '1.4.2') $es_host = hiera('profile::logstash::server::es_host', 'es01') $redis_host = hiera('profile::logstash::server::redis_host', 'redis01')

class { 'logstash': package_url => "https://download.elasticsearch.org/logstash/.../logstash_${version}.deb", java_install => true, }

logstash::configfile { 'input_redis': content => template('logstash/configfile/logstash.input_redis.conf.erb'), order => 10, }

logstash::configfile { 'output_es': content => template('logstash/configfile/logstash.output_es.conf.erb'), order => 30, }}

Page 28: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Do not rely on hostname : nodeless approach

● Facts to guide Puppet● No node myawesomeserver { } anymore● Enforce a cluster vision● site.pp gives the configuration logic

Puppet Workflow Reloaded!

# /etc/puppet/manifests/site.pp

node default {

if $::ec2_tag_tm_role { notify { "Using role : ${ec2_tag_tm_role}": } include "role::${::ec2_tag_tm_role}" } else { fail(‘No role found. Nothing to configure.’) }

}

Page 29: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● Specify tags during the provisioning

● Retrieve tags with AWS Ruby SDK and create facts

● New hierarchy

AWS EC2 tags

$ facter -p | grep ec2_tagec2_tag_cluster => rtb-bidderec2_tag_nagios_host => mgmt01ec2_tag_name => bidderec2_tag_pupenv => productionec2_tag_tm_role => rtb::bidder

:hierarchy: - "%{::zone}/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}" - "%{::zone}/%{::ec2_tag_vpc}/all" - "%{::zone}/all" - vpc/%{::ec2_tag_vpc}/%{::ec2_tag_cluster} - vpc/%{::ec2_tag_vpc}/all - environment/%{::environment} - common

Page 30: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

New merging and reviewing rules

● Everyone can commit a Puppet code

● Allow everyone to review a Puppet change (+1)

● Allow SE and SRE to validate a Puppet change (+2)

● Auto validation/merging in dev if at least 80% of test (+2)

Page 31: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Next improvements

● Acceptance testing with Beaker and Docker

● Full test provisioning with ServerSpec

● PuppetDB to improve the reporting

● Dedicated Puppet Masters

Page 32: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Nicolas BrousseJulien Fabre

@orieg@julien_fabre