smart platform infrastructure with aws

Post on 15-Feb-2017

77 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Smart Platform Infrastructure

How we are learning to let our team sleep at night

James Huston DevOPS Days Charlotte

February 2017

whoami

• James Huston - Director of Platform Engineering @ Red Ventures

• Over the last 20 years I have been on teams that:

• Tried a lot of things, some worked, some didn’t

• Learned a lot of do’s and don’ts

The Team

Thomas Hopkins Ryan Ruscett

Alfonso Cabrera Garrett JohnsonMike Guthrie

So what do I have to share?• Sleep

• Operations -vs- Platform Ops

• Infrastructure (AWS)

• Monitoring and Alerting

• Security

• Workflows

• Documentation

• Docker

Sleep

• Our jobs are 24/7/365

• Small teams

• Resource bound

• To be successful, We need sleep

Operations -vs- Platform Ops• Deeper knowledge

• Correct -vs- Fast

• Snowflakes?

• Wide breadth of knowledge

• Fast turn around, or self service

• Automate all the things

Platform OpsPlatform enables developers to safely and consistently perform their own operations and build resilient and secure applications.

Infrastructure• Traditional Operations - Healthy Infrastructure

• Linux in your datacenter

• Apps on top of that

• Platform Ops - Healthy Applications

• AWS/Azure/Google

• Managed services

• Apps on top of that

Monitoring and Alerting

• You are likely underestimating its importance

• Integrate them from the beginning, don’t bolt them on.

• Make sure your alerts go to the correct people

• Don’t create alerts that you are going to ignore!

Infrastructure Layout

Staging Production

Our Infrastructure

Infrastructure - Why is it Important

• Take advantage of Autoscaling for scale and auto healing

• Design to be secure from the start

• Design with monitoring and alerting built in

• Build your infrastructure in a standard, documented, reproducible way

Immutable Infrastructure• First line of debugging: remove the machine and let

it get replaced

• Avoid snowflakes/unicorns as much as possible

• Replace for security reasons

• Easy to implement (in the cloud anyhow)

• Salt/Chef/Puppet - use it for initial config, don’t push changes

Program and Automate• Reproduce repeatable infrastructures

• Team review of changes before they are made

• Pull requests

• Easy Rollback

• Shareable and reusable modules

• https://github.com/segmentio/stack

Terraform

• Plays nice with Most of the Things

• Multiple cloud providers, VMware, OpenStack

• Grafana, DataDog, New Relic, PagerDuty, Logentries

• MySQL, PostgreSQL

• Program all the things - Except Snowflakes

Terraform -vs- CloudFormation

• State

• Fast

• Admin Access

• No State

• Not so fast

• AWS Service Catalog

Security - SSO

• Don’t underestimate the power of the dark side OR your need to use Single Sign On (SSO)

• Active Directory, LDAP, Okta for AWS/Apps

• JumpCloud or LDAP for EC2 instances

• Avoid tools that don’t support SSO (GitHub.com) in favor of tools that do (GitHub Enterprise)

Security

• Don’t share SSH keys among your team(s). Ever.

• 0.0.0.0/0 on a security group that is not a public ELB? That’s likely bad.

• eg. future VPN or DirectConnect

Developer Workflows• Automation is key

• Use standard tooling (Makefile, shell scripts, etc)

• Bamboo -vs- Jenkins

• Centralization

• Provide guardrails and let teams with the expertise control their own destiny

• Documentation of workflows is critically important

Documentation

• README.MD - keep docs with your projects

• Centralize infrastructure, CI/CD, and other core docs

• Make it mandatory in governance

• Set a good example!

Docker

Security Info ala Jérôme Petazzoni (https://jpetazzo.github.io/) http://bit.ly/1t1DG3Q

Docker• Don’t run things as root

• Update often!

• For real security, run all filesystems read-only

• Use small (Alpine, Debian) base images

• Use only approved images

• Update them often

• Windows? All of the above.

Docker

• KISS - Keep It Simple Stupid!

Drumroll PleaseThe “Cloud” makes Platform Ops a reality. We can now program and automate “all the things” and we have the tools to make our infrastructure and applications maintain and heal themselves …

And we get to sleep at night

411James Huston

Director of Platform Engineering @ Red Ventures

james@jameshuston.net

@hustonjs

top related