pie on aws
TRANSCRIPT
About me
• Kuan-Yen Heng (Chris) • Software Engineer at Pie (pie.co) • Primarily backend and DevOps • [email protected] • https://github.com/gigablah • @gigablah
About Pie
• Chat for work • Multi device sync
(web, iOS, Android) • Rich media integration • We build Pie using Pie!
Requirements
• Realtime websocket messaging • Horizontal scalability with autoscaling • Load balancing across availability zones • Job queue / background worker system • Rapid develop-build-test-deploy cycle • Zero downtime deployment with rollback
Infrastructure as code
• We use Terraform to define and manage our staging and production clusters
• AWS resources (VPC, Security Groups, Launch Configurations, Autoscaling Groups, Instances, Load Balancers) configured in HCL
• Version control for your infrastructure • Separate planning and execution phases
Cons
• Terraform is not yet mature • Not all AWS resources and parameters
supported • Currently not possible to port in existing
infrastructure • Considering AWS CloudFormation
Docker workflow
docker push
docker pull
registry
FROM debian:wheezy MAINTAINER blah <[email protected]>
RUN apt-get install rabbitmq-server
EXPOSE 5672 15672
ENTRYPOINT ["/bin/bash", "-c"] CMD ["/usr/sbin/rabbitmq-server"]
Dockerfile
docker build docker tag
image
docker run
container
docker commit
Why containers?
• Lightweight, fast startup compared to VMs • Repeatable, consistent builds • Dependency isolation • Pristine host OS; only Docker installed • Homogenous hosts, easier management • “Servers as cattle”
Scheduling units
• Basically writing systemd units • Fleet specific metadata [X-Fleet] • Schedule global units, specify constraints and
dependencies, restart policies • Deploy units based on machine fleet
metadata, e.g. role=api and role=worker
> fleetctl list-units
UNIT MACHINE ACTIVE SUB api-discovery@master_123.service 75e1c8bd.../10.0.10.xxx active running api-discovery@master_123.service f54a4d78.../10.0.11.xxx active running api-discovery@master_123.service 320af1d0.../10.0.12.xxx active running api-proxy.service 75e1c8bd.../10.0.10.xxx active running api-proxy.service f54a4d78.../10.0.11.xxx active running api-proxy.service 320af1d0.../10.0.12.xxx active running api@master_123.service 75e1c8bd.../10.0.10.xxx active running api@master_123.service f54a4d78.../10.0.11.xxx active running api@master_123.service 320af1d0.../10.0.12.xxx active running logspout.service 75e1c8bd.../10.0.10.xxx active running logspout.service 17291bf6.../10.0.11.xxx active running logspout.service 320af1d0.../10.0.12.xxx active running logspout.service e1c8ca4c.../10.0.10.xxx active running logspout.service f54a4d78.../10.0.11.xxx active running logspout.service d28b5a20.../10.0.12.xxx active running logspout.service db206400.../10.0.10.xxx active running rabbitmq.service e1c8ca4c.../10.0.10.xxx active running rabbitmq.service 17291bf6.../10.0.11.xxx active running rabbitmq.service d28b5a20.../10.0.12.xxx active running [email protected] e1c8ca4c.../10.0.10.xxx active running [email protected] 17291bf6.../10.0.11.xxx active running [email protected] d28b5a20.../10.0.12.xxx active running
bastion
registry
cluster
hubotdeploy
“hubot deploy api:master”
pull deploy container
fleetetcd docker
dockerfleetetcd
pull api container
monitoring / metrics container
etsy/statsddatadog/docker-dd-agentscoutapp/docker-scoutlogentries/docker-logentries
Amazon CloudWatch
Amazon ECS
• CoreOS: too many moving parts? • fleet and etcd still evolving • Problems with btrfs • Studying a move to Amazon Linux with ECS
ECS parallels
• The ECS agent container takes the place of fleetd
• Cluster and task management through the AWS CLI
• Task definitions in JSON • ECS handles container lifecycle; in fleet unit
files you still have to manage your containers