how we dockerized a startup? #meetup #docker
TRANSCRIPT
#startup #newyork #marseille #r&d #frenchtech #ecommerce #bpi #shake #network #bigdata
#dockerlover #opensourcelover #hashtaglover
PRODUCTION
Web BigData
- 2 physical servers- Multiple VMs- Chef & Chef server
- 2 physical servers- Stack Cloudera (Hadoop/Spark/...)
A lot of technos joined Yuzu during the MVP
Welcome to nodejs, scala, elasticsearch, hbase, redis, kafka, couchbase,...
Be kind, rewindSpring cleaningMicroservices / 12 factorsResources isolationImproved securityContinuous deploymentStart-up compliant workflowOrchestration & supervision
{ "container": { "type": "DOCKER", "docker": { "image": "redis:2.6.17", "network": "BRIDGE", "portMappings": [ { "containerPort": 6379, "hostPort": 0, "protocol": "tcp" } ] } }, "id": "redis", "instances": 1, "cpus": 0.5, "mem": 1024, "healthChecks": [ { "protocol": "TCP", "portIndex": 0 } ]}
MARATHONwants JSON
Collect everything collectd / fluentd / rsyslog
Convert to async messages kafka
Filter and keep valuable data riemann / elasticsearch / influxdb
Generate realtime alerts riemann
Addictive dashboards kibana / grafana / riemann-dash
use data containers for valuable data
storage layer may crash / changeeasy to forget when cleaning unused
containers / images
DON’T
use shared storage (glusterfs, nfs, ...) for your codebase, working dirs, config files or sessions
sloooooooooowgenerate scary lock errors and timeouts
fscache crash full systems easily
DON’T
tag your custom images with the VCS commit hash
● makes your workflow better (same version tag in VCS and images)
● prevent useless image rebuilds ( speedup deploys )● easy way to know exactly which code version is running,
even if the tag was changed / deleted in the VCS
DO
avoid shared filesystems
● put all versioned data in containers● use external object storage for user files (S3, ceph, swift ,...)● use database / memcached / couchbase for sessions● use templates to generate local config files (consul-template ,
confd)
DO
Use docker independant storage for critical data
● lvm is your friend (unless you use and saturate thin volume metadata, use thin volumes with care)
● use storage plugins with docker >= 1.9.0 (convoy ?)● redundant backups saves lives
DO
- we tried to put the chicken in the egg
- generate config files to shared storage
- dns for discovery
- use mongo replicaset without sharding
- ...
Switch from registrator to mesos-consul
Remove the lasts SPOFs
Move user files to S3
Use the docker storage/network plugins
XDCC
Improve logs & metrics filtering and alerting
PHP/Symfony ReactJS Scala Go Clojure NodeJS Spark Langages/technos
MongoDB Redis CouchBase Hdfs/HBase ElasticSearch Databases
InfluxDB Sentry Statsd Grafana Riemann Collectd Kibana FluentdMonitoring
Gearman KafkaQueue
Jenkins DoployContinuous Integration & Delivery
Route53 S3 CloudFrontAWS
Debian Docker Mesos Registrator Zookeeper Consul Marathon ChronosSYSTEM
Mesos and the yellow elephant- yarn, mesos, or both ? (myriad)
- hdfs/hbase on mesos, not so simple (still testing)
- kafka is awesome (we run it with docker)
- spark on mesos
- elasticsearch, born to cluster, also in docker
2015Other mesos/docker stories
IPROFS A large scale php/drupal worldwide social app for “Institut français”
ARTE.TV We are migrating all their vod and svod services to mesos/docker (java apps)
VIXNSA mesos/docker cluster is collecting all logs and metrics from hundreds of servers