Download - Scaling Open edX with Kubernetes
Scaling Open edX with KubernetesDevOpsDays Boston9.15.2015
Who we are
Nate Aune
Morgan Robertson
What we’ll cover
● Background -- Open edX
● Introducing Kubernetes
● Kubernetes concepts
● Scaling + resiliency
● Open edX on Kubernetes
Open edX background
● edX: non-profit founded by MIT and Harvard
● 500+ courses, 5M students learning on edX.org
● edX released Open edX in June 2013
● Stanford, MongoDB, Salesforce, Google, Microsoft,
McKinsey, Johnson & Johnson, Smithsonian
Open edX - a catalyst for innovation
212 Contributors
One of the fastest growing open source projects on Github
Technical components
LMS/CMS (Django/Python)
Forum (Sinatra/Ruby)
User DB (MySQL)
Course DB (Mongo)
Tasks (Celery/RabbitMQ)
Caching (Memcache)
Proxy (Nginx)
Search (ElasticSearch)
Mapreduce (Hadoop)
Hosting infrastructure
S3 for serving:
● static assets
● grade downloads
● certificate downloads
● videos (for mobile)
● Load balancer
● Application server(s)
● Database server(s)
● Search server
● Utility server (tasks)
● Caching server
● Hadoop cluster
Typical scalable deployment of Open edX on AWS
Introducing Kubernetes
● Scheduling + orchestration layer for containerized applications
● Abstracts your infrastructure
● Open source project by Google
● Production-ready as of July 2015
Kubernetes architecture
Kubernetes vs. the Docker triad
Kubernetes Swarm Compose Machine
Scheduling ✔ ✔
Service discovery ✔ ✅
Container scaling ✔ ✔
Machine provisioning ✅ ✔
Health checking ✔
Secret management ✔
Production-ready ✔
Kubernetes core concepts
● Pods
● Services
● Replication controllers
Pods
● Group of containers + volumes scheduled together
● Smallest deployable unit
● Containers share certain resources including network stack
Services
Services
● Endpoint for a set of pods
● IP address, port, and label selectors
● Use round-robin routing to direct traffic to backend pods
Services + Pods
Replication Controllers
● Manage pod lifecycles for a number of replicas
● Provide scaling + fault tolerance
● Use label selectors
Pods + Services + Replication Controllers
Scaling with Kubernetes
● Replication controllers scale pods
● Services provide a single endpoint for a group of pods
● The Kubernetes master schedules pods across nodes
Resiliency with Kubernetes
● Replication controllers ensure a number of pods are running
● Services provide load balancing
● Health checks allow bad pods to be ignored/removed
Open edX on Kubernetes
● Goals:
○ Multi-tenant
○ Scalable + resilient
The challenge
Architecture
Monitoring with Sysdig
Sysdig drill-down
Lessons learned
● Containers should be stateless
● Put initialization tasks into separate pods that run once
● Services can be used to abstract non-containerized components
Conclusion
● We’re still learning, but..
○ Kubernetes is a promising technology for providing both scalability and resiliency
More info
Open edX - http://open.edx.org
Kubernetes - http://kubernetes.io
Google Container Engine - http://cloud.google.com/container-engine
Thank you for your time!Questions?
Slides: http://bit.ly/open-edx-kubernetes
[email protected]@appsembler.com
@appsembler