running cassandra on amazon's ecs

39

Click here to load reader

Upload: anirvan-chakraborty

Post on 06-Jan-2017

6.691 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Running Cassandra on Amazon's ECS

Running Cassandra on Amazon’s ECS

Anirvan Chakraborty

@anirvan_c

Page 2: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Agenda

• Motivation • Docker • Cassandra • Cassandra on Docker best practices • EC2 Container Service ECS • Cassandra on ECS

Page 3: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Motivation

Page 4: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Motivation• Ease of development • Support polyglot languages, frameworks and

components • Operational simplicity • Quick feedback loop

Page 5: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Docker

Page 6: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Docker history

• Came out of dotCloud, a PaaS company • Was originally written in Python • Got re-written in Golang in Feb, 2013 • Docker 0.1 was released on Mar, 2013 • Docker 1.10 is the latest release

Page 7: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Docker tag line

Build, ship and run any app, anywhere

Page 8: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Docker tag line

• Build: package your application in a container • Ship: move it between machines • Run: execute that container with your application • Any application: as long as it runs on Linux

• Anywhere: local VM, bare metal, cloud instances

Page 9: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Why Docker?

• Deploy reliably & consistently • Execution is fast and light weight • Simplicity • Developer friendly workflow • Fantastic community

Page 10: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Apache Cassandra

Page 11: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

What is Apache Cassandra?• Fast distributed database • High Availability • Linear Scalability • Predictable performance • No single point of failure • Multi-DC • Easy to manage • Can use commodity hardware • Not a drop in replacement for RDBMS

Page 12: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Hash ring

• Data is partitioned around the ring • Location of data in ring is determined by

partition key • Data is replicated to N servers based on RF • All nodes hold data and can answer read or

write queries

source: https://academy.datastax.com/courses/ds101-introduction-cassandra/introduction-cassandra-overview

Page 13: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

CAP Tradeoff

• During network partition it is impossible to be both consistent and highly available

• Latency between data centres also makes consistency impractical

• Cassandra chooses Availability & Partition tolerance over Consistency

Page 14: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Replication• Choose “replication factor” or RF • Data is always replicated to each replica • If node is down, missing data is replayed via hinted

handoff

source: https://academy.datastax.com/courses/ds101-introduction-cassandra/introduction-cassandra-overview

Page 15: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Consistency level

source: https://academy.datastax.com/courses/ds101-introduction-cassandra/introduction-cassandra-overview

• Per query consistency • ALL, QUORUM, ONE • How many replicas to respond OK for query to

succeed

Page 16: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Cassandra on Docker

Page 17: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Dockerize C* Dev Environment

• Make it run as slow, but as stable as possible! • Super low memory settings in cassandra-env.sh

• MAX_HEAP_SIZE=“128M” • HEAP_NEWSIZE=“24M”

• Remove caches in dev mode in cassandra.yml • key_cache_size_in_mb: 0 • reduce_cache_sizes_at: 0 • reduce_cache_capacity_to: 0

Page 18: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Dockerize C* Production

• Use host networking (—net=host) for better network performance

• Put data, commitlog and saved_caches in volume mount folders to the underlying host

• Run cassandra on the foreground using (-f) • Tune JVM heap for optimal size • Tune JVM garbage collector for your workload

Page 19: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Amazon EC2 Container Service

Page 20: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

What is ECS?

…is a highly scalable, fast, container management service that makes it easy to run, stop, and manage Docker containers

on a cluster of Amazon EC2 instances.

http://docs.aws.amazon.com/AmazonECS/latest/developerguide/Welcome.html

Page 21: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

What is ECS?

Amazon Docker as a Service

https://www.expeditedssl.com/aws-in-plain-english

Page 22: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

How does ECS work?

http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

Page 23: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Cluster

http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

Page 24: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Container Instance

http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

Page 25: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

ECS Agent

http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

Page 26: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Task

http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

Page 27: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Task

http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

{ "containerDefinitions": [ { "name": "wordpress", "links": [ "mysql" ], "image": "wordpress", "essential": true, "portMappings": [ { "containerPort": 80, "hostPort": 80 } ], "memory": 500, "cpu": 10 }, { "name": "mysql", "image": "mysql", "cpu": 10, "memory": 500, "essential": true } ], "family": "hello_world" }Task definition

Docker Container

Docker Container

Page 28: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Service

http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

Page 29: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

ECS Service

http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

Page 30: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Typical ECS workflow

• Build Docker image using whatever you want • Push image to registry • Create JSON file describing your task definition • Register this task definition with ECS • Make sure that your cluster has enough resources • Start a new task from the task definition

Page 31: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Cassandra on ECS

Page 32: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Dockerize C* on ECS

• Simple service discovery using ECS API • Custom Dockerfile and entry-point script to

control Cassandra configuration • Cleanup downed node and repair cluster on

node startup

Page 33: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Service DiscoveryUsing

• AWS EC2 API • AWS ECS API • AWS Metadata URL

1. Find all running tasks for a cluster 2. Find EC2 instance ids for the tasks 3. Find private IP addresses for the container

instances

Page 34: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Custom Dockerfile

• Expose Cassandra configuration parameters as Docker environment variables

• Update entry-point script to use Docker environment variables to configure Cassandra

Page 35: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Node Startup

1. Find the IPs of all running seed nodes 2.Check and identify any downed nodes in the

cluster nodetool status 3.Remove the downed nodes nodetool remove 4.Repair the cluster nodetool repair 5.Start a new Cassandra node

Page 36: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Work in progress

Page 37: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Where to next?

• Open source C* Dockerfile & entry-point script • Use Consul for Service Discovery in ECS • Consider Flocker for managing volumes with

Dockerized C* • Consider Weave for forming C* cluster across

network domains

Page 38: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Thanks!

Twitter: @cakesolutionsTel: 0845 617 1200

Email: [email protected] Jobs: http://www.cakesolutions.net/

careers

We are Hiring!

Page 39: Running Cassandra on Amazon's ECS

© 2016 Anirvan Chakraborty CC BY-NC-SA 4.0

Resourceshttps://academy.datastax.com/courses/ds101-introduction-cassandra

http://www.allthingsdistributed.com/2015/07/under-the-hood-of-the-amazon-ec2-container-service.html

http://docs.aws.amazon.com/AmazonECS/latest/developerguide/Welcome.html

https://www.expeditedssl.com/aws-in-plain-english

https://github.com/aws/amazon-ecs-agent

https://blog.docker.com/2015/03/why-i-love-docker-and-why-youll-love-it-too/

http://www.datastax.com/resources/whitepapers/best-practices-running-datastax-enterprise-within-docker