cassandra@coursera: aws deploy and mysql transition
Embed Size (px)
DESCRIPTION
Touches on what Coursera aims to get out of Cassandra, what goes into a good deployment, and our experience so far transitioning off MySQL.TRANSCRIPT

Cassandra @ Coursera Deploying in AWS MySQL Transition
Daniel Chia @DanielJHChia
Software Engineer, Infrastructure

Overview
• Why Cassandra
• What goes into a good deployment
• MySQL → Cassandra transition experience


110 partners !
698 courses !
8.5 million learners

A Coursera Course



Your Final Project
This is your chance to apply the course concepts to real-world situations


Identity Verified Certificates

Technical
• 100% hosted on AWS
• Service-oriented architecture
• Mix of MySQL and Cassandra for persistence

What do we care about?

We care about…
• Availability
• Scalability
• Operational Ease
• Latency
• (Bonus) Multi-region writes

Availability matters


EBS Outage (2012)
Master us-east-1a
Slave us-east-1c

Scalability

Scalability

Sharded by class
class1
class2
class3
class4
class5
Machine 1
class6
class7
class8
class9
class10
Machine 2
class11
class12
class13
class14
class15
Machine 3

New use-caseUh-oh… doesn’t fit in existing sharding

We care about…
• Availability
• Scalability
• Operational Ease
• Performance
• (Bonus) Multi-region

Try Cassandra!So we decided to…

Cassandra ≠ [database XYZ]

–Albert Einstein
“But if you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid.”

Time to deploy Cassandra!sudo apt-get install dse-full

A good deploymentMachine-level Cluster-level

Picking a machine
• Disk
• IOPS… IOPS… IOPS
• Latency
Author: D-Kuru/Wikimedia Commons Licence: CC-BY-SA-3.0-AT

Picking a machine
• CPU
Author: Mark Sze Licence: CC BY-NC-ND 2.0

Picking a machine• Memory
• Save some for page cache!
Author: brutalSoCal Licence: CC BY-NC-ND 2.0

On AWS• Ephemeral disks.
• Please don’t use EBS. Really.
• IOPS usually the problem
• Instance sizes:
• spinning disk: m1.large, m1.xlarge, m2.4xlarge
• ssd: m3.xlarge, c3.2xlarge, i2.*

Set up the machine
• Lots of documentation / talks about this
• Recommended reading: Datastax guide [1]
[1] http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installRecommendSettings.html

Cluster configuration
A
C B

Priamcare and feeding of Cassandra on AWS
https://github.com/Netflix/Priam

Cluster Topology
• We use RF=3
• Ring balanced within datacenter
• Nodes alternate racks (or AZs)

Cluster Topology (Priam)
• Token assignments stored in a database
• Can takeover token in instance of node failure

Cluster Topology (Priam)
• Priam assigns tokens evenly per region
• Alternates AZs within region
az1
az3
az2
az1
az2
az3

Autoscaling groups
• Recover from lost instance
• We don't use it for scaling with traffic

Important: Need one ASG per AZ
east-1a east-1a east-1a
east-1b east-1beast-1b
east-1ceast-1c east-1c
ASG size: 9

Important: Need one ASG per AZ
ASG size: 9
east-1a east-1a east-1a
east-1b east-1beast-1b
east-1ceast-1c
east-1b

Important: Need one ASG per AZ
ASG-1a size: 3 east-1a east-1a east-1a
east-1b east-1beast-1b
east-1ceast-1c
ASG-1b size: 3
ASG-1csize: 3 east-1c

Backups
• Data on ephemeral disks
• Guard against application errors
• SSTables immutable -> ship to S3
• Priam does this

Restore
• Have to be able use your backup
• Also useful for QA / test
• Priam handles this rather nicely

Deployed!Time to chill?
https://www.flickr.com/photos/spunkinator/2394514059 Creative Commons

Monitoringworking / not working doesn’t count.

We have our own custom reporter agent for Datadog There’s pluggable reporter support in 2.0.2 now.

JVM GC woes

JVM GC woesAll happy now

SSTables Read Histogram

Questions?before we carry on


Transition takestime mindset shift expertise (some) risk

Our experience
• Pick one feature first
• Mindset shift
• Data modeling consulting
• Libraries / Patterns / Data-as-a-service

Pick one feature
• Don’t go all in with Cassandra with something important right away
• Work closely with that team

You probably will make mistakes
Oops!

Mindset shift
• Everyone knows SQL
• Not everyone knows Cassandra / NoSQL
• Need to know queries beforehand

Enrollment Example
• Learners enroll into a course
• learner (many-to-many) course
• Need to keep track of this membership

MySQL ModelCREATE TABLE `courses_learners` (
`id` INT(11) NOT NULL auto_increment,
`course_id` INT(11) NOT NULL,
`learner_id` INT(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `c_l` (`learner_id`, `course_id`),
CONSTRAINT `ref1` FOREIGN KEY (`course_id`)
CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)
)

MySQL ModelCREATE TABLE `courses_learners` (
`id` INT(11) NOT NULL auto_increment,
`course_id` INT(11) NOT NULL,
`learner_id` INT(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `c_l` (`learner_id`, `course_id`),
CONSTRAINT `ref1` FOREIGN KEY (`course_id`)
CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)
)

MySQL ModelCREATE TABLE `courses_learners` (
`id` INT(11) NOT NULL auto_increment,
`course_id` INT(11) NOT NULL,
`learner_id` INT(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `c_l` (`learner_id`, `course_id`),
CONSTRAINT `ref1` FOREIGN KEY (`course_id`)
CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)
)

MySQL ModelCREATE TABLE `courses_learners` (
`id` INT(11) NOT NULL auto_increment,
`course_id` INT(11) NOT NULL,
`learner_id` INT(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `c_l` (`learner_id`, `course_id`),
CONSTRAINT `ref1` FOREIGN KEY (`course_id`)
CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)
)

Cassandra Style
CREATE TABLE courses_by_learner (
learner_id uuid,
course_id uuid,
PRIMARY KEY (learner_id, course_id)
)

Data modeling consulting
• Build core team proficient at C* data modeling
• Available to consult for trickier use cases

Libraries / Patterns• Abstract away simple (but common) use-cases
• Key-value storage
• Simple time series
• Maybe every developer won’t need deep C* knowledge?
• More radical: data as a service (e.g. STAASH)
STAASH: https://github.com/Netflix/staash

It’s a long roadbut we’ll get there…
Author: Carissa Rogers License: CC BY 2.0

Conclusion
• Know Cassandra
• Know what makes a good deployment
• Know that new skills have to be acquired