building a distributed key-value store with cassandra

Building a Key-Value Store with Cassandra

Kiwi PyCon 2010Aaron Morton @aaronmorton

Weta Digital

Why Cassandra?

• Part of a larger project started earlier this year to build new systems for code running on the render farm of 35,000 cores

• Larger project goals were Scalability, Reliability, Flexible Schema

How about MySQL ?• It works. But...

• Schema changes

• Write redundancy

• Query language mismatch

• So went looking for the right tool for the job

Redis ?

• Fast, flexible. But...

• Single core limit

• Replication, but no cluster (itʼs coming)

• Limited support options

Couch DB ?• Schema free, scalable (sort of),

redundant (sort of). But...

• Single write thread limit

• Replication, but no cluster (itʼs coming)

• Low consistency with asynchronous replication

Cassandra ?• Just right, perhaps. Letʼs see...

• Highly available

• Tuneable synchronous replication

• Scalable writes and reads

• Schema free (sort of)

• Lots of new mistakes to be made

Availability• Row data is kept together and

replicated around the cluster

• Replication Factor is configurable

• Partitioner determines the position of a row key in the distributed hash table

• Replication Strategy determines where in the cluster to place the replicas

Consistency• Each read or write request specifies a

Consistency Level

• Individual nodes may be inconsistent with respect to others

• Reads may give consistent results while some nodes have inconsistent values

• The entire cluster will eventually mode to a state where there is one version of each

Consistency

• R + W > N

• R = Read Consistency

• W = Write Consistency

• N = Replication Factor

• Distributed hash table

• Scale throughput and capacity with more nodes, more disk, more memory

• Adding or removing nodes is an online operation

• Gossip based protocol for discovery

Data Model• Column orientated

• Denormalise

• Cassandra in an index building machine

• Simple explanation: a row has a key and stores an ordered hash in one or more Column Families

Data Model

• Keyspace

• Row / Key

• Column Family or Super Column Family

• Column

Data Model

User CF Posts SCF

Fred email:fred@...dob:04/03

post_1:{title: foo,body: bar}

Bob email:bobpost_100:{

title: monkeys,body: naughty}

API• Thrift

• Avro (beta)

• Auto generated bindings for many languages

• Stateful connections

• Python wrappers pycassa, Telephus (twisted)

• Client supplied time stamp for all mutations

• Client supplied Consistency Level for all mutations and reads

• insert (key, column_family, super_column, column, value)

• get(key, column_family, super_column, column)

• remove(key, column_family, super_column, column)

API• Slicing columns or super columns

• list of names

• start, finish, count, reversed

• get_slice() to slice one row

• multiget_slice() to slice multiple rows

• get_range_slices() to slice rows and columns

• Slicing keys

• start key, finish key, count

• Partitioner effects key order

• get_range_slices() to slice rows and columns

• batch_mutate()

• multiple rows and CFʼs

• delete or insert / update

• Individual mutations are atomic

• Request is not atomic, no rollback

Our ApplicationVarnish

Tornado

Cassandra Rabbit MQ

Our Application

• Similar to Amazon S3.

• REST API.

• Databases, Buckets, Keys+Values.

Our Column Families

• Database (super)

• Bucket (super)

• Bucket Index

• Object

• Object Index (super)

Our API

http:// db_name.wetafx.co.nz/bucket/key

PUT Object

• /bucket/object

• batch_mutate()

• one row in Objects CF with columns for meta and the body

• one column in ObjectIndex CF row for the bucket

List Objects

• /bucket_name?start=foo

• get_slice()

• for the bucket row in ObjectIndex CF

• if needed, multiget_slice() to “join” to the Object CF

Delete Bucket

• /bucket_name

• get_slice() on ObjectIndex CF

• batch_mutate() to delete Object CF and ObjectIndex CF

• delete Bucket CF row

Thanks

• http://wetafx.co.nz

• http://cassandra.apache.org/

building a distributed key-value store with cassandra

super column family

bucket cf row

column removekey

bucket row

bucket bucket

objectindex cf row

object cf

row multiget

Technology

cassandra: towards a certifying app store for...

distributed counters in cassandra (cassandra summit 2010)

cassandra - distributed data store

deploying apache cassandra on oracle cloud...

distributed data with cassandra...

nagios xi – how to monitor apache cassandra distributed...

performance comparison of cassandra in lxc and bare metal...

cassandramdslab.unime.it/sites/default/files/cassandra_tutorial.pdf ·...

intro cassandra -...

bcndevcon 2013: usign cassandra and zookeeper to build a...

online banking application with angular js, restful web...

apache cassandra and spark for simple, distributed, near...

apache cassandra - distributed database management system...

introduction to big data · 2018-01-26 · big data: batch...

ambry: linkedin’s scalable geo-distributed object store

managing service performance in the cassandra distributed...

xiaowei wang jingxin feng mar 7th 2011€¦ · •...

hybride cloud datacenter - netclose.ch · nfs/iscsi/smb...

apache cassandra - distributed database management...

cassandra-powered distributed dns