Transcript
Page 1: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Cassandra concepts, patterns and anti-

patterns

Dave Gardner@davegardnerisme

ApacheCon EU 2012

Page 2: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Agenda

• Choosing NoSQL• Cassandra concepts

(Dynamo and Big Table)• Patterns and anti-patterns

of use

Page 3: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Choosing NoSQL...

Page 4: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

1. Find data store that doesn’t use SQL

2. Anything3. Cram all the things into it4. Triumphantly blog this

success5. Complain a month later when

it bursts into flames

http://www.slideshare.net/rbranson/how-do-i-cassandra/4

Page 5: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

“NoSQL DBs trade off traditional features to better support new and emerging use cases”

http://www.slideshare.net/argv0/riak-use-cases-dissecting-the-solutions-to-hard-problems

Page 6: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

More widely used, tested and documented software..(MySQL first OS release 1998)

.. for a relatively immature product(Cassandra first open-sourced in 2008)

Page 7: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Ad-hoc querying..(SQL join, group by, having, order)

.. for a rich data model with limited ad-hoc querying ability(Cassandra makes you denormalise)

Page 8: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

What do we get in return?

Page 9: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Proven horizontal scalability

Cassandra scales reads and writes linearly as new nodes are added

Page 10: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-

on.html

Page 11: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

High availability

Cassandra is fault-resistant with tunable consistency levels

Page 12: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Low latency, solid performance

Cassandra has very good write performance

Page 13: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

http://blog.cubrid.org/dev-platform/nosql-benchmarking/

* Add pinch of salt

Page 14: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Operational simplicity

Homogenous cluster, no “master” node, no SPOF

Page 15: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Rich data model

Cassandra is more than simple key-value – columns, composites, counters, secondary indexes

Page 16: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Choosing NoSQL...

Page 17: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

“they say … I can’t decide between this project and this project even though they look nothing like each other. And the fact that you can’t decide indicates that you don’t actually have a problem that requires them.”

http://nosqltapes.com/video/benjamin-black-on-nosql-cloud-computing-and-fast_ip(at 30:15)

Page 18: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Or you haven’t learned enough about them..

Page 19: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

• What tradeoffs are you making?

• How is it designed?• What algorithms does it use?• Are the fundamental design

decisions sane?

http://www.alberton.info/nosql_databases_what_when_why_phpuk2011.html

Page 20: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Concepts...

Page 21: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Consistent hashingVector clocks *Gossip protocolHinted handoffRead repair

http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

ColumnarSSTable storage

Append-onlyMemtable

Compaction

http://labs.google.com/papers/bigtable-osdi06.

pdf* not in Cassandra

Amazon Dynamo + Google Big Table

Page 22: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

1

2

Client

tokens are integers from

0 to 2127

Distributed Hash Table

(DHT)

3

4

5

6

Page 23: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

1

2

Client

Coordinator node 3

4

5

6

consistent hashing

Client

Page 24: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

1

2

Client

replication factor (RF) 3

coordinator node 3

4

5

6

Client

Page 25: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Consistency Level (CL)

How many replicas must respond to declare success?

Page 26: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Level Description

ONE 1st Response

QUORUM N/2 + 1 replicas

LOCAL_QUORUM N/2 + 1 replicas in local data centre

EACH_QUORUM N/2 + 1 replicas in each data centre

ALL All replicas

http://wiki.apache.org/cassandra/API#Read

For read operations

Page 27: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Level Description

ANY One node, including hinted handoff

ONE One node

QUORUM N/2 + 1 replicas

LOCAL_QUORUM N/2 + 1 replicas in local data centre

EACH_QUORUM N/2 + 1 replicas in each data centre

ALL All replicas

http://wiki.apache.org/cassandra/API#Write

For write operations

Page 28: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

1

2

Client

coordinator node 3

4

5

6

Client

RF = 3CL =

Quorum

Page 29: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Hinted Handoff

A hint is written to the coordinatornode when a replica is down

http://wiki.apache.org/cassandra/HintedHandoff

Page 30: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

1

2

Client

coordinator node 3

4

5

6

Client

RF = 3CL =

Quorum

node offline

hint

Page 31: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Read Repair

Background digest query on-read to find and update out-of-date

replicas*

http://wiki.apache.org/cassandra/ReadRepair

* carried out in the background unless CL:ALL

Page 32: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

1

2

Client

coordinator node 3

4

5

6

Client

RF = 3CL = One

background digest query,

then update out-of-date replicas

Page 33: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Big Table...

Page 34: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

• Sparse column based data model

• SSTable disk storage• Append-only commit log• Memtable (buffer and sort)• Immutable SSTable files• Compaction

http://research.google.com/archive/bigtable-osdi06.pdfhttp://www.slideshare.net/geminimobile/bigtable-4820829

Page 35: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

+ timestamp

Name

Value

Column

Timestamp used for conflict

resolution (last write wins)

Page 36: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Name

Value

Column

Name

Value

Column

Name

Value

Column

we can have millions of columns

*

* theoretically up to 2 billion

Page 37: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Name

Value

Column

Name

Value

Column

Name

Value

Column

Row Key

Row

Page 38: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Column Family

ColumnRow Key Colum

nColum

n

ColumnRow Key Colum

nColum

n

ColumnRow Key Colum

nColum

n

we can have billions of rows

Page 39: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Write Memtable

SSTable

SSTable

SSTable

SSTable

Commit Log

Memory

Disk

Write path buffer writes and sort data

flush on time or size trigger

immutable

Page 40: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Sorted data written to disk in blocks

Each “query” can be answered from a single slice

of disk

Therefore start from your queries and work backwards

Page 41: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Patterns and anti-patterns...

Page 42: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Page 43: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Storing entities as individual columns

under one row

Pattern

Page 44: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

row: USERID1234

name: Daveemail: [email protected]: Developer

Pattern

we can use C* secondary indexes to fetch all users with job=developer

one row per user

Page 45: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Storing whole entity as single

column blob

Anti-pattern

Page 46: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

row: USERID1234

data: {"name":"Dave",

"email":"[email protected]", "job":"Developer"}

now we can’t use secondary indexes nor easily update safely

one row per user

Anti-pattern

Page 47: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Mutate just the changes to

entities, make use of C* conflict

resolution

Pattern

Page 48: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

$userCf->insert( "USER1234", array("job" => "Cruft") );

Pattern

we only update the “job” column, avoiding any race conditions on reading all properties and then writing all, having only updated one

Page 49: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Lock, read, update

Anti-pattern

Page 50: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Don’t overwrite anything; store as time series data

Pattern

Page 51: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

row: USERID1234

a384cff0-26c1-11e2-81c1-0800200c9a66{"action":"create", "name":"Dave"}10dc4c40-26c2-11e2-81c1-0800200c9a66{"action":"update", "name":"foo"}

Pattern

column name is a type 1 UUID (time based)http://www.famkruithof.net/guid-uuid-timebased.html

one row per user; many columns (wide row)

Page 52: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

We can store all sorts of stuff as

time series

http://rubyscale.com/2011/basic-time-series-with-cassandra/

Pattern

Page 53: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Order Preserving Paritioner (OPP)

http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/

Anti-pattern

Page 54: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Distributed counters

Pattern

Page 55: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Super Columns(a trap for the unwary)

http://rubyscale.com/2010/beware-the-supercolumn-its-a-trap-for-the-unwary/

Anti-pattern

Page 56: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

In conclusion...

Page 57: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Cassandra is founded on sound design

principles

Page 58: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

The data model is incredibly powerful

Page 59: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

CQL and a new breedof clients are making

it easier to use

Page 60: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Lots of tools and integrations exist to

expand the feature set

Page 61: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

There is a strongcommunity and

multiple companies offering professional

support

Page 62: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Thanks

Learn more about Cassandra (if you’re ever in London)meetup.com/Cassandra-London

Learn more about the fundamentalshttp://nosqlsummer.org/

Watch videos from Cassandra SF 2011http://www.datastax.com/events/cassandrasf2011/presentations

looking for a job?

Page 63: Cassandra concepts, patterns and anti-patterns

Cassandra concepts, patterns and anti-patterns - ApacheCon EU 2012

Extending functionality

Search via Apache Solr and DataStax Enterprisehttp://www.datastax.com/technologies/solr

Batch processing via Apache Hadoop and DataStax Enterprisehttp://www.datastax.com/technologies/hadoop

Real-time analytics via Acunu Reflexhttp://www.acunu.com/acunu-analytics.html


Top Related