cassandra core concepts and design internals

Cassandra Core Concepts andDesign Internals

Cassandra Core Concepts and Design Internals

New Delhi Cassandra Users Meetup – November

By: Salil Kalia

We’re going to talk about:

1. What is Cassandra?

2. High Level Architecture

3. Data Modeling

4. Write Path

5. Read Path

6. Tools

7. Q/A

A Database:

✓ Highly available

✓ Fully distributed, with no single point of failure

✓ Free & open source, with deep developer support

✓ Highly performing with near-linearhorizontal scaling

✓ Replicated & durable

What is Cassandra ?

Elastic Scalability

Distributed

Decentralized

FaultToleran

tColumn Oriented

TunableConsistenc

Highly available

KEY FEATURES

Open Source

Cassandra – Features

Google Big Table

Amazon Dynamo DB

[Facebook] Cassandra

Cassandra Evolution

✓ Ring based data distribution

✓ Only one type of Server

✓ Highly distributed

✓ All nodes hold data

✓ All nodes answer queries

✓ All nodes are replicas

✓ In-built Multi DC

✓ In-built Snitch feature

High Level Architecture

✓ Nodes and Virtual nodes

✓ Primary & Secondary range

✓ Partition Key (Hash)

✓ Partitioner

✓ Client & Coordinator

✓ Replication Factor (RF)

✓ Consistency Level (CL)

Few Common Terms

Magic Formula

Write CL + Read CL >RF

Immediate Consistency

Keyspace

Partition

Column

Data Modeling

✓ Like an RDBMS, Cassandra uses a Table to store data

✓ Partitions within tables

✓ Rows within partitions (or a single row)

✓ CQL to create tables & query data

✓ Partition keys determine where a partition is found

✓ Clustering keys determine ordering of rows within a partition

Data Modeling

name age occupation

Salil 32 Tech Lead

Vishal 25 Software Engineer

Akshay 45 Actor

Sheri 29 Singer

cqlsh:demo> create table user (name text primary key,age int, occupation text);

cqlsh:demo> select * from user WHERE name = ’Vishal'

Example: Single Row Partition

✓ User identified by name (PK)

✓ Single row per partition✓ RDBMS like structure

Video_id Comment_id Comment

5 1 Nice pic

5 2 Which place?

5 3 lol

6 4 Great!

cqlsh:demo> create tablecomment (video_id int, comment_id int, comment text, primary key ( video_id, comment_id));

cqlsh:demo> select * from comment WHERE Video_id=5;

Example: Multiple Rows Partition

• Video_id - partition key• comment_id – cluster key

* In real world, use UUIDs instead of int for PK

Query before data modeling

Denormalize the

data Create multiple views into your data

Cassandra is built for faster

writes Better – as few reads as possible

Data Modeling – Best practices

CommitLog – append only logs

Memtables – In memory table

SSTables – created after the data flushes to disk

Compaction – process to merge SSTables

Key components of the Write Path

✓ Memtables – In memory table✓ Row Cache – In memory cache stores recent read

rows✓ Bloom Filters – reports if a partition key may found in its corresponding SSTable

✓ Key Caches – in memory (on heap)

✓✓

Partition Summaries – in memory (on heap)

Partition Indexes – on disk

✓ SSTables – on disk

Key components of the Read Path

Contact us

Have more queries related to BIG DATA?

Talk To Our Experts!

Our Office

Client Location

Click Here To Know More!

Here’s how TOTHENEW helps your customers outsource across the globe using BIG DATA!

cassandra core concepts and design internals

Technology

mysql internals manual -...

cassandra internals overview by sam tunnicliffe (1)

cassandra concepts, patterns and anti-patterns

understanding cassandra internals to solve real-world...

apache spark in depth: core concepts, architecture &...

android internals 01 - basic concepts of mobile platforms...

usenix 2001, boston, ma. solaris internals solaris internals

cassandra offline analytics · •introduction •use case...

2. dimitris labridis (auth) - presentation of the...

apache con na 2013 - cassandra internals

cassandra core concepts - cassandra day toronto

apache cassandra in action - o'reilly...

oracle to cassandra core concepts guide pt. 2

tutorial on nosql data management: concepts and systems ·...

cassandra community webinar: apache cassandra internals

cassandra internals: the read path (tyler hobbs, datastax) |...

predix columnar store€¦ · 4. familiarize yourself with...

cassandra core concepts

bringing internals to the surface internals concepts for...

c* summit eu 2013: cassandra internals