Введение в apache cassandra

29
© 2014 Grid Dynamics Created for BigData Community by Dmitry Yaraev Apache Cassandra When and Why

Upload: open-it

Post on 08-Jul-2015

92 views

Category:

Engineering


2 download

DESCRIPTION

Saratov open it teach talk. Дамир Яраев: Введение в Apache Cassandra (В ходе презентации Дамир расскажет, когда и почему стоит переходить с проверенных временем реляционных баз данных на ставшие модными в последнее время решения на базе NoSQL. В качестве примера рассмотрит колоночную NoSQL базу данных Apache Cassandra)

TRANSCRIPT

Page 1: Введение в Apache Cassandra

© 2014 Grid Dynamics

Created for BigData Community by Dmitry Yaraev

Apache CassandraWhen and Why

Page 2: Введение в Apache Cassandra

© 2014 Grid Dynamics

Agenda

1. When RDBMS Becomes a Bottleneck2. Concepts of NoSQL Paradigm3. Variety of NoSQL Databases4. Why Apache Cassandra?5. Essential Use Cases of Cassandra6. Bad Usage Patterns

Page 1

Page 3: Введение в Apache Cassandra

© 2014 Grid Dynamics

What Is Offered by RDBMS

● Mature technology with common standards● Easy migration from one engine to another● Data model corresponds to the real world● Structured Query Language (SQL)● ACID transactions

Page 2

Page 4: Введение в Apache Cassandra

© 2014 Grid Dynamics

Bottlenecked by RDBMS

● Horizontal scalability● Schema support and migration● Server and maintenance cost

Page 3

Page 5: Введение в Apache Cassandra

© 2014 Grid Dynamics

NoSQL :: History

● First mention in 1998● Class of distributed databases● Not Only SQL

Page 4

Page 6: Введение в Apache Cassandra

© 2014 Grid Dynamics

NoSQL :: Features

● Simple schema without relations● Good horizontal scalability● Combination of two of the following:

○ Consistency○ Availability○ Partition Tolerance

Page 5

Page 7: Введение в Apache Cassandra

© 2014 Grid Dynamics

NoSQL :: CAP Theorem

Page 6

Page 8: Введение в Apache Cassandra

© 2014 Grid Dynamics

NoSQL :: Storage Types

Page 7

Page 9: Введение в Apache Cassandra

© 2014 Grid Dynamics

Questions?

Page 8

Page 10: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: What Is It?

● Wide-column distributed data store● The latest version is 2.1.2 (released this month)● Proved itself in production (Instagram, Spotify,

eBay and many other big players on IT market)

Page 9

Page 11: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Origin

● Originally created in Facebook● Open-sourced in 2008● Apache incubator project in early 2009● Top level Apache project in March 2010

Page 10

Page 12: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Features

● High scalability● Tunable consistency● Cross-datacenter replication● Query language (CQL)● Drivers for a variety of languages● Lightweight transactions● Indexing

Page 11

Page 13: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Data Types

● Primitive types● Arbitrary bytes (blob)● Collections (list, map, set)● Tuples (tuple)● User defined

Page 12

Page 14: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Data Model

● Keyspace● ColumnFamily● Row● Column

Page 13

Page 15: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Data Model

Page 14

Page 16: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: ColumnFamily

Page 15

Page 17: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: CQL3

● SQL-like syntax● Three types of statements

○ data definition statements○ data manipulation statements○ data look up statements

● Prepared statements

Page 16

Page 18: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Example Queries

CREATE TABLE songs ( id uuid PRIMARY KEY, title text, album text, artist text, data blob );

SELECT * FROM songs WHERE artist = ‘Metallica’; -- RETURNS AN ERROR

CREATE INDEX ON songs(artist);

SELECT * FROM songs WHERE artist = ‘Metallica’;

Page 17

Page 19: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Data Distribution

Page 18

Page 20: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Replication

Page 19

Page 21: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Eventual Consistency

Page 20

Page 22: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Tunable Consistency

Page 21

Page 23: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Consistency Levels

● Defines a condition for successful read/write operation

● Multiple Options○ ONE○ ALL○ QUORUM,○ LOCAL_QUORUM○ SERIAL○ …

● Can be specified per request

Page 22

Page 24: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Consistency (Quorum)

Page 23

Page 25: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Consistency (ONE)

Page 24

Page 26: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Consistency (ONE)

Page 25

Page 27: Введение в Apache Cassandra

© 2014 Grid Dynamics

Cassandra :: Use Cases

● Large data sets and simple scaling● Perfectly fits for semi-structured data● Fault tolerance (no SPoF)● High write throughput

Page 26

Page 28: Введение в Apache Cassandra

© 2014 Grid Dynamics

● No good for large blobs ( > 64MB )● When there are more read operations than

writes ones and low read latency is critical● ACID transactions

Cassandra :: Limitations

Page 27

Page 29: Введение в Apache Cassandra

© 2014 Grid Dynamics

Thanks!

Page 28