c* summit eu 2013: using cassandra in a telco storage system

Post on 10-May-2015

667 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Speaker: Pavel Pontryagin, Senior Engineer at Peter-Service Video: http://www.youtube.com/watch?v=Sf81x5V8xKY&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=7 Data volume grows and in telecommunication area it is painful to support and scale RDBMS systems. This presentation shows how we switched from SQL to NoSQL. This will be an overview of aspects: * how we model schema for call data using NoSQL vs SQL. * what hardware architecture we use * NoSQL vs SQL insert-select performance * how we store graph data using C*.

TRANSCRIPT

#CASSANDRAEU

CASSANDRASUMMITEU

Using Cassandra in a Telco Storage System

Pavel Pontryagin, senior developer

#CASSANDRAEU

CASSANDRASUMMITEU

* What hardware architecture we use

* How we model schema for call, client and graph data using NoSQL vs SQL

* NoSQL vs SQL insert-select performance

What we will discuss today is how we switched from SQL to NoSQL

#CASSANDRAEU

CASSANDRASUMMITEU

SQL

What and how much is Telco data

#CASSANDRAEU

CASSANDRASUMMITEU

* What is CDR? Under the hood.

#CASSANDRAEU

CASSANDRASUMMITEU

500 millions/day 100 millions/day

Scaling data in a traditional schema

#CASSANDRAEU

CASSANDRASUMMITEU

* Speed up processing *  Increase reliability * Decrease costs * Unify units  * Add new functionality

Unanswered questions. How to:

Solution for us is

Switch to Cassandra

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL

How to load, model and query data

#CASSANDRAEU

CASSANDRASUMMITEU

The schema is

Local Storage

SSTable Loader

Local Storage

SSTable Loader

Local Storage

SSTable Loader

CDR files CDR files CDR files

Thrift Thrift Thrift 62Gb RAM 12 x CPU 2.50GHz 10Tb array

#CASSANDRAEU

CASSANDRASUMMITEU

SQL data Query

SELECT started, numA, numB FROM events_1003_main WHERE started > sysdate -1 and numA = ‘765-23-14’;

Data modeling. Calls

started numA numB Event type

Lac Cell IMSI IMEI

20131015210458 765-23-14 765-23-18 VOICE 2901 35140 IMSI1 IMEI1

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL data “raw” data CALLS_201301 index cf CALLS_MSISDN_201301 Query

SliceQuery<String, Composite, String> sliceQuery; //define query sliceQuery.setColumnFamily(“CALLS_MSISDN_201301”); //define CF

sliceQuery.setKey(“201301.765-23-18”); //Query given msidn for given date Composite start = new Composite(); start.addComponent(0, “201301”, AbstractComposite.ComponentEquality.EQUAL); // from date Composite end = new Composite(); end.addComponent(0, “201303”, AbstractComposite.ComponentEquality.GREATER_THAN_EQUAL); // to date sliceQuery.setRange(start, end, false, 1000); QueryResult<ColumnSlice<Composite, String>> qr = sliceQuery.execute(); //get the result

uuid1 20131015210458,765-23-14,765-23-18,VOICE,2901,35140

201301.765-­‐23-­‐14   {2013-­‐10-­‐15  21/04/58:uuid1}  numB  

201301.765-­‐23-­‐18   {2013-­‐10-­‐15  21/04/58:uuid1}  numA.IMSI1.IMEI1  

#CASSANDRAEU

CASSANDRASUMMITEU

… where num like ‘%123’ Using SolR

Clone Keep using

+SolR indexing

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL data “raw” data CALLS_201301 index cf CALLS_LAC_CELL_201301 Query

SliceQuery<String, Composite, Date> sliceQuery; //define query sliceQuery.setColumnFamily(“CALLS_LAC_CELL_201301”); //define CF

sliceQuery.setKey(“2013.05.01 15:15”); //Query given time slice Composite start = new Composite(); start.addComponent(0, “2901”, AbstractComposite.ComponentEquality.EQUAL); // from date Composite end = new Composite(); end.addComponent(0, “2901”, AbstractComposite.ComponentEquality.GREATER_THAN_EQUAL); // to date sliceQuery.setRange(start, end, false, 1000); QueryResult<ColumnSlice<Composite, Date >> qr = sliceQuery.execute(); //get the result

uuid1 20131015210458,765-23-14,765-23-18,VOICE,2901,35140

2013.05.01 15:15 {2901:35140: uuid1} 20131015210458

#CASSANDRAEU

CASSANDRASUMMITEU

SQL data Query SELECT distinct dn.msisdn, tl.name, t.full FROM SUBS_DATA.DCT_NUMS dn, SUBS_DATA.SUBS_NUMS_HIST snh, subs_data.clnt_attrs_hist cah, SUBS_DATA.DCT_NAMES t, dicts.telcos tl Where (dn.num_id = subn_num_id) and (cah.clnt_clnt_id = snh.clnt_clnt_id) and (t.name_id = cah.reg_name_id) and (tl.telco_id = dn.tlco_tlco_id) and (snh.started > to_date('10.09.2012', 'DD.MM.YYYY')) and (snh.started < to_date('17.09.2013', 'DD.MM.YYYY')) order by dn.msisdn;

Data modeling. Subscribers

started numA numB Event type

Lac Cell IMSI IMEI

20131015210458 765-23-14 765-23-18 VOICE 2901 35140 IMSI1 IMEI1

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL “raw” subscribers data first slice update index cf SUBS_NAME

uuid1 Pavel, 765-23-14, address1, document1, 2, 21-01-2013, 1(active_status)

uuid2 Vadim, 765-23-16, address2, document2, 2, 21-01-2013, 1(active_status)

Pavel {21-01-2013: document1: 1} {21-01-2014: document1: 0} uuid1 uuid3

Vadim {21-01-2013: document2: 1}

uuid2

uuid3 Pavel, 765-23-14, address1, document1, 2, 21-01-2014, 0(active_status)

#CASSANDRAEU

CASSANDRASUMMITEU

Data modeling. Graph data

#CASSANDRAEU

CASSANDRASUMMITEU

noSQL graph data First record from the first switch Second record from the second switch CALLS_EDGES column family

2013/01/12.msisdn1 {VOICE: msisdn2: 1:uuid1} {VOICE: msisdn2: 2:uuid2} OUT OUT

2013/01/12.msisdn2 {VOICE: msisdn1: 1:uuid1} {VOICE: msisdn1: 2:uuid2} IN IN

uuid started numA numB Event type

subsciber

uuid1 2013/01/12 msisdn1 msisdn2 VOICE 1

uuid started numA numB Event type

subsciber

uuid2 2013/01/12 msisdn1 msisdn2 VOICE 1

#CASSANDRAEU

CASSANDRASUMMITEU

*  Loading speed *  Query speed *  Fun

Summary

Write,  rec/sec  

Oracle 10 000 Cassandra 5 000 per node

seconds  

Oracle > 1, depends on plans, statistics etc. Cassandra < 1, stable

#CASSANDRAEU

CASSANDRASUMMITEU

#CASSANDRAEU

CASSANDRASUMMITEU

*  How to model "traditional" data *  How to add full-text search to intensive data load *  A couple of hardware issues

What we discussed today…

#CASSANDRAEU

CASSANDRASUMMITEU

THANK YOU

pavel.pontryagin@billing.ru

Pavel Pontryagin, senior developer

top related