Transcript
Page 1: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Performance tuning -­ A key to successful Cassandra migration

Page 2: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

1.0 Abstract

2.0 Dominance of traditional RDBMS and Adoption of NoSQL

3.0 DataStax Cassandra – ‘The Visionary’

4.1 Our journey through Cassandra optimization : Data Model

4.2 Our journey through Cassandra optimization : Integration

4.3 Our journey through Cassandra optimization : DB Parameters

5.0 The only thing constant is change

6.0 Performance tuning -­ Key to success2© 2015. All Rights Reserved.

Page 3: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Abstract

3© 2015. All Rights Reserved.

In last few years, technology has seen a major drift in the dominance of traditional / RDMBSdatabases across different domains. Expeditious adoption of NoSQL databases especiallyCassandra in the industry opens up a lot more discussions on what are the major challenges thatare faced during implementation of Cassandra and how to mitigate it. Many a times we concludethat migration or POC (proof of concept) is not successful;; however the real flaw might be in the datamodeling, identifying the right hardware configurations, database parameters, right consistency leveland so on. There's no one good model or configuration which fits all use cases and all applications.Performance tuning an application is truly an art and requires perseverance. This paper delve intodifferent performance tuning considerations and anti-­patterns that need to be considered duringCassandra migration / implementation to make sure we are able to reap the benefits of Cassandra,what makes it a ‘Visionary’ in 2014 Gartner’s Magic Quadrant for Operational DatabaseManagement Systems.

Page 4: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Dominance of RDBMS and NoSQL adoption

4© 2015. All Rights Reserved.

Ø Storage of high volume dataØ Transaction controlØ Security managementØ Common key conceptsØ Evolved over a periodØ Common construct for querying

Why don’t I try if these databases can offer more?

Ø Support for clustersØ CostØ Impedance mismatchØ Adaptability to newer workload

Page 5: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

DataStax Cassandra – ‘The Visionary’ ……

5© 2015. All Rights Reserved.

Ø As per Gartner’s Magic Quadrant, DataStax Cassandra is listed as a ‘Visionary’Ø Magic Quadrant clearly calls out the differentiating factors

ü High performanceü In-­memory optionsü Search capabilitiesü Integration with Spark and Hadoopü Experience in doing business withthe vendor

Source: www.gartner.com

Page 6: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

…… But

6© 2015. All Rights Reserved.

Ø One of the major challenges listed in Gartner Magic Quadrant analysis is thepoor performance during POCs

Two major pit falls..

Ø POCs are conducted as quick and dirty

ü No capacity planning

ü Performance Tuning

Ø Moving to production without enough performance testing

Page 7: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Don’t be in dark…

7© 2015. All Rights Reserved.

Have you tried out all possible tuning techniques before concluding the results ???...

ü Data model

ü Integration best practices

ü Database parameters

Page 8: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Performance tuning -­ Key to success

8© 2015. All Rights Reserved.

Ø For a successful migration / implementation due diligence need to be done on alldifferent aspects

• Distribution• De-­Normalization• Indexing• Query patterns

Data Model

• ‘Batch’ statements• Consistency levels• Load balancing• Tombstones

Integration• Hidden data• Compaction• Cache

DB Parameters

Page 9: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Our journey through Cassandra optimization..

9© 2015. All Rights Reserved.

• Distribution• De-­Normalization• Indexing• Query patterns

Data Model

• ‘Batch’ statements• Consistency levels• Load balancing• Tombstones

Integration• Hidden data• Compaction• Cache

DB Parameters

Page 10: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Data model

10© 2015. All Rights Reserved.

Ø Equal distribution of data across partitions

Ø De-­normalization

Ø Redundancy of data is acceptable to cater to different read use cases

Ø Reduce client side joins

Think out of the box (RDBMS) ! ! !

Page 11: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Data model contd..

11© 2015. All Rights Reserved.

Ø Limit secondary indexes

Ø Do clustering based on the readpattern

CREATE TABLE cust_interaction (cust_id text, intr_id timeuuid, intr_tx text, PRIMARY KEY (cust_id, intr_id)) WITH CLUSTERING ORDER BY (intr_id DESC);

A table / CF that supports read for most recent customer interactions

Page 12: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Our journey through Cassandra optimization..

12© 2015. All Rights Reserved.

• Distribution• De-­Normalization• Indexing• Query patterns

Data Model

• ‘Batch’ statements• Consistency levels• Load balancing• Tombstones

Integration• Hidden data• Compaction• Cache

DB Parameters

Page 13: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

‘Batch’ is not for performance improvement

13© 2015. All Rights Reserved.

Ø Batching the statements can really harm the performanceØ Use individual inserts wherever possible

N1

N2

N3

N4

N5

N6

N1

N2

N3

N4

N5

N6

Individual InsertsBatch Inserts

Page 14: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Consistency levels

14© 2015. All Rights Reserved.

Ø Decide consistency levels based onü Workloadü Need for immediate consistency

Read Heavy Write Heavy Mixed work loadHigh Consistency (Immediate)

RC : ONEWC : All

RC : AllWC : ONE

RC : QuorumWC : Quorum

Relaxed consistency

RC : ONEWC : ONE, TWO

RC : ONE, TWOWC : ONE

RC : ONE, TWOWC : ONE, TWO

Considered RF = 3

Page 15: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Load balancing strategy

15© 2015. All Rights Reserved.

Ø Consider topologyØ Be aware of distribution of clients / users

ü TokenAwarePolicy acts as a wrapperü With multiple data centers, most preferred approach is to gowith DCAwareRoundRobinPolicy with TokenAwarePolicy

ü In case of single data center installations, RoundRobinPolicywith TokenAwarePolicy can be considered

Page 16: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Beware of Tombstones

16© 2015. All Rights Reserved.

Ø Querying data which has columns with tombstone set can bring down the performance

Ø Marker in a row indicates the deleteØ Compaction removes the Tombstone based on GCØ Do not insert NULL to CassandraØ IGNORE_NULLS to TRUE

Image Source: www.datastax.com

Page 17: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Our journey through Cassandra optimization..

17© 2015. All Rights Reserved.

• Distribution• De-­Normalization• Indexing• Query patterns

Data Model

• ‘Batch’ statements• Consistency levels• Load balancing• Tombstones

Integration• Hidden data• Compaction• Cache

DB Parameters

Page 18: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Watch for hidden data

18© 2015. All Rights Reserved.

Ø TTL and gc_grace_seconds goes hand in handØ Even after the data is deleted (tombstone is set), it still occupies the spacetill it passes gc_grace_seconds

Ø Direct impact on storage and performanceØ Default GC is 10 days

Image Source: www.datastax.com

Page 19: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Compaction

19© 2015. All Rights Reserved.

Ø Size Tiered Compaction :Ø Leveled Compaction :Ø Date Tiered Compaction :

Ø Full replacement is default

Ø Incremental Replacement

Ø Anti-­compaction

Ø Clients can read data directly from the new SSTable even before it finishes writing

Ø Reduce Compaction I/O contention

Image Source: www.datastax.com

Page 20: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Compaction Cont...

20© 2015. All Rights Reserved.

Ø Default is Size-­tieredØ Alter column family to change compaction type

Image Source: www.datastax.com

Page 21: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Compaction Cont...

21© 2015. All Rights Reserved.

Ø Handle Time series-­like data

DateTiered Compaction Strategy

Image Source: www.datastax.com

Page 22: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Cache what you need

22© 2015. All Rights Reserved.

Cassandra read path = A lot of in-­memory components.. Be Optimal...

Image Source: https://academy.datastax .com/

Row cache hit

Ø Row Cache – Turned OFF by defaultü Caches the complete data

ü Earlier versions used to load thewhole partition

ü From 2.1, number of rows cached per partition is configurable

ü Optimal for low volume data that are frequently accessed

Page 23: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Cache what you need contd..

23© 2015. All Rights Reserved. Image Source: https://academy.datastax .com/

Key cache hit

Ø Key Cache – Turned ON by defaultü Caches just the key

ü Turning OFF à Increase the response time for retrieves

ü Place frequently and sparsely read data to different CF

No one configuration fits all. Tuning has to be iterative

Page 24: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

The only thing constant is change

24© 2015. All Rights Reserved.

2011 –2012

-­ Secondary Indexes-­ Online schema changes

-­ Introduction of CQL-­ Zero-­downtime upgrade-­ Leveled compaction 20

13 -­2014

-­ Virtual nodes-­ Inter-­node communication-­ Light weight tnxs-­ Triggers-­ Change in data and log location-­ User defined data types

2015

-­ Commit log compression-­ JSON support-­ Role-­based authorization-­ User defined functions-­ Windows support-­ Monthly versions

Keep up with the pace.. Changes can impact the performance a lot..

Page 25: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Performance tuning -­ Key to success

25© 2015. All Rights Reserved.

DBADeveloper

Sys Admin

Traditional DBMS world NoSQL World

Database EngineerBoundary between different roles has blurred..

Onus is on ‘us’ to tune, tune and tune the system to make the Cassandra implementation successful.. !!!

Page 26: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Question & Answers

26© 2015. All Rights Reserved.

???

Page 27: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Authors

27© 2015. All Rights Reserved.

Tiju Francis, Principal Technology Architect, Infosys Ltd

https://www.linkedin.com/in/tijufrancis

Ramkumar Nottath, Technology Architect, Infosys Ltd

https://www.linkedin.com/in/ramnottath

Arunshankar Arjunan, Technology Architect, Infosys Ltdhttps://www.linkedin.com/in/arunshankararjunan

Page 28: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Thanks..

28© 2015. All Rights Reserved.

Ø Thanks to all great minds who contributed towards this presentation.ü Srivas J, Infosys Ltdü Srivas G, Infosys Ltdü Lakshman G, Infosys Ltdü Kiran N G Infosys Ltdü Sivaram K Infosys Ltdü Chethan Danivas, Infosys Ltdü Badrinath Narayanan, Infosys Ltdü Gautam Tiwari, Infosys Ltdü Shailesh Janrao Barde , Infosys Ltd

Page 29: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

References

29© 2015. All Rights Reserved.

Ø NoSQL Distilled by Pramod J. Sadalage and Martin FowlerØ https://academy.datastax.com/coursesØ http://www.gartner.com/Ø Mastering Apache CassandraØ http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/Ø http://www.planetcassandra.org/cassandra/Ø http://jonathanhui.com/cassandra-­performance-­tuning-­and-­monitoring

Source: www.gartner.com

Page 30: Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration

Thank you


Top Related