cassandra summit 2014: performance tuning cassandra in aws

27
1 © 2014 by Intellectual Reserve, Inc. All rights reserved. Performance Tuning Cassandra In AWS Cassandra Summit 2014 Michael Nelson

Upload: planet-cassandra

Post on 06-Dec-2014

734 views

Category:

Technology


6 download

DESCRIPTION

Presenters: Michael Nelson, Development Manager at FamilySearch A recent research project at FamilySearch.org pushed Cassandra to very high scale and performance limits in AWS using a real application. Come see how we achieved 250K reads/sec with latencies under 5 milliseconds on a 400-core cluster holding 6 TB of data while maintaining transactional consistency for users. We'll cover tuning of Cassandra's caches, other server-side settings, client driver, AWS cluster placement and instance types, and the tradeoffs between regular & SSD storage.

TRANSCRIPT

Page 1: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

1!© 2014 by Intellectual Reserve, Inc. All rights reserved.!

Performance Tuning Cassandra In AWS"

Cassandra Summit 2014!Michael Nelson!

Page 2: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

2!

Outline!

•  The App: FamilySearch Family Tree!•  The Test: Borland Silk Performer!•  The Findings:!

•  Row Cache!•  Token Aware Driver!•  Networking Issues!•  Etc.!

Page 3: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

3!

What Is FamilySearch?!

•  Familysearch.org Website!•  Very Large Single Pedigree (Family Tree)!•  Largest Collection of Free Genealogical Records!•  Largest Genealogical Library!•  The Church of Jesus Christ of Latter-day Saints

(Mormons)!

Page 4: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

4!

Why does FamilySearch exist?!

Visit http://mormon.org/family-history/!!

Page 5: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

5!

Family Tree Data!

Family Tree: !•  900M+ Person Records, Open-Edit!•  500M+ Relationships, Open-Edit!•  8.4B Change Log Entries, ~1M / day!•  7TB in Cassandra (13TB in Oracle)!

•  Dynamic OLTP system!•  Data-dependent performance issues!

Page 6: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

6!

Family Tree: Example 9 Gen Pedigree!

up  to  511  person  slots  Dynamic  content!  

Page 7: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

7!

Family Tree: Example Pedigree App!

31+  persons  per  sec0on  Dynamic  content!  

Page 8: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

8!

Family Tree: Example Ancestor Page!

10+  persons  in  families  100-­‐1000+  changes  Dynamic  content!  

Page 9: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

9!

Cassandra Reimplementation!

•  Event-Sourced Data Model – journal / views!•  New Data Model – no indexes!•  New Consistency Model – satisfies consistency!

JE  #8  

P1   P1  Views  

A   B  

JE  #6  

P2   P2  Views  

A   B  

Page 10: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

10!

77% Reads / 23% Writes!

Reads:!•  LOCAL_ONE!•  Simple Queries!

Writes:!•  LOCAL_QUORUM!•  Atomic Batches!•  Multiple Tables!•  Multiple Rows!•  Business Logic!

Page 11: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

11!

A Little Optimization Goes A Long Way!

28 Node Cluster!•  250,000 op/sec!•  Optimized App!

8 Node Cluster!•  200,000 op/sec!•  Optimized App!•  Row Cache!•  Token Aware Driver!

Page 12: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

12!

Test System!

Cassandra  (Community  Ed.  2.0.5)  

Family  Tree  App  Servers  

(Datastax  2.0.0)  

Silk  Performer  Load  Agents  

8  hi1.4xlarge:  •  16  CPU  •  61  GB  RAM  •  2  TB  SSD  •  10  Gb  net  

60  m2.2xlarge:  •  4  CPU  •  34  GB  RAM  •  “moderate”  net  

25  m2.xlarge:  •  2  CPU  •  17  GB  RAM  •  “moderate”  net  

Page 13: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

13!

2x Throughput Increase!

0  

50,000  

100,000  

150,000  

200,000  

Defaults   Row  Cache   Token  Aware   concurrent_reads  

op  /  sec  

Reads   Writes  

Page 14: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

14!

Row Cache = 35% More Throughput!

Default Key Cache:!•  Cached Disk Location!•  Data From Disk Cache!•  ~11ms Reads!

Row Cache:!•  Cached Row Contents!•  ~7ms Reads!

Page 15: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

15!

Configuring Row Cache!

cassandra.yaml:!# Maximum size of the row cache in memory.

# Default value is 0, to disable row caching. row_cache_size_in_mb: 32768

!Enable For Each Table Explicitly:!ALTER TABLE person_view WITH caching = 'ALL';

!

Page 16: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

16!

90% Row Cache Hit Rate!

Page 17: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

17!

Token Aware = 50% More Throughput!

Default Round Robin:!•  Coordinator Middleman!•  Adds Network Hops!•  Load On Multiple Nodes!•  ~7ms!

Token Aware:!•  Reads From Replicas!•  No Network Hops!•  ~2ms!

Page 18: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

18!

Configuring Token Aware!

Default Load Balancing Policy:!new RoundRobinPolicy()

Better:!new TokenAwarePolicy(new RoundRobinPolicy())

Page 19: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

19!

concurrent_reads = 5% More Throughput!

Defaults:!concurrent_reads: 32

concurrent_writes: 32 native_transport_max_threads: 128

Improved:!concurrent_reads: 256 concurrent_writes: 256

native_transport_max_threads: 256

Page 20: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

20!

Now Where’s The Bottleneck?!

•  181,000 reads/sec; 21,000 writes/sec!•  CPU = 80%!•  Network = 10%!•  Disk < 5%!

Page 21: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

21!

Network Mystery: C* ≤ 800Mb!

C* Never Exceeded 800Mb On 10Gb Network!!!

Page 22: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

22!

Network Mystery: Cyclic Net Queues!

•  About 5 Second Cycle of Net Queues Backing Up!•  Client Machines Seemed OK!•  Tweaking Network Stack Had No Impact:!

•  net.core.wmem_max!•  net.core.rmem_max!•  net.ipv4.tcp_wmem!•  net.ipv4.tcp_rmem!•  net.core.somaxconn!•  net.core.netdev_max_backlog!•  net.ipv4.tcp_tw_recycle!•  net.ipv4.tcp_max_syn_backlog!•  net.ipv4.ip_local_port_range!•  txqueuelen!

Page 23: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

23!

Network Mystery: Cyclic Net Queues!

Send-Qs Backup!!

Page 24: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

24!

Network Mystery: Cyclic Net Queues!

Recv-Qs Backup!!

Page 25: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

25!

Network Mystery: Cyclic Net Queues!

Somewhat Normal – Then Starts Again!!

Page 26: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

26!

2x Throughput Increase!

0  

50,000  

100,000  

150,000  

200,000  

Defaults   Row  Cache   Token  Aware   concurrent_reads  

op  /  sec  

Reads   Writes  

Page 27: Cassandra Summit 2014: Performance Tuning Cassandra in AWS

27!

Contact Info!

Michael Nelson"Development [email protected]!!Thanks to FamilySearch team!!!Thanks to the awesome presenters & organizers at

#CassandraSummit!!