always on: building highly available applications on cassandra

55
Always On: Building Highly Available Applications on Cassandra Robbie Strickland

Upload: robbie-strickland

Post on 27-Jan-2017

398 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Always On: Building Highly Available Applications on Cassandra

Always On:Building Highly Available Applications on Cassandra

Robbie Strickland

Page 2: Always On: Building Highly Available Applications on Cassandra

Who Am I?

Robbie StricklandVP, Software [email protected]@rs_atl An IBM Business

Page 3: Always On: Building Highly Available Applications on Cassandra

Who Am I?• Contributor to C*

community since 2010• DataStax MVP 2014/15/16• Author, Cassandra High

Availability & Cassandra 3.x High Availability

• Founder, ATL Cassandra User Group

Page 4: Always On: Building Highly Available Applications on Cassandra

What is HA?• Five nines – 99.999% uptime?– Roughly 9 hours per year– … or a full work day of down time!

• Can we do better?

Page 5: Always On: Building Highly Available Applications on Cassandra

Cassandra + HA• No SPOF• Multi-DC replication• Incremental backups• Client-side failure handling• Server-side failure handling• Lots of JMX stats

Page 6: Always On: Building Highly Available Applications on Cassandra

HA by Design (it’s not an add-on)• Properly designed topology• Data model that respects C* architecture• Application that handles failure• Monitoring strategy with early warning• DevOps mentality

Page 7: Always On: Building Highly Available Applications on Cassandra

Table Stakes• NetworkTopologyStrategy• GossipingPropertyFileSnitch– Or [YourCloud]Snitch

• At least 5 nodes• RF=3• No load balancer

Page 8: Always On: Building Highly Available Applications on Cassandra

HA Topology

Page 9: Always On: Building Highly Available Applications on Cassandra

Consistency Basics• Start with LOCAL_QUORUM reads & writes– Balances performance & availability, and provides

single DC full consistency– Experiment with eventual consistency (e.g.

CL=ONE) in a controlled environment• Avoid non-local CLs in multi-DC environments– Otherwise it’s a crap shoot

Page 10: Always On: Building Highly Available Applications on Cassandra

Rack Failure• Don’t put all your

nodes in one rack!• Use rack awareness– Places replicas in

different racks• But don’t use

RackAwareSnitch

Page 11: Always On: Building Highly Available Applications on Cassandra

Rack Awareness

R2

R3R1

Rack A Rack B

Page 12: Always On: Building Highly Available Applications on Cassandra

Rack Awareness

R2

R3R1

Rack A Rack B

GossipingPropertyFileSnitchcassandra-rackdc.properties

dc=dc1rack=a

dc=dc1rack=b

Page 13: Always On: Building Highly Available Applications on Cassandra

Rack Awareness (Cloud Edition)

R2

R3R1

Availability Zone A

Availability Zone B

[YourCloud]Snitch(it’s automagic!)

Page 14: Always On: Building Highly Available Applications on Cassandra

Data Center Replication

dc=us-1 dc=eu-1

Page 15: Always On: Building Highly Available Applications on Cassandra

Data Center ReplicationCREATE KEYSPACE myKeyspaceWITH REPLICATION = {

‘class’:’NetworkTopologyStrategy’,‘us-1’:3,‘eu-1’:3

}

Page 16: Always On: Building Highly Available Applications on Cassandra

Multi-DC Consistency?

dc=us-1 dc=eu-1Assumption: LOCAL_QUORUM

Page 17: Always On: Building Highly Available Applications on Cassandra

Multi-DC Consistency?

dc=us-1 dc=eu-1Assumption: LOCAL_QUORUM

Fullyconsistent

Fullyconsistent

Page 18: Always On: Building Highly Available Applications on Cassandra

Multi-DC Consistency?

dc=us-1 dc=eu-1Assumption: LOCAL_QUORUM

Fullyconsistent

Fullyconsistent

?

Page 19: Always On: Building Highly Available Applications on Cassandra

Multi-DC Consistency?

dc=us-1 dc=eu-1Assumption: LOCAL_QUORUM

Fullyconsistent

Fullyconsistent

Eventually

consistent

Page 20: Always On: Building Highly Available Applications on Cassandra

Multi-DC Routing with LOCAL CLClient App

us-1

Client App

eu-1

Page 21: Always On: Building Highly Available Applications on Cassandra

Multi-DC Routing with LOCAL CLClient App

us-1

Client App

eu-1

Page 22: Always On: Building Highly Available Applications on Cassandra

Multi-DC Routing with non-LOCAL CL

Client App

us-1

Client App

eu-1

Page 23: Always On: Building Highly Available Applications on Cassandra

Multi-DC Routing with non-LOCAL CL

Client App

us-1

Client App

eu-1

Page 24: Always On: Building Highly Available Applications on Cassandra

Multi-DC Routing• Use DCAwareRoundRobinPolicy wrapped by

TokenAwarePolicy– This is the default– Prefers local DC – chosen based on host distance

and seed list– BUT this can fail for logical DCs that are physically

co-located, or for improperly defined seed lists!

Page 25: Always On: Building Highly Available Applications on Cassandra

Multi-DC RoutingPro tip:val localDC = //get from configval dcPolicy =

new TokenAwarePolicy(DCAwareRoundRobinPolicy.builder()

.withLocalDc(localDC)

.build())

Be explicit!!

Page 26: Always On: Building Highly Available Applications on Cassandra

Handling DC Failure• Make sure backup DC has sufficient capacity– Don’t try to add capacity on the fly!

• Try to limit updates– Avoids potential consistency issues on recovery

• Be careful with retry logic– Isolate it to a single point in the stack– Don’t DDoS yourself with retries!

Page 27: Always On: Building Highly Available Applications on Cassandra

Topology Lessons• Leverage rack awareness• Use LOCAL_QUORUM

– Full local consistency– Eventual consistency across DCs

• Run incremental repairs to maintain inter-DC consistency• Explicitly route local app to local C* DC• Plan for DC failure

Page 28: Always On: Building Highly Available Applications on Cassandra

Data Modeling

Page 29: Always On: Building Highly Available Applications on Cassandra

Quick Primer• C* is a distributed hash table– Partition key (first field in PK declaration) determines

placement in the cluster– Efficient queries MUST know the key!

• Data for a given partition is naturally sorted based on clustering columns

• Column range scans are efficient

Page 30: Always On: Building Highly Available Applications on Cassandra

Quick Primer• All writes are immutable– Deletes create tombstones– Updates do not immediately purge old data– Compaction has to sort all this out

Page 31: Always On: Building Highly Available Applications on Cassandra

Who Cares?• Bad performance = application downtime &

lost users• Lagging compaction is an operations

nightmare• Some models & query patterns create serious

availability problems

Page 32: Always On: Building Highly Available Applications on Cassandra

Do• Choose a partition key that distributes evenly• Model your data based on common read

patterns• Denormalize using collections & materialized

views• Use efficient single-partition range queries

Page 33: Always On: Building Highly Available Applications on Cassandra

Don’t• Create hot spots in either data or traffic

patterns• Build a relational data model• Create an application-side join• Run multi-node queries• Use batches to group unrelated writes

Page 34: Always On: Building Highly Available Applications on Cassandra

Problem Case #1SELECT *FROM contactsWHERE id IN (1,3,5,7,9)

Page 35: Always On: Building Highly Available Applications on Cassandra

Client

Problem Case #1

SELECT *FROM contactsWHERE id IN (1,3,5,7)

1 26 5

4 72 8

3 67 8

1 35 2

4 57 8

1 36 4

Must ask every 4 out of 6 nodes in the cluster to satisfy quorum!

Page 36: Always On: Building Highly Available Applications on Cassandra

Client

Problem Case #1

SELECT *FROM contactsWHERE id IN (1,3,5,7)

1 26 5

4 72 8

3 67 8

1 35 2

4 57 8

1 36 4

“Not enough replicas available for query at consistency LOCAL_QUORUM” X

X1,3,5 all have sufficient replicas,yet entire query fails because of 7

Page 37: Always On: Building Highly Available Applications on Cassandra

Solution #1• Option 1: Be optimistic and run it anyway– If it fails, you can fall back to option 2

• Option 2: Run parallel queries for each key– Return the results that are available– Fall back to CL ONE for failed keys– Client token awareness means coordinator does less

work

Page 38: Always On: Building Highly Available Applications on Cassandra

Problem Case #2CREATE INDEX ON contacts(birth_year)

SELECT *FROM contactsWHERE birth_year=1975

Page 39: Always On: Building Highly Available Applications on Cassandra

Client

Problem Case #2

SELECT *FROM contactsWHERE birth_year=1975

1975:JimSue

1975:SamJim

1975:SueTim

1975:TimJim

1975:SueSam

1975:SamTim

Index lives with the source data… so 5 nodes must be queried!

Page 40: Always On: Building Highly Available Applications on Cassandra

Client

Problem Case #2

SELECT *FROM contactsWHERE birth_year=1975

1975:JimSue

1975:SamJim

1975:SueTim

1975:TimJim

1975:SueSam

1975:SamTim

“Not enough replicas available for query at consistency LOCAL_QUORUM”

Index lives with the source data… so 5 nodes must be queried!

X

X

Page 41: Always On: Building Highly Available Applications on Cassandra

Solution #2• Option 1: Build your own index– App has to maintain the index

• Option 2: Use a materialized view– Not available before 3.0

• Option 3: Run it anyway– Ok for small amounts of data (think 10s to 100s of rows) that

can live in memory– Good for parallel analytics jobs (Spark, Hadoop, etc.)

Page 42: Always On: Building Highly Available Applications on Cassandra

Problem Case #3CREATE TABLE sensor_readings (

sensorID uuid,timestamp int,reading decimal,PRIMARY KEY (sensorID, timestamp)

) WITH CLUSTERING ORDER BY (timestamp DESC);

Page 43: Always On: Building Highly Available Applications on Cassandra

Problem Case #3• Partition will grow unbounded– i.e. it creates wide rows

• Unsustainable number of columns in each partition

• No way to archive off old data

Page 44: Always On: Building Highly Available Applications on Cassandra

Solution #3CREATE TABLE sensor_readings (

sensorID uuid,time_bucket int,timestamp int,reading decimal,PRIMARY KEY ((sensorID, time_bucket),

timestamp)) WITH CLUSTERING ORDER BY (timestamp DESC);

Page 45: Always On: Building Highly Available Applications on Cassandra

Monitoring

Page 46: Always On: Building Highly Available Applications on Cassandra

Monitoring Basics• Enable remote JMX• Connect a stats collector (jmxtrans, collectd,

etc.)• Use nodetool for quick single-node queries• C* tells you pretty much everything via JMX

Page 47: Always On: Building Highly Available Applications on Cassandra

Thread Pools• C* is a SEDA architecture– Essentially message queues feeding thread pools– nodetool tpstats

• Pending messages are bad:Pool Name Active Pending Completed Blocked All time blockedCounterMutationStage 0 0 0 0 0ReadStage 0 0 103 0 0RequestResponseStage 0 0 0 0 0MutationStage 0 13234794 0 0 0

Page 48: Always On: Building Highly Available Applications on Cassandra

Lagging Compaction• Lagging compaction is the reason for many

performance issues• Reads can grind to a halt in the worst case• Use nodetool tablestats/cfstats &

compactionstats

Page 49: Always On: Building Highly Available Applications on Cassandra

Lagging Compaction• Size-Tiered: watch for high SSTable counts:

Keyspace: my_keyspaceRead Count: 11207Read Latency: 0.047931114482020164 ms.Write Count: 17598Write Latency: 0.053502954881236506 ms.Pending Flushes: 0

Table: my_tableSSTable count: 84

Page 50: Always On: Building Highly Available Applications on Cassandra

Lagging Compaction• Leveled: watch for SSTables remaining in L0:

Keyspace: my_keyspaceRead Count: 11207Read Latency: 0.047931114482020164 ms.Write Count: 17598Write Latency: 0.053502954881236506 ms.Pending Flushes: 0

Table: my_tableSSTable Count: 70SSTables in each level: [50/4, 15/10, 5/100]

50 in L0 (should be 4)

Page 51: Always On: Building Highly Available Applications on Cassandra

Lagging Compaction Solution• Triage:– Check stats history to see if it’s a trend or a blip– Increase compaction throughput using nodetool

setcompactionthroughput– Temporarily switch to SizeTiered

• Do some digging:– I/O problem?– Add nodes?

Page 52: Always On: Building Highly Available Applications on Cassandra

Wide Rows / Hotspots• Only takes one to wreak havoc• It’s a data model problem• Early detection is key!• Watch partition max bytes– Make sure it doesn’t grow unbounded– … or become significantly larger than mean bytes

Page 53: Always On: Building Highly Available Applications on Cassandra

Wide Rows / Hotspots• Use nodetool toppartitions to sample

reads/writes and find the offending partition• Take action early to avoid OOM issues with:– Compaction – Streaming– Reads

Page 54: Always On: Building Highly Available Applications on Cassandra

For More Info…

(shameless book plug)

Page 55: Always On: Building Highly Available Applications on Cassandra

Thanks!

Robbie [email protected]@rs_atl An IBM Business