building antifragile applications with apache cassandra

37
@PatrickMcFadin Patrick McFadin Solutions Architect, DataStax Building Antifragile Applications with Apache Cassandra 1 Wednesday, August 21, 13

Upload: patrick-mcfadin

Post on 06-May-2015

2.326 views

Category:

Technology


6 download

DESCRIPTION

Even with the best infrastructure, failures will occur without warning and are almost guaranteed. Building applications that can resist this fact of life can be both art and science. In this talk, I'll try to eliminate the art portion and focus more on the science. Starting at high level architecture decisions, I will take you through each layer and finally down to actual application code. Using Cassandra as the back end database, we can build layers of fault tolerance that will leave end users completely unaware of the underlying chaos that could be occurring. With a little planning, we can say goodbye to the Fail Whale and the fragility of the traditional RDBMS. Topics will include: - Application strategies to utilize active-active, diverse, datacenters - Replicating data with the highest integrity and maximum resilience - Utilizing Cassandra's built-in fault tolerance - Architecture of private, cloud or hybrid based applications - Application driver techniques when using Cassandra

TRANSCRIPT

Page 1: Building Antifragile Applications with Apache Cassandra

@PatrickMcFadin

Patrick McFadinSolutions Architect, DataStax

Building Antifragile Applications with Apache Cassandra

1Wednesday, August 21, 13

Page 2: Building Antifragile Applications with Apache Cassandra

Who I am

2

• Patrick McFadin• Solution Architect at DataStax• Cassandra MVP• User for years• Follow me for more:

I talk about Cassandra and building scalable, resilient apps ALL THE TIME!

@PatrickMcFadin

Wednesday, August 21, 13

Page 3: Building Antifragile Applications with Apache Cassandra

Background - Why are we doing this?

•We live in an always-on society• Data driven applications rule the day (for now)• Failure as reality

3

•Mike Christian-Yahoo! Director of Engineering, Infrastructure Resilience• Frying Squirrels and unspun gyros

Wednesday, August 21, 13

Page 4: Building Antifragile Applications with Apache Cassandra

Background - Antifragile as a practice• Antifragile: Things That Gain From Disorder• Nassim Nicholas Taleb• Things that get better with a little chaos

4

• Jesse Robbins• Master of Disaster at Amazon

• Bringer of “Game Day”

Wednesday, August 21, 13

Page 5: Building Antifragile Applications with Apache Cassandra

Background - Distributed in a global economy

• Closer to your users == happy users• Latency is just physics• Best chance of light in fibre from US East to US West? 20ms

5

From To Latency

New York London 75.07

New York Rio De Janeiro 110.28

San Francisco Tokyo 97.33

Singapore Los Angeles 183.19

Tokyo London 242.88

Wednesday, August 21, 13

Page 6: Building Antifragile Applications with Apache Cassandra

Challenges - When it was just HTTP

• Roaring 90s. Just static HTTP • Some of it was data driven, but most not• One awesome web server

6Wednesday, August 21, 13

Page 7: Building Antifragile Applications with Apache Cassandra

Challenges - More than one web server?

• Pentium 2 web server wasn’t going to do it.•More than one web server? Wow• Distribute the HTML and then...?

7Wednesday, August 21, 13

Page 8: Building Antifragile Applications with Apache Cassandra

Challenges - Spreading the load

• Clients have to find your now spread content• Round Robin DNS to the rescue!• Crazy router hacks• Hardware based load balancers• Resilient to failure

8Wednesday, August 21, 13

Page 9: Building Antifragile Applications with Apache Cassandra

Challenges - Content Distribution Networks

• Cat pictures suck bandwidth• Cat videos suck even MORE bandwidth• User experience depends on response time• Content closer to users? Sweet

9Source: http://www.paulund.co.uk/content-delivery-network-review

Wednesday, August 21, 13

Page 10: Building Antifragile Applications with Apache Cassandra

Challenges - What about data?

• Databases have been designed as singletons (ACID)•Master - Slave replication dominates• Sharding - No more joins• Distributed replication is hard if not impossible

10

Panic!!

Wednesday, August 21, 13

Page 11: Building Antifragile Applications with Apache Cassandra

Challenges - Facebook and MySQL

• Heavily sharded but in one DC• Needed a second data center• 2007 opted for slave DBs• Re-wrote query parser and “hijacked the MySQL replication stream”• Transparent to application

11

Ref: https://www.facebook.com/note.php?note_id=23844338919&id=9445547199&index=0

Wednesday, August 21, 13

Page 12: Building Antifragile Applications with Apache Cassandra

Challenges - Finally some science!

• 2007 - Amazon and the Dynamo paper• Attempted to answer:• How do we distribute our data?• How do we maximize uptime?• How can we keep our data safe?

• Consolidated distributed database science• 24 research papers cited• Almost 30 years of distributed computing thought

12

http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

Wednesday, August 21, 13

Page 13: Building Antifragile Applications with Apache Cassandra

So now what?

13

Let’s put it all together

•More than one server - Uptime and Scaling• Get closer to users - Reduce latency.• Transparent - Make it a natural part of the system.

Wednesday, August 21, 13

Page 14: Building Antifragile Applications with Apache Cassandra

Techniques - Step one

• Control your traffic

14Wednesday, August 21, 13

Page 15: Building Antifragile Applications with Apache Cassandra

Techniques - Hardware traffic control

• F5 GTM - Global Traffic Manager• A10 Thunder• Citrix Netscaler• Cisco ACE• Barracuda ADC

15

Source: http://www.f5.com

Wednesday, August 21, 13

Page 16: Building Antifragile Applications with Apache Cassandra

Techniques - Service based control

• DYN - Active failover service• Akamai - Global Traffic Management• Amazon- Route 53

16Wednesday, August 21, 13

Page 17: Building Antifragile Applications with Apache Cassandra

Techniques - DIY control

• Client level awareness• Client chooses which path to take

17

http://imaginethefutur.blogspot.com/2012/09/javascript-load-balancer-using-cookies.html

A good example

Wednesday, August 21, 13

Page 18: Building Antifragile Applications with Apache Cassandra

Techniques - Step 2

•Make your app ok with being part of a team• Fail fast - Short circuits• Transactions are expensive• Partial or eventual data is ok

18

Great discussion on this topic:

http://www.planetcassandra.org/blog/post/a-netflix-experiment-eventual-consistency-hopeful-consistency-by-christos-kalantzis

Wednesday, August 21, 13

Page 19: Building Antifragile Applications with Apache Cassandra

Techniques - Application Architecture

• Embrace the pod (or application units)• Each stands alone

19

East Coast10000 Customers

West coast10000 Customers

Next deployable unit10000 Customers

Wednesday, August 21, 13

Page 20: Building Antifragile Applications with Apache Cassandra

Techniques - State management

• State in the browser?• State in the data layer?• State in both?

20

No!

Wednesday, August 21, 13

Page 21: Building Antifragile Applications with Apache Cassandra

Techniques - Step 3

•Make your persistence layer resilient• If your app layer can fail, then why not the DB?•Master-Master - Less complexity• Of course I’m talking about Cassandra

21

Same data. Fully replicated.

Wednesday, August 21, 13

Page 22: Building Antifragile Applications with Apache Cassandra

Techniques - Step 4

• Test!! Test!! Test!!

22

Know these guys?

• Constantly breaking things• Chaos Monkey - Shut down random services• Always be failing so it’s normal• Read this: http://queue.acm.org/detail.cfm?

id=2499552

Wednesday, August 21, 13

Page 23: Building Antifragile Applications with Apache Cassandra

Cassandra - Intro

• Based on Amazon Dynamo and Google BigTable paper• Shared nothing• Data safe as possible• Predictable scaling

23

Dynamo

BigTable

Wednesday, August 21, 13

Page 24: Building Antifragile Applications with Apache Cassandra

Cassandra - More than one server

• All nodes participate in a cluster• Shared nothing• Add or remove as needed•More capacity? Add a server

24Wednesday, August 21, 13

Page 25: Building Antifragile Applications with Apache Cassandra

Cassandra - Locally Distributed

• Client writes to any node• Node coordinates with others• Data replicated in parallel• Replication factor: How many

copies of your data?• RF = 3 here

25Wednesday, August 21, 13

Page 26: Building Antifragile Applications with Apache Cassandra

Cassandra - Geographically Distributed

• Client writes local• Data syncs across WAN• Replication Factor per DC

26Wednesday, August 21, 13

Page 27: Building Antifragile Applications with Apache Cassandra

Cassandra - Consistency

• Consistency Level (CL)• Client specifies per read or write

27

• ALL = All replicas ack• QUORUM = > 51% of replicas ack• LOCAL_QUORUM = > 51% in local DC ack• ONE = Only one replica acks

Wednesday, August 21, 13

Page 28: Building Antifragile Applications with Apache Cassandra

Cassandra - Transparent to the application

• A single node failure shouldn’t bring failure• Replication Factor + Consistency Level = Success• This example:• RF = 3• CL = QUORUM

28

>51% Ack so we are good!

Wednesday, August 21, 13

Page 29: Building Antifragile Applications with Apache Cassandra

Cassandra Applications - Drivers

• DataStax Drives for Cassandra• Java• C#• Python•more on the way

29Wednesday, August 21, 13

Page 30: Building Antifragile Applications with Apache Cassandra

Cassandra Applications - Connecting

• Create a pool of local servers• Client just uses session to interact with Cassandra

30

contactPoints = {“10.0.0.1”,”10.0.0.2”}

keyspace = “videodb”

public VideoDbBasicImpl(List<String> contactPoints, String keyspace) {

cluster = Cluster .builder() .addContactPoints(! contactPoints.toArray(new String[contactPoints.size()])) .withLoadBalancingPolicy(Policies.defaultLoadBalancingPolicy()) .withRetryPolicy(Policies.defaultRetryPolicy()) .build();

session = cluster.connect(keyspace); }

Wednesday, August 21, 13

Page 31: Building Antifragile Applications with Apache Cassandra

Cassandra Applications - Load balancing• Token aware - Request sent to primary node with data• Calls can be asynchronous and in parallel

31

1

23

45

6Client

Thread

Node

Node

Node

ClientThread

ClientThread

Node

Driver

Wednesday, August 21, 13

Page 32: Building Antifragile Applications with Apache Cassandra

Cassandra Applications - Fault tolerance

• Try first with a Consistency Level of QUORUM• If fails, retry with Consistency Level ONE

32

Client Node

Node Replica

Replica

NodeReplica

Wednesday, August 21, 13

Page 33: Building Antifragile Applications with Apache Cassandra

Application Example - Layout

• Active-Active• Service based DNS routing

33

Cassandra Replication

Wednesday, August 21, 13

Page 34: Building Antifragile Applications with Apache Cassandra

Application Example - Uptime

34

• Normal server maintenance• Application is unaware

Cassandra Replication

Wednesday, August 21, 13

Page 35: Building Antifragile Applications with Apache Cassandra

Application Example - Failure

35

• Data center failure• Data is safe. Route traffic.

33

Another happy user!

Wednesday, August 21, 13

Page 36: Building Antifragile Applications with Apache Cassandra

Conclusion

36

• Cassandra is THE BEST persistence tier for your application• Plan for chaos. Inject your own. • Now go write your app.

1. The Data Model is Dead, Long Live the Data Model

http://www.youtube.com/watch?v=px6U2n74q3g

2. Become a Super Modeler

http://www.youtube.com/watch?v=qphhxujn5Es

3. The World's Next Top Data Model

http://www.youtube.com/watch?v=HdJlsOZVGwM

Data Modeling!

https://github.com/pmcfadin/cql3-videodb-example

Example code

Wednesday, August 21, 13

Page 37: Building Antifragile Applications with Apache Cassandra

Thank you!

37

CALL FOR PAPERSSPONSORSHIP 30+ SessionsTWO DAYS TRAINING DAYCALL FOR PAPERS

SPONSORSHIP OPPORTUNITY

TWO DAYS30+ SESSIONS

TRAINING DAY

Oh yeah. We are hiring.

Q&A?

Wednesday, August 21, 13