cassandra drivers

Download Cassandra drivers

If you can't read please download the document

Upload: tyler-hobbs

Post on 25-Jun-2015

821 views

Category:

Technology


1 download

TRANSCRIPT

  • 1. Cassandra Native Protocol DriversTyler Hobbs, C* and C* driver engineer

2. About Me CHART TITLE GOES HERE Thrift drivers: pycassa, phpcassa, telephus, and others DataStax python driver (native protocol) Cassandra Engineer 3. Thrift Drivers CHART TITLE GOES HERE RPC Framework, machine generated 4. Thrift Drivers Problems? Backwards & forwards compatibility Too many connections No standard interface Thrift overhead Cluster state must be polled 5. Problems & Solutions Backwards/Forwards Compatibility Possible with Thrift, but easier with a query language (CQL) Separately versioned query language and protocol 6. Problems & Solutions Too Many Connections Operation pipelining with the native protocol 7. Problems & Solutions Operation Pipelining 127 in-flight ops per connection Improves throughput (not latency) Out-of-order processing Async, event-loop driven 8. Problems & Solutions No Standard Interface Query language Standard policies for load balancing, connection management, and retries/failure handling More similar to standard RDMBS drivers 9. Problems & Solutions Thrift Overhead Custom protocol, prepared statements 10. Problems & Solutions Cluster State Must be Polled Control connection, register for pushed notifications 11. New Driver API Sync/Async Operationsresult=session.execute(SELECT*FROMfoo) 12. New Driver API Sync/Async Operations future=session.execute_async(SELECT*FROMfoo) result=future.result() 13. New Driver API Sync/Async Operations session.execute_async(query).add_callbacks( callback=process_data, errback=log_error ) 14. New Driver Architecture Connection Pooling Min/max conns per remote, local nodes Use least busy conn Open and close conns as needed 15. New Driver Architecture What happens during Operations? Nodes to query are picked by LoadBalancingPolicy Failures are handled by RetryPolicy On errors, nodes are marked down by ConvictionPolicy 16. New Driver Architecture Load Balancing Policies RoundRobin DcAwareRoundRobin TokenAware wrapper Custom 17. New Driver Architecture What happens during Operations? Nodes to query are picked by LoadBalancingPolicy Failures are handled by RetryPolicy On errors, nodes are marked down by ConvictionPolicy 18. New Driver Architecture Retry Policies Operation type Consistency level Number (and type) of responses Type of failure Retry, raise error, or ignore error 19. New Driver Architecture What happens during Operations? Nodes to query are picked by LoadBalancingPolicy Failures are handled by RetryPolicy On errors, nodes are marked down by ConvictionPolicy, reconnect with ReconnectionPolicy 20. New Driver Architecture Reconnection Policy Schedule for attempting reconnects to down nodes Constant and Exponential backoff 21. New Driver Architecture Policy Defaults RoundRobin load balancing (not token or DC aware) Retry at most once (in a small number of cases) Mark node down after one failure Exponential backoff on reconnection attempts 22. New Driver Architecture Prepared Statements Prepared against all nodes Cache Re-preparationprepared=session.prepare(SELECTfooFROMbarWHEREid=?) result=session.execute(prepared,[user_id1]) 23. New Driver Architecture Control Connection Listens for pushed updates to cluster state and schema Marks nodes up and down Auto discovers nodes in cluster Updates schema metadata 24. New Driver Architecture Metrics Count timeouts, connection errors, and other errors Open connection stats Operation latency histogram 25. New Driver Architecture Cursors No more manual paging over large queries Works across multiple nodes Paging state provided by client 26. New Driver Architecture Quick Python Benchmark 3 nodes (local, ccm), one conn per host 50k individual inserts, single threaded Pycassa: ~1200 ops/sec DataStax python driver (sync, blocking): ~950 ops/sec DataStax python driver (future batching): ~3000 ops/sec DataStax python driver (callback chaining): ~7300 ops/sec 27. New Driver Architecture Languages Supported Java 1.0 released in Spring 2013Simple object mapper under development C# - 1.0 released in Summer 2013 LINQ integration Python Beta since Summer 2013, 1.0 coming soon Basic mapper available through cqlengine C++ - Currently in Alpha state Ruby, JS, PHP planned, but no development so far 28. New Driver Architecture Languages Supported github.com/datastax 29. Questions? @tylhobbs thobbs on #cassandra, #datastaxdrivers