cassandra in industry
DESCRIPTION
Quick Cassandra Overview given to Dublin User Group on the 20th of January. Includes some highlight use cases.TRANSCRIPT
Cassandra In IndustryNiall Milton, CTO, DigBigData
2
1. What is Apache Cassandra?2. Who uses Apache Cassandra?3. Real Use Cases4. Questions
Overview
© DigBigData 2014
© DigBigData 2014 3
Massively scalable open source NoSQL database
Dynamic SQL-like data modeling and querying
Master-less architecture, no SPOF Linear scalability, high availability / reliability Tuneable consistency New Core Value : Ease of Use!
What is Apache Cassandra?
© DigBigData 2014 4
Who Uses Apache Cassandra?
© DigBigData 2014 5
And…
© DigBigData 2014 6
Java .NET Python Ruby & PHP on the way Feeling Energetic? Use native protocol to roll
your own client.
Drivers…
© DigBigData 2014 7
33 million members in 40 countries Almost 100% deployed in Amazon Cloud Biggest single source of internet traffic in terms of
volume More than 1 billion videos delivered each month Core data services served from Cassandra since
2010 100s of nodes split into isolated clusters per service Managed and Deployed via Netflix OSS
Use Case : NetflixThe Business
© DigBigData 2014 8
Low Latency & latency variance Linear scaling of reads and writes Each cluster uses nodes from different availability
zones Ring is self repairing after outages Supports node backups & snapshots They found CL One suited their needs for most use
cases They run massive simulations in Amazon to test their
assumptions re. latency, data growth etc.
Use Case : NetflixWhy Cassandra?
© DigBigData 2014 9
75 billion dollars in goods sold per year 112 million ACTIVE users 400+ million items for sale Billions of page views / day Running 1000s of servers Processing TB per second We get it, Ebay is big!
Use Case : EBayThe Business
© DigBigData 2014 10
Mixed Data Architecture, also using Mysql, Oracle, MongoDB & Hadoop
Over 100 cassandra nodes deployed with 9 billion writes / day & 5 billion reads / day
Ethos is to use the right tool for a particular job.
Cassandra is good for sparse data, big data, flexible schemas & real-time analytics
Many use cases don’t require RDBMS
Use Case : EBayArchitecture
© DigBigData 2014 11
Multi-DC, active-active configuration. Less waste, no dark nodes. Always available Easily scaled up and down High write throughput Distributed counters Hadoop support
Use Case : EBayWhy Cassandra?
© DigBigData 2014 12
Time series data, real time insights and immediate actions Anti-fraud Order and shipping insights
Server metrics collection for monitoring and alerting
Personalization & taste graphingRuntime quality click pricing for affiliatesMobile notification logging and tracking
Use Case : EBayBut what does it do?
© DigBigData 2014 13
World’s highest rated taxi app Over 500,000 registered passengers Hailo e-hail is accepted every 4 seconds
globally Operating in 10 cities after just 18 months of
operation (as of 2013) Processes over 100million dollars in
customer & driver transactions.
Use Case : HailoThe Business
© DigBigData 2014 14
Historically using MySQL in AWS Rapid global expansion required higher scalability Global replication desired Growth forecasts indicated high growth rate.
Cassandra easily scaled to meet this demand Some prior engineering experience and
confidence in the technology Using Acunu for some analytics work. It uses
Cassandra under the hood.
Use Case : HailoWhy Cassandra?
© DigBigData 2014 15
Numerous examples of companies adopting Cassandra to answer the demands of high volume workloads
Easily supports mixed workloads and is highly tuneable to favour read, write or both
Supports rapid service rollouts (Netflix have built an entire development culture around this)
Where ease of scale, flexible schema design, high availability and hurricane survival are required Cassandra meets the need.
Conclusion
© DigBigData 2014 16
Questions?
© DigBigData 2014 17
Questions?
?