predictable performance for big data in real...
TRANSCRIPT
© 2012 Aerospike. All rights reserved. Confidential | Corporate Overview | Pg. 2
Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability
1. Know whom the interaction is with u 200 M US consumers, 5 Billion mobile devices
2. Anticipate intent based on current context u Page views, search terms, ads served, game state, last
move, friends list, location info, pre-computed data like audience segments, location patterns
3. Respond fast u Display the most relevant advertisement u Deliver the richest gaming experience u Detect the latest attack vector u Recommend the best product u Treat special customers like VIPs…
4. NEVER go down!
Interactions - faster & better decisions
© 2012 Aerospike. All rights reserved. Confidential Pg. 3
Aerospike fuels AdTech; Ads fund the Internet ➤ #2 Ad Network
➤ #1 Ad Agency
➤ #1 Indep Ad Exchange
➤ #1 Video Ad Network § +
➤ #1 Data Aggregator § +
➤ #2 Mobile Ad Platform § +
➤ #2, #6 Publisher Network § +
© 2012 Aerospike. All rights reserved. Confidential | Pg. 5
➤ #1 Search Syndicator • +
➤ #1 Recommendation Engine - ATT, Tesco, Ticketmaster,
➤ #1 DSP - Canada
➤ #1 Ad server - China
➤ #1 Mobile Ads - Asia
➤ #1 Pub Net -SE Asia
➤ #3 ISP - Japan
Why Aerospike? ➤ Fast
§ Predictable performance 99.9% in less than 1ms § For balanced read/write transactions § Even with synchronous replication for immediate consistency
➤ Scales § Manages 100+ Billion objects, 10+ Terabytes of data § Processes 500k+ TPS per node; 50k+ TPS for writes § Scales out linearly on commodity hardware
➤ Never Fails § Reliably stores data with immediate consistency, replication § Cross data center multi-master replication ensures business
continuity and geographic proximity § No performance degradation
u Even during re-balancing/ data migration, rolling software/hardware upgrades and background backups/restores!
© 2012 Aerospike. All rights reserved. Confidential Pg. 7
Built by experts in Databases & Distributed systems
➤ Donald J. Haderle, “father of DB2”
➤ Srini V. Srinivasan – Database expert § Responsible for Yahoo! Mobile’s global operations serving Millions of users 24x7 § M.S. and Ph.D Computer Science (in Databases), University of Wisconsin – Madison and B.Tech Computer
Science, IIT Chennai
➤ Russell Sullivan – High Performance expert § Founder of AlchemyDB, “Performance Man” for Redis,
20+ years experience in web scale systems at Lycos, 24/7 Real Media, top European dating site BE2 § B.S. Computer Science, Michigan University
➤ Brian Bulkowski – Networking expert § 20+ years developing web scale infrastructures at Aggregate Knowledge, Liberate and Novell § B.S. Mathematics/Computer Science, Brown University
➤ Roger Sippl, founder of Informix
© 2012 Aerospike. All rights reserved. Confidential Pg. 8
Zero downtime in 2+ yrs
© 2012 Aerospike. All rights reserved. Confidential Pg. 9
➤ Real-Time Bidding Platform for… § $22B market by 2015
➤ 27 Billion auctions per day § Doubling every year
➤ 1 Million TPS ➤ 12 TB ➤ 140 servers in 3 data centers
“Aerospike has operated without interruptions and easily scaled to meet our performance demands.” – Mike Nolet, CTO, AppNexus
2 Billion objects, 8TB data ➤ Ad Serving platform for Yahoo!, MSN, AOL, comscore sites
§ Yahoo! User data for 76% of the U.S. population § Yahoo! Search data for 300 million+ searches per day § 50,000+ user attributes from 25+ data providers
u “Genome takes in more data from more sources than other solutions.” - Peter Foster, Yahoo’s GM,
Audience & Performance Advertising
© 2012 Aerospike. All rights reserved. Confidential Pg. 10
2 Trillion Transactions per month ➤ BlueKai - largest data management platform
on the Internet § 2 Trillion Transactions per month § 100,000 attributes and user profiles for
e-commerce, recommendation engines, video traffic and ad targeting
➤ eXelate - 16 TB of data on 400 million consumers § 60 Billion Transactions per month § 4 data centers across US and Europe for
geographic proximity & high availability
§ “Scale. Real-time performance. Real-time replication at each of our four datacenters. Aerospike delivered on all of these requirements.” - Elad Efraim, CTO, eXelate
© 2012 Aerospike. All rights reserved. Confidential Pg. 11
Shared-Nothing Architecture
© 2012 Aerospike. All rights reserved. Pg. 13
Data Center 1
Data Center 2 Data Center 3
Every cluster node is Identical and handles both transactions and long running tasks
Replication supported with immediate consistency
Fast Key Value Store
➤ Taking advantage of modern commodity servers § New multi-processor, multi-core machines § Lower DRAM and SSD price points
➤ High Throughput, Elastic Scaling & ACID
© 2012 Aerospike. All rights reserved. Confidential Pg. 14
Vertical Scaling Maximizes TPS Handles traffic spikes, ensures predictable performance
Horizontal (Elastic) Scaling Maximizes Data Volumes Ensures 100% Uptime
Read more..
Intelligent Client API Shields Your Applications from the Complexity of the Cluster ➤ Implements Aerospike API
§ Easy primary key pattern § Row with typed columns § Optimistic row locking
➤ Optimized binary protocol
➤ Cluster tracking § Client / server gossip protocol § Continually learn cluster changes § Learn and update data partition map
➤ Transaction semantics § Global transaction ID § Retransmit and timeout
© 2012 Aerospike. All rights reserved. Pg. 15
No Sharding! Data is Distributed Randomly, using Hash technology
➤ Every key is hashed into a 20 byte (fixed length) string using the RIPEMD160 hash function
➤ This hash + additional data (fixed 64 bytes) are stored in RAM in the index
➤ 4 bytes of this hash are used to compute the partition id
➤ There are 4096 partitions
➤ Partition id maps to node id based on cluster membership
© 2012 Aerospike. All rights reserved. Pg. 16
cookie-abcdefg-12345678
182023kh15hh3kahdjsh
Partition ID
Master node
Replica node
… 1 4
1820 2 3
1821 3 2
4096 4 1
Cross Data Center Replication (XDR)
© 2012 Aerospike. All rights reserved. Pg. 17
Data Center 1
Data Center 2 Data Center 3
Every cluster node is Identical and handles both transactions and long running tasks
Replication supported with immediate consistency
Cross Data Center Replication (XDR) ➤ Asynchronous replication for long link
delays and outages ➤ Namespace is configured to replicate to a
destination cluster – master / slave, including star and ring
➤ Replication process § Transaction journal on partition master and
replica § XDR process writes batches to destination § Transmission state shared with source replica § Retransmission in case of network fault § When data arrives back at originating
cluster, transaction ID matching prevents subsequent application and forwarding
➤ In master / master replication, conflict resolution via multiple versions, or timestamp
© 2012 Aerospike. All rights reserved. Confidential Pg. 18
SSD-optimized Storage Layer ➤ Direct device access
i.e. raw, bypassing file system § Data written in SSD optimal large
block patterns § All indexes in RAM for low wear § Continuous background
defragmentation § Clean restart through shared
memory
➤ Random distribution using hash does not require RAID hardware
© 2012 Aerospike. All rights reserved. Pg. 19
…
SSD performance varies widely • Aerospike has a certified
hardware list • Free SSD certification tool,
CIO, is also available
Self-configuring Clusters!
➤ Automatic multicast gossip protocol for node discovery ➤ Paxos consensus algorithm determines nodes in cluster ➤ Ordered list of nodes determines data location ➤ Data partitions balanced for minimal data motion ➤ Vote initiated and terminated in 100 milliseconds
© 2012 Aerospike. All rights reserved. Pg. 20
Adding a new node
1. Cluster discovers new node via gossip protocol
2. Paxos vote determines new data organization
3. Partition migrations scheduled
4. When a partition migration starts, write journal starts on destination
5. Partition moves atomically
6. Journal is applied and source data deleted
© 2012 Aerospike. All rights reserved. Pg. 21
transactions continue
Consistency: Writing data safely
1. Write sent to row master
2. Latch against simultaneous writes
3. Apply write to master memory
4. Apply write synchronously to replica(s) memory
5. Queue operations to disk
6. Signal completed transaction (optional storage commit wait)
7. Master applies conflict resolution policy – rollback / rollforward
© 2012 Aerospike. All rights reserved. Pg. 22
master replica
Per Node Optimization Ø Right Architecture
Ø Shared nothing Ø In-memory (or multiple SSDs) Ø Tight code loop Ø Lock free isolation
Ø OS, Programming Language, Libraries Ø Modern Linux kernel Ø C language Ø Use epoll
Ø Tweaks Ø Pin threads to processor cores Ø IRQ affinity settings for NIC Ø CPU Socket Isolation via pairing of CPU to NIC
© 2012 Aerospike. All rights reserved. Pg. 23
Russ’s 10 Ingredient Recipe for
Making 1 Million TPS on $5K Hardware
Fast, Scales, Never Fails ➤ Cluster-aware Client Layer
(linear scale, avoids hot spots) § Tracks nodes, ensures 1 hop transactions by routing transactions directly to the
node with the data § Accelerates transactions with TCP/IP connection pooling § No need to restart clients when nodes go up or down
➤ Self-managing Distribution Layer (100% uptime, immediate consistency, real-time prioritization) § Reliably stores Terabytes of data with immediate consistency, automatic fail-over
and replication § No cluster master, no SPOF, no sharding § Paxos-like voting algorithm dynamically detects when nodes go up/down, § Automatic partitioning (hash) algorithm assigns R/W masters and replicas § Intelligent re-balancing and data migration § Cross data center synchronization with complex ring/star topologies
➤ SSD-optimized Storage Layer (low latency, linear scale, low TCO) § Memory efficient Index in DRAM, § 100 Million keys of any size require only 6.4GB § Native, multi-threaded, multi-core SSD I/O § Log structured file system § Built-in smart evictor and defragmenter
© 2012 Aerospike. All rights reserved. Pg. 24
Want: 1) Faster & better decisions on Hot Data 2) Unified Operations & Analytics
Response time: Hours, Weeks TB to PB Read Intensive
TRANSACTIONS (OLTP)
Response time: Seconds Gigabytes of data
Balanced Reads/Writes
ANALYTICS (OLAP)
STRUCTURED DATA
Response time: Seconds Terabytes of data
Read Intensive
© 2012 Aerospike. All rights reserved. Confidential Pg. 25
BIG DATA ANALYTICS
Real-time Transactions Response time: < 10 ms 1-20 TB Balanced Reads/Writes 24x7x365 Availability
UNSTRUCTURED DATA
REAL-TIME BIG DATA
Interactics: Focus on Velocity and $$$ 1) Faster & better decisions on hot data 2) Unified Operations & Analytics
© 2012 Aerospike. All rights reserved. Confidential | Pg. 26
Fast –Flash
Expensive –DRAM
Mongo
Couch, Riak VoltDB, Hana
Cassandra
Slow –HDD
Hadoop/ HBase
Transactions - Reads - Variety - Flex Data
Interactics - Reads & Writes - Velocity - Hot Data
Analytics - Writes - Volume - Historical Data
Velocity
Volume
# Apps - Variety
Mission Critical - $$$
Research
Flexibility
Customers moving to Aerospike from…
Mongo - adMarketplace, Sitescout § Too hard to scale, tune, make reliable § Poor SSD support ; not multicore
Couch - Brilig, Chango, x+1, adMeta § Low performance, repartition unacceptable § SSD support lacking
Cassandra - Acxiom, BlueKai, EQAds § Low performance, fragility in production § Java is not realtime; glitches and uncertainty § No support from DataStax on core function
© 2012 Aerospike. All rights reserved. Confidential | Pg. 27
Faster: Independent Benchmarks
© 2012 Aerospike. All rights reserved. Confidential Pg. 28
• YCSB++ • Preliminary
Cheaper: 17x lower TCO “…data-in-DRAM implementations such as HANA from SAP ..should be bypassed… the current leading data-in-flash database for transactional analytic applications is Aerospike.” - David Floyer, Founder & CTO, Wikibon
© 2012 Aerospike. All rights reserved. Confidential | Pg. 29
$$$
10x Better TCO* ➤ SSD-optimized Architecture requires fewer servers
Aerospike Other Storage type SSD DRAM
Storage per server 1.2 TB (4 x 300 GB) 80 GB (on 96 GB server)
Cost per server $7,000 USD $15,000 USD
# Servers for 1.5 TB (2x Replication) 3 16
Total costs (USD) $21,000 $240,000 + No Manual Operations No need to re-configure, restart servers when adding or taking down nodes
0 + $200,000 at least per year
+ No DIY development No caching, sharding, replication code to write; Developers write business logic, not middleware
0 + $200,000 at least per year
*Actual results calculated by customer
© 2012 Aerospike. All rights reserved. Pg. 30
Comparing NoSQL Databases
MemBase MongoDB Cassandra
APIs Simple (KVS) Simple (KVS) Rich (JSON) Medium (Column)
Read & Write ✔ ✔ Read optimized Write optimized
Latency < 1 ms < 2ms 5ms ~ 20ms 10ms ~ 30ms
TPS / node 250K 30K 50K 50K
Optimized for SSD ✔ ✗ ✗ ✗
Automatic Clustering ✔ ✗
Complicated
Inconsistent
MemCache support ✔
✔
✗
✗
© 2012 Aerospike. All rights reserved. Confidential Pg. 31
No other DB exists for the Internet of Things
© 2012 Aerospike. All rights reserved. Confidential | Pg. 32
Import from google spreadsheet RDBMS NoSQL Aerospike
Variety: Flex Schema
✗ ✓ ✓
Volume: Web Scale
✗ ✓ ✓
Velocity: Predictable Performance with Zero Downtime
✗ ✗ ✓
Transactions – Single Row ACID
✓ ✗ ✓
Velocity: Faster and Better Decisions with QMR
✗ ✗ ✓
Event Driven Architectures: Pub Sub, Streams ✓ ✗ Roadmap
Transactions - Multi Row ACID Serializable updates, Read Committed
✓ ✗ Roadmap
Security ✓ ✗ Roadmap
Multi-Tenancy ✗ (only Riak) Roadmap
Fueling
Dave Pickles, Founder & CTO, The Trade Desk
“Aerospike handled all challenges smoothly! Large datasets.. millisecond response times.. an ever increasing load, node outages caused by unauthorized upgrades by a managed data center provider, changes to the underlying data structure… This is real software, purpose built, lean and mean.”
First true Demand Side Platform
(DSP)
© 2012 Aerospike. All rights reserved. Confidential Pg. 34
Fueling
Dag Liodden, Co-founder & CTO,
Tapad
“We looked at a lot of open source products and eventually we went with a commercial product because it has much better predictability and low latency… Aerospike took a lot of the jitter in our performance just right out of the equation… So, very, very simple yet very capable NoSQL solution that performs insanely well…the throughput is awesome.”
First digital advertising solution for real-time mobile audience
buying and cross-device targeting
© 2012 Aerospike. All rights reserved. Confidential Pg. 35
Fueling
Andrei Duncan, CTO,
Liverail
“We liked the performance. Everything worked as advertised... With Aerospike, we’ve been adding new services, like auditing and reporting, that are enabling us to land deals we wouldn’t have otherwise. That’s the most important metric of all.” ➤ Predictable (99%) response times under 5ms ➤ 3 Billion impressions per month ➤ 25% of all video advertising ➤ 2 data centers
Video advertising platform with ad serving and real-
time bidding
© 2012 Aerospike. All rights reserved. Confidential Pg. 36
Dag Liodden, Co-founder & CTO, Tapad
“A very, very simple yet very capable NoSQL solution that performs insanely
well.”
© 2012 Aerospike. All rights reserved. Confidential Pg. 37
Elad Efraim CTO, eXelate
“Scale, real-time performance, real-time replication across 4 datacenters.
Aerospike delivered.”
© 2012 Aerospike. All rights reserved. Confidential Pg. 38