making non-distributed databases, distributed · making non-distributed databases, distributed ......
TRANSCRIPT
![Page 1: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/1.jpg)
Making Non-Distributed Databases, Distributed
Ioannis Papapanagiotou, PhDShailesh Birari
![Page 2: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/2.jpg)
Dynomite Ecosystem● Dynomite - Proxy layer● Dyno - Client● Dynomite-manager - Ecosystem orchestrator● Dynomite-explorer - UI
![Page 3: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/3.jpg)
● Needed a data store:o Scalable & highly availableo High throughput, low latencyo Netflix use case is active-active
● Master-slave storage engines:o Do not support bi-directional replicationo Cannot withstand a Monkey attacko Cannot easily perform maintenance
Problems & Observations
![Page 4: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/4.jpg)
What is Dynomite?A framework that makes non-distributed data stores, distributed. Can be used with many key-value storage engines
Features: highly available, automatic failover, node warmup, tunable consistency, backups/restores
![Page 5: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/5.jpg)
Dynomite @ Netflix● Running around 2.5 years in PROD● 70 clusters● ~1000 nodes used by internal microservices● Microservices based on Java, Python,
NodeJS
![Page 6: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/6.jpg)
Pluggable Storage Engines
RESP
● Layer on top of a non-distributed key value data store○ Peer-peer, Shared
Nothing○ Auto-Sharding○ Multi-datacenter○ Linear scale○ Replication○ Gossiping
RESP
![Page 7: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/7.jpg)
● Each rack contains one copy of data, partitioned across multiple nodes in that rack
● Multiple Racks == Higher Availability (HA)
Topology
![Page 8: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/8.jpg)
Replication● A client can connect to any node on
the Dynomite cluster when sending requests.o If node owns the data,
▪ data are written in local data-store and asynchronously replicated.
o If node does not own the data▪ node acts as a coordinator
and sends the data in the same rack & replicates to other nodes in other racks and DC.
![Page 9: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/9.jpg)
Dyno Client - Java API● Connection Pooling● Load Balancing● Effective failover● Pipelining● Scatter/Gather● Metrics, e.g. Netflix Insights
![Page 10: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/10.jpg)
Dyno Load Balancing
● Dyno client employs token aware load balancing.
● Dyno client is aware of the cluster topology of Dynomitewithin the region, can write to specific node using consistent hashing.
![Page 11: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/11.jpg)
Dyno Failover● Dyno will route
requests to different racks in failure scenarios.
![Page 12: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/12.jpg)
Dynomite on the Cloud
RESP
![Page 13: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/13.jpg)
Moving across engines
Rack A Rack B
![Page 14: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/14.jpg)
Dynomite-manager: Warm up1. Dynomite-manager identifies which node has the same token in the
same DC2. Leverage master/slave replication3. Checks for peer syncing
a. difference between master and slave offset4. Once master and slave are in sync, Dynomite is set to allow write only5. Dynomite is set back to normal state6. Checks for health of the node - Done!
![Page 15: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/15.jpg)
Dynomite-Explorer (UI)• Node.js web app with a Polymer-based user-interface• Support Redis’ rich data types• Avoid operations that can negatively impact Redis server performance• Extended for Dynomite awareness• Allow extension of the server to integrate with the Netflix ecosystem
![Page 16: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/16.jpg)
Dynomite-Explorer
![Page 17: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/17.jpg)
Roadmap● Data reconciliation & repair v2● Optimizations of RocksDB configuration● Optimizing backups through SST● Others….
![Page 18: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/18.jpg)
More information• Netflix OSS:
• https://github.com/Netflix/dynomite• https://github.com/Netflix/dyno• https://github.com/Netflix/dynomite-
manager• Chat: https://gitter.im/Netflix/dynomite
![Page 19: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/19.jpg)
![Page 20: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/20.jpg)
![Page 21: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/21.jpg)
Dynomite: S3 backups/restores● Why?
o Disaster recovery o Data corruption
● How?o Storage dumps data on the instance driveo Dynomite-manager sends data to S3 buckets
● Data per node are not large so no need for incrementals.● Use case:
o clusters that use Dynomite as a storage layero Not enabled in clusters that have short TTL or use Dynomite as a
cache
![Page 22: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/22.jpg)
Dynomite-manager
● Token management for multi-region deployments
● Support AWS environment
● Automated security group update in multi-region environment
● Monitoring of Dynomite and the underlying storage engine
● Node cold bootstrap (warm up)
● S3 backups and restores
● REST API
![Page 23: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/23.jpg)
Performance Setup● Instance Type:
○ Dynomite: i3.2xlarge with NVMe○ NDBench: m2.2xls (typical of an app@Netflix)
● Replication factor: 3○ Deployed Dynomite in 3 zones in us-east-1○ Every zone had the same number of servers
● Demo app used simple workloads key/value pairs○ Redis: GET and SET
● Payload ○ Size: 1024 Bytes○ 80%/20% reads over writes
![Page 24: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/24.jpg)
Throughput
![Page 25: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/25.jpg)
Latencies
![Page 26: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/26.jpg)
Consistency● DC_ONE
o Reads and writes are propagated synchronously only to the node in local rack and asynchronously replicated to other racks and data centers
● DC_QUORUMo Reads and writes are propagated synchronously to quorum number of nodes
in the local data center and asynchronously to the rest. The DC_QUORUM configuration writes to the number of nodes that make up a quorum. A quorum is calculated, and then rounded down to a whole number. If all responses are different the first response that the coordinator received is returned.
● DC_SAFE_QUORUMo Similarly to DC_QUORUM, but the operation succeeds only if the read/write
succeeded on a quorum number of nodes and the data checksum matches. If the quorum has not been achieved then an error response is generated by Dynomite.
![Page 27: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/27.jpg)
Deploying Dynomite in PROD
● Unit testing in Github● Building EC2 AMI in “experimental”● Pipelines for performance analysis● Promotion to “candidate”● Beta Testing● Promotion to “release”
![Page 28: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/28.jpg)
Reconciliation● Reconciliation is based timestamps (newest wins) and
is performed by a Spark cluster
● Jenkins job to avoid clock skewness
![Page 29: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/29.jpg)
Reconciliation: Design Principles
We would prefer to alleviate the processing load of
performing the reconciliation from each node in the cluster
and off load it to a high performance computation in
memory cluster based on Spark.
![Page 30: Making Non-Distributed Databases, Distributed · Making Non-Distributed Databases, Distributed ... Dyno client is aware of the ... Promotion to “release](https://reader031.vdocuments.mx/reader031/viewer/2022020204/5ae37af17f8b9ad47c8e5398/html5/thumbnails/30.jpg)
Reconciliation: Architecture
● Forcing Redis (or any other storage engine) to dump data to the disk
● Encrypted communication between Dynomite and Spark cluster
● Chunking the data - retry in case of a failure.
● Bandwidth Throttler