f4:$facebook’s$warmblob$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · takeaways •...
TRANSCRIPT
f4: Facebook’s Warm BLOB storage systems Subramanian Muralidhar, Wya1 Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar,
Linpeng Tang, Sanjeev Kumar
* Borrow some from f4 OSDI slides
Problem
• Facebook has to deal with many immutable objects • Large in size • Immutable binary data (BLOBs) • Photos, videos, a1ached files (Feb 14. 400 billion photos) • CreaRons, reads, deleRons – NO modifica7ons • Hot and warm – temperature zones exists!
• New => “hot” • Cools over Rme (rapidly)
• Requirement: • Low latency • Storage efficiency (lower effecRve-‐replicaRon-‐factor)
BLOB Storage System in Facebook architecture
TAO
BLOB storage system
• CreaRon (C) • C1. Request goes to RT • C2. RT directs request to the storage system (Haystack)
• Read (R) • R1. Read from cache. If found, return • R2. Cache miss, go to TT • R3. TT redirects request to RT • R4. RT directs request to the storage system (Haystack/f4)
• DeleRon (D) • D1. Request goes to RT • D2. RT directs request to the storage system
f4 Design
• Two main goals: • Storage efficiency • Fault tolerance
• f4 cell: • Resides within one data center • Only stores lock volumes
• The data and index files are read-‐only. • Journal files are not presented.
f4: Fault tolerance
• Within a data center: • Reed Solomon Encoding
• (k, v): k blocks data, v parity blocks • Tolerate up to v blocks fails
f4: Fault tolerance
• Between data centers: • XOR Encoding
0101 XOR 0011 = 0110 0101 XOR 0110 = 0011
EffecBve replicaBon factor
• How many Rmes physical storage required to store data? • Haystack: 3.6x – To store one bit, need 3.6 physical bit • RAID-‐6. 1.2x • Replicate three Rmes
• f4: 2.8x • Replicate cell between two data centers
• f4: 2.1x • Use third cell for XOR encoding • Reed-‐Solomon encoding: (10,4): 1.4x • XOR encoding: 1.5x
Read – local cell failure
Read – data center failure
EvaluaBon
• What and how much data is “warm”? • How efficient f4 is, in terms of throughput and latency?
Hot and warm devide
EvaluaBon
Takeaways
• f4 – warm storage system, with Haystack, provide the storage layer for BLOBs. • “One-‐size-‐fits-‐all” no longer holds: Different types of data should be handled differently.
• BLOBs in social network, or social content in general, iniRally is hot and cools rapidly over Rme. • f4 reduces effecRve replicaRon factor from 3.6x (Haystack) to 2.1x and is sRll resilient to failures (disks, hosts, racks, datacenters).