f4:$facebook’s$warmblob$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · takeaways •...

15
f4: Facebook’s Warm BLOB storage systems Subramanian Muralidhar, Wya1 Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, Sanjeev Kumar * Borrow some from f4 OSDI slides

Upload: others

Post on 21-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

f4:  Facebook’s  Warm  BLOB  storage  systems Subramanian  Muralidhar,  Wya1  Lloyd,  Sabyasachi  Roy,  Cory  Hill,  Ernest  Lin,  Weiwen  Liu,  Satadru  Pan,  Shiva  Shankar,  Viswanath  Sivakumar,  

Linpeng  Tang,  Sanjeev  Kumar  

*  Borrow  some  from  f4  OSDI  slides  

Page 2: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

Problem

•  Facebook  has  to  deal  with  many  immutable  objects  •  Large  in  size    •  Immutable  binary  data  (BLOBs)  •  Photos,  videos,  a1ached  files  (Feb  14.  400  billion  photos)  •  CreaRons,  reads,  deleRons  –  NO  modifica7ons  •  Hot  and  warm  –  temperature  zones  exists!  

•  New  =>  “hot”  •  Cools  over  Rme  (rapidly)  

• Requirement:  •  Low  latency  •  Storage  efficiency  (lower  effecRve-­‐replicaRon-­‐factor)  

Page 3: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •
Page 4: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

BLOB  Storage  System  in  Facebook  architecture

TAO  

Page 5: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

BLOB  storage  system

•  CreaRon  (C)  •  C1.  Request  goes  to  RT  •  C2.  RT  directs  request  to  the  storage  system  (Haystack)  

•  Read  (R)  •  R1.  Read  from  cache.  If  found,  return  •  R2.  Cache  miss,  go  to  TT  •  R3.  TT  redirects  request  to  RT  •  R4.  RT  directs  request  to  the  storage  system  (Haystack/f4)  

•  DeleRon  (D)  •  D1.  Request  goes  to  RT  •  D2.  RT  directs  request  to  the  storage  system  

Page 6: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

f4  Design

•  Two  main  goals:  •  Storage  efficiency  •  Fault  tolerance  

•  f4  cell:  •  Resides  within  one  data  center  •  Only  stores  lock  volumes  

•  The  data  and  index  files    are  read-­‐only.  •  Journal  files  are  not  presented.  

Page 7: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

f4:  Fault  tolerance

• Within  a  data  center:  •  Reed  Solomon  Encoding  

•  (k,  v):  k  blocks  data,  v  parity  blocks  •  Tolerate  up  to  v  blocks  fails  

Page 8: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

f4:  Fault  tolerance

• Between  data  centers:  •  XOR  Encoding  

 0101  XOR  0011  =  0110  0101  XOR  0110  =  0011  

Page 9: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

EffecBve  replicaBon  factor

• How  many  Rmes  physical  storage  required  to  store  data?  • Haystack:  3.6x  –  To  store  one  bit,  need  3.6  physical  bit  •  RAID-­‐6.  1.2x  •  Replicate  three  Rmes  

•  f4:  2.8x  •  Replicate  cell  between  two  data  centers  

•  f4:  2.1x  •  Use  third  cell  for  XOR  encoding  •  Reed-­‐Solomon  encoding:  (10,4):  1.4x  •  XOR  encoding:  1.5x  

Page 10: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

Read  –  local  cell  failure

Page 11: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

Read  –  data  center  failure

Page 12: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

EvaluaBon

• What  and  how  much  data  is  “warm”?  • How  efficient  f4  is,  in  terms  of  throughput  and  latency?  

Page 13: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

Hot  and  warm  devide

Page 14: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

EvaluaBon

Page 15: f4:$Facebook’s$WarmBLOB$storage$systemswlloyd/classes/599s15/slides/23_f4.pdf · Takeaways • f4)–warm)storage)system,)with)Haystack,)provide)the)storage)layer) forBLOBs. •

Takeaways

•  f4  –  warm  storage  system,  with  Haystack,  provide  the  storage  layer  for  BLOBs.  •  “One-­‐size-­‐fits-­‐all”  no  longer  holds:  Different  types  of  data  should  be  handled  differently.  

• BLOBs  in  social  network,  or  social  content  in  general,  iniRally  is  hot  and  cools  rapidly  over  Rme.  •  f4  reduces  effecRve  replicaRon  factor  from  3.6x  (Haystack)  to  2.1x  and  is  sRll  resilient  to  failures  (disks,  hosts,  racks,  datacenters).