finding a needle in haystack facebook’s photo storage
DESCRIPTION
Finding a needle in Haystack Facebook’s Photo Storage. Shakthi B achala. Outline. Scenario Goal Problem Previous Approach Current Approach Evaluation Advantages Critic Conclusion. Scenario :. Goal. High throughput and low latency Fault-tolerant Cost-effective Simple. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/1.jpg)
Finding a needle in Haystack Facebook’s Photo Storage
Shakthi Bachala
![Page 2: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/2.jpg)
Outline
• Scenario• Goal• Problem• Previous Approach• Current Approach• Evaluation• Advantages• Critic• Conclusion
![Page 3: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/3.jpg)
Scenario :April 2009 October 2011
Total 15 billion Photos 4*15 billion images= 60 billion images1.5 petabytes of data
65 billion Photos4*65 billion images =260 billion images20 petabytes of data
Upload Rate 220 million photos / week25 terabytes of data
1 billion photos / week60 terabytes of data
Serving Rate 550,000 images / sec 1 million images / sec
![Page 4: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/4.jpg)
Goal
• High throughput and low latency• Fault-tolerant• Cost-effective• Simple
![Page 5: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/5.jpg)
Previous Approach : Typical design for Photo Sharing
![Page 6: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/6.jpg)
Previous Approach : NFS based design for Photo Sharing at facebook
![Page 7: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/7.jpg)
Previous Approach – NFS based design
• Traditional file system architecture performs poorly under Facebook's kind of workload
• NFS - based Design: CDN effectively serves the hottest photos (profile pictures and recently updated photos), but facebook also generates a lot of requests for less popular images (long tail images). These are not handled by CDN
• Normal website had 99% CDN hit rate but facebook had around 80% CDN hit rate
![Page 8: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/8.jpg)
8
Long Tail Issue
![Page 9: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/9.jpg)
Previous Approach cont..
Problems with that approach were:• Wastage of storage capacity due to metadata
– Large metadata per file– Each image stored as a file
• Large number of disk operations for reads– Because of large directories (large directories containing thousands of
files)– Change of the directory structures and changing from large directories
to small directories has brought down the iops approximately from 10 to 2.5-3.0
![Page 10: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/10.jpg)
Current Approach – Haystack Architecture
![Page 11: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/11.jpg)
Current Approach- Haystack Components
The main components of Haystack architecture are:1. Haystack Directory2. Haystack Cache3. Haystack Store
![Page 12: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/12.jpg)
Current Approach- Haystack Directory
The main goals of directory are:• Map logical volumes to physical volumes
– 3 Physical volumes( on 3 nodes) per one logical volume• Load balance
– Writes across logical volumes – Reads across physical volumes (any of the 3 stores)
• Caching strategy: Whether the photo request should be handled by the CDN or by the cache– URL generation
http://<CDN>/<Cache>/<Node>/<Logical volume id, Image id>
• The directory would Identify the logical volumes that are read only either because of operational reason or because those volumes have reached their storage capacity
![Page 13: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/13.jpg)
Current Approach- Haystack Cache
• Approach:– The Cache receives HTTP requests for photos from
browser or CDNs– It is a distributed hash table with photo id as the key to
locate the cached data– If the photo id is missing in cache , the cache fetches the
data from photo server and replies it to the browser or CDN depending on the request
![Page 14: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/14.jpg)
Current Approach- Haystack CacheCaches a photo if it satisfies the following two conditions:
• The request directly come from a user and instead of CDN– Facebook’s experience with the NFS-based design showed post-CDN caching
is ineffective as it is unlikely that a request misses in the CDN would hit in our internal cache
• The photos is fetched by the write enabled store– Photos are most heavily accessed soon after they are uploaded – File systems generally work better when doing either writes or reads but not
both
![Page 15: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/15.jpg)
Current Approach- Haystack Cache Hit Rate
![Page 16: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/16.jpg)
Current Approach : Haystack Store
• Replaces the storage and photo server layer in NFS based Design with this structure:
![Page 17: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/17.jpg)
Current Approach : Haystack Store
• Storage : – 12x 1TB SATA, RAID6
• Filesytem:– Single approx. 10 TB xfs filesystem.
• Haystack:– Log structured , append only object store containing
needles as object abstractions– 100 haystacks per node each 100GB in size
![Page 18: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/18.jpg)
Current Approach: Haystack Store File
![Page 19: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/19.jpg)
Current Approach: Operations in Haystack
• Photo Read– Look up offset /size of the image in the incore index– Read Data (approx. 1 iop)
• Photo Write– Asynchronously append images one by one to the haystack
file– Next haystack file when becomes full– Asynchronously append index records to the index file– Flush index file if too many dirty index records– Update incore index
![Page 20: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/20.jpg)
Current Approach: Operations in Haystack
• Photo Delete– Lookup offset of the image in the incore index– Mark the image needle flag as “DELETED”– Update incore index
• Index File:– Provides minimum metadata to locate the needle in the
Haystack store– Subset of Header metadata
![Page 21: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/21.jpg)
Current Approach: Haystack Index File
![Page 22: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/22.jpg)
Haystack Based Design - Photo Upload
![Page 23: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/23.jpg)
Haystack Based Design - Photo Download
![Page 24: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/24.jpg)
Current Approach: Operations in Haystack
• Filesystem:– Haystack uses XFS, an extent based file system
• It has two main advantages:– The block maps for several contiguous large files can be
small enough to be stored in the main memory– XFS provides efficient file pre allocation, mitigating
fragmentation and reigning in how large block maps can grow
![Page 25: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/25.jpg)
Current Approach: Haystack Optimization
• Compaction:– Infrequent online operation– Create a copy of haystack skipping duplicates and deleted
photos– The patterns of deletes to photo views, young photos are a
lot more likely to be deleted– Last year about 25% of the photos got deleted
![Page 26: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/26.jpg)
Current Approach: Haystack Optimization
• Saving More Memory:– With the following two techniques store machines reduced
their main memory footprints by 20%– Eliminate the need for an in-memory representation of
flags by setting the offset to be 0 for deleted photos.– Store machine do not keep track of cookie values in main
memory and instead check the supplied cookie after reading from the disk
![Page 27: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/27.jpg)
Current Approach: Haystack Optimization
• Batch Uploads:– Disks perform better with large sequential writes instead
of small random writes, so facebook uses batch uploads whenever possible
– Many users upload entire albums to facebook instead of each picture which gives an opportunity to batch the uploads
![Page 28: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/28.jpg)
Evaluation -Data
![Page 29: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/29.jpg)
Evaluation – Production Workload
![Page 30: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/30.jpg)
Advantages
• Simple design• Decrease number of disk operations by
reducing the average metadata per photo• This system is robust enough to handle a very
large amount of data• Fault Tolerant
![Page 31: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/31.jpg)
Critic
• I thought this approach is very facebook specific .
• Any other?
![Page 32: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/32.jpg)
Conclusion
• Built a simple but robust data storage mechanism for facebook photo storage to accommodate long tail of photo requests which was not possible by previous approaches
![Page 33: Finding a needle in Haystack Facebook’s Photo Storage](https://reader036.vdocuments.mx/reader036/viewer/2022062222/5681616a550346895dd0f66c/html5/thumbnails/33.jpg)
References
1. http://www.usenix.org/event/osdi10/tech/2. http://
www.cs.uiuc.edu/class/sp11/cs525/sched.htm