Архитектура хранения и отдачи фотографий в badoo / Артем...

Download Архитектура хранения и отдачи фотографий в Badoo / Артем Денисов (Badoo)

If you can't read please download the document

Upload: ontico

Post on 06-Jan-2017

427 views

Category:

Engineering


8 download

TRANSCRIPT

Badoo , Badoo

330 . 3 Pb 3,5 . 80 .

photos1

photos2

photosN

place_idplace_id: 1..5place_id: 6..11place_id: m..nphotos1

photos2

photosN

:

:

! :

! ! :

! ! ! :

bphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

! $/Gbbphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

! $/Gb! bphotos2

bphotosN

Storage Area Network (SAN)bphotos1

! $/Gb! ! (>500 rps per host)bphotos2

bphotosN

Storage Area Network (SAN)bphotos1

:

7*109 reads / 3.5*106 writes per day

dataset :

7*109 reads / 3.5*106 writes per day

dataset , LRU

:

7*109 reads / 3.5*106 writes per day

bphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

photoscache1

photoscacheN

bphotos

bphotos

Local cache

Local cache

proxy_passproxy_storebphotos

Local cache

BufferHot cacheCold cache

proxy_passproxy_storebphotos

Local cache

BufferHot cacheCold cacheAccess log

proxy_passproxy_storebphotos

Local cache

BufferHot cacheCold cacheAccess log

proxy_passproxy_storebphotos

Cache manager daemon

Local cache

BufferHot cacheCold cacheAccess log

proxy_passproxy_storebphotos

Cache manager daemon

Local cache

BufferHot cacheCold cacheAccess log

proxy_passproxy_storebphotos

Cache manager daemon

-> Hot cache

Local cache

BufferHot cacheCold cacheAccess log

proxy_passproxy_storebphotos

Cache manager daemon

-> Hot cache -> Cold cache

Local cache

BufferHot cacheCold cacheAccess log

proxy_passproxy_storebphotos

Cache manager daemon

-> Hot cache -> Cold cache Cold cache

?

photoscache2

photoscache1

photoscache3

? Round-robin?

photoscache2

photoscache1

photoscache3

? Round-robin?Hash % count?

photoscache2

photoscache1

photoscache3

? Round-robin?Hash % count?hash(example_url) = 5server_idx0 = 5 % 3 = 2

photoscache2

photoscache1

photoscache3

? Round-robin?Hash % count?hash(example_url) = 5server_idx0 = 5 % 3 = 2server_idx1 = 5 % 4 = 1

?

photoscache2

photoscache1

photoscache3

Round-robin?Hash % count?

? Round-robin?Hash % count?Consistent hashing?

photoscache2

photoscache1

photoscache3

Consistent hashing

0hash(sharding_key)

Consistent hashing

0hash(sharding_key)

Consistent hashing

0hash(sharding_key)

A

Consistent hashing

0hash(sharding_key)

BA

Consistent hashing

0hash(sharding_key)

BAC

Consistent hashing

0hash(sharding_key)

BAC

Consistent hashing

0hash(sharding_key)

BAC

BAC

photoscache1

bphotos

photoscache2

photoscache3

photoscache4(reserve)

Load balancer

photoscache1

Load balancer

photoscache2

photoscache3

photoscache4(reserve)

bphotos

. Hitrate ( ) 98% 80k 1600 rps bphotos

. Hitrate ( ) 98% 80k 1600 rps bphotos3 (, , )

. Hitrate ( ) 98% 80k 1600 rps bphotos3 (, , )

webp, progressive jpeg resize/crop , (blur, pixelize)

CDN?

CDN?

CDN?

CDN?

CDN?

CDN

. 2013

bphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

photoscache1

photoscacheN

. 2013

bphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

photoscache1

photoscacheN

80 x bphotos = ~560Tb 40 x photoscacheX 2

. 2013

bphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

photoscache1

photoscacheN

1 x POINT OF FAILURE

. 2013

bphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

photoscache1

photoscacheN

2 x POINT OF FAILURE

. 2013

bphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

photoscache1

photoscacheN

3 x POINT OF FAILURE

. 2013

bphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

photoscache1

photoscacheN

3 x POINT OF FAILURE! MAINTENANCE

. 2013

bphotos1

bphotos2

bphotosN

Storage Area Network (SAN)

photoscache1

photoscacheN

3 x POINT OF FAILURE! DATA LOSS! MAINTENANCE

v.1

bphotos

Main partitionLocal FSFiber

Storage Area Network

bphotos

Local FS

Main partition

Backup partitionFiber

Storage Area NetworkFiber

Storage Area Network v.1

Async queue

bphotos

Local FS

Main partition

Backup partitionFiber

Storage Area NetworkFiber

Storage Area Network v.1

Async queue

bphotos

Local FS

Main partition

Backup partitionFiber

Storage Area NetworkFiber

Storage Area Network

Buffer partition v.1

Async queue

bphotos

Local FS

Main partition

Backup partitionFiber

Storage Area NetworkFiber

Storage Area Network

Buffer partition! NO DATA LOSS v.1

Async queue

bphotos

Local FS

Main partition

Backup partitionFiber

Storage Area NetworkFiber

Storage Area Network

Buffer partition! NO DATA LOSS! POINT OF FAILURE v.1

Async queue

bphotos

Local FS

Main partition

Backup partitionFiber

Storage Area NetworkFiber

Storage Area Network

Buffer partition! NO DATA LOSS! POINT OF FAILURE! MAINTENANCE

v.1

Dphotos

Async queue

bphotos

Local FS

Main partition

Backup partitionFiber

Storage Area NetworkFiber

Storage Area Network

Buffer partition

Dphotos

Async queue

bphotos

Local FS

Main partition

Backup partition

Buffer partition

Dphotos

Async queue

bphotos

Local FS

Main partition

Backup partition

Buffer partition

Dphotos

dphotosN

Buffer partition

Main partition

dphotosN+1

Buffer partition

Main partition

Dphotos

dphotosN

Async queue

Buffer partition

Main partition

dphotosN+1

Buffer partition

Main partition

Dphotos. Upload

Async queue

dphotosN

dphotosN+1

Round robin + health checks

Load balancer

Dphotos.

photoscache

Async queue

dphotosN

dphotosN+1

Round robin + health checks

Dphotos.

photoscache

Async queue

dphotosN

dphotosN+1

HIT

Round robin + health checks

Dphotos.

Async queue

dphotosN

dphotosN+1

MISS

Round robin + health checksphotoscache

Dphotos.

Async queue

dphotosN

dphotosN+1

MISS

Round robin + health checksphotoscache

Dphotos.

dphotosN

Async queue

Buffer partition

Main partition

dphotosN+1

Buffer partition

Main partition

Dphotos.

dphotosN

Async queue

Buffer partition

Main partition

dphotosN+1

Buffer partition

Main partition

Dphotos.

dphotosN

Async queue

Buffer partition

Main partition

dphotosN+1

Buffer partition

Main partition

Dphotos.

dphotosN

Async queue

Buffer partition

Main partition

dphotosN+1

Buffer partition

Main partition

Dphotos.

Dphotos.

Dphotos.

Dphotos.

Dphotos. ?

Dphotos. ?

Dphotos. ? 1.5 , SAN

photoscache

photoscache

CDN

photoscache

dphotos

CDN

Storage layer

photoscache

dphotos

CDN

Storage layerLocal drives

photoscache

dphotos

Storage Area Network

CDN

Storage layerLocal drives

?

? http://pinba.org

? http://pinba.org -> ->

? http://pinba.org -> -> Immutable

? http://pinba.org -> -> Immutable Resize

? http://pinba.org -> -> Immutable Resize -

? http://pinba.org -> -> Immutable Resize - - -