workload analysis of a large-scale key-value store

Post on 01-Jan-2016

44 Views

Category:

Documents

6 Downloads

Preview:

Click to see full reader

DESCRIPTION

Berk Atikoglu, Yuehai Xu , Eitan Fracthenberg , Song Yiang , Mike Paleczny. Workload Analysis of a Large-Scale Key-Value Store. Analyze Memcached at Facebook. +284,000,000,000 requests 5 different use cases Workload characteristics, locality, cache effectiveness. - PowerPoint PPT Presentation

TRANSCRIPT

Workload Analysis of a Large-Scale Key-Value Store

Berk Atikoglu, Yuehai Xu, Eitan Fracthenberg, Song Yiang, Mike Paleczny

2

Analyze Memcached at Facebook

+284,000,000,000 requests

5 different use cases

Workload characteristics, locality, cache effectiveness

3

Why Is Caching Important?

Cache ServersWeb Servers

Database

4

Motivation

Understand workload characteristics

Identify factors affecting performance

Provide a benchmark for future studies

5

Memcached

Distributed memory caching system Key-value store for small objects

Hash Function

Memcached Servers

Key

6

Tracing Methodology

Capture traces through a Linux Kernel Module (LKM)

Process traces with Hive

Memcached

Transport (TCP/UDP)

Network

Ethernet

LKM

7

Facebook Deployment

Pool Size Description

USR Few User-account status information

APP Dozens Object metadata of a popular application

SYS Few System data on service location

VAR Dozens Server-side browser information

ETC Hundreds Nonspecific, general purpose

Contains server related information

Anything that doesn’t belong to a specific pool goes to ETC

8

Analysis

Workload Characteristics

Locality, Cache Behavior

9

Request Composition

> 99.8% GETGET:UPDATE = 30:1

10

Key Size Distribution90% of VAR keys are 31B

USR keys are 16B or 21B

ETC is heterogeneous

11

Value Size DistributionUSR values are only 2B

90% of values are smaller than 500B

12

Value Size Dist. By Overall Weight

90% of data is generated by values of 500B or smaller except ETC

90% is 10KB or smaller values for ETC

13

Request Rate Over Time

All pools show diurnal pattern except SYS

14

Request Rate Over Time (ETC)

Night time in Western Semiphere

North America starts its day

15

Analysis

Workload Characteristics

Locality, Cache Behavior

16

Repeating Keys0.0003% of keys in 10% of requests in ETC

1% of keys in 55% of requests in ETC

Least frequent 50% of keys in 1% of requests in ETC

17

Locality Over Time

USR APP ETC VAR SYS0

20

40

60

80

100

% of unique keys out of total in unit time

5min 60min

18

Reuse Period of Keys99.9% of SYS keys are reused in 1hr

88.5% of ETC keys are reused in 1hr

96.4% of ETC keys are reused in 6hr

19

Hit Rate98.2% 92.9% 81.4%

93.7% 98.7%

Why?

20

Causes of ETC Cache Misses

Compulsory

Capacity

Invalidation

70% 22% 8%

81%

13%4% 2%hit miss: compulsory miss: capacity

miss: invalidation

21

Conclusion

Analyzed 5 different memcached use cases

Different applications of memcached have extreme variations in access patterns

Answered pertinent questions to improve Facebook’s memcached usage

22

Thank You

Questions?

top related