cache2k, java caching, turbo charged, fosdem 2015

22
Java Caching, Turbo Charged JavaDevRoom, FOSDEM 2015 Jens Wilke, headissue GmbH twitter.com/cruftex github.com/cruftex http://cache2k.org

Upload: cruftex

Post on 17-Jul-2015

462 views

Category:

Software


0 download

TRANSCRIPT

Java Caching, Turbo ChargedJavaDevRoom, FOSDEM 2015

Jens Wilke, headissue GmbHtwitter.com/cruftexgithub.com/cruftex

http://cache2k.org

cache2k Overview

● Started in year 2000 as in house product and evolving since

● Focus on in memory (in heap) caching (persistence and off heap is on the way)

● Research on optimized performance / modern eviction policies

● Open sourced 2013

● Contains features not found in (all) cache products, e.g.:

– On time expiry

– Extensive statistics

– Support for exceptions and nulls

– Blocking fetch for multiple requests on the same key(read through configuration)

Eviction AlgorithmsEviction Algorithms

flickr:alexanderkafka

LRU

1 2 3 4 5 6 7

1 2 3 5 6 74

LRU Entry

cache access => move to front

CLOCK

hand

1=hit

1=hit0=no hit

0=no hit

0=no hit

1=hit

1=hit 1=hit

1=hit

Improving on LRU... protect the working set

● For completeness: Least frequently used

– LFU

– LRFU

– …

● Split set of entries into cold and hot, to protect the working set

– 2Q

– LIRS

– ARC – Adaptive Replacement Cache

● Nimrod Megiddo and Dharmendra S. Modha (Usenix 2003) – patented by IBM

– Clock-Pro

● Song Jiang, Feng Chen and Xiaodong Zhang (Usenix 2005)

cold set hot set

Improving on LRU... history of seen entries

● Keep an LRU list of the evicted keys

● If seen again, insert directly into hot set

cold set hot set

ghost set (only keys)

Clock-Pro+

handHot

0 hits

1 hit

0 hits

2 hits

0 hits

1 hit 4 hits

0 hits

2 hits

handCold

5 hits

0 hits 1 hits

Clock-Pro+ Evaluation

– Only inexpensive operation on access, no exclusive access needed

– Better efficiency then LRU for most analyzed workloads

– Downside● Eviction overhead increases when possible hitrates get high

(e.g. 3 entries scanned per eviction at 50% hitrate, 10 entries scanned at 95%)

● High complexity, no straight forward implementation by the book, lots of tuning needed (and possible)

– Still missing:● Optimal selection of cold / hot space sizes

BenchmarksBenchmarksflickr:bantam10

Benchmark Setup

● Cache implementations:

– Cache2k Version 0.21 (to be release next week)

– EHCache Version 2.9.0

– Guava 18

– Infinispan 7.1.0.CR2

● Oracle JRE 1.8-25

● Hardware

– Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz

Test workload

– Keys and values are integers

– Read through configuration, the cache source just returns the key

– Not practical: emphasis of caching overhead

// run the benchmark Integer[] trace = …. for (Integer v : trace) { cache.get(v); }

// Implementation of cache source public Integer get(Integer o) { incrementMissCount(); return o; }

Runtime for artificial traces

3 million requests on cache with 500 capacityExcept Hits2000: cache with 2000 capacity

Hits: repeat different 500 valuesRandom: random select from 1000 valuesEff90 / Eff95: random trace with approx. 90% and 95% hitrate on LRU0

1

2

3

4

5

6

run

time

in s

eco

nd

s

Runtime of 3 million cache requests

cache2k/CLOCKcache2k/CP+cache2k/ARC

EHCacheInfinispan

Guava

Runtime for mostly hits

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2ru

ntim

e in

sec

onds

Runtime of 3 million cache hits

HashMap+Countercache2k/CLOCK

cache2k/CP+cache2k/ARC

EHCacheInfinispan

Guava

The first four times for Hits: 20ms, 50ms, 50ms, 70ms

Runtime with two threads

0

0.5

1

1.5

2

2.5ru

ntim

e in

sec

onds

3 million cache requests Eff95 per thread count

cache2k/CLOCKcache2k/CP+cache2k/ARC

EHCacheInfinispan

Guava

Some CPU consuming computation is done on cache miss

Eff95Threads2:Same trace executed in separate threadwith index offset

Hitrate comparison -Artificial traces

0

10

20

30

40

50

60

70

80

90

100

runt

ime

in s

econ

ds

Hitrate of 3 million cache requests

cache2k/CLOCKcache2k/CP+cache2k/ARC

EHCacheInfinispan

Guava

Hitrate comparison -Multi2 trace

0

10

20

30

40

50

60

70

80

Hitrates for Multi2 trace

OPTLRU

CLOCKCP+ARC

EHCacheInfinispan

GuavaRAND

Hitrates comparison -Web12 trace

0

10

20

30

40

50

60

70

80

90

Hitrates for Web12 trace

OPTLRU

CLOCKCP+ARC

EHCacheInfinispan

GuavaRAND

Hitrate comparison -Sprite trace

0

10

20

30

40

50

60

70

80

90

100

Hitrates for Sprite trace

OPTLRU

CLOCKCP+ARC

EHCacheInfinispan

GuavaRAND

Take away

● The goal:

– Eviction algorithm doing better than LRU

– Self tuning / adapting

– Minimal overhead on cache access

Clock-Pro+ is quite there

Get involved...

● Try it: cache2k is on maven central

● Source on github:● http://github.com/headissue/cache2k

● http://github.com/headissue/cache2k-benchmarks

● Ask questions on stackoverflow!

Thanks & Enjoy Life!Thanks & Enjoy Life!http://cruftex.nethttp://cruftex.net http://cache2k.org http://cache2k.org