arc: a self-tuning, low overhead replacement cache nimrod megiddo and dharmendra s. modha presented...
TRANSCRIPT
ARC: A self-tuning, low overhead Replacement Cache
Nimrod Megiddo andDharmendra S. Modha
Presented by Gim Tcaesvk
Cache
Cache
Auxiliary memory
Cache is expensive.
The replacement policy is the only algorithm of interest.
Cache management problem
Maximize the hit rate.
Minimize the overhead.ComputationSpace
Belady’s MIN
Replaces the furthest referenced page.
Optimum for every case.Provides upper bound of hit ratio.
But,Who knows the future?
LRU (Least Recently Used)
Replaces the least recently used page.Optimum policy for SDD (Stack Depth Distributi
on).Captures recency but not frequency.
0 1 2 3 2 3 2 1
0
1
2
1
2
3
3
2
2
3
3
2
3
2
1
LFU (Least Frequently Used)
Replaces the least frequently used page.Optimum policy for IRM (Independent Referenc
e Model).Captures frequency but not recency.
0 0 1 1 2 3
0
0
1
0
1
23
0
Drawbacks of LFU
Logarithmic complexity in cache size
High frequently used page is hardly to paged out.Even if it is no longer useful.
LRU-2
Replaces the least 2nd recently used page.
Optimum online policy which knows at most 2 most recent reference for IRM.
Limitations of LRU-2
Logarithmic complexityDue to priority queue
Tunable parameter,CIP (Correlated Information Period)
2(2,4)
1(1,3)
0(-∞,0)
2Q (modified LRU-2)
Uses simple LRU list instead of the priority queue.
Two parameter, Kin and Kout
Kin=Correlated Information PeriodKout=Retained Information Period
LIRS (Low Inter-reference Recency Set)
Maintains 2 LRU stacks of different size.Llirs maintains LRU page at least twice recent
ly seen.Lhirs maintains LRU page only once recently s
een.
Works well for IRM, but not for SDD.
FBR (Frequency-Base Replacement)
Divide LRU list into 3 sections;New
Reference count in new section is not incremented.
MiddleOld
Replaces page in old section with smallest reference count.
LRFU (Least Recently/Frequently Used)
1,
0,
otherwise 2
t;at time referenced is if 21
LRU
LFULRFU
xC
xxCxC
Exponential smoothing. λ is hard to tune.
MQ (Multi-Queue)
Uses m LRU queues Q0, Q1, … Qm-1
Qi contains page that have been seen [2i,2i+1) times recently.
Introducing a Class of Replacement Policies
DBL(2c): DouBLe
Maintains 2 variable-sized lists.Holds 2c pages.
cL
cL
cLL
20
0
20
2
1
21
DBL(2c) flow
21 LLxt 2in LMRUxt
cL 1
1in LMRUxt
True
True
False
False
out in 1LLRU cLLLLRU 2 ifout in 212
Most recent c page will always be contained.
: A new class of policy c
ii BMRUTLRU in n recent tha more is in
222 BTL
111 BTL
cTTcLL
BBcLL
2121
2121
FRCp(Fixed Replacement Cache)
FRCp(c) is tuned
We define a parameter p such that.0≤p≤c.
cTT
pcT
pT
cFRCcFRC
cFRC
cFRC
pp
p
p
21
2
1
ccTT
21
FRCp(c) flow
2Tin MRUxt 21 TTxt
Replace
21 BBxt
21 LLxt
Delete
1Tin MRUxt
1Lin LRU
Replace
True
Delete
cLL
LRU
2 if
Bin
21
2
False
21
21
BB
TTxt
cL 1
ARC (Adaptive Replacement Cache)
US Patent 20040098541Assignee Name: IBM Corp.
ARC (Adaptive RC)
Adaptation parameter p∈[0,c]For a given (fixed) value of p,
Exactly same as FRCp.But it learns!
ARC continuously adapts and tunes p.
Learning
We should
2
1
1
21
2
1
2
1
2
1
,1max,,1max
in hit on increase toby decrease
increase
B
B
B
B
B
B
T
Tp
ARC flow
2Tin MRUxt 21 TTxt
Replace
21 LLxt
Delete
1Tin MRUxt
1Lin LRU
Replace
True
Delete
cLL
LRU
2 if
Bin
21
2
False
22
1
11
2
if ,1max
if ,1max
BxB
Bp
BxB
Bp
p
21 BBxt
cL 1
21
21
BB
TTxt
Scan-Resistant
Long sequence of 1-time-only reads will pass through L1.
Less hits will be encountered in B1 compared to B2.
If hit in B1, T2 will grow by learning.
ARC example
0
0
1
0
1
00
2
2
1
11
0
3
2
3
4T1
B1
B2T2
p=2
23
4
2
01
25
4
5
p=3
3
0
p=2
2
3
0
2
6
45
3
6
7
56
4
7
0
1
Experiment
Various traces used in this paper.
Experiment: Hit ratio of OLTP
ARCOutperforms online parameter algorithms.Performs as well as offline parameter
algorithms.
Experiment: Hit ratio (cont.)
MQ outperforms LRU, while ARC outperforms all.
Experiment: Hit ratio (cont.)
Comparison for various workloads shows ARC outperforms.
Experiment: Cache size of P1
ARC performs always better than LRU.
Experiment: parameter p
“Dances to the tune being played by workload.”
Summary
ARC Is online and self-tuning.
Is robust for a wider range of workloads.
Has less overhead Is scan-resistant.
Outperforms LRU.