cache organization for memory speculation メモリ投機を支援するキャッシュの構成法
DESCRIPTION
Cache Organization for Memory Speculation メモリ投機を支援するキャッシュの構成法. 東京大学工学部電子工学科 坂井研究室 20422 豊島 隆志. Outline. Introduction Background ~Speculative Multi-threading~ Our Baseline Model – NEKO – Cache Coherency Protocols Overview Cache Directories Events Conditions State Diagrams - PowerPoint PPT PresentationTRANSCRIPT
1 ページ なのだ!Cache Organization for Memory Speculation
Cache Organization for Memory Speculation
メモリ投機を支援するキャッシュの構成法
東京大学工学部電子工学科坂井研究室
20422 豊島 隆志
Cache Organization for Memory Speculation
2 ページ なのだ!
OutlineIntroduction
Background ~Speculative Multi-threading~Our Baseline Model – NEKO –
Cache Coherency ProtocolsOverviewCache DirectoriesEventsConditionsState Diagrams
EvaluationEnvironmentResults
ConclusionsConclusionsFuture Works
Cache Organization for Memory Speculation
Introduction
3 ページ なのだ!
Background~Speculative Multi-threading~
Dependencies (interrupting parallel executions)
Control dependenciesData dependencies
• Register level dependencies• Memory level dependencies
– Memory Speculation
A single threaded
application
Parallel execution !!
•Proposal of Cache Coherency Protocols
Support Memory Speculation on Speculative Multi-threading
•Comparison between these protocolsInvalidate-based vs. Update-based, etc…
Objectives
Cache Organization for Memory Speculation
Introduction
4 ページ なのだ!
SuperscalarCore
RegisterSync. Unit
PU 0
ICache DCache
Controller andSpeculation support
SuperscalarCore
RegisterSync. Unit
PU 3
ICache DCache
Controller andSpeculation support
Secondary Cache
・・・・・・
Thread Prediction UnitThread Validation and
Retire Unit
Thread Control Unit
Our Baseline Model –NEKO–
Cache Organization for Memory Speculation
5 ページ なのだ!
OutlineIntroduction
Background ~Speculative Multi-threading~Our Baseline Model – NEKO –
Cache Coherency ProtocolsOverviewCache DirectoriesEventsConditionsState Diagrams
EvaluationEnvironmentResults
ConclusionsConclusionsFuture Works
Cache Organization for Memory Speculation
Cache Coherency Protocols
6 ページ なのだ!
OverviewTwo major types of cache coherency protocols
Invalidate-based
Update-based
Cache Miss
in multi-tasking
× Bad ○ Good
Trafficespecially on data
bus
○ Little × Heavy
Design Complexity
○ Simple×
Complex
Cache Organization for Memory Speculation
Cache Coherency Protocols
7 ページ なのだ!
Cache Directories
Invalidate-based Protocol -MONE-
Containing conditionsconditions in addition to common information
Violation DetectionVersion Management
Line
Tag
State
ConditionWord 7 … Word 0
Data
Conditions
…Data
Conditions
Obsolete
Speculative
Loaded
Stored
Loaded
Stored
2bit 1bit 1bit64bit
1bit 1bit …64bit
1bit 1bit
Cache Organization for Memory Speculation
Cache Coherency Protocols
8 ページ なのだ!
Cache Directories
Update-based protocol -NENEKO-
Containing conditionsconditions in addition to common information
Violation DetectionVersion Management
Line
Tag
StateConditi
ons
Word 7 … Word 0
Data
Conditions
…Data
Conditions
Invalid
Shared
Modified
Obsolete
Store
Loaded
Store
Loaded
1bit 1bit 1bit 1bit64bit
4bit 1bit …64bit
4bit 1bit
Cache Organization for Memory Speculation
Cache Coherency Protocols
9 ページ なのだ!
Events
EventsPrRd Load operation from Processor
PrWr Store operation from Processor
BusRd Load operation via Bus
BusInv Invalidate operation via Bus
BusUpd Update (Store) operation via Bus
PrSquash
Processor is squashed
PrCommit
Processor is committed
PrvCommit
Just previous processor is committed
BusCommit
Other processors are committed
Invalidate-based only
Update-based only
Thread-control Events
Unique to Speculative Multi-threading
Cache Organization for Memory Speculation
Cache Coherency Protocols
10 ページ なのだ!
Conditions
Conditions
bus
shared shared with other processors
forwarded
forwarded from another processor
dynamic
delayedsent from successor
processors
masked masked by prior events
static
loaded loaded by the processor
stored modified by the processor
obsoletebe invalid when next thread is
assigned
speculative
have been forwarded
Invalidate-based only
Update-based only
Cache Organization for Memory Speculation
Cache Coherency Protocols
11 ページ なのだ!
State Diagrams
Clean Shared
Modified
Invalid
PrRd/BusRd PrRd/BusRd
Input/ Output
PrWr/ BusRdPrWr/ BusRd,BusInv
PrWr/ -
BusRd/ -
PrSquash/ -
PrWr/ BusRd,BusInv
BusInv/ -PrSquash/ -PrCommit/ -
BusInv/ -PrSquash/ -
PrCommit/BusFlush
PrCommit/BusFlush
Clean
SharedClean
Invalid
SharedModified
Modified
PrRd/ BusRd PrRd/ BusRd
PrRd/ BusRd PrRd/ BusRd,BusUpd
PrWr/ -
BusRd/ Forward
PrCommit/BusCommit
PrCommit/BusCommit
BusRd/ Forward
PrSquash/ -PrCommit/
BusCommit,BusFlush
PrCommit/ BusCommit,BusFlush
PrCommit/ BusCommit
PrWr/ BusUpd
BusRd/ ForwardPrCommit/ BusCommit
BusUpd/ -PrSquash/ -
PrCommit/ BusCommit
PrWr/ BusUpdBusRd/ Forward
BusUpd/ -PrSquash/ -
PrCommit/ BusCommit,BusFlush
PrCommit/BusFlush,BusCommit
Input/ Output
Invalidate-based Protocol -MONE- Update-based Protocol -NENEKO-
Cache Organization for Memory Speculation
12 ページ なのだ!
OutlineIntroduction
Background ~Speculative Multi-threading~Our Baseline Model – NEKO –
Cache Coherency ProtocolsOverviewCache DirectoriesEventsConditionsState Diagrams
EvaluationEnvironmentResults
ConclusionsConclusionsFuture Works
Cache Organization for Memory Speculation
Evaluations
13 ページ なのだ!
Environment
SimulatorCPON2: MONE/NENEKO Protocol Simulator
– Processors’ access patterns are gained from trace data of SPECint95 on NEKO Simulator
EnvironmentProcessors: 4Cache size: 256kBytesLine size: 64Bytes (=8Words)Associativity: 2Protocols: MONE/NENEKO
– (with per-WWord / per-LLine basis violation detection)– MONE/W, MONE/L, NENEKO/W, NENEKO/L
Cache Organization for Memory Speculation
Evaluations
14 ページ なのだ!
1 2 4 8min
max
0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
0.0009
0.001
block time ratio
assosiativity
Results
1.1.Block time ratioBlock time ratioInvalidate-based Protocol -MONE- Update-based Protocol -NENEKO-
zero
zero
The worst case of SPECint95
Average of SPECint95
The best case of SPECint95
1 2 4 8min
max
0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
block time ratio
assosiativity
2.4×10-4
Cache Organization for Memory Speculation
Evaluations
15 ページ なのだ!
Results
2.2.Violation FrequencyViolation Frequency
The worst case of SPECint95
Average of SPECint95
The best case of SPECint95
MONE/W MONE/ L NENEKO/W NENEKO/ Lmin
max
0
0.0002
0.0004
0.0006
0.0008
0.001
0.0012
0.0014
○
○×
×
Cache Organization for Memory Speculation
Evaluations
16 ページ なのだ!
0
0.0005
0.001
0.0015
0.002
0.0025
MONE./W MONE/ L NENEKO/W NENEKO/ L
Results
3.3.Cache miss ratioCache miss ratio
Capacity or Conflict
Thread Control
Invalidation
Only 12.15% invalidations are effective and appear as cache miss reason in the left graph
masked1.19%
ignored73.78%
effective12.15%
delayed12.89%
Cache Organization for Memory Speculation
Evaluations
17 ページ なのだ!
Results
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
events per cycle
MONE/W MONE./ L NENEKO/W NENEKO/L
invalidateupdateflushload
4.4.Bus event ratioBus event ratio
0
0.005
0.01
0.015
0.02
0.025
0.03
events per cycle
MONE/W MONE./ L NENEKO/W NENEKO/L
updateflushload
Address Bus Data Bus
Cache Organization for Memory Speculation
18 ページ なのだ!
OutlineIntroduction
Background ~Speculative Multi-threading~Our Baseline Model – NEKO –
Cache Coherency ProtocolsOverviewCache DirectoriesEventsConditionsState Diagrams
EvaluationEnvironmentResults
ConclusionsConclusionsFuture Works
Cache Organization for Memory Speculation
Conclusions
19 ページ なのだ!
Conclusions
Proposal of Cache Coherency Protocols supporting Memory SpeculationEvaluations of these Protocols
MONE./W
MONE/L
NENEKO/W
NENEKO/L
block ○ ○ ○ ○
violation ○ × ○ ×
cache mis
s△ △ △ △
bus event
- - - -
Result is different from expectation
Cache Organization for Memory Speculation
Conclusions
20 ページ なのだ!
Future Works
Mechanisms for avoiding effects of thread-controlComparison between these protocols and other approaches for Memory Speculation