guaranteeing hits to improve the efficiency of a small ...€¦ · guaranteeing hits to improve the...

49
Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary Tyson Department of Computer Science Florida State University December 5, 2007

Upload: others

Post on 18-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Guaranteeing Hits to Improve the Efficiency ofa Small Instruction Cache

Stephen Hines, David Whalley, and Gary Tyson

Department of Computer ScienceFlorida State University

December 5, 2007

Page 2: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

L0/Filter Instruction Caches

Programs spend a great deal of time executing small loops

Small, direct-mapped→ Better energy/access

Low hit rate→Worse execution times (4–8%)Power/Performance tradeoff in embedded systems

Prediction can reduce some of this cycle penalty ...

Page 3: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

L0/Filter Instruction Caches

Programs spend a great deal of time executing small loops

Small, direct-mapped→ Better energy/access

Low hit rate→Worse execution times (4–8%)Power/Performance tradeoff in embedded systems

Prediction can reduce some of this cycle penalty ...But guarantees will give us the best of both worlds

Page 4: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Instruction Fetch

Types of Instruction Fetch

Sequential FetchesNon-sequential (branching) fetches

Direct branchesIndirect branches

Page 5: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Instruction Fetch

Types of Instruction Fetch

Sequential FetchesNon-sequential (branching) fetches

Direct branchesIndirect branches

Page 6: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Intra-line Sequential Fetch

Traditional Line Buffers

Perform tag comparison early→ disable L1-IC on hit

May lengthen cycle time for fetch

Page 7: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Intra-line Sequential Fetch

Traditional Line Buffers

Perform tag comparison early→ disable L1-IC on hit

May lengthen cycle time for fetch

Tagless Hit Line Buffer (TH-LB)

Keep track of previous cycle’s branch status/prediction

If not-taken and intra-line (not last inst), access TH-LB;otherwise fetch from the L1-IC and write into TH-LB

Eliminates the need for TH-LB tag comparison or storage

Identical hit and miss behavior as LB and cycle time is thesame as just L1-IC

Page 8: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Guaranteeing Tagless Hits

A TH-LB makes guarantees about intra-line sequentialfetch behavior

Extend this principle to handle other regular fetches

Tagless Hit Instruction Cache (TH-IC)

Tag comparisons for most accesses

Recognize/capture regularity inherent to fetch

Employ more useful metadata to handle links betweeninstructions in the cache

Page 9: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Tagless Hits

Fetching Sequential Lines

Next Sequential bit (NS) associated with each TH-IC lineSequential fetches that cross a line boundary examine NSto determine the next cycle’s fetch

NS is not set – fetch from L1-IC and set the NS bitNS is set – guaranteed hit in TH-IC (L1-IC, tag check)

When a line is replaced, we clear its NS bit as well as theprevious line’s NS bit

Essentially we now have a multiple line buffer

Page 10: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Tagless Hits

Handling Direct Branches

Next Target bit (NT) associated with each TH-ICinstructionDirect branches that are predicted taken examine NT todetermine next cycle’s fetch

NT is not set – fetch from L1-IC and set the NT bitNT is set – guaranteed hit in TH-IC (L1-IC, tag check)

On line replacement, we will need to invalidate some ofthe NT bits contained in the TH-IC

Page 11: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Tagless Hits

TH-IC Guaranteed Hit and False Miss Rates

Page 12: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Tagless Hits

TH-IC Line and State Information

Page 13: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1inst 1

inst 2inst 3inst 4

inst 5inst 6inst 7

inst 8. . .

0

0000

000

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

Page 14: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 4

inst 5inst 5inst 6inst 7

inst 8. . .

01

0000

000

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

Page 15: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 4

inst 5inst 6inst 6inst 7inst 7

inst 8. . .

01

0000

000

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hits

Page 16: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 2inst 3inst 4

inst 5inst 6inst 7

inst 8. . .

01

0000

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

Page 17: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 3inst 4inst 4

inst 5inst 6inst 7

inst 8. . .

01

0000

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hits

Page 18: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 4

inst 5inst 5inst 6inst 7

inst 8. . .

01

00001

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hitsinst 5 false miss set line 1 NS bit X

Page 19: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 4

inst 5inst 6inst 6inst 7inst 7

inst 8. . .

01

00001

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hitsinst 5 false miss set line 1 NS bit X

insts 6,7 hits

Page 20: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 2inst 3inst 4

inst 5inst 6inst 7

inst 8. . .

01

00001

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hitsinst 5 false miss set line 1 NS bit X

insts 6,7 hitsinst 2 hit

Page 21: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 3inst 4inst 4

inst 5inst 6inst 7

inst 8. . .

01

00001

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hitsinst 5 false miss set line 1 NS bit X

insts 6,7 hitsinst 2 hit

insts 3,4 hits

Page 22: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 4

inst 5inst 5inst 6inst 7

inst 8. . .

01

00001

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hitsinst 5 false miss set line 1 NS bit X

insts 6,7 hitsinst 2 hit

insts 3,4 hitsinst 5 hit

Page 23: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 4

inst 5inst 6inst 6inst 7inst 7

inst 8. . .

01

00001

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hitsinst 5 false miss set line 1 NS bit X

insts 6,7 hitsinst 2 hit

insts 3,4 hitsinst 5 hit

insts 6,7 hits

Page 24: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 2inst 3inst 3inst 4inst 4

inst 5inst 5inst 6inst 6inst 7inst 7

inst 8. . .

01

00001

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hitsinst 5 false miss set line 1 NS bit X

insts 6,7 hitsinst 2 hit

insts 3,4 hitsinst 5 hit

insts 6,7 hits

Page 25: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 4

inst 5inst 6inst 7

inst 8inst 8. . .

01

00001

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hitsinst 5 false miss set line 1 NS bit X

insts 6,7 hitsinst 2 hit

insts 3,4 hitsinst 5 hit

insts 6,7 hitsinst 8 hit

Page 26: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Example

Fetching a Loop

. . .

inst 1

inst 2inst 3inst 4

inst 5inst 6inst 7

inst 8. . .

01

00001

0001

00

11

NSNT

Line

1Li

ne2

Fetched Result Metadata Set L1/ITLB?

inst 1 miss set line 0 NS bit X

inst 5 miss set inst 1 NT bit X

insts 6,7 hitsinst 2 false miss set inst 7 NT bit X

insts 3,4 hitsinst 5 false miss set line 1 NS bit X

insts 6,7 hitsinst 2 hit

insts 3,4 hitsinst 5 hit

insts 6,7 hitsinst 8 hit

Page 27: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Tag Comparisons in TH-IC

Guaranteed HitsTag comparison is completely unnecessary

ITLB access can also be skipped

Potential Misses

Access L1-IC and TH-ICDetermine if line is actually in TH-IC (false miss)

TH-IC is inclusive of L1-IC, so . . .

Page 28: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Tag Comparisons in TH-IC

Guaranteed HitsTag comparison is completely unnecessary

ITLB access can also be skipped

Potential Misses

Access L1-IC and TH-ICDetermine if line is actually in TH-IC (false miss)

TH-IC is inclusive of L1-IC, so . . .Only need to check whether TH-IC line points to L1-IC lineto verify a false miss

Page 29: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Reducing TH-IC Tag Size

Page 30: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Reducing TH-IC Tag Size

Page 31: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Reducing TH-IC Tag Size

Page 32: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Metadata Invalidation

TH-IC line replacements can corrupt NT and NS links

NS: Clear previous line’s NS (easy)NT: Clear any NT that points to replaced line (harder)

Too much metadata – inefficient energy usage due to extratracking bitsToo little metadata – overly aggressive invalidation kicks outlinks that are still valid (i.e. point to other lines)

Page 33: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Invalidation Policies

Page 34: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Invalidation Policies

Page 35: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Invalidation Policies

Page 36: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Metadata Management

Invalidation Policies

Page 37: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Configuration

SimpleScalar PISA with Wattch extensions forpower/energy modelingStrongARM-derived configuration (in-order, 1-issue, . . . )

L0-IC (128B – 512B)TH-IC (128B – 512B) x (TN, TT, TL, TI) + TH-LBSlides only show 256B (16x4) configurations

VPO optimized MiBench benchmarks

Benchmarks run to completion

Page 38: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Total Processor Energy

Page 39: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Fetch Efficiency Across Different Application Domains

Fetch Statistics

MiBench Average 176.gcc (SPECInt2k)L0_16x4 TL_16x4 L0_16x4 TL_16x4

Execution Cycles 106.05% 100.00% 104.10% 100.00%Total Energy 75.17% 68.58% 83.81% 79.03%Small Cache Hit Rate 87.63% 84.96% 77.86% 73.57%Fetch Power 43.81% 35.47% 63.72% 56.07%Energy-Delay Squared 84.57% 68.58% 90.82% 79.03%

TH-IC is beneficial even for applications with more diverseinstruction and data access behavior like 176.gcc

Page 40: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Fetch Efficiency Across Different Application Domains

Fetch Statistics

MiBench Average 176.gcc (SPECInt2k)L0_16x4 TL_16x4 L0_16x4 TL_16x4

Execution Cycles 106.05% 100.00% 104.10% 100.00%Total Energy 75.17% 68.58% 83.81% 79.03%Small Cache Hit Rate 87.63% 84.96% 77.86% 73.57%Fetch Power 43.81% 35.47% 63.72% 56.07%Energy-Delay Squared 84.57% 68.58% 90.82% 79.03%

TH-IC is beneficial even for applications with more diverseinstruction and data access behavior like 176.gcc

Page 41: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Fetch Efficiency Across Different Application Domains

Fetch Statistics

MiBench Average 176.gcc (SPECInt2k)L0_16x4 TL_16x4 L0_16x4 TL_16x4

Execution Cycles 106.05% 100.00% 104.10% 100.00%Total Energy 75.17% 68.58% 83.81% 79.03%Small Cache Hit Rate 87.63% 84.96% 77.86% 73.57%Fetch Power 43.81% 35.47% 63.72% 56.07%Energy-Delay Squared 84.57% 68.58% 90.82% 79.03%

TH-IC is beneficial even for applications with more diverseinstruction and data access behavior like 176.gcc

Page 42: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Fetch Efficiency Across Different Application Domains

Fetch Statistics

MiBench Average 176.gcc (SPECInt2k)L0_16x4 TL_16x4 L0_16x4 TL_16x4

Execution Cycles 106.05% 100.00% 104.10% 100.00%Total Energy 75.17% 68.58% 83.81% 79.03%Small Cache Hit Rate 87.63% 84.96% 77.86% 73.57%Fetch Power 43.81% 35.47% 63.72% 56.07%Energy-Delay Squared 84.57% 68.58% 90.82% 79.03%

TH-IC is beneficial even for applications with more diverseinstruction and data access behavior like 176.gcc

Page 43: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Related Work

L0/Filter Cache Prediction

Cannot eliminate performance penalty completely

Still using full size tags along with prediction metadata andthere are still tag checks for hits

Way Memoization

Utilizes NT/NS concept to avoid 64-way tag comparisons inL1-IC for many fetches

Large amount of metadata required and expensiveinvalidation mitigates much of the energy consumptionbenefit

Page 44: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Conclusions

TH-IC simultaneously eliminates performance penalty ofsmall caches, and further reduces fetch energy

Eliminates majority of tag checks and ITLB accessesReduces size of tag/ID check on miss due to inclusion

Make guarantees, not predictions for regularly behavedpipeline features like instruction fetch

Suitable for high-performance computing due to energyreductions and lack of performance penalty

Ease of integration with nearly any processor design

Page 45: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Introduction Tagless Hit Instruction Cache (TH-IC) Experimental Evaluation Summary

Questions???

Page 46: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Backup Slides Graphs

Page 47: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Backup Slides Graphs

Execution Time

Page 48: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Backup Slides Graphs

Average Fetch Power

Page 49: Guaranteeing Hits to Improve the Efficiency of a Small ...€¦ · Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache Stephen Hines, David Whalley, and Gary

Backup Slides Graphs

Energy-Delay2