kyushu university esa’07 @ las vegas, june 2007 the effect of nanometer-scale technologies on the...

44
Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale The Effect of Nanometer-Scale Technologies on the Cache Size Technologies on the Cache Size Selection for Low Energy Embedded Selection for Low Energy Embedded Systems Systems Hamid Noori Hamid Noori , Maziar Goudarzi, , Maziar Goudarzi, Koji Inoue, and Kazuaki Koji Inoue, and Kazuaki Murakami Murakami Kyushu University Kyushu University

Upload: barrie-leonard

Post on 17-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

The Effect of Nanometer-Scale The Effect of Nanometer-Scale Technologies on the Cache Size Technologies on the Cache Size

Selection for Low Energy Embedded Selection for Low Energy Embedded SystemsSystems

Hamid NooriHamid Noori, Maziar Goudarzi, , Maziar Goudarzi, Koji Inoue, and Kazuaki MurakamiKoji Inoue, and Kazuaki Murakami

Kyushu UniversityKyushu University

Page 2: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

OutlineOutline

Motivations and ObservationsMotivations and Observations Energy EvaluationEnergy Evaluation Problem DefinitionProblem Definition Experimental ResultsExperimental Results ConclusionConclusion

Page 3: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

OutlineOutline

Motivations and ObservationsMotivations and Observations

Problem FormulationProblem Formulation

Energy Evaluation ModelEnergy Evaluation Model

Experimental ResultsExperimental Results

ConclusionConclusion

Page 4: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Motivations and Motivations and Observations (1/2)Observations (1/2) Caches contribute a Caches contribute a

large portion of large portion of energy consumption energy consumption in embedded in embedded systemssystems

Leakage power is Leakage power is increasing in new increasing in new nanometer-scale nanometer-scale technologiestechnologies

Page 5: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Motivations and Motivations and Observations (2/2)Observations (2/2)

0

0.05

0.1

0.15

0.2

0.25

0.3

180nm 100nm 70nm

Technology

Dy

na

mic

En

erg

y (

nJ

)

32K 16K 8K 4K 2K 1K

0

50

100

150

200

250

300

180nm 100nm 70nm

Technology

Le

ak

ag

e P

ow

er

(mW

)

32K 16K 8K 4K 2K 1K

4-way set-associative cache with 16-byte block size 4-way set-associative cache with 16-byte block size Dynamic: 180nm ~ 4x 100nm & 9x 70nm (CACTI 4.1)Dynamic: 180nm ~ 4x 100nm & 9x 70nm (CACTI 4.1) Static: 70nm ~ 400x 180nm & 5x 100nm (CACTI 4.1)Static: 70nm ~ 400x 180nm & 5x 100nm (CACTI 4.1)

Page 6: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

GoalGoal

The effect of different nanometer-The effect of different nanometer-scale technologies on cache scale technologies on cache configuration selection in low-configuration selection in low-energy embedded systemsenergy embedded systems

Page 7: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

OutlineOutline

Energy EvaluationEnergy Evaluation

Page 8: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Energy Evaluation Energy Evaluation (1/3)(1/3) StaticStatic DynamicDynamic

energy_memory(Config, Tech) =energy_memory(Config, Tech) =

energy_dynamic(Config, Tech) + energy_dynamic(Config, Tech) +

energy_static(Config, Tech)energy_static(Config, Tech)

Page 9: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Energy Evaluation Energy Evaluation (2/3)(2/3)

  energy_dynamic(Config, Tech) = energy_dynamic(Config, Tech) = cache_accesses(Config) * energy_cache_access(Config, cache_accesses(Config) * energy_cache_access(Config,

Tech) + Tech) + cache_misses(Config) * energy_miss(Config,Tech) cache_misses(Config) * energy_miss(Config,Tech)

energy_miss(Config, Tech) = energy_miss(Config, Tech) =

energy_off_chip_access + energy_off_chip_access + energy_cache_block_refill(Config,Tech)energy_cache_block_refill(Config,Tech)    

energy_static(Config, Tech) = energy_static(Config, Tech) = executed_clock_cycles(Config) * clock_period * executed_clock_cycles(Config) * clock_period *

leakage_power(Config, Tech)leakage_power(Config, Tech)

Page 10: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Energy Evaluation Energy Evaluation (3/3)(3/3) SimplescalaSimplescalarr

– cache_accessescache_accesses– cache_missescache_misses– executed_clock_cyclesexecuted_clock_cycles

CACTI 4.1CACTI 4.1– energy_cache_accessenergy_cache_access– energy_cache_block_refillenergy_cache_block_refill– leakage_powerleakage_power

energy_off_chip_access = 20 nJenergy_off_chip_access = 20 nJ Clock freq = 200MHzClock freq = 200MHz

Page 11: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

OutlineOutline

Problem DefinitionProblem Definition

Page 12: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Problem DefinitionProblem Definition

““For a given application, processor For a given application, processor architecture, technology, and architecture, technology, and instruction- and data-cache instruction- and data-cache organization (i.e. the cache organization (i.e. the cache

associativity and line-size), find the associativity and line-size), find the cache size that results in minimum cache size that results in minimum energy consumption (i.e. minimizes energy consumption (i.e. minimizes Equation 1 for a given technology) Equation 1 for a given technology) over the entire application run.over the entire application run.””

Page 13: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

OutlineOutline

Experimental ResultsExperimental Results

Page 14: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Experimental ResultsExperimental Results

Applications from MibenchApplications from Mibench SimpleScalarSimpleScalar CACTI 4.1CACTI 4.1

– Three technologies: 180nm, 100nm, and Three technologies: 180nm, 100nm, and 70nm70nm

Page 15: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Instruction CacheInstruction Cache

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

64K 32K 16K 8K 4K 2K 1K

Cache Size

Clo

ck

Cy

cle

s (M

)

Page 16: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Energy Evaluation for three Energy Evaluation for three different technologies - different technologies - qsortqsort

0

500

1000

1500

2000

2500

3000

3500

4000

64K 32K 16K 8K 4K 2K 1K

Cache Size

To

tal E

ne

rgy

(m

J)

- 1

80

nm

static dynamic

0

500

1000

1500

2000

2500

3000

3500

4000

64K 32K 16K 8K 4K 2K 1K

Cache Size

To

tal E

ne

rgy

(m

J)

- 1

00

nm

static dynamic

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

64K 32K 16K 8K 4K 2K 1K

Cache Size

To

tal E

ne

rgy

(m

J)

- 7

0n

m

static dynamic

Page 17: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Energy SavingEnergy Saving

There are two different points for a minimum-energy There are two different points for a minimum-energy cache size which are 64K (180nm), and 16K (100nm cache size which are 64K (180nm), and 16K (100nm and 70nm).and 70nm).

Total energy is reduced by 38% and 55% respectively Total energy is reduced by 38% and 55% respectively in 100nm and 70nm processes when selecting 16KB in 100nm and 70nm processes when selecting 16KB size for the instruction cache instead of 64KB. size for the instruction cache instead of 64KB.

In this application (In this application (qsortqsort), this saving comes at a ), this saving comes at a performance penalty of 37% performance penalty of 37%

We also note that energy is reduced by 50% in 180nm We also note that energy is reduced by 50% in 180nm process when employing a 64KB cache instead of process when employing a 64KB cache instead of 16KB; i.e., bigger cache used to result in less energy. 16KB; i.e., bigger cache used to result in less energy. But as shown above, this trend is reversed in But as shown above, this trend is reversed in nanometer technologies. nanometer technologies.

Page 18: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Other ApplicationsOther Applications

Cache Size 100nm 70nm

180nm 100nm 70nm Energy saving

Performance penalty

Energy saving

Performance penalty

basicmath 32K 32K 32K 0.0 0.0 0.0 0.0

bitcounts 2K 2K 2K 0.0 0.0 0.0 0.0

Cjpeg 16K 16K 4K 0.0 0.0 3.38 123.88

Djpeg 16K 16K 4K 0.0 0.0 28.12 79.27

Lame 32K 8K 8K 30.02 36.39 55.54 36.39

dijkstra 16K 16K 1K 0.0 0.0 14.41 211.07

patricia 32K 32K 32K 0.0 0.0 0.0 0.0

blowfish 32K 32K 8K 0.0 0.0 40.70 80.40

rijndael 32K 32K 16K 0.0 0.0 8.62 61.02

average 3.33 4.04 16.75 65.78

Page 19: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Data CacheData Cache

750000

800000

850000

900000

950000

1000000

64K 32K 16K 8K 4K 2K 1K

Cache Size

Clo

ck

Cy

cle

s (K

)

Page 20: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Energy Evaluation for three Energy Evaluation for three different technologies - different technologies - qsortqsort

0

20

40

60

80

100

120

64K 32K 16K 8K 4K 2K 1K

Cache Size

To

tal E

ne

rgy

(m

J)

- 1

80

nm

static dynamic

0

100

200

300

400

500

600

64K 32K 16K 8K 4K 2K 1K

Cache Size

To

tal E

ne

rgy

(m

J)

- 1

00

nm

static dynamic

0

500

1000

1500

2000

2500

64K 32K 16K 8K 4K 2K 1K

Cache Size

To

tal E

ne

rgy

(mJ)

- 7

0nm

static dynamic

Page 21: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Energy SavingEnergy Saving

According to the results 32K, 2K and 1K are minimum-According to the results 32K, 2K and 1K are minimum-energy data cache sizes for 180nm, 100nm and 70nm, energy data cache sizes for 180nm, 100nm and 70nm, respectively.respectively.

The minimum-energy caches for 100nm (2KB) and The minimum-energy caches for 100nm (2KB) and 70nm (1KB) technologies respectively consume 88% 70nm (1KB) technologies respectively consume 88% and 56% less energy compared to the minimum-and 56% less energy compared to the minimum-energy cache of 180nm process (i.e. 32KB). energy cache of 180nm process (i.e. 32KB).

The corresponding performance penalty is only 9% The corresponding performance penalty is only 9% and 14% respectively. and 14% respectively.

In 180nm technology, the optimal cache size (32KB) In 180nm technology, the optimal cache size (32KB) consumes 28% and 40% less energy than 2KB and consumes 28% and 40% less energy than 2KB and 1KB caches, but this relation is reversed, with 1KB caches, but this relation is reversed, with increasing significance, in 100nm and 70nm increasing significance, in 100nm and 70nm technologies.technologies.

Page 22: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Other ApplicationsOther Applications

Cache Size 100nm 70nm

180nm 100nm 70nm Energy saving

Performance penalty

Energy saving

Performance penalty

basicmath 4K 2K 2K 28.15 2.73 43.02 2.73

susan 8K 2K 2K 34.84 10.08 62.20 10.08

cjpeg 32K 8K 8K 48.13 12.21 66.22 12.21

djpeg 32K 8K 8K 25.46 25.96 58.71 25.96

lame 32K 16K 8K 21.93 12.97 47.52 53.85

dijkstra 32K 8K 8K 34.44 35.87 58.77 35.87

patricia 32K 8K 8K 57.04 9.85 77.69 24.79

blowfish 32K 8K 4K 57.91 11.43 69.28 52.10

rijndael 32K 16K 8K 36.61 9.00 59.98 33.89

sha 32K 1K 1K 74.53 13.7 91.34 13.72

average 41.09 14.38 63.47 26.52

Page 23: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

The effect of miss rate on optimal The effect of miss rate on optimal

cache size for different cache size for different technologiestechnologies

0

10000

20000

30000

40000

50000

60000

64K 32K 16K 8K 4K 2K 1K

Cache Size

Nu

mb

er

of

Mis

se

s (

K)

1-way 2-way

Page 24: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Energy EvaluationEnergy Evaluation

0

200

400

600

800

1000

1200

1400

64K 32K 16K 8K 4K 2K 1K

Cache Size

To

tal E

ne

rgy

(m

J)

180nm 100nm 70nm

0

200

400

600

800

1000

1200

1400

1600

64K 32K 16K 8K 4K 2K 1K

Cache Size

To

tal E

ne

rgy

(m

J)

180nm 100nm 70nm

Page 25: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

ResultsResults

For direct mapped cache, the minimum-energy cache size for three For direct mapped cache, the minimum-energy cache size for three technologies is 32Ktechnologies is 32K

For 2-way, 32K, 16K and 16K are candidates with minimum energy for For 2-way, 32K, 16K and 16K are candidates with minimum energy for 180nm, 100nm and 70nm. 180nm, 100nm and 70nm.

When the slope of miss rate is very sharp, dynamic energy becomes When the slope of miss rate is very sharp, dynamic energy becomes dominant compared to static energy, and therefore, for any technology we dominant compared to static energy, and therefore, for any technology we will reach to the same cache size. will reach to the same cache size.

However when a 2-way set associative cache is used, the sharpness in miss However when a 2-way set associative cache is used, the sharpness in miss rate diagram flattens and again the static energy becomes more important. rate diagram flattens and again the static energy becomes more important. That is why in 100nm and 70nm we have a different optimal point compared That is why in 100nm and 70nm we have a different optimal point compared to 180nm in the 2-way cache. to 180nm in the 2-way cache.

Thus, as the miss ratio variations become softer, the optimal cache sizes for Thus, as the miss ratio variations become softer, the optimal cache sizes for different technologies get farther. different technologies get farther.

For the instruction cache, where execution clock cycles changes from 800 M For the instruction cache, where execution clock cycles changes from 800 M to 17000 M (~21 times more), the optimal cache sizes are 64K, 16K and 16K, to 17000 M (~21 times more), the optimal cache sizes are 64K, 16K and 16K, whereas for data cache with softer variation, from 800 M to 1000 M (only 1.2 whereas for data cache with softer variation, from 800 M to 1000 M (only 1.2 times more, the minimum-energy cache sizes are 32K, 2K and 1K. times more, the minimum-energy cache sizes are 32K, 2K and 1K.

In the case of the 2-way cache, the optimal cache size for 100nm and 70nm In the case of the 2-way cache, the optimal cache size for 100nm and 70nm processes (16KB in both of them) respectively consumes 9% and 29% less processes (16KB in both of them) respectively consumes 9% and 29% less energy compared to the 180nm optimal cache (32KB) with 25% performance energy compared to the 180nm optimal cache (32KB) with 25% performance loss. loss.

Page 26: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

ConclusionsConclusions

The results show that for re-implementing low energy embedded The results show that for re-implementing low energy embedded systems in a new technology the cache may need to be re-selected. systems in a new technology the cache may need to be re-selected.

Our study showed that the sharper the slope of miss rate for Our study showed that the sharper the slope of miss rate for different cache sizes, the less variation in optimal cache size for different cache sizes, the less variation in optimal cache size for different technologies. different technologies.

The experiments showed that in all cases, the optimal cache size The experiments showed that in all cases, the optimal cache size decreases in finer technologies despite the increase in misses and decreases in finer technologies despite the increase in misses and dynamic energy. This is due to high impact of static energy in future dynamic energy. This is due to high impact of static energy in future technologies and confirms that, unlike micrometer-scale technologies and confirms that, unlike micrometer-scale technologies, simply adding more cache does not reduce total technologies, simply adding more cache does not reduce total system energy in future; system energy in future; cache size must be reduced to minimize cache size must be reduced to minimize total system energy in future nanometer technologiestotal system energy in future nanometer technologies. .

In data cache to due the less cache accesses (less dynamic energy) In data cache to due the less cache accesses (less dynamic energy) compared to the instruction cache, this fact is magnified. compared to the instruction cache, this fact is magnified.

Since the smaller caches are more suitable for low energy systems Since the smaller caches are more suitable for low energy systems in finer technologies, finding an optimal cache configuration that in finer technologies, finding an optimal cache configuration that simultaneously optimizes performance and energy is increasingly simultaneously optimizes performance and energy is increasingly more difficult in future.more difficult in future.

Page 27: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Thank you for your attentionThank you for your attention

Page 28: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Energy Saving & Energy Saving & Performance PenaltyPerformance Penalty

Energy Saving = Energy Saving =

(energy_cache180_NTech – energy_cacheNTech) / (energy_cache180_NTech – energy_cacheNTech) / energy_cache180_NTechenergy_cache180_NTech

Performance Penalty = Performance Penalty =

(exec_time_cacheNTech – (exec_time_cacheNTech – exec_time_cache180) / exec_time_cache180exec_time_cache180) / exec_time_cache180

Page 29: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Instruction Cache – Instruction Cache – Energy SavingEnergy Saving

0

10

20

30

40

50

60

70

En

erg

y s

av

ing

fo

r 1

00

nm

(%

)

20°C 60°C 100°C

0

5

10

15

20

25

30

35

40

45

50

basic

mat

h

bitcounts

qsort

cjpeg

djpeg

lam

e

dijkst

ra

patric

ia

blowfis

h

aver

age

En

erg

y s

av

ing

fo

r 7

0n

m (

%)

20°C 60°C 100°C

100nm: 8%, 27%, and 41% for 20°C, 60°C, 100°C (max: 65%)

70nm: 1%, 6%, and 16% for 20°C, 60°C, 100°C (max: 45%)

Page 30: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Instruction Cache – Instruction Cache – Performance PenaltyPerformance Penalty

-2

0

2

4

6

8

10

12

14

basic

mat

h

bitcounts

qsort

cjpeg

djpeg

lam

e

dijkst

ra

patric

ia

blowfis

h

aver

age

Pe

rfo

rma

nc

e p

en

alt

y f

or

10

0n

m (

%)

20°C 60°C 100°C

-1

4

9

14

19

24

29

34

39

44

basic

mat

h

bitcounts

qsort

cjpeg

djpeg

lam

e

dijkst

ra

patric

ia

blowfis

h

aver

age

Pe

rfo

rma

nc

e p

en

alt

y f

or

70

nm

(%

)

20°C 60°C 100°C

100nm: 1%, 1.2%, and 2.2% for 20°C, 60°C, 100°C

70nm: 0.6%, 2.3%, and 16% for 20°C, 60°C, 100°C

Page 31: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Data Cache – Energy Data Cache – Energy SavingSaving

0

10

20

30

40

50

60

70

80

En

erg

y s

av

ing

fo

r 1

00

nm

(%

)

20°C 60°C 100°C

0

10

20

30

40

50

60

70

basic

mat

h

susa

nqso

rt

cjpeg

djpeg

lam

e

dijkst

ra

patric

ia

blowfis

h

aver

age

En

erg

y s

av

ing

fo

r 7

0n

m (

%)

20°C 60°C 100°C

100nm: 3.3%, 25%, and 47% for 20°C, 60°C, 100°C (max: 75%)

70nm: 7%, 22%, and 33% for 20°C, 60°C, 100°C (max: 65%)

Page 32: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Data Cache – Data Cache – Performance PenaltyPerformance Penalty

0

2

4

6

8

10

12

14

16

18

20

Per

form

ance

pen

alty

for

100n

m (%

)

20°C 60°C 100°C

0

5

10

15

20

25

30

35

40

45

50

Per

form

ance

pen

alty

for

70n

m (%

)

20°C 60°C 100°C

100nm: 0.8%, 5.3%, and 8% for 20°C, 60°C, 100°C

70nm: 3.6%, 10%, and 20% for 20°C, 60°C, 100°C

Page 33: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Architecture and Architecture and Reconfiguration Flow for a Reconfiguration Flow for a

Temperature-Aware Temperature-Aware Configurable CacheConfigurable Cache

Configurable Cache +Configurable Cache +

– HardwareHardware Thermal sensorThermal sensor Accessible read portAccessible read port

– SoftwareSoftware A table in Operating System (OS) for recoding A table in Operating System (OS) for recoding

temperature ranges and their suitable cache temperature ranges and their suitable cache configurationconfiguration

Page 34: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Flow of configuring Flow of configuring Temperature-Aware Temperature-Aware Configurable Cache Configurable Cache

Static and dynamicpower for differentcache configuration

and temperatures forthe target technology

Execution time, number ofhits and misses for

different cacheconfigurations obtained

through running theapplication on an ISS

Determining thelowest energy cache

configuration fordifferent targettemperatures

Fill the lookup table of theconfigurable cache withproper configuration for

each temperature

Evaluationphase

(offline)

Detect the currenttemperature

Use the lookup table andload the proper

configuration for thecurrent temperature

Execute theapplication

Reconfigurationphase (online)

Page 35: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Temperature Temperature measurement accuracy measurement accuracy

(1/2)(1/2) TTjj = T = Taa + θ + θJAJA . P . P

– TTjj: Junction Temperature: Junction Temperature

– TTaa: Ambient Temperature: Ambient Temperature

– P: PowerP: Power

– θθJA JA : Junction-to-Ambient: Junction-to-Ambient Thermal Thermal ResistanceResistance

Page 36: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Temperature Temperature measurement accuracy measurement accuracy

(2/2)(2/2)

ARM7TDMIARM7TDMI ARM966E-SARM966E-S

180nm180nm Power Power consumptioconsumptio

nn

24.15 mW24.15 mW 140 mW140 mW

FrequencyFrequency 115 MHz115 MHz 200 MHz200 MHz

130nm130nm Power Power consumptioconsumptio

nn

7.98 mW7.98 mW 62.5 mW62.5 mW

FrequencyFrequency 133 MHz133 MHz 250 MHz250 MHz

90nm90nm Power Power consumptioconsumptio

nn

7.08 mW7.08 mW 51.7 mW51.7 mW

FrequencyFrequency 236 MHz236 MHz 470 MHz470 MHz

θJA: 7°C/W ~ 35 °C/W ΔT = (Tj - Ta) ~ 5 °C

Page 37: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

ConclusionsConclusions

Our results show that up to 66% and 45% energy Our results show that up to 66% and 45% energy consumption can be saved for 100nm and 70nm for consumption can be saved for 100nm and 70nm for instruction cache when the temperature changes from 0°C instruction cache when the temperature changes from 0°C to 100°C. to 100°C.

Due to the increase of leakage effect in finer technologies Due to the increase of leakage effect in finer technologies and higher temperatures, the smaller caches will be more and higher temperatures, the smaller caches will be more energy efficient for future low energy systems.energy efficient for future low energy systems.

Since the smaller caches are more suitable for low energy Since the smaller caches are more suitable for low energy systems in finer technologies and higher temperatures, systems in finer technologies and higher temperatures, finding an optimal cache configuration that simultaneously finding an optimal cache configuration that simultaneously optimizes performance and energy is increasingly more optimizes performance and energy is increasingly more difficult in future, specially at high temperatures. difficult in future, specially at high temperatures.

Since the accesses to data cache are less than the Since the accesses to data cache are less than the accesses to instruction cache, the data cache is more accesses to instruction cache, the data cache is more easily affected by temperature and technology than easily affected by temperature and technology than instruction cache. By using a configurable data cache, up instruction cache. By using a configurable data cache, up to 74% and 64% energy can be saved for 100nm and to 74% and 64% energy can be saved for 100nm and 70nm respectively. 70nm respectively.

Page 38: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Thank you for your attentionThank you for your attention

Questions?Questions?

Page 39: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Motivations and Motivations and Observations (3/4)Observations (3/4) BSIM3 equation for subthreshold leakageBSIM3 equation for subthreshold leakage

Page 40: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Experimental Results Experimental Results (1/)(1/) Applications from MibenchApplications from Mibench SimpleScalarSimpleScalar CACTI 4.1CACTI 4.1

– Three technologies: 180nm, 100nm, and Three technologies: 180nm, 100nm, and 70nm70nm

– Six Temperatures: Six Temperatures: 00°C, 2°C, 200°C, 4°C, 400°C, 6°C, 600°C, °C, 8800°C, 10°C, 1000°C °C

Configurable CacheConfigurable Cache– Size: 64KB~1KBSize: 64KB~1KB

Page 41: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Qsort-Instruction Qsort-Instruction CacheCache

0

2000000

4000000

6000000

8000000

10000000

12000000

14000000

16000000

18000000

20000000

128K 64K 32K 16K 8K 4K 2K 1K

Instruction Cache Size - qsort

No

. of E

xecu

tion

Clo

ck C

ycle

s (K

)

0

500

1000

1500

2000

2500

3000

3500

4000

128K 64K 32K 16K 8K 4K 2K 1K

Instruction Cache Size - qsort

Dyn

amic

En

erg

y (m

J) -

100n

m

Page 42: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Qsort-Instruction Qsort-Instruction CacheCache

0

50

100

150

200

250

300

350

400

450

500

128K 64K 32K 16K 8K 4K 2K 1K

Instruction Cache Size - qsort

Sta

tic

En

erg

y (

mj)

- 1

00

nm

0°C 20°C 40°C

60°C 80°C 100°C

0

500

1000

1500

2000

2500

3000

3500

4000

4500

128K 64K 32K 16K 8K 4K 2K 1K

Instruction Cache Size - qsort

To

tal E

ne

rgy

(m

J)

- 1

00

nm

0°C 20°C 40°C

60°C 80°C 100°C

• {0°C ~ 80°C} 64KB , {80°C ~ 100°C} 32KB

• 17% energy saving and 19.6% performance penalty

Page 43: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Qsort-Data Cache Qsort-Data Cache

1000000

1050000

1100000

1150000

1200000

1250000

1300000

1350000

512K 256K 128K 64K 32K 16K 8K 4K 2K 1K

Data Cache Size - qsort

No

. of E

xecu

tion

Clo

ck C

ycle

s (K

)

0

20

40

60

80

100

120

512K 256K 128K 64K 32K 16K 8K 4K 2K 1K

Data Cache Size -qsort

Dyn

amic

En

erg

y (m

J) -

100n

m

2-way set-associative, 16 bytes line size, 100nm.

Page 44: Kyushu University ESA’07 @ Las Vegas, June 2007 The Effect of Nanometer-Scale Technologies on the Cache Size Selection for Low Energy Embedded Systems

Kyushu University ESA’07 @ Las Vegas, June 2007

Qsort-Data CacheQsort-Data Cache

0

500

1000

1500

2000

2500

3000

512K 256K 128K 64K 32K 16K 8K 4K 2K 1K

Data Cache Size - qsort

Sta

tic

En

erg

y (

mJ

) -

10

0n

m

0°C 20°C 40°C

60°C 80°C 100°C

Fig. 12. Static energy for different data cache sizes (100nm).

0

100

200

300

400

500

600

700

800

128K 64K 32K 16K 8K 4K 2K 1K

Data Cache Size - qsort

To

tal E

ner

gy

(mJ)

- 10

0nm

0°C 20°C 40°C

60°C 80°C 100°C