krisztián flautner - [email protected] automatic performance setting for dynamic voltage...
TRANSCRIPT
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 1
Automatic Performance Setting for Dynamic Voltage Scaling
Krisztián [email protected]
Steve Reinhardt
Trevor Mudge
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 2
Overview
• A mechanism for quantifying the user experience.– Metric: response time.– Automatic, no user program modifications required.– Run-time feedback to the kernel.
• Guiding performance setting of DVS processors.– For interactive episodes: slow down processor to save
energy when response times are fast enough.– For periodic events: track periodicity, utilization and inter-
task communication to establish necessary performance.
• Simulated and experimental results.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 3
Dynamic Voltage Scaling
• Voltage is proportional to the frequency.• Reduce f and v to match performance demands.• Reduced frequency implies longer execution time.
Power = Capacitance • voltage2 • frequency
Energy ~ voltage2
Execute only as fast as necessary to meet deadlines.
Running fast and idling is not energy efficient.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 4
Why bother?
386386
486 486
Pentium(R)Pentium(R)
MMX
Pentium Pro
(R)
Pentium II (R)
1
10
100
Max
Po
wer
(W
att
s)
?
So
urc
e:
Inte
l
Higher performance = increased power consumption.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 5
Power Density!
1
10
100
1000
Wat
ts/c
m2
Hot plate
Nuclear Reactor
RocketNozzle Sun’s
Surface?
So
urc
e:
Inte
l
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 6
Small performance reduction = big energy savings
20% performance reduction = 32% energy reduction40% performance reduction = 55% energy reduction
0
0.4
0.8
1.2
1.6
2
0 200 400 600 800 1000 1200
Frequency (Mhz)
Vo
ltag
e (V
)
0
0.2
0.4
0.6
0.8
1
En
ergy facto
r
Graph based onIntel XScale data
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 7
Processors supporting DVS
lpARM Intel SA-1100Transmeta
Crusoe 5600Intel XScale
Intel XScale Demo
Min.
8Mhz
1.1V
1.8mW
59Mhz
0.79V
106mW
500Mhz
1.2V
~1W
150Mhz
0.75V
40mW
150Mhz
0.75V
40mW
Max.
100Mhz
3.3V
220mW
251Mhz
1.65V
964mW
700Mhz
1.6V
~2W
800Mhz
1.5V
900mW
1000Mhz
1.75V
1.45W
Process 0.6 0.35 0.18 0.18 0.18
Max/min energy
9 4.4 1.8 4 5.4
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 8
Some recent desktop processors
Intel Pentium IV Intel Pentium IIIAMD Athlon
Model 4MPC 7450
Core 1.4Ghz @ 1.7V500Mhz @ 1.35V
733Mhz @ 1.65V
650Mhz @ 1.75V
1.2Ghz @ 1.75V
533Mhz @ 1.8V
667Mhz @ 1.8V
I/O 400Mhz100Mhz, 133Mhz
3.3V
200Mhz, 266Mhz
1.6V
133Mhz
1.8V-2.5V
Process 0.18 0.18 0.18 0.18
Max. Power
66.3W12W
19.1W
38W
66W
17W
19.1W
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 9
Performance setting algorithms
• Programmer specified– Works well but requires explicit specification of deadlines.
• Interval based algorithms– Use the ratio of idle to busy time to guide DVS.– Only work well if processor utilization is regular.– No service quality guarantees.
• Ours: episode classification based– Find important execution episodes – predict their performance.
– Works with existing user programs.
– Works well with irregular workloads.
– Uses information in kernel to derive deadlines automatically.
– Impact on response time is automatically quantified.• Performance can be adapted to the user’s preference.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 10
Episode classification
• Interactive episodes– When the user is waiting for the computer to respond.
• Periodic episodes– Producer (e.g. MP3 player).– Consumer (e.g. sound daemon).
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 11
A utilization trace
Each horizontal quantum is a millisecond, height corresponds to the utilization in that quantum.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 12
Episode classification
Interactive (Acrobat Reader), Producer (MP3 playback), and Consumer (esd sound daemon) episodes.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 13
Mouse movement
X server updates screen every ~10ms. Update takes ~0.25ms.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 14
Interactive episodes
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 15
Interactive episodes can include idle time
Waiting for data from the network during a run of Netscape. Page rendering starts after 250ms.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 16
Finding interactive episodes
• One way: mouse click indicates start, idle time indicates end.– Inaccurate, latency in finding the end of the episode.
• Our approach: track inter-task communication.– Start of an interactive episode:
• X server sends a message to another task.
– During interactive episode:• Keep track of communicating tasks (episode’s task set).
• Compute desired metrics.
– Conditions for ending the episode (applied to tasks in task set):• No tasks are executing.
• Data written by the tasks have been consumed.
• No task was preempted the last time it ran.
• No tasks are blocked on I/O.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 17
Characteristics of Interactive Episodes
• Faster is not necessarily better.– Human perception has finite resolution.– Perception threshold is ~50ms.– The goal is to run fast enough to meet the perception
threshold, no point to running any faster.
• Many interactive episodes are already fast enough.• More will be imperceptible in the near future.
– 200ms perception threshold today estimates work done during 50ms 3 years from now.
Slow down the processor!
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 18
Time above the perception threshold
0%
20%
40%
60%
80%
100%
50ms 100ms 150ms 200ms 250ms 300ms
Perception threshold
Tim
e
ab
ov
e t
he
pe
rce
pti
on
th
res
ho
ld
Acrobat Reader
FrameMaker
Ghostview
GIMP
Netscape
Time above the perception threshold is given as a percentage of time spent in all interactive episodes.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 19
The key: performance-setting algorithm
• Use episode detection and classification.– Interactive episodes.– Periodic episodes (producer and consumer).
• Performance-setting on a per episode basis.
• Stretch episodes to their deadlines.– Interactive episode: perception threshold.– Stretch producer to consumer.
No modification of existing programs needed.Works with irregular processor utilization and multiprogramming.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 20
Cumulative interactive episode length distributionF
ram
eMak
er
Episode length (sec)
Cumulative numberCumulative time
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1e-05 0.0001 0.001 0.01 0.1 1
50ms10ms
Minimum performance level sufficient Max. performance
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 21
Performance-setting strategy for interactive episodes
• Predict the performance factor that would be correct most of the time (not for most events).– Based on past optimal performance factors.
• Limit worst case impact on response time.– Run at full performance after PanicThreshold is reached.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 22
Performance-setting for interactive episodes
• Wait 5ms before transition to ignore short episodes• Switch to predicted performance level.
• If episode duration reaches PanicThreshold, switch to maximum performance.
• Estimate full performance episode duration.
• Compute optimum performance level for past episode.
• Compute new prediction based on optimum settings.
At the beginning of the episode
During the episode
At the end of the episode
PanicThreshold = PerceptionThreshold(1 + PerformanceFactor)
Predicted PerformanceFactor is the average of past optimum settings, weighted by the corresponding episode lengths.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 23
Performance-setting algorithm
• Enter period-sampling mode.
• Switch to maximum performance.
• Establish base performance level.
• Exit period-sampling mode.
Periodic activity detected
• If not in period-sampling mode, apply interactive episode performance-setting policy.
Start of interactive episode
• Update interactive episode statistics.
• Switch to base performance level, if there is periodic activity on the machine.
End of interactive episode
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 24
Performance-setting during the Acrobat Reader benchmark (200ms p.t.)
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10 12 14 16 18104 124
Time (sec)
Pe
rfo
rma
nce
fa
cto
r
Transitions to maximum performance level are due to reaching the PanicThreshold
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 25
Performance-setting during the Acrobat Reader + MP3 benchmark (200ms p.t.)
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20
Time (sec)
Pe
rfo
rma
nce
fa
cto
r
Transitions due to PanicThreshold
Full performance for periodic activity.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 26
Hardware assumptions
Minimum performance 150Mhz @ 0.75V
Maximum performance 1000Mhz @ 1.75V
PLL resynch time (stalls execution)
0.02ms
Voltage transition time 1ms
Assumptions based on Intel Xscale.
We assume that processor switches to sleep mode when it is not executing an episode.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 27
Energy factors (no MP3)
0%
20%
40%
60%
80%
100%
50ms 100ms 150ms 200ms 250ms 300ms
Perception threshold
Ene
rgy
fact
or
Acroread FrameMakerGhostview GIMPNetscape Xemacs
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 28
Energy factors with MP3 playback
0%
20%
40%
60%
80%
100%
50ms 100ms 150ms 200ms 250ms 300ms
Perception threshold
Ene
rgy fa
ctor
Acroread FrameMakerGhostview GIMPNetscape Xemacs
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 29
Changes in cumulative episode lengths as the result of performance scaling (Xemacs 50ms p.t. )
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1e-05 0.0001 0.001 0.01 0.1 1
50ms10ms
Episode length (sec)
Be
fore
pe
rfo
rma
nc
e s
ca
lin
g Afte
r pe
rform
an
ce
sc
alin
g
Cum
ula
tive
pe
rce
nta
ge o
f tim
e
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 30
Vertigo
• A DVS implementation for Linux 2.4 kernel.
• Currently runs on Transmeta Crusoe.– Test machine: Sony PictureBook (PCG-C1VN) using TM5600 processor
(300Mhz-600Mhz).
Goals:
• Robust implementation.
• Evaluate our algorithms on computers with DVS.
• Contrast with conventional DVS algorithm (LongRun).
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 33
Vertigo vs. LongRun
• LongRun: implemented as part of the processor.– Interval based algorithm (guided by busy vs. idle time).– Min. and max. range is controllable in software.
• Vertigo: implemented in OS kernel.– Classification based algorithm.– Distinguishes important from unimportant parts of execution.– Takes the quality of the user experience into account.
• Qualitative comparison on following graphs.– The two runs of the benchmarks are close but not identical.
• Human repeated the runs of the benchmark.
– Transitions to sleep are not shown.
– Same perceived interactive performance.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 34
No user activity
Time (s)
Pe
rfo
rma
nc
e l
ev
el
Pe
rfo
rma
nc
e l
ev
el
Time (s)
LongRun
Vertigo
Frequency range of the TM5600 processor.
50% = 300Mhz @ 1.3V
100% = 600Mhz @ 1.6V
Max. energy savings that should beexpected on this processor is ~34%.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 35
Emacs
Time (s)
Pe
rfo
rma
nc
e l
ev
el
Pe
rfo
rma
nc
e l
ev
el
Time (s)
LongRun
Vertigo
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 36
Acrobat Reader
Time (s)
Pe
rfo
rma
nc
e l
ev
el
Pe
rfo
rma
nc
e l
ev
el
Time (s)
LongRun
Vertigo
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 37
Acrobat Reader with sleep transitions
Time (s)
Pe
rfo
rma
nc
e l
ev
el
Pe
rfo
rma
nc
e l
ev
el
Time (s)
LongRun
Vertigo
Frequent transitions to/from sleep mode. Longer durations without sleeping.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 38
Desired improvements
• Processor parameters are good enough.– Faster voltage transitions would help a little.
– As peak performance gets higher, lower minimum performance is desirable.
• More sophisticated prediction algorithms.– Distinguish between episode instances, not just episode
types.
• Larger performance range for DVS processor.– Puts more pressure on performance-setting algorithm.
– More opportunity for energy savings.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 39
Conclusions
• Many interactive episodes are already fast enough.– More will be fast enough in the near future.– Use Dynamic Voltage Scaling to save energy.
• Episode classification based on inter-task communication.– Fast, accurate, no user program modifications required.
• Performance-setting based on episode classification.– Works well with multiprogramming, irregular processor utilization.– Ensures high quality interactive performance.– Significant energy savings (10%-80%).
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 40
Future work
• Evaluate our algorithms on real hardware.– Processors are slowly becoming available.– Impact on interactive performance.
• An API to specify episodes.– Light-weight: specify hints, not complete information.– Works in concert with existing detection mechanism.
• Apply episode detection to other problems.– Scheduler: can real-time deadlines be detected
automatically?
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 41
fin.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 42
Response time
• Faster is not always better.– Fundamental limit to what is perceptible to humans.
• Movies: 20-30 frames per second.• Perceptual causality: 50ms-100ms.• Dragging objects on screen: 200ms.• Non-continuous operation: 1-2sec.
The time it takes for the computer to respond to user initiated events.
The goal is to run fast enough to meet the perception threshold, no point to running any faster.
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 43
The performance gap
1
10
100
1000
10000
100000
0 1.5 3 4.5 6 7.5 9Time (years)
Per
form
ance
Available performancestarts accommodatingrequirements (A).
Desired performance
Available Performance
All performancerequirements are met (B).
Slowest availableperformance exceedsminimum requirements (C).
Available performanceis higher than required (D).
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 44
Cumulative interactive episode length distributionX
emac
s
Episode length (sec)
Cumulative numberCumulative time
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1e-05 0.0001 0.001 0.01 0.1 1
50ms10ms
Minimum performance level sufficient Max. performance
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 45
Communication between tasksC P U 1C P U 0
89 5
75 7
75 7
75 77 78
7 78
8 89
89 5
75 7
20 88
75 7
R
R
R
W
W
W
W
W
W
W
C P U 1C PU 0
7 57
7 572 09 0
7 57
W
W
W
7 57 W
7 57W
7 57 W
7 57 W
Krisztián Flautner - [email protected] Performance Setting for Dynamic Voltage Scaling 46
Producer and consumer episodes
• Example: MP3 playback through esd sound daemon.• Monitor communications to/from sound daemon.• Distance between producer and consumer episodes determines
necessary performance level.
Sound daemon
MP3 player
HW sound device