variational path profiling
DESCRIPTION
Variational Path Profiling. Erez Perelman * , Trishul Chilimbi † , Brad Calder * * University of Califonia, San Diego † Microsoft Research, Redmond. Observation: Variation in Paths Exists. Goal: find the paths to focus on for optimization What is a path - PowerPoint PPT PresentationTRANSCRIPT
Variational Path Profiling
Erez Perelman*, Trishul Chilimbi†, Brad Calder*
*University of Califonia, San Diego
†Microsoft Research, Redmond
Observation: Variation in Paths Exists
• Goal: find the paths to focus on for optimization
• What is a path– Acyclic control flow trace thru binary (i.e. loop
body)
• Variation in path performance is optimization potential
What is variation?
• Performance between iterations of a path is not constant– Can be underlying architecture effects
(cache misses) that cause variations
• Example of amount of variation seen– One common path in gzip observed to
execute within 48,409 cycles and also 4,004,226 cycles
Goal: Optimize Away Variation
• Hypothesis: – All execution of a path can take the minimum time
(if architecture effects are ignored)
• Want: Reduce variation of a path to improve program performance– Ideal Time = The fastest execution for a path– Optimize path to execute near its ideal time every
time
• Result– Balanced path execution time (smaller net
variation for a path)
How to Find the Variation
• Sample path executions and measure performance variations– Rank top varying paths in program
• Highly optimized paths won’t have much variation– Using traditional hot path profilers won’t find you
the variation• Optimized paths execute same number of times
– VPP will focus on good optimization points that have not been exploited
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
VPP: Profiling
• Sample execution of acyclic paths with Bursty Tracing – Measure time in path– Unique path signature
• Entry PC and Branch History
0x0040211F-110
• Accurate measurement of performance essential
Bursty Tracing
A
B
A’A
B’B
Original Procedure Modified Procedure (Bursty Tracing)
Sampling Overhead
• Accuracy is critical for time measurement of path– Bursty Tracing has less than 5%
instrumentation overhead– Timing of path is even lower overhead
• Don’t measure time of instrumentation code
• Small bias exists, but consistent and can be accounted for
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
VPP: Analysis
• Compute net variation time for each path– Basetime(i) = fastest execution time– Net variation path (i) =Total time(i) – [Frequency(i) x
Basetime(i)]
• Rank paths according to net variation– Top few paths dominate all program variation
Structure within Variation
0
50
100
150
200
250
1 21 41 61 81 101 121 141 161 181 201 221 241 261
Time Varations Relative to Fastest Path Execution
Num
ber o
f Occ
uran
ces
• Bzip2 Top 5 Varying Paths
VPP: Top 10 Paths
0
10
20
30
40
50
60
70
amm
p art
bzip
equa
ke gcc
mcf
pars
ertw
olf
vorte
xvp
rav
g
% E
xcut
ion
Tim
e
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
Stability
• Do top varying paths change when system load or program input is changed?– System load measures the resource utilization
(processor, memory, buses, etc…)
• Measure stability of tops paths across system loads– Heavy system load vs. light system load
• Across program inputs– Program execution varies, how does it affect top
paths?
Stability: System Load
0
10
20
30
40
50
60
70
80
90
100
amm
par
tbz
ip
equa
ke gcc
mcf
pars
ertw
olf
vorte
xvp
rav
g
% E
xecu
tion
Tim
e
low load top 10low load w/ top in bothhigh load top 10high load w/ top in both
Stability: Input
0
10
20
30
40
50
60
70
ammp art bzip equake gcc mcf parser tw olf vortex vpr avg
% E
xecu
tion
Tim
e
top 10 self trained
top 10 cross trained
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
VPP: Optimize Top Paths
• Simple optimization strategy for top paths to show optimization potential– Prefetch loads in path one or two iterations ahead of
loop– Check for loop bounds to stay within bounds of data
accesses
• After optimization paths lost 41% of net variation on average
• More elaborate optimizations can reduce more variation
Optimization Example: VPR
1 while (ito < heap_tail) {2 if (heap[ito+1]->cost < heap[ito]->cost)3 ito++;4 if (heap[ito]->cost > heap[ifrom]->cost)5 break;6** if (ito*8 < heap_tail)7** _mm_prefetch((char*)&heap[ito*8]->cost, 1);8 temp_ptr = heap[ito];9 heap[ito] = heap[ifrom];10 heap[ifrom] = temp_ptr;11 ifrom = ito;12 ito = 2*ifrom;13 }
• this optimization results in 9% speedup!
VPP: Spec 2K Speedup
0
5
10
15
20
25
amm
p art
bzip
equa
ke gcc
mcf
parse
rtw
olf
vorte
xvp
rav
g
% S
pee
du
p
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
Comparing to other Profiling Techniques
• Path profiling techniques often base hotness on frequency– Most executed paths are considered hot– Once these are optimized
• Still hot based on frequency• Lower variation, ranking goes down with VPP
• VPP dynamically ranks paths– Once optimized, path ranking can change
Comparing to other Profiling Techniques
0
1
2
3
4
5
6
7
8
9
10
art
equa
ke
amm
p
gzip
vpr
gcc
mcf
craf
ty
pars
er gap
vort
ex
bzip
2
twol
f
foxp
ro
pc g
ame
mul
timed
ia
avgDiff
in T
op 1
0 V
aria
tiona
l Pat
hs
Hot Path Ranking-Frequency of Path
Time Ranking: net exectime of path
Outline• Variational Path Profiling
– Profiling– Analysis– Measuring Stability
• Optimizations– Apply simple optimizations on top paths– Speedup results – Comparison to other path profiling techniques
• Future Work– Discovering Structure in variation and its implication
Observation: Variation Structure
• Is there a pattern in variation?– If we plot the variation over time we can see
interesting structure
• Future work: – Does the context leading up to a path have
correlation with the path performance– Can specific hardware structures be identified
to cause variation– Can specific optimization be recommended
based on variation structure
Structure within Variation
0
50
100
150
200
250
1 21 41 61 81 101 121 141 161 181 201 221 241 261
Time Varations Relative to Fastest Path Execution
Num
ber o
f Occ
uran
ces
• Bzip2 Top 5 Varying Paths
Conclusion
• VPP finds the top varying paths with good optimization potential– Few top paths account for majority of variation– Top variational paths are stable
• Applying simple optimization has 8.5% speedup on avg for Spec 2k on P4
• VPP finds hot paths that are not found with other techniques – Once path is optimized, its variation is reduced (the
_hotness_ in VPP)