uncovering hidden loop level parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1...
TRANSCRIPT
![Page 1: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/1.jpg)
1 University of MichiganElectrical Engineering and Computer Science
Uncovering Hidden Loop Level Parallelism in Sequential Applications
Hongtao Zhong, Mojtaba Mehrara, Steve Lieberman,
Scott Mahlke
Advanced Computer Architecture Lab.University of Michigan
![Page 2: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/2.jpg)
2 University of MichiganElectrical Engineering and Computer Science
CMP Architectures• Multiple cores on a chip
– Higher throughput– Reduced complexity (per core)– More power/heat friendly
• Multithreaded applications
Inte
l Cor
e 2
Duo
AMD
Quad
-cor
e (B
arce
lona
)
Sun Niagara 2
![Page 3: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/3.jpg)
3 University of MichiganElectrical Engineering and Computer Science
How About Single Thread?
[Source : Bridges et al, MICRO `07]
![Page 4: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/4.jpg)
4 University of MichiganElectrical Engineering and Computer Science
Loop Level Parallelization
i = 0-39
DOALL loop
![Page 5: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/5.jpg)
5 University of MichiganElectrical Engineering and Computer Science
Loop Level Parallelization
i = 0-39i = 20-39i = 0-19
Core 1Core 0
DOALL loop
![Page 6: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/6.jpg)
6 University of MichiganElectrical Engineering and Computer Science
Loop Level Parallelization
i = 0-39
Speculative DOALL loop
![Page 7: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/7.jpg)
7 University of MichiganElectrical Engineering and Computer Science
Loop Level Parallelization
i = 0-39
i = 10-19
i = 30-39
i = 0-9
i = 20-29
Core 1Core 0
Loop Chunk
Speculative DOALL loop
![Page 8: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/8.jpg)
8 University of MichiganElectrical Engineering and Computer Science
Loop Level Parallelization
i = 0-39
i = 10-19
i = 30-39
i = 0-9
i = 20-29
Core 1Core 0
Loop Chunk
Bad news: limited number of parallel loops in general purpose applications
–1.3x speedup for SpecINT2000 on 4 cores
Speculative DOALL loop
![Page 9: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/9.jpg)
9 University of MichiganElectrical Engineering and Computer Science
Contributions
• Code generation framework– Speculative parallelization of
uncounted loops
• Compiler transformations – Speculative loop fission– Isolation of infrequent dependences– Speculative prematerialization
Initialization
Consolidation
Abort Handler
for(i=IS; i<IE; i++) { ...... if (brk_cond) local_brk_flag = 1; break;}
XBEGIN
if (global_brk_flag) break;
perm = RECV(THREADj-1)XCOMMITif (local_brk_flag) global_brk_flag = 1; kill_other_threads;elseif (IE < n) SEND(perm,THREADj+1)
IS = ...; IE = ...;
Spawn
![Page 10: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/10.jpg)
10 University of MichiganElectrical Engineering and Computer Science
Target Architecture
L2 cache
L2 cache
Core 0 Core 1
Core 2 Core 3
![Page 11: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/11.jpg)
11 University of MichiganElectrical Engineering and Computer Science
Target Architecture
L2 cache
L2 cache
Core 0 Core 1
Core 2 Core 3
Scalar operand network
![Page 12: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/12.jpg)
12 University of MichiganElectrical Engineering and Computer Science
Target Architecture
L2 cache
L2 cache
Core 0 Core 1
Core 2 Core 3Hardware transactional
memory
Scalar operand network
![Page 13: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/13.jpg)
13 University of MichiganElectrical Engineering and Computer Science
Code Generation Framework
for (i=0;i<n;i++)// original loop code
![Page 14: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/14.jpg)
14 University of MichiganElectrical Engineering and Computer Science
Code Generation Framework
while (...)IS+=...; IE+=...;XBEGIN
XCOMMIT
for (i=IS;i<IE;i++)// original loop code
![Page 15: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/15.jpg)
15 University of MichiganElectrical Engineering and Computer Science
RECV(THREADj-1)XCOMMITSEND(THREADj+1)
Code Generation Framework
while (...)IS+=...; IE+=...;XBEGIN
for (i=IS;i<IE;i++)// original loop code
![Page 16: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/16.jpg)
16 University of MichiganElectrical Engineering and Computer Science
RECV(THREADj-1)XCOMMITSEND(THREADj+1)
Code Generation Framework
while (...)IS+=...; IE+=...;XBEGIN
for (i=IS;i<IE;i++)// original loop code
Spawn
![Page 17: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/17.jpg)
17 University of MichiganElectrical Engineering and Computer Science
RECV(THREADj-1)XCOMMITSEND(THREADj+1)
Code Generation Framework
while (...)IS+=...; IE+=...;XBEGIN
for (i=IS;i<IE;i++)// original loop code if (brkCond) break;
Spawn
![Page 18: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/18.jpg)
18 University of MichiganElectrical Engineering and Computer Science
for (i=IS;i<IE;i++)// original loop code if (brkCond)
localBrk=1; break;
RECV(THREADj-1)XCOMMITif (localBrk)globalBrk=1;abortOtherTXs;SEND(THREADj+1)
Code Generation Framework
while (...)IS+=...; IE+=...;XBEGINif (globalBrk) break;
Spawn
![Page 19: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/19.jpg)
19 University of MichiganElectrical Engineering and Computer Science
for (i=IS;i<IE;i++)// original loop code if (brkCond)
localBrk=1; break;
RECV(THREADj-1)XCOMMITif (localBrk)globalBrk=1;abortOtherTXs;SEND(THREADj+1)
Code Generation Framework
while (...)IS+=...; IE+=...;XBEGINif (globalBrk) break;
Consolidation
Spawn
![Page 20: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/20.jpg)
20 University of MichiganElectrical Engineering and Computer Science
Code Generation Framework• Supports counted and
uncounted loops– Software managed
control speculation• Iteration chunking• Enforce transaction
ordering• Handles livein, liveout &
accumulator registers
for (i=IS;i<IE;i++)// original loop code if (brkCond)
localBrk=1; break;
RECV(THREADj-1)XCOMMITif (localBrk)globalBrk=1;abortOtherTXs;SEND(THREADj+1)
while (...)IS+=...; IE+=...;XBEGINif (globalBrk) break;
Consolidation
Spawn
![Page 21: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/21.jpg)
21 University of MichiganElectrical Engineering and Computer Science
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
052.a
lvin
n056.e
ar171.s
wim
172.m
grid
177.m
esa
179.a
rt183.e
quak
e188.a
mm
p008.e
spre
sso
023.e
qnto
tt026.c
ompre
ss072.s
c099.g
o124.m
88ks
im129.c
ompre
ss130.li
132.ij
peg
164.g
zip
175.v
pr
181.m
cf197.p
arse
r256.b
zip2
300.t
wol
fcj
peg
djp
egep
icg721dec
ode
g721en
code
gsm
dec
ode
gsm
enco
de
mpeg
2dec
mpeg
2en
cpeg
witdec
peg
witen
cra
wca
udio
raw
dau
dio
unep
icgre
ple
xya
ccav
erag
e
SPEC FP SPEC INT Mediabench Utilities
Fra
ctio
n o
f se
qu
en
tial execu
tion Provable DOALL
DOALL Coverage – Provable and Profiled
![Page 22: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/22.jpg)
22 University of MichiganElectrical Engineering and Computer Science
DOALL Coverage – Provable and Profiled
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
052.a
lvin
n056.e
ar171.s
wim
172.m
grid
177.m
esa
179.a
rt183.e
quak
e188.a
mm
p008.e
spre
sso
023.e
qnto
tt026.c
ompre
ss072.s
c099.g
o124.m
88ks
im129.c
ompre
ss130.li
132.ij
peg
164.g
zip
175.v
pr
181.m
cf197.p
arse
r256.b
zip2
300.t
wol
fcj
peg
djp
egep
icg721dec
ode
g721en
code
gsm
dec
ode
gsm
enco
de
mpeg
2dec
mpeg
2en
cpeg
witdec
peg
witen
cra
wca
udio
raw
dau
dio
unep
icgre
ple
xya
ccav
erag
e
SPEC FP SPEC INT Mediabench Utilities
Fra
ctio
n o
f se
qu
en
tial execu
tion
Profiled DOALLProvable DOALL
![Page 23: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/23.jpg)
23 University of MichiganElectrical Engineering and Computer Science
DOALL Coverage – Provable and Profiled
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
052.a
lvin
n056.e
ar171.s
wim
172.m
grid
177.m
esa
179.a
rt183.e
quak
e188.a
mm
p008.e
spre
sso
023.e
qnto
tt026.c
ompre
ss072.s
c099.g
o124.m
88ks
im129.c
ompre
ss130.li
132.ij
peg
164.g
zip
175.v
pr
181.m
cf197.p
arse
r256.b
zip2
300.t
wol
fcj
peg
djp
egep
icg721dec
ode
g721en
code
gsm
dec
ode
gsm
enco
de
mpeg
2dec
mpeg
2en
cpeg
witdec
peg
witen
cra
wca
udio
raw
dau
dio
unep
icgre
ple
xya
ccav
erag
e
SPEC FP SPEC INT Mediabench Utilities
Fra
ctio
n o
f se
qu
en
tial execu
tion
Profiled DOALLProvable DOALL
Still not good enough!Few dependences hinder parallelization in many loops
![Page 24: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/24.jpg)
24 University of MichiganElectrical Engineering and Computer Science
DOALL Coverage – Provable and Profiled
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
052.a
lvin
n056.e
ar171.s
wim
172.m
grid
177.m
esa
179.a
rt183.e
quak
e188.a
mm
p008.e
spre
sso
023.e
qnto
tt026.c
ompre
ss072.s
c099.g
o124.m
88ks
im129.c
ompre
ss130.li
132.ij
peg
164.g
zip
175.v
pr
181.m
cf197.p
arse
r256.b
zip2
300.t
wol
fcj
peg
djp
egep
icg721dec
ode
g721en
code
gsm
dec
ode
gsm
enco
de
mpeg
2dec
mpeg
2en
cpeg
witdec
peg
witen
cra
wca
udio
raw
dau
dio
unep
icgre
ple
xya
ccav
erag
e
SPEC FP SPEC INT Mediabench Utilities
Fra
ctio
n o
f se
qu
en
tial execu
tion
Profiled DOALLProvable DOALL
Still not good enough!Few dependences hinder parallelization in many loops
Compiler can help:•Speculative fission•Isolation of infrequent paths•Speculative prematerialization
![Page 25: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/25.jpg)
25 University of MichiganElectrical Engineering and Computer Science
1: while (node) {2: work(node);3: node = node->next;
}
Speculative Loop Fission
![Page 26: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/26.jpg)
26 University of MichiganElectrical Engineering and Computer Science
1: while (node) {2: work(node);3: node = node->next;
}
Speculative Loop Fission1: while (node) {4: node_array[count++] = node;3: node = node->next;
}
![Page 27: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/27.jpg)
27 University of MichiganElectrical Engineering and Computer Science
1: while (node) {2: work(node);3: node = node->next;
}
Speculative Loop Fission1: while (node) {4: node_array[count++] = node;3: node = node->next;
}
XBEGIN5: node = node_array[IS];
i = 0;1':while (node && i++ < CS) {2: work(node);3': node = node->next;
}RECV(THREADj-1)XCOMMITSEND(THREADj+1)}
![Page 28: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/28.jpg)
28 University of MichiganElectrical Engineering and Computer Science
1: while (node) {2: work(node);3: node = node->next;
}
Speculative Loop Fission
XBEGIN5: node = node_array[IS];
i = 0;1':while (node && i++ < CS) {2: work(node);3': node = node->next;
}RECV(THREADj-1)XCOMMITif (node!= node_array[IS+CS]){
update_node_array;kill_other_threads();}
SEND(THREADj+1)}
1: while (node) {4: node_array[count++] = node;3: node = node->next;
}
![Page 29: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/29.jpg)
29 University of MichiganElectrical Engineering and Computer Science
1: while (node) {2: work(node);3: node = node->next;
}
Speculative Loop Fission
XBEGIN5: node = node_array[IS];
i = 0;1':while (node && i++ < CS) {2: work(node);3': node = node->next;
}RECV(THREADj-1)XCOMMITif (node!= node_array[IS+CS]){
update_node_array;kill_other_threads();}
SEND(THREADj+1)}
1: while (node) {4: node_array[count++] = node;3: node = node->next;
}
![Page 30: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/30.jpg)
30 University of MichiganElectrical Engineering and Computer Science
1: while (node) {2: work(node);3: node = node->next;
}
Speculative Loop Fission
XBEGIN5: node = node_array[IS];
i = 0;1':while (node && i++ < CS) {2: work(node);3': node = node->next;
}RECV(THREADj-1)XCOMMITif (node!= node_array[IS+CS]){
update_node_array;kill_other_threads();}
SEND(THREADj+1)}
1: while (node) {4: node_array[count++] = node;3: node = node->next;
}
![Page 31: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/31.jpg)
31 University of MichiganElectrical Engineering and Computer Science
1: while (node) {2: work(node);3: node = node->next;
}
Speculative Loop Fission
XBEGIN5: node = node_array[IS];
i = 0;1':while (node && i++ < CS) {2: work(node);3': node = node->next;
}RECV(THREADj-1)XCOMMITif (node!= node_array[IS+CS]){
update_node_array;kill_other_threads();}
SEND(THREADj+1)}
1: while (node) {4: node_array[count++] = node;3: node = node->next;
}
![Page 32: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/32.jpg)
32 University of MichiganElectrical Engineering and Computer Science
1: while (node) {2: work(node);3: node = node->next;
}
Speculative Loop Fission
XBEGIN5: node = node_array[IS];
i = 0;1':while (node && i++ < CS) {2: work(node);3': node = node->next;
}RECV(THREADj-1)XCOMMITif (node!= node_array[IS+CS]){
update_node_array;kill_other_threads();}
SEND(THREADj+1)}
1: while (node) {4: node_array[count++] = node;3: node = node->next;
}
![Page 33: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/33.jpg)
33 University of MichiganElectrical Engineering and Computer Science
Infrequent Dependence Isolation
1:
2:99%
1%
A
B
C
![Page 34: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/34.jpg)
34 University of MichiganElectrical Engineering and Computer Science
Infrequent Dependence Isolation
1:
2:
1:
2:99%
1%
A
B
C
A
B
C
![Page 35: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/35.jpg)
35 University of MichiganElectrical Engineering and Computer Science
Infrequent Dependence Isolation
1:
2:
1:
2:99%
1%
A
B
C
A
B
C’
C
![Page 36: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/36.jpg)
36 University of MichiganElectrical Engineering and Computer Science
Infrequent Dependence Isolation
1:
2:
1:
2:99%
1%break
A
B
C
A
C’
C
B
1%
99%
![Page 37: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/37.jpg)
37 University of MichiganElectrical Engineering and Computer Science
for( j=0; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times) {best = cbest;times = count;
}}
Infrequent Dependence Isolation
Sample loop from yacc benchmark
![Page 38: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/38.jpg)
38 University of MichiganElectrical Engineering and Computer Science
for( j=0; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times) {best = cbest;times = count;
}}
Infrequent Dependence Isolation
Sample loop from yacc benchmark
![Page 39: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/39.jpg)
39 University of MichiganElectrical Engineering and Computer Science
for( j=0; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times) {best = cbest;times = count;
}}
Infrequent Dependence Isolation
if ( count > times) {best = cbest;times = count;
}
1 %
Sample loop from yacc benchmark
![Page 40: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/40.jpg)
40 University of MichiganElectrical Engineering and Computer Science
for( j=0; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times) {best = cbest;times = count;
}}
Infrequent Dependence Isolation
if ( count > times) {best = cbest;times = count;
}
j=0;while (j<=nstate){
for( ; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times)break;
}
if (count > times) {best = cbest;times = count; j++;}}
1 %1 %
Sample loop from yacc benchmark
![Page 41: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/41.jpg)
41 University of MichiganElectrical Engineering and Computer Science
for( j=0; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times) {best = cbest;times = count;
}}
Infrequent Dependence Isolation
if ( count > times) {best = cbest;times = count;
}
j=0;while (j<=nstate){
for( ; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times)break;
}
if (count > times) {best = cbest;times = count; j++;}}
1 %1 %
Sample loop from yacc benchmark
![Page 42: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/42.jpg)
42 University of MichiganElectrical Engineering and Computer Science
for( j=0; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times) {best = cbest;times = count;
}}
Infrequent Dependence Isolation
if ( count > times) {best = cbest;times = count;
}
j=0;while (j<=nstate){
for( ; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times)break;
}
if (count > times) {best = cbest;times = count; j++;}}
1 %
1 %
1 %
Sample loop from yacc benchmark
![Page 43: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/43.jpg)
43 University of MichiganElectrical Engineering and Computer Science
for( j=0; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times) {best = cbest;times = count;
}}
Infrequent Dependence Isolation
if ( count > times) {best = cbest;times = count;
}
j=0;while (j<=nstate){
for( ; j<=nstate; ++j ){if( tystate[j] == 0 ) continue;if( tystate[j] == best ) continue;count = 0;cbest = tystate[j];for (k=j; k<=nstate; ++k)if (tystate[k]==cbest) ++count;if ( count > times)break;
}
if (count > times) {best = cbest;times = count; j++;}}
1 %
1 %
1 %
Sample loop from yacc benchmark
![Page 44: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/44.jpg)
44 University of MichiganElectrical Engineering and Computer Science
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
052.a
lvin
n056.e
ar171.s
wim
172.m
gri
d177.m
esa
179.a
rt183.e
quak
e188.a
mm
p008.e
spre
sso
023.e
qnto
tt026.c
om
pre
ss072.s
c099.g
o124.m
88ks
im129.c
om
pre
ss130.li
132.ijp
eg164.g
zip
175.v
pr
181.m
cf197.p
arse
r256.b
zip2
300.t
wolf
cjpeg
djp
egep
icg721dec
ode
g721en
code
gsm
dec
ode
gsm
enco
de
mpeg
2dec
mpeg
2en
cpeg
wit
dec
peg
wit
enc
raw
caudio
raw
dau
dio
unep
icgre
ple
xya
ccav
erag
e
SPEC FP SPEC INT Mediabench Utilities
Fra
cti
on
of
se
qu
en
tia
l e
xe
cu
tio
n
profiled + provable
DOALL Coverage – Profiled and Transformed
![Page 45: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/45.jpg)
45 University of MichiganElectrical Engineering and Computer Science
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
052.a
lvin
n056.e
ar171.s
wim
172.m
gri
d177.m
esa
179.a
rt183.e
quak
e188.a
mm
p008.e
spre
sso
023.e
qnto
tt026.c
om
pre
ss072.s
c099.g
o124.m
88ks
im129.c
om
pre
ss130.li
132.ijp
eg164.g
zip
175.v
pr
181.m
cf197.p
arse
r256.b
zip2
300.t
wolf
cjpeg
djp
egep
icg721dec
ode
g721en
code
gsm
dec
ode
gsm
enco
de
mpeg
2dec
mpeg
2en
cpeg
wit
dec
peg
wit
enc
raw
caudio
raw
dau
dio
unep
icgre
ple
xya
ccav
erag
e
SPEC FP SPEC INT Mediabench Utilities
Fra
cti
on
of
se
qu
en
tia
l e
xe
cu
tio
n
profiled + provable
DOALL Coverage – Profiled and Transformed
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
052.a
lvin
n056.e
ar171.s
wim
172.m
gri
d177.m
esa
179.a
rt183.e
quak
e188.a
mm
p008.e
spre
sso
023.e
qnto
tt026.c
om
pre
ss072.s
c099.g
o124.m
88ks
im129.c
om
pre
ss130.li
132.ijp
eg164.g
zip
175.v
pr
181.m
cf197.p
arse
r256.b
zip2
300.t
wolf
cjpeg
djp
egep
icg721dec
ode
g721en
code
gsm
dec
ode
gsm
enco
de
mpeg
2dec
mpeg
2en
cpeg
wit
dec
peg
wit
enc
raw
caudio
raw
dau
dio
unep
icgre
ple
xya
ccav
erag
e
SPEC FP SPEC INT Mediabench Utilities
Fra
cti
on
of
se
qu
en
tia
l e
xe
cu
tio
n
profiled + provable transformations
![Page 46: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/46.jpg)
46 University of MichiganElectrical Engineering and Computer Science
Coverage Breakdown
0
10
20
30
40
50
60
70
SpecINT MediaBench Utilities
Fra
cti
on
of
se
qu
en
tia
l e
xe
cu
tio
n
DOALL loops Control speculation for uncounted loops
Speculative fission Speculative prematerialization
Infrequent dependence isolation DOALL loops after transformations
![Page 47: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/47.jpg)
47 University of MichiganElectrical Engineering and Computer Science
Coverage Breakdown
0
10
20
30
40
50
60
70
SpecINT MediaBench Utilities
Fra
cti
on
of
se
qu
en
tia
l e
xe
cu
tio
n
DOALL loops Control speculation for uncounted loops
Speculative fission Speculative prematerialization
Infrequent dependence isolation DOALL loops after transformations
![Page 48: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/48.jpg)
48 University of MichiganElectrical Engineering and Computer Science
Experimental Setup
• OpenIMPACT compiler• Multicore simulator
– Simulates up to 8 ARM9-like processors– Models scalar operand network– Assumes perfect memory system– Uses STM library to emulate HTM functionality
![Page 49: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/49.jpg)
49 University of MichiganElectrical Engineering and Computer Science
1
1.5
2
2.5
3
3.5
4
4.5
5
05
2.a
lvin
n
05
6.e
ar
17
1.s
wim
17
2.m
gri
d
17
7.m
esa
17
9.a
rt
18
3.e
qu
ake
18
8.a
mm
p
00
8.e
spre
sso
02
3.e
qn
tott
02
6.c
om
pre
ss
07
2.s
c
09
9.g
o
12
4.m
88
ksi
m
12
9.c
om
pre
ss
13
0.li
13
2.ijp
eg
16
4.g
zip
17
5.v
pr
18
1.m
cf
19
7.p
ars
er
25
6.b
zip
2
30
0.t
wolf
cjp
eg
djp
eg
ep
ic
g7
21
deco
de
g7
21
en
cod
e
gsm
deco
de
gsm
en
cod
e
mp
eg
2d
ec
mp
eg
2en
c
peg
wit
dec
peg
wit
en
c
raw
cau
dio
raw
dau
dio
un
ep
ic
gre
p
lex
yacc
avera
ge
SPEC FP SPEC INT Mediabench Utilities
Sp
ee
du
p
With transformations
Without transformations
Speedup
2 core4 core8 core
7.897.37
7.87
6.44
![Page 50: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/50.jpg)
50 University of MichiganElectrical Engineering and Computer Science
1
1.5
2
2.5
3
3.5
4
4.5
5
05
2.a
lvin
n
05
6.e
ar
17
1.s
wim
17
2.m
gri
d
17
7.m
esa
17
9.a
rt
18
3.e
qu
ake
18
8.a
mm
p
00
8.e
spre
sso
02
3.e
qn
tott
02
6.c
om
pre
ss
07
2.s
c
09
9.g
o
12
4.m
88
ksi
m
12
9.c
om
pre
ss
13
0.li
13
2.ijp
eg
16
4.g
zip
17
5.v
pr
18
1.m
cf
19
7.p
ars
er
25
6.b
zip
2
30
0.t
wolf
cjp
eg
djp
eg
ep
ic
g7
21
deco
de
g7
21
en
cod
e
gsm
deco
de
gsm
en
cod
e
mp
eg
2d
ec
mp
eg
2en
c
peg
wit
dec
peg
wit
en
c
raw
cau
dio
raw
dau
dio
un
ep
ic
gre
p
lex
yacc
avera
ge
SPEC FP SPEC INT Mediabench Utilities
Sp
ee
du
p
With transformations
Without transformations
Speedup
2 core4 core8 core
7.897.37
7.87
6.44
1.36x, 1.84x and 2.34x speedup on 2-, 4-, and 8-cores
![Page 51: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/51.jpg)
51 University of MichiganElectrical Engineering and Computer Science
Conclusion
• Figure out ways to use available resources for legacy applications– Codes like error handlers, linked list & tree
traversal limit parallelism• Compiler analysis and optimization
looks promising • 1.84x speedup on 4 cores after
transformations compared to 1.41x
![Page 52: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/52.jpg)
52 University of MichiganElectrical Engineering and Computer Science
Questions?
Thank you!
![Page 53: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/53.jpg)
53 University of MichiganElectrical Engineering and Computer Science
SpecDSWP vs. Speculative Fission
B
A
C
![Page 54: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/54.jpg)
54 University of MichiganElectrical Engineering and Computer Science
SpecDSWP vs. Speculative Fission
B0
A0A1A2A3 B1
B2B3
C0
C1
C2
C3
Core 0 Core 1 Core 2 Core 3
B0
A0A1A2A3
B1 B2 B3
C0 C1 C2 C3
Core 0 Core 1 Core 2 Core 3
![Page 55: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/55.jpg)
55 University of MichiganElectrical Engineering and Computer Science
Speculative Prematerialization
for (...) {1: current = ...;2: work(last);3: last = current;
}
![Page 56: Uncovering Hidden Loop Level Parallelism in …cccp.eecs.umich.edu/slides/mehrara-hpca08.pdf0.9 1 052.alvinn 056.ear 171.swim 172.mgrid 177.mesa 179.art 183.equake 188.ammp 008.espresso](https://reader034.vdocuments.mx/reader034/viewer/2022042922/5f6ced1b223e9e31f3123958/html5/thumbnails/56.jpg)
56 University of MichiganElectrical Engineering and Computer Science
Speculative Prematerialization
for (...) {1: current = ...;2: work(last);3: last = current;
}
XBEGIN1’: current =3’: last =
for (...) {1: current = ...;2: work(last);3: last = current;
}XCOMMIT