octet: capturing and controlling cross-thread dependences efficiently
DESCRIPTION
Michael Bond Milind Kulkarni Man Cao Minjia Zhang Meisam Fathi Salmi Swarnendu Biswas Aritra Sengupta Jipeng Huang. Purdue. Octet: Capturing and Controlling Cross-Thread Dependences Efficiently. Ohio State. Parallel programming is mainstream. Shared memory with locks Challenge: - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/1.jpg)
Octet: Capturing and Controlling Cross-Thread Dependences Efficiently
Michael BondMilind KulkarniMan CaoMinjia ZhangMeisam Fathi SalmiSwarnendu BiswasAritra SenguptaJipeng Huang
Ohio State
Purdue
![Page 2: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/2.jpg)
Parallel programming is mainstream
Shared memory with locks
Challenge:performance & correctness
![Page 3: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/3.jpg)
• Help express parallelism better• Eliminate concurrency errors• Diagnose production bugs• Deal with nondeterminism
Need practical runtime support
![Page 4: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/4.jpg)
• Atomicity checking• Data race
detection• Record & replay
• Transactional memory• DRF/SC enforcement• Deterministic execution
Need practical runtime support
![Page 5: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/5.jpg)
• Atomicity checking• Data race
detection• Record & replay
• Transactional memory• DRF/SC enforcement• Deterministic execution
Track dependences Control dependences
Need practical runtime support
o.f = …… = o.f
![Page 6: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/6.jpg)
• Atomicity checking• Data race
detection• Record & replay
• Transactional memory• DRF/SC enforcement• Deterministic execution
Track dependences Control dependences
Need practical runtime support
o.f = …… = o.f
![Page 7: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/7.jpg)
• Atomicity checking• Data race
detection• Record & replay
• Transactional memory• DRF/SC enforcement• Deterministic execution
Track dependences Control dependences
Need practical runtime support
o.f = …… = o.f
Commodity (software-only) approachesslow programs by several times
![Page 8: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/8.jpg)
• Atomicity checking• Data race
detection• Record & replay
• Transactional memory• DRF/SC enforcement• Deterministic execution
Track dependences Control dependences
o.f = …check
… = o.fcheck
Need practical runtime support
Commodity (software-only) approachesslow programs by several times
![Page 9: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/9.jpg)
• Atomicity checking• Data race
detection• Record & replay
• Transactional memory• DRF/SC enforcement• Deterministic execution
Track dependences Control dependences
o.f = …check
… = o.fcheck
Need practical runtime support
Any access could race add synchronization at every access
![Page 10: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/10.jpg)
Octet
Framework for runtime supportHB edges all dependencesAtomicity of analysis & access
Concurrency control mechanismSynchronization cross-thread dependence Qualitative performance improvement
![Page 11: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/11.jpg)
Octet
Framework for runtime supportHB edges all dependencesAtomicity of analysis & access
Concurrency control mechanismSynchronization cross-thread dependence Qualitative performance improvement
Proofs!
![Page 12: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/12.jpg)
Octet tracks ownership
Each object’s state Є { WrExT , RdExT , RdShc }
![Page 13: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/13.jpg)
wr o.f
T1 T2
write check
o’s state = WrExT1Ti
me
![Page 14: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/14.jpg)
wr o.f
T1 T2
read check
write check
o’s state = WrExT1Ti
me
![Page 15: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/15.jpg)
wr o.f
T1 T2
read check
write check
o’s state = WrExT1Ti
me
![Page 16: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/16.jpg)
wr o.f
T1 T2
safe point
write check
read check
o’s state = WrExT1Ti
me
![Page 17: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/17.jpg)
wr o.f
T1 T2
safe point
write check
read check
o’s state = WrExT1
Implicit safe pointTi
me
![Page 18: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/18.jpg)
wr o.f
T1 T2
safe point
write check
read check
o’s state = WrExT1Ti
me
![Page 19: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/19.jpg)
wr o.f
T1 T2
safe point
write check
read check
o’s state = RdExT2Ti
me
![Page 20: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/20.jpg)
wr o.f
T1 T2
rd o.f
safe point
write check
read check
o’s state = RdExT2Ti
me
![Page 21: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/21.jpg)
wr o.f
T1 T2
rd o.f
safe point
read check
write check
o’s state = RdExT2
T3 T4
![Page 22: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/22.jpg)
wr o.f
T1 T2
rd o.f
T3
safe point
T4
read check
write check
read check
o’s state = RdExT2
![Page 23: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/23.jpg)
wr o.f
T1 T2
rd o.f
rd o.f
T3
safe point
T4
read check
write check
read check
o’s state = RdShc
![Page 24: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/24.jpg)
wr o.f
T1 T2
rd o.f
rd o.f
T3
safe point
T4
read check
write check
read check
read check
o’s state = RdShc
![Page 25: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/25.jpg)
wr o.f
T1 T2
rd o.f
rd o.f
T3
safe point
rd o.f
T4
read check
write check
read check
read check
o’s state = RdShc
![Page 26: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/26.jpg)
wr o.f
T1 T2
rd o.f
rd o.f
T3
safe point
rd o.f
T4
read check
write check
read check
read check
o’s state = RdShc
Sharing detection[von Praun & Gross ’01]Comparison in our paper
Distributed shared memoryShasta [Scales et al. ’96]
Biased locking[Kawachiya et al. ’02][Russell & Detlefs ’06][Hindman & Grossman ’06]
![Page 27: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/27.jpg)
• Atomicity checking• Data race
detection• Record & replay
• Transactional memory• DRF/SC enforcement• Deterministic execution
Practical runtime support
Track dependences Control dependences
Framework for runtime supportConcurrency control mechanism O
ctet
![Page 28: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/28.jpg)
wr o.f
T1 T2
rd o.f
rd o.f
T3
safe point
rd o.f
T4
read check
write check
read check
read check
Dependence recorder records happens-before edges
![Page 29: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/29.jpg)
Implementation in Jikes RVMPublicly availablehttp://jikesrvm.org/Research+Archive
Parallel programsDaCapo Benchmarks 2006 & 2009SPEC JBB 2000 & 2005
Parallel platform32 cores (AMD Opteron 6272)
![Page 30: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/30.jpg)
eclips
e6
hsqldb
6
lusea
...
xalan
6
avror
a9
jytho
n9
luind
ex9
lusea
...pm
d9
sunflo
w9xa
lan9
jbb20
00
jbb20
05 geo
0
100
200
300
400
500
600
700
800
900
1000
Pessimi...
Ove
rhea
d (%
)34,600% 3,000%
![Page 31: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/31.jpg)
eclips
e6
hsqldb
6
lusea
...
xalan
6
avror
a9
jytho
n9
luind
ex9
lusea
...pm
d9
sunflo
w9xa
lan9
jbb20
00
jbb20
05 geo
0
20
40
60
80
100
120 Octet w/o coord
Octet w/o coord
Ove
rhea
d (%
)
![Page 32: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/32.jpg)
eclips
e6
hsqldb
6
lusea
...
xalan
6
avror
a9
jytho
n9
luind
ex9
lusea
...pm
d9
sunflo
w9xa
lan9
jbb20
00
jbb20
05 geo
0
20
40
60
80
100
120
OctetOctet w/o coord
Ove
rhea
d (%
)
![Page 33: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/33.jpg)
eclips
e6
hsqldb
6
lusea
...
xalan
6
avror
a9
jytho
n9
luind
ex9
lusea
...pm
d9
sunflo
w9xa
lan9
jbb20
00
jbb20
05 geo
0
20
40
60
80
100
120
RecorderOctetOctet w/o coord
Ove
rhea
d (%
)
![Page 34: Octet: Capturing and Controlling Cross-Thread Dependences Efficiently](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56816363550346895dd43515/html5/thumbnails/34.jpg)
Octet helps enable practical runtime support for reliable, scalable concurrency
Framework for runtime supportHB edges all dependencesAtomicity of analysis & access
Concurrency control mechanismSynchronization cross-thread dependence Qualitative performance improvement