man cao minjia zhang michael d. bond
DESCRIPTION
Drinking from Both Glasses : Adaptively Combining Pessimistic and Optimistic Synchronization for Efficient Parallel Runtime Support. Man Cao Minjia Zhang Michael D. Bond. Dynamic Analyses for Parallel Programs. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/1.jpg)
1
Drinking from Both Glasses: Adaptively Combining Pessimistic and
Optimistic Synchronization for Efficient Parallel Runtime Support
Man CaoMinjia Zhang
Michael D. Bond
![Page 2: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/2.jpg)
2
Dynamic Analyses for Parallel Programs
• Data Race Detector, Record & Replay, Transactional Memory, Deterministic Execution, etc.
• Performance is usually bad!– several times slower
• Fundamental difficulties?
![Page 3: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/3.jpg)
3
Cross-thread dependences
• Crucial for dynamic analyses and systems• Capturing cross-thread dependences– Detecting
e.g. data race detector, dependence recorder
– Controllinge.g. transactional memory, deterministic execution
o.f = …
… = o.f
T1 T2
![Page 4: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/4.jpg)
4
Typical approach
• Per-object metadata (state)– E.g. last writer/reader thread
• At each object access:– Check current state– Analysis-specific action– Update state if needed– Perform the access
Atomically
![Page 5: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/5.jpg)
5
Typical approach
• Per-object metadata (state)– E.g. last writer/reader thread
• At each object access:– Check current state– Analysis-specific action– Update state if needed– Perform the access
Atomically
How to guarantee?
![Page 6: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/6.jpg)
6
Pessimistic Synchronization
• Used by most existing work– Data Race Detector• [FastTrack, Flanagan & Freund, 2009]
– Atomicity Violation Detector• [Velodrome, Flanagan et al., 2008]
– Record & Replay• [Instant Replay, LeBlanc et al., 1987]• [Chimera, Lee et al., 2012]
![Page 7: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/7.jpg)
7
Pessimistic Synchronization
LockMetadata()
![Page 8: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/8.jpg)
8
Pessimistic Synchronization
LockMetadata()
Check and compute new metadata
![Page 9: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/9.jpg)
9
Pessimistic Synchronization
LockMetadata()
Check and compute new metadata
Analysis-specific actions
![Page 10: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/10.jpg)
10
Pessimistic Synchronization
LockMetadata()
Check and compute new metadata
Program access
Analysis-specific actions
![Page 11: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/11.jpg)
11
Pessimistic Synchronization
LockMetadata()
Check and compute new metadata
Program access
UnlockAndUpdateMetadata()
Analysis-specific actions
![Page 12: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/12.jpg)
12
Pessimistic Synchronization
• Synchronization on every access• 6X slowdown on average
LockMetadata()
Check and compute new metadata
Program access
UnlockAndUpdateMetadata()
Analysis-specific actions
![Page 13: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/13.jpg)
13
Optimistic Synchronization
• Used to improve performance– Biased Locking• [Lock Reservation, Kawachiya et al., 2002]• [Bulk Rebiasing , Russell & Detlefs, 2006]
– Distributed Memory System• [Shasta, Scales et al. 1996]
– Framework Support• [Octet, Bond et al. 2013]
![Page 14: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/14.jpg)
14
Optimistic Synchronization
• Avoid synchronization for non-conflicting accesses
• Heavyweight coordination for conflicting accesses
![Page 15: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/15.jpg)
15
Optimistic Synchronization (Cont.)T1 T2
wr o.f
write check
wr o.f
write check
![Page 16: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/16.jpg)
16
Optimistic Synchronization (Cont.)T1 T2
wr o.f
write check
read checkwr o.f
write check
![Page 17: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/17.jpg)
17
Optimistic Synchronization (Cont.)T1 T2
wr o.f
safe point
write check
read check
Analysis-specific action
wr o.f
write check
![Page 18: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/18.jpg)
18
Optimistic Synchronization (Cont.)T1 T2
wr o.f
safe point
write check
read check
Analysis-specific action
change metadata
wr o.f
write check
rd o.f
![Page 19: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/19.jpg)
19
Optimistic Synchronization (Cont.)T1 T2
wr o.f
safe point
write check
read check
Analysis-specific action
change metadata
wr o.f
write check
rd o.f• 26% on average with outliers– Expensive if there are many conflicting accesses
![Page 20: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/20.jpg)
20
Optimistic synchronization performs best if there are few conflicting
accesses.
![Page 21: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/21.jpg)
21
Pessimistic synchronization is cheaper for conflicting accesses.
![Page 22: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/22.jpg)
22
Drink from both glasses?
• Goal:– Optimistic sync. for most non-conflicting accesses– Pessimistic sync. for most conflicting accesses
• Our approach:– Hybrid state model– Adaptive policy– Support for detecting and controlling cross-thread
dependences
![Page 23: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/23.jpg)
23
Adaptive Policy
• Decides when to perform Pess → Opt and Opt → Pess transitions
• Cost—Benefit model– Formulates the problem
• Online profiling– Efficiently collects information and approximates
the Cost-Benefit model
![Page 24: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/24.jpg)
24
Cost—Benefit model
• Compares total time spent in transitions if an object were optimistic or pessimistic– Whichever takes less time is beneficial
• Only relies on numbers (or just the ratio) of non-conflicting and conflicting transitions
![Page 25: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/25.jpg)
25
Evaluation
• Implementation– Jikes RVM 3.1.3
• Parallel programs– DaCapo Benchmarks 2006 & 2009– SPEC JBB 2000 & 2005
• Platform– 32 cores (AMD Opteron 6272)
![Page 26: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/26.jpg)
26
Performance
eclipse
6
hsqldb6
lusearc
h6xa
lan6
avro
ra9
jython9
luindex9
lusearc
h9pmd9
sunflow9
xalan
9
jbb2000
jbb2005
geomean
0
10
20
30
40
50
Pure Pessimistic
Pure Optimistic
Adaptive
Ove
rhea
d (%
)
770 270 326 650 95 270 430 170 100 29,000 2,400 210 120 470
![Page 27: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/27.jpg)
27
Performance
eclipse
6
hsqldb6
lusearc
h6xa
lan6
avro
ra9
jython9
luindex9
lusearc
h9pmd9
sunflow9
xalan
9
jbb2000
jbb2005
geomean
0
10
20
30
40
50
Pure Pessimistic
Pure Optimistic
Adaptive
Ove
rhea
d (%
)
770 270 326 650 95 270 430 170 100 29,000 2,400 210 120 470
![Page 28: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/28.jpg)
28
Performance
eclipse
6
hsqldb6
lusearc
h6xa
lan6
avro
ra9
jython9
luindex9
lusearc
h9pmd9
sunflow9
xalan
9
jbb2000
jbb2005
geomean
0
10
20
30
40
50
Pure Pessimistic
Pure Optimistic
Adaptive
Ove
rhea
d (%
)
770 270 326 650 95 270 430 170 100 29,000 24,00 210 120 470
![Page 29: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/29.jpg)
29
Framework support
• Detecting cross-thread dependences– dependence recorder
• Key challenge– Identify the source location of a happens-before
edge for a pessimistic conflicting transition– Current solution requires acquiring a lock and
writing to remote thread’s log
![Page 30: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/30.jpg)
30
Framework support (Cont.)
• Controlling cross-thread dependences– enforcing Region Serializability (in progress)
• Key challenge – Need to keep locking pessimistic objects until the end
of a region• Possible solution– Defer unlocking of pessimistic objects until program
lock releases• Helps dependence recorder• Simplifies instrumentation
![Page 31: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/31.jpg)
31
Framework support (Cont.)
• Controlling cross-thread dependences– enforcing Region Serializability (in progress)
• Key challenge – Need to keep locking pessimistic objects until the end
of a region• Possible solution– Defer unlocking of pessimistic objects until program
lock releases• Helps dependence recorder• Simplifies instrumentation
![Page 32: Man Cao Minjia Zhang Michael D. Bond](https://reader035.vdocuments.mx/reader035/viewer/2022062718/56812dbd550346895d92fd64/html5/thumbnails/32.jpg)
32
Conclusion & Future work
• Hybrid, adaptive synchronization achieves better performance– never significantly degrades performance– sometimes improves performance substantially
• Future directions– Explore different adaptive policies (e.g. aggregate
profiling)– Reduce instrumentation cost by deferring unlock
operations of pessimistic synchronization– Apply to control cross-thread dependences