Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses
Michael D. Bond
University of Texas at Austin
Graham Z. BakerTufts / MIT Lincoln Laboratory
We don't make a lot of the bug detectors you use. We make a lot of the bug detectors you use better.
Samuel Z. GuyerTufts University
Example: Dynamic data race detector
Thread A Thread Bwrite x
unlock mlock mwrite x
read x
Example: dynamic data race detector
Thread A Thread Bwrite x
unlock mlock mwrite x
read xrace!
Example: dynamic data race detector
Thread A Thread Bwrite x
unlock mlock mwrite x
read xrace!
T@A
T’@B
T’’@A
Example: dynamic data race detector
Thread A Thread Bwrite x
unlock mlock mwrite x
read xrace!
How is this race reported?
T@A
T’@B
T’’@A
Reporting a race
Thread A Thread Bwrite x
unlock mlock mwrite x
read x
T@A
T’@B
T’’@A
loc1
loc2
loc3
race!
Reporting a race
Thread A Thread B
write xread x
race! T’@B
T’’@A
write xunlock m
lock m
T@A loc1
loc2
loc3
AbstractDataTreeNode.indexOfChild():426
AbstractDataTreeNode.storeStrings():536
Reporting a race
Thread A Thread B
write xread x
race! T’@B
T’’@A
write xunlock m
lock m
T@A loc1
loc2
loc3
AbstractDataTreeNode.indexOfChild():426
AbstractDataTreeNode.storeStrings():536
Problem : not much information
Full stack traces
Thread A Thread B
write xread x
race! T’@B
T’’@A
write xunlock m
lock m
T@A loc1
loc2
loc3
AbstractDataTreeNode.indexOfChild():426AbstractDataTreeNode.childAtOrNull():212DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135Resource.getResourceInfo():973Project.hasNature():479 JavaProject.hasJavaNature():224JavaProject.computeExpandedClasspath():430JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376...
AbstractDataTreeNode.storeStrings():536DataTreeNode.storeStrings():343AbstractDataTreeNode.storeStrings():541DataTreeNode.storeStrings():343...ElementTree.shareStrings():706SaveManager.shareStrings():1154...StringPoolJob.shareStrings():124...Worker.run():76...
Context sensitivity Big impact on static analysis
Better information Better precision
Critical in modern software: Intensive code reuse (e.g., frameworks) Many small methods Highly dynamic behavior
What about dynamic analysis?
How hard is this?
Thread A Thread B
write xread x
race! T’@B
T’’@A
write xunlock m
lock m
T@A loc1
loc2
loc3
AbstractDataTreeNode.indexOfChild():426AbstractDataTreeNode.childAtOrNull():212DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135Resource.getResourceInfo():973Project.hasNature():479 JavaProject.hasJavaNature():224JavaProject.computeExpandedClasspath():430JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376...
AbstractDataTreeNode.storeStrings():536DataTreeNode.storeStrings():343AbstractDataTreeNode.storeStrings():541DataTreeNode.storeStrings():343...ElementTree.shareStrings():706SaveManager.shareStrings():1154...StringPoolJob.shareStrings():124...Worker.run():76...
How hard is this?
Thread A Thread B
write xread x
race! T’@B
T’’@A
write xunlock m
lock m
T@A loc1
loc2
loc3
AbstractDataTreeNode.indexOfChild():426AbstractDataTreeNode.childAtOrNull():212DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135Resource.getResourceInfo():973Project.hasNature():479 JavaProject.hasJavaNature():224JavaProject.computeExpandedClasspath():430JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376...
AbstractDataTreeNode.storeStrings():536DataTreeNode.storeStrings():343AbstractDataTreeNode.storeStrings():541DataTreeNode.storeStrings():343...ElementTree.shareStrings():706SaveManager.shareStrings():1154...StringPoolJob.shareStrings():124...Worker.run():76...EAS
Y
Race discovered here
How hard is this?
Thread A Thread B
write xread x
race! T’@B
T’’@A
write xunlock m
lock m
T@A loc1
loc2
loc3
AbstractDataTreeNode.indexOfChild():426AbstractDataTreeNode.childAtOrNull():212DeltaDataTree.lookup():666 ElementTree.includes():528 Workspace.getResourceInfo():1135Resource.getResourceInfo():973Project.hasNature():479 JavaProject.hasJavaNature():224JavaProject.computeExpandedClasspath():430JavaProject.getExpandedClasspath():1444... EclipseStarter.run():376...
AbstractDataTreeNode.storeStrings():536DataTreeNode.storeStrings():343AbstractDataTreeNode.storeStrings():541DataTreeNode.storeStrings():343...ElementTree.shareStrings():706SaveManager.shareStrings():1154...StringPoolJob.shareStrings():124...Worker.run():76...EAS
Y
HARD
Previously recorded
information
Challenge
Many events might need context information e.g., race detector: every read and write (!)
Existing approaches Walk the stack: up to 100X slowdown Build calling context tree: 2-3X, plus space
Context
Context
Context
Context
Context
Context
Context
Context
Context
Context
Context
BUG
Goal
Compact representation of calling contexts
Fast correct execution
Print out stack trace when bug detected
Efficient context sensitivity for dynamic bug detectors
Starting point
Represent a calling context in 1 word ⎯PCC value
Computed online, low overhead ⎯<5%
BUT, no way to decode a PCC value
Probabilistic Calling ContextBond and McKinley OOPSLA 07
✓
✓
✘
With PCC: analysis is context sensitive
Thread A Thread Bwrite x
unlock mlock mwrite x
read xrace!
T@A
T’@B
T’’@A
pcc1
pcc2
pcc3
0xFE9A651B
0x59C2DF08
How PCC works
Caller
Callee
m()
k()
k();
j(); h();
current PCC
callsite ID
p’ = f (p, c)
new PCC
= (3p + c) mod 232
…
… …
At each call site…
At each call site…Caller
Callee
m()
k()
k();
j(); h();
current PCC
callsiteID
p’ = f (p, c)
new PCC
= (3p + c) mod 232
…
… …
p = 0 in main()……
p = f(…f( f( f(0, c0), c1), c2)…, cn)
How PCC works
Breadcrumbs
Problem: decode PCC value Find a sequence of callsite IDs such that
p = f(…f( f( f(0, c0), c1), c2)…, cn)i.e., invert the hash function
Breadcrumbs
Problem: decode PCC value pFind a sequence of callsite IDs such that
p = f(…f( f( f(0, c0), c1), c2)…, cn)i.e., invert the hash function
Key: f is invertible Given p’ and c
unique p such that p’ = f(p, c) “Peel off” callsites until we reach 0 (main)
3 and 232 relatively
prime
Decode stack trace bottom-up
PCC value = 0x5A93CF09
g():2
Start at bottomof call stack
Decode stack trace bottom-up
PCC value = 0x5A93CF09
g():2
PCC value = 0x089C3A02
Use static call graph to determine
callersand apply f-1
f-1(0x5A93CF09, g():2)
Decode stack trace bottom-up
PCC value = 0x5A93CF09
g():2
d():9
PCC value = 0x0
PCC value = 0x089C3A02
PCC value = 0x59C2DF08
Continue until main() and p = 0
f-1(0x5A93CF09, g():2)
f-1(0x5A93CF09, d():9)
Decode stack trace bottom-up
PCC value = 0x5A93CF09
g():2
d():9
a():5
main():44
…
PCC value = 0x0
PCC value = 0x089C3A02
PCC value = 0x59C2DF08
Continue until main() and p = 0
f-1(0x5A93CF09, g():2)
f-1(0x5A93CF09, d():9)
…
Problem: blind search of call graph
e():5 f():4 g():2 …
d():9
a():5
main():44
…
PCC value = 0x0
…
…
b():3
…
…
PCC value = 0x5A93CF09
Statically possible
contexts >> 264
c():8h():3
j():8
Need moreinformation
Idea: record per-callsite PCC values
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8
b():3
…
…Add hash table
at each call site
0x089C3A02
Very easy search
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8
b():3
…
…Which caller is the right one?
PCC value = 0x5A93CF09
PCC value = 0x0
Very easy search
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8
b():3
…
…Invert f to find p
PCC value = 0x5A93CF09
PCC value = 0x0
f-1(0x5A93CF09, g():2)
0x089C3A02
Very easy search
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8
b():3
…
…New p value will
be in caller’s hash table
PCC value = 0x5A93CF09
PCC value = 0x0
✓✘f-1(0x5A93CF09,
g():2)
0x089C3A02
Very easy search
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8
b():3
…
…And continue…
PCC value = 0x5A93CF09
PCC value = 0x0
0x59C2DF08
f-1(0x5A93CF09, d():9)
f-1(0x5A93CF09, g():2)
0x089C3A02
Very easy search
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8
b():3
…
…
PCC value = 0x5A93CF09
PCC value = 0x0
✓ ✘0x59C2DF08
f-1(0x5A93CF09, d():9)
f-1(0x5A93CF09, g():2)
0x089C3A02
And continue…
Very easy search
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8
b():3
…
…
PCC value = 0x5A93CF09
PCC value = 0x0
0x59C2DF08
f-1(0x5A93CF09, d():9)
f-1(0x5A93CF09, g():2)
0x089C3A02
Not really searching at all
…
antlrcharteclip
se fophsql
dbjythonluind
expmdxala
n jbb
geomean
0 20
40
60
80
100
120
140
160
% overhead
With per-callsite setsJikesRVMDaCapo benchmarks
# set ops528m201m857m21m158
m3,624m217
m270m738m137m
PCC only
Observation
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8
b():3
…
…
A few call sites account for a
huge fraction of cost
Idea: stop tracking hot call sites
e():5 f():4 g():2
d():9
a():5
main():44
…
c():8h():3
…
…
j():8…
…b():3Throw out hash
table and instrumentation
antlrcharteclip
se fophsql
dbjythonluind
expmdxala
n jbb
geomean
0 20
40
60
80
100
120
140
160
% overhead
t = 100,000
t = 100
t = 10,000
No threshold
t = 1,000
PCC only
Is it enough information?
Tunable “hotnes
s” threshol
d
Decoding: hybrid search
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8…
…b():3Which caller is the right one?
PCC value = 0x5A93CF09
Decoding: hybrid search
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8…
…b():3Which caller is the right one?
PCC value = 0x5A93CF09
✓f-1(0x5A93CF09,
g():2)
0x089C3A02
Decoding: hybrid search
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8…
…b():3No information: must explore both paths
PCC value = 0x5A93CF09
0x59C2DF08
f-1(0x5A93CF09, d():9)
f-1(0x5A93CF09, g():2)
0x089C3A02
Heuristic search (see paper)
e():5 f():4 g():2 …
d():9
a():5
main():44
…
c():8h():3
…
…
j():8…
…b():3Sometimes fails to decode a
context
PCC value = 0x5A93CF09
0x59C2DF08
f-1(0x5A93CF09, d():9)
f-1(0x5A93CF09, g():2)
0x089C3A02
antlrcharteclip
se fophsql
dbjythonluind
expmdxala
n jbb
0 20
40
60
80
100
120
140
160
% overhead
100%
100%
100%
100%
100%
47% 47% 47% 82% 95%
100%
100%
100%
100%
100%
89% 95% 95% 97% 97%
Race detectionresults
(go to Pacer talk tomorrow!)
t = 100,000
t = 100
t = 10,000
No threshold
t = 1,000
geomean
Summary
Make any dynamic bug detector context sensitive
More in the paper: Description of search algorithm What kinds of bug detectors will benefit Results for two real bug detectors
(both quantitative and qualitative) Available as patch to JikesRVM
Related work
Reconstruct contexts from PC and SP [Mytkowicz et al. 2009] [Inoue and Nakatani 2009]Very low overhead, but little entropy in these values
Path profiling approach [Sumner et al. 2010]Uses multiple integers to represent calling context
exactly
Both require offline training, pre-computed infoChallenge for complex, highly dynamic software
Thank You
Questions?
Goals
Represent calling context compactlyEasily take place of static program locations
Fast correct executionFor deployed or field-testing environment
Decode back into stack trace when neededCould expensive, but cost paid offline
Calling context representation
Calling context stored in 1 word ⎯ PCC valueEssentially a hash of sequence of call site IDs
Computed online, low overhead <5%PCC values computed incrementally, at each call site
BUT, no way to decode a PCC valueCan distinguish, but not identify calling contexts
✓
✓
✘
Started with Probabilistic Calling Context Bond and McKinley OOPSLA 07
Summary
Make any dynamic bug detector context sensitive
Tunable overhead/precision tradeoffSweet spot:
10% to 20% overhead at threshold 1,000 to 10,000 Challenges
Long sequences of hot callsites Deep recursion
Available as patch to JikesRVM
antlrchart
eclipsefop
hsqldbjython
luindexpmd
xalanjbb
geomean
0 20 40 60 80 100 120 140 160
% overhead
Tradeoff: cost vs
decoding
t = 100,000
t = 100
t = 10,000
No threshold
t = 1,000
PCC only
100%
100%
100%
100%
100%
antlrcharteclip
se fophsql
dbjythonluind
expmdxala
n jbb
geomean
0 20
40
60
80
100
120
140
160
% overhead
47% 47% 47% 82% 95%
100%
100%
100%
100%
100%
89% 95% 95% 97% 97%