Transcript
Page 1: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Learning From Mistakes—A Learning From Mistakes—A Comprehensive Study on Real Comprehensive Study on Real World Concurrency Bug World Concurrency Bug CharacteristicsCharacteristics

Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan ZhouAppeared in ASPLOS’08

Presented by Michelle GoodsteinLBA Reading Group 3/27/08

Page 2: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

IntroductionIntroductionMulti-core computers are commonMore programmers are having to write

concurrent programsConcurrent programs have different bugs

than sequential programsHowever, without a study, hard to know

what those bugs areFirst real-world study of concurrency

bugs

Page 3: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

IntroductionIntroductionKnowing the types of concurrent bugs

that actually occur in software will:◦Help create better bug detection schemes◦ Inform the testing process software goes

through◦Provide information to program language

designers

Page 4: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

IntroductionIntroductionCurrent state of affairs◦ Repeating concurrent bugs is difficult◦ Test cases are critical to being able to diagnose a bug◦Most detection research focuses:

data races deadlock bugs some new work on detecting atomicity violations

Few studies on real world concurrency bugs◦Most use programs that were buggy by design for the

studyMost studies on bug characteristics focus on non-

concurrent bugs

Page 5: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

MethodologyMethodology4 representative open-source

applications:◦MySQL◦Apache◦Mozilla◦OpenOffice

Each application has◦ 9-13 years of development history ◦1-4 million lines of code

Page 6: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

MethodologyMethodologyRandomly selected bugs from bug

databases that contained at least one keyword related to concurrency (eg “race”, “concurrency”, “deadlock”, “synchronization”, etc.)

From these, randomly choose 500 bugs that have◦Root causes explained well and in detail◦Source code available◦Bug fix info available

Page 7: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

MethodologyMethodologyRemove any bugs not truly caused by

concurrencyResult: 105 concurrency bugsSeparate study of deadlock and non-

deadlock bugs

Page 8: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

MethodologyMethodologyEvaluated bugs in 3 dimensions◦Bug pattern: {atomicity-violation, order-

violation, other}◦Manifestation: required conditions for bug to

occur, # threads involved, # variables, # accesses

◦Bug fix strategy: Look at final patch, mistakes in intermediate patches, and whether TM can help

Results organized as a collection of findings

Page 9: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

MotivationMotivation34/105 concurrency bugs cause program

crashes37/105 concurrency bugs cause programs

to hangConcurrency bugs are important

Page 10: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Bug PatternsBug Patterns

Page 11: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug PatternsFindings: Bug PatternsAtomicity Violation

Order Violation

Page 12: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug PatternsFindings: Bug PatternsMost (72/74) of the examined non-

deadlock concurrency bugs are either atomicity-violations or order-violationsFocusing on atomicity and order-violations

should detect most non-deadlock concurrency bugs

In fact, 24/74 are order violationsSince current tools don’t address order-

violation, new tools must be developed

Page 13: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Bug ManifestationsBug Manifestations

Page 14: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug ManifestationsFindings: Bug ManifestationsMost (101/105) bugs involved ≤ 2 threads• Most communication among a small number

of threads• Enforcing certain partial orderings among a

small number of threads can expose bugs• Heavy workloads can increase competition for

resources, and make it more likely to observe a partial ordering that causes a bug

Pairwise Testing can find many bugs

Page 15: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug ManifestationsFindings: Bug ManifestationsSome (7/31) bugs experience deadlock

bugs with only 1 thread!Easy to detect/avoid

Page 16: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug ManifestationsFindings: Bug ManifestationsMany (49/74) non-deadlock bugs involve

1 variable. However, 34% involve ≥ 2 variables Focusing on 1 variable is a good

simplificationHowever, new tools also necessary to

discover multivariable concurrency bugs

Page 17: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug ManifestationsFindings: Bug ManifestationsMost (30/31 ) deadlock bugs involved ≤ 2

resourcesPairwise testing of order among obtained

and released resources should help reveal deadlocks

Page 18: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug ManifestationsFindings: Bug ManifestationsMost (92%) bugs manifested if enforced certain

partial orderings among ≤ 4 memory accesses Testing small groups of accesses will be polynomial time and

expose most bugs

Page 19: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Bug FixesBug Fixes

Page 20: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug FixesFindings: Bug FixesAdding/changing locks only helps

minority (20/74) non-deadlock concurrency bug fixesLocks aren’t enough to fix all concurrency

bugs.Locks don’t promise ordering, just atomicityAddition of locks can hurt performance or

create new, deadlock bugs

Page 21: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug FixesFindings: Bug FixesMost common fix (19/31) to deadlock

bugs allows 1 thread to ignore acquiring a resource, like a lockThis may get rid of deadlock bugs, but create

other non-deadlock bugsCode may no longer be correct

Page 22: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Bug fixes: Buggy Patches Bug fixes: Buggy Patches 17/57 Mozilla bugs have ≥ 1 buggy patchOn average, release .4 buggy patches for

every final correct patchOf 23 distinct buggy patches for the 17 bugs:◦6 decrease probability of occurrence but do not

eliminate original bug◦5 create new concurrency bugs◦ 12 create new non-concurrency bugs

Page 23: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug fixesFindings: Bug fixesIn many (41/105) cases, TM can help

avoid concurrency bugs

Page 24: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug fixesFindings: Bug fixesAlso in many cases (44/105), TM might be

able to help with concurrency bugs◦Need to allow long regions, rollback of I/O,

strange “nature” of the code

Page 25: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Findings: Bug fixesFindings: Bug fixesIn 20/105 cases, TM provides little help◦TM cannot help with many order-violation bugsWhile TM could be useful in preventing

concurrency bugs, it will not fix all of them

Page 26: Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

ConclusionConclusion First real-world concurrent bug study Multiple findings on

◦ Type of concurrency bugs◦ Conditions for manifestation◦ Techniques for fixing concurrent bugs

Several heuristics proposed for:◦ Bug detection◦ Testing◦ Language Design (ie, TM)

Future work can focus on detecting common types of errors◦ Multi-variable bugs◦ Order violation bugs◦ Multiple-access bugs


Top Related