learning from mistakes—a comprehensive study on real world concurrency bug characteristics

Download Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Post on 30-Dec-2015




1 download

Embed Size (px)


Learning From MistakesA Comprehensive Study on Real World Concurrency Bug Characteristics. Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou Appeared in ASPLOS08. Presented by Michelle Goodstein LBA Reading Group 3/27/08. Introduction. Multi-core computers are common - PowerPoint PPT Presentation


  • Learning From MistakesA Comprehensive Study on Real World Concurrency Bug CharacteristicsShan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan ZhouAppeared in ASPLOS08Presented by Michelle GoodsteinLBA Reading Group 3/27/08

  • IntroductionMulti-core computers are commonMore programmers are having to write concurrent programsConcurrent programs have different bugs than sequential programsHowever, without a study, hard to know what those bugs areFirst real-world study of concurrency bugs

  • IntroductionKnowing the types of concurrent bugs that actually occur in software will:Help create better bug detection schemesInform the testing process software goes throughProvide information to program language designers

  • IntroductionCurrent state of affairsRepeating concurrent bugs is difficultTest cases are critical to being able to diagnose a bugMost detection research focuses:data racesdeadlock bugssome new work on detecting atomicity violationsFew studies on real world concurrency bugsMost use programs that were buggy by design for the studyMost studies on bug characteristics focus on non-concurrent bugs

  • Methodology4 representative open-source applications:MySQLApacheMozillaOpenOfficeEach application has 9-13 years of development history 1-4 million lines of code

  • MethodologyRandomly selected bugs from bug databases that contained at least one keyword related to concurrency (eg race, concurrency, deadlock, synchronization, etc.)From these, randomly choose 500 bugs that haveRoot causes explained well and in detailSource code availableBug fix info available

  • MethodologyRemove any bugs not truly caused by concurrencyResult: 105 concurrency bugsSeparate study of deadlock and non-deadlock bugs

  • MethodologyEvaluated bugs in 3 dimensionsBug pattern: {atomicity-violation, order-violation, other}Manifestation: required conditions for bug to occur, # threads involved, # variables, # accessesBug fix strategy: Look at final patch, mistakes in intermediate patches, and whether TM can helpResults organized as a collection of findings

  • Motivation34/105 concurrency bugs cause program crashes37/105 concurrency bugs cause programs to hangConcurrency bugs are important

  • Bug Patterns

  • Findings: Bug PatternsAtomicity Violation

    Order Violation

  • Findings: Bug PatternsMost (72/74) of the examined non-deadlock concurrency bugs are either atomicity-violations or order-violationsFocusing on atomicity and order-violations should detect most non-deadlock concurrency bugsIn fact, 24/74 are order violationsSince current tools dont address order-violation, new tools must be developed

  • Bug Manifestations

  • Findings: Bug ManifestationsMost (101/105) bugs involved 2 threadsMost communication among a small number of threadsEnforcing certain partial orderings among a small number of threads can expose bugsHeavy workloads can increase competition for resources, and make it more likely to observe a partial ordering that causes a bug Pairwise Testing can find many bugs

  • Findings: Bug ManifestationsSome (7/31) bugs experience deadlock bugs with only 1 thread!Easy to detect/avoid

  • Findings: Bug ManifestationsMany (49/74) non-deadlock bugs involve 1 variable. However, 34% involve 2 variables Focusing on 1 variable is a good simplificationHowever, new tools also necessary to discover multivariable concurrency bugs

  • Findings: Bug ManifestationsMost (30/31 ) deadlock bugs involved 2 resourcesPairwise testing of order among obtained and released resources should help reveal deadlocks

  • Findings: Bug ManifestationsMost (92%) bugs manifested if enforced certain partial orderings among 4 memory accesses Testing small groups of accesses will be polynomial time and expose most bugs

  • Bug Fixes

  • Findings: Bug FixesAdding/changing locks only helps minority (20/74) non-deadlock concurrency bug fixesLocks arent enough to fix all concurrency bugs.Locks dont promise ordering, just atomicityAddition of locks can hurt performance or create new, deadlock bugs

  • Findings: Bug FixesMost common fix (19/31) to deadlock bugs allows 1 thread to ignore acquiring a resource, like a lockThis may get rid of deadlock bugs, but create other non-deadlock bugsCode may no longer be correct

  • Bug fixes: Buggy Patches 17/57 Mozilla bugs have 1 buggy patchOn average, release .4 buggy patches for every final correct patchOf 23 distinct buggy patches for the 17 bugs:6 decrease probability of occurrence but do not eliminate original bug5create new concurrency bugs 12 create new non-concurrency bugs

  • Findings: Bug fixesIn many (41/105) cases, TM can help avoid concurrency bugs

  • Findings: Bug fixesAlso in many cases (44/105), TM might be able to help with concurrency bugsNeed to allow long regions, rollback of I/O, strange nature of the code

  • Findings: Bug fixesIn 20/105 cases, TM provides little helpTM cannot help with many order-violation bugsWhile TM could be useful in preventing concurrency bugs, it will not fix all of them

  • ConclusionFirst real-world concurrent bug studyMultiple findings onType of concurrency bugsConditions for manifestationTechniques for fixing concurrent bugsSeveral heuristics proposed for:Bug detectionTestingLanguage Design (ie, TM)Future work can focus on detecting common types of errorsMulti-variable bugsOrder violation bugsMultiple-access bugs



View more >