ali kheradmand, baris kasikci, george candea lockout: efficient testing for deadlock bugs 1

20
i Kheradmand, Baris Kasikci , George Candea Lockout: Efficient Testing for Deadlock Bugs 1

Upload: branden-norman

Post on 17-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

Ali Kheradmand, Baris Kasikci, George Candea

Lockout: Efficient Testing for Deadlock Bugs

1

Page 2: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

2

Deadlock

• Set of threads • Each holding a lock needed by another thread• Waiting for another lock to be released by some

other thread

Page 3: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

3

Why Do Deadlocks Matter?

• Common in modern software• Hard to detect manually• Occur rarely during execution

Chrome Apache Eclipse LimeWire

Page 4: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

4

Deadlock Detection

• Traditional testing– Deadlocks manifest rarely ✗

• Static detection– Fast (run offline) ✓– Few false negatives ✓– Many false positives ✗

• Dynamic detection– Slow (high runtime overhead) ✗– Many false negatives ✗– Few false positives ✓

Page 5: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

5

Best of Two Worlds

• Normal tests can’t discover enough deadlocks• Deadlock avoidance or fixing tools tools

(Dimmunix [OSDI’08]) take a long time• Need to find the schedules that lead to a deadlock

• How to increase the probability of encountering a deadlock?

Steer the program towards schedules that are likely to cause a deadlock

Page 6: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

6

Lockout

• Systematic deadlock testing• Increases deadlock probability• By steering the scheduling

• Leverages past program executions

Trigger more deadlocks with the same test suite

Page 7: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

7

Outline

• Lockout architecture• Deadlock triggering algorithm• Preliminary results• Summary and future work

Page 8: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

8

Lockout Architecture

cout << “Hello, World!\n”;

Instrumentation01001001010101010101001001101010101

Static Phase

Source Code

Instrumented Binary

StaticAnalysis

RaceMob [SOSP’13]

Page 9: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

Lockout ArchitectureDynamic Phase

Runtime Previous executions info

Subsequent executions

InstrumentedProgram

9

End of execution

Scheduleperturbation

Events

DeadlockFuzzer [PLDI’09]

Page 10: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

10

Outline

• Lockout architecture• Deadlock triggering algorithm• Preliminary results• Summary and future work

Page 11: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

11

Runtime Lock Order Graph (RLG)

a

b

c

locklock(a)…lock(b)…lock(c)

Potential deadlock

Thread 1

Lock acquisition order

Page 12: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

12

Deadlock Triggering

• Selects a directed cycle in RLG• Delays threads accordingly• Improves simple preemption (CHESS [OSDI’08])• Preemption before each lock

d a

bc

lock(a)

lock(b)

Page 13: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

13

Runtime

Deadlock Triggering

Thread 1lock(a)lock(b)…unlock(b)unlock(a)

Thread 2

lock(b)lock(c)…unlock(c)unlock(b)

Time Thread 3

lock(c)lock(a)…unlock(a)unlock(c)

a

c b

Page 14: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

14

RT()lock(b)RT()

lock(c)

RT()lock(c)RT()

lock(a)

Before loc

k (a)

Thread 1RT()lock(a)RT()

lock(b)

Thread 2Time Thread 3

a

c b

Runtime

Before lock (a)

ContinueBefore lock (b)

Delay Before lock (c)

Continue

Delay

Deadlock Triggering

Before lock (b

)

ContinueBefo

re lock

(c)

Delay

Page 15: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

15

Race Dependent Deadlocks

• Preempt before memory accesses• Ideally only shared memory accesses

• Can be approximated by preempting after locks

• Can be improved using static analysis• Can be improved using data race detection

Page 16: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

16

Outline

• Lockout architecture• Deadlock triggering algorithm• Preliminary results• Summary and future work

Page 17: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

17

Lockout EffectivenessProgram Fraction of executions resulting in deadlock (%)

Native Simple preemption+Post-lockpreemption

Deadlock triggering+ Pre-memory accesspreemption

Deadlock triggering+Post-lockpreemption

Microbench 0.00066 % 0.5 % 50 % 50 %

SQLite 3.3.0 0.00064 % 4 % 50 % 50 %

HawkNL 1.6b3 23 % 64 % 50 % 50 %

Pbzip2 1.1.6 0 % 0 % 0 % 0 %

Httpd 2.0.65 0 % 0 % 0 % 0 %

Fraction of executions with deadlocksincreased up to three orders of magnitude

Page 18: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

18

Outline

• Lockout architecture• Deadlock triggering algorithm• Preliminary results• Summary and future work

Page 19: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

19

Lockout

• Increases deadlock probability• Leverages past program executions• Effective• Up to 3 orders of magnitude more deadlock prone

• Open source: • https://github.com/dslab-epfl/lockout

Page 20: Ali Kheradmand, Baris Kasikci, George Candea Lockout: Efficient Testing for Deadlock Bugs 1

20

Future Work

• Increasing effectiveness with low overhead• Static analysis (data races, shared variables)

• Lockout + Automatic failure fixing/avoidance (Dimmunix [OSDI’08], CFix [OSDI’12], Aviso [ASPLOS’13])• In production• For testing

• Crowdsourcing (Aviso [ASPLOS’13], RaceMob [SOSP’13])