tolerating and correcting memory errors in c and c++

50
Tolerating and Correcting Memory Errors in C and C++ Ben Zorn Microsoft Research In collaboration with: Emery Berger and Gene Novark, Umass - Amherst Karthik Pattabiraman, UIUC Vinod Grover and Ted Hart, Microsoft Research Ben Zorn, Microsoft Research 1 Tolerating and Correcting Memory Errors in C and C++

Upload: bruce-medina

Post on 03-Jan-2016

52 views

Category:

Documents


5 download

DESCRIPTION

Tolerating and Correcting Memory Errors in C and C++. Ben Zorn Microsoft Research In collaboration with: Emery Berger and Gene Novark, Umass - Amherst Karthik Pattabiraman, UIUC Vinod Grover and Ted Hart, Microsoft Research. Focus on Heap Memory Errors. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tolerating and Correcting Memory Errors in C and C++

Tolerating and Correcting Memory Errors in C and C++

Ben ZornMicrosoft Research

In collaboration with:Emery Berger and Gene Novark, Umass - Amherst

Karthik Pattabiraman, UIUCVinod Grover and Ted Hart, Microsoft Research

Ben Zorn, Microsoft Research 1

Tolerating and Correcting Memory Errors in C and C++

Page 2: Tolerating and Correcting Memory Errors in C and C++

Buffer overflow

char *c = malloc(100);c[101] = ‘a’;

Dangling reference

char *p1 = malloc(100);char *p2 = p1;

free(p1);p2[0] = ‘x’;

a

Focus on Heap Memory Errors

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 2

c

0 99

p1

0 99

p2

x

Page 3: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Approaches to Memory Corruptions Rewrite in a safe language

Static analysis / safe subset of C or C++ SAFECode [Adve], etc.

Runtime detection, fail fast Jones & Lin, CRED [Lam], CCured [Necula], etc. Debugging possible, security advantage?

Runtime toleration Failure oblivious [Rinard] (unsound) Rx, Boundless Memory Blocks, ECC memory

DieHard / Exterminator, Samurai

3Tolerating and Correcting Memory Errors

in C and C++

Page 4: Tolerating and Correcting Memory Errors in C and C++

Fault Tolerance and Platforms Platforms necessary in computing ecosystem

Extensible frameworks provide lattice for 3rd parties Tremendously successful business model Examples numerous: Window, iPod, browser, etc.

Platform power derives from extensibility Tension between isolation for fault tolerance,

integration for functionality Platform only as reliable as weakest plug-in Tolerating bad plug-ins necessary by design

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 4

Page 5: Tolerating and Correcting Memory Errors in C and C++

Research Vision Increase robustness of installed code base

Potentially improve millions of lines of code Minimize effort – ideally no source mods, no

recompilation Reduce requirement to patch

Patches are expensive (detect, write, deploy) Patches may introduce new errors

Enable trading resources for robustness E.g., more memory implies higher reliability

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 5

Page 6: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Outline Motivation Exterminator

Collaboration with Emery Berger, Gene Novark Automatically corrects memory errors Suitable for large scale deployment

Critical Memory / Samurai Collaboration with Karthik Pattabiraman, Vinod Grover New memory semantics Source changes to explicitly identify and protect

critical data Conclusion

6Tolerating and Correcting Memory Errors

in C and C++

Page 7: Tolerating and Correcting Memory Errors in C and C++

Tolerating and Correcting Memory Errors in C and C++

DieHard Allocator in a Nutshell Joint work with Emery Berger Existing heaps are packed

tightly to minimize space Tight packing increases likelihood

of corruption Predictable layout is easier for

attacker to exploit We randomize and

overprovision the heap Expansion factor determines how

much empty space Semantics are identical

Replication increases benefits Enables analytic reasoning

7

Normal Heap

DieHard Heap

Ben Zorn, Microsoft Research

Page 8: Tolerating and Correcting Memory Errors in C and C++

DieHard in Practice DieHard (non-replicated)

Windows, Linux version implemented by Emery Berger Try it right now! (http://www.diehard-software.org/) Adaptive, automatically sizes heap Mechanism automatically redirects malloc calls to DieHard DLL

Application: Firefox & Mozilla Known buffer in version 1.7.3 overflow crashes browser

Experience Usable in practice – no perceived slowdown Roughly doubles memory consumption

20.3 Mbytes vs. 44.3 Mbytes with DieHard

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 8

Page 9: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Caveats Primary focus is on protecting heap

Techniques applicable to stack data, but requires recompilation and format changes

DieHard trades space, extra processors for memory safety Not applicable to applications with large footprint Applicability to server apps likely to increase

DieHard requires non-deterministic behavior to be made deterministic (on input, gettimeofday(), etc.)

DieHard is a brute force approach Improvements possible (efficiency, safety, coverage, etc.)

9Tolerating and Correcting Memory Errors

in C and C++

Page 10: Tolerating and Correcting Memory Errors in C and C++

Exterminator Motivation DieHard limitations

Tolerates errors probabilistically, doesn’t fix them Memory and CPU overhead Provides no information about source of errors

“Ideal” solution addresses the limitations Program automatically detects and fixes memory errors Corrected program has no memory, CPU overhead Sources of errors are pinpointed, easier for human to fix

Exterminator = correcting allocator Joint work with Emery Berger, Gene Novark Random allocation => isolates bugs while tolerating them

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 10

Page 11: Tolerating and Correcting Memory Errors in C and C++

Exterminator Components Architecture of Exterminator dictated by solving

specific problems How to detect heap corruptions effectively?

DieFast allocator How to isolate the cause of a heap corruption

precisely? Heap differencing algorithms

How to automatically fix buggy C code without breaking it? Correcting allocator + hot allocator patches

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 11

Page 12: Tolerating and Correcting Memory Errors in C and C++

DieFast Allocator Randomized, over-provisioned heap

Canary = random bit pattern fixed at startup Leverage extra free space by inserting canaries

Inserting canaries Initialization – all cells have canaries On allocation – no new canaries On free – put canary in the freed object with prob. P Remember where canaries are (bitmap)

Checking canaries On allocation – check cell returned On free – check adjacent cells

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 12

100101011110

Page 13: Tolerating and Correcting Memory Errors in C and C++

1 2

Installing and Checking Canaries

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 13

Allocate Allocate

Install canarieswith probability PCheck canary Check canary

Free

Initially, heap full of canaries

1

Page 14: Tolerating and Correcting Memory Errors in C and C++

Heap Differencing Strategy

Run program multiple times with different randomized heaps

If detect canary corruption, dump contents of heap Identify objects across runs using allocation order

Key insight: Relation between corruption and object causing corruption is invariant across heaps Detect invariant across random heaps More heaps => higher confidence of invariant

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 14

Page 15: Tolerating and Correcting Memory Errors in C and C++

1 2 X

Attributing Buffer Overflows

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 15

One candidate!

4 3

corrupted canary

Which object caused?

delta is constant but unknown?

12 X4 3

Run 2

Run 1

Now only 2 candidates

2 4

41 X3

Run 3

2 44

Precision increases exponentially with number of runs

Page 16: Tolerating and Correcting Memory Errors in C and C++

Detecting Dangling Pointers (2 cases) Dangling pointer read/written (easy)

Invariant = canary in freed object X has same corruption in all runs

Dangling pointer only read (harder) Sketch of approach (paper explains details)

Only fill freed object X with canary with probability P Requires multiple trials: ≈ log2(number of callsites) Look for correlations, i.e., X filled with canary => crash Establish conditional probabilities

Have: P(callsite X filled with canary | program crashes) Need: P(crash | filled with canary), guess “prior” to compute

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 16

Page 17: Tolerating and Correcting Memory Errors in C and C++

Correcting Allocator Group objects by allocation site Patch object groups at allocate/free time Associate patches with group

Buffer overrun => add padding to size request malloc(32) becomes malloc(32 + delta)

Dangling pointer => defer free free(p) becomes defer_free(p, delta_allocations)

Fixes preserve semantics, no new bugs created Correcting allocation may != DieFast or DieHard

Correction allocator can be space, CPU efficient “Patches” created separately, installed on-the-fly

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 17

Page 18: Tolerating and Correcting Memory Errors in C and C++

Deploying Exterminator Exterminator can be deployed in different modes Iterative – suitable for test environment

Different random heaps, identical inputs Complements automatic methods that cause crashes

Replicated mode Suitable in a multi/many core environment Like DieHard replication, except auto-corrects, hot patches

Cumulative mode – partial or complete deployment Aggregates results across different inputs Enables automatic root cause analysis from Watson dumps Suitable for wide deployment, perfect for beta release Likely to catch many bugs not seen in testing lab

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 18

Page 19: Tolerating and Correcting Memory Errors in C and C++

cfra

c

espr

esso

linds

ay p2c

robo

op

164.

gzip

175.

vpr

176.

gcc

181.

mcf

186.

craf

ty

197.

pars

er

252.

eon

253.

perlb

mk

254.

gap

255.

vortex

256.

bzip

2

300.

twolf

Geom

etric

mea

n

0

0.5

1

1.5

2

2.5

GNU libc Exterminator

Norm

alized

Execu

tion

Tim

e

allocation-intensive SPECint2000

DieFast Overhead

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 19

Page 20: Tolerating and Correcting Memory Errors in C and C++

Exterminator Effectiveness Squid web cache buffer overflow

Crashes glibc 2.8.0 malloc 3 runs sufficient to isolate 6-byte overflow

Mozilla 1.7.3 buffer overflow (recall demo) Testing scenario - repeated load of buggy page

23 runs to isolate overflow Deployed scenario – bug happens in middle of

different browsing sessions 34 runs to isolate overflow

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 20

Page 21: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Outline Motivation Exterminator

Collaboration with Emery Berger, Gene Novark Automatically corrects memory errors Suitable for large scale deployment

Critical Memory / Samurai Collaboration with Karthik Pattabiraman, Vinod Grover New memory semantics Source changes to explicitly identify and protect

critical data Conclusion

21Tolerating and Correcting Memory Errors

in C and C++

Page 22: Tolerating and Correcting Memory Errors in C and C++

Critical Memory Motivation C/C++ programs vulnerable to memory errors

Software errors: buffer overflows, etc. Hardware transient errors: bit flips, etc.

Increasingly a problem due to process shrinking, power

Critical memory goals: Harden programs from both SW and HW errors Allow local reasoning about memory state Allow selective, incremental hardening of apps Provide compatibility with existing libraries, apps

Ben Zorn, Microsoft Research 22

Tolerating and Correcting Memory Errors in C and C++

Page 23: Tolerating and Correcting Memory Errors in C and C++

balance += 100;if (balance < 0) { chargeCredit();} else { x += 10; y += 10;}

Main Idea: Data-centric Robustness Critical memory

Some data is more important than other data Selectively protect that data from corruption

Examples Account data, document contents are critical

// UI data is not Game score information, player stats, critical

// rendering data structures are not

balance

Data Code

x, y

critical data

code thatreferencescritical data

Ben Zorn, Microsoft Research 23

Tolerating and Correcting Memory Errors in C and C++

Page 24: Tolerating and Correcting Memory Errors in C and C++

Critical Memory Semantics Conceptually, critical memory is parallel and

independent of normal memory Critical memory requires special allocate/deallocate

and read/write operations critical_store (cstore) – only way to consistently update

critical memory critical_load (cload) – only way to consistently read critical

memory Critical load/store have priority over normal

load/store Normal loads still see the value of critical memory

Ben Zorn, Microsoft Research 24

Tolerating and Correcting Memory Errors in C and C++

Page 25: Tolerating and Correcting Memory Errors in C and C++

Example

cstore balance, 100…cload balance returns 100load balance returns 100

100

100normalmemory

criticalmemory

cstore100

cstore balance, 100store balance, 10000 (applications should not do this)…load balance returns 10000 (compatibility semantics)

cload balance returns 100 (possibly triggers exception)

100

10000normalmemory

criticalmemory

cstore 100store 10000

cloadload

load

Ben Zorn, Microsoft Research 25

Tolerating and Correcting Memory Errors in C and C++

Page 26: Tolerating and Correcting Memory Errors in C and C++

int x, y, buffer[10];critical int balance = 100;

third_party_lib(&x, &y);buffer[10] = 10000; // Bug!

// balance still == 100

if (balance < 0) { chargeCredit();} else { x += 10; y += 10;}

Critical Memory Deployment Define “critical” type specifier:

Like “const” Easy to use, minimal source

mods Allows local reasoning

External libraries, code cannot modify critical data

Tolerates memory errors Non-critical overflows cannot

corrupt critical values Alllows static analysis of program

subset Critical subset of program can be

statically checked independently Additional checking on critical

data possible

Ben Zorn, Microsoft Research 26

Tolerating and Correcting Memory Errors in C and C++

Page 27: Tolerating and Correcting Memory Errors in C and C++

Third-party Libraries/Untrusted Code Library code does not

need to be critical memory aware If library does not mod

critical data, no changes required

If library needs to modify critical data Allow normal stores to

critical memory in library Explicitly “promote” on

return Copy-in, copy-out semantics

critical int balance = 100;…library_foo(&balance);promote balance;…__________________

// arg is not critical int *void library_foo(int *arg){ *arg = 10000; return;}

Ben Zorn, Microsoft Research 27

Tolerating and Correcting Memory Errors in C and C++

Page 28: Tolerating and Correcting Memory Errors in C and C++

Samurai: SCM Prototype Software critical memory for heap objects

Critical objects allocated with crit_malloc, crit_free Approach

Replication – base copy + 2 shadow copies Redundant metadata

Stored with base copy, copy in hash table Checksum, size data for overflow detection

Robust allocator as foundation DieHard, unreplicated Randomizes locations of shadow copies

Ben Zorn, Microsoft Research 28

Tolerating and Correcting Memory Errors in C and C++

Page 29: Tolerating and Correcting Memory Errors in C and C++

Samurai Implementation

cstore balance, 100…cload balance returns 100load balance returns 100

100

100basecopy

shadowcopies

cstore100

cstore balance, 100store balance, 10000…load balance returns 10000cload balance returns 100

100

cload

metadata

cs

=?

100

10000basecopy

shadowcopies

100

metadata

cs

load=?

cload

=?store 10000

Ben Zorn, Microsoft Research 29

Tolerating and Correcting Memory Errors in C and C++

Page 30: Tolerating and Correcting Memory Errors in C and C++

Samurai Experimental Results Prototype implementation of critical memory

Fault-tolerant runtime system for C/C++ Applied to heap objects Automated Phoenix compiler pass

Identified critical data for five SPECint applications Low overheads for most applications (less than 10%)

Conducted fault-injection experiments Fault tolerance significantly improved over based code Low probability of fault-propagation from non-critical data to

critical data for most applications No new assertions or consistency checks added

Ben Zorn, Microsoft Research 30

Tolerating and Correcting Memory Errors in C and C++

Page 31: Tolerating and Correcting Memory Errors in C and C++

Performance ResultsPerformance Overhead

1.03 1.08 1.01 1.08

2.73

0

0.5

1

1.5

2

2.5

3

vpr crafty parser rayshade gzip

Benchmark

Slo

wd

ow

n

Baseline Samurai

Ben Zorn, Microsoft Research 31

Tolerating and Correcting Memory Errors in C and C++

Page 32: Tolerating and Correcting Memory Errors in C and C++

Fault Injection into Critical Data (vpr)Fault Injections into vpr (with Samurai)

0%

20%

40%

60%

80%

100%

1000

0

2000

0

3000

0

4000

0

5000

0

6000

0

7000

0

8000

0

9000

0

1E+

06

Fault Period (number of accesses)

Per

cen

tag

e o

f T

rial

s

Successes Failures False-Positives

Fault Injections into vpr (without Samurai )

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1000

00

2000

00

3000

00

4000

00

5000

00

6000

00

7000

00

8000

00

9000

00

1000

000

Fault Period (number of accesses)

Per

cen

tag

e o

f T

rial

s

Successes Failures False-Positives

Ben Zorn, Microsoft Research 32

Tolerating and Correcting Memory Errors in C and C++

Page 33: Tolerating and Correcting Memory Errors in C and C++

Fault Injection into Non-Critical DataApp Number

of TrialsControl Errors

Data Errors

Pointer Errors

Assertion Violations

Total Errors

vpr 550 (199) 0 203 (0) 1 (0) 2 (2) 203 (0)

crafty 55 (18) 12 (7) 9 (3) 4 (3) 0 25 (13)

parser 500 (380) 0 3 (1) 0 0 3 (1)

rayshade 500 (68) 0 5 (1) 0 1 (1) 5 (1)

gzip 500 (239) 0 1 (1) 2 (2) 157 (157) 3 (3)

Ben Zorn, Microsoft Research 33

Tolerating and Correcting Memory Errors in C and C++

Page 34: Tolerating and Correcting Memory Errors in C and C++

Experience with CM for Components Hardened two libraries with critical memory

CM-STL – hardened List Class data and metadata CM-malloc – hardened allocator metadata

Effort Relatively small changes to existing code

Performance impact CM-STL in Web Server thread list => 10% slower CM-malloc in cfrac benchmark => 22% slower

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 34

Page 35: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Conclusion Programs written in C / C++ can execute safely

and correctly despite memory errors Research vision

Improve existing code without source modifications Reduce human generated patches required Increase reliability, security by order of magnitude

Current projects DieHard / Exterminator: automatically detect and

correct memory errors (with high probability) Critical Memory / Samurai: enable local reasoning,

allow selective hardening, compatibility ToleRace: replication to hide data races

35Tolerating and Correcting Memory Errors

in C and C++

Page 36: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Hardware Trends (1) Reliability Hardware transient faults are increasing

Even type-safe programs can be subverted in presence of HW errors Academic demonstrations in Java, OCaml

Soft error workshop (SELSE) conclusions Intel, AMD now more carefully measuring “Not practical to protect everything” Faults need to be handled at all levels from HW up the

software stack Measurement is difficult

How to determine soft HW error vs. software error? Early measurement papers appearing

36Tolerating and Correcting Memory Errors

in C and C++

Page 37: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Hardware Trends (2) Multicore DRAM prices dropping

2Gb, Dual Channel PC 6400 DDR2 800 MHz $85

Multicore CPUs Quad-core Intel Core 2 Quad, AMD

Quad-core Opteron

Eight core Intel by 2008?

Challenge: How should we use all this hardware?

37Tolerating and Correcting Memory Errors

in C and C++

Page 38: Tolerating and Correcting Memory Errors in C and C++

Additional Information Web sites:

Ben Zorn: http://research.microsoft.com/~zorn DieHard: http://www.diehard-software.org/ Exterminator: http://www.cs.umass.edu/~gnovark/

Publications Emery D. Berger and Benjamin G. Zorn,  "DieHard:

Probabilistic Memory Safety for Unsafe Languages", PLDI’06. Karthik Pattabiraman, Vinod Grover, and Benjamin G. Zorn,

"Samurai - Protecting Critical Data in Unsafe Languages", Microsoft Research, MSR-TR-2006-127, September 2006.

Gene Novark, Emery D. Berger and Benjamin G. Zorn,  “Exterminator: Correcting Memory Errors with High Probability", PLDI’07.

Ben Zorn, Microsoft Research 38

Tolerating and Correcting Memory Errors in C and C++

Page 39: Tolerating and Correcting Memory Errors in C and C++

Backup Slides

Ben Zorn, Microsoft Research 39

Tolerating and Correcting Memory Errors in C and C++

Page 40: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

DieHard: Probabilistic Memory Safety Collaboration with Emery Berger Plug-compatible replacement for malloc/free in C lib We define “infinite heap semantics”

Programs execute as if each object allocated with unbounded memory

All frees ignored Approximating infinite heaps – 3 key ideas

Overprovisioning Randomization Replication

Allows analytic reasoning about safety

40Tolerating and Correcting Memory Errors

in C and C++

Page 41: Tolerating and Correcting Memory Errors in C and C++

Overprovisioning, Randomization

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 41

Expand size requests by a factor of M (e.g., M=2)

1 2 3 4 5

1 2 3 4 5

Randomize object placement

12 34 5

Pr(write corrupts) = ½ ?

Pr(write corrupts) = ½ !

Page 42: Tolerating and Correcting Memory Errors in C and C++

Replication (optional)

Ben Zorn, Microsoft Research

Tolerating and Correcting Memory Errors in C and C++ 42

Replicate process with different randomization seeds

1 234 5

P2

12 345

P3

input

Broadcast input to all replicas

Compare outputs of replicas, kill when replica disagrees

1 23 45

P1

Voter

Page 43: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

DieHard Implementation Details Multiply allocated memory by factor of M Allocation

Segregate objects by size (log2), bitmap allocator Within size class, place objects randomly in address

space Randomly re-probe if conflicts (expansion limits probing)

Separate metadata from user data Fill objects with random values – for detecting uninit reads

Deallocation Expansion factor => frees deferred Extra checks for illegal free

43Tolerating and Correcting Memory Errors

in C and C++

Page 44: Tolerating and Correcting Memory Errors in C and C++

Segregated size classes

- Static strategy pre-allocates size classes- Adaptive strategy grows each size class incrementally

Ben Zorn, Microsoft Research

Over-provisioned, Randomized Heap

2

H = max heap size, class i

L = max live size ≤ H/2

F = free = H-L

4 5 3 1 6

object size = 16

object size = 8

44Tolerating and Correcting Memory Errors

in C and C++

Page 45: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Randomness enables Analytic ReasoningExample: Buffer Overflows

k = # of replicas, Obj = size of overflow With no replication, Obj = 1, heap no more

than 1/8 full:Pr(Mask buffer overflow), = 87.5%

3 replicas: Pr(ibid) = 99.8%

45Tolerating and Correcting Memory Errors

in C and C++

Page 46: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

DieHard CPU Performance (no replication) Runtime on Windows

0

0.2

0.4

0.6

0.8

1

1.2

1.4

cfrac espresso lindsay p2c roboop Geo. Mean

Norm

aliz

ed runtim

e

malloc DieHard

46Tolerating and Correcting Memory Errors in

C and C++

Page 47: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

DieHard CPU Performance (Linux)

47Tolerating and Correcting Memory Errors

in C and C++

cfra

c

esp

ress

o

lind

say

rob

oo

p

Ge

o. M

ea

n

16

4.g

zip

17

5.v

pr

17

6.g

cc

18

1.m

cf

18

6.c

rafty

19

7.p

ars

er

25

2.e

on

25

3.p

erl

bm

k

25

4.g

ap

25

5.v

ort

ex

25

6.b

zip

2

30

0.tw

olf

Ge

o. M

ea

n

0

0.5

1

1.5

2

2.5

malloc GC DieHard (static) DieHard (adaptive)

No

rma

lize

d r

un

tim

e

alloc-intensive general-purpose

Page 48: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Correctness Results Tolerates high rate of synthetically injected

errors in SPEC programs Detected two previously unreported benign

bugs (197.parser and espresso) Successfully hides buffer overflow error in

Squid web cache server (v 2.3s5) But don’t take my word for it…

48Tolerating and Correcting Memory Errors

in C and C++

Page 49: Tolerating and Correcting Memory Errors in C and C++

Experiments / Benchmarks vpr: Does place and route on FPGAs from netlist

Made routing-resource graph critical crafty: Plays a game of chess with the user

Made cache of previously-seen board positions critical gzip: Compress/Decompresses a file

Made Huffman decoding table critical parser: Checks syntactic correctness of English

sentences based on a dictionary Made the dictionary data structures critical

rayshade: Renders a scene file Made the list of objects to be rendered critical

Ben Zorn, Microsoft Research 49

Tolerating and Correcting Memory Errors in C and C++

Page 50: Tolerating and Correcting Memory Errors in C and C++

Ben Zorn, Microsoft Research

Related Work Conservative GC (Boehm / Demers / Weiser)

Time-space tradeoff (typically >3X) Provably avoids certain errors

Safe-C compilers Jones & Kelley, Necula, Lam, Rinard, Adve, … Often built on BDW GC Up to 10X performance hit

N-version programming Replicas truly statistically independent

Address space randomization (as in Vista) Failure-oblivious computing [Rinard]

Hope that program will continue after memory error with no untoward effects

50Tolerating and Correcting Memory Errors

in C and C++