uw-madison computer sciences multifacet group© 2011 karma: scalable deterministic record-replay...

24
UW-Madison Computer Sciences Multifacet Group © 2011 Karma: Scalable Deterministic Record- Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at University of Wisconsin- Madison

Upload: daniella-towles

Post on 14-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

UW-Madison Computer Sciences Multifacet Group © 2011

Karma:Scalable Deterministic Record-Replay

Arkaprava BasuJayaram Bobba

Mark D. Hill

Work done at University of Wisconsin-Madison

Page 2: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

2

Executive summary

• Applications of deterministic record-replay– Debugging– Fault tolerance– Security

• Existing hardware record-replayer– Fast record but– Slow replay or – Requires major hardware changes

• Karma: Faster Replay with nearly-conventional h/w– Extends Rerun– Records more parallelism

Page 3: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

3

Outline

• Background & Motivation• Rerun Overview• Karma Insights• Karma Implementation• Evaluation• Conclusion

Page 4: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

4

Deterministic Record-Replay

• Multi-threaded execution non-deterministic• Deterministic record-replay to reincarnate

past execution• Record:

– Record selective events in a log• Replay:

– Use the log to reincarnate past execution• Key Challenge: Memory races

Page 5: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

5

Record-Replay Motivation

• Debugging– Ensures bugs faithfully reappear (no heisenbugs)

• Fault-Tolerance– Enable hot backup for primary server to

shadow primary & take over on failure

• Security– Real time intrusion detection & attack analysis

Rep

lay sp

eed

matte

rs

Page 6: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

6

Previous work

• Record Dependence– Wisconsin Flight Data Recorder [ISCA’03,etc.]: Too much

state– UCSD Strata [ASPLOS’06]: Log size grows rapidly w #cores

• Record Independence– UIUC DeLorean [ISCA’08]: Non-conventional BulkSC H/W– Wisconsin Rerun [ISCA’08]: Sequential replay– Intel MRR [MICRO’09]: Only for snoop based systems– Timetraveler [ISCA’10]: Extends Rerun to lower log size

• Our Goal– Retain Rerun’s near-conventional hardware– Enable Faster Replay

Page 7: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

7

Outline

• Background & Motivation• Rerun Overview• Karma Insights• Karma Implementation• Evaluation• Conclusion

Page 8: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

8

Rerun’s Recording

• Most code executes without races– Use race-free regions for ordering

• Episodes: independent execution regions– Defined per thread

T0 T1

LD A ST B ST C LD F

ST E LD B ST X LD R ST T LD X

T2

ST V ST Z LD W LD J

ST C LD Q LD J

ST Q ST E ST K LD Z

LD V

ST X

Partially adopted from ISCA’08 talk

Page 9: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

9

23

Rerun’s Recording (Contd.)

• Capturing causality:– Timestamp via Lamport scalar clock [Lamport ‘78]

• Replay in timestamp order– Episodes with same timestamp can be replayed in parallel

43 2260

61 44

62

2344

45

T0 T1 T2

Page 10: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

10

Rerun’s Replay

T0 T1 T2

22

43

4444

45

60

61

TS=22

TS=45

TS=44

TS=43

TS=60

TS=61

Page 11: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

11

Outline

• Background & Motivation• Rerun Overview• Karma Insights• Karma Implementation• Evaluation• Conclusion

Page 12: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

12

Karma’s Insight 1:

• Capture order with DAG (not scalar clock)

Recording: DAG captured with episode predecessor & successor sets 23

43 2260

61 44

62

2344

45

T0 T1 T2

Page 13: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

13

Karma’s Insight 1:

T0 T1 T2

2260

61 43

4444

62

T0 T1 T2

22

43

4444

45

60

61

Reru

n’s

Rep

lay

Karm

a’s

Rep

lay

Page 14: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

14

Karma’s Insight 1: (Contd.)

• Naïve approach: DAG arcs point to episodes– Episode represented by integers– Too much log size overhead !!

• Our approach: DAG arcs point to cores– Recording: Only one “active” episode per core – Replay: Send wakeup message(s) to core(s) of

successor episode(s)

Page 15: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

15

Karma’s Insight 1:

T0 T1 T2

2260

61 43

44

44

62

84 0|0|1 0|0|1

Anatomy of a log entry

Page 16: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

17

• Not necessary to end the episode on every conflict:– As long as the episodes can be ordered during

replay

ST B ST C

Karma Insight 2:

T0 T1 LD A

LD F

ST E LD B ST X LD R ST T

LD X

T2

ST V ST Z LD W LD J

ST C LD Q

LD J ST Q

ST E ST K LD Z

LD V

ST X

Page 17: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

18

Outline

• Background & Motivation• Rerun Overview• Karma Insights• Karma Implementation• Evaluation• Conclusion

Page 18: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

19

Karma Hardware

Coherence Controller

L1 I

L2 0

L2 1

L2 14

L2 15

Core 15

Interconnect

DR

AM

DR

AM

Core 14

Core 1

Core 0 …

Base System

Total State: 148 bytes/core

Address Filter(FLT)

Reference (REFS)

Predecessor(PRED)

Successor(SUCC)

Timestamp(TS)

Page 19: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

20

Outline

• Background & Motivation• Rerun Overview• Karma Insights• Karma Implementation• Evaluation• Conclusion

Page 20: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

21

Evaluation:

• Were we able to speed up the replay?

0

0.2

0.4

0.6

0.8

1

1.2

4core-4MB 8core-8MB 16core-16MB

Spee

dup

norm

aliz

ed to

"Ba

se"

of c

orre

spon

ding

co

nfigu

rati

on

Number of cores-L2 cache size

Apache Base

Rerun Replay

Karma Replay

Page 21: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

22

Evaluation:

• Were we able to speed up the replay?

0

0.2

0.4

0.6

0.8

1

1.2

4core-4MB 8core-8MB 16core-16MB

Spee

dup

norm

aliz

ed to

"Ba

se"

of c

orre

spon

ding

co

nfigu

rati

on

Number of cores-L2 cache size

Apache Base

Rerun Replay

Karma Replay

0

0.2

0.4

0.6

0.8

1

1.2

4core-4MB 8core-8MB 16core-16MB

Spee

dup

norm

aliz

ed to

"Ba

se"

of c

orre

spon

ding

co

nfigu

rati

on

Number of cores-L2 cache size

Jbb Base

Rerun Replay

Karma Replay

0

0.2

0.4

0.6

0.8

1

1.2

4core-4MB 8core-8MB 16core-16MB

Spee

dup

norm

aliz

ed to

"Ba

se"

of c

orre

spon

ding

co

nfigu

rati

on

Number of cores-L2 cache size

OltpBaseRerun ReplayKarma Replay

0

0.2

0.4

0.6

0.8

1

1.2

4core-4MB 8core-8MB 16core-16MB

Spee

dup

norm

aliz

ed to

"Ba

se"

of c

orre

spon

ding

co

nfigu

ratio

n

Number of cores-L2 cache size

Zeus Base

Rerun Replay

Karma Replay

On Average ~4X improvement in replay speed over Rerun

Page 22: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

23

Evaluation

• Did we blowup log size?

0

0.2

0.4

0.6

0.8

1

1.2

1.4

128 256 512 1024 2048 4096 8192 Unbounded

Ka

rma

lo

g s

ize

no

rma

lize

d t

o R

eru

n's

lo

g s

ize

Maximum allowable Episode size

Apache

Zeus

Oltp

Jbb

On average Karma does not increase the size of the log but instead improves it by as much as 40% as we allow larger episodes

Page 23: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

25

Conclusion

• Applications of deterministic replay– Debugging– Fault tolerance– Security

• Existing hardware record-replayer– Slow replay or – Requires major hardware changes

• Karma: Faster Replay with nearly-conventional h/w– Extends Rerun– Uses DAG instead of Scalar clock– Extend episodes past conflicts

• Widen Application + Lower Cost More Attractive

Page 24: UW-Madison Computer Sciences Multifacet Group© 2011 Karma: Scalable Deterministic Record-Replay Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at

26

Questions?