stm in the small: trading generality for performance in ......stm overheads •book-keeping done in...

109
STM in the Small: Trading Generality for Performance in Software Transactional Memory Aleksandar Dragojević

Upload: others

Post on 14-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM in the Small: Trading Generality for Performance in Software Transactional Memory Aleksandar Dragojević

Page 2: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Our motivation

• Concurrent data structures

• Hash-tables, skip-lists

– In-memory database indices2

– Key-value stores

[2] Larson et al. “High-performance concurrency control mechanisms for main-memory databases” VLDB Dec. 2011

2

Page 3: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Software Transactional Memory

• Simplifies data structures

– Ensures correct implementations

• Enables use of complex data structures

– More complex than when using CAS directly

3

Page 4: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

What is STM performance like?

4

Page 5: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

What is STM performance like?

Graph from: Cascaval et al. “Software transactional memory: why is it only a research toy” ACMQ September 2008

5

Page 6: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

What is STM performance like?

0

1

2

3

4

5

6

7

8

9

10

Spe

ed

up

1

2

4

8

16

6

Page 7: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

What is STM performance like?

0

1

2

3

4

5

6

7

8

9

10

Spe

ed

up

1

2

4

8

16

7

Page 8: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

What is STM performance like?

0

1

2

3

4

5

6

7

8

9

10

Spe

ed

up

1

2

4

8

16

8

Page 9: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

What is STM performance like?

0

1

2

3

4

5

6

7

8

9

10

Spe

ed

up

1

2

4

8

16

9

Page 10: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

What is STM performance like?

0

1

2

3

4

5

6

7

8

9

10

Spe

ed

up

1

2

4

8

16

10

Page 11: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

What is STM performance like?

0

1

2

3

4

5

6

7

8

9

0 5 10 15

Thro

ugh

pu

t [M

op

s/s]

Threads

Skip-list

STM

CAS

11

Page 12: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

What is STM performance like?

0

1

2

3

4

5

6

7

8

9

0 5 10 15

Thro

ugh

pu

t [M

op

s/s]

Threads

Skip-list

STM

71%

78%

12

CAS

Page 13: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

SpecTM: Specialized STM for Concurrent Data Structures

Ease of programming easy hard

Perf

orm

ance

fa

st

slo

w

STM

CAS

13

Page 14: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

SpecTM: Specialized STM for Concurrent Data Structures

easy hard

Perf

orm

ance

fa

st

slo

w

SpecTM

Ease of programming

14

CAS

STM

Page 15: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

This talk

0

1

2

3

4

5

6

7

8

9

0 5 10 15

Thro

ugh

pu

t [M

op

s/s]

Threads

Skip-list

STM

71%

15

CAS

78%

Page 16: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

This talk

0

1

2

3

4

5

6

7

8

9

0 5 10 15

Thro

ugh

pu

t [M

op

s/s]

Threads

Skip-list SpecTM 0%

16

CAS

0%

Page 17: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Overview

• STM overheads

• SpecTM

– Short-transaction API

– Collocating data and meta-data

– In-word meta-data

• Evaluation

17

Page 18: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Overview

• STM overheads

• SpecTM

– Short-transaction API

– Collocating data and meta-data

– In-word meta-data

• Evaluation

18

Page 19: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM overheads

• Book-keeping done in software

• Memory accesses become STM calls

• Significant overheads3

– With a single thread 50% overheads

– Requires 4 threads to outperform sequential code in 75% of cases

[3] Dragojević et al. “Why STM Can Be more than a Research Toy” CACM Apr. 2011

19

Page 20: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM API

void StartTx();

void CommitTx();

word_t ReadWord(word_t *addr);

void WriteWord(word_t *addr, word_t val);

20

Page 21: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read

map(addr)

21

Page 22: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read

map(addr)

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

22

Page 23: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read

map(addr)

check RaW

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

23

Page 24: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read

map(addr)

check RaW

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

[orec idx]

write-set

address-value

orecs

index

… …

24

Page 25: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read

map(addr)

check RaW

o1 == o2 && !locked

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set read orec

read val

read orec

address-value

orecs

… …

25

index

[orec idx]

Page 26: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read (cont’d)

log read

26

Page 27: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read (cont’d)

log read head curr

count=2

read-set

27

Page 28: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read (cont’d)

log read head curr

count=3

read-set

28

Page 29: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read (cont’d)

log read

orec <= validts

head curr

count=3

read-set

29

Page 30: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read (cont’d)

log read

return val

orec <= validts

head curr

count=3

read-set

30

Page 31: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read (cont’d)

log read

validate

return val abort

valid

orec <= validts

head curr

count=3

read-set

31

Page 32: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read (cont’d)

log read

validate

return val abort

valid

orec <= validts

head curr

count=3 read-set

read-set

32

Page 33: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM read fast-path

• Significantly more expensive than CPU read instructions

– >10 instructions

• Writes have comparable costs

– Incurs a compare-and-swap at commit time

33

Page 34: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Overview

• STM overheads

• SpecTM

– Short-transaction API

– Collocating data and meta-data

– In-word meta-data

• Evaluation

34

Page 35: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Overview

• STM overheads

• SpecTM

– Short-transaction API

– Collocating data and meta-data

– In-word meta-data

• Evaluation

35

Page 36: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transactions

• Short read-write transactions

• Short read-only transactions

• Single-access transactions

– Read, Write, CAS

36

Page 37: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short read-write transactions

37

word_t TxRWRead_1(word_t *addr_1);

word_t TxRWRead_2(word_t *addr_2);

word_t TxRWRead_3(word_t *addr_3);

...

void TxRWCommit_1(word_t val);

void TxRWCommit_2(word_t v1, word_t v2);

void TxRWCommit_3(word_t v1,

word_t v2,

word_t v3);

...

Page 38: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short read-write transactions

word_t TxRWRead_1(word_t *addr_1);

word_t TxRWRead_2(word_t *addr_2);

word_t TxRWRead_3(word_t *addr_3);

...

void TxRWCommit_1(word_t val);

void TxRWCommit_2(word_t v1, word_t v2);

void TxRWCommit_3(word_t v1,

word_t v2,

word_t v3);

...

38

Page 39: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

word_t TxRWRead_1(word_t *addr_1);

word_t TxRWRead_2(word_t *addr_2);

word_t TxRWRead_3(word_t *addr_3);

...

void TxRWCommit_1(word_t val);

void TxRWCommit_2(word_t v1, word_t v2);

void TxRWCommit_3(word_t v1,

word_t v2,

word_t v3);

...

Short read-write transactions

39

Page 40: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

word_t TxRWRead_1(word_t *addr_1);

word_t TxRWRead_2(word_t *addr_2);

word_t TxRWRead_3(word_t *addr_3);

...

void TxRWCommit_1(word_t val);

void TxRWCommit_2(word_t v1, word_t v2);

void TxRWCommit_3(word_t v1,

word_t v2,

word_t v3);

...

Short read-write transactions

40

Page 41: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transaction optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

41

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

Page 42: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transaction optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

42

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

Page 43: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transaction optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

43

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

Page 44: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transaction optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

44

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

Page 45: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transaction optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

static

write-set

45

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

Page 46: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transaction optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

static

write-set

46

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

CAS (from write)

Page 47: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transaction optimizations (cont’d)

log read

validate

return val abort

valid

orec <= validts

head curr

count=3 read-set

read-set

47

Page 48: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transaction optimizations (cont’d)

log read

validate

return val abort

valid

orec <= validts

head curr

count=3 read-set

read-set

merged with write-set

48

static

Page 49: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Short transaction optimizations (cont’d)

log read

validate

return val abort

valid

orec <= validts

head curr

count=3 read-set

read-set

49

merged with write-set

static

Page 50: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Using short transactions

Tx

With “traditional” transactions the whole operation is a single transaction

50

Page 51: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Using short transactions

Tx1 Tx2 Tx3 Tx4 Tx5 Tx6

Split operation into a sequence of short transactions

51

Page 52: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Using short transactions

Tx1 Tx2 Tx3 Tx4 Tx5 Tx6

“Glue” them together using techniques similar to lock-free algorithms (e.g. pointer marking)

52

Page 53: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Using short transactions

Tx1 Tx2 Tx3 Tx4 Tx5 Tx6

“Glue” them together using techniques similar to lock-free algorithms (e.g. pointer marking)

Is this simpler than using CAS directly?

53

Page 54: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Using short transactions

Tx1 Tx2 Tx3 Tx4 Tx5 Tx6

“Glue” them together using techniques similar to lock-free algorithms (e.g. pointer marking)

Is this simpler than using CAS directly? Yes: transactions eliminate tricky races

54

Page 55: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Mix short and ordinary transactions

Tx1 Tx2 Tx3 Tx4 Tx5 Tx6

Tx

Implement corner cases using ordinary transactions

55

Page 56: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

56

Page 57: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

TxSingleRead

57

Page 58: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

TxSingleRead

58

Page 59: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

TxSingleRead

59

Page 60: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

60

Short read-write transaction to mark the node and unlink it

Page 61: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

61

Short read-write transaction to mark the node and unlink it

Page 62: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

This becomes trickier when using CAS directly

62

Page 63: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

We have to use several CASes

63

CAS

Page 64: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

64

We have to use several CASes

CAS

Page 65: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

65

CAS

We have to use several CASes

Page 66: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

66

We have to use several CASes

CAS

Page 67: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

67

CAS

We have to use several CASes

Page 68: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

68

We have to use several CASes CAS

Page 69: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

69

CAS

We have to use several CASes

Page 70: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

70

We have to use several CASes

CAS

Page 71: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

71

There might be a race at each of these steps

CAS

Page 72: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

72

We need to make sure other operations clean-up for us

CAS

Page 73: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

73

All interactions between operations have to be handled correctly

CAS

Page 74: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Example

10 20 30 40

10

10

20 40

40

Remove 20

74

E.g. removal of an element that is still being added to the list

CAS

Page 75: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Overview

• STM overheads

• SpecTM

– Short-transaction API

– Collocating data and meta-data

– In-word meta-data

• Evaluation

75

Page 76: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

“Traditional” STM data layout

O1

O2

O3

O4

W1

W2

W3

W4

application data

orecs

76

Page 77: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

“Traditional” STM data layout

O1

O2

O3

O4

W1

W2

W3

W4

application data

General False conflicts Cache misses

orecs

77

Page 78: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Collocate data and meta-data

W1

O1

W3

O3

W2

O2

W4

O4

Simplify mapping No false conflicts Same cache line

78

Page 79: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Collocated meta-data optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

static

write-set

79

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

CAS (from write)

Page 80: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Collocated meta-data optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

static

write-set

80

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

CAS (from write)

simplified mapping

Page 81: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Collocated meta-data optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

static

write-set

81

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

CAS (from write)

simplified mapping

better cache locality

Page 82: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Pure value-based validation

W1

W2

W3

W4

… Restrict values Restrict access patterns Single-word operations

82

Page 83: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Value-based optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

static

write-set

83

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

CAS (from write)

simplified mapping

better cache locality

Page 84: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Value-based optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

static

write-set

84

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

CAS (from write)

simplified mapping

better cache locality

Page 85: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Value-based optimizations

address

orec table

0 0 c 1 2 a b 0

[c1a2aa]

[c1a2ab]

[c1a2ac]

write-set

address-value

orecs

… …

static

write-set

85

index

[orec idx]

map(addr)

check RaW

read orec

read val

read orec

o1 == o2 && !locked

CAS (from write)

simplified mapping

better cache locality

Page 86: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

API changes

86

Page 87: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

API changes

word_t ReadWord(TVar *addr);

void WriteWord(TVar *addr, word_t val);

87

word_t TVarToWord(TVar addr);

TVar WordToTVar(word_t val);

Page 88: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

API changes

word_t ReadWord(TVar *addr);

void WriteWord(TVar *addr, word_t val);

88

Transactions access TVar, not word_t

word_t TVarToWord(TVar addr);

TVar WordToTVar(word_t val);

Page 89: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

API changes

89

Transactions access TVar, not word_t

Calls to update TVar non-transactionally

word_t ReadWord(TVar *addr);

void WriteWord(TVar *addr, word_t val);

word_t TVarToWord(TVar addr);

TVar WordToTVar(word_t val);

Page 90: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Pure value-based short read-write

read val

90

Page 91: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Pure value-based short read-write

!locked

91

read val

Page 92: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Pure value-based short read-write

CAS

!locked

92

read val

Page 93: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Pure value-based short read-write

abort

!locked

93

read val

CAS

Page 94: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Pure value-based short read-write

log

abort

success

!locked

94

read val

CAS

Page 95: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Pure value-based short read-write

write-set

abort

!locked

95

static

read val

log

success

CAS

Page 96: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Pure value-based short read-write

return val

abort

!locked

96

read val

static

write-set

log

success

CAS

Page 97: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Pure value-based short read-write

return val

abort

!locked

97

read val

static

write-set

log

success

CAS

Page 98: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

STM vs. SpecTM

98

STM SpecTM

Page 99: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Overview

• STM overheads

• SpecTM

– Short-transaction API

– Collocating data and meta-data

– In-word meta-data

• Evaluation

99

Page 100: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Evaluation

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120 140

Thro

ugh

pu

t [M

op

s/s]

Threads

Skip-list (90% lookups)

STM

100

CAS

Page 101: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Evaluation

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120 140

Thro

ugh

pu

t [M

op

s/s]

Threads

Skip-list (90% lookups)

STM

240%

101

CAS

Page 102: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120 140

Thro

ugh

pu

t [M

op

s/s]

Threads

Skip-list (90% lookups)

Evaluation

48%

STM

SpecTM-Short

102

CAS

Page 103: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120 140

Thro

ugh

pu

t [M

op

s/s]

Threads

Skip-list (90% lookups)

Evaluation

12%

SpecTM-Short-Colloc

SpecTM-Short

STM

103

CAS

Page 104: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120 140

Thro

ugh

pu

t [M

op

s/s]

Threads

Skip-list (90% lookups)

Evaluation

2% SpecTM-Short-Val

STM

SpecTM-Short-Colloc

SpecTM-Short

104

CAS

Page 105: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Evaluation

105

0

50

100

150

200

250

300

350

400

450

0 20 40 60 80 100 120 140

Thro

ugh

pu

t [M

op

s/s]

Threads

Hash-table (90% lookups) CAS

SpecTM-Short-Val

SpecTM-Short-Colloc

STM

SpecTM-Short

0%

Page 106: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Evaluation

106

0

10

20

30

40

50

60

70

80

0 5 10 15

Thro

ugh

pu

t [M

op

s/s]

Threads

Hash-table (90% lookups)

STM

SpecTM-Short

SpecTM-Short-Colloc

CAS SpecTM-Short-Val 0%

Page 107: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

Conclusions

• Trade generality for performance in SpecTM

– Restrict STM API

– Control data layout

• STM-based data structures perform as well as the ones built directly from CAS

• Do we need SpecTM with upcoming Intel TSX?

– Best-effort limited-size transactions

107

Page 108: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

SpecTM: Specialized STM for Concurrent Data Structures

easy hard

Perf

orm

ance

fa

st

slo

w

SpecTM

Ease of programming

108

CAS

STM

Page 109: STM in the Small: Trading Generality for Performance in ......STM overheads •Book-keeping done in software •Memory accesses become STM calls •Significant overheads3 –With a

©2011 Microsoft Corporation. All rights reserved.