local linearizability - forsyteforsyte.at/download/frida15/15sokolova-local-lineariziability... ·...

20
Local Linearizability Ana Sokolova joint work with: Andreas Haas Michael Lippautz Christoph Kirsch Tom Henzinger Andreas Holzer Ali Sezgin Helmut Veith Hannes Payer

Upload: lamnhu

Post on 24-Apr-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Local LinearizabilityAna Sokolova

joint work with:

Andreas Haas

Michael LippautzChristoph KirschTom Henzinger

Andreas Holzer

Ali Sezgin Helmut VeithHannes Payer

Page 2: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Semantics of concurrent data structures

• Sequential specification = set of legal sequences

!

• Consistency condition = e.g. linearizability / sequential consistency

Ana Sokolova FRIDA DisCoTec 5.6.15

e.g. pools, queues, stacks

e.g. queue legal sequence enq(1)enq(2)deq(1)deq(2)

e.g. linearizable queue concurrent history

t1: enq(2) deq(1)

enq(1) deq(2)t2:

Page 3: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Consistency conditions

Ana Sokolova FRIDA DisCoTec 5.6.15

Linearizability

Sequential Consistency

there exists a sequential witness that preserves

precedence

t1: enq(2) deq(1)

enq(1) deq(2)t2: 1

2 3

4

there exists a sequential witness that preserves per-

thread precedence (program order)

t1: enq(1) deq(2)

deq(1) enq(2)t2:1

2 3

4

[Herlihy,Wing ’90]

[Lamport’79]

Page 4: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Performance and scalability

Ana Sokolova FRIDA DisCoTec 5.6.15

throughput

# of threads / cores

:-)))

:-)

:-(:-\

Page 5: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Relaxations allow trading !

correctness for

performance

Ana Sokolova FRIDA DisCoTec 5.6.15

provide the potential for better-performing

implementations

Page 6: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Relaxing the Semantics

• Sequential specification = set of legal sequences

• Consistency condition = e.g. linearizability / sequential consistency

Ana Sokolova FRIDA DisCoTec 5.6.15

Quantitative relaxations Henzinger, Kirsch, Payer, Sezgin,S. POPL13

Local linearizability in this talk for queues only

(feel free to ask for more)

not “sequentially

correct”

too weak

Page 7: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Local Linearizability main idea

• Partition a history into a set of local histories

• Require linearizability per local history

Ana Sokolova FRIDA DisCoTec 5.6.15

Already present in some shared-memory consistency conditions

(not in our form of choice)

Local sequential consistency… is also possible

no global witness

Page 8: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Local Linearizability (queue) example

Ana Sokolova FRIDA DisCoTec 5.6.15

t1: enq(1)

deq(1)enq(2)

deq(2)

t2:

(sequential) history not linearizable

t1-induced history, linearizable

t2-induced history, linearizable

locally linearizable

Page 9: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Local Linearizability (queue) definition

Ana Sokolova FRIDA DisCoTec 5.6.15

Queue signature ∑ = {enq(x) | x ∈ V} ∪ {deq(x) | x ∈ V} ∪ {deq(empty)}

For a history h with n threads, we put

Inh(i) = {enq(x)i ∈ h | x ∈ V}

Outh(i) = {deq(x)j ∈ h | enq(x)i ∈ Inh(i)} ∪ {deq(empty)}

in-methods of thread i enqueues performed

by thread i

out-methods of thread i dequeues

(performed by any thread) corresponding to enqueues that

are in-methods h is locally linearizable iff every thread-induced history hi = h | (Inh(i) ∪ Outh(i)) is linearizable.

Page 10: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Generalizations of Local Linearizability

Ana Sokolova FRIDA DisCoTec 5.6.15

Signature ∑

For a history h with n threads, choose

Inh(i)

Outh(i)

in-methods of thread i, methods that go in hi

out-methods of thread i, dependent methods

on the methods in Inh(i) (performed by any thread)

h is locally linearizable iff every thread-induced history hi = h | (Inh(i) ∪ Outh(i)) is linearizable.

by increasing the in-methods,

LL gradually moves to linearizability

Page 11: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Where do we stand?

Ana Sokolova FRIDA DisCoTec 5.6.15

In general

Linearizability

Sequential Consistency

Local Linearizability

Page 12: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Where do we stand?

Ana Sokolova FRIDA DisCoTec 5.6.15

For queues (and all pool-like data structures)

Linearizability

Sequential Consistency

Local Linearizability

Page 13: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Where do we stand?

Ana Sokolova FRIDA DisCoTec 5.6.15

C: For queues

Linearizability

Sequential Consistency

Local Linearizability& Pool-seq.cons.

Page 14: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Properties

Ana Sokolova FRIDA DisCoTec 5.6.15

Local linearizability is compositional

h (over multiple objects) is locally linearizable iff each per-object subhistory of h is locally linearizable

like linearizability unlike sequential consistency

Local linearizability is modular / “decompositional”

uses decomposition into smaller histories, by definition

allows for modular verification

Page 15: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Verification (queue)

Ana Sokolova FRIDA DisCoTec 5.6.15

Queue sequential specification (axiomatic)

s is a legal queue sequence iff 1. s is a legal pool sequence, and 2. enq(x) <s enq(y) ⋀ deq(y) ∈ s ⇒ deq(x) ∈ s ⋀ deq(x) <s deq(y)

Queue linearizability (axiomatic)

h is queue linearizable iff 1. h is pool linearizable, and 2. enq(x) <h enq(y) ⋀ deq(y) ∈ h ⇒ deq(x) ∈ h ⋀ deq(y) ≮h deq(x)

Page 16: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Verification (queue)

Ana Sokolova FRIDA DisCoTec 5.6.15

Queue sequential specification (axiomatic)

s is a legal queue sequence iff 1. s is a legal pool sequence, and 2. enq(x) <s enq(y) ⋀ deq(y) ∈ s ⇒ deq(x) ∈ s ⋀ deq(x) <s deq(y)

Queue local linearizability (axiomatic)

h is queue locally linearizable iff 1. h is pool locally linearizable, and 2. enq(x) <hi enq(y) ⋀ deq(y) ∈ h ⇒ deq(x) ∈ h ⋀ deq(y) ≮h deq(x)

Page 17: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Generic Implementations

Ana Sokolova FRIDA DisCoTec 5.6.15

Your favorite linearizable data structure implementationΦ

turns into a locally linearizable implementation by:

segment of dynamic size (n)

t2t1 tn…

Φ Φ Φ

local inserts / global (randomly distributed) removes

LLD Φ pool linearizable

& locally

linearizable

Page 18: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Performance

Ana Sokolova FRIDA DisCoTec 5.6.15

0

2

4

6

8

10

12

14

16

18

2 10 20 30 40 50 60 70 80

mill

ion

oper

atio

nspe

rsec

(mor

eis

bette

r)

number of threads

MSLCRQ

k-FIFO (k=80)

LL k-FIFO (k=80)static LL DQ (p=40)

dynamic LL DQ

1-RA DQ (p=80)

(a) FIFO queues and “queue-like” pools

0

2

4

6

8

10

12

14

16

18

2 10 20 30 40 50 60 70 80

mill

ion

oper

atio

nspe

rsec

(mor

eis

bette

r)

number of threads

TreiberTS-interval Stackk-Stack (k=80)

LL k-Stack (k=80)static LL DS (p=40)

dynamic LL DS

1-RA DS (p=80)

(b) Stacks and “stack-like” pools

Figure 9: Performance and scalability of producer-consumer mi-crobenchmarks with an increasing number of threads on a 40-core(2 hyperthreads per core) machine

of the fastest locally linearizable implementation (dynamic LL) tothe fastest linearizable queue (LCRQ) and stack (TS-interval Stack)implementation at 80 threads is 1.93 and 1.79, respectively.

Note that the performance degradation for LCRQ between 30and 70 threads aligns with the performance of fetch-and-inc—the CPU instruction that atomically retrieves and modifies the con-tents of a memory location—on the benchmarking machine, whichis different on the original benchmarking machine [31]. LCRQ usesfetch-and-inc as its key atomic instruction.

7. Conclusions & Future WorkLocal linearizability utilizes the idea of decomposition yielding anintuitive and verifiable consistency condition for concurrent objectsthat enables implementations with superior performance and scala-bility compared to linearizable and relaxed implementations. Thereare at least two directions for future work: (1) Further implicationsof decomposition to correctness; (2) Program-aware correctness,i.e., identifying classes of programs that are (in)sensitive to notionsof (relaxed) semantics.

References[1] Y. Afek, G. Korland, and E. Yanovsky. Quasi-Linearizability: Relaxed

Consistency for Improved Concurrency. In Proc. Conference onPrinciples of Distributed Systems (OPODIS), LNCS, pages 395–410.Springer, 2010. .

[2] M. Ahamad, R. Bazzi, R. John, P. Kohli, and G. Neiger. The powerof processor consistency. In Proc. Symposium on Parallel Algorithmsand Architectures (SPAA), pages 251–260. ACM, 1993. .

[3] M. Ahamad, G. Neiger, J. Burns, P. Kohli, and P. Hutto. Causalmemory: definitions, implementation, and programming. DistributedComputing, 9(1):37–49, 1995. ISSN 0178-2770. .

[4] D. Alistarh, J. Kopinsky, J. Li, and N. Shavit. The spraylist: A scalablerelaxed priority queue. In Proc. Symposium on Principles and Practiceof Parallel Programming (PPoPP), pages 11–20. ACM, 2015. .

[5] J. Aspnes, M. Herlihy, and N. Shavit. Counting networks. J. ACM, 41(5):1020–1048, Sept. 1994. ISSN 0004-5411. .

[6] M. Batty, M. Dodds, and A. Gotsman. Library abstraction for c/c++concurrency. In Proc. Symposium on Principles of Programming Lan-guages (POPL), pages 235–248, New York, NY, USA, 2013. ACM..

[7] A. Bouajjani, M. Emmi, C. Enea, and J. Hamza. Tractable refinementchecking for concurrent objects. In Proc. Symposium on Principles ofProgramming Languages (POPL), pages 651–662. ACM, 2015. .

[8] S. Burckhardt, A. Gotsman, H. Yang, and M. Zawirski. Replicateddata types: Specification, verification, optimality. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 271–284,New York, NY, USA, 2014. ACM. .

[9] P. . A. E. Committee. Popl 2015 artifact evaluation. URL http://popl15-aec.cs.umass.edu/home/. Accessed on 01/14/2015.

[10] Computational Systems Group, University of Salzburg. Scal: High-performance multicore-scalable computing. URL http://scal.cs.uni-salzburg.at.

[11] J. Derrick, B. Dongol, G. Schellhorn, B. Tofan, O. Travkin, andH. Wehrheim. Quiescent Consistency: Defining and Verifying Re-laxed Linearizability. In Proc. International Symposium on FormalMethods (FM), LNCS, pages 200–214. Springer, 2014. .

[12] M. Dodds, A. Haas, and C. Kirsch. A scalable, correct time-stampedstack. In Proc. Symposium on Principles of Programming Languages(POPL), pages 233–246. ACM, 2015. .

[13] J. Goodman. Cache consistency and sequential consistency. Univer-sity of Wisconsin-Madison, Computer Sciences Department, 1991.

[14] A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin,and A. Sokolova. Distributed queues in shared memory: Multicoreperformance and scalability through quantitative relaxation. In Proc.International Conference on Computing Frontiers (CF). ACM, 2013..

[15] A. Heddaya and H. Sinha. Coherence, non-coherence and local consis-tency in distributed shared memory for parallel computing. Technicalreport, Computer Science Department, Boston University, 1992.

[16] S. Heller, M. Herlihy, V. Luchangco, M. Moir, W. Scherer, andN. Shavit. A lazy concurrent list-based set algorithm. In Proc.Conference on Principles of Distributed Systems (OPODIS), LNCS.Springer, 2005. .

[17] J. Hennessy and D. Patterson. Computer Architecture, Fifth Edi-tion: A Quantitative Approach. Morgan Kaufmann Publishers Inc.,San Francisco, CA, USA, 5th edition, 2011. ISBN 012383872X,9780123838728.

[18] T. Henzinger, C. Kirsch, H. Payer, A. Sezgin, and A. Sokolova. Quan-titative relaxation of concurrent data structures. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 317–328.ACM, 2013. .

[19] T. Henzinger, A. Sezgin, and V. Vafeiadis. Aspect-oriented lineariz-ability proofs. In Proc. Internatonal Conference on Concurrency The-ory (CONCUR), LNCS, pages 242–256. Springer, 2013. .

short description of paper 8 2015/5/21

MS queue

LLD MS queue

LLD MS queue performs

significantly better than

MS queue

Page 19: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Performance

Ana Sokolova FRIDA DisCoTec 5.6.15

0

2

4

6

8

10

12

14

16

18

2 10 20 30 40 50 60 70 80

mill

ion

oper

atio

nspe

rsec

(mor

eis

bette

r)

number of threads

MSLCRQ

k-FIFO (k=80)

LL k-FIFO (k=80)static LL DQ (p=40)

dynamic LL DQ

1-RA DQ (p=80)

(a) FIFO queues and “queue-like” pools

0

2

4

6

8

10

12

14

16

18

2 10 20 30 40 50 60 70 80

mill

ion

oper

atio

nspe

rsec

(mor

eis

bette

r)

number of threads

TreiberTS-interval Stackk-Stack (k=80)

LL k-Stack (k=80)static LL DS (p=40)

dynamic LL DS

1-RA DS (p=80)

(b) Stacks and “stack-like” pools

Figure 9: Performance and scalability of producer-consumer mi-crobenchmarks with an increasing number of threads on a 40-core(2 hyperthreads per core) machine

of the fastest locally linearizable implementation (dynamic LL) tothe fastest linearizable queue (LCRQ) and stack (TS-interval Stack)implementation at 80 threads is 1.93 and 1.79, respectively.

Note that the performance degradation for LCRQ between 30and 70 threads aligns with the performance of fetch-and-inc—the CPU instruction that atomically retrieves and modifies the con-tents of a memory location—on the benchmarking machine, whichis different on the original benchmarking machine [31]. LCRQ usesfetch-and-inc as its key atomic instruction.

7. Conclusions & Future WorkLocal linearizability utilizes the idea of decomposition yielding anintuitive and verifiable consistency condition for concurrent objectsthat enables implementations with superior performance and scala-bility compared to linearizable and relaxed implementations. Thereare at least two directions for future work: (1) Further implicationsof decomposition to correctness; (2) Program-aware correctness,i.e., identifying classes of programs that are (in)sensitive to notionsof (relaxed) semantics.

References[1] Y. Afek, G. Korland, and E. Yanovsky. Quasi-Linearizability: Relaxed

Consistency for Improved Concurrency. In Proc. Conference onPrinciples of Distributed Systems (OPODIS), LNCS, pages 395–410.Springer, 2010. .

[2] M. Ahamad, R. Bazzi, R. John, P. Kohli, and G. Neiger. The powerof processor consistency. In Proc. Symposium on Parallel Algorithmsand Architectures (SPAA), pages 251–260. ACM, 1993. .

[3] M. Ahamad, G. Neiger, J. Burns, P. Kohli, and P. Hutto. Causalmemory: definitions, implementation, and programming. DistributedComputing, 9(1):37–49, 1995. ISSN 0178-2770. .

[4] D. Alistarh, J. Kopinsky, J. Li, and N. Shavit. The spraylist: A scalablerelaxed priority queue. In Proc. Symposium on Principles and Practiceof Parallel Programming (PPoPP), pages 11–20. ACM, 2015. .

[5] J. Aspnes, M. Herlihy, and N. Shavit. Counting networks. J. ACM, 41(5):1020–1048, Sept. 1994. ISSN 0004-5411. .

[6] M. Batty, M. Dodds, and A. Gotsman. Library abstraction for c/c++concurrency. In Proc. Symposium on Principles of Programming Lan-guages (POPL), pages 235–248, New York, NY, USA, 2013. ACM..

[7] A. Bouajjani, M. Emmi, C. Enea, and J. Hamza. Tractable refinementchecking for concurrent objects. In Proc. Symposium on Principles ofProgramming Languages (POPL), pages 651–662. ACM, 2015. .

[8] S. Burckhardt, A. Gotsman, H. Yang, and M. Zawirski. Replicateddata types: Specification, verification, optimality. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 271–284,New York, NY, USA, 2014. ACM. .

[9] P. . A. E. Committee. Popl 2015 artifact evaluation. URL http://popl15-aec.cs.umass.edu/home/. Accessed on 01/14/2015.

[10] Computational Systems Group, University of Salzburg. Scal: High-performance multicore-scalable computing. URL http://scal.cs.uni-salzburg.at.

[11] J. Derrick, B. Dongol, G. Schellhorn, B. Tofan, O. Travkin, andH. Wehrheim. Quiescent Consistency: Defining and Verifying Re-laxed Linearizability. In Proc. International Symposium on FormalMethods (FM), LNCS, pages 200–214. Springer, 2014. .

[12] M. Dodds, A. Haas, and C. Kirsch. A scalable, correct time-stampedstack. In Proc. Symposium on Principles of Programming Languages(POPL), pages 233–246. ACM, 2015. .

[13] J. Goodman. Cache consistency and sequential consistency. Univer-sity of Wisconsin-Madison, Computer Sciences Department, 1991.

[14] A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin,and A. Sokolova. Distributed queues in shared memory: Multicoreperformance and scalability through quantitative relaxation. In Proc.International Conference on Computing Frontiers (CF). ACM, 2013..

[15] A. Heddaya and H. Sinha. Coherence, non-coherence and local consis-tency in distributed shared memory for parallel computing. Technicalreport, Computer Science Department, Boston University, 1992.

[16] S. Heller, M. Herlihy, V. Luchangco, M. Moir, W. Scherer, andN. Shavit. A lazy concurrent list-based set algorithm. In Proc.Conference on Principles of Distributed Systems (OPODIS), LNCS.Springer, 2005. .

[17] J. Hennessy and D. Patterson. Computer Architecture, Fifth Edi-tion: A Quantitative Approach. Morgan Kaufmann Publishers Inc.,San Francisco, CA, USA, 5th edition, 2011. ISBN 012383872X,9780123838728.

[18] T. Henzinger, C. Kirsch, H. Payer, A. Sezgin, and A. Sokolova. Quan-titative relaxation of concurrent data structures. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 317–328.ACM, 2013. .

[19] T. Henzinger, A. Sezgin, and V. Vafeiadis. Aspect-oriented lineariz-ability proofs. In Proc. Internatonal Conference on Concurrency The-ory (CONCUR), LNCS, pages 242–256. Springer, 2013. .

short description of paper 8 2015/5/21

MS queue

LLD MS queue

LLD Φ performs

significantly better than Φ

Page 20: Local Linearizability - FORSYTEforsyte.at/download/frida15/15sokolova-local-lineariziability... · Local linearizability utilizes the idea of decomposition yielding an ... LLD MS

Performance

Ana Sokolova FRIDA DisCoTec 5.6.15

0

2

4

6

8

10

12

14

16

18

2 10 20 30 40 50 60 70 80

mill

ion

oper

atio

nspe

rsec

(mor

eis

bette

r)

number of threads

MSLCRQ

k-FIFO (k=80)

LL k-FIFO (k=80)static LL DQ (p=40)

dynamic LL DQ

1-RA DQ (p=80)

(a) FIFO queues and “queue-like” pools

0

2

4

6

8

10

12

14

16

18

2 10 20 30 40 50 60 70 80

mill

ion

oper

atio

nspe

rsec

(mor

eis

bette

r)

number of threads

TreiberTS-interval Stackk-Stack (k=80)

LL k-Stack (k=80)static LL DS (p=40)

dynamic LL DS

1-RA DS (p=80)

(b) Stacks and “stack-like” pools

Figure 9: Performance and scalability of producer-consumer mi-crobenchmarks with an increasing number of threads on a 40-core(2 hyperthreads per core) machine

of the fastest locally linearizable implementation (dynamic LL) tothe fastest linearizable queue (LCRQ) and stack (TS-interval Stack)implementation at 80 threads is 1.93 and 1.79, respectively.

Note that the performance degradation for LCRQ between 30and 70 threads aligns with the performance of fetch-and-inc—the CPU instruction that atomically retrieves and modifies the con-tents of a memory location—on the benchmarking machine, whichis different on the original benchmarking machine [31]. LCRQ usesfetch-and-inc as its key atomic instruction.

7. Conclusions & Future WorkLocal linearizability utilizes the idea of decomposition yielding anintuitive and verifiable consistency condition for concurrent objectsthat enables implementations with superior performance and scala-bility compared to linearizable and relaxed implementations. Thereare at least two directions for future work: (1) Further implicationsof decomposition to correctness; (2) Program-aware correctness,i.e., identifying classes of programs that are (in)sensitive to notionsof (relaxed) semantics.

References[1] Y. Afek, G. Korland, and E. Yanovsky. Quasi-Linearizability: Relaxed

Consistency for Improved Concurrency. In Proc. Conference onPrinciples of Distributed Systems (OPODIS), LNCS, pages 395–410.Springer, 2010. .

[2] M. Ahamad, R. Bazzi, R. John, P. Kohli, and G. Neiger. The powerof processor consistency. In Proc. Symposium on Parallel Algorithmsand Architectures (SPAA), pages 251–260. ACM, 1993. .

[3] M. Ahamad, G. Neiger, J. Burns, P. Kohli, and P. Hutto. Causalmemory: definitions, implementation, and programming. DistributedComputing, 9(1):37–49, 1995. ISSN 0178-2770. .

[4] D. Alistarh, J. Kopinsky, J. Li, and N. Shavit. The spraylist: A scalablerelaxed priority queue. In Proc. Symposium on Principles and Practiceof Parallel Programming (PPoPP), pages 11–20. ACM, 2015. .

[5] J. Aspnes, M. Herlihy, and N. Shavit. Counting networks. J. ACM, 41(5):1020–1048, Sept. 1994. ISSN 0004-5411. .

[6] M. Batty, M. Dodds, and A. Gotsman. Library abstraction for c/c++concurrency. In Proc. Symposium on Principles of Programming Lan-guages (POPL), pages 235–248, New York, NY, USA, 2013. ACM..

[7] A. Bouajjani, M. Emmi, C. Enea, and J. Hamza. Tractable refinementchecking for concurrent objects. In Proc. Symposium on Principles ofProgramming Languages (POPL), pages 651–662. ACM, 2015. .

[8] S. Burckhardt, A. Gotsman, H. Yang, and M. Zawirski. Replicateddata types: Specification, verification, optimality. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 271–284,New York, NY, USA, 2014. ACM. .

[9] P. . A. E. Committee. Popl 2015 artifact evaluation. URL http://popl15-aec.cs.umass.edu/home/. Accessed on 01/14/2015.

[10] Computational Systems Group, University of Salzburg. Scal: High-performance multicore-scalable computing. URL http://scal.cs.uni-salzburg.at.

[11] J. Derrick, B. Dongol, G. Schellhorn, B. Tofan, O. Travkin, andH. Wehrheim. Quiescent Consistency: Defining and Verifying Re-laxed Linearizability. In Proc. International Symposium on FormalMethods (FM), LNCS, pages 200–214. Springer, 2014. .

[12] M. Dodds, A. Haas, and C. Kirsch. A scalable, correct time-stampedstack. In Proc. Symposium on Principles of Programming Languages(POPL), pages 233–246. ACM, 2015. .

[13] J. Goodman. Cache consistency and sequential consistency. Univer-sity of Wisconsin-Madison, Computer Sciences Department, 1991.

[14] A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin,and A. Sokolova. Distributed queues in shared memory: Multicoreperformance and scalability through quantitative relaxation. In Proc.International Conference on Computing Frontiers (CF). ACM, 2013..

[15] A. Heddaya and H. Sinha. Coherence, non-coherence and local consis-tency in distributed shared memory for parallel computing. Technicalreport, Computer Science Department, Boston University, 1992.

[16] S. Heller, M. Herlihy, V. Luchangco, M. Moir, W. Scherer, andN. Shavit. A lazy concurrent list-based set algorithm. In Proc.Conference on Principles of Distributed Systems (OPODIS), LNCS.Springer, 2005. .

[17] J. Hennessy and D. Patterson. Computer Architecture, Fifth Edi-tion: A Quantitative Approach. Morgan Kaufmann Publishers Inc.,San Francisco, CA, USA, 5th edition, 2011. ISBN 012383872X,9780123838728.

[18] T. Henzinger, C. Kirsch, H. Payer, A. Sezgin, and A. Sokolova. Quan-titative relaxation of concurrent data structures. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 317–328.ACM, 2013. .

[19] T. Henzinger, A. Sezgin, and V. Vafeiadis. Aspect-oriented lineariz-ability proofs. In Proc. Internatonal Conference on Concurrency The-ory (CONCUR), LNCS, pages 242–256. Springer, 2013. .

short description of paper 8 2015/5/21

MS queue

LLD MS queue

LLD MS queue performs better

than the best known

pools

Thank You!