local linearizability - forsyteforsyte.at/download/frida15/15sokolova-local-lineariziability... ·...
TRANSCRIPT
Local LinearizabilityAna Sokolova
joint work with:
Andreas Haas
Michael LippautzChristoph KirschTom Henzinger
Andreas Holzer
Ali Sezgin Helmut VeithHannes Payer
Semantics of concurrent data structures
• Sequential specification = set of legal sequences
!
• Consistency condition = e.g. linearizability / sequential consistency
Ana Sokolova FRIDA DisCoTec 5.6.15
e.g. pools, queues, stacks
e.g. queue legal sequence enq(1)enq(2)deq(1)deq(2)
e.g. linearizable queue concurrent history
t1: enq(2) deq(1)
enq(1) deq(2)t2:
Consistency conditions
Ana Sokolova FRIDA DisCoTec 5.6.15
Linearizability
Sequential Consistency
there exists a sequential witness that preserves
precedence
t1: enq(2) deq(1)
enq(1) deq(2)t2: 1
2 3
4
there exists a sequential witness that preserves per-
thread precedence (program order)
t1: enq(1) deq(2)
deq(1) enq(2)t2:1
2 3
4
[Herlihy,Wing ’90]
[Lamport’79]
Performance and scalability
Ana Sokolova FRIDA DisCoTec 5.6.15
throughput
# of threads / cores
:-)))
:-)
:-(:-\
Relaxations allow trading !
correctness for
performance
Ana Sokolova FRIDA DisCoTec 5.6.15
provide the potential for better-performing
implementations
Relaxing the Semantics
• Sequential specification = set of legal sequences
• Consistency condition = e.g. linearizability / sequential consistency
Ana Sokolova FRIDA DisCoTec 5.6.15
Quantitative relaxations Henzinger, Kirsch, Payer, Sezgin,S. POPL13
Local linearizability in this talk for queues only
(feel free to ask for more)
not “sequentially
correct”
too weak
Local Linearizability main idea
• Partition a history into a set of local histories
• Require linearizability per local history
Ana Sokolova FRIDA DisCoTec 5.6.15
Already present in some shared-memory consistency conditions
(not in our form of choice)
Local sequential consistency… is also possible
no global witness
Local Linearizability (queue) example
Ana Sokolova FRIDA DisCoTec 5.6.15
t1: enq(1)
deq(1)enq(2)
deq(2)
t2:
(sequential) history not linearizable
t1-induced history, linearizable
t2-induced history, linearizable
locally linearizable
Local Linearizability (queue) definition
Ana Sokolova FRIDA DisCoTec 5.6.15
Queue signature ∑ = {enq(x) | x ∈ V} ∪ {deq(x) | x ∈ V} ∪ {deq(empty)}
For a history h with n threads, we put
Inh(i) = {enq(x)i ∈ h | x ∈ V}
Outh(i) = {deq(x)j ∈ h | enq(x)i ∈ Inh(i)} ∪ {deq(empty)}
in-methods of thread i enqueues performed
by thread i
out-methods of thread i dequeues
(performed by any thread) corresponding to enqueues that
are in-methods h is locally linearizable iff every thread-induced history hi = h | (Inh(i) ∪ Outh(i)) is linearizable.
Generalizations of Local Linearizability
Ana Sokolova FRIDA DisCoTec 5.6.15
Signature ∑
For a history h with n threads, choose
Inh(i)
Outh(i)
in-methods of thread i, methods that go in hi
out-methods of thread i, dependent methods
on the methods in Inh(i) (performed by any thread)
h is locally linearizable iff every thread-induced history hi = h | (Inh(i) ∪ Outh(i)) is linearizable.
by increasing the in-methods,
LL gradually moves to linearizability
Where do we stand?
Ana Sokolova FRIDA DisCoTec 5.6.15
In general
Linearizability
Sequential Consistency
Local Linearizability
Where do we stand?
Ana Sokolova FRIDA DisCoTec 5.6.15
For queues (and all pool-like data structures)
Linearizability
Sequential Consistency
Local Linearizability
Where do we stand?
Ana Sokolova FRIDA DisCoTec 5.6.15
C: For queues
Linearizability
Sequential Consistency
Local Linearizability& Pool-seq.cons.
Properties
Ana Sokolova FRIDA DisCoTec 5.6.15
Local linearizability is compositional
h (over multiple objects) is locally linearizable iff each per-object subhistory of h is locally linearizable
like linearizability unlike sequential consistency
Local linearizability is modular / “decompositional”
uses decomposition into smaller histories, by definition
allows for modular verification
Verification (queue)
Ana Sokolova FRIDA DisCoTec 5.6.15
Queue sequential specification (axiomatic)
s is a legal queue sequence iff 1. s is a legal pool sequence, and 2. enq(x) <s enq(y) ⋀ deq(y) ∈ s ⇒ deq(x) ∈ s ⋀ deq(x) <s deq(y)
Queue linearizability (axiomatic)
h is queue linearizable iff 1. h is pool linearizable, and 2. enq(x) <h enq(y) ⋀ deq(y) ∈ h ⇒ deq(x) ∈ h ⋀ deq(y) ≮h deq(x)
Verification (queue)
Ana Sokolova FRIDA DisCoTec 5.6.15
Queue sequential specification (axiomatic)
s is a legal queue sequence iff 1. s is a legal pool sequence, and 2. enq(x) <s enq(y) ⋀ deq(y) ∈ s ⇒ deq(x) ∈ s ⋀ deq(x) <s deq(y)
Queue local linearizability (axiomatic)
h is queue locally linearizable iff 1. h is pool locally linearizable, and 2. enq(x) <hi enq(y) ⋀ deq(y) ∈ h ⇒ deq(x) ∈ h ⋀ deq(y) ≮h deq(x)
Generic Implementations
Ana Sokolova FRIDA DisCoTec 5.6.15
Your favorite linearizable data structure implementationΦ
turns into a locally linearizable implementation by:
segment of dynamic size (n)
t2t1 tn…
Φ Φ Φ
local inserts / global (randomly distributed) removes
LLD Φ pool linearizable
& locally
linearizable
Performance
Ana Sokolova FRIDA DisCoTec 5.6.15
0
2
4
6
8
10
12
14
16
18
2 10 20 30 40 50 60 70 80
mill
ion
oper
atio
nspe
rsec
(mor
eis
bette
r)
number of threads
MSLCRQ
k-FIFO (k=80)
LL k-FIFO (k=80)static LL DQ (p=40)
dynamic LL DQ
1-RA DQ (p=80)
(a) FIFO queues and “queue-like” pools
0
2
4
6
8
10
12
14
16
18
2 10 20 30 40 50 60 70 80
mill
ion
oper
atio
nspe
rsec
(mor
eis
bette
r)
number of threads
TreiberTS-interval Stackk-Stack (k=80)
LL k-Stack (k=80)static LL DS (p=40)
dynamic LL DS
1-RA DS (p=80)
(b) Stacks and “stack-like” pools
Figure 9: Performance and scalability of producer-consumer mi-crobenchmarks with an increasing number of threads on a 40-core(2 hyperthreads per core) machine
of the fastest locally linearizable implementation (dynamic LL) tothe fastest linearizable queue (LCRQ) and stack (TS-interval Stack)implementation at 80 threads is 1.93 and 1.79, respectively.
Note that the performance degradation for LCRQ between 30and 70 threads aligns with the performance of fetch-and-inc—the CPU instruction that atomically retrieves and modifies the con-tents of a memory location—on the benchmarking machine, whichis different on the original benchmarking machine [31]. LCRQ usesfetch-and-inc as its key atomic instruction.
7. Conclusions & Future WorkLocal linearizability utilizes the idea of decomposition yielding anintuitive and verifiable consistency condition for concurrent objectsthat enables implementations with superior performance and scala-bility compared to linearizable and relaxed implementations. Thereare at least two directions for future work: (1) Further implicationsof decomposition to correctness; (2) Program-aware correctness,i.e., identifying classes of programs that are (in)sensitive to notionsof (relaxed) semantics.
References[1] Y. Afek, G. Korland, and E. Yanovsky. Quasi-Linearizability: Relaxed
Consistency for Improved Concurrency. In Proc. Conference onPrinciples of Distributed Systems (OPODIS), LNCS, pages 395–410.Springer, 2010. .
[2] M. Ahamad, R. Bazzi, R. John, P. Kohli, and G. Neiger. The powerof processor consistency. In Proc. Symposium on Parallel Algorithmsand Architectures (SPAA), pages 251–260. ACM, 1993. .
[3] M. Ahamad, G. Neiger, J. Burns, P. Kohli, and P. Hutto. Causalmemory: definitions, implementation, and programming. DistributedComputing, 9(1):37–49, 1995. ISSN 0178-2770. .
[4] D. Alistarh, J. Kopinsky, J. Li, and N. Shavit. The spraylist: A scalablerelaxed priority queue. In Proc. Symposium on Principles and Practiceof Parallel Programming (PPoPP), pages 11–20. ACM, 2015. .
[5] J. Aspnes, M. Herlihy, and N. Shavit. Counting networks. J. ACM, 41(5):1020–1048, Sept. 1994. ISSN 0004-5411. .
[6] M. Batty, M. Dodds, and A. Gotsman. Library abstraction for c/c++concurrency. In Proc. Symposium on Principles of Programming Lan-guages (POPL), pages 235–248, New York, NY, USA, 2013. ACM..
[7] A. Bouajjani, M. Emmi, C. Enea, and J. Hamza. Tractable refinementchecking for concurrent objects. In Proc. Symposium on Principles ofProgramming Languages (POPL), pages 651–662. ACM, 2015. .
[8] S. Burckhardt, A. Gotsman, H. Yang, and M. Zawirski. Replicateddata types: Specification, verification, optimality. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 271–284,New York, NY, USA, 2014. ACM. .
[9] P. . A. E. Committee. Popl 2015 artifact evaluation. URL http://popl15-aec.cs.umass.edu/home/. Accessed on 01/14/2015.
[10] Computational Systems Group, University of Salzburg. Scal: High-performance multicore-scalable computing. URL http://scal.cs.uni-salzburg.at.
[11] J. Derrick, B. Dongol, G. Schellhorn, B. Tofan, O. Travkin, andH. Wehrheim. Quiescent Consistency: Defining and Verifying Re-laxed Linearizability. In Proc. International Symposium on FormalMethods (FM), LNCS, pages 200–214. Springer, 2014. .
[12] M. Dodds, A. Haas, and C. Kirsch. A scalable, correct time-stampedstack. In Proc. Symposium on Principles of Programming Languages(POPL), pages 233–246. ACM, 2015. .
[13] J. Goodman. Cache consistency and sequential consistency. Univer-sity of Wisconsin-Madison, Computer Sciences Department, 1991.
[14] A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin,and A. Sokolova. Distributed queues in shared memory: Multicoreperformance and scalability through quantitative relaxation. In Proc.International Conference on Computing Frontiers (CF). ACM, 2013..
[15] A. Heddaya and H. Sinha. Coherence, non-coherence and local consis-tency in distributed shared memory for parallel computing. Technicalreport, Computer Science Department, Boston University, 1992.
[16] S. Heller, M. Herlihy, V. Luchangco, M. Moir, W. Scherer, andN. Shavit. A lazy concurrent list-based set algorithm. In Proc.Conference on Principles of Distributed Systems (OPODIS), LNCS.Springer, 2005. .
[17] J. Hennessy and D. Patterson. Computer Architecture, Fifth Edi-tion: A Quantitative Approach. Morgan Kaufmann Publishers Inc.,San Francisco, CA, USA, 5th edition, 2011. ISBN 012383872X,9780123838728.
[18] T. Henzinger, C. Kirsch, H. Payer, A. Sezgin, and A. Sokolova. Quan-titative relaxation of concurrent data structures. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 317–328.ACM, 2013. .
[19] T. Henzinger, A. Sezgin, and V. Vafeiadis. Aspect-oriented lineariz-ability proofs. In Proc. Internatonal Conference on Concurrency The-ory (CONCUR), LNCS, pages 242–256. Springer, 2013. .
short description of paper 8 2015/5/21
MS queue
LLD MS queue
LLD MS queue performs
significantly better than
MS queue
Performance
Ana Sokolova FRIDA DisCoTec 5.6.15
0
2
4
6
8
10
12
14
16
18
2 10 20 30 40 50 60 70 80
mill
ion
oper
atio
nspe
rsec
(mor
eis
bette
r)
number of threads
MSLCRQ
k-FIFO (k=80)
LL k-FIFO (k=80)static LL DQ (p=40)
dynamic LL DQ
1-RA DQ (p=80)
(a) FIFO queues and “queue-like” pools
0
2
4
6
8
10
12
14
16
18
2 10 20 30 40 50 60 70 80
mill
ion
oper
atio
nspe
rsec
(mor
eis
bette
r)
number of threads
TreiberTS-interval Stackk-Stack (k=80)
LL k-Stack (k=80)static LL DS (p=40)
dynamic LL DS
1-RA DS (p=80)
(b) Stacks and “stack-like” pools
Figure 9: Performance and scalability of producer-consumer mi-crobenchmarks with an increasing number of threads on a 40-core(2 hyperthreads per core) machine
of the fastest locally linearizable implementation (dynamic LL) tothe fastest linearizable queue (LCRQ) and stack (TS-interval Stack)implementation at 80 threads is 1.93 and 1.79, respectively.
Note that the performance degradation for LCRQ between 30and 70 threads aligns with the performance of fetch-and-inc—the CPU instruction that atomically retrieves and modifies the con-tents of a memory location—on the benchmarking machine, whichis different on the original benchmarking machine [31]. LCRQ usesfetch-and-inc as its key atomic instruction.
7. Conclusions & Future WorkLocal linearizability utilizes the idea of decomposition yielding anintuitive and verifiable consistency condition for concurrent objectsthat enables implementations with superior performance and scala-bility compared to linearizable and relaxed implementations. Thereare at least two directions for future work: (1) Further implicationsof decomposition to correctness; (2) Program-aware correctness,i.e., identifying classes of programs that are (in)sensitive to notionsof (relaxed) semantics.
References[1] Y. Afek, G. Korland, and E. Yanovsky. Quasi-Linearizability: Relaxed
Consistency for Improved Concurrency. In Proc. Conference onPrinciples of Distributed Systems (OPODIS), LNCS, pages 395–410.Springer, 2010. .
[2] M. Ahamad, R. Bazzi, R. John, P. Kohli, and G. Neiger. The powerof processor consistency. In Proc. Symposium on Parallel Algorithmsand Architectures (SPAA), pages 251–260. ACM, 1993. .
[3] M. Ahamad, G. Neiger, J. Burns, P. Kohli, and P. Hutto. Causalmemory: definitions, implementation, and programming. DistributedComputing, 9(1):37–49, 1995. ISSN 0178-2770. .
[4] D. Alistarh, J. Kopinsky, J. Li, and N. Shavit. The spraylist: A scalablerelaxed priority queue. In Proc. Symposium on Principles and Practiceof Parallel Programming (PPoPP), pages 11–20. ACM, 2015. .
[5] J. Aspnes, M. Herlihy, and N. Shavit. Counting networks. J. ACM, 41(5):1020–1048, Sept. 1994. ISSN 0004-5411. .
[6] M. Batty, M. Dodds, and A. Gotsman. Library abstraction for c/c++concurrency. In Proc. Symposium on Principles of Programming Lan-guages (POPL), pages 235–248, New York, NY, USA, 2013. ACM..
[7] A. Bouajjani, M. Emmi, C. Enea, and J. Hamza. Tractable refinementchecking for concurrent objects. In Proc. Symposium on Principles ofProgramming Languages (POPL), pages 651–662. ACM, 2015. .
[8] S. Burckhardt, A. Gotsman, H. Yang, and M. Zawirski. Replicateddata types: Specification, verification, optimality. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 271–284,New York, NY, USA, 2014. ACM. .
[9] P. . A. E. Committee. Popl 2015 artifact evaluation. URL http://popl15-aec.cs.umass.edu/home/. Accessed on 01/14/2015.
[10] Computational Systems Group, University of Salzburg. Scal: High-performance multicore-scalable computing. URL http://scal.cs.uni-salzburg.at.
[11] J. Derrick, B. Dongol, G. Schellhorn, B. Tofan, O. Travkin, andH. Wehrheim. Quiescent Consistency: Defining and Verifying Re-laxed Linearizability. In Proc. International Symposium on FormalMethods (FM), LNCS, pages 200–214. Springer, 2014. .
[12] M. Dodds, A. Haas, and C. Kirsch. A scalable, correct time-stampedstack. In Proc. Symposium on Principles of Programming Languages(POPL), pages 233–246. ACM, 2015. .
[13] J. Goodman. Cache consistency and sequential consistency. Univer-sity of Wisconsin-Madison, Computer Sciences Department, 1991.
[14] A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin,and A. Sokolova. Distributed queues in shared memory: Multicoreperformance and scalability through quantitative relaxation. In Proc.International Conference on Computing Frontiers (CF). ACM, 2013..
[15] A. Heddaya and H. Sinha. Coherence, non-coherence and local consis-tency in distributed shared memory for parallel computing. Technicalreport, Computer Science Department, Boston University, 1992.
[16] S. Heller, M. Herlihy, V. Luchangco, M. Moir, W. Scherer, andN. Shavit. A lazy concurrent list-based set algorithm. In Proc.Conference on Principles of Distributed Systems (OPODIS), LNCS.Springer, 2005. .
[17] J. Hennessy and D. Patterson. Computer Architecture, Fifth Edi-tion: A Quantitative Approach. Morgan Kaufmann Publishers Inc.,San Francisco, CA, USA, 5th edition, 2011. ISBN 012383872X,9780123838728.
[18] T. Henzinger, C. Kirsch, H. Payer, A. Sezgin, and A. Sokolova. Quan-titative relaxation of concurrent data structures. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 317–328.ACM, 2013. .
[19] T. Henzinger, A. Sezgin, and V. Vafeiadis. Aspect-oriented lineariz-ability proofs. In Proc. Internatonal Conference on Concurrency The-ory (CONCUR), LNCS, pages 242–256. Springer, 2013. .
short description of paper 8 2015/5/21
MS queue
LLD MS queue
LLD Φ performs
significantly better than Φ
Performance
Ana Sokolova FRIDA DisCoTec 5.6.15
0
2
4
6
8
10
12
14
16
18
2 10 20 30 40 50 60 70 80
mill
ion
oper
atio
nspe
rsec
(mor
eis
bette
r)
number of threads
MSLCRQ
k-FIFO (k=80)
LL k-FIFO (k=80)static LL DQ (p=40)
dynamic LL DQ
1-RA DQ (p=80)
(a) FIFO queues and “queue-like” pools
0
2
4
6
8
10
12
14
16
18
2 10 20 30 40 50 60 70 80
mill
ion
oper
atio
nspe
rsec
(mor
eis
bette
r)
number of threads
TreiberTS-interval Stackk-Stack (k=80)
LL k-Stack (k=80)static LL DS (p=40)
dynamic LL DS
1-RA DS (p=80)
(b) Stacks and “stack-like” pools
Figure 9: Performance and scalability of producer-consumer mi-crobenchmarks with an increasing number of threads on a 40-core(2 hyperthreads per core) machine
of the fastest locally linearizable implementation (dynamic LL) tothe fastest linearizable queue (LCRQ) and stack (TS-interval Stack)implementation at 80 threads is 1.93 and 1.79, respectively.
Note that the performance degradation for LCRQ between 30and 70 threads aligns with the performance of fetch-and-inc—the CPU instruction that atomically retrieves and modifies the con-tents of a memory location—on the benchmarking machine, whichis different on the original benchmarking machine [31]. LCRQ usesfetch-and-inc as its key atomic instruction.
7. Conclusions & Future WorkLocal linearizability utilizes the idea of decomposition yielding anintuitive and verifiable consistency condition for concurrent objectsthat enables implementations with superior performance and scala-bility compared to linearizable and relaxed implementations. Thereare at least two directions for future work: (1) Further implicationsof decomposition to correctness; (2) Program-aware correctness,i.e., identifying classes of programs that are (in)sensitive to notionsof (relaxed) semantics.
References[1] Y. Afek, G. Korland, and E. Yanovsky. Quasi-Linearizability: Relaxed
Consistency for Improved Concurrency. In Proc. Conference onPrinciples of Distributed Systems (OPODIS), LNCS, pages 395–410.Springer, 2010. .
[2] M. Ahamad, R. Bazzi, R. John, P. Kohli, and G. Neiger. The powerof processor consistency. In Proc. Symposium on Parallel Algorithmsand Architectures (SPAA), pages 251–260. ACM, 1993. .
[3] M. Ahamad, G. Neiger, J. Burns, P. Kohli, and P. Hutto. Causalmemory: definitions, implementation, and programming. DistributedComputing, 9(1):37–49, 1995. ISSN 0178-2770. .
[4] D. Alistarh, J. Kopinsky, J. Li, and N. Shavit. The spraylist: A scalablerelaxed priority queue. In Proc. Symposium on Principles and Practiceof Parallel Programming (PPoPP), pages 11–20. ACM, 2015. .
[5] J. Aspnes, M. Herlihy, and N. Shavit. Counting networks. J. ACM, 41(5):1020–1048, Sept. 1994. ISSN 0004-5411. .
[6] M. Batty, M. Dodds, and A. Gotsman. Library abstraction for c/c++concurrency. In Proc. Symposium on Principles of Programming Lan-guages (POPL), pages 235–248, New York, NY, USA, 2013. ACM..
[7] A. Bouajjani, M. Emmi, C. Enea, and J. Hamza. Tractable refinementchecking for concurrent objects. In Proc. Symposium on Principles ofProgramming Languages (POPL), pages 651–662. ACM, 2015. .
[8] S. Burckhardt, A. Gotsman, H. Yang, and M. Zawirski. Replicateddata types: Specification, verification, optimality. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 271–284,New York, NY, USA, 2014. ACM. .
[9] P. . A. E. Committee. Popl 2015 artifact evaluation. URL http://popl15-aec.cs.umass.edu/home/. Accessed on 01/14/2015.
[10] Computational Systems Group, University of Salzburg. Scal: High-performance multicore-scalable computing. URL http://scal.cs.uni-salzburg.at.
[11] J. Derrick, B. Dongol, G. Schellhorn, B. Tofan, O. Travkin, andH. Wehrheim. Quiescent Consistency: Defining and Verifying Re-laxed Linearizability. In Proc. International Symposium on FormalMethods (FM), LNCS, pages 200–214. Springer, 2014. .
[12] M. Dodds, A. Haas, and C. Kirsch. A scalable, correct time-stampedstack. In Proc. Symposium on Principles of Programming Languages(POPL), pages 233–246. ACM, 2015. .
[13] J. Goodman. Cache consistency and sequential consistency. Univer-sity of Wisconsin-Madison, Computer Sciences Department, 1991.
[14] A. Haas, T. Henzinger, C. Kirsch, M. Lippautz, H. Payer, A. Sezgin,and A. Sokolova. Distributed queues in shared memory: Multicoreperformance and scalability through quantitative relaxation. In Proc.International Conference on Computing Frontiers (CF). ACM, 2013..
[15] A. Heddaya and H. Sinha. Coherence, non-coherence and local consis-tency in distributed shared memory for parallel computing. Technicalreport, Computer Science Department, Boston University, 1992.
[16] S. Heller, M. Herlihy, V. Luchangco, M. Moir, W. Scherer, andN. Shavit. A lazy concurrent list-based set algorithm. In Proc.Conference on Principles of Distributed Systems (OPODIS), LNCS.Springer, 2005. .
[17] J. Hennessy and D. Patterson. Computer Architecture, Fifth Edi-tion: A Quantitative Approach. Morgan Kaufmann Publishers Inc.,San Francisco, CA, USA, 5th edition, 2011. ISBN 012383872X,9780123838728.
[18] T. Henzinger, C. Kirsch, H. Payer, A. Sezgin, and A. Sokolova. Quan-titative relaxation of concurrent data structures. In Proc. Symposiumon Principles of Programming Languages (POPL), pages 317–328.ACM, 2013. .
[19] T. Henzinger, A. Sezgin, and V. Vafeiadis. Aspect-oriented lineariz-ability proofs. In Proc. Internatonal Conference on Concurrency The-ory (CONCUR), LNCS, pages 242–256. Springer, 2013. .
short description of paper 8 2015/5/21
MS queue
LLD MS queue
LLD MS queue performs better
than the best known
pools
Thank You!