early scheduling in parallel state machine replication · mysql group replication galera cluster...
TRANSCRIPT
Early Scheduling in Parallel State Machine Replication
Eduardo Alchieri, Fernando Dotti, and Fernando Pedone
Universidade de Brasilia, Pontifica Universidade Católica do Rio Grande do Sul, and University of Lugano
�1
State Machine Replication (SMR)Fundamental approach to fault tolerance
Google Spanner
Apache Zookeeper
Windows Azure Storage
MySQL Group Replication
Galera Cluster
Blockchain, …
�2
SMR is intuitive and simple
�3
Clients Servers
R0 R1 R2
same order deterministic
execution
R0 R1 R2
R0 R1 R2
Key observation
Independent requests can execute concurrently
Conflicting requests must be serialized and executed in the same order by the replicas
Two requests conflict if they access common state and at least one of them updates the state
Parallel State Machine Replication
�4
Parallel State Machine Replication
�5
R1
R2R4
R5 R3 R0
scheduler
worker
worker
scheduler
worker tx
Replica
Replicaworker ty
R0(x)R2(x)R3(x)R5(x)
R1(y)R4(y)
Late scheduling
Scheduling happens after requests are ordered
Early scheduling
Scheduling decisions happen before requests are ordered
E.g., worker tx executes requests on X, worker ty executes requests on Y
Scheduling tradeoff
�6
Con
curr
ency
Low
High
Synchronization Overhead
Low High
Classic SMR
Late Scheduling
Ideal
Early Scheduling
Early Scheduling
This paper
Our contributions
Generalization of Early Scheduling
Classes of requests: expressing application concurrency
How to automatically map classes to worker threads
How the resulting technique compares to late scheduling
�7
Classes of requests
Readers and writers
Class CR: read requests
Class CW: write requests
�8
CR
CWInternal conflict
External conflict
Mapping classes to workers
Define workers that execute requests in the class
Define class type
Sequential: one request at a time
Concurrent: requests executed concurrently
�9
CR
CWSequential
Concurrentt0, …, tk
tk, …, tn
Early Scheduling execution model
�10
scheduler
worker t0
worker t1
Replica
class ➝ workers mapping
ordered requestsR1, R2,… in class C
Class C is CONCURRENT: request assigned to t0 OR t1
R1R4 R3R6
R5 R2R7
Early Scheduling execution model
�11
scheduler
worker t0R1R2
worker t1R1R2
Replica
class ➝ workers mapping
ordered requestsR1, R2,… in class C
Class C is SEQUENTIAL: request assigned to t0 AND t1
barrier
Mapping classes to workers
�12
Every class must have at least one worker thread
t0,t1,t2
t3
Rule #1
➝
➝
C1
C2
If C has internal conflicts, then it must be sequential
Rule #2CR
CW SequentialC1
If C1 and C2 conflict, at least one must be sequential
Rule #3
Sequential
Sequential
or
C1
C2If C1 and C2 conflict, C1 is
concurrent, and C2 is sequential, workers of C1 are workers of C2
Rule #4
Sequential
Concurrent C1
C2
t0,t1
t0,t1,t2If C1 and C2 conflict, and are
sequential, then C1 and C2 must have one worker in common
Rule #5
Sequential
Sequential C1
C2
t0,t1,t2
t2,t3,t4
Local reads mostcommon requests
Workers: t0,t1,t2,t3
Mapping classes to workers
�13
CR1
CW1
CR2
CW2
CRg
CWg
Synchronizedt0, t1, t2
t0, t1, t2, t3
t0, t2, t3Sequential
Concurrent
Sequential
Concurrent
Sequential
Concurrent
t0, t1 t2, t3
t0, t2
Optimizing scheduling
O1a: Minimize workers in sequential classes
O1b: Maximize workers in concurrent classes
O2: Assign workers to concurrent classes in proportion to class weight (i.e., more work, more workers)
O3: Minimize unnecessary synchronization among classes
�14
Optimization model
�15
. . .
Described in AMPL
Solved with KNitro
Naive vs Optimized mapping
�16
Local reads mostcommon requests
Workers: t0,t1,t2,t3
CR1
CW1
CR2
CW2
CRg
CWg
Concurrent t0, t1
Concurrent t2, t3
Sequential t0, t1, t2, t3
Sequential t0, t1
Sequential t2, t3
parallel
Sequentialt0, t2
Experimental evaluationPrototype in BFT-SMaRt environment
Early scheduling and late scheduling
Configured to crash failures (not BFT)
Linked-list application
Single- and multi-shard deployments
Light, moderate, and heavy execution costs
Uniform and skewed workloads
�17
Single-shard, reads, moderate
�18
Multi-shard, mixed, moderate
�19
�20
http://www.inf.usi.ch/faculty/pedone/