byzantine vector consensus in complete graphs
DESCRIPTION
Byzantine Vector Consensus in Complete Graphs. Paper by: Nitin H. Vaidya - University of Illinois Vijay K. Garg - University of Texas Presented by: Dima Ogurtsov. In This Presentation We Will. Briefly introduce Byzantine scalar consensus problem - PowerPoint PPT PresentationTRANSCRIPT
Byzantine Vector Consensus in Complete Graphs
Paper by:Nitin H. Vaidya - University of Illinois
Vijay K. Garg - University of Texas
Presented by: Dima Ogurtsov
In This Presentation We Will• Briefly introduce Byzantine scalar consensus problem• Introduce Byzantine vector consensus (BVC) problem• Present geometric and communicational primitives that
will be used in BVC algorithms• Provide a necessary and sufficient condition for achieving
exact BVC in a synchronous system (complete graph)• Provide a necessary and sufficient condition for achieving
approximate BVC in an asynchronous system (complete graph)
• Provide an algorithms for both versions of BVC
Agenda• Introduction
– The Byzantine Generals problem– Fun facts about scalar Byzantine consensus– Introducing Byzantine vector consensus (BVC)
• Why is it a non trivial problem given scalar consensus solution
• Communication primitives– Reliable broadcast– Witness technique
• Geometric primitives– Tverberg’s theorem– Linear programming
• Exact BVC in synchronous systems– Necessary condition– Algorithm and sufficient condition
• Approximate BVC in asynchronous systems– Necessary condition– Algorithm and sufficient condition
• Wrap up
Introduction
The Byzantine Generals Problem
• Imagine several divisions of the Byzantine army are camped outside of an enemy city
• Each division is commanded by its own general• Generals communicate with each other by messages• Generals must decide on common plan of action• However, some of the generals may be traitors trying
to prevent loyal generals from reaching the agreement
• We want to give an algorithm which guarantees a good solution
What is a Good Solution• It depends on a specific problem• In general, consensus problem can be divided to two types
– Exact consensus – all generals* get the same value– Approximate consensus – all generals* get values which are close enough (allowing
some error margin)• In both cases we may want to put a constraint on the value that every
general gets, such that a solution will be “good”. We will call this constraint a validity condition
• We will discuss the problem in rational space and we will choose convexity constraint as a validity condition
• * All loyal generals for Byzantine systems
Exact Byzantine Consensus Problem
• Assume a system of n processes• f out of n processes are faulty – can behave arbitrary• Each process Pi has a scalar input Xi• Find a decision value that satisfies the following:
– Agreement: The decision value of all non-faulty processes is identical
– Validity: The decision value of all non-faulty processes is in the convex hull of the input values of all non-faulty processes
– Termination: Each non-faulty process must terminate within a finite amount of time
Approximate Byzantine Consensus Problem
• Assume a system of n processes• f out of n processes are faulty – can behave arbitrary• Each process Pi has a scalar input Xi• Find a decision value that satisfies the following:
– -Agreement: The decision value of all non-faulty processes must be within of each other, where > 0 is a predefined constant
– Validity: The decision value of all non-faulty processes is in the convex hull of the input values of all non-faulty processes
– Termination: Each non-faulty process must terminate within a finite amount of time
Synchronous and Asynchronous Systems
• The problem needs to be solved separately for synchronous and asynchronous systems– The asynchronous solution will work for synchronous
system too but may be non optimal• The asynchronous system introduces several
difficulties that can influence the solution– Arbitrary delays in messages– No synchronized clocks– Cannot distinguish between crash of a process or just
slow execution
Fun Facts About Scalar Byzantine Consensus
• Necessary and sufficient condition for exact Byzantine consensus in a synchronous system: – L. Lamport, R. Shostak, and M. Pease. The Byzantine generals
problem, 1982• Exact Byzantine consensus cannot be achieved in an
asynchronous system– M. J. Fischer, N. A. Lynch, and M. S. Paterson. Impossibility of
distributed consensus with one faulty process, 1985• Necessary and sufficient condition for approximate Byzantine
consensus in an asynchronous system: – I. Abraham, Y. Amit, D. Dolev. Optimal Resilience Asynchronous
Approximate Agreement, 2004
Introducing BVC Problem
• Assume a system of n processes• f out of n processes are faulty or Byzantine –
can behave arbitrarily• Each process Pi has an input vector • Find a decision vector that satisfies the
following:
…vector that satisfies the following:
• Validity: The decision vector at each non-faulty process must be in the convex hull of the input vectors at the non-faulty processes.
• Termination: Each non-faulty process must terminate within a finite amount of time
• For exact BVC– Agreement: The decision vector at all the non-faulty processes must
be identical.• For approximate BVC
– -Agreement: For 1 ≤ l ≤ d, the l-th elements of the decision vectors at any two non-faulty processes must be within of each other, where > 0 is a pre-defined constant.
Why not scalar Byzantine consensus on each dimension
• One might think that BVC problem can be solved by simply performing scalar consensus on each dimension of the input vectors independently
• But in reality even if validity condition for scalar consensus is satisfied for each dimension of the vector separately, the validity condition of the decision vector may be not satisfied
Why not scalar Byzantine consensus on each dimension (2)
• For example let’s take n=4, f=1, d=3. Processes p1, p2 and p3 are not faulty and their input vectors are [1,0,0], [0,1,0], and [0,0,1] respectively. p4 is faulty.
• If we perform Byzantine scalar consensus on each dimension of the vector separately, then the processes may possibly agree on the decision vector [0,0,0] which satisfies scalar validity condition along each dimension separately
• However the decision vector is not in convex hull of the non-faulty inputs. The inputs are probability vectors, while the decision vector is not.
• So the decision vector is not valid.
Why not scalar Byzantine consensus on each dimension – visual example
Communication Primitives
In this section, we present distributed algorithms and their properties which we
will use as primitive building blocks for communication in our BVC algorithms
Reliable Broadcast• Two procedures: Reliable-Broadcast(m, r) and Reliable-Accept(p,
m, r). • Guarantees the following properties:
– Correctness. If a non-faulty process p with a message m on round r performs Reliable-Broadcast(m, r) then all non-faulty processes will eventually Reliable-Accept(p, m, r).
– Non-forgeability. If a non-faulty process p does not perform at round r Reliable-Broadcast(m, r) then no non-faulty process will ever perform Reliable-Accept(p, m, r).
– Uniqueness. If a non-faulty process performs Reliable-Accept(p, m, r) and another non-faulty process performs Reliable-Accept(p, m’, r) then m = m’;
Witness Technique
• Assuming n > 3f• Witness for p is a process whose first accepted
values were also accepted by p• A nonfaulty process waits for witnesses for each
value• Every pair of nonfaulty processes have ≥
common witnesses• We will use this technique together with Reliable-
Broadcast
AAD and AAD-broadcast
• Algorithm for approximate scalar agreement proposed in: I. Abraham, Y. Amit, D. Dolev. Optimal Resilience Asynchronous Approximate Agreement, 2004
• We will refer to it as AAD• AAD works in asynchronous rounds• Each round a combination of reliable broadcast
and witness technique is used for communication• We will call this combined algorithm AAD-
broadcast
AAD-broadcast properties
• In the end of AAD-broadcast in round r the following properties hold:– Common knowledge. Any two non faulty
processes learn at least n-f identical tuples– Uniqueness. Process p cannot receive two tuples
(q,msg1,r), (q’,msg2,r) such that – Non-forgeability. If process p gets message m from
q, then q indeed broadcasted m.
And now for something completely different
Geometric Primitives
In this section, we present geometric theorems, algorithms, definitions and ideas
that we will use as primitive building blocks for decision vector computation in BVC algorithms
Tverberg’s Theorem: Informal• Let’s assume we have n points in • We want to find a good partition of our points into several
subsets• Good partition – we want all these subsets to have something
in common• Convex hull of all points in a subset is a subspace of • We want all these subspaces to have at least one common
point • In other words all these subspaces must have non-empty
intersection• We argue that for n large enough such partition exists
independently of the actual points
Example: Tverberg’s Theorem
n = 7, d = 2, f = 2. We want f+1=3 subsets
Example: Tverberg’s Theorem
Example: Tverberg’s Theorem
Example: Tverberg’s Theorem
Example: Tverberg’s Theorem
Tverberg’s Theorem: formal• For any integer f ≥ 1• For every multiset Y such that points in • There exists a partition of Y into f+1 multisets such that
– H() • H(X) is a convex hull of all points in X• Such partition is called Tverberg’s partition• All points in the common convex hull are called
Tverberg’s points• Proof: http://gilkalai.wordpress.com/2008/11/26/sarkarias-proof-of-tverbergs-theorem-2
function
• Assume multiset Y, , • Assume integer f < n• Define
• In other words– Take convex hull of each n-f size subset of Y– Intersect all of them
example, n=5, f=1,d=2
example, n=5, f=1,d=2
example, n=5, f=1,d=2
example, n=5, f=1,d=2
example, n=5, f=1,d=2
example, n=5, f=1,d=2
example, n=4,f=1,d=3
example, n=4,f=1,d=3
example, n=4,f=1,d=3
example, n=4,f=1,d=3
example, n=4,f=1,d=3
example, n=4,f=1,d=3
• No common intersection of 4 faces – Gamma is empty
Non-empty lemma
• Lemma: For any multiset Y containing at least points in , • Proof:• Consider a Tverberg’s partition of Y into f+1 subsets.
Since such partition exists. Let’s call it Q• Reminder: • , there are f+1 non-empty subsets in partition Q T
excludes elements from at most f of these subsets• Thus T fully contains one of the subsets from Q
Non-empty lemma (cont.)
• Every T fully contains one of the subsets from
• Because Q is Tverberg’s partition, H() • So H(T) for every T fully contains H() • Thus
•We have proven: For any multiset Y containing at least points in ,
Linear Programming• A technique for the optimization of a linear function,
subject to linear equality and linear inequality constraints• Allows solution of problems of the following type: Maximize f(X) subject to C1(X),C2(X)..Ck(X)– f(X) is a linear function– C1..Ck(X) are k linear constraints on value of X–
• Time complexity - different implementations yield different results– polynomial in d and n– Linear in n, exponential in d
Rest for a minute…
Exact BVC in a Synchronous System
In this section, we derive necessary and sufficient conditionsfor exact BVC in a synchronous system with up to f
faulty processes. The discussion in the rest of this paperassumes that the network is a complete graph, even if this is
not stated explicitly
Necessary Condition for Exact BVC
• We will prove that is necessary for exact BVC– n is number of processes– f is number of faulty processes– d is dimension of the input vectors
Necessary Condition for Exact BVC
• Necessary condition for scalar Byzantine consensus is • Basic reduction from scalar to vector problem
– Assume solution exists for inputs in – Assume Xi is a scalar input of process Pi– Define Yi in = [Xi,Xi…Xi] (Vector in , all entries equal to Xi)– Because of validity condition the decision vector is of form [Di,Di…
Di] (Vector in , all entries equal to Di) – The first component of the decision vector is a correct solution of
the original problem• Agreement holds: all processes agree on the same decision vector• Termination holds: vector BVC terminates• Validity holds: in convex hull of Yis => between max and min
Necessary Condition for Exact BVC
• Assume f = 1, n = d+1• For input Xi of process Pi is an all-zero vector
except for i-ths component, which equals to 1• is all-zero vector• For Let denote a convex hull of the inputs of
all processes except • Define • Claim:
Necessary Condition for Exact BVC
• is a convex hull of the inputs of all processes except • Claim: • Proof:• For only has non-zero i-th component in the input
vector. Thus all points in must have zero i-th component• Applying the above for
• But 0 cannot be expressed as convex combination of independent unit vectors
• Thus
Necessary Condition for Exact BVC,
• Claim: the decision vector must be in • Explanation:• From point of view of Pi every other process Pj can be
faulty for each . • To satisfy validity, the decision vector must be in for each • So the decision vector must be in • But for other process Pk such that the decision vector
must be in • Because these two processes must agree on the same
decision vector, it must be in
Necessary Condition for Exact BVC, , dv is in
• For f=1, n=d+1 we have shown:– The decision vector must be in
• Thus there is no legal decision vector under given assumptions– These assumptions:
Necessary Condition for Exact BVC
• Reduction from problem with 1 faulty process– Assume solution for f > 1– Take original n processes, duplicate each one f times.
We end up with f*n processes, f from which are Byzantine
– Solve the problem with f*n processes.– The solution is a legal solution of the original problem
• We have proven: necessary condition for exact BVC is
Rest for a minute…
Algorithm for finding exact BVC
• Assume necessary condition holds• The algorithm:– Reliable-Broadcast of input vector• Let us denote by Si a multiset of input values that Pi
received after Reliable-Broadcast finished– Each process chooses a decision vector from
• Claim: This decision vector is a legal solution of exact BVC problem.
Algorithm for finding exact BVCproof: termination
• The algorithm is synchronous. First part (Reliable-Broadcast) terminates after some pre-defined time period.
• Second part does not involve communication thus terminates
• So the BVC algorithm it is guaranteed to terminate
Algorithm for finding exact BVCproof: agreement
• One of the properties of synchronous Reliable-Broadcast is that by the end of the broadcast all non-faulty processes have received the same set of values
• So is the same for all non-faulty processes• Because according to non-empty Gamma lemma, • Because decision vector is chosen from using some
deterministic function all the processes acquire the same decision vector
• So agreement holds.
Algorithm for finding exact BVCproof: validity
• Denote by V multiset of inputs of only non-faulty processes
•Reminder: • , so at least one of the Ts above contain inputs of
only non-faulty processes. Denote it by T’
•So validity holds.•We have proven: sufficient condition for exact BVC is
Using linear programming to get decision vector
• Given our feasible region is • Define some linear ordering f() of points in .
• For example lexicographic ordering of points•Linear program formulation:
• Minimize f() under the following constraints:
Using linear programming to get decision vector (cont.)
•There is a total of constraints
Rest for a minute…
Necessary Condition for Approximate Asynchronous BVC
• We will prove that is necessary for approximate asynchronous BVC
• As opposed to in synchronous exact case– n is number of processes– f is number of faulty processes– d is dimension of the input vectors
Necessary Condition for Approximate BVC
• Assume f = 1, n = d+2• For input of process Pi is an all-zero vector
except for i-ths component, which equals to • , are all-zero vectors
Necessary Condition for Approximate BVCf = 1, n = d+2
• Algorithm must tolerate a single failure• For must terminate in all executions where
does not take any steps until all the other processes terminate
• When terminates it cannot distinguish between the following d+1 scenarios– is Byzantine and has crashed– is not Byzantine but just slow, and process is
Byzantine ()
Necessary Condition for Approximate BVCf = 1, n = d+2
• is Byzantine and has crashed– decision of must be in where
• is not Byzantine but just slow, and process is Byzantine ()– has not taken any steps, cannot use its input– Cannot trust input of – decision of must be in
• Combining all scenarios: decision of must be in
Necessary Condition for Approximate BVCf = 1, n = d+2, decision of must be in
• Reminder: • Claim: the only possible decision of is • Explanation:• Decision of must be in so in particular for every , decision of
must be in• is the only one that has non-zero j-th component• does not contain • Thus all vectors in have zero in j-th component for every • Decision of must also be in, so it should be convex
combination of d non-zero independent input vectors
Necessary Condition for Approximate BVCf = 1, n = d+2, Claim: decision of is
1. All vectors in have zero in j-th component for every 2. Decision of must also be in, so it should be convex
combination of d non-zero independent input vectors
3. Because of (1) and the way we chose inputs, coefficients of must be zero for all
4. Because of (3) and the fact that decision of is a convex combination of , coefficient of must be 1
5. So decision of must be
Necessary Condition for Approximate BVCthe only possible decision of is
• Reminder:– For input of process Pi is an all-zero vector except for i-ths
component, which equals to – , are all-zero vectors
• So first component of decision of equals to • First component of decision of every other process
equals to 0• So it is impossible to achieve -agreement.
Necessary Condition for Approximate BVC
•Same reduction as in synchronous case works here• Simulating each process f times
• We have proven: necessary condition for approximate BVC is
Rest for a minute…
Algorithm for finding approximate BVC assumptions
• Assume necessary condition holds• W.l.o.g. assume m processes P1..Pm are non faulty, where
and the remaining n-m processes are faulty• Assume upper bound U and lower bound ν on the values of
the d elements in the inputs vectors at non-faulty processes– Assumption for simplification of BVC algorithm – it can work
without this assumption by agreeing on values of U,v in the additional stage of the algorithm
• Assume process Pi has input vector Xi • Pi maintains current value Vi , initialized to Xi•Pi maintains multisets Bi, B’i and Zi, initialized to empty set
Algorithm for finding approximate BVC round notation
• The algorithm works in rounds– Round number
• Variable denoted X[t] means value of the variable X in the end of the round t– For instance Zi[t] means subset Z of process i at the end
of round t– Value of X in the beginning of round t is X[t-1]– X[0] means the initial value of X
• We will omit the round notation where it is not necessary
Algorithm for finding approximate BVCadditional definitions
• A point r at round t is said to be valid if there exists a representation of r as a convex combination of Vi[t-1]-s of the non faulty processes – is said to be weight of in the above convex combination– Note that if a point can be expressed as a convex
combination of valid points, it is also valid.• At any time •
– Assuming n > 1, n > f: • numrounds =
Algorithm for finding approximate BVC
• Run numrounds rounds• In the t-th round
1. Send v[t-1] using AAD-Broadcast a. All sent tuples to process Pi is stored in B’i, respective messages in Bi
2. After obtaining Bi[t] compute Vi[t] as followsa. Initialize Zi emptyb. For each such that
1. add to Zi one deterministically chosen point from
• Claim: Vi[numrounds] is a legal decision vector of approximate BVC
Algorithm for finding approximate BVCproof: termination
• AAD Broadcast is the only communication primitive that we use
• It is guaranteed to terminate under our assumptions.
• So the BVC algorithm it is guaranteed to terminate
Algorithm for finding approximate BVCproof: validity
• At any round t at any non-faulty process Pi consider any C in step 2.b of the algorithm
• By non-empty lemma: . So Zi will contain point from for each C.• There are at most f faulty processes, . Then at least one of (n-2f)-size
subsets of contains state of only non-faulty processes• Therefore all points in are valid• So all points in Zi are valid• Because is an average of points in Zi, it is also valid• Therefore Vi[numrounds] can be expressed as a convex combination
of Vj[0], • So validity holds
Algorithm for finding approximate BVCproof: -agreement
• Consider two non-faulty processes and • We have shown that all points in are valid• By Common knowledge property of AAD-
broadcast
• contain at least one common point. Call it z• It is valid, so • There exists some process such that
Algorithm for finding approximate BVCproof: -agreement (cont.)
• Reminder: , exists such that • Reminder: For each such that add 1 value to Zi
• Because of Uniqueness property of AAD-broadcast we know that
• Because of first four facts:
Algorithm for finding approximate BVCproof: -agreement (cont.)
• Reminder: • Define span(L)[t] – distance between maximal and minimal L-th
components of all Vi[t]• Claim: for each L: • Proof: in the appendix 1. For now assume the claim is correct• At each round span reduces by factor of thus by repeated application of
times the initial maximal span () is decreased beyond
• Recall our definition of numrounds: numrounds = • So -agreement holds.
Wrap Up
•We have met Byzantine Consensus problem•We have defined Byzantine Vector Consensus problem•We have met Tverberg’s theorem•We have proven: necessary and sufficient condition for approximate BVC in asynchronous system is •We have proven: necessary and sufficient condition for exact BVC in synchronous system is •We have learned an algorithm for exact BVC in synchronous •We have learned an algorithm for approximate BVC
Questions?
Appendix 1: proof of reducing span claim in epsilon-agreement proof
• Define span(L)[t] – distance between maximal and minimal L-th components of all Vi[t]
• Claim: for each L: • Proof:
Appendix 1: proof of reducing span claim in epsilon-agreement proof
There exist constants such that:
Appendix 1: proof of reducing span claim in epsilon-agreement proof
Focusing on operations on l-th component (max):
Appendix 1: proof of reducing span claim in epsilon-agreement proof
Focusing on operations on l-th component (min):
Appendix 1: proof of reducing span claim in epsilon-agreement proof