byzantine vector consensus in complete graphs

Byzantine Vector Consensus in Complete Graphs

Paper by:Nitin H. Vaidya - University of Illinois

Vijay K. Garg - University of Texas

Presented by: Dima Ogurtsov

In This Presentation We Will• Briefly introduce Byzantine scalar consensus problem• Introduce Byzantine vector consensus (BVC) problem• Present geometric and communicational primitives that

will be used in BVC algorithms• Provide a necessary and sufficient condition for achieving

exact BVC in a synchronous system (complete graph)• Provide a necessary and sufficient condition for achieving

approximate BVC in an asynchronous system (complete graph)

• Provide an algorithms for both versions of BVC

Agenda• Introduction

– The Byzantine Generals problem– Fun facts about scalar Byzantine consensus– Introducing Byzantine vector consensus (BVC)

• Why is it a non trivial problem given scalar consensus solution

• Communication primitives– Reliable broadcast– Witness technique

• Geometric primitives– Tverberg’s theorem– Linear programming

• Exact BVC in synchronous systems– Necessary condition– Algorithm and sufficient condition

• Approximate BVC in asynchronous systems– Necessary condition– Algorithm and sufficient condition

• Wrap up

Introduction

The Byzantine Generals Problem

• Imagine several divisions of the Byzantine army are camped outside of an enemy city

• Each division is commanded by its own general• Generals communicate with each other by messages• Generals must decide on common plan of action• However, some of the generals may be traitors trying

to prevent loyal generals from reaching the agreement

• We want to give an algorithm which guarantees a good solution

What is a Good Solution• It depends on a specific problem• In general, consensus problem can be divided to two types

– Exact consensus – all generals* get the same value– Approximate consensus – all generals* get values which are close enough (allowing

some error margin)• In both cases we may want to put a constraint on the value that every

general gets, such that a solution will be “good”. We will call this constraint a validity condition

• We will discuss the problem in rational space and we will choose convexity constraint as a validity condition

• * All loyal generals for Byzantine systems

Exact Byzantine Consensus Problem

• Assume a system of n processes• f out of n processes are faulty – can behave arbitrary• Each process Pi has a scalar input Xi• Find a decision value that satisfies the following:

– Agreement: The decision value of all non-faulty processes is identical

– Validity: The decision value of all non-faulty processes is in the convex hull of the input values of all non-faulty processes

– Termination: Each non-faulty process must terminate within a finite amount of time

Approximate Byzantine Consensus Problem

• Assume a system of n processes• f out of n processes are faulty – can behave arbitrary• Each process Pi has a scalar input Xi• Find a decision value that satisfies the following:

– -Agreement: The decision value of all non-faulty processes must be within of each other, where > 0 is a predefined constant

– Validity: The decision value of all non-faulty processes is in the convex hull of the input values of all non-faulty processes

– Termination: Each non-faulty process must terminate within a finite amount of time

Synchronous and Asynchronous Systems

• The problem needs to be solved separately for synchronous and asynchronous systems– The asynchronous solution will work for synchronous

system too but may be non optimal• The asynchronous system introduces several

difficulties that can influence the solution– Arbitrary delays in messages– No synchronized clocks– Cannot distinguish between crash of a process or just

slow execution

Fun Facts About Scalar Byzantine Consensus

• Necessary and sufficient condition for exact Byzantine consensus in a synchronous system: – L. Lamport, R. Shostak, and M. Pease. The Byzantine generals

problem, 1982• Exact Byzantine consensus cannot be achieved in an

asynchronous system– M. J. Fischer, N. A. Lynch, and M. S. Paterson. Impossibility of

distributed consensus with one faulty process, 1985• Necessary and sufficient condition for approximate Byzantine

consensus in an asynchronous system: – I. Abraham, Y. Amit, D. Dolev. Optimal Resilience Asynchronous

Approximate Agreement, 2004

Introducing BVC Problem

• Assume a system of n processes• f out of n processes are faulty or Byzantine –

can behave arbitrarily• Each process Pi has an input vector • Find a decision vector that satisfies the

following:

…vector that satisfies the following:

• Validity: The decision vector at each non-faulty process must be in the convex hull of the input vectors at the non-faulty processes.

• Termination: Each non-faulty process must terminate within a finite amount of time

• For exact BVC– Agreement: The decision vector at all the non-faulty processes must

be identical.• For approximate BVC

– -Agreement: For 1 ≤ l ≤ d, the l-th elements of the decision vectors at any two non-faulty processes must be within of each other, where > 0 is a pre-defined constant.

Why not scalar Byzantine consensus on each dimension

• One might think that BVC problem can be solved by simply performing scalar consensus on each dimension of the input vectors independently

• But in reality even if validity condition for scalar consensus is satisfied for each dimension of the vector separately, the validity condition of the decision vector may be not satisfied

Why not scalar Byzantine consensus on each dimension (2)

• For example let’s take n=4, f=1, d=3. Processes p1, p2 and p3 are not faulty and their input vectors are [1,0,0], [0,1,0], and [0,0,1] respectively. p4 is faulty.

• If we perform Byzantine scalar consensus on each dimension of the vector separately, then the processes may possibly agree on the decision vector [0,0,0] which satisfies scalar validity condition along each dimension separately

• However the decision vector is not in convex hull of the non-faulty inputs. The inputs are probability vectors, while the decision vector is not.

• So the decision vector is not valid.

Why not scalar Byzantine consensus on each dimension – visual example

Communication Primitives

In this section, we present distributed algorithms and their properties which we

will use as primitive building blocks for communication in our BVC algorithms

Reliable Broadcast• Two procedures: Reliable-Broadcast(m, r) and Reliable-Accept(p,

m, r). • Guarantees the following properties:

– Correctness. If a non-faulty process p with a message m on round r performs Reliable-Broadcast(m, r) then all non-faulty processes will eventually Reliable-Accept(p, m, r).

– Non-forgeability. If a non-faulty process p does not perform at round r Reliable-Broadcast(m, r) then no non-faulty process will ever perform Reliable-Accept(p, m, r).

– Uniqueness. If a non-faulty process performs Reliable-Accept(p, m, r) and another non-faulty process performs Reliable-Accept(p, m’, r) then m = m’;

Witness Technique

• Assuming n > 3f• Witness for p is a process whose first accepted

values were also accepted by p• A nonfaulty process waits for witnesses for each

value• Every pair of nonfaulty processes have ≥

common witnesses• We will use this technique together with Reliable-

Broadcast

AAD and AAD-broadcast

• Algorithm for approximate scalar agreement proposed in: I. Abraham, Y. Amit, D. Dolev. Optimal Resilience Asynchronous Approximate Agreement, 2004

• We will refer to it as AAD• AAD works in asynchronous rounds• Each round a combination of reliable broadcast

and witness technique is used for communication• We will call this combined algorithm AAD-

broadcast

AAD-broadcast properties

• In the end of AAD-broadcast in round r the following properties hold:– Common knowledge. Any two non faulty

processes learn at least n-f identical tuples– Uniqueness. Process p cannot receive two tuples

(q,msg1,r), (q’,msg2,r) such that – Non-forgeability. If process p gets message m from

q, then q indeed broadcasted m.

And now for something completely different

Geometric Primitives

In this section, we present geometric theorems, algorithms, definitions and ideas

that we will use as primitive building blocks for decision vector computation in BVC algorithms

Tverberg’s Theorem: Informal• Let’s assume we have n points in • We want to find a good partition of our points into several

subsets• Good partition – we want all these subsets to have something

in common• Convex hull of all points in a subset is a subspace of • We want all these subspaces to have at least one common

point • In other words all these subspaces must have non-empty

intersection• We argue that for n large enough such partition exists

independently of the actual points

Example: Tverberg’s Theorem

n = 7, d = 2, f = 2. We want f+1=3 subsets

Example: Tverberg’s Theorem

Tverberg’s Theorem: formal• For any integer f ≥ 1• For every multiset Y such that points in • There exists a partition of Y into f+1 multisets such that

– H() • H(X) is a convex hull of all points in X• Such partition is called Tverberg’s partition• All points in the common convex hull are called

Tverberg’s points• Proof: http://gilkalai.wordpress.com/2008/11/26/sarkarias-proof-of-tverbergs-theorem-2

http://gilkalai.wordpress.com/2008/11/26/sarkarias-proof-of-tverbergs-theorem-2/

function

• Assume multiset Y, , • Assume integer f < n• Define

• In other words– Take convex hull of each n-f size subset of Y– Intersect all of them

example, n=5, f=1,d=2

example, n=4,f=1,d=3

example, n=4,f=1,d=3

• No common intersection of 4 faces – Gamma is empty

Non-empty lemma

• Lemma: For any multiset Y containing at least points in , • Proof:• Consider a Tverberg’s partition of Y into f+1 subsets.

Since such partition exists. Let’s call it Q• Reminder: • , there are f+1 non-empty subsets in partition Q T

excludes elements from at most f of these subsets• Thus T fully contains one of the subsets from Q

Non-empty lemma (cont.)

• Every T fully contains one of the subsets from

• Because Q is Tverberg’s partition, H() • So H(T) for every T fully contains H() • Thus

•We have proven: For any multiset Y containing at least points in ,

Linear Programming• A technique for the optimization of a linear function,

subject to linear equality and linear inequality constraints• Allows solution of problems of the following type: Maximize f(X) subject to C1(X),C2(X)..Ck(X)– f(X) is a linear function– C1..Ck(X) are k linear constraints on value of X–

• Time complexity - different implementations yield different results– polynomial in d and n– Linear in n, exponential in d

Rest for a minute…

Exact BVC in a Synchronous System

In this section, we derive necessary and sufficient conditionsfor exact BVC in a synchronous system with up to f

faulty processes. The discussion in the rest of this paperassumes that the network is a complete graph, even if this is

not stated explicitly

Necessary Condition for Exact BVC

• We will prove that is necessary for exact BVC– n is number of processes– f is number of faulty processes– d is dimension of the input vectors


• Necessary condition for scalar Byzantine consensus is • Basic reduction from scalar to vector problem

– Assume solution exists for inputs in – Assume Xi is a scalar input of process Pi– Define Yi in = [Xi,Xi…Xi] (Vector in , all entries equal to Xi)– Because of validity condition the decision vector is of form [Di,Di…

Di] (Vector in , all entries equal to Di) – The first component of the decision vector is a correct solution of

the original problem• Agreement holds: all processes agree on the same decision vector• Termination holds: vector BVC terminates• Validity holds: in convex hull of Yis => between max and min


• Assume f = 1, n = d+1• For input Xi of process Pi is an all-zero vector

except for i-ths component, which equals to 1• is all-zero vector• For Let denote a convex hull of the inputs of

all processes except • Define • Claim:


• is a convex hull of the inputs of all processes except • Claim: • Proof:• For only has non-zero i-th component in the input

vector. Thus all points in must have zero i-th component• Applying the above for

• But 0 cannot be expressed as convex combination of independent unit vectors

• Thus

Necessary Condition for Exact BVC,

• Claim: the decision vector must be in • Explanation:• From point of view of Pi every other process Pj can be

faulty for each . • To satisfy validity, the decision vector must be in for each • So the decision vector must be in • But for other process Pk such that the decision vector

must be in • Because these two processes must agree on the same

decision vector, it must be in

Necessary Condition for Exact BVC, , dv is in

• For f=1, n=d+1 we have shown:– The decision vector must be in

• Thus there is no legal decision vector under given assumptions– These assumptions:


• Reduction from problem with 1 faulty process– Assume solution for f > 1– Take original n processes, duplicate each one f times.

We end up with f*n processes, f from which are Byzantine

– Solve the problem with f*n processes.– The solution is a legal solution of the original problem

• We have proven: necessary condition for exact BVC is

Algorithm for finding exact BVC

• Assume necessary condition holds• The algorithm:– Reliable-Broadcast of input vector• Let us denote by Si a multiset of input values that Pi

received after Reliable-Broadcast finished– Each process chooses a decision vector from

• Claim: This decision vector is a legal solution of exact BVC problem.

Algorithm for finding exact BVCproof: termination

• The algorithm is synchronous. First part (Reliable-Broadcast) terminates after some pre-defined time period.

• Second part does not involve communication thus terminates

• So the BVC algorithm it is guaranteed to terminate

Algorithm for finding exact BVCproof: agreement

• One of the properties of synchronous Reliable-Broadcast is that by the end of the broadcast all non-faulty processes have received the same set of values

• So is the same for all non-faulty processes• Because according to non-empty Gamma lemma, • Because decision vector is chosen from using some

deterministic function all the processes acquire the same decision vector

• So agreement holds.

Algorithm for finding exact BVCproof: validity

• Denote by V multiset of inputs of only non-faulty processes

•Reminder: • , so at least one of the Ts above contain inputs of

only non-faulty processes. Denote it by T’

•So validity holds.•We have proven: sufficient condition for exact BVC is

Using linear programming to get decision vector

• Given our feasible region is • Define some linear ordering f() of points in .

• For example lexicographic ordering of points•Linear program formulation:

• Minimize f() under the following constraints:

Using linear programming to get decision vector (cont.)

•There is a total of constraints

Necessary Condition for Approximate Asynchronous BVC

• We will prove that is necessary for approximate asynchronous BVC

• As opposed to in synchronous exact case– n is number of processes– f is number of faulty processes– d is dimension of the input vectors

Necessary Condition for Approximate BVC

• Assume f = 1, n = d+2• For input of process Pi is an all-zero vector

except for i-ths component, which equals to • , are all-zero vectors

Necessary Condition for Approximate BVCf = 1, n = d+2

• Algorithm must tolerate a single failure• For must terminate in all executions where

does not take any steps until all the other processes terminate

• When terminates it cannot distinguish between the following d+1 scenarios– is Byzantine and has crashed– is not Byzantine but just slow, and process is

Byzantine ()

Necessary Condition for Approximate BVCf = 1, n = d+2

• is Byzantine and has crashed– decision of must be in where

• is not Byzantine but just slow, and process is Byzantine ()– has not taken any steps, cannot use its input– Cannot trust input of – decision of must be in

• Combining all scenarios: decision of must be in

Necessary Condition for Approximate BVCf = 1, n = d+2, decision of must be in

• Reminder: • Claim: the only possible decision of is • Explanation:• Decision of must be in so in particular for every , decision of

must be in• is the only one that has non-zero j-th component• does not contain • Thus all vectors in have zero in j-th component for every • Decision of must also be in, so it should be convex

combination of d non-zero independent input vectors

Necessary Condition for Approximate BVCf = 1, n = d+2, Claim: decision of is

1. All vectors in have zero in j-th component for every 2. Decision of must also be in, so it should be convex

combination of d non-zero independent input vectors

3. Because of (1) and the way we chose inputs, coefficients of must be zero for all

4. Because of (3) and the fact that decision of is a convex combination of , coefficient of must be 1

5. So decision of must be

Necessary Condition for Approximate BVCthe only possible decision of is

• Reminder:– For input of process Pi is an all-zero vector except for i-ths

component, which equals to – , are all-zero vectors

• So first component of decision of equals to • First component of decision of every other process

equals to 0• So it is impossible to achieve -agreement.

Necessary Condition for Approximate BVC

•Same reduction as in synchronous case works here• Simulating each process f times

• We have proven: necessary condition for approximate BVC is

Algorithm for finding approximate BVC assumptions

• Assume necessary condition holds• W.l.o.g. assume m processes P1..Pm are non faulty, where

and the remaining n-m processes are faulty• Assume upper bound U and lower bound ν on the values of

the d elements in the inputs vectors at non-faulty processes– Assumption for simplification of BVC algorithm – it can work

without this assumption by agreeing on values of U,v in the additional stage of the algorithm

• Assume process Pi has input vector Xi • Pi maintains current value Vi , initialized to Xi•Pi maintains multisets Bi, B’i and Zi, initialized to empty set

Algorithm for finding approximate BVC round notation

• The algorithm works in rounds– Round number

• Variable denoted X[t] means value of the variable X in the end of the round t– For instance Zi[t] means subset Z of process i at the end

of round t– Value of X in the beginning of round t is X[t-1]– X[0] means the initial value of X

• We will omit the round notation where it is not necessary

Algorithm for finding approximate BVCadditional definitions

• A point r at round t is said to be valid if there exists a representation of r as a convex combination of Vi[t-1]-s of the non faulty processes – is said to be weight of in the above convex combination– Note that if a point can be expressed as a convex

combination of valid points, it is also valid.• At any time •

– Assuming n > 1, n > f: • numrounds =

Algorithm for finding approximate BVC

• Run numrounds rounds• In the t-th round

1. Send v[t-1] using AAD-Broadcast a. All sent tuples to process Pi is stored in B’i, respective messages in Bi

2. After obtaining Bi[t] compute Vi[t] as followsa. Initialize Zi emptyb. For each such that

1. add to Zi one deterministically chosen point from

• Claim: Vi[numrounds] is a legal decision vector of approximate BVC

Algorithm for finding approximate BVCproof: termination

• AAD Broadcast is the only communication primitive that we use

• It is guaranteed to terminate under our assumptions.

• So the BVC algorithm it is guaranteed to terminate

Algorithm for finding approximate BVCproof: validity

• At any round t at any non-faulty process Pi consider any C in step 2.b of the algorithm

• By non-empty lemma: . So Zi will contain point from for each C.• There are at most f faulty processes, . Then at least one of (n-2f)-size

subsets of contains state of only non-faulty processes• Therefore all points in are valid• So all points in Zi are valid• Because is an average of points in Zi, it is also valid• Therefore Vi[numrounds] can be expressed as a convex combination

of Vj[0], • So validity holds

Algorithm for finding approximate BVCproof: -agreement

• Consider two non-faulty processes and • We have shown that all points in are valid• By Common knowledge property of AAD-

broadcast

• contain at least one common point. Call it z• It is valid, so • There exists some process such that

Algorithm for finding approximate BVCproof: -agreement (cont.)

• Reminder: , exists such that • Reminder: For each such that add 1 value to Zi

• Because of Uniqueness property of AAD-broadcast we know that

• Because of first four facts:

Algorithm for finding approximate BVCproof: -agreement (cont.)

• Reminder: • Define span(L)[t] – distance between maximal and minimal L-th

components of all Vi[t]• Claim: for each L: • Proof: in the appendix 1. For now assume the claim is correct• At each round span reduces by factor of thus by repeated application of

times the initial maximal span () is decreased beyond

• Recall our definition of numrounds: numrounds = • So -agreement holds.

Wrap Up

•We have met Byzantine Consensus problem•We have defined Byzantine Vector Consensus problem•We have met Tverberg’s theorem•We have proven: necessary and sufficient condition for approximate BVC in asynchronous system is •We have proven: necessary and sufficient condition for exact BVC in synchronous system is •We have learned an algorithm for exact BVC in synchronous •We have learned an algorithm for approximate BVC

Questions?

Appendix 1: proof of reducing span claim in epsilon-agreement proof

• Define span(L)[t] – distance between maximal and minimal L-th components of all Vi[t]

• Claim: for each L: • Proof:


There exist constants such that:


Focusing on operations on l-th component (max):


Focusing on operations on l-th component (min):

byzantine vector consensus in complete graphs

Documents

byzantine army

bvc algorithmsprovide

decision vector

scalar input xifind

validity condition

optimalthe asynchronous

decision value

faulty processestermination