asynchronous consensus

50
Asynchronous Consensus Ken Birman

Upload: franz

Post on 02-Feb-2016

60 views

Category:

Documents


0 download

DESCRIPTION

Asynchronous Consensus. Ken Birman. Outline of talk. Reminder about models Asynchronous consensus: Impossibility result Solution to the problem With an “oracle” that detects failures Without oracles, using timeout Big issues? Revisit from Byzantine agreement - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Asynchronous Consensus

Asynchronous Consensus

Ken Birman

Page 2: Asynchronous Consensus

Outline of talk

Reminder about models Asynchronous consensus: Impossibility result Solution to the problem

With an “oracle” that detects failures Without oracles, using timeout

Big issues? Revisit from Byzantine agreement Is this model realistic? In what ways is it “legitimate”? Should we focus on impossibility, or “possibility”? Asynchronous consensus in real world systems

Page 3: Asynchronous Consensus

Distributed Computing Models

Recall that we had two models To reason about networks and applications

we need to be precise about the setting in which our protocols run

But “real world” networks are very complexThey can drop packets, or reorder themIntruders might be able to intercept and modify

dataTiming is totally unpredictable

Page 4: Asynchronous Consensus

Asynchronous network model

Asynchronous because we lack clocks: Network can arbitrarily delay a message But we assume that messages are sequenced and

retransmitted (arbitrary numbers of times), so they eventually get through.

“Free” to say: lossless, ordered No value to assumptions about process speed

Failures in asynchronous model? Usually, limited to process “crash” faults If detectable, we call this “fail-stop” – but how to detect?

Page 5: Asynchronous Consensus

An asynchronous network

Not causal!

Page 6: Asynchronous Consensus

An asynchronous network

Time shrinks…

Page 7: Asynchronous Consensus

An asynchronous network

Time shrinks…

Time stretches…

Page 8: Asynchronous Consensus

Justification?

If we can do something in the asynchronous model, we can probably do it even better in a real network Clocks, a-priori knowledge can only help…

But today we will focus on an impossibility result

By definition, impossibility in this model means “xxx can’t always be done”

Page 9: Asynchronous Consensus

Paradigms

Fundamental problems, the solution of which yields general insight into a broad class of questions

In distributed systems: Agreement (on value proposed by a leader) Consensus (everyone proposes a value… pick one) Electing a leader Atomic broadcast/multicast (send a message, reliably, to

everyone who isn’t faulty, such that concurrent messages are delivered in the same order everywhere)

Deadlock detection, clock or process synchronization, taking a snapshot (“picture”) of the system state….

Page 10: Asynchronous Consensus

Consensus problem

Models distributed agreement Comes in various forms (with subtle differences in the

associated results)! With a leader: leader gives an order, like “attack”, and non-

faulty participants either attack or do nothing, despite some limited number of failures: Byzantine Agreement

Without a leader: participants have an initial vote; protocol runs and eventually all non-faulty participants chose the same outcome, and it is one of the initial votes (typically, 0 or 1): Fault-tolerant Consensus

Page 11: Asynchronous Consensus

Consensus problem

P0

Q0

R1

P1

Q1

R1

Page 12: Asynchronous Consensus

Fault-tolerance

Goal: an algorithm tolerant of one failure Failure: process crashes but this is not

detectable So the algorithm must work both in the face

of arbitrary message delay caused by the network, and in the event of a single failure

Page 13: Asynchronous Consensus

If some process stays up…

Suppose we knew that P won’t fail Then P could simply broadcast it’s input All would “decide” upon this value Solves the problem

Page 14: Asynchronous Consensus

If one process stays up

Indeed, suppose that P stays up only long enough to send one message

But there is only one failure And we knew that P would “lead” Then we can relay P’s message, using an all-

to-all broadcast

Page 15: Asynchronous Consensus

Algorithm

P: broadcast my input Q P: on receiving P’s message for first

time, broadcast a copy Tolerates anything except failure of P in the

first step, but we need to agree upon “P” before starting (ie P is the least ranked process, using alphabetic ranking)

Page 16: Asynchronous Consensus

Another algorithm

All processes start by broadcasting own value to all other processes

If we know that there is always exactly one failure, could wait until n-1 messages received, then using any deterministic rule

But doesn’t work if sometimes we have one failure, sometimes none

Page 17: Asynchronous Consensus

FLP result

Considers general case Assumes an algorithm that can decide with

zero or one failures Proves that this algorithm can be prevented

from reaching decision, indefinitely

Page 18: Asynchronous Consensus

Basic idea

Think of system state as a “configuration” Configuration is v-valent if decision to pick v has

become inevitable: all runs lead to v If not 0-valent or 1-valent, configuration is bivalent

Initial configuration includes At least one 0-valent: {0,0,0….0} At least one 1-valent: {1,1,1…..1} At least one bivalent: {0,0,…1,1}

Page 19: Asynchronous Consensus

Basic idea

0-valentconfigurations

1-valentconfigurations

bi-valentconfigurations

Page 20: Asynchronous Consensus

Transitions between configurations

Configuration is a set of processes and messages Applying a message to a process changes its state,

hence it moves us to a new configuration Because the system is asynchronous, can’t predict

which of a set of concurrent messages will be delivered “next”

But because processes only communicate by messages, this is unimportant

Page 21: Asynchronous Consensus

Basic Lemma

Suppose that from some configuration C, the schedules 1, 2 lead to configurations C1 and C2, respectively.

If the sets of processes taking actions in 1 and 2, respectively, are disjoint than 2 can be applied to C1 and 1 to C2, and both lead to the same configuration C3

Page 22: Asynchronous Consensus

Basic Lemma

2

C1

C3

C

C2

2

1

1

Page 23: Asynchronous Consensus

Main result

No consensus protocol is totally correct in spite of one fault

Note: Uses total in formal sense (guarantee of termination)

Page 24: Asynchronous Consensus

Basic FLP theorem

Suppose we are in a bivalent configuration now and later will enter a univalent configuration

We can draw a form of frontier, such that a single message to a single process triggers the transition from bivalent to univalent

Page 25: Asynchronous Consensus

Basic FLP theorem

bivalent

univalent

e’

D0

D1

C

C1

e’

e

e

Page 26: Asynchronous Consensus

Single step decides

They prove that any run that goes from a bivalent state to a univalent state has a single decision step, e

They show that it is always possible to schedule events so as to block such steps

Eventually, e can be scheduled but in a state where it no longer triggers a decision

Page 27: Asynchronous Consensus

Basic FLP theorem

They show that we can delay this “magic message” and cause the system to take at least one step, remaining in a new bivalent configuration

Uses the diamond-relation seen earlier But this implies that in a bivalent state there are

runs of indefinite length that remain bivalent Proves the impossibility of fault-tolerant consensus

Page 28: Asynchronous Consensus

Notes on FLP

No failures actually occur in this run, just delayed messages

Result is purely abstract. What does it “mean”?

Says nothing about how probable this adversarial run might be, only that at least one such run exists

Page 29: Asynchronous Consensus

FLP intuition

Suppose that we start a system up with n processes Run for a while… close to picking value associated

with process “p” Someone will do this for the first time, presumably

on receiving some message from q If we delay that message, and yet our protocol is

“fault-tolerant”, it will somehow reconfigure Now allow the delayed message to get through but

delay some other message

Page 30: Asynchronous Consensus

Key insight

FLP is about forcing a system to attempt a form of reconfiguration

This takes time Each “unfortunate” suspected failure causes

such a reconfiguration

Page 31: Asynchronous Consensus

FLP and our first algorithm

P is the leader and is supposed to send its input to Q Q “times out” and

Tells everyone that P has apparently failed Then can disseminate its own value If P wakes up, we re-admit it to the system but it is no

longer considered least ranked One can make such algorithms work… But they can be attacked by delaying first P, then Q,

then R, etc

Page 32: Asynchronous Consensus

FLP in the real world

Real systems are subject to this impossibility result But in fact often are subject to even more severe

limitations, such as inability to tolerate network partition failures

Also, asynchronous consensus may be too slow for our taste

And FLP attack is not probable in a real system Requires a very smart adversary!

Page 33: Asynchronous Consensus

Chandra/Toueg

Showed that FLP applies to many problems, not just consensus In particular, they show that FLP applies to

group membership, reliable multicast So these practical problems are impossible in

asynchronous systems, in formal sense But they also look at the weakest condition

under which consensus can be solved

Page 34: Asynchronous Consensus

Chandra/Toueg Idea

Separate problem into The consensus algorithm itself A “failure detector:” a form of oracle that

announces suspected failure But it can change its mind

Question: what is the weakest oracle for which consensus is always solvable?

Page 35: Asynchronous Consensus

Sample properties

Completeness: detection of every crash Strong completeness: Eventually, every

process that crashes is permanently suspected by every correct process

Weak completeness: Eventually, every process that crashes is permanently suspected by some correct process

Page 36: Asynchronous Consensus

Sample properties

Accuracy: does it make mistakes? Strong accuracy: No process is suspected before it

crashes. Weak accuracy: Some correct process is never

suspected Eventual strong accuracy: there is a time after which

correct processes are not suspected by any correct process

Eventual weak accuracy: there is a time after which some correct process is not suspected by any correct process

Page 37: Asynchronous Consensus

A sampling of failure detectors

Completeness Accuracy

Strong Weak Eventually Strong Eventually Weak

Strong PerfectP

StrongS

Eventually Perfect

P

Eventually Strong S

Weak D WeakW

D Eventually Weak W

Page 38: Asynchronous Consensus

Perfect Detector?

Named Perfect, written P Strong completeness and strong accuracy Immediately detects all failures Never makes mistakes

Page 39: Asynchronous Consensus

Example of a failure detector

The detector they call W: “eventually weak” More commonly: W: “diamond-W” Defined by two properties:

There is a time after which every process that crashes is suspected by some correct process

There is a time after which some correct process is never suspected by any correct process

Think: “we can eventually agree upon a leader.” If it crashes, “we eventually, accurately detect the crash”

Page 40: Asynchronous Consensus

W: Weakest failure detector

They show that W is the weakest failure detector for which consensus is guaranteed to be achieved

Algorithm is pretty simple Rotate a token around a ring of processes Decision can occur once token makes it around once

without a change in failure-suspicion status for any process

Subsequently, as token is passed, each recipient learns the decision outcome

Page 41: Asynchronous Consensus

Rotating a token versus 2-phase commit

“phase”

Propose v… ack… Decide v

Page 42: Asynchronous Consensus

Rotating a token versus 2-phase commit

Their protocol is basically a 2-phase commit But with n processes, 2PC requires 2(n-1)

messages per phase, 3(n-1) total Passing a token only requires n messages per

phase, for 2n total (when nothing fails) Tolerates f < n/2 failures

Page 43: Asynchronous Consensus

Set of problems solvable in:

Clock synchronization

TRBnon-blocking atomic

commit

consensusatomic broadcast

reliablebroadcast

Synchronous systems

Asynchronous using P

Asynchronous using W

Asynchronous

TRB: Byzantine Generals with only crash failures

Page 44: Asynchronous Consensus

Building systems with W

Unfortunately, this failure detector is not implementable

Using timeouts we can make mistakes at arbitrary times

But with long enough timeouts, could produce a close approximation to W

Page 45: Asynchronous Consensus

Would we want to?

Question: are we solving the right problem? Pros and cons of asynchronous consensus Think about an air traffic control application

Find one problem for which asynchronous consensus is a good match

Find one problem for which the match is poor

Page 46: Asynchronous Consensus

French ATC system (simplified)

Controllers

Air Traffic Database (flight plans, etc)

X.500 Directory

Radar

Onboard

Page 47: Asynchronous Consensus

Potential applications

Maintaining replicated state within console clusters Distributing radar data to participants Distributing data over wide-area links within large

geographic scale Management and control (administration) of the

overall system Distributing security keys to prevent unauthorized

action Agreement when flight control handoffs occur

Page 48: Asynchronous Consensus

Broad conclusions?

The protocol seems unsuitable for high availability applications If the core of the system must make progress, the

agreement property itself is too strong If a process becomes unresponsive might not want to

wait for it to recover Also, since we can’t implement any of these failure

detectors, the whole issue is abstract… Hence real systems don’t try to solve consensus as

defined and used in these kinds of protocols!

Page 49: Asynchronous Consensus

Value of FLP/Consensus

A clear and elegant problem statement Highlights limitations

Perhaps with clocks we can overcome them More likely, we need a different notion of

failure “Crash failure” is too narrow, “unreachable”

also treated as failure in many real systems Caused much debate about real systems

Page 50: Asynchronous Consensus

Nature of debate

We’ll see many practical systems soon Do they

Evade FLP in some way? Are they subject to FLP? If so, what problem do they

“solve”, given that consensus (and most problems reduce to consensus) is impossible to solve?

Or are they subject to even more stringent limitations?

Is fault-tolerant consensus even an issue in real systems?