election algorithms. topics r issues r detecting failures r bully algorithm r ring algorithm

22
Election Algorithms

Upload: hortense-malone

Post on 23-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Election Algorithms

Page 2: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Topics

Issues Detecting Failures Bully algorithm Ring algorithm

Page 3: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Readings

Van Steen and Tanenbaum: 5.4 Coulouris: 11.3

Page 4: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Election Algorithms

Remember using Lamport clocks for total order

Can you think of another way to do this? It turns out that you can use a sequencer.

All operations go to a sequencer The sequencer assigns numbers to each message

before the message goes to each replica What if the sequencer goes down?

Page 5: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Election Algorithms

Many distributed algorithms require a process to act as a coordinator.

The coordinator can be any process that organizes actions of other processes.

A coordinator may fail How is a new coordinator chosen or

elected?

Page 6: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Election Algorithms

Assumptions Each process has a unique number to

distinguish them. One process per machine (which suggests that

an IP address can be the unique identifier) Processes know each other’s process number Processes do not know which ones are currently

up and which ones are down. General Approach

Locate the process with the process with the highest process number and designate it as the coordinator.

Election algorithms differ in how they do this.

Page 7: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Issues in Dealing with Coordinator Failure

Detecting Failure• Any node might detect failure first• Multiple processes might detect failure at once.

Election• Must run without coordination• Must deal with arbitrary process failures• All nodes must agree on when election is over

and who the new coordinator is.

Page 8: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Detecting Failures

Timeouts are used to detect failuresT = 2Ttrans + Tprocess

• Where Ttran is maximum transmission delay and Tprocess represents the maximum delay for processing a message.

If a process fails to respond to a message request within T seconds then an election is initiated.

Page 9: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Bully Algorithm

When a process, P, notices that the coordinator is no longer responding to requests, it initiates an election. P sends an ELECTION message to all

processes with higher numbers. If no one responds, P wins the election and

becomes a coordinator. If one of the higher-ups answers, it takes

over. P’s job is done.

Page 10: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Bully Algorithm When a process gets an ELECTION message

from one of its lower-numbered colleagues: Receiver sends an OK message back to the

sender to indicate that he is alive and will take over.

Receiver holds an election, unless it is already holding one.

Eventually, all processes give up but one, and that one is the new coordinator.

The new coordinator announces its victory by sending all processes a message telling them that starting immediately it is the new coordinator.

Page 11: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Bully Algorithm

If a process that was previously down comes back: It holds an election. If it happens to be the highest process

currently running, it will win the election and take over the coordinator’s job.

“Biggest guy” always wins and hence the name “bully” algorithm.

Page 12: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

The Bully Algorithm (Example)

The bully election algorithm Process 4 holds an election Process 5 and 6 respond, telling 4 to stop Now 5 and 6 each hold an election

Page 13: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

The Bully Algorithm (Example)

d) Process 6 tells 5 to stope) Process 6 wins and tells everyone

Page 14: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Bully AlgorithmAnalysis

Best case The node with second highest identifier

detects failure Total messages = N-2

• One message for each of the other processes indicating the process with the second highest identifier is the new coordinator.

Worst case The node with lowest identifier detects failure.

This causes N-1 processes to initiate the election algorithm each sending messages to processes with higher identifiers.

Total messages = O(N2)

Page 15: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Bully Algorithm Discussion

How many processes are used to detect a coordinator failure? As many as you want. You could have all

other processes check out the coordinator. It is impossible for two processes to be

elected at the same time.

Page 16: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Ring Algorithm Use a ring (processes are physically or logically

ordered, so that each process knows who its successor is).

Algorithm When a process notices that coordinator is not

functioning:• Builds an ELECTION message (containing its own process

number)• Sends the message to its successor (if successor is down,

sender skips over it and goes to the next member along the ring, or the one after that, until a running process is located).

• At each step, sender adds its own process number to the list in the message.

Page 17: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Ring Algorithm Algorithm (continued)

When the message gets back to the process that started it all:

• Process recognizes the message that contains its own process number

• Changes message type to COORDINATOR• Circulates message once again to inform everyone

else: Who the new coordinator is (list member with highest number); Who the members of the new ring are.

• When message has circulated once, it is removed.

• Even if two ELECTIONS started at once, everyone will pick same leader since node with highest identifier is picked.

Page 18: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Ring Algorithm

Initiation:1. Process 4 sends an

ELECTION message to its successor (or next alive process) with its ID

Page 19: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Ring Algorithm

Initiation:2. Each process adds its own ID and forwards the ELECTION message

Page 20: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Ring Algorithm contd…

Leader Election:3. Message comes back to initiator, here the initiator is 4.4. Initiator announces the winner by sending another message around the ring

Page 21: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Ring Algorithm Analysis

•At best 2(N-1 ) messages are passed•One round for the ELECTION message

•One round for the COORDINATOR

•Assumes that only a single process starts an election.

•Multiple elections cause an increase in messages but no real harm done.

Page 22: Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm

Summary

Synchronization between processes often requires that one process acts as a coordinator.

The coordinator is not fixed. Election algorithms determine the

coordinator.