distributed systems cs 15-440 programming models- part i lecture 13, oct 13, 2014 mohammad hammoud 1

22
Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Upload: nathaniel-shaw

Post on 22-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Distributed SystemsCS 15-440

Programming Models- Part I

Lecture 13, Oct 13, 2014

Mohammad Hammoud

1

Page 2: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Today…

Part 1 of the Session: Election Algorithms Midterm Overview

Part 2 of the Session: Programming Models – Part I

Announcements: PS3 is due today by 11:59PM Midterm exam will be on Wednesday, Oct 15th (during the class

time)- Open book, open notes P2 is due on Oct 23, 2014 by midnight PS2 grades are out 2

Page 3: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Election in Distributed SystemsMany distributed algorithms require one process to act as a coordinator

Typically, it does not matter which process is elected as the coordinator

3

Client 1

Server

Resource

P1P1

Coordinator

C1C1

A Centralized Mutual Exclusion Algorithm

Time server

Berkeley Clock Synchronization Algorithm

Home Node Selection in Naming

Root node selection in Multicasting

Page 4: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Election Process

Any process Pi in the DS can initiate the election algorithm that elects a new coordinator

At the termination of the election algorithm, the elected coordinator process should be unique

Every process may know the process ID of every other processes, but it does not know which processes have crashed

Generally, we require that the coordinator is the process with the largest process ID

The idea can be extended to elect best coordinatorExample: Election of a coordinator with least computational load

If the computational load of process Pi denoted by loadi, then coordinator is the process with highest 1/loadi. Ties are broken by sorting process ID.

4

Page 5: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Election Algorithms

We will study two election algorithms1. Bully Algorithm

2. Ring Algorithm

5

Page 6: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

1. Bully AlgorithmA process initiates election algorithm when it notices that the existing coordinator is not responding

Process Pi calls for an election as follows:

6

1. Pi sends an “Election” message to all processes with higher process IDs

2. When process Pj with j>i receives the message, it responds with a “Take-over” message. Pi no more contests in the election

i. Process Pj re-initiates another call for election. Steps 1 and 2 continue

3. If no one responds, Pi wins the election. Pi

sends “Coordinator” message to every process

Election

Election

Election

Take-Over

Take-over

Election

Ele

ctio

n

Election

Take-Over

Coordinator

Page 7: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

2. Ring AlgorithmThis algorithm is generally used in a ring topology

When a process Pi detects that the coordinator has crashed, it initiates an election algorithm

7

1. Pi builds an “Election” message (E), and sends it to its next node. It inserts its ID into the Election message

2. When process Pj receives the message, it appends its ID and forwards the message

i. If the next node has crashed, Pj finds the next alive node

3. When the message gets back to the process that started the election:

i. it elects process with highest ID as coordinator, and

ii. changes the message type to “Coordination” message (C) and circulates it in the ring

E: 5

E: 5,6

E: 5,6,0 E: 5,6,0,1

E: 5,6,0,1,2

E: 5,6,0,1,2,3

E: 5,6,0,1,2,3,4

C: 6

C: 6

C: 6 C: 6

C: 6

C: 6

C: 6

Page 8: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Comparison of Election Algorithms

Assume that:n = Number of processes in the distributed system

8

Algorithm

Number of Messages for

Electing a Coordinator

Problems

Bully Algorithm

Ring Algorithm

O(n2) • Large message overhead

2n • An overlay ring topology is necessary

Page 9: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Summary of Election Algorithms

Election algorithms are used for choosing a unique process that will coordinate certain activities

At the end of the election algorithm, all nodes should uniquely identify the coordinator

We studied two algorithms for electionBully algorithm

Processes communicate in a distributed manner to elect a coordinator

Ring algorithmProcesses in a ring topology circulate election messages to choose a coordinator

9

Page 10: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Election in Large-Scale Networks

Bully Algorithm and Ring Algorithm scale poorly with the size of the network

Bully Algorithm needs O(n2) messages

Ring Algorithm requires maintaining a ring topology and requires 2n messages to elect a leader

In large networks, these approaches do not scale well

We discuss a scalable election algorithm for large-scale peer-to-peer networks

10

Page 11: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Election in Large-Scale Peer-to-Peer Networks

Many P2P networks have a hierarchical architecture for balancing the advantages between centralized and distributed networks

Typically, P2P networks are neither completely unstructured nor completely centralized

Centralized networks are efficient and, they easily facilitate locating entities and data

Flat unstructured peer-to-peer networks are robust, autonomous and balances load between all peers

11

Page 12: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Super-peers

In large unstructured Peer-to-Peer Networks, the network is organized into peers and super-peers

A super-peer is an entity that does not only participate as a peer, but also carries on an additional role of acting as a leader for a set of peers

Super-peer acts as a server for a set of client peers

All communication from and to a regular peer proceeds through a super-peer

It is expected that super-peers are long-lived nodes with high-availability

12

Regular PeerSuper Peer

Super-Peer Network

Page 13: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Super-Peers – Election Requirements

In a hierarchical P2P network, several nodes have to be selected as super-peers

Traditionally, only one node is selected as a coordinator

Requirements for a node being elected as a super-peerSuper-peers should be evenly distributed across the overlay network

There should be a predefined proportion of super-peers relative to the number of regular peers

Each super-peer should not need to serve more than a fixed number of regular peers

13

Page 14: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Election of Super-peers in a DHT-based system

14k

m

Recall: In a DHT-based system each node receives a random, uniformly assigned m-bit identifier

We reserve first k-bits to identify super-peersE.g., let m = 8 and k = 3

Route key p to p AND 11100000

Proportion of super-peersIf we need N super-peers, then k = log2(N) bits

Page 15: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Midterm

A Quick Overview

15

Page 16: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Course ObjectivesThe course aims at providing an in-depth and hands-on understanding

on

Principles on which distributed systems are based

Principles on which distributed systems are optimized

Distributed system programming models and analytics engines

How modern distributed systems meet the demands of contemporary distributed applications

Page 17: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

List of Topics

.1.

Architectures and Communications

.2.

Naming

.3.

Synchronization

.4.

Programming Models

.5.

Consistency and Replication

.6.

Fault Tolerance

.7.

Distributed File Systems

.8.

Virtualization

Considered: a reasonably critical and comprehensive understanding.

Thoughtful: Fluent, flexible and efficient understanding.

Masterful: a powerful and illuminating understanding.

Included in the Midterm

Page 18: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Course Content

Course Overview and Introduction (2 Lectures): Why distributed systems?Defining distributed systemsCourse overview and intended learning outcomesTrends in distributed systems

High performance platformsMobile and ubiquities computingCloud computingEtc.,

Challenges in designing distributed systemsHeterogeneity, openness, security, scalability, reliability, concurrency, transparency and quality of service

Page 19: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Course Content

Architectural Models (1 Lecture):Client-server, peer-to-peer, tiered and layered architectures

Networking (1 Lecture):Types of networks

Networking principles:Network classification

Network Layers (Physical, data-link, network and transport layers)

Congestion control

Page 20: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Course Content

Communication Paradigms (1 Lecture):Socket communication

TCP and UDP sockets

Remote invocationRPC and RMI

Indirect communicationMessage-queuing, publish-subscribe, and group communication systems

Page 21: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Course Content

Naming (2 Lectures):Flat naming

Broadcasting, forwarding pointers, home-based naming, and distributed hash tables

Structured namingHierarchical name spaces, name resolution, linking and mounting

Attribute-based namingLDAP and RDF

Page 22: Distributed Systems CS 15-440 Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1

Course Content

Synchronization (3 Lectures):Time synchronization

Physical clocks (UTC, Cristian & Berkeley Algorithms and Network Time Protocol)Logical clocks (Lamport and vector clocks)

Distributed Mutual ExclusionPermission-basedToken-based

Election AlgorithmsBully and Ring algorithms