chapter1 dos

7/30/2019 Chapter1 Dos

1/51

Fundamentals of DCS


2/51

DISTRIBUTED SYSTEMS

What is a distributed system?

A distributed system is a collection of independent

computers that appear to its users as a single coherent

system.

A weaker definition of distributed system:

A distributed system is a collection of independent

computers that are used jointly to perform a single

task or to provide a single service.


3/51

Multiple processors

Computer architectures having interconnected,multiple processors are basically of two types:

1. Tightly coupled systems.

2.Loosely coupled systems


4/51

(a) A tightly coupled multiprocessor systems

(b) A loosely coupled multiprocessor systems


5/51

Multiple processors

Tightly coupled systems. there is a single system

wide primary memory (address space) that is

shared by all the processors

If any processor writes, for example, the value

100 to the memory locationx, any otherprocessor subsequently reading from locationx

will get the value 100.

Therefore, in these systems, any communicationbetween the processors usually takes place

through the shared memory.

are referred to as parallel processing systems


6/51

2.Loosely coupled systems. In these systems, the

processors do not share memory, and each

processor has its own local memory

If a processor writes the value 100 to the memory

location x, this write operation will only change the

contents of its local memory and will not affectthememory of any other processor.

In these systems, all communication between the

processors is done by passing messages across thenetwork.

are referred to as distr ibuted computing systems


7/51

In short, a DCS is basically a collection of

processors interconnected by a communication

network in which each processor has its ownlocal memory and other peripherals, and the

communication between any two processors of

the system takes place by message passing over

the communication network.

For a particular processor, its own resources are

local, whereas the other processors and their

resources are remote. Together, a processor and

its resources are usually referred to as a node or

site ormachine of the distributed computing

system.


8/51

System Architecture types are

Minicomputer Model-processor/user

Workstation Model-processor/user

Workstation-server Model-processor/user

Processor Pool Model-processor/user

Hybrid Model

DCS Models


9/51

Minicomputer Model

A simple extension of the

centralized time-sharing system.

A DCS based on this modelconsists of few minicomputers (

may be large supercomputers as

well) interconnected by a

communication network. Each minicomputer usually has

multiple users simultaneously

logged on to it. For this, several

interactive terminals areconnected to each minicomputer.

Each user is logged on to one specific minicomputer, with remote

access to other minicomputer.

The network allows a user to access remote resources available on

some machine other than the one logged.


10/51

Workstation Model

A DCS based on the workstation model consists of several

workstations interconnected by a communication network.

A company's office or a university department, It has been often

found that in such an environment, at night many workstations

are idle resulting in the waste of large amounts of CPU time.

The idea of the workstation model is to interconnect all these

workstations by a high-speed LAN so that idle workstationsmay be used to process jobs of users who are logged onto

other workstations and do not have sufficient processing power

at their own workstations and get their jobs processed

efficiently. The system transfers one or more of the processes from the

user's workstation to some other workstation that is currently

idle and gets the process executed there, and finally the result

of execution is returned to the user's workstation.


11/51

Workstation-Server Model

A workstation with its own local

disk is usually called a diskful

workstation and a workstationwithout a local disk is called a

diskless workstation.

With the invention of high-speed

networks, diskless workstationsie., the workstation-server model

is popular.

Workstation

Along with Normal computation activities, requests for services

provided by special servers (such as a file server or a databaseserver) are sent to a server providing that type of service that

performs the users requested activity and returns the result of

request processing to the users workstations.

Here , the users processes need not to be migrated to the server

machines for getting the work done by those machines.


12/51

Comparison

Workstation-server model has several advantages:

Cheaper - to use a few servers that are accessedover the network than a large number of diskful

stations, with each workstation having a small,

slow disk. Diskless workstations are preferred from a system

maintenance point of view. Backup and hardware

maintenance are easier to perform with a few largedisks (servers). Furthermore, installing new

releases of software (such as file server with new

functionalities) is easier.


13/51

Comparison

In the workstation-server model, since the file servers

manage all files, users have the flexibility to use any

workstation and access the files in the same manner

irrespective of which workstation the user is currently

logged on.

In the workstation-server model, the request-responseprotocol is mainly used to access the services of the

server machines. Therefore, this model does not need a

process migration facility, which is difficult to

implement.

A user has guaranteed response timebecause

workstations are not used for executing remote processes.

However, the model does not utilize the processing

ca abilit of idle workstations.


14/51

Processor-Pool Model

A user needs very large amount

computing power once in a

while for a short time. Therefore, in the processor-pool

model the processors are pooled

together to be shared by the

users as needed. The pool of processors consists

of a large number of

microcomputers and

minicomputers attached to the

network.

Each processor in the pool has its own memory to load and run a

system program or an application program of the distributed

computing system.

Pool of processors


15/51

Hybrid Model

To combine the advantages of both the workstation-server and

processor-pool models, a hybrid model may be used to build adistributed computing system.

This is based on the workstation-server model but with the addition

of a pool of processors.

The processors in the pool can be allocated dynamically forcomputing that are too large for workstations or that requires

several computers concurrently for efficient execution.

In addition to efficient execution of computation-intensive jobs, the

hybrid model gives guaranteed response to interactive jobs byallowing them to the processed on local workstations of the users.

However, the hybrid model is more expensive to implement than

the workstation-server model or the processor-pool model.


16/51

Advantages of Distributed System

Inherently distributed applications

Bank, Reservation systems, multinational companies Low price/high performance

rapid increasing power of processor with reducing cost,increasing network speed

Improved reliability and availability Degree of tolerance against errors and component

failures

Prevent loss of information even in the event of failures

Is done by having multiple copies of critical informationwithin the system

Availability refers to the total time for which the systemis available for use

Failure of one component need not stop the remaining

components from doing the job


17/51

Advantages

Modular Expandability

Open distributed system can exist where additionalresources can be added as and when required

Resource sharing

Facilitate hardware resource ( printers, plotters, storagedevices) sharing among multiple computers

Sharing of software resources like libraries, databases

Information sharing

Information generated by one user can be easily andefficiently shared by users working on other nodesgeographical far off from each other

Example Computer Supported Cooperative Work orgroupware


18/51

Advantages

Shorter response times and higher thruput

due to multiplicity of processors can have shorter

response times

A computation can be partitioned into number of sub-

computations ad run on different processorssimultaneously

Better Flexibility

Pool of different types of computers suitable fordifferent types of computations


19/51

Distributed Operating Systems

Transparency

Reliability

Flexibility Performance

Scalability

Heterogeneity

Security

Emulation of existing OSs

Issues :


20/51

DOS: Design Issues ..Transparency Description

AccessHide differences in data representation and how a resource is

accessedremote or local , accessibility is same

Locationname

transparency, user

mobility

Hide where a resource is located - unique resource names

systemwide, logon to any machine and access the resource with

same name

Migration Hide that a resource may move to another location

Scaling Expansion of resources without disturbing activities of the users

ReplicationHide that a resource may be shared by several competitive users by

having replicas of files and resources

Concurrency

Hide that a resource may be shared by several competitive users

which requires propertiesevent ordering, mutual exclusion, no-

starvation, no deadlock

Failure Hide the failure and recovery of a resource

Performance Hide the load of processors which vary dynamically by migratingjobs from high loaded processors to low loaded processors


21/51

DOS : Design Issues

Reliability Fault is a mechanical or algorithmic defect that

may generate error.

System may fail in two ways:

Fail- stopsystem stops functioning after changing to a

state in which its failure can be detected

Byzantine ( undetected software bugs)system

continues to function but produces wrong results

Faults can be avoided, tolerated, detected and

recovered if possible


22/51

Fault Avoidance

Deals with designing the components of system insuch a way that the occurrence of faults isminimized

Fault Tolerance Ability of the system to function in the event of

partial failure of system

This can be done by

Redundancy techniquesreplicating critical hardwareand software components with consistency maintained

Distributed controlAlgorithms or protocols must havedistributed control

DOS : Design Issues


23/51

Fault Detection and recovery

Use of hardware and software to determine the

occurrence of failure and to correct the system

Techniques are

Atomic Transactions- either all operations performedsuccessfully or none

Stateless serversrestart the process without any details

of previous operations

Acknowledgements and time-out based transmissions ofmessages - detect lost messages based on time outs and

acknowledgements

DOS : Design Issues


24/51

DOS : Design Issues

Flexibilityrequired because Ease of modificationsome parts of design need

to be modified or replaced due to bug or new

requirements with minimum interruption to the

user Ease of enhancementkernel level design

Monolithic kernelincludes almost all OS services,

shared resources, hence no message passing, no context

switching- so faster performance

Micro kernelas small as possible, includes minimal

facility, hence easy to design, implement and install

with a penalty of performance

i


25/51

Performance

Batch if possibletransfer of data across the network in largechunks rather than individual pages, attach acknowledge in the

subsequent msgs

Cache whenever possiblefrequently used data cached to

reduce computation time, network bandwidth and avoidcontention

Minimize copying of datamoving data between stack - msg

buffer of senders address space - kernel address space should

be minimized Minimize network trafficcluster the processes that

communicate on a single node rather than migrating

Fine-grained parallelism for multiprocessingservers

structured as group of threads and executed

DOS : Design Issues


26/51

Scalabilityadapt to increased service load

Avoid centralized entitiesreplication of

resources

Avoid centralized algorithmsespecially

scheduling algorithms Perform most operations on client workstations

done by caching required data

DOS : Design Issues


27/51

DOS: Issues ..

Heterogeneitydissimilar components

Networks , computer hardware , operating

systems, programming languages provides

Portability, interoperability

Disadvantageextra resources required like

Mobile code, adaptability softwares (applets,

agents, translators)


28/51

DOS: Issues ..

Security - Distributed systems should allowcommunication between programs/users/ resources

on different computers.

Authentication - Protection against disclosure to

unauthorized person Integrity - Protection against alteration and corruption

Authorizability - Keep the resource accessible to

appropriate use of resources by different users has to be

guaranteed.


29/51

Message Passing

I nterprocess communication(IPC) basically requires

information sharing among two or more processes. Twobasic methods for information sharing are as follows:

original sharing, orshared-data approach;

copy sharing, ormessage-passing approach


30/51

In the shared-data approach, the information

to be shared is placed in a common memory

area that is accessible to all processesinvolved in an IPC.

In the message-passing approach, the

information to be shared is physically copiedfrom the sender processs space to the

address space of all the receiver processes,

and this is done by transmitting the data to be

copied in the form of messages (message is a

block of information).


31/51

message-passing system

A message-passing systemis a subsystem of

distributed operating system that provides a set ofmessage-based IPC protocols

does by shielding the details of complex network

protocols and multiple heterogeneous platforms fromprogrammers.

It enables processes to communicate by exchanging

messages and allows programs to be written by using

simple communication primitives, such as sendand

receive.

Helps for building other high level IPC systems like

RPC and DSM

F f G d M P i


32/51

Features of a Good Message-Passing

System

Simplicity should be simple and easy to use

Must be direct, to construct new applications and to

communicate

It should be possible to communicate with old and

new applications, with different modules without the

need to worry about the system and network aspects.

Others Uniform semantics, efficiency, correctness,

reliability, security, flexibility, portability


33/51

Features

Uniform Semanticseasy to use In a distributed system, a message-passing system may

be used for the following two types of interprocess

communication:

local communication-communicating processes are on thesame node;

remote communication-communicating processes are on

different nodes.

Semantics of remote communication should be as closeas possible to those of local communications.

F


34/51

Features

Efficiencycritical issue

Can be gained by reducing the number of messageexchanges,

Some optimizations adopted for efficiency are

avoiding the costs of establishing and terminatingconnectionsbetween the same pair of processes for

every message exchange

minimizing the costs of maintaining the connections

piggybacking of acknowledgement of previous

messages with the next message that involves several

message exchanges between a sender and a receiver.

Feat res


35/51

Features

Correctnessfor group communication

Issues related to correctness are as follows:Atomicity- ensures that every message sent to a

group of receivers will be delivered to either all

of them or none of themordered delivery - ensures that messages arrive

to all receivers in an order acceptable to the

application survivability - guarantees that messages will

be correctly delivered despite partial failures of

processes, machines, or communication links

Features


36/51

Features

Reliability

Should cope with failure problems and guaranteesthe delivery of messages

Handling of lost messages thru acknowledgements

and retransmissions

Detecting and handling duplicate messages by

providing sequence numbers to messages

Security

Authentication of sender and receiver

Encryption of messages before sending

F t


37/51

Features

Flexibility

Users can choose and specify the types and levels of

reliability and correctness requirements of their

applications

To permit any kind of control flow between the co-

operating processes, like synchronous andasynchronous send and receive

Portability

Message passing system itself should be portablenew IPC facility on another system can be

constructed by reusing the existing basic design

Applications of Message passing system should be

portableexternal data representation format


38/51

Issues in IPC by Message Passing

A message is a block of information formatted

by a sending process in such a manner that it is

meaningful to the receiving process.

It consists of a fixed-length header and a

variable-size collection of typed data objects.

The header usually consists of the following

elements:

Address

Sequence number

Structural information

Structural information


39/51

Structural information.

This element also has two parts.

The typepart specifies whether the data to be passed

on to the receiver is included within the message orthe message only contains a pointer to the data,

which is stored somewhere outside the contiguous

portion of the message.

The second part specifies the length of the variable-

size message data.

Synchronization


40/51

Synchronization

A central issue in the communication structure is

synchronization imposed on the communicatingprocesses by the communication primitives.

The semantics used for synchronization may by

broadly classified as blocking and nonblockingtypes.

A primitive is said to have nonblocking

semantics if its invocation does not block theexecution of its invoker (the control returns

almost immediately to the invoker); otherwise a

primitive is said to be of the blockingtype.

blocking send pr imitive after execution of the send


41/51

blocking send pr imitive, after execution of the send

statement, the sending process is blocked until it receives

an acknowledgement from the receiver

nonblocking send primitive, after execution of the sendstatement, the sending process is allowed to proceed with

its execution as soon as the message has been copied to a

buffer.

blocking receive primitive, after execution of the receive

statement, the receiving process is blocked until it

receives a message.

nonblocking receive primitive, the receiving processproceeds with its execution after execution of the receive

statement, which returns control almost immediately just

after telling the kernel where the message buffer is.

A i i i bl ki i


42/51

An important issue in a nonblocking receive

primitive is how the receiving process knows

that the message has arrived in the message

buffer?

Polling- The receiver uses test primitive to

periodically poll the kernel to check the buffer

status - if the message is already available in the

buffer.

Interrupt- when the message has been filled in

the buffer and is ready for use by the receiver, a

software interrupt is used to notify the receiving

process.

diti l i i i ti hi h l


43/51

conditional receive pr imitive, which also

returns control to the invoking process almost

immediately, either with a message or with an

indicator that no message is available.

When both the send and receive primitives of a

communication between two processes use

blocking semantics, the communication is

said to be synchronous, otherwise it is

asynchronous.

The main drawback of synchronous

communication is that it limits concurrency

and is subject to communication deadlocks.

Synchronous mode of communication with both send


44/51

Synchronous mode of communication with both send

and receiveprimitives having blocking-type semantics

More

reliable

Easy to

implement

Buffering


45/51

Buffering

In the standard message passing model, messages

can be copied many times, hence buffering has tobe managed

The three types of buffering strategies used in

interprocess communication

Null Buffering

Single message buffer

Unbounded capacity buffer Finite bounded capacity buffer


46/51


47/51

Null Buffer (No Buffering)

no place to temporarily store the message. Hence

one of the following implementation strategies

used:

The message remains in the sender processs

address space and the execution of the send isdelayed until the receiver executes the

corresponding receive.

The message is simply discarded and the time-out

mechanism is used to resend the message after a

timeout period. The sender may have to try several

times before succeeding.

Si l M B ff


48/51

Single-Message Buffer

a buffer having a capacity to store a single message on

the receivers node.

usually used for synchronous communication, an

application may have at most one message at a time.

Unbounded-Capacity Buffer

In the asynchronous mode of communication, since a

sender does not wait for the receiver to be ready, there

may be several pending messages that have not yet been

accepted by the receiver. Therefore, a message-buffer that can store all unreceived

messages is needed to support asynchronous

communication with the assurance that all the messages

sent to the receiver will be delivered.

Finite-Bound Buffer


49/51

also known as multiple-message buffers - message is

first copied from the sending processs memory into

the receiving processs mailbox and then copied fromthe mailbox to the receivers memory when the

receiver calls for the message.

strategy is needed for handling buffer overflow. Is

of the following two ways:

Unsuccessful communication. message transfers

simply fail and an error is returned.

F low-control led communication. the sender is

blocked until the receiver accepts some messages,

thus creating space in the buffer for new messages.

Failure Handling


50/51

Failure Handling

During interprocess communication partial failures

such as a node crash or communication link failuremay lead to the following problems:

Loss of request message-may happen either due to the

failure of communication link between the sender and

receiver or the receivers node is down at the time the requestmessage reaches there.

Loss of response message. This may happen either due to the

failure of communication link between the sender and

receiver or the senders node is down at the time the response

message reaches there.

Unsuccessful execution of the request. This may happen due

to the receivers node crashing while the request is being

processed.


51/51

chapter1 dos

Documents