chap-02v2 modified by pratit

7/28/2019 Chap-02v2 Modified by Pratit

1/61

0

DISTRIBUTED SYSTEMS

Principles and Paradigms

Second EditionANDREW S. TANENBAUM

MAARTEN VAN STEEN

Chapter 2ARCHITECTURES

Modified by Pratit Santiprabhob


2/61

1

Architecture (Software) Architecture: logical organization of distributed

systems into software components!

Designing/Adopting a good architecture is crucial for thesuccess of system to be developed!

Architecture style is important; a style is formulated in termsof components, the way that components are connected

together, the data exchanged between components, and how

they are jointly configured into a system

A component is a modular unit with well-defined (requiredand provided) interfaces that is replaceable within its

environment

A connectoris a mechanism that mediates communication,

coordination, or cooperation among components


3/61

2

Architectural Styles (1) Important styles of architecture for distributed

systems are Layered architectures

Networking

Object-based architectures Client/Server

Data-centered architectures Shared distributed file system, shared Web-based data

services

Event-based architectures Publisher/subscriber system; loosely coupled

referentially decoupled Data may be sent with events

Shared data spaces Combination of event-based and data-centered

architectures

Processes are decoupled in time, sharing throughshared persistent data space


4/61

3

Architectural Styles (2)

Figure 2-1. (a) The layered architectural style.


5/61

4


Figure 2-1. (b) The object-based architectural style.


6/61

5


Figure 2-2. (a) The event-based architectural style.


7/616


Figure 2-2. (b) The shared data-space architectural style.


8/617

System Architectures Software components and their interaction dictate software

architecture as well as system architecture

Major ones include

Centralized architectures

Client/Server architectures

Decentralized architectures

Structured peer-to-peer architectures

Unstructured peer-to-peer architectures

Superpeers Hybrid architectures

Edge-server systems

Collaborative distributed systems


9/61

8

Centralized Architectures Many researchers and practitioners in distributed system field

agrees that thinking in terms of clients requesting services

from servers and servers providing services to clients helps

enhance understanding and design of distributed systems

The client/server model also helps us manage the complexityof distributed systems


10/61

9

Client/Server Model (1) Client/Server model consists of

A serverwhich is a process implementing a specific service;

waiting for a request from a client

A client which is a process requesting a service from a server by

sending a request and waiting for a response In a connectionless environment, responses may be lost

during transmission leading to a problem of what to do when

a client receives no response?

Idempotent operation is an operation that can be repeatedmultiple times without any harm (no different effects left)!

Connection-oriented communication may be an alternativewith additional overhead


11/61

10

Client/Server Model (2)

Figure 2-3. General interaction between a client and a server.


12/61

11

Application Layering (2) Typical client/server applications can be divided into threelayers

The user interface level

Could be as simple as text-based interface all the way to full-fledged object-based windows interface

The processing level This part pretty much varies depending on the respective

applications

Could be anything from calculation, to searching, to generatingHTML pages

The data level Maintains actual data on which the applications operate

Data is often persistent

Data must be consistent across different applications

Depending on the nature of the data, relational or object-oriented

data model may be used


13/61

12

Application Layering (2)

Figure 2-4. The simplified organization of an Internet search

engine into three different layers.


14/61

13

Multitiered Architectures (1)

The simplest organization is to have only

two types of machines (two-tiered):

A client machine containing only the

programs implementing (part of) the user-interface level

A server machine containing the rest, the

programs implementing the processing and

data levels


15/61

14

Multitiered Architectures (2) However, within this two-tiered architecture, the division of

work between client and server can vary (as to be seen on

next slide)

The thin client arrangement in which most processing is done

on the server side is normally preferred to the fat clientarrangement where the processing is done more on the client

side

Not too much dependent on client platform which is usually less

stable than the server platform Let client does only few things reduces the chance that client

software will be prone to errors

Management is much simpler especially when it comes to

software upgrade/maintenance since more stuffs are on the

server


16/61

15


Figure 2-5. Alternative client-server organizations (a)(e).


17/61

16

Multitiered Architectures (4) With a lot of things to be done by the server side, the server

processing can very well be further distributed

This leads to 3-, 4-tiered architectures, e.g.

3-tiered: client, TP monitor, (possibly many) database server(s)

3-tiered: client, application server, database server

4-tiered: web browser, web server, application server, databaseserver


18/61

17


Figure 2-6. An example of a server acting as client.


19/61

18

Decentralized Architectures The centralized architectures distribute logically different

components in on different machines; this is called verticaldistribution

In vertical distribution, functions are logically and physically

distributed across multiple machines On the other hand, in horizontal distribution, a client or a

server may be physically split up into logically equivalent

parts; each part is operating on its own share of the complete

data set, hence, balancing the load This horizontal distribution architecture is represented by

peer-to-peer systems


20/61

19

Peer-to-Peer Systems The processes that constitute a peer-to-peer system are

basically equal; functions are represented by every process

Thus, much of the interaction between processes is

symmetric; each process acts as a client and a server at the

same time! (acting as a servent) An overlay network is a network in which the nodes are

formed by the processes and the links represent the possible

communication channels; they are two types of overlay

network Structured

Unstructured


21/61

20

Structured Peer-to-Peer Architectures (1) Here, the overlay network is constructed using a deterministic

procedure Typical arrangement is through the used of Distributed Hash

Table (DHT)

Data items/objects are assigned a random key from a largeidentifier space, e.g. 128-bit or 160-bit space

Nodes are also assigned a random key from the same identifier

space

Data items are then associated with their respective appropriate

node, based on some sort of distance algorithm

Two examples are Chord system and Content Addressable

Network (CAN)

Membership management will now be examined


22/61

21

Structured Peer-to-Peer Architectures (2) In Chord system,

Nodes and data items are organized in a ring

Each data item with key kis mapped to the node with the

smallest id k

A new node joining the ring is required to work with itssuccessor and predecessor and transferred certain data items

from its successor

A node leaving the ring also needs to inform its successor and

predecessor and transfer all data items associated with to the

successor


23/61

22

Structured Peer-to-Peer Architectures (3)

Figure 2-7. The mappingof data items ontonodes in Chord.


24/61

23

Structured Peer-to-Peer Architectures (4) In Content Addressable Network (CAN)

d-dimensional Cartesian coordinate space is used to contain

data items and is completely partitioned among all the nodes

participating in the system

Each node maintain the knowledge of its neighbors When a node wants to join the system, it arbitrarily picks up a

coordinate, looks up which node is taking care of that coordinate

then asks that node to split the space in half, gives one half

together with the associated data items to it

When a node wants to leave the systems, its region will beassigned to one of its neighbors

After awhile (some nodes leaving), the system will be left in a

not very symmetric partitioning; background repartitioning of the

coordinate space should be performed periodically in

background


25/61

24


Figure 2-8. (a) The mapping

of data items onto nodes

in CAN.


26/61

25


Figure 2-8. (b) Splitting aregion when a node

joins.

Unstructured Peer to Peer


27/61

26

Unstructured Peer-to-PeerArchitectures (1)

Unstructured peer-to-peer systems rely on randomizedalgorithms for constructing an overlay network

Each node maintain a list of neighbors

The neighbor list is constructed in a more or less random way

Data items are assumed to place randomly on nodes

Basically, when a node needs to locate a specific data item, itneeds to flood the network with a search query

Membership management

We want to construct an overlay network that resembles a

random graph Each node is to maintain a list ofcneighbors of randomly

chosen live nodes a partial view

One framework to construct and maintain a partial view isshown on next slides



28/61

27

Unstructured Peer-to-Peer

Architectures (2)

Figure 2-9. (a) The steps taken by the active thread.

U d P P


29/61

28

Unstructured Peer-to-Peer

Architectures (3)

Figure 2-9. (b) The steps take by the passive thread



30/61

29

Unstructured Peer-to-PeerArchitectures (4)

A joining node can contact an arbitrary other node (may befrom a list of well-known access points highly available

nodes) to start its association with the network

Protocols that use only push or pull mode may lead to

undesirable, disconnected overlays Both modes should be employed to make nodes actually

exchange entries

A leaving node may just silently leave the network

Assumption the partial view is regularly exchanged amongnodes, the node that has departed will no longer respond and

will be removed from its peers partial view

Note however that for a given node P, thehigher its indegree

is the more chances it will be contacted, hence the moreworkload it will have

Topology Management of Overlay


31/61

30

Topology Management of OverlayNetworks (1)

By carefully exchanging and selecting entries from partialviews, it is possible to construct and maintain specific

topologies of overlay networks

Two-layered approach

The lower layer is an unstructured peer-to-peer system in whichnodes periodically exchange entries of their partial views with

the aim to maintain an accurate random graph partial viewsfilled with randomly selected live node

The partial view is passed to upper layer where additional

selection of entries take place

Some sort of ranking functions can be used order nodes with a

criterion with respect to a given node, e.g. by distance, bysemantic proximity, etc.



32/61

31


Networks (2)

Figure 2-10. A two-layered approach for constructing and

maintaining specific overlay topologies using techniques from

unstructured peer-to-peer systems.



33/61

32


Networks (3)

Figure 2-11. Generating a specific overlay network using a two-layered unstructured peer-to-peer system [adapted with

permission from Jelasity and Babaoglu (2005)].


34/61

33

Superpeers (1) In some cases, it may be wise to shy away from the

symmetric nature of pure peer-to-peer network

Note that locating relevant data items in an unstructured

peer-to-peer system can really be a problem especially when

the network grows E.g. in a collaborative Content Delivery Network (CDN), nodes

may offer storage to host copies of Web pages allowing Web

clients to access pages near by them

Using a node that maintains indices or acts as a broker, asuperpeer, can be helpful in this case one such node can

maintain resources for a number of associated nodes

The client (regular peer) supperpeerrelationship can be fixed

or can be changed from time to time as clients may discover

better supperpeers to associate with


35/61

34

Superpeers (2)

Figure 2-12. A hierarchical organization of nodes into a

superpeer network.


36/61

35

Hybrid Architectures In real-world, there are not only client/server and peer-to-peer

architectures, but also some hybrid architectures, e.g. the

supperpeers

Such distributed systems combine different architectural

features There are specific classes of distributed systems in which

client/server (centralized) architectures are combined withdecentralized architectures


37/61

36

Edge-Server Systems (1) Edge-server systems are deployed on the Internet where

servers are placed a the edge of the network

The edge is the boundary between enterprise networks and the

Internet, e.g. as provided by an ISP

An ISP can also be considered as residing on the edge whenhome users connect to the Internet

Clients connect to the Internet by means of an edge server

An edge server is mainly to serve content

A collection of edge servers can be used to optimize contentand application distributing where one edge server acts as an

origin server from which all content originates; other serversmay then replicate the content


38/61

37

Edge-Server Systems (2)

Figure 2-13. Viewing the Internet as consisting of a

collection of edge servers.


39/61

38

Collaborative Distributed Systems (1) When a node gets started usually a traditional client/server

scheme is used

Afterward, when the node has (fully) joined the system, a

decentralized scheme is used for collaboration

A good example is BitTorrent

To download a file, a user first needs to access a global

directory

A directory contains references called .torrentfiles

Each .torrentfile refers to a tracker A tracker is a server maintaining an accurate account ofactive

nodes that contains chunks of the request files The downloading node also needs to share its files to other

nodes, otherwise, other nodes may choose to decrease the data

rate of data being sent to the downloading node


40/61

39

Collaborative Distributed Systems (2)

Figure 2-14. The principal working of BitTorrent [adapted with

permission from Pouwelse et al. (2004)].


41/61

40

Collaborative Distributed Systems (3)

In Globule collaborative content distribution network,users/organizations voluntarily provide enhanced Webservers that are capable of collaborating in the replication ofWeb pages

Components of each such server include

A component that can redirect client requests to otherservers

A component for analyzing access patterns

A component for managing the replication of Web pages

A centralized component in Globule is the broker (server)which is responsible for registering servers and making themknown to others

Note that the broker itself can also be replicated for reliabilitypurpose

A hit t V Middl


42/61

41

Architectures Versus Middleware Middleware forms a layer between applications and

distributed platforms in order to provide a degree ofdistribution transparency hiding the distribution of data,processing and control from applications

In practice, the middleware actually follows a specificarchitectural style, e.g.

CORBA for object-oriented style

TIB/Rendezvous for event-based style This makes designing and developing applications simpler;

however, this may not optimal to a variety of requirements

Adaption is desirable in many cases, and can be done by Multiple versions of a middleware, or

Making a middleware easy to configure, adapt and customize

Middleware systems are now developed in which a stricter

separation of policies and mechanisms

I t t (1)


43/61

42

Interceptors (1) Interceptoris a software construct that will break the usual

flow of control and allow other (application specific) code to

be executed

Two approaches in making an interceptor

Generic but may be rather complex and difficult to build Simple but with restricted applicability

As shown in an example of next slide,

Request-level interceptor may hide certain distribution from the

caller, e.g. when object B has multiple replicas for which theinvocation needs to be sent to all

Message-level interceptor may handle fragmentation of data for

communication together with local OS transparently from theapplication (caller)


44/61

43

Interceptors (2)

Figure 2-15. Using interceptors to handle

remote-object invocations.

General Approaches to Adaptive


45/61

44

General Approaches to Adaptive

Software

Interceptoroffers a means to adapt the middleware This adaptation is responding to the fact that the environment

in which distributed applications execute changescontinuously

Three basic approaches to adaptive software: Separation of concerns

Modularizing systems

Aspect-oriented software development

Computational reflection An ability for a program to inspect itself and if needed, adaptits behavior

Component-based design

Supporting adaption through composition; this may even be

done at run-time

Di i


46/61

45

Discussion Trying to achieve distribution transparency often leads to

bulky and complex middleware! The conflicting requirements for generality and specialization

often results in middleware solutions being highly flexible, but

complex! Any how the need foradaptive software is there; the software

needs to adapt as its environment changes

Many distributed systems cannot be shut down

Components must be replaced/upgraded on the fly It is also desirable that distributed systems can react to

changes in their environment

The challenge is to let this reactive behavior takes place without

human intervention

Self-Management in


47/61

46

Self Management inDistributed Systems

Distributed systems and their associated middleware needhelp shield undesirable features of network in order tosupport as many applications as possible distributiontransparency

However, as distribution transparency comes with complexity,

full-distribution transparency is not what always wanted In many cases, what we want is for distributed systems to be

adaptive particularly when it comes to adapting theirexecution behavior; not the software components

Adaptation needs to be done automatically Need to have components that monitor and adjust

Need to decide where the processes handling adaptation are tobe executed

This is know as autonomic computing orself-starsystems

(self-managing, self-healing, self-configuring, etc.

Th F db k C t l M d l (1)


48/61

47

The Feedback Control Model (1) This is a well-established engineering principle When a system operates in its environment, there are always

some disturbances/uncontrollable parameters that will cause

the output and/or performance of the system to be deviated

from an expected one A logical organization of a feedback control loop consists of

three main elements

Metric estimation component: this component uses observed

output measurements, e.g. roundtrip-time, to estimate relevantparameters regarding the systems output/performance, e.g.

latency

Feedback analysis component: this component forms the heart

of the control loop; it contains algorithms to decide/determine

possible adaptations



49/61

48

The Feedback Control Model (2) Mechanisms to directly influence the behavior of the system

There can be many different mechanisms, e.g. placing replicas,

redirecting requests to different servers, changing routing, changing

scheduling priorities

The analysis component needs to be aware of the actual

mechanisms available in order to choose appropriate ones for

adjustment

This feedback control model provides a means for self-

managing of a distributed system; it also fits manual

management of such as system in the latter case the

analysis component is replaced by human operator Note that what discussed here is a logical organization; in the

actual physical organization, components discussed may be

and usually are physically distributed over different parts of

the system



50/61

49

The Feedback Control Model (3)

Figure 2-16. The logical organization of a

feedback control system.

Example: Systems Monitoring


51/61

50

Example: Systems Monitoringwith Astrolabe (1)

Astrolabe is a general tool/scheme to monitorsystembehavior; output of Astrolabe can be used by an analysiscomponent deciding corrective actions

The system organizes a large collection of nodes/hosts into ahierarchy of zones

The lowest-level zones each consists of a single host

As we go up the hierarchy, zones are subsequently grouped into zones with increasing size

The top-level zone has all the hosts

Each host runs an Astrolabe process called an agent thatcollects information on the zones in which the host iscontained

The agents also communicate with one another to spreadzone information across the entire system



52/61

51


Each host maintains a set of attributes for collecting localinformation; only these attributes are writable

Astrolabe organizes information collected into records ofdatabase tables which are easy for further manipulation bymeans of SQL and some enhancements to SQL

E.g. each machine in a zone may collect information such asCPU load, free memory, number of active processes; thenthe averages of these can be computed for the zone

The agent in each host is responsible for collecting its local

information, and computing parts of the tables of itsassociated zones

The gossiping protocol is used to exchange informationamong agents so that agents that need to assist in obtainingsome aggregated information will see the same result (if no

changes occur in the meantime)



53/61

52


Figure 2-17. Data collection and information

aggregation in Astrolabe.

Example: Differentiating Replication


54/61

53

Example: Differentiating ReplicationStrategies in Globule (1)

Globule is a collaborative content distribution network End-user servers are being placed in the Internet

These servers collaborate to optimize performance throughreplication of Web pages

Globule assumes that the Internet can be viewed as an edge-server system; requests can always to passed through anappropriate edge server

An origin server can observe what would happen if a replicais placed on a specific edge server and periodically determine

an appropriate replication policy Determining an appropriate replication policy requires an

analysis of various performance metrics, e.g. aggregateddelay between a client to the replica server, bandwidthneeded between the origin server and the replica server, etc.

with respect to the different policies to be evaluated



55/61

54


Figure 2-18. The edge-server model assumed by Globule.



56/61

55


The cost computation of each replication policy is expressiblein the form of a simple linear weighted function where weights

are configurable to represent the importance of each

performance metric

The evaluation is done on a per page basis against a fewtens of replication policies, periodically when enough

requests to the page are collected by means oftrace-driven

simulation

It is interesting to observe that The error in predicting the best policy goes up if the trace length

is not long enough not enough information

The error also goes up when the trace length gets too long too

many changes in the access pattern are captured which may be

irrelevant to recent behavior



57/61

56


Figure 2-19. The dependency between prediction

accuracy and trace length.

Example: Automatic Component Repair


58/61

57

Example: Automatic Component RepairManagement in Jade (1)

When a cluster consists of servers that are built using acomponent-based approach; component failures may be

detected and replaced automatically this is what Jade does

Jade is built on the Fractal component model, a Java

implementation of a framework that allows components to beadded and removed at runtime

Each component has two types of interfaces

A server interface which is used to call methods implemented by

that component A client interface which is used by a component to call other

components

Components are connected to each other by binding theirrespective client and server interfaces



59/61

58

a p e uto at c Co po e t epaManagement in Jade (2)

Jade has the notion of a repair management domain in whicheach node represent a server and components executed bythe server; there is a separate node manager

Each node is equipped with failure detectors which report anyfailures detected to the node manager

An example of a repair procedure initiated by the nodemanager may have the following steps

Terminate every binding between a component on a non-faulty node, and a component on the node that just failed.

Request the node manager to start and add a new node to

the domain. Configure the new node with exactly the same

components as those on the crashed node.

Re-establish all the bindings that were previouslyterminated.



60/61

59

p p pManagement in Jade (3)

Note that the above repair procedure only works when thecrashed component/node is stateless no crucial data is lost

Further observe that a repair procedure is driven by a repair

policy which in turn depends on the detected failure

All these are managed by the node manager in conjunctionwith failure reports from the nodes in a given repairmanagement domain

Observe also that turning a legacy, non-component-based,

application into a self-managing system with the approachdescribed here is impossible

Summary


61/61

Summary For distributed systems, we can make a distinction between

Software architecture logical organization of softwarecomponents and their interactions, and

System architecture where the components are physicallyplaced

Important architecture styles include layering, object

orientation, event orientation and data-space orientation In centralized architectures, client/serveris the main concept

of interaction

In decentralized architectures, processes play more of equal

roles, hence, a peer-to-peersystem in an overlay network Structured deterministic scheme for routing messages

Unstructured search algorithm is needed for locatingdata/processes

In self-managing distributed system, the concept offeed-back

chap-02v2 modified by pratit

Documents