7 ijeast vol no.4 issue no.1 management of multi channel multi container application servers 069 080

8/7/2019 7 IJEAST Vol No.4 Issue No.1 Management of Multi Channel Multi Container Application Servers 069 080

1/12

Management of Multi-Channel Multi-Container

Application Servers

Kareim Sobhe

The American University in Cairo

P.O.Box 2511, Cairo, Egypt

[email protected]

Abstract: We propose a Multi-Channel Multi-Container Web

application server that allows the division of a single web requestinto independent portions to be executed in parallel overdifferent communication channels. To achieve this, theunderlying communication infrastructure of traditional webenvironments is changed from the state-full TCP to the stateless

UDP communication protocol. Such an architectural changeprovided an environment suitable for parallelization anddistribution and enhanced some other already existingcharacteristics of the current web environments such as faulttolerance and web caching. In a previous paper we havepresented the detailed architecture and results of a rapidprototype. In this paper, performance is further enhanced as part

of the framework upon which web applications will run byintroducing a number of changes. Such proposed architecturalchanges were done transparently from web clients keeping thereinterfaces unchanged. The target applications that would benefitfrom the multichannel environment are web applications that

require intensive processing with respect to data transfer, andwhich are decomposable by nature; such application profilesdemonstrate high gain in performance by the multichannelarchitecture over the traditional one. The importance of the

Cluster Manager, which we will refer to as the CLM throughout

this paper, resides in the fact that it takes the new webenvironment/architecture from just a clustered cascadedenvironment with replicated web application servers node thatruns independently to serve dispatched requests, to a

collaborative cluster whose nodes work in a tightly coupledmechanism to provide distributed resources for serving a singlerequest through passing different kinds of messages between

nodes through the CLM. The CLM provides a lot of serviceswhich are all designed for cluster based functionalities. Some ofthose functionalities are informative which provide informationabout different cluster nodes to other cluster nodes within thesame cluster as well as providing informative services to HPAssuch as the discovery service, and some others are transactionwhich are services that when an action is taken by one or morenode in the cluster such as service state replication andmigration. Thus as we will see , the CLM in the multi-channel

multi-container environment will have two main tasks that canbe divided into client tasks and server tasks. The CLM should be

able to give the HPA some cluster based information to enable

the HPA embedded dispatcher to initiate requests to the mostfree nodes, provide the HPA with information needed forchannels reservation and initialization ,and assist the HPA incase of take over situations resulting from failures. The CLM hasserver side duties such as updating all cluster nodes to be in

sync, provide mechanisms for running services to replicate theirstate over the distributed replicated shared memory clusterengine between different nodes, and finally the CLM carry outsome intermediate tasks between the Deployment Manager anddifferent nodes service factories to enable cross cluster servicedeployment.Keywords- Multi-Channel, Web application Server, Clustering,High Availability, Service State Migration, High Performance

Computing High Performance Agent, Skeleton Caching

Ahmed Sameh

Prince Sultan University

P.O.Box 66833, Riyadh, Saudi Arabia

[email protected]

I- INTRODUCTION

The architecture of the proposed web application server [1] is

made up of two main components which are the Container

and the High Performance Agent (HPA). The container is a

normal application deployment server which can loadapplication component instances in the form of services, as

well as provide the resources required for them to execute and

function. The Container supports a UDP based communicationlayer through which all communication between anyContainer and its clients are over a state-full communication

protocol built on top of UDP. An important question pops upwhich is Why build a state-full communication protoco

over UDP while the TCP protocol exists? The answer can be

summarized in three points:

1. The TCP protocol is too general with lots of overhead to

accommodate its general features designed to serve any kind

of communication sequences between any two entities. A

minimized version of TCP can be implemented to remove al

the overhead and be specific for web transactions on a

Request/Reply basis only,2. As the tests presented below show, over UDP more web

transactions can be handled than over the normal TCP used in

current web servers; thus utilizing concurrent channels to

serve one web transaction from different container nodes,3. A deviation from the TCP protocol is needed to be able to

change the source of the data stream at any point of time. A

container which is sending a web transaction reply to a

specific client must be able at any point of time to delegate the

execution of such web transaction to another container located

physically on another container node which will resume th

sending of the data stream, and hence the whole webtransaction. This capability provides an infrastructure for fault

tolerance through service takeover.

Since a Container will not be able to communicate excep

through a proprietary protocol based on UDP, and since

normal web clients communicate with web servers using

HTTP over TCP, an intermediate translator is necessary to

narrow the gap and enable the web client to transparently send

its requests to the container. Thus, the High PerformanceAgent component is introduced which will be referred to

throughout this paper as HPA. Acting as a reverse proxy, theHPA is located physically on the machine which the web

client initiates its web requests from. Unlike any norma proxy, the HPA provides proxy operations between a webclient and a Container over different communication

protocols, so the HPA will be communicating with the web

client through normal HTTP over TCP and will translate those

client requests to the container through an extended HTTP

protocol over UDP. The HPA is designed to be a revers

Kareim Sobhe et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIESVol No. 4, Issue No. 1, 069 - 080

ISSN: 2230-7818 @ 2011 http://www.ijaest.iserp.org. All rights Reserved. Page 69
mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]


2/12

proxy because unlike normal proxies, a reverse proxy serves a

specific destination or a number of destinations.

Figure 1: The Proposed Multi-Channel Web Environment based on UDP

In a realistic situation, the HPA is not considered an overhead,

as it is located on the client machine, very tightly coupled with

the web client and serves only the normal load of a single

user's web transactions. Figure 1 shows the proposed new

architecture.I.1 ContainerThe proposed Container is a web application server

deployment container with all the subsystems needed tocarryout the basic functionalities of a normal web application

server deployment container which are loading application

business logic components in the form of loadable services

components, and providing them with the necessary resources

to be able to operate and function. The Container has a class

hierarchy that any service needs to extend to be able to be

deployed in the Container. Services should be developed and

implemented in the development technology that a container

supports; in this case, the proposed environment will support

hybrid development and runtime technology types ofcontainers which will all be replicas in architecture and

provide the same communication interface, so there will be

C++ Containers and Java Containers. Maybe in the future

there will be PERL containers, PHP Containers, Python

Containers, ...etc., where the responsibility of each containertype is to host services that are developed with its supported

technology in mind, for example, the C++ container will host

services developed in C++. As will be seen in the next

sections, a web request could be broken down to portions that

may run on different development technology container nodes,and those hybrid services can exchange messages.

A container node has a multi-thread communication layer with

pre-allocated communication sockets to communicateconcurrently with different clients. A service factory is

required to load service instances in ready-to-execute threads

to assign to service requests coming from the clients. The

service factory loads services that are defined in the container

configuration files, thus a configuration manager subsystem is

needed to parse and load configuration files which define the

settings that the container should have such as thecommunication port range that the container should acquire,

maximum number of communication threads, services namesthat the container should load, number of instances to be

instantiated from each service type, location of multi-channelserver side scripts called skeletons, ...etc. The container node

has a dispatcher which dispatches incoming web transactions

to the correct services to handle the request, and also a

communication buffer manager to assign and managecommunication buffers allocated for dispatched services.

Figure 2: Container Node Architecture

As can be seen, many resources are allocated by a container

node such as communication threads, communication buffers

memory and thread resources for instantiated servicesinstances, therefore a garbage collector is needed for

environment housekeeping for expired resources to enable

them to be reinitialized and reused for following requestsEach component will have its own garbage collection module

For example the factory will be able to clean and reacquire

terminated service instances after they finish execution. The

communication layer will be able to clean up finished

communication channels and reinitialize them for further

reuse. The communication buffer manager will be able to de

allocate expired unused communication buffers. Figure 2shows the internal architecture of a container node irrespectiveof its supported development and runtime technology.

So far, the architecture presented serves a single containerfunctionality, so a cluster management subsystem will be

added to enable message exchange between different containe

nodes which will help in the proposed multichannemechanisms and through which service state migration, which

is discussed later, will provide a better infrastructure for faul

tolerance. To deploy services easily, a deployment manager

subsystem will work closely with the cluster managemen

subsystem to enable the clustered deployment of services

which will include service images replication on container

clustered nodes. In fact, the deployment manager will use thecluster management subsystem's APIs and interfaces to carry

out cross cluster deployment operations

Each Container type has an embedded class hierarchy all of

which follow the same design and functionality as much as

possible. For a service to be deployed in a specific container i

should extend an Ancestor Class which is provided by alcontainer types. The Ancestor class basically has two sets of

methods; the first set is those methods which are basic virtua

methods for services to extend and overload such as the main

method which is called by the service factory when a service

is dispatched to serve a web request. The other set of methodshas the role of encapsulating functionalities that are carried out

by the container on behalf of services such as reading and

parsing the HTTP Request header and posted data as well a

composing the HTTP Reply header. It is very important tha

the service developer be aware of the container class hierarchy

and its interfaces to be able to utilize the container

functionalities and its internal infrastructure.




3/12

The container can serve two types of services which are

designed to enable the developer of application components to

develop applications in a decomposable way that will enable

the concurrent execution of services, and the delivery of their

results over multiple communication channels: 1- Single

Channel Services: The first type of services is the Single

Channel Service, which we define as the smallest executable

entity that can run independently. A Single Channel Service is

considered the indivisible building block of an applicationcomponent which can be used to build up more complex

services, providing re-usability and extend-ability. As the

name indicates, the most important architectural feature of a

Single Channel Service is that it communicates over a single

communication channel which is basically based on UDP

communication. The direct client of a Single Channel Service

is the HPA which will act as an interface agent between the

service and the web client. A Single Channel Service can be

visualized as a Java Servlet which runs in the application

server environment and delivers web results to the client. 2-

Skeleton Services: Since the Single Channel Service does not

differ in concept from a normal web application component, away is needed to group those independent basic components,

the Single Channel Services, to build more complex

functionality services able to run those components in parallel

to improve performance. A Skeleton Service is basically a

server side in-line script which follows the normal structure of

regular web server side in-line scripts such as PHP or ASP.

Some features are added to the Skeleton to achieve

multichannel and parallel execution such as adding

parallelization constructs to each in-line code section in the

skeleton as well as the type construct defining the

development environment of each in-line code section. The

developer will write the skeleton source file which is a hybrid

of static content as well as in-line code sections defining the

dynamic parts. Then the deployment manager will take as an

input the source of the Skeleton to generate the skeleton map

and add independent single channel services for each

concurrent in-line script section. The Skeleton map is a map

that will be used by the HPA to identify each concurrent

service that needs to be requested from the container in

parallel. The communication layer of the Container is based

on a special state-full protocol built on top of UDP sufficient

to serve the web application communication needs of a single

request-reply communication sequence. The communicationlayer consists of multi-threaded components that allow the

container to handle multiple communication channels

simultaneously and service multiple requests concurrently.

The container does not perceive the relation between different

channels, rather from the container perspective each

communication channel is assigned to a service which either

serves a normal service or transfers a skeleton map to the

HPA, both of which require a single channel. The HPA is the

one which initiates multiple communication channels to

different containers to serve a complex service defined by a

skeleton map. When a request arrives from the HPA the

container starts by validating the client.

Figure 3: Service Factory

On successful validation the communication layer passes the

HTTP request to the service dispatcher which will then

evaluate the HTTP request and with the help of the service

factory a communication channel will be assigned to a service

to serve the requested web transaction. After the transaction

finishes, the communication layer subsystem is responsible for

cleaning up the communication channel and re-initializing it to

serve future requests. When the HPA initially tries to

communicate with a container node, it will do so on a default

administrative port through which it will be assigned a range

of service ports over which it can request services from the

container. The HPA will be able to communicate with any

container node in the cluster over the same range of

communication ports. The communication layer, with the help

of the cluster management subsystem, will assign the HPA to

a free range of ports and replicate this assignment to al

container nodes in the cluster. After a specific idle time from a

specific client the port range assignment is cleared and the

HPA client will need to reclaim a port range again. The

Service Manager subsystem is composed mainly of theService Manager and the Service Dispatcher which are

concerned with the service status in all stages of operations

First a service is loaded by the service factory when it is in the

stage of being ready to serve requests. When a request arrives

and the service dispatcher decides on the type of service tha

should serve a specific request, it asks the service factory to

avail a service instance for this request, which is the point

where the service is assigned by the dispatcher to the

communication channel as well as a communication buffer

and its status is changed to being operational and dispatched

where it will reside in the active service pool. When the

service finishes serving the request, the garbage is collected bythe service factory, returned to the status of being ready to use

and transferred to the ready to execute service pool. Figure 3

gives an abstract view of the Service Manager and how the

Service Dispatcher interacts with the Service Factory. This

subsystem is responsible for reading configuration information

from configuration sources, which are all based on XML

format and require XML parsing, and storing it in internal

structures ready for use by different subsystems. For example

the range of ports to be used by the communication layer, the

number of instances for a specific service type, . etc. With the




4/12

help of the Cluster Management System, the Configuration

Manager is capable of distributing configuration structures

over different container nodes in the web cluster. The

Administration Manager is an interface layer between the

human administrator of the container web cluster and the

container nodes. It enables the administrator to pass

administration commands to the container node with the help

of the Cluster Management System, the commands issued by

the administrator can run transparently on multiple containernodes providing a single system image SSI for the whole

container web cluster. The Deployment Manager is

responsible for deploying the services provided by the

application developers and replicating the deployment over

different cluster container nodes with the help of the Cluster

Management System. The deployment manager can deploy

single channel services as back-end components as well as

multichannel services represented in server side in-line scripts.

The developer will provide the multichannel in-line scripts.

The deployment manager will then parse the script and extract

each piece of code defined as a separate thread and generate

the single channel service source code for it. The deploymentmanager will then compile the services, generate whatever

error or warning messages apply and send them to the

deployment administrator. The deployment manager will

choose the correct compiler for each code section according to

its type, meaning that sections written in C++ will be

compiled with GCC for example, and sections written in

JAVA will be compiled with an appropriate JAVA compiler.

On successful compilation of the services constructed from the

in-line script definitions, the deployment agent will deploy

those services across the container cluster nodes according to

their types. C++ single channel services will be replicated over

C++ containers, and JAVA services will be replicated over

JAVA containers. It is important to state that some replication

constructs and rules can be applied for the service replications.

The default replication may be equal distribution of the

services, but there might be another deployment scheme which

takes into consideration the amount of memory and the speed

of the CPU of each container node. After the single channel

services are compiled and deployed successfully, the

deployment manager will generate a skeleton map for the in-

line script and replicate it over cluster nodes. The skeleton

map will contain pointers to the target single channel services

indicating their primary and secondary locations in case offailures. The service pointer is compose of an HTTP-like

header of the request for the single channel service with a little

room for adding extra information about the service such as

alternative service locations. The Cluster Management System

is the subsystem that is responsible for the exchange of

information between different containers. The cluster

management system enables the deployment manager to

distribute newly deployed services as well as modified ones.

The Cluster Management System is also responsible for

transparently executing administration commands issued by

the environment administration over all the nodes of the

cluster which eases the administration of the web cluster and

makes it appear as a single system to the administrator

Moreover, the Cluster Management Subsystem is responsible

for all the communication necessary to carry out the service

state migration.

Figure 4: Communication flow between web client and container with HPA in

the middle

I.2 HPA

The proposed High Performance Agent is the agent that the

whole system depends on. The HPA acts as a multi-protocol

reverse proxy between the Container and the web client. TheHPA acts as a web server for the web client and as the agen

which understands the constructs sent by the container to split

the communication stream into multiple channels, which wil

enable the parallelization of delays from which should come

the enhanced performance. How the gears will work can be

seen in the work flow section.

The communication layer of the HPA is a multi-protocoldouble edged communication layer. It can be viewed as two

separate communication layers that communicate with each

other. The first communication layer is a standard multi

threaded TCP communication layer that can handle multiple

web transactions concurrently. The second, UDP based

communication layer is responsible for communicating with

the back end containers. A request is initiated by a web clien

through an HTTP over TCP connection. When the requestarrives to the HPA, the HPA will use one of the already

established UDP connections with the container environment

and a discovery request will be initiated to identify the nodethat this request will be served from. A cache for discovery

results in the HPA will be updated to eliminate unnecessarycommunication. Finally, the request will be served from the

container to the HPA over UDP and the data stream will be

transferred to the web client consequently over TCP, which

will take place transparently to the web client. Both

communication threads, TCP and UDP, will run in two

different threads to avoid dependent communication blocking

hence a buffering mechanism will be needed between the two

threads to enable data storage which will help to postpone

mutual communication blocking between the two

communication threads. Of course when the communicationbuffer is fully occupied, the UDP receiver will wait until the

TCP sender starts spooling data from the buffer and vise versa




5/12

Figure 4 gives an overall view of the communication

mechanism between the web client and the container with the

intermediate agent HPA in the middle.

Figure 5: Multi-channel Scenario Work Flow

Obviously a server side in-line script will contain some static

content, and every time a server side script is requested by the

client, the skeleton map for that script will have to be fetchedfrom the container for the HPA to continue and establish the

required single channel requests to fulfill serving the serverside script request. The connection required to fetch the

skeleton map is an overhead, hence adding a cache module to

the HPA to keep unmodified versions of skeleton maps willachieve two things: 1) eliminate an extra connection that isneeded for the skeleton fetching, 2) cache some of the static

content that is embedded in the dynamic content generated by

back end services.

All the scripted sections will be cached by the HPA, and the

impact of that will depend on the size of the cachable areas. Of

course in current modern scripting environment such caching

is not possible as the client has no clue which parts of the UI,

e.g. HTML, is static and which part is generated by a backend

business logic engine, yet the client, HPA, in our case has no

access to the business logic source code. The Discovery client

is the module that is responsible for advising the HPA of the

locations of services through communication with theContainer discovery service. Caching will be applied toeliminate unneeded communication as much as possible.

II- WORK FLOW

Work flow of requests and mechanisms is discussed to

highlight how the gears will really move, and how the bits and

pieces of the whole environment will cooperate and

collaborate to serve web requests using the multichannel

mechanism. A file spooler is used as an example to clarify the

three scenarios presented; the Single Channel scenario, the

Multi-Channel scenario, and the Service State MigrationScenario. Work flow figures provide visualization of each

scenario.II.1 Single Channel Scenario

The proposed Single Channel Scenario is the basic building

block upon which the multichannel scenario is built. A special

case one container of Figure 5 illustrates the work flow of the

single channel scenario. The scenario starts with a web clientusing the HPA installed on the same machine and operating on

the loopback address, to initiate a single channel request to a

container node. The request is in normal URI structure which

contains the name of the container node that the requestedservice resides on, and the name of the service to be executed.

The request is sent to the HPA over TCP. The HPA evaluates

the request and identifies it as a single channel request. The

HPA then opens a UDP connection to the container node

specified in the URI, and passes the request to it. The

container then dispatches the request to the correct serviceinstance to serve the request. The stream returned by the

service to the HPA over UDP is sent to the client over UDP.

As can be seen from the figure, the UDP communication iscarried out in parallel with the TCP communication which

allows the pipelining communication mechanism thaeliminates overhead and increases the speed.

II.2 Multi-channel Scenario

The proposed multichannel scenario is based on the single

channel scenario, as a web transaction is broken down into a

number of single channel services that are distributed and

executed concurrently and serve their content over paralle

communication channels. Figure 5 illustrates the work flow o

the multichannel scenario.

The request reaches the HPA over TCP as usual, exactly as in

the previous scenario. The HPA evaluates the request andidentifies it as a multichannel request by the service name

extension .skel. The HPA then makes necessary updates to its

skeleton cache. Then it fetches the skeleton data structure

from its cache, and identifies the different single channel

requests needed. The HPA then spawns a thread for each

single channel request to different container nodes accordingto the information in the skeleton map of the multichanne

service. The HPA returns the replies of the channels to the

web client over TCP as they arrive according to their

chronological order which entails some buffering and blocking

techniques. For example, if the second channel finishes beforethe first channel, the second channel content must be buffered

on the HPA side until the first channel finishes, during which

the communication channel will be blocked.

III- CLUSTER MANAGER (CLM)

The proposed CLM provides a lot of services which are al

designed for cluster based functionalities. Some of those

functionalities are informative which provide informationabout different cluster nodes to other cluster nodes within the

same cluster as well as providing informative services to

HPAs such as the discovery service, and some others are

transaction which are services that when an action is taken byone or more node in the cluster such as service state

replication and migration. Thus the CLM is considered the

backbone infrastructure for all the communication, represented

in message passing, between different cluster nodes and in

some cases between HPAs and cluster nodes; consequently the

set of services that needs to be carried out by the CLM

imposes its architecture which reflects on the whole web

environment architecture. Though-out this paper we wil

demonstrate the architecture of the new proposed web

environment with respect to the CLM and the services the

CLM provides as well as the detailed mechanism that suchservices adopt to operate. All current midrange web

environments that support clusters, such as Websphere

Weblogic, Apache/Tomcate ...etc. are based on the dispatcher

model. The dispatcher in this case will carry out two main

tasks which are the scheduling task, which decides which nodein the cluster should server the next coming request, and the

physical dispatching mechanism which is based in most cases

on either a layer-4 or layer-7 switching [2]. A lot of

scheduling techniques are available, but the problem is that for

such scheduling techniques to operate efficiently, they need to




6/12

have some sort of feedback about the status of the nodes, and

more precisely the status of the service that serve web requests

on each node of the cluster as the state of the physical nodemachine might not be a good indication. A lot of techniques

were invented for this, and one of the most famous is the IBM

Websphere Edge Server Advisor component, which feeds back to the layer-4 dispatcher some kind of advisory

information about the load of the cluster node it is running on,also a layer-7 caching reverse proxy is provided within the

Websphere suite that enables layer-7 dispatching based on the

advisory component. But as can be seen, this dispatching

mechanisms work on a transaction as the smallest portion of

distribution, so a transaction will need at least 2 physical TCP

connections and will reside at the end on one node to serve it.

The same goes for the Grid, yet plug-ins and wrappers were

invented for HPC tools, such as MPI, openMP, and PVM, to

be integrated to grid environments to solve this problem. A

grid will have a dispatcher and a scheduler to assign tasks tothe best fitting candidate grid node to execute such task as a

first level dispatching. The second level dispatching in this

case is handled by the integrated HPC tool to break down the

task to smaller tasks and execute them distributedly, and a lot

of nice features will be inherited in this model enabling

collaborative execution of a single task, and exchanging dataover HPC infrastructure. Some attempts are made to integrate

web services technology with such environments to serve on

the web [5]. In the proposed Multi-channel Multi-Container

environment, we need to utilize the HPC methods and features

but in a way to integrate tightly to the environment instead ofbuilding interfaces to already existing libraries. So the model

provide a break down of a web server page (web service) that

the programmer of the service will impose and define, which

is the same in MPI and PVM, as the programmer provide the

segmentation of the program and the conditions that each

portion should run according. Also the CLM in the proposed

multi-channel environment is designed for web which will

eliminate a lot of the layered integration overhead, unlike HPCapplication, web application are not long time serving, yet

they are database intensive applications that might need some

processing power but not that much that an NQUEEN MPI

solver will need. Thus as we will see shortly, the CLM in themulti-channel environment will have two main tasks that can

be divided into client tasks and server tasks. The CLM should

be able to give the HPA some cluster based information to

enable the HPA embedded dispatcher to initiate requests to the

most free nodes, provide the HPA with information needed for

channels reservation and initialization and assist the HPA in

case of take over situations resulting from failures. The CLM

has server side duties such as updating all cluster nodes to be

in sync, provide a mechanisms for running services to

replicate their state over the distributed replicated shared

memory cluster engine between different nodes, and finallythe CLM carry out some intermediate tasks between the

Deployment Manager and different nodes service factories to

enable cross cluster service deployment.

The CLM is basically an engine that resides right above the

communication layer of the container node, and it is designedto control the service factory, reserved channels queue, shared

memory segment. Since the new proposed web environment is

designed to be a clustered one, then it should appear to an

outsider as single entity and that all the internal

communication of the cluster should be done transparently so

all nodes are synchronized together without publishing the

details of the internal structure of the environment to its

clients, and hence some of the nodes of the cluster will havesome extra functionalities which are cluster related to manage

this kind of node synchronization process and act as leaders

for the environment to say which node does what, and whenThis cluster based functionality can be handled by one node o

the cluster, but since we intended to use the word clusterwhich is meant to be used, then the high availability feature is

a must and needs to be considered as a priority, so a number o

nodes are nominated to carry out such cluster based

functionalities, and be replica to each other, so when one fails

the other will still be able to function and fulfill the desired

role. By this definition, nodes of that type of cluster-based

functionalities will not be only having the role o

synchronization between different cluster component nodes

but more importantly will act as a database that represents al

the cluster attributes and all the nodes states and capabilities aany point of time which will need to be updated frequently and

efficiently to avoid overheads and discrepancy between

different manager nodes databases, and this gives such group

of nodes the capabilities of taking on behalf of the whole

cluster in come cases which leads to the desired SSI (Single

System Image) functionality which makes the cluster appearas a single entity that is capable to encapsulating, controlling

and hiding all its internal conflicts and operations from its

clients. Obviously, member of nodes of our new environment

will be of different kinds, namely Compute Nodes and

Management Nodes. Before we can go to the details of theCLM architecture and how it work; following is the details of

the role and the functionalities of each type.

III.1 Compute Node

A compute node is a normal container node that besides being

able to serve channel requests it has a minimal function CLM

version that is able to work as a managed node which is

controlled by the Management nodes of the cluster and being

able to synchronize with the other nodes throughcommunicating with the management nodes of the cluster. The

CLM of a compute node acts as a client agent to the CLM of

the Management nodes of the cluster. The CLM of a compute

node would have the following subsystems:III.1.1 Reservation Service

The reservation service is a network communication service

based on UDP which listens on a specific port that is provided

by the Configuration Manager [3]. The service waits fo

requests from the HPA asking to reserve a channel to be able

to initiate requests to this node. The CLM on receiving a

reservation request will attempt to locate available channels

that are either not already reserved by HPAs, or those being

ideal of a predefined duration and are marked as expired and

ready for reuse; on the success to locate candidate channels

they are reserved for this HPA and will reject any servicerequests coming from different source than that HPA till they

expire after being idle. The port numbers of the channels

assigned will be sent back to the HPA as a reply to its reques

in case of successful reservation of a failure notification other

wise. Due to the fact that we are using the UDP statelesscommunication protocol, if an HPA initiated a request for

reservation to a compute node that has channels assigned to

this HPA, the CLM will reply to the HPA with the ports of the

already reserved channels and will timestamp the channels

with the current time of registration to avoid quick expiration




7/12

Also, due to the fact that the reservation request by the HPA,

and the reservation reply by the container node are small

enough they can be encapsulated within single UDP packetwhich makes the reservation communication very simple and

will be based on timeout/resend mechanisms avoiding to

develop any flow control mechanism in such communication.More precisely, the Reservation service is considered the

channel manager, as it is the service in control ofcommunication channel in terms of assigning them to HPAs,

and resetting their status to being available when idle.

III.1.2 Shared Memory Segment

The Shared Memory Segment is basically a memory segment

that keeps shared data of services running on different nodes.

For service state migration purposes, each service will have

the capability of storing its state into an XML stream, and

forward it to the CLM, of the node its is running on, asking it

to replicated to the cluster nodes, consequently, and

transparently from the service, the CLM is responsible ofreplicating such service state to all nodes of the cluster to be

stored in their shared memory segments. As each service on

the cluster will have a unique ID that identifies it, precisely

composed of the cluster node ID and the location of the

service in the service queue, the state of the service is stored

by it service ID on all nodes of the cluster, and hence if a nodefails and service state migration took place and a node is

chosen to continue the execution of the failing service, it will

use the state of the service stored in its Shared Memory

Segment to resume the execution and serve the client HPA

from the nearest point to the failure without the need to re-executing the whole service from the beginning. We will

describe the Service State Migration mechanism shortly and

will illustrate how the Shared Memory Segment is crucial for

the Service State Migration to function. Each time the state of

a service is generated for replication, the sender CLM

timestamp the state, so when its stored on remote nodes the

timestamp will define the freshness of the stored state. This is

a very important as when UDP packets sent to remote CLMfail to arrive to their destination, this timestamp will make the

management nodes differentiate between up to date and

lagged behind nodes in case of searching for a take over node

when a service state migration is needed and initiated. It isvery important to highlight that the Shared Memory Segment

replication is working on top of a UDP communication

scheme, and for this replication to operate as fast and reliable

as possible, we impose one important constraint which is the

need that the state of any service should fit into one UDP

packet which will enable such communication to be built on

top of a timeout/resend mechanisms avoiding the need of a

flow control mechanism which will slow down the transfer

and may cause a situation of what is called Traffic Storm

leading to a congestion in the internal network of the cluster.

III.1.3 Load ReplicatorAs all the nodes of the cluster need to be synchronize aiming

the Cluster to appear as a Single System Image, which most

clustered environments in different domains aim at, nodes of

the cluster will need to send its state, represented mainly in the

processor idle time, free memory, network bandwidth used,and number of channels serving, to other nodes of the cluster.

This kind of data is very small and can be encapsulated in one

UDP packet and sent frequently every fixed duration of time,

which is characterized by the nature of the sent data to be a

very small duration of time that is on the average a couple of

seconds. The nodes load data will be stored on all nodes of the

cluster, but practically in the current version of our

architecture this data will not be used except by themanagement nodes, as we will describe shortly in the next

section, but we intended to replicate it on other compute nodes

as it might be used in the future as a base of some decisionsuch as nodes which are very busy or marked as being failed

should not receive shared memory data, so when a service istrying to replicate its state, such nodes will not be included in

the list of nodes that the state will be sent to.

Figure 6: Cluster Management Architecture

III.2 Management Node

The Management Node is a normal compute node with some

extra features and capabilities that enable it to carry out some

cluster based management tasks. There are two main

management tasks that the CLM of a Management Node

which are Container Management and HPA management

Two services are started on any Management Container Node

that listens on two UDP ports defined in the containerconfiguration and provided to the CLM by the Configuration

Manager [3].

III.2.1 Management ServiceThe Management Service is the service which carries out

containers related cluster management tasks, which is mainly

service state migration initiation which is a very complex

operation that is based on communicating with different nodes

resulting in choosing a candidate node to takeover the

execution of an already running service that happens to fail to

complete its execution.

III.2.2 HPA Management ServiceThe HPA management service is responsible for responding to

HPA requests to provide it with cluster related info which can

be summarized in the following points:

1. Discovery request at startup of the HPA for cluster nodes

and their types and service languages.

2. Frequent discovery request initiated by HPAs requestinginformation about cluster nodes load to be able to choose theless load nodes to forward service requests to.

3. Take over requests initiated by HPAs timing out on a failingservice.

IV. CLUSTER MANAGER ARCHITECTURE

IV.1 Overview

Figure 6 gives an overview of the CLM of different node

types, compute and management nodes, and how they

communication together through the CLM communication

layer. As it is obvious from the diagram, and the description o




8/12

the cluster service in the previous section, the cluster services

on different nodes communicate together on a predefined port

number and the communication unit is a UDP packet which inmost cases will need to land on all nodes in the cluster, hence

a kind of a multicast communication scheme will be needed.

For example, incase of the shared memory segment, the stateof one service on one node will need to be sent to all or some

nodes of the cluster to be replicated in their shared memorysegments for future use in case of service state migration. The

cluster services are also running in parallel threads within the

CLM which makes them independent, yet they communicate

with each other through internal APIs, in the form of object

method invocations coupled with some locking mechanisms

for critical sections handling.

IV.2 Communication Layer

The communication layer used in CLM is an extended version

of the normal UDP-Based communication layer used for

servicing HTTP channel requests between a normal containerand an HPA requesting a service execution. The

communication layer was extended with an emulated multicast

capability that allows one node at any point of time to send a

packet to all or some nodes of the cluster. The emulated

multicast functionality will act transparently from the caller

point of view as a normal multicast function, but it will physically connect to each node in the recipients queue and

send to it a packet containing the data need to be sent. As we

described above, one important constraint is valid for all

cluster services, which is the need to encapsulating all the data

of any cluster transaction in one packet.IV.3 Obstacles and Work Around

Two main problems arise from the above architecture which

needs some sort of work around or at least justification. The

first thing will be a justification for why did we use emulated

multicasting instead of native multicasting which is available

in UDP, and the other thing is a work around for what is called

Traffic storm which will arise from the usage of emulated

multicasting as packets will be physically sent individually toeach node in the recipient queue, instead of being sent once on

a multicast address which seems to save a lot of traffic on the

cluster internal network.

IV.3.1 Native Multicast vs. Emulated MulticastAlthough the UDP socket library provides packet multicasting

which saves a lot of traffic in case a single packet will need to

be sent to a group of nodes on the network, which is what

exactly needed by the CLM, yet two constraint arises which

pushed for implementing an emulated multicast functionality

in the container communication layer to emulate the native

multicast, which are:

1. The multicast is not standard in all environments and will

need specific configuration on the network devices of the

containers cluster network, which is not standard, and one or

the most important features that we are after is being able tohave a hybrid environment in terms of Operating Systems and

Container Software.

2. In most multicast libraries, de-fragmentation is not available

on multicast packets, which leaves us with a very limited

space per packet to transfer data, and since we have theconstraint of encapsulating data of any cluster transaction into

one UDP packet, the native multicast will represent a major

obstacle to the CLM capabilities. Some mechanisms can be

used such as compression to send more data with the same

limited packet size, yet the performance draw back on this will

be high as most of the cluster transaction are done frequently

and continuously.

Figure 7: HPA Startup

IV.3.2 Traffic Storm

Due to the fact that the emulated multicasting functionality

will be used by the container nodes CLM, more bandwidth

will be needed by the CLM communication layer to be able to

operate. Not only that, due to the use of a stateless protocol

and the timeout/resend mechanisms that are adopted, a

symptom that is called Traffic Storm may occur at peaks

especially when the number of services and/or the number ofnodes in a cluster increases, practically speaking this is un

avoidable, yet we can try to make some decisions that wil

help reaching such situation frequently. The suggested work

around is that for some cluster services, such as service statereplication, the replication will occur on a subset of nodes and

not all the nodes in the cluster. Practically, shared memory

replication is the most service of all requiring considerable

amount of data transfer compared with load replicator and

discovery services, and in fact a failing service will need only

one service to takeover from it, so it will not be needed to

replicate the state on all nodes of the cluster. Thus, from theshared memory replication perspective we might have sub-

clusters, defined by the administrator, to replicate with eachother.

V. CLUSTER MANAGER SCENARIOS

V.1 Discovery Scenario

The discovery scenario is initiated at the HPA startup. Figure

7 and the following steps describe this scenario.1. The HPA reads its configuration file on its start up to

identify the primary management that it will connect to

2. The HPA will connect to the HPA Management port of the

cluster it needs service from3. The HPA will send the management node a request for info

about the cluster

4. The management node will send back the HPA an XMLstream whose records represent all the nodes of the cluster

their types, either management or compute node, and the

language of the services it runs on top. Also, the preserver por

on each node will be sent to the HPA for future reference. I

will be up to the primary management node to send all or

some of the nodes info to the HPA, meaning that this might by

a way to load balancing that is very close to the DNS rotation

mechanism that will distribute load in peak times.5. The HPA will parse and store the cluster info in its interna

buffer for channel reservation and for load info amendments.

6. If the HPA receives no response from the management node

it will keep on resending the discovery request over a




9/12

predefined timeout for a predefined number of retries before it

report failure and exit.

Figure 8: Shared memory Segment Replication Service State Migration

V.2 Channel Reservation Scenario

Following on the same previous figure, the HPA will iterate

on each node in the discovery info sent to it, and initiate a

reservation request to it. The reservation process follows the

following steps:

1. The HPA sends a reservation request to a specific container

node2. The container node will search for channels already

reserved by the requesting HPA, and if it finds any it will

stamp them by the current time and send back the port numberof the channels to the HPA

3. If not, the reserver will look for available channels to assign

to the HPA, and if any is found they are reserved to the IP of

the HPA, so no requests will be accepted by those channels

except from the IP address of the requesting HPA. The

channels will then be stamped with the current time of the

reservation to start calculating the idle time. Each time the

HPA communicate over a specific channel with a container,

the container will update the time stamp of the channel with

the current time.4. If not, the reserver will start looking for channels whose

time stamp is too old which means that they are no longerreserved, so they will be cleared and will be reserved for the

requesting HPA.5. If not, the HPA will receive a failure message which

indicates that all the channels within the current container are

occupied.

V.3 Shared Memory Segment Replication Scenario

The Shared Memory Segment Replication can be illustrated as

in figure 8 and the following steps below. As the figure

appears to be complex, just try to follow the red bold arrows:

1. A running service should call its serialze routine

periodically, which the a method that the service developeroverloads

2. The overloaded serialize routine should contain the code

necessary to produce a string stream that represent the state of

the service

3. At the exit of the routine the CLM is invoked with the state

stream.

4. The CLM will send to all its neighbors defined in the clusterqueue (This might not be all nodes in the cluster) the state to

be stored in the Shared Memory Segment. The state stream is

coupled with the unique ID of the service and a time stamp

representing the freshness of the state.

5. One important attribute that is also coupled with the state is

the serial number of the last packet sent which helps a take

over service to resume data transfer without disturbing theflow control of the already established communication.

V.4 Service State Migration Scenario

Following on the same previous diagram, the service statemigration is initiated by the HPA reporting a delay in service

delivery. Follow the green bold arrows in the previousdiagram as well as the steps defined below:

1. When an HPA fails to receive the reply from a funning

service that it initiated, it sends a help request to one of the

management node CLM.

2. The management node CLM will acknowledge the recipien

of the request, other wise the HPA will keep on sending the

request over a timeout period for a predefined number of

retries.

3. The management node CLM will send to all the nodes

asking for the state version stored in their shared memorysegment for the requested service.

4. On reply of the nodes, the management node CLM will pick

the node that has the most fresh copy of the state as well as the

enough resources needed to handle the failing request (see

next section), and will notify it of being the takeover candidate

through its CLM.5. The candidate node CLM will contact its service factory

and the Channel Reserver to avail a channel and a service of

the same type (as defined by the configuration manager in

service.xml [3]).

6. The candidate node CLM will then call the de-serializeroutine of the chosen service with the state in its Shared

Memory Segment. The de-serialize routine is an overloaded

method by the developer of the service.

7. The de-serialize routine should have the code necessary to

read the state and set the internal attribute of the service objec

to the state represented in the state stream.

8. The CLM will then call the main routine of the service to

fire its execution.9. Since we use UDP, the HPA can still receive the stream

recovery the same UDP socket handler.

10. The HPA will discard any packets that it already received

from the original service, as failure may happen at any pointbetween two consecutive state replications.

V.5 Cluster Nodes Load Replicator

The load replicator is a mean for a container node to tell others

about its free load. This is used in two main cases which are

finding a candidate for a service state migration, and choosing

container nodes by an HPA to establish channels required to

serve a server page. Thus we have two scenarios, the first one

is a Container-to-Container Load replication and the other is

Container-to-HPA load replication.

V.6 Container to Container Scenario

The Container to Container load replication scenario works ina push mode where each node will periodically send an XML

stream to all other nodes with its state. The management node

will primarily store the load data of all nodes to be able to

make service state migration decisions as well as responding

to load inquiries by the HPA. If the management node doesnot hear from a specific node for a predefined duration of

time, this node will be marked disabled by the management

node and will not be considered as a candidate in service state

migration until it starts sending its state again.

V.7 Container to HPA




10/12

The Container to HPA load replication scenario works in pull

mode, were the HPA will periodically initiate a load info

request from the management node and the management nodewill respond with and XML stream; each record in the XML

stream will refer to a node through its name and an integer

which represents its priority representing the availability of thenode. A node with a priority 0 means that this nod is no longer

exists, and the higher the value of the priority the more strongthe node is. On every load pull, the HPA will update its

discovery buffer with the loads so it can make decisions on

which nodes should be used which requesting services

channels.

VI- EXPERIMENTS

The cluster related illustrative experiments showing startups of

HPA and Containers are introduced in [2, 3]. This section will

illustrate an example for Shared Memory Replication and

Service State Migration scenarios, which is carried out acrossdifferent container types, C++ and JAVA. The example used

in this section is based on a service that reads a file and send it

to the browser, the file name is passed to the service in the

URL query string as a normal GET request parameter, and we

call this service the File Service. This service has two version

implementations, one in C++ and the other in JAVA, and eachversion is deployed on a separate container. The service

implementation utilizes the serialize and the de-serialize

routines which are overloaded by the two implementations of

the service; thus a running instance of such service will

periodically multicast its state to be replicated over differentnodes of the cluster. The initial HTTP request is directed to

the C++ service hosted on the C++ container, which is at the

same time the management node of the cluster. The C++

implementation is designed to fail after sending a portion of

the file requested. The expected behavior is that the running

service will start sending the files over UDP packets and will

issue State Replication for the CLM periodically every

number of UDP packets sent. The replicated state contains thefile name being served, the last location of the file being sent,

and the sequence number of the last UDP packet sent to the

HPA. On failure of the service were the HPA will fail to

receive the service reply successfully, the HPA will reportfailure to the management node CLM, and the container

management node should transparently initiate the service

state migration process and the HPA should be able to resume

the receiving of the rest of the reply transparently on the same

socket handler. The following console snapshots shows how

this process takes place during running the experiment that

tests the shared memory replication and the service state

migration functionalities.

As can be seen from the above snapshots, the HPA startreceiving the reply from the C++ service, and meanwhile it

can be seen clearly on the C++ and Java consoles that the

service on the C++ container is replicating its state and the

JAVA container is receiving such state and storing it in its

Shared Memory Segment. When the failure happens, the HPA

starts reporting a timeout situation, and after a number oretries it reports failure and start asking for takeover from the

management node. As can be seen on the C++ console, as the

C++ node is the management node it starts the take-over process to find a takeover node candidate. On the JAVA

console, the java container receives the takeover request and

send a takeover reply with the freshness of the service state

that it store in its shared memory segment, the management

node accept and fires the approval to the JAVA node, and we

can see clearly on the JAVA console that the JAVA container

start taking over and reserving a temporary channel for thisservice and start resuming the service from the last state i

received.

VI.1 Case Study: Web Usage StatisticsThe Web usage statistical analysis applications are important

applications in the web domain. For large web portals, theusage data is very large and arrives in large volumes and

frequency, which makes the statistical calculations processing

intensive. Two approaches can be used to generate web usage

statistical reports:

VI.1.1 Online Batch Reports Processing




11/12

For this option, the data of a specific period is processed and

rolled up into a small database that holds the results of the

usage reports and eliminates a lot of details that in some casesmight be needed. This approach is more wide-spread as the

need to see the usage reports on-line is a feature that is

business dependent and is not very much in demand by many portal owners. The kind of reports to be generated is

predefined and cannot be altered at run time since the detailsfrom which the reports are generated are not included in the

report database.

VI.1.2 Online Usage Report Generation

For this option, the raw usage data is stored in a database, and

mined to generate up-to-date online reports over any duration,

not limited to the predefined roll-up periods defined in the

previous approach. This approach provides a lot of flexibility

for generating reports or even customizing a report with no

limits, however, such reports take more time than the ones

generated by the batch approach. Of interest here is the secondapproach "Online Usage Report Generation". An attempt is

made to enhance its execution and get better performance

through deploying it on the multichannel environment. One of

the most processing intensive parts of the usage report is page

clustering. Generating usage statistics per portal page, is easy

and quick, however portal owners are usually not interested inviewing the statistics of individual pages as a portal will

usually contain many pages and there is no real benefit from

such a report.

Rather a portal owner or manager is more interested in

calculating usage for parts of the web portal, meaning thatgroups of pages are clustered and defined as belonging

together under specific sections of a web portal, in which case

usage is presented per cluster of pages, a very processing

intensive task. This case study shows how dividing the task

into smaller sub-tasks affects the performance with respect to

the traditional way of executing such task.

VI.1.3 Experiment SetupThe following experiments were run against real live data for

a very high traffic on-line portal. The usage database in our

experiments has 2 months of data with over 750000 hits,

200000 visits, and 150000 visitors. The portal comprises 650unique pages which are all dynamic. Those pages are

categorized into 25 clusters with an average of 26 pages per

cluster, and a maximum of 42 pages per cluster.

The experiments were run on both environments; the

traditional and the multi-channel. Two variables were changed

throughout the experiment runs: the number of concurrent

requests, and the number of nodes in the serving cluster.

VI.1.4 Results

First the experiments were run on the Multichannel and the

traditional web environment against one server node. The

results show a boost in performance in the runs on themultichannel environment, as shown by figure 9. The chart

presents the average percentage gain in performance over all

channels per each run, and in every run the number of

concurrent requests is increased by 2. A new cluster node was

introduced in the second experiment to see the effect ofclustering and how the multichannel environment benefits

from adding new nodes. Figures 9-13 display the average

percentage of performance gain when augmenting the number

of nodes in the back-end cluster. The same pattern is repeated

as we add more new nodes with the only difference that the

number of concurrent requests threshold is increased as the

number of nodes in the back-end cluster is increased. Figure

13 compares the four experiments showing that themultichannel environment makes use of each added node. At

the end of each experiment a new node was add, and a

difference in the performance could be observed clearly in thefinal run in the chart.

The following are the most important observations from the

above results:

1. Multichannel environment provided a high performance

gain that was around 110 % in average

2. The gain in performance was directly proportional to the

increase in the number of nodes in the cluster.

3. As the number of concurrent requests increased the

percentage gain in performance decreased.

4. The decrease in the average percentage of performance gain

is not linear relative to the increase in the concurrent numberof connections. For example, in the third experiment where

there were 3 server nodes, comparing the first and the last runs

of this experiment, reveals that the average gain in

performance (gain in performance per request) decreased by

20 % yet the number of concurrent requests increased 15

times, and moreover the number of concurrent databasequeries increased 15 times as well.

5. The rate of decrease in performance gain relative to the

increase in the number of concurrent requests is almost linear.

6. It was noticed during the experiments that the container

CPU processing usage during the multichannel experimentswas higher than the processing usage during the traditiona

environment experiments, which indicated that the

multichannel environment utilized more resources to provide

better performance.

VII- CONCLUSION

In conclusion, we end up with a web environment which is

totally designed for web transactions serving and able to

perform real distributed execution within a transaction request providing high performance computing features not through

integration with HPC tools, but within the framework of the

web environment and totally inherited from the common web

concepts like server side scripting standards and extending theHTTP protocol to serve such means. The fruit if this is a lot of

features such as transparent fault tolerance without full re-

execution of the failing service, the ability of different

heterogeneous services to exchange service state message and

emulate a unified shared memory between all cluster nodes

whose content can be used by services written in different

technologies. Moreover, the CLM contributed to the

presentation of the Multi-Channel cluster as an SSI Singl

System Image. It can be seen that with this kind of CLM

functionalities, the Multi-Channel environments can be

situated in between web application servers and gridenvironments; it provides almost all the functionalities that a

normal web application environment provides and a subset of

functionalities that are provided by a grid environment

REFERENCES[1] A. Sameh, K. Sobh, "Multi-Channel Clustered Web Application Servers -Architecture", Working paper

[2] A. Sameh, K. Sobh, "Multi-Channel Clustered Web Application Servers -

Configuration Manager", Working paper[3] A. Sameh, K. Sobh, "Multi-Channel Clustered Web Application Servers -

Deployment Manager", Working paper




12/12

[4] A. Sameh, K. Sobh, "Multi-Channel Clustered Web Application Servers -

Cluster Manager", Working paper

[5] J. Crowcroft, Iain Phillips. TCP/IP and Linux Protocol Implementation:Systems Code for the Linux Internet. John Wiley Sons, Dec 2001. ISBN-10:0471408824. ISBN-13: 978-0471408826.

Figure 9: Server Node- Web Usage Statistics Average percentage Gain in

Performance: 1 Container

Figure 10: Server Nodes- Web Usage Statistics Average percentage Gain inPerformance: 2 Containers



Figure 13: Performance Comparison with respect to number of Nodes


ISSN 2230 7818 @ 2011 htt // ij t i All i ht R d P 80

7 ijeast vol no.4 issue no.1 management of multi channel multi container application servers 069 080

Documents