on remote procedure call* - puc-rionoemi/sd-09/rpc-survey.pdf · ify the communication and...

On Remote Procedure Call *

Patricia Gomes Soares t

Distributed Computing and Communications (DCC) Lab .Computer Science Dept ., Columbia Universit y

New York City, NY 1002 7

Abstract

The Remote Procedure Call (RPC) paradigm i sreviewed. The concept is described, along withthe backbone structure of the mechanisms tha tsupport it . An overview of works in support-ing these mechanisms is discussed . Extensionsto the paradigm that have been proposed t oenlarge its suitability, are studied . The maincontributions of this paper are a standard vie wand classification of RPC mechanisms accord-ing to different perspectives, and a snapshot o fthe paradigm in use today and of goals for thefuture of RPC .

1 Introductio n

Over time, operating-system support as wel las language-level support have been designedto ease the development of distributed applica-tions . A distributed application is a set of re-lated programs that execute concurrently or inparallel, on the same or distinct autonomou sprocessing nodes of a distributed system, inter-changing information .

The decentralized configuration provided b ya distributed system brought many benefits .The interconnection of these processing units

*The IBM contact for this paper is Jan Pachl, Centr efor Advanced Studies, IBM Canada Ltd ., Dept . B2/894 ,895 Don Mills Road, North York, Ontario, M3C 1W3 .

This work was partly supported by Brazilian Re-search Council (CNPq) under Grant 204543/89 .4 .

makes resources available to a wider commu-nity . Both data and processes) can be repli-cated thereby achieving greater performancethrough parallelism, increased reliability andavailability, and higher fault tolerance. The re-mote accessibility of specialized units provid ebetter performance for applications with spe-cific requirements.

Initially, only operating-system-level prim-itives supported interprocess communication .Unfortunately, the details of communicatio nwere not transparent to application program-mers. To transmit complex data structures, i twas necessary to explicitly convert them int oor from byte streams . The programmer neede dto be aware of heterogeneity at the machin elevel, as well as at the language level, betwee nthe sending and the receiving processes. Nocompile-time support was offered by the lan-guages, and debugging became an extremel ydifficult task . A higher level of abstraction wasneeded to model interprocess communication .A variety of language-level constructors andoperators were then proposed to address thi sproblem. They improved readability, portabil-ity, and supported static type checking of dis-tributed applications . A major achievemen tin this area was the Remote Procedure Call(RPC) .

The procedure mechanism, which is provide dby most imperative nondistributed languages ,was initially proposed to support code mod-

IA running program unit is a process .

215

ularization, and to be an abstraction facility .RPC extends this mechanism to a distributedenvironment, whereby a process can call a pro-cedure that belongs to another process . Inter-process communication is then given the syn-tax and semantics of a well-accepted stronglytyped language abstraction . Significant result shave been achieved in the effort to better ac-commodate in the model problems related notonly with interprocess communication issues ,but also with distributed environment configu-rations .

Computing systems continue evolving atan astonishing speed, and parameters changequickly . A high-speed network is capable ofcontinuously broadcasting a database, thu sproviding for an almost zero access time t odata . An operating system today plays the roleof a decentralized manager that searches for fa-cilities to provide applications with necessar yresources . Users are close to a declarative en-vironment where they specify what their need sare, and the distributed system is supposed t oselect, based on the instantaneous configura-tion of the system, how to provide it .

Much demand is placed on mechanisms tha tisolate architectural dependent characteristic sfrom the user . The RPC is an abstractionwhich can make these dependencies transpar-ent to the user . A system supporting RPCsmay provide for location transparency, wherethe user is unaware of where the call is exe-cuted .

The RPC constitutes, also, a sound basi sfor moving existing applications to the dis-tributed system environment . It supports soft-ware reusability, which can be one of the mostimportant medicines in attacking the softwar ecrisis [69] .

In this paper, the whole RPC concept is ex-plored, and surveyed . In Section 2, the un-derlying architecture is described, and in Sec-tion 2.1 the concept of RPC is introduced . Sec-tion 3 presents the major challenges in support -ing the paradigm. The standard design of anRPC system is described in Section 4, as wellas its major components, their function, andhow they interact . Sections 5, 6, and 7 sur-vey each of those major components, analyzin gthe problems they are supposed to deal with ,and describing several systems and their dis-

tinct approaches . In Section 8, extensions andsoftware support added to the RPC paradigmare discussed. Finally, Section 9 describes mea-surements that can be used to evaluate the per-formance of RPC systems .

2 General Concepts

Before exploring the concept of RPC, the un-derlying environment is described here. Themajor environment requirements that RPC isaimed to support are introduced . Other ap-proachs that address these features are sur-veyed .

A distributed computing system consists o fa collection of autonomous processing nodesinterconnected through a communication net -work. An autonomous processing node com-prises one or more processors, one or mor elevels of memory, and any number of exter-nal devices [8, 47] . No homogeneity is as-sumed among the nodes, neither in the proces-sor, memory, nor device levels . Autonomousnodes cooperate by sending messages over thenetwork. The communication network may belocal area (short haul), wide area (long haul) ,or any combination local ones and wide onesconnected by gateways . Neither the networknor any node need be reliable . In general, it isassumed that the communication delay is longenough to make the access to nonlocal data sig -nificantly more expensive than the access to 10-

cal primary memory. It is arguable, though, asto whether or not the nonlocal access is not sig-nificantly more expensive than the local acces swhen secondary storage is involved [59] .

A distributed program consists of multipleprogram units that may execute concurrentlyon the nodes of a distributed system, and inter -act by exchanging messages2 . A running pro-gram unit is a process . The distributed pro-gramming consists of implementing distributedapplications . The distributed languages consistof programming languages that support the de-velopment of distributed applications . Thereare three important features that distribute d

2 1n shared memory distributed systems, progra munits can also interact through shared memory. Thes esystems will not be considered in this paper, and someassumptions and assertions made may not be valid fo rthose environments .

216

languages must deal with that are not presen tin sequential languages [8] :

• Multiple program units

• Communication among the program unit s

• Partial failure .

A distributed language must provide sup -port for the specification of a program compris-ing multiple program units . The programme rshould also be able to perform process man-agement . That is, to specify the location i nwhich each program unit is to be executed ; themode of instantiation, heavy weighted or lightweighted ; and when the instantiation shouldstart . The operating system allocates a com-plete address space for the execution of heav yweighted processes . The address space con-tains the code portion of the process, the dat aportion with the program data, the heap fordynamic allocation of memory, and the stack ,which contains the return address linkage foreach function call and also the data elementsrequired by a function . The creation of a light -weighted process causes the operating systemto accommodate the process into the addres sspace of another one . Both processes have th esame code and data portions and share a com-mon heap and stack spaces . Process manage-ment can be used to minimize the executio ntime of a distributed program, where the par-allel units cooperate rather than compete . Afull analysis of the limitations of synchronouscommunication in languages that do not sup -port process management is found in [46] . Pro-cesses can be created either implicitly by thei rdeclaration, or explicitly by a create construct .The total number of processes may have to b efixed at compile time, or may increase dynam-ically.

Language support is also needed to spec-ify the communication and synchronization o fthese processes . Communication refers to theexchange of information, and synchronizatio nrefers to the point at which to communicate .

For applications being executed in a singlenode, the node failure causes the shutdown o fthe whole application . Distributed systems ar epotentially more reliable . The nodes being au-tonomous, a failure in one does not affect th ecorrect functioning of the others . The failure

of only a subset of the nodes in an applica-tion's execution, partial failure, leaves still theopportunity for the detection and recovery ofthe failure, avoiding the application's failure .Considerable increase in reliability can also b eachieved by simply replicating the functions o rdata of the application on several nodes .

Distributed languages differ in modeling pro-gram units, communication and synchroniza-tion, and recoverability . Some languages modelprogram units as objects and are distribute dobject-based . An extensive survey of dis-tributed object-based programming systems i sdescribed in [16] . Others model the programunits as processes units and are named process -oriented (Hermes [75], and NIL [76]) . Yet oth-ers model program units into tasks, as well asprocesses (Ada [20]), where a task is a less ex-pensive instantiation of program units than theother two methods .

Many languages model communication asthe sending and reception of messages, re-ferred to as message-based languages, simulat-ing datagram mailing (CSP [35], NIL [74]) . Aprocess P willing to communicate with a pro-cess Q, sends a message to Q and can continueexecuting . Process Q, on the other hand, de-cides when it is willing to receive messages andthen checks its mail box. Process Q can, if nec-essary, send messages back to P in response t oits message . P and Q are peer processes, thatis, processes that interact to provide a serviceor compute a result . The application user is re-sponsible for the correct pairing and sequencin gof messages .

Several languages require the processes t osynchronize in order to exchange informa-tion . This synchronization, or meeting, is ren-dezvous, and is supported by languages suchas Ada [20], Concurrent C [27], and XMS [26] .The rendezvous concept unifies the concept sof synchronization and communication betwee nprocesses. A simple rendezvous allows only uni -directional information transfer, while an ex-tended rendezvous, also referred to as transac-tion, allows bidirectional information transfer .

The Remote Procedure Call (RPC) mech-anism merges the rendezvous communicatio nand synchronization model with the procedur emodel . The communication details are handledby the mechanisms that support such abstrac -

217

tion, and are transparent to the programmer .Languages that support RPC are referred to asRPC-based languages' (Mesa [13], HRPC [55] ,Concert C [90, 7]) . For the synchronization is-sue, RPC-based languages are classified as th eones that block during a call until a reply is re-ceived, while message-based languages unbloc kafter sending a message or when the messag ehas been received .

A distributed language may provide excep-tion handling mechanisms that allows the use rto treat failure situations . The underlying op-erating system may also provide some fault tol-erance. Projects such as Argus [44] concentrat eon techniques for taking advantage of the po-tential increase in reliability and availability .

A survey of programming languagesparadigms for distributed computing is foun din [8], and an analysis of several paradigms fo rprocess interaction in distributed programs i sin [5] .

2 .1 RPC : The Concept

This paper assumes familiarity with the opera-tional semantics of the procedure call binding .Section A of the Appendix presents a quick re-view of the concept .

RPC extends the procedure call concept t oexpress data and control scope transfer acros sthreads of execution in a distributed environ-ment . The idea is that a thread (caller thread )can call a procedure in another thread (calleethread), possibly being executed on anothermachine . In the conventional procedure cal lmechanism, it is assumed that the caller andthe callee belong to the same program unit ,that is, thread of execution, even if the lan-guage supports several threads of execution i nthe same operating-system address space asshown in Figure 1 . In RPC-based distributedapplications, this assumption is relaxed, an dinterthread calls are allowed as shown in Fig-ure 2 . Throughout this paper, local or conven-tional call refers to calls in which the caller andcallee are inside the same thread, and remotecalls or RPC refers to calls in which the caller

'There exists a common agreement in the researc hcommunity on the viability of the use of RPC for dis-tributed systems, especially in a heterogeneous environ-ment [56] .

PU: Program UnitPi : Procedure i—~► Control transferPCi: Procedure call i

Figure 1 : Conventional Procedure Call Sce-nario

and callee are inside distinct threads . Section Bof the Appendix presents an extension of opera-tional semantics of the procedure call presente din Section A, to handle RPCs .

Distributed applications using RPC arethen conceptually simplified, communicatingthrough a synchronized, type-safe, and well-defined concept . A standard RPC works asfollows . Thread P that wants to communicat ewith a thread Q, calls one of Q's procedures ,passing the necessary parameters for the call .As in the conventional case, P blocks until thecalled procedure returns with the results, if any .At some point of its execution, thread Q exe-cutes the called procedure and returns the re-sults . Thread P then unblocks and continuesits execution .

The communication mechanism has a clean ,general, and comprehensible semantics . At theapplication level, a programmer designs a dis-tributed program using the same abstractionas well-engineered software in a nondistributedapplication . A program is divided into modulesthat perform a well-defined set of operation s(procedures), expressed through the interfac eof the module, and modules interact through

218

Figure 3 : Possible configurations of RPC s

PU: Program UnitPi : Procedure i— Control transferPCi : Procedure call i

Figure 2 : Remote Procedure Call Scenario

procedure calls . Information hiding, where a smuch information as possible is hidden withindesign components, and largely decoupled com -ponents are provided . For the application user ,it is transparent where these modules are exe-cuted, and all the communication details arehidden. The implementation framework thatsupports the concept is an RPC system or RPCmechanism .

The application programmer or the RPC sys-tem designer are responsible for mapping th eapplication modules into the underlying archi-tecture where the program is to be executed .The RPC system designer must be prepare dto deal with three possible configurations ofan RPC from a system-level point of view asshown in Figure 3, where threads are abbre-viated by the letter T, operating-system pro-cesses by P and machines by M . An RPC canbe performed between two threads executinginside a process (RPC1 in the Figure), betwee ntwo processes running in the same machine(RPC2), or between two processes running i ndifferent machines (RPC3). For each situation,the communication media used may vary, ac-cording to RPC system designers, environmen tavailability, and possibly the wants of the ap-plication programmer . RPC1 could use sharedmemory, RPC2 could use sockets [72], andRPC3 could use a high-speed network. Com-munication traffic on the same machine (suc has for RPC1 and RPC2) is cross-domain traf-

fic [9], and between domains on separate ma-chines is cross-machine traffic (as for RPC3) .

3 Challenges of an RPCMechanis m

The ideal RPC mechanism is the one that pro-vides the application user with the same syn-tax and semantics for all procedure calls in-dependently of being a local call or a remot eone in any of the possible configurations . Theunderlying system supporting the concept pro-ceeds differently for each type of call, and i twill be in charge of hiding all the details fromthe application user . The system must be capa-ble to deal with the following differences in thesource of the calls, that require developmentand execution-time support :

• Artificialities may occur in extending theconventional procedure call model, inde-pendent of the RPC configuration . Someaspects to be considered are :

With the relaxation of the single -thread environment for the caller andthe callee, thread identifications andaddressing must be incorporated intothe model .

– In the conventional model, the failureof the callee thread means the fail-ure of the caller thread . In the RP Cmodel, this is not the situation andnew types of exceptions and recoveryare required.

– Caller and callee threads share thesame address space, in the conven-tional case, and therefore have the

219

same access rights to the system re-sources . The implementation of theinformation exchange protocol is sim-ple . Global variables are freely ac-cessed, and parameters, return val-ues, and their pointers are manipu-lated. In the RPC model, the RP Cmechanism has to move data acrossaddress spaces and pointer-based ac-cesses require special handling . Ad-dresses need to be mapped across ad-dress spaces and memory need to beallocated to store a copy of the con-tents of positions aliased by pointers .By using pointers, cyclic structuresmay be created, and the mechanismmust overcome infinite loops durin gthe copying of data .

• The presence of a communication networ kbetween caller and callee can cause differ-ences. The RPC mechanism has to han-dle transport-level details, such as, flowcontrol, connection establishment, loss ofpackets, connection termination, and soon .

• The distributed environment configura-tion can cause differences. Issues suchas availability, recovery, locality trans-parency of processes, and so on, ca narise . Some of the problems of distribute denvironments result from heterogeneit yat the machine level, operating-systemlevel, implementation-language level, o rcommunication-media level . Distinct RPCmechanisms support heterogeneity at dis-tinct levels, the ultimate goal being to sup-port heterogeneity at all levels in a waycompletely transparent to the applicatio nprogrammer .

Some mechanisms accommodate these re-quirements by exposing the application pro-grammer to different syntaxes, or semantics ,depending on the configuration of the call be-ing issued . Nelson [53] describes and analyzesthe major concerns of an RPC mechanism (an dtheir complexity), which can be summarized asfollows :

• Essential properties of an RP Cmechanism:

– Uniform call semantics

– Powerful binding and configuration

– Strong type checkin g

– Excellent parameter functionality

– Standard concurrency and exceptionhandling .

• Pleasant properties of an RPC mech-anism :

– Good performance of remote call s

– Sound remote interface design

– Atomic transactions, respect for au-tonomy

– Type translation

– Remote debugging .

In Section 8 these issues are considered .If a remote call preserves exactly the sam e

syntax as a local one, the RPC mechanism issyntactically transparent . If a remote call pre-serves exactly the same semantics of a localcall, the mechanism is semantically transpar-ent . The term remote/local transparency refersto both syntax or semantics transparency, o reither of them (when there is no ambiguity) .

To support the syntax and semantics trans-parency, even in the presence of machine orcommunication failures, the RPC mechanismmust be very robust . Nevertheless, an RPCmust be executed efficiently, otherwise (experi -enced) programmers will avoid using it, losingtransparency.

RPC mechanisms should provide for the easyintegration of software systems, thereby facili-tating the porting of nondistributed applica-tions to the distributed environment . The pro-cedure model can limit the ratio of dependenc ybetween components of a program, by coupling ,increasing the potential for software reusabil-ity [69] . In languages supporting RPCs, the in-teraction between modules is specified througha collection of typed procedures, the interface ,which can be statically checked [29] . Inter-faces, being well-defined and documented, givean object-oriented view of the application .

An RPC mechanism must not allow cross -domain or cross-machines access by unautho-rized RPC calls . This provides for a large

220

grained protection model [9], because the pro-tection must be divided by thread domains .

Criticism of the paradigm exists [82] . Issuessuch as the following are still undergoing anal-ysis :

• Program structuring in RPC, mainly dif-ficulties in expressing pipelines and multi-casting .

• Major sources of transparency loss, suc has, pointers and reference parameters ,global variables, and so on .

• Problems introduced by broken network sand process failures .

• Performance issues around schemes to ,transparently or gracefully set up dis-tributed applications based on RPC insuch a way that the performance is notdegraded .

A survey of some features provided by a se tof major RPC implementations (such as Ceda rRPC, Xerox Courier RPC, SUN 4 ONC/RPC ,Apollo NCA/RPC, and so on) can be foun din [83] . The implementations are mainly com-pared considering issues such as design objec-tives, call semantics, orphan treatment, bind-ing, transport protocols supported, security,and authentication, data representation, andapplication programmer interface .

4 Design of an RPC Sys-tem

The standard architecture of an RPC system ,which is based on the concept of stubs is pre-sented here . It was first proposed in [53] base don communication among processes separate dby a network, and it has been extended andused since then .

Stubs play the role of the standard proce-dure prologue and epilogue normally producedby a compiler for a local call. The prologu eputs the arguments, or a reference to them, a swell as the return address, in the stack struc-ture of the process in memory, or in registers ,and jumps to the procedure body. By the end

4 SUN is a trademark of Sun Microsystems, Inc .

of the procedure, the epilogue puts the value sreturned, if any, either in the stack of the pro-cess in memory or in registers, and jumps bac kto execute the instruction pointed to by th ereturn address . Depending on the RPC config-uration, stubs may simulate this behavior in amultithread application .

Suppose two threads that want to commu-nicate as shown in Figure 4 : the caller thread ,issues the call to the remote procedure, and th ecallee thread executes the call . When the callerthread calls a remote procedure, for example ,procedure remote, it is making a local call asshown in (1) in Figure 4, into a stub proce-dure or caller stub. The caller stub has th esame name and signature as the remote proce-dure, and is linked with the caller process code .In contrast with the usual prologue, the stubpacks the arguments of the call into a call mes-sage format (the process is marshaling), andpasses (2) this call message to the RPC run-time support which sends (3) it to the calleethread . The RPC runtime support is a librarydesigned by the RPC system to handle thecommunication details during runtime . Aftersending the call message to the callee thread ,the runtime support waits for a result messag e(10), that brings the results of the execution ofthe call . On the callee side, the runtime sup -port is prepared to receive (4) call messages tothat particular procedure, and then to call (5 )a callee stub . Similar to the caller stub, thecallee stub unpacks the arguments of the cal lin a process referred to as demarshaling . Afterthat, the callee stub makes a local call (6) t othe real procedure intended to be called in th efirst place (procedure remote, in our example) .When this local call ends, the control returns(7) to the callee stub, which then packs (thatis, marshals) the result into a result message ,and passes (8) this message to the runtime sup-port that sends (9) it back to the caller thread .On the caller side, the runtime support thatis waiting for the result message receives (10 )it and passes (11) it to the caller stub . Thecaller stub then unpacks (that is, demarshals)the result and returns (12) it as the result t othe first local call made by the caller thread . Inthe majority of the RPC mechanisms, the stub sare generated almost entirely by an automati ccompiler program, the stub generator .

221

LPC(1) I I (12)

caller stub

(2)

(11)

t 1o)RPC runtime •

auppod i:(3 )

---------------- -LPC: local procedure call

—+ procedure call

Figure 4 : Basic RPC components and interac-tions

4 .1 RPC System Component s

RPC systems must interact with a variety o fcomputer system components, as well as sup-port a wide range of functionality, exemplifie das follows . The system or the source languagemust provide a minimum process management ,that allows the design and execution of threads(heavy-weight or light-weight), that constitut ethe distributed application . To issue an RPC ,a process should acquire an identifier to the re -spective procedure (name binding) so that i tcan refer to it . RPC systems must providemeans for processes to exchange bindings, us-ing a binding discipline . The runtime suppor tmodule must interact with the transport layerto transport the data among the communicat-ing processes should a cross-machine communi-cation occur .

The major components of an RPC mecha-nism can be divided into three categories :

• Application-development time com-ponents : components used by the appli-cation user during development time .

• Process modules : components that aremainly dependent on the specific RP Cmechanism. Depending on the particu-lar mechanism, they can be application -development components, execution-tim ecomponents, or even components prede-fined and available in the operating sys-tem .

• Execution-time components : componentsused only during run time .

RPC facilities can differ in any of these com-ponents and in several dimensions . For in-stance, the binding discipline (a process mod-ule component) may differ from one RPC sys-tem to another on the interface that they offe rto the application user, on the time one pro-cess should perform the binding (compilationor run time), on the implementation details o fhow this binding is performed, on whether th ebinding deals with heterogeneity and how, an dso forth . While these choices should be orthog-onal, they are usually hard coded and inter -wined. As a result, RPC mechanisms usuallycannot interact and are extremely difficult t omodify to make such interaction possible .

In the following sections, the functionality ofeach of the components is defined precisely, th edimensions of differences in RPC implementa-tions are analyzed, and real systems and thei rapproaches are explored . A brief historical sur-vey of the systems referred to is presented inAppendix C .

5 Application-Develop-

ment Time Components

The interface description language (IDL) andstub generator are two major application-development time components . In designing aprogram unit, the user specifies the procedure sthat the process can call remotely (importe dprocedures), and the procedures that other pro-cesses can call (exported procedures) . The setof imported and exported procedures of a pro-cess is its interface and is expressed throughthe interface description language (defined inSection 5 .1) of an RPC facility.

For the imported and exported procedures ofa process, the stub generator creates the callerand callee stubs, respectively. These stubs areresponsible for the marshaling of arguments ,the interaction with the communication media ,and the demarshaling of result arguments, ifany during run time .

5 .1 Interface Description Lan-guage (IDL)

An interface description (ID), or module speci-fication, consists of an abstract specification o f

caller process

----------------- -

(7) (6) LPC

callee stub

(8) (S)

RPCruntime

I 5uPP sR _

---------------- -

communicationmedia

callee process

----------------- -

(9);

222

Y

a software component (program unit) that de -scribes the interfaces (functions and supporte ddata types) defined by a component . It can beinterpreted as a contract among modules . Aninterface description language (IDL) is a nota-tion for describing interfaces . It may consistof an entirely new special-purpose language, ormay be an extension of any general-purposelanguage, such as, C .

The concept of an ID is orthogonal tothe concept of RPC . Its use motivates theinformation-hiding concept, because module sinteract based on their interfaces and the im-plementation details are nonapparent . Never-theless, IDs are of major significance for RPCmechanisms, which use them for generatingcaller and callee stubs . In general, RPC pro-tocols extend IDLs to support extra featuresinherited solely from the RPC mechanism .

In specifying a function, an ID contains it ssignature, that is the number and type of its ar-guments, as well as the type of its return value ,if any . More elaborate IDLs also allow for th especification of the other important options :

• Parameter passing semantic s

• Call semantic s

• Interface identification .

It is not always possible to find out the di-rectionality of the arguments in a function cal lbased only on its ID, because of the diver-sity of parameter passing semantics (PPS) sup-ported by high-level languages . By directional-ity is meant the direction of the informatio nflow, that is, whether the caller is passing val-ues to the callee, or vice versa, or both . ThePPS specification explicitly determines howthe arguments of a function are affected bya call . Concert [90, 7] defines a complete setof attributes . As directional attributes, Con-cert contains in, in (shape), out, and out

(value) . It makes a distinction between th evalue and the shape of an argument . The valu erefers to the contents of the storage, while shaperefers to the amount of storage allocated, aswell as the data format. The in attributemeans that the information is passed from th ecaller to the callee, and because of that mus tbe initialized before the call . The in (shape)

attribute means that only the shape informa-tion of the data is passed from the caller to th ecallee . The out attribute means that the infor-mation is passed from the callee to the caller ,and the callee can change the value of the pa-rameter, as well as its shape information . Theout (value) attribute is similar to out, butthe shape information does not change. Onlythe value of the parameter changes .

Concert also defines the following value -retention attributes : discard, discard

(callee), discard (caller), keep, keep

(callee), and keep (caller) . These at-tributes specify whether copies of storage gen-erated by the runtime system during the pro-cedure 's execution should remain or be deallo-cated (discarded) . These allocations are gener-ated as part of the interprocess operation . Thekeep attribute and the discard attribute spec-ify that both caller and callee can keep or dis-card the storage allocated .

To deal with pointers across operating-system address spaces or across machineboundaries, Concert defines the followingpointer attributes : optional, required ,

aliased, and unique . A pointer to a required

type is pointer that cannot be NULL, whereasa pointer to an optional type can be NULL ,simply indicating the absence of data . Apointer to a unique type is a pointer to datathat cannot be pointed to by any other pointerused in the same call . In other words, it doesnot constitute an alias to any data being passe dduring the call . Pointers to aliased types, onthe other hand, are pointers to data that mayalso be pointed to by other pointers in the sam ecall . This property is expressed in the trans-mission format, and reproduced on the othe rside, so that complex data structures, such asgraphs, can be transmitted .

The call semantics, on the other hand, isan IDL feature inherited from the RPC model .In conventional procedure call mechanisms, athread failure causes the caller and callee pro-cedures to fail . A failure is a processor fail-ure, or abnormal process termination . If thethread does not fail, the called procedure wasexecuted just once in what is called exactly-once semantics . If the thread fails, the runtim esystem, in general, restarts the thread, the cal lis eventually repeated, and the process contin -

223

ues until it succeeds. Just as procedures maycause side-effects on permanent storage, earlie r(abandoned) calls may have caused side-effect sthat survived the failure . The results of the lastexecuted call are the results of the call . Thesesemantics are called last-one semantics, last-of-many semantics, or at-least-once semantics .

In the RPC model, caller and callee may b ein different operating system address spaces, a swell as in different machines . If one thread isterminated, the other may continue execution .A failed caller thread may still have outstand-ing calls to other nonterminated callee threads .A callee, on the other hand, may fail before re-ceiving the call, after having participated in thecall, or even after finishing the call . For callersand callees on different machines, the networkcan also cause problems . A call message canbe lost or duplicated, causing the RPC to beissued none or several times. If the result mes-sage is lost, the RPC mechanism may trigge rthe duplication of the call .

The RPC mechanism must be powerfu lenough to detect all, or most of, the abnor-mal situations, and provide the application userwith a call semantics as close as possible to theconventional procedure call semantics . This i sonly a small part of semantics transparency ofan RPC mechanism, which is discussed thor-oughly in Section 8 . Nevertheless, it is very ex-pensive for an RPC system, to support exactly-once semantics . The run time would recoverfrom the failure without the user being aware .Weaker and stronger semantics for the RP Cmodel were, therefore, proposed, and some sys -tems extended the IDL with notations that al-low the user to choose the call semantics tha tthe user is willing to pay for a particular pro-cedure invocation . The possibly, or maybe se-mantics do not guarantee that the call is per-formed. The at-most-once semantics ensuresthat the call is not executed more than once . Inaddition, calls that do not terminate normallyshould not produce any side-effects, which re-quires undoing of any side-effect that may hav ebeen produced so far .

On the other hand, a procedure that does notproduce any side-effect can be executed sev-eral times and yet preclude the same result swithout causing any harm to the system . Suchprocedures are idempotent procedures, and, for

them, an at-least-once semantics are equivalen tto an exactly-once semantics . Another way ofapproaching the problem is to extend the ID Lso that the user can express such properties a sidempotency of a procedure, and let the systemdo the work .

Some RPC mechanisms extend IDLs to sup -port the declaration of an interface identifier ,which may be just a name, or number, or both ,or it may include such additional informatio nas the version number of the module that im-plements it . This identifier can be particularl yuseful for RPC mechanisms that support thebinding of sets of procedures, rather than justone procedure at a time. The binding can bedone through this identifier .

The Polygen system [14], within the UNIX 5environment, extends the functionality of in-terfaces a little further . In addition to a mod-ule's interface, the interface describes proper -ties that the module should pursue, and rule sto compose other modules into an application .A site administrator provides a characteriza-tion of the interconnection compatibilities oftarget execution environments, and an abstrac tdescription of the compatibility of several pro-gramming languages with these environments .This characterization is in terms of rules . Ap-plying these rules, Polygen generates, for aparticular interface, an interface software thatcan be used to compose components in a par-ticular environment . Besides the usual stubs ,the stub generator creates commands to com-pose and create the executables, including com-mands to compile, link, invoke, and intercon-nect them. The source programs, the gener-ated stubs, and the generated commands ar ebrought together in a package . The activityof analyzing program interfaces and generat-ing packages is packing . This scheme allow sfor components to be composed and reused i ndifferent applications and environments with-out being modified explicitly by the softwaredeveloper .

In Cedar [13], Mesa [51] is the language use das IDL and as the major host language for th eprograms. No extensions were added . The pro-grammer is not allowed to specify arguments o rresults that are incompatible with the lack of

5 UNIX is a trademark of UNIX System Laborato-ries, Inc .

224

shared memory .

NIDL [22], the IDL of NCA/RPC, is a net -work interface description language, because i tis used not only to define constants, types, an doperations of an interface, but also to enforc econstraints and restrictions inherent in a dis-tributed computation. It is a pure declara-tive language with no executable constructs .An interface in NIDL has a header, constan tand type definitions, and function descrip-tions . The header comprises a UUID (Univer-sal Unique Identifier), a name, and a versionnumber that together identify an interface im-plementation . NIDL allows scalar types an dtype constructors (structures, unions, top-leve lpointers, and arrays) to be used as argumentsand results . All functions declared in NID Lare strongly typed and have a required handl eas their first argument . The handle specifiesthe object (process) and machine that receivesthe remote call . A single global variable canbe assigned to be the handle argument for al lfunctions in an interface, making the handle ar -gument implicit and permitting the binding o fall functions of an interface to a remote pro-cess . This scheme disturbs the transparencyand semantic issues of RPCs somewhat . Therequired argument reminds the user that th eprocedure that is being called is a remote one ,losing transparency. Yet, the flexibility of mak-ing it implicit, increases the chance of confusin gthe user who is used to local call interface andwho may not expect to be calling a remote func-tion . Consequently, the user may not be pre-pared to handle exception cases that can arise .In the implicit situation, the application user i sresponsible for implementing the import func-tion that is automatically called from the stubswhen the remote call is made .

Procedures declared in an interface can b eoptionally declared as idempotent . The NID Lcompiler generates the caller and callee stubswritten in C, and translates NIDL definitionsinto declarations in implementation language s(C and Pascal are the supported languages sofar) . This scheme makes an interface in IDLgeneral enough to be used by any programminglanguage if the NIDL compiler can do the con-version . These files must be included in th ecaller and callee programs that use the inter-face . In this interface, the user has the flexibil -

ity of declaring transmissible types, which theNIDL compiler cannot marshal . The user is re-sponsible for the design of a set of procedure sthat convert the data to or from a transmissi-ble type to or from types that NIDL can mar-shal . This allows, for instance, the user to pas snested pointers as arguments to remote calls .

SUN RPC [79] defines a new interface de-scription language, RPC Language (RPCL) . Inthe specification file, each program has a num -ber, a version, and the declaration of its proce-dures associated with it . Each procedure hasa number, an argument, and a result . Multi-ple arguments or multiple result values mus tbe packed by the application user into a struc-ture type. The compiler converts the namesof the procedures into remote procedure namesby converting the letters in the names to lowercase and appending an underscore and the ver-sion number . Every remote program must de-fine a procedure numbered 0, a null procedure ,which does not accept any arguments, and doesnot return any value . Its function is to allow acaller who, by calling it, can check whether th eparticular program and version exist . It also al-lows the client to calculate the round-trip time .

In the Modula-2 extension [1] to supportRPC, the old Modula-2 external module defini-tions were used as the interface description wit honly a few restrictions and extensions added .The idea was to make as few modifications a spossible, allowing any external modules to b eexecuted remotely . For restrictions, remote ex-ternal modules cannot export variables and for -mal parameters of pointer types do not havethe same semantics as when called locally . Forextensions to the syntax, a procedure can b edefined as being idempotent, a variable can b edefined as of type result (out parameter) or seg-ment . The attribute segment allows the user t odirectly control the Modula/V system to mov eexactly one parameter as a large block, and t omove the other data needed in the call as asmall message .

5.2 Stub Generator

During runtime, the stubs are the RPC mecha-nism generated components that give the userthe illusion of making an RPC like a local call .The main function of stubs is to hide the imple -

225

1: 1

Figure 5 : Stub Generator Global Overvie w

mentation details from the application designe rand the user . They are usually generated at ap-plication development time by the RPC mecha-nism component, the stub generator . From theinterface description of an interface, the stu bgenerator generates a callee header file, a callerheader file, a caller stub, and a callee stub foreach of the functions declared in the interfac eas shown in Figure 5 . If the caller and calleeare written in the same language, the headerfile can be the same for both the caller an dcallee . Some stub generators [29] also generat eimport and export routines. Such routines per-form the binding of the respective interface o rprocedure .

Stubs are the RPC mechanism component sresponsible for the transmission of abstrac tdata types expressed in (high-level) program-ming languages over a communication media .A data representation protocol defines the map-ping between typed high-level values and bytestreams. The marshaling is the process of con-verting typed data into transmitted representa-

Lion (wire representation or standardized rep-resentation) . The demarshaling is the reverseprocess .

The data representation protocol is respon-sible for defining the mapping at two lev-els [73] : the language level and the machinelevel . Caller and callee threads can be imple-mented in the same language or different ones ,which is homogeneity or heterogeneity at th elanguage level or homogeneity or heterogeneityat the abstract data type level . Different lan-guages, such as, FORTRAN, LISP, COBOL ,and C, have different abstract data types, anddata type conversion may be needed for thecommunication . Existing compilers for nondis-tributed languages already provide transforma-tions in the same machine, in the same or dif-ferent languages, and in the configuration o fdifferent data types . These conversions are im-

plicit conversion, coercion, promotion, widen-ing, cast, or explicit conversion . An exampleof implicit conversion is a function that accept sonly arguments of float type, and is called wit hinteger arguments .

Stubs are also responsible for giving se-mantics to parameter passing by reference, orpointer-based data in general, across a commu-nication media . Several implementations havebeen proposed . One possible solution wouldbe to pass the pointer (address) only . Any fu-ture data access through the pointer sends amessage back to the caller to read or write th edata [82] . Besides inefficiency issues, this im-plementation requires that the compiler on th ecallee side treats local pointers differently fro mremote ones . This violates one of RPC prin-ciples : the compiler should not be affected b ythe presence of RPCs .

In some other systems [41], a dereferenceswaps a memory page via the communicationmedia, and the data is accessed locally. Thisscheme, however, requires homogeneous ma-chines and operating systems, and must ensurethat only one copy of a page exists within thesystem .

The threads can be executed on homoge-neous machines presenting homogeneity at th emachine level, or heterogeneous ones present-ing heterogeneity at the machine level . Dif-ferent machines have different primitive datatypes. In a message-passing system, some con -

T

226

version is required when sending data to a dis-similar machine . In a shared memory system ,a transformation is needed when the share dmemory is accessed by a processor that rep-resents data in a different way than the share dmemory.

At the machine level, the data representationprotocol must select from three possible dimen-sions of heterogeneity between machines : (1)the number of bits used to represent a primitivedata type, (2) the order of bits, and (3) the cod-ing method. Order of bits refers to choices suchas big endian, little endian, and so on. Codingmethod refers to choices such as 2 's comple-ment, sign magnitude, exponent bias, hidde nbit, and so forth .

A hardware representation transformationscheme for primitive data types in a heteroge-neous distributed computer system is suggeste din [73] . It also performs an extensive analysi sof the complexity of the several possible archi-tecture models to implement it .

As observed in [73], the primitive data typ etransformation can be local, central, or dis-tributed through the system. In the loca lmodel, the machine or a transformation unitattached to it performs the transformation ofthe data into a common format . This data ina common format is sent to the destination ,and the transformation unit there transformsthe data into the destination format . Thisapproach always requires two transformations ,even when dealing with two homogeneous ma-chines . It also requires the specification of apowerful common format that can represent alldata types that may be transferred .

In the central approach, a central unit per-forms all the transformations. The central uni treceives the data in the source format, trans -forms it to the destination format, and send sit to the destination . This central unit keep sinformation about all the interconnected ma-chines, and each piece of data to be trans -formed needs to have attached to it its sourc emachine, its data type in the source machine ,and the destination machine type .

In the distributed approach, specialize dtransformation units scattered throughout thesystem perform the transformations . A spe-cialized transformation unit can only performtransformations from one source type into one

destination type . If no specialized unit exist sto transform type A into type C, but there ar especialized units to transform type A to typ eB, and from type B to C, a transformation fromA to C can be accomplished via A to B, and Bto C . The number of transformation units maybe reduced by using a partially connected ma-trix, and by rotating the data through severalspecialized transformation units .

At the software level the most common ap-proaches to performing these transformation sare the local method and a degeneration of th edistributed one . In the local method, the callerstub converts the data from the source repre-sentation into a standard one . The callee stubthen does the conversion back to the originalformat . This scheme requires that both calle rand callee agree upon a standard representa-tion, and that two transformations are alwaysperformed . The use of a standard data repre-sentation to bridge heterogeneity between com-ponents of a distributed system is not restrictedto the communication level . One example is th eapproach used in [84] for process migration i nheterogeneous systems by recompilation . Thestrategy used is to suspend a running proces son one machine at migration points to whichthe states of the abstract and physical machine scorrespond . Then the state of the binary ma-chine program is represented by a source pro-gram that describes the corresponding abstractmachine state . This source program is trans-ferred to the remote machine, recompiled there ,and continues execution there . The compilerintermediate language is used to communicatebetween a language-dependent front-end com-piler, and a machine-dependent code generator .

In the degenerated distributed approach, ei-ther the caller stub (sender makes it rightmethod) or the callee stub (receiver makes itright method) does the transformation from thesource directly into the destination representa-tion . Only one transformation is needed . Nev-ertheless, the side performing the conversionneeds to know all the representation detailsused by the other side to properly interpre tthe data . The caller packs the data in its fa-vorite data format if the callee side recognize sit . The callee needs only to convert the data ifthe format used to transmit the data is not theone used locally . This avoids intermediary con-

227

versions when homogeneous machines are com-municating, and makes the process extensible ,because the data format can be extended tospecify other features such as new encodings o rpacking disciplines . In the local approach, theconnection of a new machine or language int othe system requires only that the stubs for thi snew component convert from its representatio nto the standard one . In the distributed method ,all the old components must be updated withthe information about this new component, o rthe new component must be updated about al lthe old ones .

When using the local approach, the stan-dardized representation can be either self-describing (tagged) or untagged . Self-addressing schemes have benefits such as en-hanced reliability in data transfer, potentia lfor programs to handle values whose types ar enot completely predetermined, and extensibil-ity. Untagged representations make more effi-cient use of the bandwith and require less pro-cessing time to marshal and demarshal . Thecaller and callee must nevertheless agree on thestandard representation .

Heterogeneity at the representation level, o rmore specifically, at the mapping level betwee nabstract data types and machine bytes, has al-ready been studied in a nondistributed envi-ronment . A mechanism for communicating ab-stract data types between regions of a syste mthat use distinct representations for the ab-stract data values is presented in [34] . A localapproach is used, in which the main concernsis modularity. The implementation of eachdata abstraction includes the transmission im-plementation details of instances of that partic-ular abstraction. Each data type implementa-tion must then define how to translate betweenits internal representation and the canonicalrepresentation, and the reverse, which is don ethrough the definition of an external represen-tation type for that type . The system support sthe transmission of a set of transmissible types ,that are high-level data types, including sharedand cyclic data structures . The external rep-resentation type of an abstract type T is anyconvenient transmissible type XT, which canbe another abstract type if wanted .

Stubs are responsible for conversion of dat aat the language level as well as at the machine

level . Over the years, distinct approaches havebeen developed to attack the conversion prob-lem. Some of the approaches concentrate o nsolving only the machine level conversion prob-lems, assuming homogeneity at the languag elevel . Others tried to solve only the languagelevel conversions, assuming that communica-tion protocols at a lower level would be respon-sible for performing the machine-level conver-sions. More ambitious approaches tried to dea lwith both problems . In the following section ,distinct approaches are studied .

5.2 .1 Stub Generator Designs

The Cedar [13] system designed the first stu bgenerator, called Lupine [13] . The stub codewas responsible for packing and unpacking ar-guments and for dispatching to the correct pro-cedure for incoming calls in the server side . TheMesa interface language was used without anymodifications or extensions . It assumed homo-geneity at the machine level and the languag elevel.

Matchmaker [39, 38] was one of the earlies tefforts toward connecting programs written i ndifferent languages . The purpose was to dealwith heterogeneity at the language level . Thesame idea was later explored by the MLP [33]system. On the other hand, HORUS [28], de-veloped almost concurrently with MLP, aimedat also dealing with heterogeneity at the ma -chine level . These three systems shared th esame approach of designing their interface de-scription language, and generating the stub sbased on the interfaces .

The HORUS [28] was one of the primary ef-forts in designing a stub generator for RPCthat would deal with heterogeneity at the dat arepresentation level and at the language level .All interfaces are written (or rewritten) in aspecific interface specification language. Theaim was to provide a clean way to restric tthe RPC system data structures and inter-faces that could be reasonably supported i nall target languages. This simplifies the stubgenerator and provides a uniform documenta-tion of all remote interfaces. The system use da single external data representation that de-fined a standard representation of a common -language and common-machine data types over

228

the wire. Data types were then converted fromthe source language into common-language ,and later from the common-language into thetarget language . This simplified the stub gen-erator and stub design, because only one repre-sentation was allowed. Nevertheless, the exter-nal data representation may not represent effi-ciently some data values from a source or targetlanguage . On the other hand, new source lan-guages and machine types can be added to thesystem without requiring an update in the en -tire network. The HORUS defined precisely theclasses of knowledge needed for a marshalingsystem, and this classification can be consid-ered one of its major achievements . The infor-mation was classified as : language dependent ,machine dependent, and language and machin eindependent . It also defined a format for thisknowledge that was easy to manipulate, thatwas general enough to contain all the informa-tion required, and that was self-documenting .All the language dependent knowledge require dfor the marshaling is stored as language specifi-cations . They contain entries for each commonlanguage data type, and additional (data typ eindependent) specifications needed to correctlygenerate the caller and callee stubs in the par-ticular language. The machine dependent in-formation, including primitive data type rep-resentation, order of bits, and coding metho dof all possible target machines is stored asmachine specifications . Entries for both, ma-chine and language dependent information, ar eexpressed in an internal HORUS command-driven language. The language specificatio nentries contain a set of code pieces for eac hpair of basic data types (or type constructors )in the common language and a possible targetlanguage . The set completely specifies all infor -mation required by the stub generator to mar-shal that type for that target language . It con-tains a piece of code for generating a declara-tion of the type in the target language, a piec eof code to marshal an argument of this type ,and a piece of code to demarshal . HORUSworks as a compile time interpreter as show nin Figure 6. It generates the stubs and headerfiles by parsing the interface written in commo nlanguage and, for each declaration, it consult sthe corresponding table and entry, and executesthe appropriate piece of code associated with

User Supplied Interface Declaration

l l

( caller stub )

r callee stub )

Figure 6 : Structure of HORUS Stub Generator

that entry, generating the stubs and headers .HORUS provides IN, OUT, and INOUT param-eter types to optimize passing data by value -result . HORUS provides for the marshaling ofpointer-based data types, as well as arbitrarygraph structures . A pointer argument is mar-shaled into an offset and the value it references .The offset indicates where in the byte array thi svalue is stored . Pointers are chased depth firs tuntil the entire graph structure is marshaled .HORUS constructs a runtime table of object sand byte array addresses to detect shared an drepeated objects .

The Universal Type System [33] (UTS) lan-guage, another effort towards the support ofheterogeneity at the language level was devel-oped concurrently with HORUS . Unlike HO-RUS, it did not deal with heterogeneity atthe machine level. The language was designedto specify interfaces in the Mixed Languag eProgramming System (MLP System), where aMixed Language Program was written in twoor more programming languages . A programconsisted of several program components, eachwritten in a host language . The UTS, a setof types, and a type constructor contain a lan-guage for constructing type expressions or sig-natures that express sets of types . The lan-guage consists of the alternation operator or ,the symbol — that indicates one unspecifie dbound in an array or string, the symbol * tha tindicates an arbitrary number of arguments o runspecified bounds, and the symbol ?, which i s

callermachine

specification

callerlanguage

specification

calleelanguage

specification

callee

machine

specification

Hornsstub generator

Horns

stub generator

229

the universal specification symbol that denote sthe set of all types. The representation of avalue of UTS type is tagged, usually indicatin gthe type and length of the data, and is storedin a machine- and language-independent way .

For each host language in the MLP system ,a language binding is defined, which is a setof conventions specifying the mapping betwee nthe host language types and the language typ eof the UTS language . The mapping is divide dinto three parts : mapping for types in whichthe two languages agree, meaning that eachtype has a direct equivalent in the other typesystem; mapping for types that do not have adirect counterpart, but that are representablein the other language ; and mapping for UT Stypes that do not fall in the above categories .For each language binding, there is an agentprocess that performs the mapping, and is writ -ten in a combination of that language and a sys -tem programming language. This use of mul-tiple languages may be necessary, because th eagents need to perform low-level data trans -formations, and yet be callable from the hostlanguage .

In the MLP system, there is a distinctionbetween the interface that a module export s(exported interface), and the (possibly several)interfaces that other modules import (importe dinterfaces) . These interfaces are specified i nthe UTS language as signatures that describethe number and UTS type of the argumentsand returned values of a procedure . Signaturesthat make use of the additional symbols in UT Sare underspecified, because the data types maynot map into only one host language-level type ,but a set of legal types . The MLP linker, there -fore, takes a list of program components tha tconstitute a mixed-language program and per -forms static type checking of the argumentsand parameters of the intercomponent calls . InUTS, a match between an exported signatureand an imported one has a more formal notio nthan simply determining if the signatures ar eequivalent . As defined in [33] :

An import signature matches, or uni-fies, with the corresponding exportsignature if and only if the set denotedby the former is a subset of the set de-noted by the latter .

This rule guarantees that the conversion canbe done. When a signature does not unify ,MLP delays the decoding of the UTS value an dtransfers the problem to the user to handle .This is done through the use of a representa-tive for an UTS value as the argument value .MLP provides the user with a library of rou-tines that can be called passing a representa-tive value as an argument to inquire the UTStype of the corresponding value, or to perfor mexplicit conversion of types . Valid operation sfor the host language types are not allowed t obe applied on the representative itself, which ismaintained by the agent . Representatives ar esimilar to tickets that are presented to agents :they provide access to the data .

Matchmaker [39, 38], an interface specifica-tion language for distributed processing, wasone of the earliest efforts toward connectingprograms written in different languages . It pro-vides a language for specifying the interfacesbetween processes being executed within theSPICE network, and a multitargeted compile rthat converts these specifications into cod efor each of the major (target) languages usedwithin the SPICE environment . Matchmake rwas intended to be a language rich enoug hto express any data structure that could b eefficiently represented over the network, an dreasonably represented in all target languages .The major target languages were C, PER QPascal, Common Lisp, and Ada .

Each remote or local procedure call contain san object port parameter, and a list of in andout parameters . The object port specifies theobject in which the operation is performed . Allparameters, but the object port, are passe dby value . Pointers, variable-sized arrays, an dunions can only occur in top-level remote dec-larations, and may not be used when construct-ing other types . Time-out values can be spec-ified, and the reply wait can be made asyn-chronous as well .

The Matchmaker language is used to masklanguage differences, such as language syntax ,type representations, record field layout, pro-cedure call semantics, and exception handlin gmechanisms . Representation for argument sover the network is chosen by the Matchmakercompiler . A Matchmaker specification includescomplete descriptions of the types of every ar -

230

gument that is passed, and the compiler gen-erates the target language type declarations t obe imported into the generated code . Match-maker assigns a unique identification (id) toeach message, which is used at run time toidentify messages for an interface . After theprocedure (message) is identified, the types ofall fields within it are also known, because mes -sages are strongly typed at compile time . Thecompiler is internally structured to allow th eaddition of code generators for other language sas they are added to the SPICE environment .This allows a callee to be automatically acces-sible to callers written in any of the supporte dlanguages, regardless of the language in whichthe callee is written .

The Abstract Syntax Notation One [37](ASN.1), a more general purpose interface de-scription language, was later designed, an dhas been widely used in international standar dspecifications. The ASN.1 interfaces are com-piled into its transfer syntax based on the BasicEncoding Rules (BER), which is used as the ex-ternal data representation . Each data type inASN.1 has a rule to translate it into its corre-sponding transfer syntax . The ED [54] libraryis an implementation of the ASN .1 BER. Thelibrary includes encoding and decoding rou-tines that can be used as primitive functions tocompose encoders and decoders for arbitraril ycomplicated ASN .1 data types . It also containsroutines that translate data values directly be-tween C and the BER transfer syntax withou tconverting it to an intermediate format . Fromthe ED library, CASN1, an ASN.1 compiler ,was designed and implemented for translatingASN .1 protocol specifications into C, and fo rgenerating the BER encoders and decoders fo rthe protocol defined data types .

Because of the difficulty introduced with pa -rameter passing by reference, or pointer-base ddata in general, in an RPC, some systems sim-ply do not allow arguments of such type in thedeclaration of functions that can be called re-motely. An example is the SUN RPC [79] im-plementation, which disallows it and restrict sthe number of arguments that can be passed i nan RPC to one, as well as the return value . Ifthe user wants to pass more than one argument ,or receive more than one value, the data mus tbe grouped into a structure data type . The

user then ends up being responsible for mostmarshaling and demarshaling of data . The rpc-gen [79], the SUN stub generator, generates ,from an RPC specification file, the caller an dcallee stubs, as well as a header file . The rpc-gen generates the remote procedure names b yconverting all letters in the names to lowercase ,and appending an underscore and the programversion number to end of the names . The SUNdefined and used a canonical representation fordata over the wire, the eXternal Data Repre-sentation [78] (XDR) protocol .

To avoid constrainting the user, another ap-proach was to allow the user to implement spe-cial routines for the marshaling and demar-shaling of complex data structures . This ap-proach was used, for instance, by the Networ kInterface Definition Language (NIDL) of theNCA/RPC [22] system and by HRPC [10] .

In NIDL, the routines convert the transmis-sible types defined by the user to or from typesthat the NIDL compiler can marshal . The het-erogeneity at the data representation level wasdealt by NCA with a Network Data Represen-tation protocol (NDR) . A set of values acrossthe communication media is represented by aformat label that defines the scalar value rep-resentation, and a byte stream with the values .The NDR does not specify the representationof the information in packets, however . TheNIDL compiler and the NCA/RPC system en-code the format label in packet headers, frag-ment the byte stream into packet-sized pieces ,and pack the fragments in packet bodies . Thi simplementation is very flexible, allowing the re-ceiver makes it right approach to deal with het-erogeneity in the data representation level .

In HRPC [10], each primitive type or typ econstructor defined by the IDL is translate dfrom the specific data type in the host machin erepresentation to or from a standard over-the-wire format . No direct support is provided forthe marshaling of data types involving point-ers, although the user can provide marshalin groutines for complicated data types .

The main concern of other systems durin gthe design of the stub generator is the main-tenance of location transparency . The Edensystem [2] consists of objects called Ejects thatcommunicate via invocations . It is transpar-ent for the user whether the invocation is for a

231

local Eject or a remote one . The system trans-parently does the packing and stub generationwhen required .

6 Process ModulesThe binding disciplines, language support, andexception handling are the main component sof the process modules . The binding disci-plines define a way of identifying and address-ing threads, and binding procedure names t othreads . The language support of the RPCmechanism allows the user to make these map-pings, and to issue the remote calls in situationswhere syntax transparency is not supported .The exception handling mechanism supportsthe handling of abnormal situations.

6.1 Binding DisciplinesIn conventional procedure call mechanisms, abinding usually consists of finding the addres sof piece of code, by using the name of a proce-dure. In the RPC model, a binding disciplin ehas also to identify the thread in which the cal lwill be executed, and several approaches maybe taken. First of all, there may be severa lthreads that can or are willing to execute aspecific procedure . These threads are usuall yserver threads (server processes), because theygive the illusion that they exist to serve call sto these procedures . Some implementations re-quire the caller to identify a specific thread t oexecute its call . Some require only the name ofthe machine in which the caller wants the callto be executed, independently of the particula rthread . Finally, some implementations requirethat the caller identify only the procedure to b ecalled, and dynamically and transparently, th eruntime support locates the most appropriat ethread to execute the call (Cedar [13]) . In thisway, an RPC binding can change during thecaller's execution, and a caller can be virtuallybound to several similar callees simultaneously .

The binding can be static, performed dur-ing compilation or linkage time, or dynamic ,performed during runtime . Static bindings areuseful if the application configuration is staticor seldom changes. They can also be used insituations where the extra cost of binding im-mediately before the call may, significantly, af-

fect performance, as may be the situation inrising alarms in a process control application .

Two phases are involved in acquiring a bind -ing that fully connects a caller and a callee :naming, and port determination . The namingis the process of translating the caller-specifie dcallee name into the network address of the hoston which the callee resides. The port determi-nation identifies a callee thread in one host ma-chine. The network address produced durin gnaming is not enough for addressing a callee ,because multiple callees may be running on asingle host . So, each callee has its own com-munication port and network address, whichuniquely identifies the callee thread . In theconventional situation, the procedur e 's name isenough, because everything else is known b ydefault .

Some systems allow the threads to expor tadditional attributes associated with a bindingthat provides the callees with more informa-tion so that they can decide the thread to use ,or with several ways of identifying a thread be -sides its name. For instance, a thread may ex-port the version number of its implementation ,so a callee may choose the newest version, ora particular version of interest . A thread mayalso publish the data objects upon which it i supdated, so a callee might choose a thread tha tdoes, or does not, modify a piece of data . Thebinding handler is all the information neededto identify a port .

The binding process works as follows . Athread willing to accept RPCs exports (pub-lishes) its port address and the procedures thatit wants to execute . A thread willing to is-sue RPCs imports the port information of athread or set of threads it thinks is adequat eto execute its calls . The exported or importedaddress information can connect the caller di-rectly to the callee, or can connect the caller t othe RPC runtime in the callee machine that willact as bridge to the callee . The main goal is t omake this process work as similar as possibl eto the conventional procedure scheme, hidingfrom the application user the import and ex-port processes. Concert [90, 7] is the systemthat supports the most conventional and flexi-ble binding mechanism.

To import and export procedures, som esystems (Cedar RPC [13], NCA RPC [22] ,

232

HRPC [10, 55], and so on) manage a globa lname space, that is, a set of names whose asso-ciated data can be accessed from anywhere i nthe environment . Export corresponds to writ-ing or updating information into this globalname space, whereas import corresponds toreading this information . Some other system suse well-known ports, where each callee is as-signed a specific port number . Callers usu-ally consult a widely published document t oget the well-known port numbers. The handle rconsists of this port number and the home o fthe machine in which the callee is supposed t orun. Because the global name service usuallyinvolves permanent memory, systems shoulddetect when a callee no longer exists, and un-export its associated information . Other sys-tems use a Binding Daemon (SUN RPC [79]) ,a process that resides at a well-known port oneach host and manages the port numbers as-signed to callees running on that host . An-other way would be to have a server-spawnin gdaemon process, a process that also reside sat a well-known port on each host, but tha tserves import requests by creating new port s(through forking other processes) and passingto the callee the new port number .

Some systems provide thread facilities to im-port or export a single procedure, whereas oth-ers provide only for the import or export of aset of procedures (an interface) . In some situ-ations the interfaces represent a functional re-lated grouping of functions, that typically op-erate on shared data structures . The unit ofbinding being the interface is sometimes justi-fied as a means of ensuring that all the func-tions in that interface are executed by thesame server . This is a step towards an object -oriented approach, where program objects ca nonly be accessed through an specific set of pro-cedures .

Another aspect of the binding discipline i sthe interface exposed to the user . The most ad-equate interface would be the one that wouldprovide the user with the same call inter -face (syntax transparency), independently ofwhether the user was making an RPC or a lo-cal call . Most systems however require thatthe RPC handler be an argument of any RP Ccall (NCA/RPC [22], SUN RPC [79]) . ThisRPC handler must also be the product of call -

ing an import function upon an interface orprocedure. Some other systems allow for thedeclaration of an interface-wise global variablethat will keep the address of the handler . Al-though the user interface is similar to a localone, the side-effects of the call can be unfortu-nate if the user is not careful enough to updat ethe handler when needed .

6 .1 .1 Different Binding Discipline De-signs

In the Cedar RPC Facility [13], the unit ex -ported or imported is an interface, and it i sidentified by its type and instance. The typespecifies in some level of abstraction the inter-face functionality and the instance, the imple-mentor . For each instance, the Grapevine dis-tributed database is used to store the networkaddress of the machine where it is running . Atthe application user point of view, for each in-terface imported or exported there is a relatedfunction to import or export the binding (net -work address) of that interface . The exportfunction gets the type and instance name of aparticular instance of an interface as argument sand updates the database with this informatio nand the machine ' s network address . The set ofusers that can export particular interfaces is re-stricted by the access controls that restrict up -dates to the database . The import function re-trieves this information from the database an dissues a call to the RPC runtime package run-ning on that machine, asking for the bindinginformation associated with that type and in-stance . This binding information is then ac-cessible to the caller stubs of that interface . Ifthere is any problem in the callee side, by th etime the binding is requested, the caller side isnotified accordingly and the application user i ssupposed to handle the exception . There areseveral choices for the binding time, dependin gon the information the application user speci-fies in the import calls . If the user gives onlythe type of the interface, the instance is chosendynamically at the time the call is made by th erun time support . On the other hand, the usercan specify the network address of a particularmachine as the instance identification for bind-ing at compile time .

In NCA/RPC [22], as mentioned previously ,

233

RPCs require a first argument, a handle, tha tspecifies the remote object (process) that re-ceives and executes the call . The primary goalof this scheme is to support location trans-parency, that is, to allow dynamic determi-nation of the location of an object . The lo-cal calls have the same syntax. Handles areobtained by calling the runtime support an dpassing the object (process) identification, th eUUID (Universal Unique Identification), an dnetwork location as input arguments . One flex-ibility is the implicit semantics . The user de-clares a global variable for each interface tha tstores the information necessary to acquire th ehandle, and designs a procedure that convert sthis variable into a handle . The handle argu-ment is not passed in the call and the con-version procedure called automatically by th ethe stubs during run time . To import and ex-port interfaces in NCA, the NCA/LB (Loca-tion Broker) was designed, which retrieves lo -cation information based on UUIDs . A calleeside process exports some functionality by reg-istering its location and the objects, type ofobjects, or interfaces that it wants to export a sa global (to the entire network) or a local (lim-ited users) service . The LB is composed of theGlobal Location Database, a replicated objec tthat contains the registration information of al lglobally registered processes ; and the Local Lo -cation Broker that contains the information o fthe local exportations. The Global Databas eprovides a weakly consistent replicated pack -age, meaning that replicas can be inconsistentat any time, but, without updates, all replicasconverge to a consistent state in a finite amoun tof time . A caller process can inquire about th einformation of a specific node, or can make anRPC by providing an incomplete binding han-dle, and specifying, for instance, only an inter-face, and not a specific process, and the run -time system asks the Local Location Broker th ereminder of the information .

In HRPC [10, 55], to support the heterogene-ity at the data representation protocol, trans-port protocol, and runtime protocol, the bind-ing is much richer, containing not only the loca -tion, but also a separate set of procedure point-ers for each of these components . Calls to thes eroutines of these components are made indi-rectly through these pointers .

Initially, the HRPC binding subsystemqueries the name service to retrieve a bind-ing descriptor . A binding descriptor consist sof, a machine-independent description of th einformation needed to construct the machine -specific and address-space-specific binding . I tindicates the control component, data repre-sentation component, and transport compo-nent that the callee uses, as well as, a net -work address, a program number, a port num-ber, and a flag indicating whether the bind-ing protocol for this callee involves indirectio nthrough a binding agent . The HCS Name Ser-vice (HNS) applies the concept of direct-access-naming, where each HCS system has a nam-ing mechanism and HNS accesses the informa-tion in these name services directly, rather tha nreregistering all the information from all the ex-isting name services . This design allows close dsystems named insular systems to evolve with-out interfering with the HNS, and makes newl yadded insular services immediately available ,eliminating consistency and scalability prob-lems. The HCS name service (HNS) can beviewed as a mapping between the global nam eof an object and the name of that object in itslocal system. The local system is the one re-sponsible for the final name to data mapping .Each HNS name then consists of two pieces :the context, which determines the specific nam eservice used to store the data associated withthe name, and the individual name or loca lname . Each name service connected to HNShas a set of name semantics managers (NSMs) .These routines convert any HNS query call t othe interface, format, and data semantics o fa local call to the name service. These rou-tines are remote routines and can be added tothe system dynamically without recompilatio nof any existing code . For a new system to b eadded to the HNS, NSMs for that system ar ebuilt and registered with the HNS ; the datastored in them becomes available through th eHNS immediately.

Using the context part of the name and thequery type, the HNS returns a binding for th eNSM in charge of handling this query type forthis context . The caller uses this binding t ocall the NSM, passing it the name and quer ytype parameters as well as any query type spe-cific parameters . The NSM then obtains th e

234

information, usually by interrogating the nam eservice in which it is stored .

SUN RPC [79] has no strongly expressedbinding model . Some procedures take the I Phost name of the machine to which the callerwants to communicate . Others simply take fil edescriptors to open sockets that are assumed t oalready be connected to the correct machine .Each callee 's host machine has a port mapperdaemon that can be contacted to locate a spe-cific program. The caller has to make an ex-plicit call to the port mapper to acquire a nRPC handler . The caller specifies the callee'shost machine, program number, version num-ber, and TCP or UDP protocol to be used . TheRPC handler is the second argument of theRPC. The port mapper also provides a moreflexible way of locating possible callees . Thecaller sends a broadcast request for a specifie dremote program, version, and procedure to aparticular port mapper's well-known port . Ifthe specified procedure is registered within th eport mapper, the port mapper passes the re-quest to the procedure. When the procedureis finished, it returns the response to the por tmapper instead, which forwards it to the caller .The response also carries the port number o fthe procedure so that the caller can send laterrequests directly to the remote program .

In the Template-Based model [68] for dis-tributed applications design, an applicationconsists of modules that communicate witheach other via RPC . Each module can haveonly one exportable procedure, but it can cal lany of its local (internal to the module) proce-dures, as well as any procedure exported b yother modules . The module's interaction i sexpressed through a set of attribute binding sknown as template attachments . A template i sa set of characteristics that can be used to spec-ify the scheduling and synchronization struc-ture of a module in an application . It describesthe interaction of one module with others. Ev-ery module can have up to three templates t ospecify its behavior : input, which describes thecorrect scheduling and synchronization of in -coming calls, output, where similar to input forcalls originated by the module, and body, whichdescribes attributes that can modify the mod-ule 's execution behavior in the distributed en-vironment . In this model, the unit of binding is

a module, which can have only one exportableprocedure. The binding is done at call time .

Concert [90, 7] expresses a binding in th emost transparent way : as a function pointer .Languages like C, which have already functionpointers, need no extensions . Concert simplytreats functions as closures . The closures iden-tify a code body and a process environmentin which the body is executed . In Concert ,when a process assigns the address of a functionbody to a function pointer, not only is the func-tion body address implicitly assigned, but alsothe encompassing process environment . Anyfunction pointer passed to another process pre-serves this information, exporting a functionpointer or binding .

6.2 Language Support

In the traditional procedure call model, thecallee process, being the same as the caller pro-cess, by default, is always active . In the RPCmodel, an issue exists as to when to activat ethe process in which the callee is supposed torun. These problems are called activation prob-

lem or callee management problem or callee se-mantics problem . In some systems, the applica-tion user is responsible for this activation. Inothers, callees processes are activated by theunderlying process model. There is a calleemanager that can create callee processes on de-mand, spawn new processes, or simply act as aresource manager that chooses an idle processfrom a pool of processes created earlier . Suchcallees can remain in existence indefinitely, an dcan retain state between successive procedurecalls; or they may last a session as a set ofprocedure calls from a specific client, or theymay last only one call, which is a instance-per-call strategy. When the activation process iscontrolled by the runtime system, several in-stances of a callee can be installed on the sam eor different machines transparently to the user .This may be only to provide either load bal-ancing, or some measurement of resilience tofailure . The binding mechanisms would allowone of these instances to be selected at importtime.

Some languages allow the user to start pro-cesses that will either be caller or callee fro mthe operating-system level, and also offer Ian -

235

guage support to start new processes durin grun time. In the latter situation, an interfac eshould be powerful enough to allow passing ar-guments between the parent and the children ,as pointed out in [89] . Supporting parameterpassing for newly created processes allows fo rcompile-time protection, more readability, an dstructuring . Concert [90] allows the passing ofarguments between parent and children, an dthese arguments can be remote procedure ref-erences as well .

In Athena RPC [70], callees are character-ized by their lifetime, whether they are share damong clients or are private, and the meansthrough which they are instantiated . In the in-terface definition, the user specifies these char-acteristics through attributes that are associ-ated with callee's stubs. The system is veryflexible, allowing callee processes to be instan -tiated either manually or automatically in re-sponse to a request from a caller, or bein glong-lived, belonging to a pool, and potentiallyanswering multiple callers . This instantiatedmode is transparent to the caller .

RPC systems also differ in the way callees re -spond to the calls . For the syntax of the calle ereception, the receive can be explicit or im-plicit . In explicit mode, the language providesa function that can be called by an already ac-tive process just like any other. In the implici tmode, the language provides a mechanism forhandling calls or messages . The receive is notperformed by any language-level active process .The callees are dedicated processes that per -form only remote calls . This mode may loo kmore like the behavior in a sequential proce-dure environment, where the arrival of the cal ltriggers the execution of an appropriate bodyof code, much as an interrupt triggers its han-dler (see [65] for further discussion) . In essence ,a classification can be made [65] in which ex-plicit receive is a message based-mode, whereasimplicit receipt is a procedure-based one . Theissue of synchronization, though, is orthogonalto the issue of the syntax of how the callee pro-cess responds to it .

In Cedar [13], for instance, there is no user -level control to the service of calls on the calleeside. There is a pool of callee side processesin the idle state waiting to handle incomingcalls . In this way, a call can be handled without

the cost of process creation and initialization ofstate . In peak situations, additional processe scan be created to help and terminated after th esituation returns to normal . This scheme i svery efficient, because it reduces the numberof process switches in a call, but the user hasno control during run time of the callee sideprocesses .

In explicit mode, one of the most commonstatements is the accept statement that ac-cepts a procedure name, or a set of proce-dures, as input and blocks until a call to one ofthe specified procedures is made. The selectstatement allows the callee to express Booleanexpressions (guards) that must be attended tobefore the callee accepts a call for the corre-sponding procedure . Most of the languagesthat do have a select construct on the calleeside such as Ada [20], do not allow RPC call sto be part of the Boolean expressions, that is ,act as a guard. A discussion about situation sin which the absence of such a construct cause sloss of expressiveness, of robustness, and evenof the wanted semantics, is presented in [25] .An expressive extension of Ada's select con-struct that solve the problem in an elegant wa yis also described. Having RPC calls as guardsin select constructs is an advantage in situa-tions such as : conjunctive transmission (broad -casting), disjunctive transmission, pipelining ,transmission stopped by a signal, and so on .The broadcasting occurs when the RPC callergets some information that it wants to broad-cast to a specific set of processes . It has thecalls it is to make, but the order in which thecallees are ready to answer the call is unknown .The caller does not want to block waiting fo ra busy callee to answer while it can be call-ing other callees and afterwards try the bus yone again. The disjunctive transmission oc-curs when a caller wants to call any potentialcallees . The pipelining happens when processesP1 , . . . , P„ are to form a pipeline, where proces sPi receives items from P;_ 1 and sends them t oPi+ 1 . Each Pi is simultaneously ready to eithe raccept an item from Pi_ 1 or to send an item toPi +1, and is willing to choose by the availabilit yof its neighbors . The transmission stopped b ya signal occurs when mixed calls and accept sare necessary—for example, when you want aprocess to continuously make specific calls un -

236

til another process calls it when it is supposedto accept and stop calling .

Concert [90] presents a set of language sup -ports for the handling of RPCs in the explicitmode, which is discussed in Section C of theAppendix .

6.3 Exception Handlin g

In a single thread of execution, there are tw omodes of operation : the whole (including calle rand callee) program works, or the whole pro-gram fails . With RPC, different failure mode sare introduced : either the callee or the callermalfunction, or the network fails . Even withgood exception raising schemes, the user i sforced to check for errors or exceptions wherelogically one should not be possible, making theRPC code different from the local code, and vi-olating transparency.

Transparency is lost either because RPC hasexceptions not present in the local case, or be -cause of the different treatment in the RPC orlocal case of exceptions common to both . Theprogramming convention in single machine pro -grams is that, if an exception is to be notifiedto the caller, it should be defined in the inter -face module, while the others should be han-dled only by a debugger [13] . Based on thi sassertion, Cedar RPC [13] emulated the seman-tics of local calls for all RPC exceptions definedin the interfaces exported . The exception forcall failure is raised by the runtime support ifany communication difficulty should occur .

To prevent a call from tying up a valuabl eresource in the callee side, the callee shouldbe notified of the caller's failure [86]. For acaller's failure, or any other abnormal termi-nation, the resource should be freed automati-cally. The general problem of garbage collectio nin a distributed system could be very expen-sive . The NCA/RPC [36] exposes to the clientexceptions that occur during the execution ofRPCs . It also propagates interrupts occurringto the caller process, all the way to the calle eprocess executing on the caller's behalf .

7 Execution-Time Compo-nents

The runtime support and communication me-dia support constitute the execution time com-ponents of RPC . The runtime support consistsof a set of libraries provided by the RPC sys-tem. Its a software layer between the otherRPC components and the underlying architec-ture . It receives and enqueues calls to the calleestubs, sends the call messages generated by thecaller stubs, guarantees that calls are executedin the correct order, and so on . The commu-nication media support is the transport mediathat carries the messages. At the operating-system level, there are several possible configu-rations for an RPC call, as shown previously i nFigure 3 . Depending on the particular syste mand the underlying architecture, the communi-cation media chosen for each particular config-uration varies . Nevertheless, when the callerand callee threads reside on different machines ,network transport support is needed . This sup -port has been the subject matter of severa lRPC systems, and it is discussed in the fol-lowing section .

7.1 Network Transport Support

For the network transport support, the primar yquestion is which protocol is the most suitablefor the RPC paradigm .

For better performance in the extended ex -change case, TCP could be chosen, since it pro-vides better flow and error control, with negli-gible processing overhead after the connectio nis established. It also provides the predictabil-ity required for RPC. On the other hand, it istypically expensive in terms of space and time .In space, it requires the maintenance of connec-tion and state information, and, in time, duringconnection establishment, it requires usuall ythree messages to set it up, and three more t otear it down . RPC systems usually have shor tperiods of information exchange between callerand callee (the duration of a call), and a calle emust be able to easily handle calls from hun-dreds of callers . Network implementations usu-ally are strict about the number of concurrentlyopen connections. Even without this restric-tion, it would be unacceptable for a callee t o

237

maintain information about such a large num-ber of connections .

Datagram-oriented network services (such asUDP) do not provide for all the security anRPC mechanism needs, but are inexpensiveand are available in any network .

In general, RPC-based systems make use ofthe datagram service provided by the network ,and they implement a transport protocol onthe top of the datagram, which meets RPC re-quirements . This protocol, usually, performsthe flow and error control of TCP protocols ,but the connection establishment mechanism i slight weight .

The designers of the Cedar RPC Facility [13 ]found that the design and implementation of atransport protocol specifically for RPC woul dbe a great source of performance gains . First ,the elapsed real-time between initiating a cal land getting results should be minimized. Theadequacy for bulk data transfer, where mos tof the time is spent transferring data, was nota major concern in the project . Second, theamount of state information per connection ,and the handshaking protocol to establish one ,should be minimized . Finally, at-most-once se-mantics should be used .

The scheme adopted bypasses the softwarelayers that correspond to the normal layer sof a protocol . The communication cost wasminimized when all the arguments of a cal lfit in a single packet . The protocol work sas follows. To issue a call, the caller sendsa packet containing a call identifier, a proce-dure identifier, and the arguments . The cal lidentifier consists of the calling machine iden-tifier, a machine-relative identifier of the call-ing process, and a sequence number . The pair[machine identifier, process] is named ac-tivity . Each activity has at most one outstand-ing remote call at any time. The sequence num-ber must be monotonic for each activity. Thecall identifier allows the caller to identify th eresult packet, and the callee to eliminate du -plicate call packets . Subsequent packets ar eused for implicit acknowledgment of previouspackets . Connection state information is sim-ply the shared information between an activ-ity on a calling machine, and the RPC runtim esupport on the callee machine, making possiblea light-weight connection management . When

the connection is idle, the caller has only acounter, the sequence number of its next call ,and the callee has the call identifier of the las tcall . When the arguments or results are toolarge to fit in a single packet, multiple packetsare sent alternatively by the caller and callee .The caller sends data packets, and the callee re-sponds with acknowledgments both during ar-guments transmission, and the reverse, durin gresults transmission . This allows the imple-mentation to use only one packet buffer at eachend for a call, and avoids buffering and flo wcontrol strategies in normal-bulk data transferprotocols . In this way, costs for establishin gand terminating connections are avoided, andthe cost of maintenance is minimized .

The NCA/RPC [22] protocol is robus tenough to be built on top of an unreliable net-work service and yet to deal with lost, du-plicated, out-of-order, long-delayed messagesthat cause callee side processes to fail. Italso guarantees at-most-once semantics whennecessary. A similar approach to the Cedarproject is used . The connection informationkept between processes constitutes, basically, alast-call sequence number. This number canbe queried from both ends for synchronizationpurposes through ping packets . The runtim esupport, though, uses the socket abstraction t ohide the details of several protocols, allowingprotocol independent code in this way. A poolof sockets is maintained for each process : allo-cated when a remote call is issued, and returne dto the pool when the call is over . A similarend-to-end mechanism is implemented by th eAthena RPC [70] and by NCS/RPC [22] .

SUN RPC [79] allows the user to choose th etransport protocol based on the user 's needs .SUN RPCs require an RPC handler as the sec-ond argument . This handler is obtained by call-ing the port mapper, and passing the transpor tprotocol type specified by the user, TCP, orUDP. When using TPC, there is no limit t othe size of the arguments or result values . Aninteger at the beginning of every record deter -mines the number of bytes in the record . WithUDP, the total size of arguments must not gen-erate a UDP packet that exceeds 8,192 bytes inlength . The same holds for the returned values .

ASTRA [3] developed RDTP/IP, a reliabl edatagram transport protocol, that is built on

238

UDP. It provides a low-cost reliable transport ,where calls and replies are guaranteed througha connectionless protocol . For flow and errorcontrol, it uses a stop and wait protocol. Italso provides an urgent datagram option tha tcauses RDTP to push the packet out to th enetwork immediately. On the callee side, theurgent datagram is stored in a special buffer ,and a signal is raised to the user . A user canreceive urgent datagrams by specifying the ur-gent datagram option in receiving statements .

The synchronized clock message protocol [48]

(SCMP) is a new protocol that guarantees at-most-once delivery of messages at low cost an dis based on synchronized clocks over a dis-tributed system . It assumes that every nod ehas a clock and that the clocks of the node sare loosely synchronized with some skew e, les sthan a hundred milliseconds . Nodes carry out aclock synchronization protocol that ensure suc hprecision. This synchronization protocol is be-ing used for other purposes, such as for authen-tication and for capabilities that expire .

The protocol can be summarized as follows :every thread of control T has a current time ,T.time, that is read from the clock of the nod ein which T is running. Every message has atimestamp, m.ts, that is T.time of the sendingthread at the time m is created . Each messag econtains a connection identifier, m.conn, cho-sen by the client without consultation with th eserver . Each server maintains a connection ta-ble, T. CT, that is a mapping from connectionidentification to connection information . Onepiece of information is the timestamp of th elast message accepted on that connection . Notall connections have an entry in T.CT. Any T i sfree to remove an entry T .CT[C] for a connec-tion C from its table, if T .CT[C] .ts < T.time—p, where p is the interval during which a con-nection information must be retained . A calleealso keeps the upper bound, T.upper, on thetimestamps that are removed from the table .Because only old timestamps are removed fromthe table, T .upper < T.time —p. The mech-anism then distinguishes new from duplicated(old) messages, based on the notion of boun dof a particular connection . On the callee side ,the bound of a connection is the timestamp ofthe most recent previously accepted message ,if the message 's connection has an entry in the

table. Otherwise, the global bound T .upper i sused. Because no information is in the table forthis connection, the last message on this con-nection, if there was any, had a time stamp <T.upper . Provided that p is sufficiently large ,and having T .upper < T.time —p, the chancesof mistakenly assuming a message as duplicat eis very low .

The above protocol is nevertheless for calleesthat do not survive failures . For resilientcallees, it is necessary to determine by the timeof a message arrival, whether the message is anew one or a duplicate of a message that ar-rived before the failure . An estimate of thetimestamp of the latest message that may havebeen received before the failure occurred . Thesystem operates as if no message with a times-tamp greater than the estimate was receivedbefore the failure, and therefore, messages olderthan the estimate are new. An upper boundon the timestamps of accepted messages is en-forced and is named T.latest . After the failure ,T.latest is used to initialize T .upper . Roughlyspeaking, a server periodically writes T .timeto stable storage . After a failure, T .latest i sthe most recent value written to stable storage .For example, SCMP can be used by higher-level protocols to establish connections . It canbe used as a security reinforcement mechanismthat allows only new messages to be passed t ohigher-level mechanisms that use it . Perfor-mance measurements [48] show that at-most-once semantics for callers calling callees oc-casionally are achieved with about the samecost as UDP-based SUN RPC, and with signifi-cantly better performance than the TCP-basedSUN RPC. If the caller makes several calls tothe same callee, none major overhead cost isadded over the UDP, and it performs bette rthan the TCP-based RPC .

When the RPC involves processes connecte dthrough a wide-area network, the protocolmust support location services and transparentcommunication among the distant connecte dprocesses. An adequate protocol for local-are anetworks would be inappropriated for wide-area networks, because the latter has a veryhigh latency rate . Amoeba distributed operat-ing system [85] introduced a session layer gate-way to support the interconnection of local -area networks . Through a publish function,

239

target servers (callees) export their port an dwide-area network address to other Amoebasites. When receiving this information, eachsite installs a server agent . The server agen tacts as a virtual server (callee) for all calls t othat service, listening locally to requests to thatport from that site . When issuing an RPC, thecallee calls the server agent, which forwards therequest across the wide-area network using thepublished address . When the request arrive sat the remote Amoeba site, a client agent pro-cess is created there that acts as the virtua lclient (caller) of the target server and startsa local call to it . After execution, the tar-get server sends the results to the client agent ,which forwards the results through the wide-area network to the server agent . The serveragent then completes the call, returning the re-sults to the real client . Amoeba provides ful lprotocol transparency for the user . For wide-area communication, it uses whatever protocolis available and without the client and serverprocesses knowing about it . For local commu-nication, it uses protocols optimized for localnetworks. The error recovery is very powerfu lin this model, because the client agent notifie sthe server agent, and the reverse, when a shut-down of the client occurs .

REX [57] is a remote execution protocol tha textends the RPC concept to a wider range o fremote process-to-process interactions for a nobject-oriented distributed processing architec-ture .

8 Supporting and Extend-ing the RPC Mechanism

This section discusses several implementationtechniques to enhance the RPC mechanism ,and extensions added to the concept . The ex-tensions try to model the RPC paradigm s othat it will be more efficient, or more suitable ,for particular applications .

8 .1 Process Management

Of foremost concern in designing and imple-menting an RPC mechanism are three factors :the language that is extended to support RPC ,the operating system in which the mechanism

is built, and the networking latency of the un-derlying implementation .

An operating system should be capabl eof supporting several threads of control [82] .Single-threaded environments do not provid efor the performance and ability for concurren taccess of shared information required for theimplementation of the mechanism . An attemptat implementing an RPC prototype as part o fthe MIT's Project Athena, around 1986, on th etop of 4 .2 BSD, is described in [70] . There wereno support for more than one thread of execu-tion within a single address space. The com-munication protocol was a user mode requestand response layered on UDP (UDP-RR) . The4.2BSD does not provide the low latency re-quired. The process scheduling algorithm i snonpreemptive, which tends to increase the cal llatency.

One of the most useful properties of a lan-guage that is to be extended with RPC, orany synchronous mechanism, is the supportof process management . This topic is dis-cussed in detail in [46], where the limitation sof synchronous communication with static pro-cess structure are presented. Two situation sin which a process could block are identified :local delay and remote delay . When the cur-rent activity needs a local resource that is un-available, the activity is said to be blocking be-cause of a local delay . A callee in this situa-tion should suspend the execution of this call ,and turn its attention to requests from othercallers . If the current activity makes a remot ecall to other activity and is delayed, the othe ractivity is blocked because of a remote delay .A remote delay can actually be a communi-cation delay, which can be large in some net-works. It can also happen because the calle dactivity is busy with another request, or mustperform considerable computing in response t othe request . The caller process, while waiting ,should be able to work on something else . Theuser should be able to design several threads o fexecution inside a process . This provides theoperating system with alternative functions t oexecute while that call, on the callee or callerside, is blocked .

Several alternative ways to avoid or overcom ethe blocking in the absence of dynamic processcreation are described as follows and the de-

240

ficiencies inherited in the alternative ways ar eanalyzed [46] . Examples of some solutions are :

• Refusal: a callee that accepts a requestand is unable to execute it, because of adelay, returns an exception to the caller toindicate that the caller should try agai nlater .

• Nested accept : a callee that encounters alocal delay accepts another request in anested accept statement . This scheme re-quires that the new call is completed be-fore the old call is completed. This mayresult in an ordering that may not corre-spond to the needs of applications .

• Task families : instead of using a singleprocess to implement a callee, a familyof identical processes is used. These pro-cesses together constitute the callee, andsynchronize with one another in their useof common data .

• Early reply : simulate asynchronous prim-itives with synchronous primitives . Acallee that accepts a call from a caller sim-ply records the request and returns an im-mediate acknowledgment .

• Dynamic task creation in the higher-leve lcallee : the higher-level server uses earlyreply to communicate with callers . Whenit receives a request, it creates a subsidiarytask (or allocates an existing task) and re-turns the name of the task to the caller .This would handle remote delay.

A synchronization mechanism may provid eadequate expressive power to handle local de-lays only if it permits scheduling decisions t obe made on the basis of the the name of th ecalled operation, the order in which requestsare received, the arguments of the call, and th estate of the resource [46] .

NIL [76] was one of the primary efforts i nhaving processes as the main program structur-ing construct at the language level, which ar eboth units of independent activity, and unit sof data ownership . Processes can be dynam-ically created and destroyed in groups calledcomponents, and each component accepts a listof creation-time parameters . These parameters

are used to pass initial data to an initializatio nroutine within each process being created. Thechoice of processes to load into a componen tis made at run-time. Hermes [75] and Con-cert [90] are languages that came afterwards ,also based upon the process model .

8 .2 Fault Tolerance

The fault tolerance and recoverability are pri-mary issues in distributed environments . A dis-tributed environment provides a decentralizedcomputing environment . The failure of a sin-gle node in such environments represents onl ya partial failure, that is, the damage (loss) isconfined to the node that failed . The rest ofthe system should be able to continue process-ing. A system is fault tolerant [8] if it contin-ues functioning properly in the face of proces-sor (machine) failures, allowing distributed ap-plications to proceed with their execution an dallowing users to continue using the system .The recoverability refers to the ability to re-cover from a failure, partial or not .

There are three sources of RPC failures :

• Network failure : The network is down, o rthere is a network partition. The callerand callee cannot send, or receive anydata .

• Caller site failure : The caller process o rthe caller host fails .

• Callee site failure : The callee process orthe callee host fails . This might cause th ecaller to be indefinitely suspended whilewaiting for a response .

The system may hide the distributed envi-ronment, or other underlying details from th euser, thus providing transparency . The sys-tem can provide location transparency, mean-ing that the user does not have to be awareof the machine boundaries and the physical 10-cations of a process to make an invocation onit . On the other hand, the simplest approachto handling failures is to ignore them, or to letthe user take care of them .

The fault model for node failures is as fol-lows: either a node works according to its spec-ifications, or it stops working (fails) . After

241

a failure, a node is repaired within a finit eamount of time and made active again .

Usually, the receipt of a reply message fromthe callee process constitutes a normal termi-nation of a call . A normal termination is pos-sible if [58] :

• No communication or node failures occurduring the call .

• The RPC mechanism can handle a fixedfinite number of communication failures .

• The RPC mechanism can handle a fixednumber of communication failures an dcallee node failures .

• The RPC mechanism can handle a fixe dnumber of communication failures andcallee node failures and there is toleranceto a fixed finite number of caller node fail-ures .

The appropriate choice of fault tolerance ca-pabilities and the semantics under normal an dabnormal situations, are among the most im-portant decisions to be taken in an RPC de-sign [58] . A correctness criterion for an RPCimplementation is described in [58] . Let C; de-note a call, and W; represent the correspond-ing computation invoked at the callee side . LetC; and C1 be any two calls made such that :(1) C; happens after C; (denoted by C; thenC;), and (2) computations W; and W1 sharesome data such that W; and/or W1 modify theshared data. The correctness criterion (CR)then establishes that an RPC implementationmust meet in the presence of failures the fol-lowing CR :

CR: C; then C; implies Wt then W1

A call is terminated abnormally if the termi-nation occurs because no reply message is re-ceived from the callee . Network protocols typ-ically employ time-outs to prevent a processthat is waiting for a message from being heldup indefinitely.

If the call is terminated abnormally (thetime-out expires), there are four mutually ex-clusive situations to consider :

• The callee did not receive the request, bu tit was made .

• The caller did not receive the reply mes-sage, but it was sent .

• The callee failed during the execution ofthe respective call. The callee either re-sumes the execution after failure recovery ,or does not resume the execution even af-ter starting again .

• The callee is still executing the call, butthe time-out was issued too soon .

There are two situations in which the ab-normal termination can cause the caller to is -sue the same call twice. If the time-out is is -sued too soon for a particular call, the callermay erroneously assume that the call coul dnot be executed, and might decide to call an-other callee process, while the former one isstill processing the call . If the two executionsshare some data, the computations can inter-fere with each other . Another situation canoccur when the caller recovers from a failur ebut does not know whether the call was alread yissued or not, and makes the call again . Exe-cutions that are failed are orphans, and manyschemes have been devised for detecting an dtreating orphans [40, 53] .

Network failures can be classified [58] as fol-lows :

• A message transmitted from a node doesnot reach its intended destination (a com-munication failure) .

• Messages are not received in the same or-der as they are sent .

• A message is corrupted during its trans -mission .

• A message is replicated during its trans -mission .

There are well-known mechanisms (based onchecksums and sequence numbers) that a re-ceiver can use to treat messages that arrive outof order, are corrupted, or that are copies ofpreviously received messages . Therefore, net-work failure is the treatment of communicationfailures only. If the message handling facilit yis such that messages occasionally get lost, acaller would be justified in resending a messag ewhen it suspects a loss . This can sometimes

242

result in more than one execution at the calleeside .

When a process survives failures on its node ,the process is a resilent process . It might re-quire maintenance of information in a log on anonvolatile storage medium .

Because of the difficulties in achieving con-ventional call semantics for remote calls, be-cause of problems with argument passing an dproblems with failures, some works [31] argu ethat remote procedures should be treated dif-ferently from the start, resulting in a com-pletely nontransparent RPC mechanism .

8.2 .1 Atomicity

A major design decision of any RPC mecha-nism is the choice of the call semantics of anRPC in the presence of failures . Some systemsadopt the notion that RPC should have onlylast-of-many semantics when failures occurre dto model procedure calls in a centralized sys-tems [53] . In Argus [45], RPC has at-most-onc esemantics and malfunctions are treated by ex-ception handling . The Cedar [13] implementa-tion enforces at-most-once semantics by send-ing probe messages, and by using Mesa's power-ful exception handling mechanism . Other sys-tems [67, 80] follow the idea that RPC shoul dhave exactly-once semantics to be useful .

An implementation of an atomic RPC mech-anism, which maintains totality and serializ-ability, is presented in [42] . Serializability isguaranteed through concurrency control by at-taching a call graph path identifier to each mes-sage representing a procedure call . Each pro-cedure keeps the message path of the last ac-cepted call . This path is used to make compar-isons with incoming message paths, and onlycalls that can be serialized are accepted. Inthis system, procedures are separated entities .A procedure's data can be either static or au-tomatic . The automatic data is deleted after aprocedure 's activation, while static data is re-tained . Associated states of static variables aresaved in backup processors to survive failures .

Because of concurrency control and to main-tain atomicity, some calls to procedures wit hstatic data can be rejected . For instance, sup-pose a procedure A that concurrently invokescalls on procedures B and C, both of which call

A

B

C

Figure 7 : Concurrent Call of D by B and C

procedures D, which has static data (Figure 7) .If B calls D first, C's call to D should be rejecteduntil B returns to A . If C's call is accepted, an dif B calls D again later, B 's and C 's effects onD cannot be serialized, violating the atomicityrequirement .

Procedure calls can be atomic, that is, to-tal and serializable, or nonatomic . Exactly-once semantics are guaranteed for atomic calls ,while only at-most-once semantics are assuredfor nonatomic ones.

Calls invoked by the callees of an atomic cal lare also executed as atomic calls . The total-ity property requires that when a caller mal-functions and recovers to an old state, all thestates of its callees must also be restored . Ifa callee fails and recovers, its caller is not af-fected, but the call must be repeated . The al-gorithm recovers a failed procedure's data stateto its state after the last time it finished a natomic call . A procedure that was called bya failed atomic procedure is recovered to thestate before the failed atomic procedure mad ethe first direct or indirect call to the procedure .The backup states for a procedure are Associ-ated States (AS) . An AS is the procedure's dat astate (static variable and process identification )that is tagged with a version number, which isincreased by one every time a new AS is cre-ated . An AS is created on procedure initiatio nand updated every time after the procedure re-turns from an atomic call .

A Transactional RPC design and implemen-tation is presented in [23] . The design consistsof a preprocessor, a small extra runtime library,and the usual RPC components . The prepro-cessor accepts interfaces written in DCE ID Land generates, for each procedure, a shado wclient stub and a shadow manager stub (shadowserver stub) . It also generates an internal in-terface, which contains a new version of theprocedures declared in the original interface ,

243

where an extra in-out argument is inserted .This extra argument contains the data infor-mation needed for the transaction system . Theinternal interface is the input to the stub gen-erator .

When issuing the RPC, the user calls th eshadow client stub, which communicates wit hthe transaction subsystem. The shadow clientregisters the call, obtains some registering data(piggyback data), and updates the log file . Af-ter that, it calls the shadow manager stub gen-erated for the internal interface and passes thepiggyback data as the extra argument . Onother side, the manager stub receives the call .It recovers the piggyback data argument, an dinteracts with the transaction system to com-municate the acceptance of the request . There -after, it calls the real caller stub with the re-minder of the arguments . A similar protocolis used when the call is finished . The shadowserver interacts with the transaction systemcommunicating the completion of the call, an dobtaining some more piggyback data . It thenpasses the piggyback data and the procedur eresults back to the client shadow . The clientshadow, once again, interacts with the trans-action system by passing this piggyback data ,and then returns the results of the procedure t othe original caller . The user interfaces are no tmodified, and the user with transactional ap-plications benefits in a completely transparen tmanner .

A reliable RPC mechanism that adopts th eexactly once semantics is described in [67] . Atthe RPC system language level, the user canspecify additional information that enables th eunderlying process to manage the receipt an dsending of messages . Orphans are not treatedat this level . The level immediately above thi srequires that all programs be atomic action swith the all or nothing property. Executingcallees in an atomic way guarantees that therepeated execution at the callee side is per -formed in a logically serial order with orphanactions terminating without producing any re-sults . The syntax transparency is lost, and anRPC call requires, at least, the name of thecallee's host and a status and a time-out ar-gument . The time-out specifies how long thecaller is willing to wait for a response to hisrequest . The status indicates whether the call

was executed without problems, whether th ecall was not done, whether the callee was ab-sent or could not execute the call . The systemassigns sequence numbers (SN) to messages .SNs are unique over the entire system . A calleemaintains only the largest SN received in a fail -ure proof storage . All retry messages are sentby a sender with the same SN as the origina lmessage . A callee accepts only messages whos eSN is greater than the current value of the las tlargest SN. A similar approach is provided atthe caller's side. For the generation of network-wide unique sequence numbers, the loosely syn-chronized clock approach is used . Each node i sequipped with a clock and is also assigned aunique node number . A sequence number a teach node is the current clock value concate-nated with the node number .

8.2.2 Replication

Some systems use replication to support faul ttolerance against hardware failure. Schemesthat use replication can be classified in two cat-egories : primary standby and modular redun-dancy . In the primary standby approach on ecopy is the active primary copy and the oth-ers are passive secondary copies . The primarycopy receives all the requests and is responsi-ble for providing the responses . It also updatesperiodically the checkpoints of the passive sec-ondary copies, so that they can assume the roleof primary copy, if this is the case . If the pri-mary copy fails, a prespecified secondary cop yassumes its role . The major drawback of sucha scheme is the considerable system support re-quired for checkpointing and message recovery .In modular redundancy scheme, all the repli-cated copies receive the requests, perform th eoperation, and send the results . There is no dis-tinction made between the copies . The calle rusually waits for all the results to be returned ,and may employ voting to decide the correctresult . The major disadvantage of this schemeis the high overhead of issuing so many request sand so many responses . If there are m requestsand n copies of the requested service, the num-ber of messages needed to complete the requestis O(m x n) .

The Replicated Procedure Call [18] is an ex-ample of modular redundancy-based mecha-

244

Caller troupe

Callee troupe

Figure 8 : Replicated Procedure Call

nism for RPC-based environments . The setof replicas (instances) of a program module i sa troupe . A replicated procedure call is madefrom a caller troupe to a callee troupe . Eachmember of the caller troupe makes a one-to-many call to the callee troupe, and each calleemember troupe handles many-to-one call fromthe caller troupe as shown in Figure 8 . Eachmember of the callee troupe only performs th erequested procedure once, while each memberof the caller troupe receives several results . Tothe programmer this looks like normal proce-dure calls . Distributed applications continueto function normally despite failure if at leastone member of each troupe survives . The de-gree of replication can be adjusted to achiev evarying levels of reliability . Increasing the num-ber of nodes spanned by a troupe increases itsresilience to failures .

To issue a replicated procedure call, thecaller's side of the protocol sends the same cal lmessage, with the same sequence number, t oeach member of the callee troupe . On thecallee's side, a callee member receives call mes-sages from each caller's troupe member, per -forms the procedure once, and sends the sameresult to each member of the caller 's troupe .For this call, each troupe has a unique troup eid, assigned by a binding agent, and each mes-sage contains the caller's troupe id . When arequest arrives, the callee checks the member sof the caller 's troupe id, and determines fro mwhom to expect call messages as part of themany-to-one call . The callee also needs a way

of grouping the related requests, that is, to de-cide whether the requests are unrelated or arepart of the same replicated call, because therequests may not be synchronized or identical .The system does not require complete deter-minism, that is, that each member of a troup ereaches the same stable state through the ex-ecution of the same procedures with the samearguments in the same order . The system isbased on the notion of state equivalence rela-tion between modules, and parameter equiva-lence and result equivalence among procedurecalls . Procedure executions in a module hav ean equivalence relation induced by the rela-tions . Two executions are equivalent if : (1) th eresults are result equivalent, (2) the new statesare state equivalent, (3) the traces are identica lup to parameter equivalence of the remote pro-cedures called . A module is deterministic up t oequivalence if : (1) all executions of a procedur ewith a specific set of arguments in a specifi cstate are equivalent, or (2) whenever a param-eter is equivalent to another parameter, and astate is also equivalent to another state, all ex-ecutions of a specific procedure in the forme rstate with the former arguments are equivalen tto the executions of the same procedure withthe second argument in the second state .

To assure the deterministic up to equivalencerelation, the notion of call stack is extendedto distribute call stack, which consists of thesequence of contexts, possibly across machin eboundaries, that were entered and not returne dfrom yet. At the base of the stack is thetop-level context that originated the chain o fstacks . The distribute call stack is, once again ,extended to replicated call stack, in which anentry for an active procedure consists of a set ofcontexts, one from each member of the troup eimplementing that procedure . Given an activereplicated procedure call, the replicated cal lstack can be traced back, and a unique troup ecan be found at the base level . This is the roottroupe of the replicated procedure call .

In this way, a root id is added to each cal lmessage . A root id contains the troupe id ofthe root troupe of the call, and the sequencenumber of the root troupe's original replicatedcall . The root id is a transaction identifier, an dwhenever a callee makes a replicated call onbehalf of a caller, it propagates the root id of

Call P

Ca11 P

Call P

245

the call that it is currently performing . Theroot ids have the essential property that twoor more call messages arriving at a callee havethe same root id if they are part of the samereplicated call . This allows a callee to handlemany-to-one calls .

A combination of the modular redundanc yapproach and the primary standby is describe din [88], which provides for an RPC mechanis mtolerant to hardware failures . Whereas all thereplicated copies execute each call concurrently ,as in the modular scheme, the caller only make sone call, which is then transmitted to the oth-ers . The mechanism exposes to the user a clus-ter abstraction, which constitutes a group ofreplicated incarnations of a procedure . The in-carnations are identical, but reside in differentnodes . An RPC call is issued by a caller clus-ter, and the callee cluster is the one that exe-cutes the call . The caller process, the proces sthat starts the call, is not replicated, and isa special instance of a caller cluster with onlyone instance . In recursive and nested calls, acluster can function both as a caller and as acallee .

There are primary and secondary incarna-tions of a procedure . During cluster creationtime, an incarnation is designated primary ,and is responsible for handling the communica-tion between clusters . All others are consideredsecondary . If the primary incarnation fails, asecondary incarnation assumes its function an dbecomes the primary incarnation . A secondar yincarnation can only interact with other clus-ter's processes, if all its superior incarnationsfail and it has become primary. The incarna-tion hierarchy in a cluster is determined whe nthe service is created . All incarnations play anactive role : all the incarnations in the calleecluster execute a call, and every incarnation ina caller cluster receives a copy of the result . Inthe absence of failure, though, the caller make sa single call and only one copy of the result i ssent to the caller cluster . The system assumesthat a cluster is deterministic : when receivingthe same call each of its incarnations produce sthe same result and has the same side-effects .The callee procedures must be idempotent .

8.2 .3 Orphan Detection

Rajdoot [58] uses three built-in mechanisms tocope with orphans in an efficient way . It adopt sexactly once semantics . For callee failure, thecall is guaranteed to be terminated abnormally.There are three orphan handling mechanisms :

• (M1 :) If a client call is terminated abnor-mally, any computations that the call mayhave generated are also terminated .

• (M2 :) Consider a node that fails an dmakes a remote call to node C after recov-ery . If C has any orphans because of th ecaller's failure, they are terminated befor ethe execution of the call starts at C .

• (M3 :) If the node remains down, or if af-ter recovery never makes calls to C, an yorphans on C are detected and remove dwithin a finite amount of time .

To ensure M1, every call has a deadlin e(time-out) argument that indicates to the calle ethe maximum time available for execution .When the deadline expires, the callee termi-nates the execution and the call is terminate dabnormally . To ensure M2, every node main-tains a crashcount in permanent storage. Thecrash count is a counter that is incremented im-mediately after a node recovers from a failure .Also in stable storage, a node maintains a tableof crashcount values for callers that made call sto it . Every call request contains the caller' scrashcount value . If the value in a requestis greater than the value in the callee's tablefor that caller, orphans may be at the callee .These orphans are terminated before the calleeproceeds with the call . To ensure M3, ev-ery node runs a terminator process that occa-sionally checks the crashcount values of othernodes by sending messages to them and receiv-ing replies, and then terminates any orphanswhen it detects failures .

The RPC call syntax is as follows, where pa-rameters and results are passed by values :

RPC(server :

. . . ; call : . . . ;

time-out : . . . ; retry : . . . ;

var reply : . . . ;var rpc_status : . . .) ;

The first parameter specifies the callee's host .The second parameter, which is the name of the

246

function and the arguments. The retry indi-cates the number of times that the call is t obe retried (the default value being zero) . If af-ter issuing a call, no reply is received withinthe duration specified by time—out, the callis reissued . The process is repeated a maxi-mum of retry times . The manager processes ,which are run by each process and accessed viaa well-known address, are responsible for cre-ating callee processes that execute remote callsof a client . All remote calls of the client ar edirected to the callee process .

SUN RPC does not specify the call seman-tics to be supported and has no provision for or -phan treatment . The transport protocol chosenat the time that the RPC handler is acquire ddetermines the call semantics . Similarly, XeroxCourier RPC [87] appears to support exactlyonce semantics, but its description is not pre-cise about its fault tolerance capabilities andno support for orphan treatment is provided .

8 .3 Debugging and Monitorin gRPC Systems

Concurrency and communication among th econcurrent parts of a distributed programgreatly increase the difficulty associated wit hprogram debugging . For this reason, a systemfor prototyping and monitoring RPCs was de -signed [91] . From programmer-written calleedefinition files, the system can prototype calle eprograms, caller driver programs, interface de-scription files written in NIDL for each callee ,stubs, and prototype programs of RPC proce-dures for each callee .

The system consists of two parts : a prototyp-ing generator and a monitor . The prototypin ggenerator gets the callees' definition files, an dgenerates the above-mentioned files . It consist sof a set of specialized generators that provid efor the gradual development of the component sof a distributed environment . The server pro -gram generator generates a callee stub and acallee program for each callee definition file .The sequential client program generator gen-erates, for each callee, a caller stub and acaller driver program . The parallel client pro -gram generator generates a driver program foreach callee which can make use of any num-ber of remote procedures provided by the callee

concurrently. For a group of callees, the dis-tributed client program generator generates adriver program that can make use of any num-ber of remote procedures provided by thes ecallees .

The produced programs are monitored b ythe monitor when the debug option is on ,and during prototyping generation . The mon-itor has a controller and a group of managingservers . The controller offers a user interfaceand has a filter . Each host that has monitoredprogram parts on it has a managing server con-sisting of a server (MS) and an event database .

The monitoring is done in three steps . Dur-ing the monitoring step, all events that occu ron one host are monitored by the local M Sand recorded into the local event database . Anevent is an RPC and execution, a process cre-ation (through a fork), and termination, or anycombined event defined by the user through th eEvent Definition File (EDF) . During the sec-ond step, the ordering step, the events saved inthe local event databases are ordered throughmessages interchanged among the MSs . Thecontrollers can be invoked by any host . Thelast step, the replaying step, consists of combin-ing the results on all related event databases .The filter presents the execution trace of thedistributed program to the user . After the pro-gram is debugged, the user needs only to pro-gram the RPC body and the application inter-face .

8 .4 Asymmetry

Another aspect of the RPC mechanism tha tmight not be appropriate for certain applica-tions is the asymmetry inherited by the RPCmodel . In the RPC model, the caller can choosethe particular callee, or at least the callee' shost machine . Seldom can the callee mak eany restrictions about the caller, because thecallee cannot distinguish among callers to pro-vide them with different levels of service or t oextend to them different levels of trust . Calleesmust deal with callers that they do not under -stand, and certainly cannot trust [64] .

This issue is referred as naming (or address-ing) of the parties involved in an interaction [8] .A communication scheme based on direct nam-ing is symmetric if both the sender and re -

247

ceiver can name each other . In an asymmet-ric scheme, only the sender names the receiver .When the interprocess communication mecha-nism provides for explicit message receipt, th ereceiver has more control over the acceptance .The receiver can be in many different statesand accept different types of messages in eachstate. The indirect naming involves an inter-mediate object, usually called a mailbox, towhich the sender directs its message and t owhich the receiver listens .

In LYNX [64], a link is provided as a built-i ndata type, and is used to represent a resource .Although the semantics of a call over a link ar ethe same as an RPC, both ends of the call canbe moved . Callees not only can specify the setof callers that might be connected to the othe rside of the link, but also are free to rearrang etheir interconnection to meet the needs of achanging user community, and to control accessto the resources they provide .

Also related to symmetry is the nondeter-minism issue . A process may want to wait fo rinformation from any of a group of other pro-cesses, rather than from one specific process .In some situations, it is not known in advanc ethe member (or members) of the group tha twill have its information available first . Suchbehavior is nondeterministic .

On the other extreme, communicationthrough shared data is anonymous . A processaccessing data does not know or care who sentthe data .

Another criticism of the RPC model is thepassiveness of the callee [82] . Except for theresults, a callee is not able to initiate action t osignal the caller . The point is that the proce-dure call model is asymmetric and the calleehas no elegant way of communicating or awak-ing the callers . One example [17] where th ecallee should call the caller is when the net -work server (callee) needs to signal to an uppe rlayer in a protocol, or when a window manage rserver needs to respond to user print .

The procedure mechanism provides a syn-chronous interface to call downward throughsuccessive layers of abstraction . RPCs exten dthe mechanism and allow the layers to residein different address spaces . The distributed Up-Calls mechanism [17] is a facility which allows alower level of abstraction to pass information to

a higher level of abstraction in an elegant way .It is implemented as part of a server structuringsystem called CLAM. Users can layer abstrac-tions in caller processes (statically bound) ordynamically load layers into the callee process .Consider two abstractions P and Q in differen tlayers with P in a higher layer than Q . The Ppasses to Q pointers to functions in which P i swilling to accept UpCalls . To pass pointers, Pcalls a Q registering function, with the functio npointers as arguments, which is similar to ex-porting in normal RPC. When an event occursthat requires an UpCall to be made, Q decide sthe higher level of abstraction that should re-ceive the call, which could be P. The P and Qcan be in the same address space (local Up-Calls) or in different ones (remote UpCalls) .This is a different way of expressing, for in -stance, exception handling .

8 .5 Lack of Parallelism

The inherited lack of parallelism between th ecaller and the callee, in an RPC model, is an-other feature that deserved a lot of attentionby the scientific community. With RPC, whil ethe callee is active, the caller is always idle an dwaiting for the response [82] . Parallelism isnot possible. To gain performance, the calle rshould continue computing while the callee isworking . Related to the above issue, is the fac tthat there is no way of interleaving between thecaller and the callee [82] .

A caller can only have one outstanding call a ta time. The RPC paradigm is inherently a two-party interaction [82] . With the large numberof available processors possible in a distribute denvironment, the caller is actually forbidden bythe mechanism to use such parallelism . Manydistributed applications could benefit from con -current access to multiple callees [86] .

A caller may wait to make several calls si-multaneously. This is bulk data situations foran RPC environment .

A solution would be to build RPC-based ap-plications with several threads, so that, whilewaiting for a call, only that call's thread block sand the others proceed being executed . Thissolution does not scale well [3] . Since the cumu-lative cost of thread creation, context switch-ing, and thread destruction can be too great

248

in large distributed environments, where thenumber of RPC calls grows and shrink dynam-ically . Moreover, threads are not universallysupported .

The interactions between a caller and a calle ecan be classified as follows [3] :

• Intermittent exchange : the caller makesa few intermittent request-response typ ecalls to the callee .

• Extended exchange : the caller is eitherinvolved in bulk data transfer, or makesmany request-response type calls to acallee .

Several solutions have been proposed to ex -tend the RPC concept, so that it is both moreefficient for bulk data transfer, and so that i tallows for asynchronous communication . Allsolutions try to maintain as much as possiblethe semantics of a procedure call .

A call buffering approach [30] is a mechanismthat provides a call buffering server, wherethe requests and replies are stored (buffered) .An RPC call results in a call to the buffer-ing server, which stores the request, the callernames and the callee name . After that, thecaller makes periodic requests to the call buffer-ing server to check whether the call is executed .In the meantime, callees poll the call buffe rserver to check whether there are any call swaiting. If so, the arguments are retrieved, th ecall is executed, and the callee calls the cal lbuffer server back to store the results . Thenext time the caller pools the buffer, the re-sults can be retrieved .

In the Mercury System [45], a mechanis mis proposed that generalizes and unifies RP Cand byte-stream communication mechanisms .The mechanism supports efficient communica-tion, is language independent and is called cal lstream or stream . To add the mechanism toa particular language, a few extensions may beadded to the language . Such extensions are lan-guage veneer . Veneers for three languages havealready been provided : Argus, C, and Lisp .

A call stream connects two components of adistributed program . The component sendermakes calls to the component receiver . Thenext call can be made before the reply to th eprevious call received . The mechanism can de-liver the calls in the order they are made, and

deliver the replies from the sender in the or-der of the corresponding calls . The purposeof the proposed mechanism is to achieve hig hthroughput where calls are buffered and flushe dwhen convenient . Low latency can be achievedby explicitly flushing the calls . The call streamrelies solely on a specific, reliable byte-strea mtransport such as TCP, making it more suitablefor bulk data transfer . The use of TCP lead sto higher overhead for most transactional ap-plications in which a request-response protoco lis more appropriate .

The mechanism offers three types of calls :

• Ordinary RPCs: the receiver can make an-other call only after receiving the reply .

• Stream calls : the sender can make mor ecalls before receiving the reply. To theuser, the sequence of calls has the sameeffect as if the user had waited to receiv ethe nth reply before doing the (n + 1)-thcall . The system delivers calls and replie sin the correct order .

• Sends: the sender is not interested in thereply.

The receiver provides a single interface, anddoes not distinguish among the three kindsof calls . The underlying system takes care ofbuffering messages where appropriate, and de-livering calls in the correct order . At the mo-ment of the call, the senders can choose inde-pendently the particular type of call desired .

To minimize the delay for the ordinary RP Cmode, requests and replies are sent over th enetwork immediately, that is, as soon as the yare issued or available . On the other hand ,stream calls and sends are buffered and sentwhen convenient .

The paradigm also introduces the abstrac-tion of entities . An entity has a set of portgroups and activities (processes) . The portgroups group ports for sequencing purposes .On the sender side, entities have agents anda port group that define the sending end of astream. In this way, the user and the system es-tablish the sequencing in which calls from port sof the same group, calls from different stream sbut from the same entity, or calls from differentstreams and different entities must be executed .

249

In the CRONUS [3] system, Future is de-signed only for low latency, and yet the orde rof execution in Future may vary with the orde rcalled . Stream and Future do not optimize fo rintramachine calls .

Another extension to the RPC paradigm wasthe new data type introduced in the work de-scribed in [43], promise . The goal was to pre-serve the semantics of an RPC call, yet pre-vent the caller from blocking until the momen tthe caller really needs the values returned b ythe call . An RPC is declared as returning apromise, a place holder for a value that willexist in the future. The caller issues the call ,and continues executing in parallel with thecallee . When the callee returns the results, the yare stored in the promise, where they can b eclaimed by the caller when needed . The sys-tem guarantees that the calls are executed i nthe order that they were called, even if the re-sult of call i + 1 is claimed before the result o fcall i, if this result is ever claimed . The pro-grammer explicitly claims the result, and th eintroduction of this data type shows the userthe distinction between call time and claimin gtime. Functions are restricted to have only ar-gument passing by value and return only on evalue that constitutes the promise value .

The extension is language independent an dthe abstraction can be incorporated in any lan-guage. An RPC mechanism could implementthe idea behind the mechanism in a transpar-ent way, confining the problem at the imple-mentation level, but not at the language level .The system could do some data flow analysis i nthe program and allow the parallel execution ofcaller and callees, blocking the caller only whenvalues that are expected to be affected by RPCcalls are claimed. The complication here is t otake care of pointer-based arguments that aremodified as a side-effect of the remote execu-tion .

ASTRA [3] tries to combine low-latency andhigh-throughput communication into a singl easynchronous RPC model, and provides theuser with the flexibility of choosing among dif-ferent transport mechanisms at bind time . Ituses the methodology that the programmeruser knows better, letting the programmer de-cide whether a specific call requires low latencyor high throughput . The system does all the

optimizations necessary . The caller does notblock during a call, and the replies can b eclaimed when needed similar to promises . Allcalls are executed in the same order as called . I fthe reply is not available when the caller claim sit, the caller can unblock the receive operationby specifying a no delay option . ASTRA alsoprovides optimized intramachines calls, bypass-ing the data conversion and network communi-cation . For binding it uses a superserver dae-mon that must reside on every machine tha truns a callee process, and also detects whethera particular server is still active . The majordrawback of such a design is the loss of trans-parency between local and remote procedurecalls .

Another new data type, Pipes [29], propose sto combine the advantages of RPC with theefficient transfer of bulk data . It tries to com-bine low-latency and high-throughput commu-nication into a single framework, allowing fo rbulk data incremental results to be efficientlypassed in a type safe manner . RPCs are first -class values, that is, have the same status asany other values . They can be the value of anexpression, or they can be passed as an argu-ment . Therefore, they can be freely exchange damong nodes . Unlike procedure calls, pipe call sdo not return values and do not block a caller .Pipes are optimized for maximum throughput ,where RPCs are optimized for minimum la-tency. RPC and pipes are channels used in thesame manner as local procedures. The nodethat makes the call is the source node, and thenode that processes calls made on a channe lis the channel ' s sink node . For timing and se-quencing among calls, channels can be collecte dinto a channel group . A channel group is a setof pipes and procedures that share the samesink node and that observe a sequential order-ing constraint for a source process . This order-ing constraint guarantees that calls made by aprocess on the members of a channel group ar eprocessed in the order that they were made .The user makes the grouping .

A survey of asynchronous remote procedurecalls can be found in [4] .

250

8.6 Security

A major problem introduced by an externalcommunication network is the providing ofdata integrity and security in such an opencommunication network . Security in an RPCmechanism [71] involves several issues :

• Authentication : To verify the identity ofeach caller .

• Availability: To ensure that callee accesscannot be maliciously interrupted .

• Secrecy: To ensure that callee informationis disclosed only to authorized callers .

• Integrity: To ensure that callee informa-tion is not destroyed .

Systems treat these problems differently ,providing more insight or priority according t otheir most suitable applications .

The Cedar RPC Facility [13] uses th eGrapevine [12, 62] Distributed Database as anauthentication service or key distribution cen-ter for data encryption . It treats the RP Ctransport protocol as the level of abstraction a twhich to apply end-to-end authentication andsecure transmission measures [61] . The An-drew system [61] independently chose the sameapproach .

Andrew's RPC mechanism [61] supports se-curity. When a caller wants to communicat ewith a callee, it calls a bind operation . Thebinding sets up a logical connection at one ofthe four levels offered by the system :

• OpenKimono : the information is neithe rauthenticated nor encrypted .

• AuthOnly : the information is authenti-cated, but not encrypted .

• HeadersOnly : the information is authen-ticated, and the RPC packet headers, bu tnot bodies, are encrypted .

• Secure : the information is authenticated ,and each RPC packet is fully encrypted .

When establishing the connection, the calleralso specifies the kind of encryption to be used .The callee rejects the kinds of encryption thatit cannot handle. The bind operation involves

a 3-phase handshake between the caller and th ecallee. By the end of the operation, the calle rand the callee share one handshake key . Thecaller is assured that the callee can derive thiskey from its identification, and the callee is as-sured that the caller possesses the correct hand -shake key . The RPC mechanism does not hav eaccess to the format of a caller's identification ,nor the way in which the key can be derivedfrom this identification. All this functionalityis hidden by function pointers with which thesecurity mechanism supplies the RPC runtim esystem. The code for the details of the authen-tication handshake is small, self-contained, an dcan be treated as a black box, allowing for alter-native mutual authentication techniques to besubstituted with relative easy. A call to an un-bind operation terminates the connection, an ddestroys every state associated with that con-nection .

One way of guaranteeing security in RP Csystems, is to enhance the transport layer pro-tocol at the implementation level and to extendthe call interface at the user level, as in [11] .The new abstraction data type conversations i sintroduced, through which users interact wit hthe security facilities . To create a conversation ,a caller passes its name, its private key, and thename of the other principal to the RPC runtim esystem. That conversation is then used as anargument of an RPC, and the runtime systemensures that the call is performed securely us-ing a conversation key known only to the twoprincipals .

The scheme is based on the use of privat ekeys and built upon an authentication servic e(or key distribution center) . Each principa lhas a private key known only to the princi-pal and the authentication service. A callerand a callee that want to communicate nego-tiate with the authentication service to obtaina shared conversation key . This key is usedto encrypt subsequent communication betweenthe two principals . The federal Data Encryp-tion Standard [52] (DES) is used for encryp-tion, because very fast and inexpensive hard -ware implementations are broadly available .

For example, if principal A wants to commu-nicate securely with principal B, A 's programincludes a call on the RPC runtime system i nthe form :

251

cony <- R .P .C .Creat e

Conv[from : nameOfA ,

to :

nameOfB ,

key : privateKeyOfA]

Principal A can then make remote calls to pro-cedure P.Q implemented by B, such as :

x <- P .Q[thisConv :conv,arg :y]

Inside the implementation of P .Q, principal Bcan find the identity of the caller by a call inthe RPC runtime system in the form :

caller <- RPC .GetCaller[thisConv]

The concept of conversation is orthogonal t othe other abstractions in a call . Multiple pro-cesses can participate in a conversation, an dmultiple simultaneous calls may be involved ina conversation .

8.7 Broadcasting and Multicast-ing

It is argued in the research community tha tthere are applications to which RPC seems t obe an inappropriate paradigm . Several exten-sions to this model have therefore been pro-posed. One extension is applications based onthe necessity for multicasting or broadcasting .

Programming language support for multicas tcommunication in distributed systems that ca nbe built on RPC is presented in [19] . The calleeinterface has no modifications at all . The callerhas additional binding disciplines, which pro -vide for binding a call to a group of callees ,and additional linguistic support . Callers haveto deal with several replies, instead of just one ,and may have to block only until the first nreplies are received .

8 .8 Remote Evaluation

One of the main motivations behind interpro-cess communication and distributed environ-ments is that specialized processors can be in-terconnected . This kind of environment seemslike a single, multitarget system able to ex-ecute, possibly efficiently, several specialize dcomputations . In an RPC model, the special-ized processors are generally mapped to calleeprocesses that offer (export) a certain interface .

Because of the wide range of applications towhich these processors may be applicable, ther emay be no ideal set of remote procedures tha ta callee should provide [71] . In the worst sit-uation, the RPC model requires as many cus-tomized interfaces from a set of callee processesas there are applications that may make use ofit . Without an appropriate RPC interface, adistributed application can suffer degraded per-formance because of the communications over-head . It is claimed in [71] that, in the RP Cmodel, a program has a unique partition int ofragments for local and remote execution .

Remote Evaluation (REV) is the ability toevaluate a program expression at a remote com -puter [71]. A computer that supports REVmakes a set of procedures available to othercomputers by exporting them. The set ofprocedures exported by a computer is its in-terface . When a computer (client computer )sends a program expression to another com-puter (server computer), this is a REV request .The server evaluates the program expressio nand returns the results (if any) to the client .

Before sending the program expression, theclient prepares the code portion for transmis-sion, in the encoding process . The end resultof this encoding is a self-contained sequence ofbits that represents the code portion of the re-quest in a form that can be used by any serve rsupporting the corresponding service . It mayconsist of compiled code, source code, or othe rprogram representation . This choice affect smainly the run-time performance and server se-curity .

The use of precompiled REV requests canimprove run-time performance, because th eservers directly execute the REV requests in-stead of using an interpreter . It may not bevery useful in a heterogeneous computing en-vironment, and has the potential for compro-mising server security. Dynamic compilationof REV requests at servers is certainly possi-ble, but the net effect on performance depend son the REV requests . The performance of anRPC mechanism is bounded by the overhead ofinternode communication . The REV may elim-inate such overhead. Because trade-off is be-tween generality and performance when RPCsare used, service designers typically choose per -formance over generality. When REV is avail-

252

able, however, programmers can construct dis-tributed applications that need both. While inthe RPC model, a callee process exports a fixe dset of procedures, a REV environment allowsthe application programmer to compose serve rprocedures to create new procedures that haveequal standing with the exported ones .

The RPC REV relationship is compared wit hthe relationship between built-in data type an dnew data type definitions . The REV is claime dto be a generalization of, and an alternative to ,RPCs .

One of the drawbacks of REV is that it doesnot allow functions to be passed as argumentsof REV program expressions . The Network

Command Language [24] (NCL) defines a newprogramming model that is symmetric, that is ,not restricted to caller and callee network archi -tectures . Both callers and callees are allowe dto use NCL to send expressions that specify analgorithm for an operation .

To communicate with a remote host, th ecaller has to establish a session with it, whichis equivalent to logging into a server by passingappropriate credentials . If the caller fulfills theauthorization requirements, it enters an inter-active LISP read-eval-print loop with the hos t 'sevaluator (server) . The server connects backwith the caller in the same way. The calle rand server exchange NCL expressions repre-senting requests and responses over this ses-sion as if each were a user typing to an interac -tive evaluator . A caller or server can vary theprogramming dynamically to suit changing re-quirements within a heterogeneous distribute dsystem. In this way, a server can perform all ofthe operations necessary for a collection of het -erogeneous callers without previous definition ,linkages, or interfaces modules for every combi -nation of primitives . As a result, the ultimatepoint of compatibility for distributed systemsshifts to the NCL syntax, libraries, and canon-ical data formats .

The primary advantage of this scheme i sthat many application programs can be buil tthat translate tasks from the language that theclient is used to, to expressions to be evaluatedremotely . This advantage makes this softwarehighly extensibly .

To transmit expressions, LISP-like represen -tations are used that are very economical . On

the other hand, it can be rather inefficient hav-ing these expressions interpreted on the server' smachine . The integration with RPC-based sys-tems is also not easy.

9 Performance Analysi sIt is a common practice for experienced pro-grammers to use small procedures instead of in-line code, because it is more modular and doe snot affect performance too much [82] . If suchprocedures are run remotely however, insteadof locally, the performance may be severely de-graded .

RPCs incur an overhead between two an dthree orders of magnitude greater than localcalls [86] . The cost of making an RPC in th eabsence of network errors can be classified asfollows [86] :

call time = parameter packing

+ transmission queuing

+ network transmission

+ callee queuing and

scheduling

+ parameter unpacking

+ execution

+ results packing

+ transmission queuing


+ caller scheduling

+ results unpacking

It can be rewritten as:

call time = parameter transformations


+ execution

+ operating-system delay s

Programs that can issue RPCs can b emodeled, and a cost analysis can be devel-oped based on simple measurements, as shownin [50] . RPCs can be classified according t othe size of input, the size of output, and th eamount of computation performed by the pro-cedure. Application programmers are providedwith some guidelines, such as, how often theuser can afford to call an expensive RPC con-sidering the degradation of performance be-cause of remote execute instead of local exe-cution, the cases in which the remote host can

253

perform the computation faster and the speed-ing up factor, or, more generally, the expecte dexecution time of a program with a certain mixof RPCs .

An extensive analysis of several RPC pro-tocols and implementations developed befor e1984 was presented in [53], and the firs tmeasurements of a real implementation givenin [13] . The measurements were made forremote calls between two Dorados connectedby an Ethernet with a raw data rate of 2 .94megabits per second, lightly loaded . Table 1shows the time spent on average, and the frac-tion of that time that was spent in transmissio nand handling of local calls in the process . Thesemeasurements are made without using any en-cryption facility from the moment an RPC i srequested to the moment at which the result isin the final destination .

In the Modula-2 extension [1] to suppor tRPC, it was also observed that RPC timingsare directly affected by the number and size ofparameters, as shown in Table 2 . The perfor-mance was much better also for out arguments ,than for in-out, and for idempotent proceduresthan for non-idempotent ones .

In distributed (RPC or message passin gbased) operating systems, large percentage(94% to 99%) of the communication is onl ycross-domain, not cross-machine [9] . Effort hasbeen made to optimize the cross-domain im-plementation of RPCs to bring considerableperformance improvements . The Lightweigh tRPC [9] (LRPC) consisted of a design andimplementation effort to optimize the cross -domain use for the TAOS operating system o fthe DEC SRC Firefly multiprocessor worksta-tion . The improvements are the result of thre esimple aspects :

• Simple control transfer: the caller's threa dexecutes the requested procedure in the

Procedure AverageTime

Trans .Time

Loca lCall Time

no args/results 1097 131 910 args/results 1278 239 1 7240 word array 2926 1219 98

Table 1: RPC Performance in the CedarProject (in ps) .

Number

Time for

Timeof bytes

Local Callsfor

RPCs1 1 .05 3 .5 125 1 .79 4 .1 7

1016 2 .99 8 .223000 6 .66 19 .19

Table 2: RPC Performance in Extende dModula-2 (in ms) .

callee 's domain .

• Simple data transfer : a shared memory ex-ists between the caller and the callee, andthe arguments are copied only once for allsituations .

• Simple stubs : are a consequence of thesimple model of control and data transfe rbetween threads .

The system adopts a concurrency-based designthat avoids shared data structure bottlenecks .It also caches domains context on idle proces-sors, avoiding context switch . A performanceof 157 ms is achieved for the null cross-domai ncall, and of 227 ms for the 200-byte in-out ar-gument .

Another approach used to improve perfor-mance was to carefully analyze the steps takenby an RPC, and to precisely identify the bot-tlenecks [63] . An optimization for the RP Cmechanism for the Firefly multiprocessor com-puter is described in [63] . The architectureis based on multiprocessors in the same ma -chine sharing only one I/O bus connected tothe CPU 0. This feature could affect RPClatency on other processors if a high load o nthis particular CPU occurs . Great performanceincreases are achieved through the identifica-tion of the major bottlenecks in the mechanismand the optimization of the process in the par-ticular points . During marshaling, the stub suse customized code that directly copy argu-ments to and from the call or result packet ,instead of calling library routines or an inter-preter . The RPC and transport mechanism scommunicate through a buffer pool that residesin a shared memory accessible to all user ad-dress spaces, to the Nub, and which are als opermanently mapped into the I/O space . Inthis way, all these components can read an d

254

write packet buffers in the same address spac ewith no need for copying or extra address map -ping operations. Nevertheless, no protectionhas been added and this strategy can lead t oleaks into the system . An efficient buffer man-agement is done, making the 1server stub reusethe call packet for the results, and the receiverinterrupt handler to immediately replace thebuffer used by an arriving call or result packet ,checking for additional packets to process be -fore termination . A speedup by a factor of al-most three was obtained rewriting a particula rpath in assembly language . The demultiplex-ing of RPC packets is done at the interrupt rou-tine, instead of the usual approach of awaken-ing an operating-system thread to demultiple xthe incoming packet . The Firefly RPC offer sthe choice of different transport mechanism sat bind time . The integration of Firefly RPCinto a heterogeneous environment, where RP Ccan be especially beneficial, is not discussed .Heterogeneity is also a potential source of per-formance optimization problems, because dif-ferences in machine architecture and informa-tion representation can make time-consumin gconversion operations unavoidable . Portions ofthe software are implemented in assembly lan-guage, which may cause portability and main-tenance problems . Possible additional hard -ware and software speedup mechanisms are dis-cussed, offering estimated speedups of 4% t o50% .

An effort was made to take some of the op-erations in communications in distributed sys-tems to the hardware level . Hardware supportfor message passing at the level of the oper-ating system primitives was proposed in [60] .A message coprocessor is introduced that in-teracts with the host and network interfacethrough shared memory, and provides concur-rent message processing while the host execute sother functions . In control of the network in-terface the coprocessor takes care of the com-munication process . This involves checking thevalidity of the call, addressing and manipu-lating control blocks, kernel buffering, short -term scheduling decisions, and sending net -works packets, when necessary, and so on .

10 Conclusion

Distributed systems provide for the intercon-nection of numerous and diverse processingunits . Software applications must relish suchbroad functionality, the inherited parallelismavailable, and the reliability and fault toler-ance mechanisms that can be provided . Sev-eral language-support paradigms have beenproposed to the modeling of applications i ndistributed environments . An inappropri-ate choice of language abstractions and dat atype constructors may only add to the ever -increasing software development crisis. It mayprevent software maintenance, portability andreusability. Today's conception of a distributedsystem is far behind the desirable model o fcomputation . It is mandatory that new soft-ware paradigms be as independent as possibl eof the intrinsic characteristics of a specific ar-chitecture and, consequently, must be more ab-stract .

In this work, we studied RPC, a new pro-gramming language paradigm. A survey of themajor RPC-based systems and a discussion ofthe major techniques used to support the con-cept were presented . The importance of RPC i sbecause of its conservative view toward alread ydeveloped software systems . The goal of RP Cdevelopment is to provide easy interconnectionof new and old software systems, despite it slanguage, operating system, or machine depen-dency . To achieve this goal, the foundation ofRPC is based upon a strong language construc-tor, the procedure call mechanism, which exist sin almost any modern language .

This work may inspire important futur ework . Syntax and semantic transparency ar eever-challenging research topics and the suit -ability of the RPC paradigm for new dis-tributed systems applications, such as multi -media, is also an important topic to be studied .

Acknowledgements

I am thankful to Shaula Yemini for suggestingthe writing of this survey, and for giving signif-icant comments on its overall organization . Iam also thankful to Prof. Yechiam Yemini, myadvisor, for carefully reviewing earlier drafts ofthis work . Finally, I would like to thank the

255

referees for their careful reading of this paperand for their valuable comments .

References

[1] Guy T . Almes. The impact of languageand system on remote procedure call de-sign. In The 6th International Conferenceon Distributed Computing Systems, pages414-421, Cambridge, Massachusetts, May1986 . IEEE Computer Society Press .

[2] Guy T. Almes, Andrew P. Black, Ed-ward D. Lazowska, and Jerre D . Noe .The eden system: A technical review .IEEE Transactions on Software Engineer-ing, 11(1) :43-58, January 1985 .

[3] A . L . Ananda, B . H . Tay, and E . K. Koh .ASTRA - an asynchronous remote pro-cedure call facility. In The 10th Inter-national Conference on Distributed Com-puting Systems, pages 172-179, Arlington ,Texas, May 1991 . IEEE Computer SocietyPress .

[4] A . L . Ananda, B . H . Tay, and E . K. Koh .A survey of asynchronous remote proce-dure calls . ACM Operating Systems Re-view, 26(2) :92-109, April 1992 .

Gregory R. Andrews. Paradigms for pro-cess interaction in distributed programs .ACM Computing Surveys, 23(1) :49-90 ,March 1991 .

Gregory R. Andrews and Ronald A . Ols-son. The evolution of the sr language . Dis-tributed Computing, 1(3) :133-149, 1986 .

Joshua Auerbach . Concert/C specificatio n- under development. Technical report ,IBM T. J . Watson Research Center, Apri l1992 .

[8] Henry E . Bal, Jennifer G . Steiner, and An-drew S . Tanembaum . Programming lan-guages for distributed computing systems .ACM Computing Surveys, 21(3) :261-322 ,September 1989 .

[9] Brian N . Bershad, Thomas E . Ander-son, Edward D. Lazowska, and Henry M .

Levy. Lightweight remote procedure call .ACM Transactions on Computer Systems,8(1) :37-55, February 1990 .

[10] Brian N. Bershad, Dennis T Ching, Ed-ward D Lazowska, Jan Sanislo, andMichael Schwartz. A remote call facil-ity for interconnecting heterogeneous com-puter systems. IEEE Transactions onSoftware Engineering, 13(8) :880-894, Au-gust 1987 .

[11] Andrew D. Birrell . Secure communicationusing remote procedure calls . ACM Trans -actions on Computer Systems, 3(1) :1-14 ,February 1985 .

[12] Andrew D. Birrell, R. Levin, R . M. Need-ham, and M. D. Schroeder . Grapevine : Anexercise in distributed computing . ACMCommunications, 4(25) :260-274, Apri l1982 .

[13] Andrew D . Birrell and Bruce Jay Nel-son. Implementing remote procedure calls .ACM Transactions on Computer Systems,2(1):39-59, February 1984 .

[14] John R. Callahan and James M . Purtilo .A packing system for heterogeneous exe-cution environment . IEEE Transactionson Software Engineering, 17(6) :626-635 ,June 1991 .

[15] B. E. Carpenter and R . Cailliau. Experi-ence with remote procedure calls in a real -time control system . Software Practic eand Experience, 14(9) :901-907, September1984 .

[16] Roger S. Chin and Samuel T . Chan-son. Distributed object-based program-ming systems . ACM Computing Surveys,23(1) :91-124, March 1991 .

[17] David L. Cohrs, Barton P. Miller, andLisa A . Call . Distributed up calls : A mech-anism for layering asynchronous abstrac-tions . In The 8th International Conferenc eon Distributed Computing Systems, pages55-62, San Jose, California, June 1988 .

[18] E. C. Cooper . Replicated procedure call .In Srd ACM Symposium on Principles

[5 ]

[6]

[7]

256

[28] Phillip B . Gibbons . A stub generator formulti-language RPC in heterogenous envi-ronments . IEEE Transactions on SoftwareEngineering, pages 77-87, 1987 .

[29] David K. Gifford and Nathan Glasser . Re-mote pipes and procedures for efficient dis-tributed communication . ACM Transac-tions on Computer Systems, 6(3):258-283 ,August 1988 .

[30] R. Gimson. Call buffering service . Tech-nical Report 19, Programming ResearchGroup, Oxford University, Oxford Univer-sity, Oxford, England, 1985 .

[31] K. G. Hamilton . A Remote ProcedureCall System . PhD thesis, University ofCambridge, Computing Lab., Univ . Cam-bridge, Cambridge, England, 1984 .

[32] Per Brinch Hansen. Distributed pro-cesses: A concurrent programming con-cept . Communications ACM, 21(11) :934-941, November 1978 .

[33] Roger Hayes and Richard D . Schlichting .Facilitating mixed language programmingin distributed systems . IEEE Transaction son Software Engineering, 13(12) :1254-1264, December 1987 .

[34] M. Herlihy and B . Liskov . A value trans-mission method for abstract data types .ACM Transactions on Programming Lan-guages and Systems, 4(4) :527-551, Octo-ber 1982 .

[35] C. A. R. Hoare . Communicating sequen-tial processes. Communications of ACM,21(8) :666-677, August 1978 .

[36] Apollo Computer Inc . Apollo NCA andsun ONC: A comparison . In 1987 Sum-mer USENIX Conference, Phoenix, Ari-zona, June 1987 .

[37] International Standard for Informatio nProcessing System . Open System sInterconnection—Specification of AbstractSyntax Notation One (ASN.1), 1987 . ISOFinal 8824 .

of Distributed Computing, pages 220-232 ,Vancouver, BC, 1984 .

[19] Eric C. Cooper . Programming languag esupport for multicast communication indistributed systems . In The 10th Interna-tional Conference on Distributed Comput-ing Systems, pages 450-457, Paris, France ,June 1990. IEEE Computer Society Press .

[20] Department of Defense, Department ofDefense, Ada Joint Program Office, Wash-ington, DC . Reference Manual for the Ad aProgramming Language, July 1982 edition .

[21] L. P. Deutsch and E . A. Taft . Require-ments for an exceptional programming en-vironment . Technical Report CSL-80-10 ,Xerox Palo Alto Research Center, PaloAlto, California, 1980 .

[22] Terence H. Dineen, Paul J . Leach,Nathaniel W. Mishkin, Joseph N . Pato ,and Geoffrey L. Wyant . The network com-puting architecture and system : An en-vironment for developing distributed ap-plications . In USENIX Conference, pages385-398, Phoenix, Arizona, June 1987 .

[23] Jeffrey L . Eppinger, Naveen Saxena, andAlfred Z . Spector . Transactional RPC . InProceedings of Silicon Valley NetworkingConference, April 1991 .

[24] Joseph R . Falcone . A programmable inter-face language for heterogenous distribute dsystems . ACM Transactions on Compute rSystems, 5(4) :330-351, November 1987 .

[25] Nissim Francez and Shaula A. Yem-ini. Symmetric intertask communication .ACM Transactions on Programming Lan-guages and Systems, 7(4) :622-636, Octo-ber 1985 .

[26] Neil Gammage and Liam Casey. XMS: Arendezvous-based distributed system soft-ware architecture . IEEE Software, 2(3) :9-19, May 1985 .

[27] Narain H. Gehani and William D . Roome.Rendezvous facilities : Concurrent C an dthe Ada language . IEEE Transactions o nSoftware Engineering, 14(11) :1546-1553 ,November 1988 .

25 7

T

[38] Michael B . Jones and Richard F. Rashid .Mach and matchmaker : Kernel languag esupport for object-oriented distributedsystems: In OOPSLA '86 Proceedings,pages 67-77, Portland, Oregon, Novembe r1986 . SIGPLAN Notices .

[39] Michael B . Jones, Richard F . Rashid, an dMary R. Thompson. Matchmaker: Aninterface specification language for dis-tributed processing . In 12th ACM Sympo-sium on Principles of Programming Lan-guages, pages 225-235, New Orleans, LA ,January 1985 .

[40] B. Lampson. Remote procedure calls . Lec-ture Notes in Computer Science, 105:365-370, 1981 .

[41] P. J . Leach, P . H. Levine, B . P. Douros ,J. A. Hamilton, D . L. Nelson, and B . L .Stumpf. The architecture of an integrate dlocal area network . IEEE Journal o nSelected Areas in Communications, page s842-857, 1983 .

[42] Kwei-Jay Lin and John D . Gannon .Atomic remote procedure call . IEEETransactions On Software Engineering,11(10) :1126-1135, October 1985 .

[43] Barbara Liskov and Li uba Shrira.Promises : Linguistic support for effi-cient asynchronous procedure calls in dis-tributed systems . In SIGPLAN'88 Con-ference on Programming Language Desig nand Implementation, pages 260-267, At-lanta, Georgia, June 1988 .

[44] Barbara Liskov. Distributed program-ming in Argus . Communications ACM,31(3) :300-312, March 1988 .

[45] Barbara Liskov, Toby Bloom, David Gif-ford, Robert Scheifler, and William Weihl .Communication in the Mercury system . In21st Annu . Hawaii Int . Conf Syst. Sc . ,pages 178-187, January 1988 .

[46] Barbara Liskov, Maurice Herlihy, an dLucy Gilbert . Limitations of synchronouscommunication with static process struc-ture in languages for distributed comput-ing. In 19th ACM Symposium on Prin -

ciples of Programming Languages, pages150-159, St . Petersburg, Florida, 1986 .

[47] Barbara Liskov and R Scheifler . Guardiansand actions : linguistic support for robust ,distributed programs . ACM Transactionson Programming Language and Systems ,5 :381-404, 1983 .

[48] Barbara Liskov, Liuba Shrira, and JohnWroclaswski. Efficient at-most-oncemessages based on synchronized clocks .ACM Transactions on Computer Systems,9(2) :125-142, May 1991 .

[49] Klaus-Peter Lohr, Joachim Muller, andLutz Nentwig . Support for distributed ap-plications programming in heterogeneouscomputer networks . In The 8th Interna-tional Conference on Distributed Comput-ing Systems, pages 63-71, San Jose, Cali-fornia, June 1988 .

[50] Dan C . Marinescu . Modeling of program swith remote procedures . In R. Popescu-Zeletin, G. Le Lann, and K . H. Kim, edi-tors, The 7th International Conference o nDistributed Computing Systems, pages 98-104, Berlin, West Germany, September1987 .

[51] J . G. Mitchell, W. Maybury, and R . Sweet .Mesa language manual (version 5 .0) . Tech-nical Report CSL-79-3, Xerox Palo AltoResearch Center, Palo Alto, California ,1979 .

[52] National Bureau of Standards (NBS) ,NBS, Department of Commerce, Wash-ington, D .C. Data Encryption Standard ,1977 .

[53] B. J . Nelson . Remote Procedure Call. PhDthesis, Carnegie Mellon University, Pits-burgh, Pensilvania, 1981 .

[54] Gerald W. Neufeld and Yueli Yang . Thedesign and implementation of an ASN.1-C compiler . IEEE Transactions on Soft-ware Engineering, 16(10) :1209-1220, Oc-tober 1990 .

[55] David Notkin, Andrew P. Black, Ed-ward D . Lazowska, Henry M. Levy, Jan

258

Sanislo, and John Zahorjan . Intercon-necting heterogeneous computer systems .Communications of the ACM, 31(3) :258-273, March 1988 .

[56] David Notkin, Norman Hutchinson, JanSanislo, and Michael Schwartz . Hetero-geneous computing environments : Reporton the ACM SIGOPS workshop on ac-commodating heterogeneity. Communica-tions of the ACM, 30(2) :132-140, Febru-ary 1987 .

[57] Dave Otway and Ed Oskiewicz . REX :A remote execution protocol for object-oriented distributed applications. InR. Popescu-Zeletin, G. Le Lann, and K . H .Kim, editors, The 7th International Con-ference on Distributed Computing Sys-tems, pages 113-118, Berlin, West Ger-many, September 1987 .

[58] Fabio Panzieri and Santosh K . Shrivas-tava. Rajdoot: A remote procedure cal lmechanism supporting orphan detectio nand killing . IEEE Transactions on Soft -ware Engineering, 14(1) :30-37, Januar y1988 .

[59] C . Pu, D . Florissi, P . G . Soares, P. S . Yu ,and K. Wu. Performance comparison ofsender-active and receiver-active mutua ldata serving. In HPDC-1, Symposium onHigh Performance Distributed Computing ,Syracuse, New York, September 1992 .

[60] Umakishore Ramachandran, MarvinSolomon, and Mary K. Vernon . Hardwaresupport for interprocess communication .IEEE Transactions on Parallel and Dis-tributed Systems, 1(3) :318-329, July 1990 .

[61] M. Satyanarayanan . Integrating securityin a large distributed system . ACM Trans-actions on Computer Systems, 7(3) :247-280, August 1989 .

[62] Michael D . Schroeder, Andrew D . Birrel ,and Roger M . Needham . Experience withgrapevine : The growth of a distribute dsystem . ACM Transactions on DistributedSystems, 2(1):3-23, February 1984 .

[63] Michael D. Schroeder and Michael Bur-rows. Performance of firefly RPC .ACM Transactions on Computer Systems ,8(1) :1-17, February 1990 .

[64] M. L. Scott . Language support for loosel ycoupled distributed programs . IEEETransactions on Software Engineering ,13(1) :88-103, January 1987 .

[65] Michael Lee Scott . Messages vs . remoteprocedures is a false dichotomy. SIGPLANNotices, 18(5) :57-62, May 1983 .

[66] Robert Seliger . Extending C++ to sup-port remote procedure call, concurrency,exception handling, and garbage collec-tion. In Proceedings of USENIX C++Conference, 1990 .

[67] S . K . Shrivastava and F. Panzieri . Thedesign of a reliable remote procedure cal lmechanism . IEEE Transactions on Com-puters, 31:692-697, July 1982 .

[68] Ajit Singh, Jonathan Schaeffer, and MarkGreen. A template-based approach t othe generation of distributed applicationsusing a network of workstations . IEEETransactions on Parallel and DistributedSystems, 2(1) :52-67, January 1991 .

[69] Ian Sommerville . Software Engineer-ing . Addison-Wesley Publishing Company,third edition, 1989 .

[70] Robert J . Souza and Steven P. Miller .UNIX and remote procedure calls : Apeaceful coexistence? In 6th Interna-tional Conference on Distributed Comput-ing System, pages 268-277, Cambridge ,Massachusetts, May 1986 .

[71] James W. Stamos and David K. Gifford .Remote evaluation . ACM Transaction son Programming Languages and Systems ,12(4) :537-565, October 1990 .

[72] W. Richard Stevens . UNIX NetworkProgramming . Prentice Hall, Englewoo dCliffs, New Jersey, 1990 .

[73] Michael W. Strevell and Harvey G .Cragon. High-speed transformation ofprimitive data types in a heterogeneous

259

distributed computer system . In The 8t hInternational Conference on Distribute dComputing Systems, pages 41-46, SanJose, California, June 1988 .

[74] Robert Strom and Nagui Halim . A newprogramming methodology for long-live dsoftware systems . IBM Journal ResearchDevelopment, 28(1) :52-59, January 1984 .

[75] Robert E. Strom, David F . Bacon, A rthur P. Goldberg, Andy Lowry, Daniel M .Yellin, and Shaula Alexander Yemini . Her-mes — A Language for Distributed Com-puting. Prentice Hall, 1991 .

[76] Robert E. Strom and Shaula Yemini . NIL :An integrated language and system fordistributed programming. In Proceedingsof SIGPLAN Symposium on Programmin gLanguage Issues in Software System, page s73-82. ACM, June 1983 .

[77] Mark Sullivan and David Anderson . Mar-ionette : A system for parallel distribute dprogramming using a master/slave model .In The 9th International Conference o nDistributed Computing Systems, pages181-188, Southern California, 1989 . IEE EComputer Society Press .

[78] Sun Microsystems, Network InformationCenter, SRI International . XDR: Exter-nal Data Representation Standard (RF C1014), June 1987. Internet Network Work-ing Group Requests for Comments, No .10014 .

[79] Sun Microsystems, Network InformationCenter, SRI International . RPC: Re -mote Procedure Call, Protocol Specifica-tion, Version 2 (RFC 1057), June 1988 .Internet Network Working Group Re-quests for Comments, No . 10057 .

[80] L . Svobodova . Resilient distributed com-puting . IEEE Transactions on SoftwareEngineering, 10 :257-268, May 1984 .

[81] D . C . Swinehart, P. T. Zellweger, andR. B. Hagmann. The structure of cedar .SIGPLAN Notices, 20(7) :230-244, July1985 .

[82] Andrew S . Tanenbaum and Robbert vanRenesse. A critique of the remote pro-cedure call paradigm. In R. Speth, edi-tor, Proceedings of the EUTECO 88 Con-ference, pages 775-783, Vienna, Austria ,April 1988. Elsevier Science Publishers B .V. (North-Holland) .

[83] B. H. Tay and A . L . Ananda . A survey ofremote procedure calls . ACM Operatin gSystems Review, 24(3) :68-79, July 1990 .

[84] Marvin M. Theimer and Barry Hayes.Heterogeneous process migration by re-compilation . In 11th International Confer-ence, pages 18-25, Arlington, Texas, May1991 . IEEE Computer Society Press .

[85] Robbert van Renesse and Jane Hall . Con-necting RPC-based distributed system susing wide-area networks . In R. Popescu-Zeletin, G . Le Lann, and K. H. Kim, ed-itors, The 7th International Conferenceon Distributed Computing Systems, page s28-34, Berlin, West Germany, September1987 .

[86] Steve Wilbur and Ben Bacarisse . Build-ing distributed systems with remote pro-cedure call . Software Engineering Journal,pages 148-159, September 87 .

[87] Xerox Corporation, Xerox OPD . Courier:The Remote Procedure Call Protocol, De-cember 1981 . Xerox System IntegrationStandard 038112 .

[88] M. Yap, P. Jalote, and S . K. Tripathi .Fault tolerant remote procedure call . InThe 8th International Conference on Dis-tributed Computing Systems, pages 48-54 ,San Jose, California, June 1988 . IEEEComputer Society Press .

[89] Shaula Yemini. On the suitability of Adamultitasking for expressing parallel algo-rithms. In Proceedings of the AdaTECConference on Ada, pages 91-97, Arling-ton, VA, October 1982 . ACM.

[90] Shaula A. Yemini, German S . Goldszmidt ,Alexander D . Stoyenko, and Yi-Hsiu Wei .Concert : A high level language ap-proach to heterogeneous distributed sys-tems. In 9th International Conferenc e

260

on Distributed Computing Systems, pages162-171, Newport Beach, CA, June 1989 .

[91] Wanlei Zhou. PM: A system for proto-typing and monitoring remote procedurecall programs. ACM Software Engineer-ing Notes, 15(1):59-63, January 1990 .

Appendix

A Operational Semanticsof Procedure Call Bind-ing

Programming languages, together with the op-erating systems, provide the programmer userwith an abstract view of the underlying physi-cal architecture . From the user point of view ,a program manipulates entities such as vari-ables, statements, abstract data types, proce-dures, and so on, that are identified throughtheir names . To do the appropriate mappin gbetween these entities and the physical ma-chine, the language designers usually maintaina list of attributes (properties) associated witheach entity . For instance, an entity of typ eprocedure must have attributes that representthe name and type of formal parameters, it sresult 's name and type, if any, and parame-ter passing conventions . It also maintains aninstruction pointer (ip) and an environmentpointer (ep or the access environment) . Theip allows for the localization of the code o fthe procedure inside the current address space .The ep provides access to the context for resolv-ing free variables (global variables) in a proce-dure body . In this way, when the programmerrefers to a specific entity through its name, th esystem can access a list of attributes and per -form the correct mapping . The attributes mus tbe specified before an entity is processed durin grun time. The binding corresponds to the spec-ification of the precise value of an attribute ofa particular entity. This binding information isusually in data structure descriptor of the en-tity . Some attributes can be bound at compila-tion time, others at run time (so that they canchange dynamically), and some are defined b ythe language designers (the user cannot change

them) . Programming languages differ (1) o nthe kind of entities that they support, (2) onthe kind of attributes they have associated withone entity type, and (3) on the binding time foreach attribute .

In conventional procedure call mechanisms, abinding usually consists of finding the addres sof the respective piece of code (ip) by using thename of a procedure. The context in whichfree variables are to be resolved is the contextof the caller procedure at the moment of th ecall accessed through the ep . This binding canbe done during compilation time, if the bod yof the function is defined in the same modulethat the call is ; during link-editing time if th eprocedure's body is in a module different fro mthe caller's module; or during run time if th eprocedure 's name is stored in variables (point-ers in C) whose values can only be determineddynamically. In the last case, the compiler pro-duces a mapping between a procedure's nam eand its address that can be accessed indirectl yduring run time .

B Operational Semantic sof RPC Binding

An RPC is as an extension of the conven-tional procedure call mechanism to which a lit-tle binding information must be added . In aconventional situation, it is assumed that th ecaller and callee belong to the same programunit, that is, thread of execution, even thoug hthe language supports several threads of exe-cution in the same operating system addres sspace or in distributed programs as shown inFigure 1 . In other words, there are no in-terthread calls, only intrathread calls, wher ethe binding limits its search space to the pro-gram unit in which the procedure is called . Thesame happens with the execution environmentwhen the called procedure is being identified .The callee also inherits the caller 's context .

In RPC-based distributed applications, thesame thread assumption is relaxed and in-terthread calls are allowed, as shown in Fig-ure 2 . The binding information must thencontain additional attributes that allow th eidentification of the particular thread in whic hthe callee procedure is to be executed (callee

261

thread) . The callee thread now has its own con-text, perhaps completely apart from the calle r 'scontext, and the threads exchange informationonly through the parameters passed . Some ad-ditional semantics is needed, though, to defin ethe behavior of parameter passing by referenceand pointer-based memory access .

C Brief Historical Survey

The idea of extending the concept of proce-dure calls to distributed environment was firs tintroduced in Distributed Processes [32] . InDistributed Processes, a distributed applica-tion consisted of a fixed number of sequen-tial processes that could be executed simulta-neously. A process has variables that are no tdirectly accessible to other processes, has aninitial statement, and has some common proce-dures . A process can call local common proce-dures or the common procedures of other pro-cesses (external-request RPC) . An external re-quest call has the syntax :

call <process_name> .<proc_name>

(<expressions>, <variables> )

where <processmame> identifies the process inwhich the common procedure <proc_name> i s

to be executed . Before the procedure is ex-ecuted, the <expressions> values of the callare assigned to the input parameters . Afterexecution, the output parameters are then as -signed to the <variables> of the call . Dis-tributed Processes also provided guarded com-mands and guarded regions to allow nondeter-ministic execution of statements . They allow aprocess to make arbitrary choices among sev-eral statements based on the current state ofits variables .

One of the major pioneers of RPC-based Dis-tributed Environment was the Cedar [21, 81 ]project that was developed around 1984 . TheRPC Facility was mostly based on the con-cepts described in [53]. The Interface De-scription Language used was Mesa [51] withno extensions, and the user could import an dexport only interfaces, not procedures . TheGrapevine [12, 62] distributed database wasused as a means to store the exported infor-mation. The user was responsible .for export-ing and importing the interfaces before making

the corresponding RPC . There was a flexibilit yin the binding time, which could be at com-pilation or runtime. Only a fixed, restricte dset of data types, not including pointers, coul dbe passed as arguments or returned in RPCs .The environment assumed was completely ho-mogeneous . The callee processes were starte dand terminated by the runtime support actingas permanent servers of their interface, and ex-ecuting no other computation and leaving nocontrol to the user application to manage theservice of the calls . An independent transportprotocol was entirely developed, and was con-cerned mainly with efficient establishing andcanceling of connections, and with their light -weight management . For system malfunctions ,the Cedar mechanism did not use time-outs t oprevent indefinite waiting by a client, but mad euse of special probe messages to detect whetherthe called server was still running .

An extension of P+, a language for applica-tions in control system, to support RPC wasproposed in [15] . The control system in whichP+ was implemented already had a remote ex-ecution protocol . Any programmer could re-quest the remote execution of any part of theprogram, which was written in a specializedsyntax. This part was transmitted in ASCIIcode to an interpreter (server) in the remot emachine . The server executed the transmit-ted source code interpretatively, and returnedthe results to the calling program, using a spe-cialized syntax . To extend P+, routine REMwas introduced . The REM accepted a runtimecall descriptor, which includes the remote pro-cedure name, the remote computer name, andfor each parameter, a tag that indicates the pa-rameter type and mode (read-only, read-write ,write-only), the size of an array, and the ad -dress of the parameter . The REM encoded thedescriptor in a datagram format written in th especialized syntax (containing the necessary in -terpretative code in ASCII) and binary valuesof the parameters . Making use of the remot eprotocol already present in the system, REMsent the datagram to the server and waited forthe results . The server executed the code re-ceived interpretatively, which was a single pro-cedure call, that was unaffected by the call is-sued by REM.

262

ARGUS [47] is an integrated programmin glanguage and system for distributed programdevelopment . The fundamental concept in thelanguage is atomicity . The logical units of dis-tribution are guardians and an atomic activityis an action . An action can be completed eitherby committing, when all involved objects takeon their new states, or by an abort procedure ,where the effect is as if the action had neverstarted. A distributed application consists of aset of guardians that control access to a groupof resources available to its users only throughfunctions named handlers (local and possiblyremote functions) . A guardian contains pro-cesses that execute background tasks and han-dlers . Each call to a handler triggers the spawn -ing of a separate process inside that guardian ,which can execute many calls concurrently . Aguardian can be created dynamically, and th euser need only specify the node in which it mus tbe created . Guardian and handler names arefirst-class elements. They can be exchangedthrough handler calls, which avoids the explicitimporting and exporting of functions . Handlercalls are location independent, and continue t owork even though the corresponding guardianchanges its location . They have at-most-oncesemantics (defined in Section 5 .1) and they ar ebased on the termination model of exception ,terminating either normally or in one of severaluser-defined exception conditions .

In the Eden system [2], a distributed appli-cation is a collection of Eden objects or Ejects .Each Eject has a concrete Edentype, the pieceof code executed by the particular Eject, an dan abstract Edentype, which describes its be-havior . Several concrete Edentypes can imple-ment the same abstract Edentype . Ejects com-municate via invocations procedures from oneEject to another . To issue an invocation, anEject must have a capability for the callee Eject .Each Eject has one capability to which invoca-tions can be addressed . Invocation procedureshave CallersRights as a parameter, allowing th ecallee Eject to check the caller's right . TheEden Programming Language (EPL) providesfacilities to control concurrency within Ejects .If the code fragments appear in different Ejects ,it is the the job of the EPL to ensure parame-ter packing and stub generation . On the calleeside, the translator builds a version of CallInvo -

263

cationProcedure that is tailored to the invoca-tion procedure it defines. On the caller side, th eusual stub is built . On the EPL level, the lan-guage provides ReceiveOperation, ReceiveSpe-cific, and ReceiveAny that are used directly bythe programmer . The Receive Operation takes aset of operations, and returns only when one o fthem is called, whereas ReceiveSpecific is pro-vided only when universal or singleton sets arewanted.

Modula-2 was also extended to support RPCcalls [1], on top of the V System implemen-tation for SUN workstations . The goal wasto allow entire external modules, as defined inModula-2, to be remote, that is, a callee woul dbe defined by the definition module and imple-mented (specified) by the implementation mod-ule . A major effort was made to generate as fe was possible dissimilarities between a remote ex -ternal module and a local module . Only a fewattributes and constraints were added to th edefinition module ; a stub generator was imple-mented for a particular definition module thatgenerated the callee and caller stubs as well a sheaders ; and a small runtime module was de-veloped that supported the paradigm duringexecution time. The binding was done auto-matically the first time that a remote functionwas called, and this binding lasted for the en -tire execution time. Null RPCs, with no argu-ments, executed in around 3 .32 milliseconds .

SR [6] is a distributed programming languag ethat was concerned mainly with expressivenes s(to make it possible to solve relevant problemsin a straightforward way), efficiency in com-piling and executing, and simplicity in under -standing and using the language . The proble mdomain was distributed programming . SR pro-vides primitives, which, when combined in a va-riety of ways, support semaphores, rendezvous ,asynchronous message-passing, and local andremote procedure calls . SR's main componentis resources, which have a specification part andan implementation part . The specification partdefines the operations provided by the resourcethat is implemented by possibly several pro-cesses running on the same processor . All re-sources and processes are created dynamicall yand explicitly by execution statements . To cal loperations, SR provides mainly the call andsend statements . The call statement has the

1

semantics of a procedure call, either locally o rremotely, while send has the semantics of semi -asynchronous message passing . A send invoca-tion is returned after the call message is deliv-ered to the remote machine . On the callee side ,SR provides proc and in statements to deter-mine the operations execution time . An oper-ation declared as a proc operation triggers aseparate process to execute each call received .The in statement provides for the execution ofa selected operation using the value of its as-sociated Boolean expression . A receive state-ment waits for a call of an specified operation ,and then saves the arguments of the call, butthe call is not executed . To allow the earlytermination of an operation, the return state-ment terminates the smallest enclosing in stat-ment or proc statement . Similarly, the replystatment terminates the respective statement swithout terminating the process . It providesfor conversations, in which caller and calleecontinuously exchange information .

LYNX [64] supports RPC calls on the top oflinks, which are built-in data types . A link isa virtual circuit abstraction that represents aresource . Callee processes can specify the pro-cesses that can be bound to the other end of th elink. In this way, LYNX provides for some sym-metry between caller and callee, because bot hcan choose with whom to communicate . Pro-cesses can exchange link bindings, and once ex-changed, the previous owner loses access to th elink. Procedures are called through connect

statements that use a specified link to send acall through it . This operation blocks until theresults are returned in a way similar to proce-dure calls . The process bound to the other endof the link accepts the call, executes it, andreplies the results back, unblocking the callerprocess . A process can contain a single threa dof control that receives requests explicitly, or aprocess can receive them implicitly, where eachappropriate request creates its own thread au-tomatically . A link end can be bound to morethan one operation, which leads to a type se-curity check on a message-by-message basis .

Around 1987, Apollo presented the Net-work Computing Architecture [22] (NCA), a nobject-oriented environment framework for thedevelopment of distributed applications, to-gether with the Network Computing System

(NCS), a portable NCA implementation . Het -erogeneity was then included in the RPCdesigners concerns, but mainly at the leve lof the machine data representation. To in-terconnect heterogeneous distributed systems ,NCA designed an RPC facility support calle dNCA/RPC, a Network Interface Definitio nLanguage called NIDL, and a Network Dat aRepresentation called NDR . The NCA/RPC isa transport-independent RPC system built o nmultiple threads of execution in a single ad-dress space . A replicated object location bro-ker is used for location purposes. The con-cept of object-oriented programming was ex-tended to abstract the location (where it isgoing to be executed) from the user, makingobjects the unit of distribution, reconfigura-tion, and reliability. The unit to be exporte dor imported was an object, an object type, o ran interface . Location transparency was en-forced by the flexibility of importing only ob-jects or functionality independently of thei rlocation. The replication aspect of the sys-tem as a whole brought failure transparency,because several objects (processes) could ex-port the same functions . One important exten-sion was the flexibility of having user-designedprocedures to marshal and demarshal complexdata structures, thereby widening the usabilityof the paradigm. It also provided the possi-bility of attaching binding procedures to dat atypes defined in the interface, allowing the de-velopment of even more customized code . Anextensive analysis of the major advantages ofApollo NCA over the SUN/ONC (Open Net-work Computing) system is presented in [36] .

SUN RPC system [79] is a stub generatorcalled rpcgen, an eXternal Data Representa-tion [78] (XDR), and a runtime library. Froman RPC specification file, rpcgen generates thecaller and callee stubs, and a header file thatis included by both files . Each callee 's hostmust have a daemon that works as a port map-per that can be contacted to locate a specificprogram and version . The caller has to spec-ify the host 's name and the transport protocolto acquire an RPC handler . The RPC han-dler is used as the second argument of an RP Ccall . SUN supports either TCP or UDP (ex-plained in Section 7 .1) . An RPC call returns aNULL value in case any error occurs . An effort is

264

made to provide at-most-once semantics . EachRPC handler has a unique transaction identi-fier (id) . Each distinct request using the RP Chandler has a different value that is obtainedby slightly modifying the id . On the caller side ,the protocol uses tests to match this id beforereturning an RPC value. This matching guar-antees that the response is from the correct re -quest .

If the protocol chosen is UDP, the UDP calle ehas an option of recording all of the caller's re-quests that it receives . It maintains a cachewith the result values obtained that can be in-dexed by the transaction id, program number ,version number, procedure number, and caller'sUDP address . Before executing a request, thecallee checks into this cache, and if the call isduplicated, the result value saved into the cacheis simply returned, on the assumption that theprevious response was lost or damaged .

For security, SUN provides three forms of au -thentication . The Null authentication is the de-fault situation, where no authentication occurs .The Unix authentication transmits with everyROC request a timestamp, caller 's host name ,caller ' s user id, calle r 's group id, and a list of al lother group ids to which the caller belongs to .The callee decides, based on this information ,whether it wants to grant the caller's request .The DES authentication uses the SUN RP Cprotocol .

Mainly concerned with heterogeneity, andsearching for accommodation of old and newdistributed systems in a single environment ,HCS [55, 10] was designed . Major componentwas the HCS Remote Procedure Call (HRPC) ,designed on a plug replacement basis, wherethe basic components could be independentl ybuilt and plugged together . The goal of HRP Cwas to identify (factorize) the major compo-nents of an RPC system, and to define pre-cisely their interfaces . In this way, component scould be replaced easily, if the new compo-nents supported the interface and functional-ity of the former one and hide the implemen-tation details . The interconnection of close dsystems (systems in which changes were nei-ther possible nor wanted) were accomplished b ysimply building components that emulated thebehavior of the systems . This interconnectionmade possible the interconnection of different

265

RPC mechanisms that were heterogeneous onseveral levels, but mainly on the system level .There was considerable and broad reusabilit yof components already built . HRPC divided anRPC System into five major components : thecompile-time support (programming language ,IDL and stub generator), the binding protocol ,the data representation, the transport proto-col, and the control protocol (basically a run-time support). The localized translation wasanother basic principle adopted by HRPC : ifthere exists an expert to do the job, let it do it ,instead of concentrating a narrower knowledg ein a more-general purpose component . For in-stance, one way of implementing a name serve rto deal with heterogeneous databases is to builda centralized database capable of doing all thedifferent searches and conversions between sys-tems. HRPC's name service (NSM) is a simpl ebridge between the several homogeneous nameservices that are connected to the RPC mecha-nism. Lazy decision making or procrastination ,was also adopted, because in dealing with het-erogeneous systems, an a late-as-possible com-mitment represents the most flexible one .

Marionette [77] is a software package base don the master/slave model for distributed en-vironments . A distributed application in thi smodel is one master process and many slaveprocesses, all of them sequential . The interac-tion is done either through shared data struc-tures created and updated by the master pro-cess, or through operation invocation, whic hcan be a worker operation or a context opera-tion . The context operations modify the slaves 'process state and are executed by all slaves .Worker operations are each executed in a singleslave, are not specified by the master, and areinvoked through asynchronous RPC. Later, themaster explicitly accepts the operation's result .To deal with heterogeneity at the data repre-sentation level, all data exchanged between ma -chines is translated into SUN External Dat aRepresentation [78] (XDR). The conversion isdone through user-defined routines . The im-port and export of functions is avoided by hav-ing all programs specify linkage arrays . Onelinkage array contains descriptors for worke roperation functions, another contains descrip-tors for context operation functions, and an-other contains descriptors for types of share d

a

data structures . An index into one of the ar-rays works as a function identifier .

C++ was also extended to support RPC ,concurrency, exception handling and garbagecollection to form a superset of C++ name dExtended-C++ [66] . A translator was designe dto compile code in Extended-C++ into C+ +code, that along with a runtime library, sup-ported the extensions . For the support of RPC ,attributes were added to the syntax that al-lowed the declaration of classes as remotable :its member functions were called remotely. Aclass was instantiated only for executing th ecallee side of RPCs. In this situation, only theclass declaration was required in the caller' sside code, avoiding linking the code with thedefinitions of the functions that can be made re -mote. There is an explicit distinction betweena pointer to a local value and a pointer to aremote value, and the user must be careful t odereference correctly, or else a compilation er-ror occurs . The pointer is declared as beingremotable and the syntax used for dereferecingit is different from the syntax to dereference alocal pointer . Only a subset of the types canbe used as the types of remotable function ar-guments or returned values . A type in this se tis a transmissible type . The fundamental dat atypes and a remotable pointer are transmissi-bles. A local pointer is not transmissible : auser-defined encoder and decoder are neede dfor each class to make it a transmissible class .For encoding and decoding complex structures ,such as, circular linked list the degree of shar-ing of remotable objects can be controlled . Anobject in Extended-C++ can be encoded an ddecoded as a shared object or as a value object .During a remotable call, a shared object canbe encoded (decoded) at most once, whereasa value object has no limit on the number o ftimes it can be encoded (decoded) . This avoidslooping during encoding (decoding) for circularlists, for instance, through the correct specifi-cation of an object encoder (decoder) function .

DAPHNE [49] consists of language-indepen-dent tools and runtime support to allow pro-grams to be divided into parts for distributedexecution on nodes of a heterogeneous com-puter network. A callee process is not share dbetween concurrent callers belonging to differ -ent users, and is private and created on de-

mand. A caller process can issue the parallelexecution of several callees and resumes con-trol only when it receives a reply from all calls .Each machine has a spawner process that sup-ports the creation of callee processes on de-mand . It is reached via RPC using a well -known transport address .

In Concert [90, 7], a distributed processmodel and an interface description languageare designed . The ultimate goal of Concertis to allow the interprocess communication ofprocesses that support its process model an dIDL. The interoperability is possible even i nthe presence of heterogeneity at the languag elevel, operating-system level, or machine level .

The process model is integrated at the lan-guage level, which provides greater portabil-ity of the concept . Languages usually mustbe minimally extended to support the model .Extensions encompass process dynamics, ren-dezvous, RPC, and asynchronous RPC. Themodel defines additional operations and thre edata abstractions : processes, ports, and bind-ings . The processes consist of a thread of con-trol and some encapsulated data local to a pro-cess . Processes can be dynamically created an dterminated at the language level, and may no tcorrespond to operating-system processes . Theinterprocess communication is achieved via di-rected typed channels, whose receiving ends ar erepresented by ports . The bindings representthe capability to call a port or to send a typedmessage to. The operations allow the manipu-lation of these abstract types, such as, the dy-namic creation of processes, ports, and bind-ings, the query as to whether the ports hav emessages queued, the ability to wait for mes-sages to arrive on specific ports, and the abil-ity to process these messages and to reply ifneeded .

The ports introduce a new storage class ,which have queues associated with them t ohold calls . Functions can be declared to b ein the port storage class, and can be passedto other processes, and can then be used fo rinterprocess communication . On the calleeside, the function accept() receives a collec-tion of port storage-class functions and exe-cutes a rendezvous on one of the ports . Ifno calls are pending on any of the port func-tions, the accept function waits for a call to

266

become pending on at least one of the ports .

Otherwise, it selects one of the port functionsnondeterministically. The accept function de-marshals the arguments, invokes the respectivefunction, and transmits the results to the callerafter the function is finished .

Concert is an ambitious design in which no tonly RPC but also asynchronous communica-tion and process dynamics are provided at thelanguage level . To support asynchronous com-munication, Concert adds two new data typesand five new operators . The receiveport op-

erator specifies ports for one-way messages; thecmsg (call message) operator specifies a two-way message that was received, but not replie dto yet; the send operator sends a message in anon-blocking manner, that is, the caller is un-blocked right after the message is sent, proba-bly before it is received and processed ; the polloperator allows a process to check whether mes-sages are present at any of a collection of ports ,but without processing the messages ; the send

operator is similar to the poll operator, butblocks if no call is pending and until someother process makes a call ; the receive oper-ator receives (dequeues) a message of any kin dfrom any kind of port ; and the reply oper-ator replies to a two-way message which wa sreceived and represented as a cmsg operator .

Hermes [75] was the first language of th eConcert family to be implemented . Hermes i sa process-oriented language that fully supportsall of the Concert process model in type-safemanner .

About the Author

Patricia Gomes Soares received the B .Sc .and M .Sc . degrees in Computer Science fromUniversidade Federal de Pernambuco (UFPE) ,Recife, Brazil, in 1987 and 1989, respectively .Since 1989 she is a Ph .D . candidate in the Com-puter Science Department, Columbia Univer-sity, New York . Her research interests includelanguages, compilers, and software systems i ngeneral .

She can be reached at the address Com-puter Science Department, Columbia Univer-sity, 450 Computer Science Building, NewYork, NY, 10027 . Her Internet address i [email protected] .

267

on remote procedure call* - puc-rionoemi/sd-09/rpc-survey.pdf · ify the communication and...

Documents