distributed processing under unix

3
Distributed processing under Unix by JOHN DOBSON T he Newcastle Connection is the name given to a software sub- system developed at the Univer- sity of Newcastle upon Tyne, in the UK. The subsystem is added to a set of standard Unix systems to connect them together over any communica- tions medium. The resulting distri- buted system, which can use a variety and multiplicity of local and wide area networks, is functionally indistin- guishable from a conventional cen- tralized Unix system. Thus, all issues concerning net- work protocols and interprocessor communication are completely hid- den from the user. Instead, all the standard Unix conventions, e.g., for naming, accessing and protecting files and devices, for executing commands, and for input/output redirection, are applicable, without apparent change, to the distributed system as a whole. This is done without any moditica- tion, other than recompilation, to any existing source code of either the Unix operating system or any user pro- Abstract: The Newcastle Connection is a sofiware subsystemthat creates a distributed systemwhich appears to the user to be a single Unix systemwhile preserving the rights and responsibilities of the systemadministrators for each of the individual systems from which it is composed. Because the physical communicationsmedium (LAN or WAN) is transparent, users can actfreely in adapting to the perceived performance limitations rather than have their working practices constrained by the type of services offered. Keywords: Unix, distributed systems, transparent networking, system administration. John Dobson is technical director at Mari Advanced Microelectronics Limited. grams. As far as the user is concerned, he/she has at his/her disposal a single large Unix system which is composed from a number of constituent Unix systems, each on a separate machine. In what follows, I shall call the resulting large Unix system spanning several machines, Unix United. The software mechanism which imple- ments the Unix United architecture is called the Newcastle Connection. It is important to realize that the Newcastle Connection is primarily a technique for managing distribution, not for managing networking. That is, the emphasis has been placed on transparent extensions of Unix to embrace multiple systems rather than on considerations of network proto- cols and optimization of communica- tions bandwidth. Clearly, such con- siderations are important, and close attention has been paid to them during the design and implementation of the Newcastle Connection, but the primary objective has been the preser- vation of Unix functionality. There is a close analogy here with the use of paging mechanisms to implement a virtual memory system. From a VM user’s point of view, his/her requirement is for a large homogeneous address space. The exact combination of page turning algorithm and optimizing disc driver needed to satisfy the user’s require- ment is something he/she may neither know nor care about. In Unix United, the designers’ goal has been a large (and potentially infinite) homo- geneous name space transparently dis- tributed over a large (and potentially infinite) set of machines. The New- castle Connection corresponds to a paging supervisor, and is merely the 10 0011-684x/84/09001&03$03.00 0 1984 Butterworth & Co (Publishers) Ltd. data processing mechanism through which the goal is achieved. System administrator A Unix United system is thus com- posed out of a (possibly large) number of interlinked standard Unix systems, each with its own storage and peri- pheral devices, accredited set of users and system administrator. Each con- stituent system has the responsibility for authenticating any user who attempts to log-in to that system. The system administrator for a particular system also has responsi- bility for maintaining a table of recog- nized remote user identifiers, and the privileges such remote users are allowed, on his/her local machine. If the system administrator so wishes, rather than refuse all access to un- recognized remote users, he/she can give them a default ‘guest’ status, presumably one which enjoys very limited access rights. From an individual user’s point of view, therefore, though he/she might have needed to negotiate not with one, but with several system admini- strators for usage rights beforehand, access to the whole Unix United system is via a single conventional log- in. Subject to the rights allocated by the various system administrators, the user will then be governed by, and be able to make normal use of, the standard Unix procedures in acces- sing the entire distributed system. In particular, there is no need for the user to log-in, or to provide pass- words, to any of the remote systems that his/her commands or programs happen to use. The user is, therefore, presented with the appearance of a single system, without the abrogation of the rights and responsibilities of the individual system administrators. Unix United as a distributed system The term ‘distributed system’ can be applied to a large spectrum of com- puter systems. At one end of the

Upload: john-dobson

Post on 26-Aug-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Distributed processing under Unix by JOHN DOBSON

T he Newcastle Connection is the name given to a software sub- system developed at the Univer-

sity of Newcastle upon Tyne, in the UK. The subsystem is added to a set of standard Unix systems to connect them together over any communica- tions medium. The resulting distri- buted system, which can use a variety and multiplicity of local and wide area networks, is functionally indistin- guishable from a conventional cen- tralized Unix system.

Thus, all issues concerning net- work protocols and interprocessor communication are completely hid- den from the user. Instead, all the standard Unix conventions, e.g., for naming, accessing and protecting files and devices, for executing commands, and for input/output redirection, are applicable, without apparent change, to the distributed system as a whole. This is done without any moditica- tion, other than recompilation, to any existing source code of either the Unix operating system or any user pro-

Abstract: The Newcastle Connection is a sofiware subsystem that creates a distributed system which appears to the user to be a single Unix system while preserving the rights and responsibilities of the system administrators for each of the individual systems from which it is composed. Because the physical communications medium (LAN or WAN) is transparent, users can act freely in adapting to the perceived performance limitations rather than have their working practices constrained by the type of services offered.

Keywords: Unix, distributed systems, transparent networking, system administration.

John Dobson is technical director at Mari Advanced Microelectronics Limited.

grams. As far as the user is concerned, he/she has at his/her disposal a single large Unix system which is composed

from a number of constituent Unix systems, each on a separate machine.

In what follows, I shall call the resulting large Unix system spanning several machines, Unix United. The software mechanism which imple- ments the Unix United architecture is called the Newcastle Connection.

It is important to realize that the Newcastle Connection is primarily a technique for managing distribution, not for managing networking. That is, the emphasis has been placed on transparent extensions of Unix to embrace multiple systems rather than on considerations of network proto- cols and optimization of communica- tions bandwidth. Clearly, such con- siderations are important, and close attention has been paid to them during the design and implementation of the Newcastle Connection, but the primary objective has been the preser- vation of Unix functionality.

There is a close analogy here with the use of paging mechanisms to implement a virtual memory system. From a VM user’s point of view, his/her requirement is for a large homogeneous address space. The exact combination of page turning algorithm and optimizing disc driver needed to satisfy the user’s require- ment is something he/she may neither know nor care about. In Unix United, the designers’ goal has been a large (and potentially infinite) homo- geneous name space transparently dis- tributed over a large (and potentially infinite) set of machines. The New- castle Connection corresponds to a paging supervisor, and is merely the

10 0011-684x/84/09001&03$03.00 0 1984 Butterworth & Co (Publishers) Ltd. data processing

mechanism through which the goal is achieved.

System administrator

A Unix United system is thus com- posed out of a (possibly large) number of interlinked standard Unix systems, each with its own storage and peri- pheral devices, accredited set of users and system administrator. Each con- stituent system has the responsibility for authenticating any user who attempts to log-in to that system.

The system administrator for a particular system also has responsi- bility for maintaining a table of recog- nized remote user identifiers, and the privileges such remote users are allowed, on his/her local machine. If the system administrator so wishes, rather than refuse all access to un- recognized remote users, he/she can give them a default ‘guest’ status, presumably one which enjoys very limited access rights.

From an individual user’s point of view, therefore, though he/she might have needed to negotiate not with one, but with several system admini- strators for usage rights beforehand, access to the whole Unix United system is via a single conventional log- in.

Subject to the rights allocated by the various system administrators, the user will then be governed by, and be able to make normal use of, the standard Unix procedures in acces- sing the entire distributed system. In particular, there is no need for the user to log-in, or to provide pass-

words, to any of the remote systems that his/her commands or programs happen to use. The user is, therefore, presented with the appearance of a single system, without the abrogation of the rights and responsibilities of the individual system administrators.

Unix United as a distributed system The term ‘distributed system’ can be applied to a large spectrum of com- puter systems. At one end of the

communications

F ‘igure I. A Unix name tree.

-root (“/ ” 1

flleb filed I \

fllea f ilec

spectrum are tightly coupled multi- processor systems in which all the processors share a common main memory. At the other end lie loosely coupled systems tied together by a serial link, for example, a home

microcomputer acting as a dumb ter- minal so that a user can access a remote larger computer over a tele- phone line.

In general, the more tightly coupled the system, the more similar rhe component computers, and the

more each computer knows or assumes about the operation of the others. From a managerial viewpoint, tightly coupled systems are likely to have single authorities performing all the management functions, to have a single user community and to be owned by a single organization. Con- versely, loosely coupled systems may well be administered by several authorities, have disjoint user com- munities and comprise separately Ilwned computers and communica- :ions media.

Thus, both Unix United as an (architecture, and the Newcastle Con- nection as its implementation, lie towards the loosely coupled end of the spectrum of distributed systems. An important part of the design was to allow each component Unix system to have its own administrator, user com- munity and access policy.

But while allowing this possibility, Unix United at the same time allows each user to have an integrated single- system view of the distributed system as a whole.

When a Unix system is considered

in isolation, there is little need for a precise definition of what constitutes the system. But in a Unix United system, where each component sys- tem retains its own identity while being a part of the whole, a more

precise definition is essential if the

implementation, i.e., the Newcastle Connection, is to make a consistent extension of Unix semantics from the

local to the global environment. To explain the Newcastle Connection notion of a system, we need to con- sider how objects, files, devices, com- mands are named in Unix.

Naming in Unix United

In standard Unix, the names of files and devices are regarded as belonging to a tree structure. Figure 1 shows (part of) a typical Unix name tree. The full name of a file is the concat- enation of the names of all the modes in the path from the root (convention- ally called ‘1’) to the file, for example /user/tom/filed.

Each of the intermediate nodes is a directory of the nodes below it, for example, iuseribrian is a directory. A directory may contain entries for both files and for lower level directories, each such entry being marked with its access limitations, e.g. readonly. Directories are, however, treated and implemented as ordinary files for all intents and purposes, and are acces- sed via the normal file access methods.

A device is a special sort of file. It is the device driver, in effect, and appears on the name tree just as if it were a file. The root of the tree (‘i’) is

a specially marked directory, but is otherwise no different in principle from any other directory.

When a user wishes to name a file, he/she can do so either by its full name, i.e., relative to ‘i’ or by a

shortened name relative to the user’s current working directory (which can be positioned wherever the user wishes). Thus, in Figure 1, if user ‘brian’ has set his current working directory to iuseribrian, he can refer to his files as ‘filea’ or ‘filet’. If user

‘tom’ has set his current working directory to /user, he can refer to jed/filea or brianifilea. Note that with the working directory at /user, if ‘tom’ were to refer to ‘fileb’, it would be treated as a reference to iuserlfileb, which may or may not exist. (It doesn’t in Figure 1.)

Thus, the standard Unix name tree consists of files, or devices, which are treated as a special sort of file, direc- tories and a distinguished ‘root’ directory. It is the root which defines a Unix system, since a number of important Unix concepts are tied to the root. These include a set of user identifiers, an administrator and his/ her administrative apparatus, e.g., the password file, a set of active

process identifiers, and some standard conventional files and devices. It is these objects, which are named rela- tive to the local root, that constitute the environment seen by the user and within which context his/her pro- cesses are executed.

The most concise way of explaining the Newcastle Connection notion of a system is to say that a component system is equivalent to a ‘root’ direc- tory, which is potentially one of many such root directories which may occur anywhere in a Unix United naming tree. (Note that we now are distin- guishing between a root directory, which provides a context for a user, and the ‘base’ of the tree, which is that directory without a parent in the tree.)

Thus, the overall Unix United naming tree consists of files, direc-

~0126 no 9 november 1.984 11

- base (“/..“I

“/ ” as seen “/ ” as seen from below - UNIX1 UNIX2 UNIX 1 / \ / \- Ki ie’Ow

etc user etc user

/A A jed brian tom brian Ifm robert

filea

Figure 2. A Unix United name tree. UNIX1 and UNIX2 are marked as systems, i.e., can serve as local roots.

tories and systems (or roots). When a user logs in to a Unix United system, his/her root context becomes the root directory of the component system to which he/she is logging in. Subse- quently the user can move to a different context simply by naming it in the usual Unix manner of naming and the usual Unix access checks will be applied just as if that part of name space to which the user is attempting to move was potentially available to him/her on his/her own Unix system.

In Figure 2, a user on UNIX1 can refer to a file on UNIX2 as /../ UNIX2/user/brian/filea, using the standard Unix facility that ‘..’ is an anonymous reference to the parent of a particular directory - hence ‘/..’ refers to the parent of ‘I’, i.e. the base in Figure 2. At the time the file is named, the access rights to the named file will be checked by the UNIX2 system to see if the requesting user (on UNIXl) enjoys the privilege of access.

Unix United and Aspect

As a particular example of a Unix United environment, I shall consider the work being carried out by the Aspect consortium, which is funded under the UK Alvey programme. Aspect is charged with the provision of an Integrated Project Support Environment (IPSE) for the distri- buted development of distributed systems.

12

The projected environment is assumed to consist of a number of geographically dispersed teams, each working on a different facet of some overall project. In practice, there will be a fairly tight information coupling within one location, which may com- prise several machines interconnected typically by a local area network (LAN), but there will be a much lesser degree of information coupling between separate locations, which will typically communicate using a wide area network (WAN).

In general, current models of geo- graphically distributed interworking assume that the loose information coupling between remote sites takes the form of a few specific application types, usually mail, file transfer, and remote job execution.

However, the differences in type between intra and interlocation in- formation exchange are dictated partly by purely geographical and project management considerations and partly by the fact that with current technology, LANs are faster and more reliable than WANs. It is the effect of these factors separately that the Aspect project wishes to investigate as one of its areas of research.

In particular, Aspect does not wish to constrain in advance the type of interworking that is subject to the performance limitations of WAN technology. Rather it wishes to dis- cover from the users how they choose

to work in the light of those per- formance limitations. It is quite pos- sible that the kind of information exchange that users evolve for them- selves in a distributed development environment turns out to be quite different from the preconceptions of the designers of application layer soft- ware.

The Unix United architecture is ideally suited for that kind of project and that investigation. Since the structure of name space can be made completely independent of the under- lying physical topology, and is under direct control of the project admini- stration, it can be made to reflect the logical structure of the project rather than the structure of the physical location of the various project teams.

Obviously, individuals will become aware of differences in response time in accessing various components on which they are operating, and will adjust their working habits accord- ingly, but they will have freedom to adapt to the perceived environment rather than have the environment dictated to them, since their concep- tual image of the environment will be a unified system which they can manipulate rather than a disjoint and inflexible set of services.

One final point is perhaps worth mentioning. The administrative mechanisms and facilities required to control the use of a centralized multi- processing operating system have gradually become better understood over several years’ experience. What has become clear as a result of the work on Unix United is that this understanding can also be extended to distributed systems. The additional problems and opportunities that face the administrator of a distributed system should not be allowed to obscure the continued relevance of established practice in user control in centralized systems. Cl

Mari Advanced Microelectronics Ltd, 32 Grainger Park Rd, Newcastle upon Tyne NE4 8RY, UK.

data processing