network operating system.docx

Network Operating System _____________________________________

Network Operating System also referred to as the Dialoguer, is the software that runs on a server and enables the server to manage data, users, groups, security, applications, and other networking functions. Unlike operating systems, such as Windows, that are designed for single users to control one computer, network operating systems (NOS) coordinate the activities of multiple computers across a network. The network operating system acts as a director to keep the network running smoothly. The network operating system is designed to allow shared file and printer access among multiple computers in a network, typically a local area network (LAN), a private network or to other networks. Normally, it is a complete operating system with file, task and job management; however, with some earlier products, it was a separate component that ran under the OS; for example, LAN Server required OS/2, and LANtastic required DOS. The most popular network operating systems are Microsoft Windows Server 2003, Microsoft Windows Server 2008, UNIX, Linux, Mac OS X, Novell NetWare, and BSD. Such products may also include a Web server, directory services, messaging system, network management and multiprotocol routing capabilities.

An operating system is low-level software controlling the inner workings of a machine. Typical functions performed by an operating system include managing the CPU among many concurrently executing tasks, managing memory allocation to the tasks, handling of input and output and controlling all the peripherals. Applications programs and often the human user are unaware of the existence of the features of operating systems as the features are embedded and hidden below many layers of software. Thus, the term low-level software is used. Operating systems were developed, in many forms, since the early 1960’s and have matured in the 1970’s. The emergence of networking in the 1970’s and its explosive growth since the early 1980’s have had a significant impact on the networking services provided by an operating system. As more network management features moved into the operating systems, network operating systems evolved. Like regular operating systems, network operating systems provide services to the programs that run on top of the operating system. However, the type of services and the manner in which the services are provided are quite different. The services tend to be much more complex than those provided by regular operating systems. In addition, the implementation of these services requires the use of multiple machines, message passing and server processes.

The control program of any home or corporate computer network, a network operating system (NOS) controls the utilities, users, and devices that make up a network. The NOS is similar to a computer operating system like Windows and Linux but contains an array of utilities and programs that ensure data is transmitting to the correct user, computer, and other devices. Network operating systems contain common characteristics, no matter what platform the operating system is running on.

Network Operating Systems extend the facilities and services provided by computer operating systems to support a set of computers, connected by a network. The environment managed by a network operating system consists of an interconnected group of machines that are loosely connected. By loosely connected, we mean that such computers possess no hardware connections at the CPU – memory bus level, but are connected by external interfaces that run under the control of software. Each computer in this group run an autonomous operating system, yet cooperate with each other to allow a variety of facilities including file sharing, data sharing, peripheral sharing, remote execution and cooperative computation. Network operating systems are autonomous

of 45

http://en.wikipedia.org/wiki/Berkeley_Software_Distribution

http://en.wikipedia.org/wiki/Novell_NetWare

http://en.wikipedia.org/wiki/Mac_OS_X

http://en.wikipedia.org/wiki/Mac_OS_X

http://en.wikipedia.org/wiki/Linux

http://en.wikipedia.org/wiki/UNIX

http://en.wikipedia.org/wiki/Windows_Server_2008

http://en.wikipedia.org/wiki/Microsoft



http://en.wikipedia.org/wiki/Private_network

http://en.wikipedia.org/wiki/Private_network

http://en.wikipedia.org/wiki/Local_area_network

http://en.wikipedia.org/wiki/Shared_file_access

http://en.wikipedia.org/wiki/Software

operating systems that support such cooperation. The group of machines comprising the management domain of the network operating system is called a distributed system. A close cousin of the network operating system is the distributed operating system. A distributed operating system is an extension of the network operating system that supports even higher levels of cooperation and integration of the machines on the network (features include task migration, dynamic resource location, and so on).

The network operating system is an extensible operating system. It provides mechanisms to easily add and remove services, reconfigure the resources, and has the ability of supporting multiple services of the same kind (for example two kinds of file systems). Such features make network operating systems indispensable in large networked environments. In the early 1980’s network operating systems were mainly research projects. Many network and distributed operating systems were built. These include such names as Amoeba, Argus, Berkeley Unix, Choices, Clouds, Cronus, Eden, Mach, Newcastle Connection, Sprite, and the V-System. Many of the ideas developed by these research projects have now moved into the commercial products. The commonly available network operating systems include Linux (freeware), Novell Netware, SunOS/Solaris, Unix and Windows NT. In addition to the software technology that goes into networked systems, theoretical foundations of distributed (or networked) systems has been developed. Such theory includes topics such as distributed algorithms, control of concurrency, state management, deadlock handling and so on.

The set of typical services provided by a network operating system includes (but are not limited to):

1. Remote logon and file transfer2. Transparent, remote file service3. Directory and naming service4. Remote procedure call service5. Object and Brokerage service6. Time and synchronization service7. Remote memory service

In order to transmit signals across a network, it is necessary for the computer to communicate with its modem or Network Interface Card. Network Operating Systems (NOS) provide the protocols necessary to achieve this goal, but each different type of modem or NIC needs to be able to communicate with the particular NOS. It is therefore necessary to install the special software that comes with the interface device. This software is often referred to as a driver. Computers made today usually come with both the interface and necessary drivers installed. Occasionally, you must install the modem or NIC yourself. It is necessary to install the correct driver for that interface device. Failure to so install the driver means that the device will be unable to communicate over the network or with the computer it is installed in. Network Operating Systems not only allow communication across a network, they also allow a network administrator to organize resources, control access, and ensure that the network is operating efficiently.

of 45

Characteristics of Network Operating Systems __________________

Provides Support

The first characteristic of network operating systems is the support component. Network operating systems provide support for the multiple processors, applications, and hardware devices that make up a network. The systems support the users accessing the network as well as process requests for specific documents and usage of hardware. NOSs also provide protocol requests from users, including Transmission Control Protocol/Internet Protocol (TCP/IP) and other protocols.

Controls Security

The second characteristic of network operating systems is the security component. NOSs manage the authorization and authentication of users, computer workstations, and other devices accessing a network. If an intruder tries to access a network, the NOS blocks the unauthorized user/computer and logs the intrusion attempt within its log files. The NOS also manages software and hardware installations to keep users from installing unauthorized software and devices.

User Setup

The third characteristic of network operating systems is the user setup component. Network operating systems create user accounts and manage the users logging into and out of the network. The systems also manage what file and directory services a specific user has access to, users accessing the network remotely, and how the network's graphical interface looks to specific users.

Printing and File Services

The fourth characteristic of network operating systems is the printing and file services component. Like other operating systems, network operating systems manage all printing, storage, backup, and duplication services for computers and users accessing a network. The systems also control access to the Internet, local-area (LAN) and wide-area networks (WAN), port routing, and internal web services known as Intranet. An NOS filters documents and files through the printing/file services immediately. More than one user can send documents and files through the network for printing, backing up, or other types of processing.

Email Services

The last characteristic of network operating systems is the email component. An NOS manages electronic mail, also known as email, for the entire network, including users accessing the NOS remotely and from the Internet. The NOS blocks SPAM and other problematic emails and sends/receives email, as well as lets users create additional email accounts.

of 45

Features of Network Operating Systems __________________

Basic Operating Features

Network operating systems support the basic underlying operating features of networks. These include support for processors and the various protocols that allow computers to share data. Many network operating systems can detect hardware within the system to allow for asset discovery within the network. Also, network operating systems support the processing of other software applications that run on both individual computers and within the network.

Security Features

Network operating systems support a number of security features that control access to the network. These include authorization and permission for access to the network, with specific control of features such as user management, log-on controls and passwords. Systems also provide access control for features such as remote access and network monitoring.

Networking

A network operating system is the platform on which computer networking takes place. Basic features allow for file, print and Internet connections. Data backup and replication functions are controlled through the network operating system. The management of connective systems for local and wide area networks (LANs and WANs), such as routing, switches and other ports are configured and managed through network operating system features.

Administrative Interface

One of the features of a network operating system is that it has an administrative interface that allows a network administrator to monitor and maintain the system. This interface will have a menu that allows the administrator to perform functions such as formatting hard drives and setting up security protocols for both the system and individual users. He can also and configure security and data backup requirements for individual computers or the network as a whole.

of 45

History________________________________________________

The emergence of and subsequent popularity of networking prompted the advent of network operating systems. The first networks supported some basic network protocol and allowed computers to exchange data. Specific application programs running on these machines controlled the exchange of data and used the network to share data for specific purposes. Soon it was apparent that a uniform and global networking support within the operating system would be necessary to effectively use the underlyingnetwork. A particularly successful thrust at integrating networking extensions into an operating system resulted in Berkeley Unix (known as BSD). Unix was an operating system created at Bell Labs, and was licensed to the University of California at Berkeley for enhancements and then licensed quite freely to most universities and research facilities. The major innovation in Berkeley’s version was support for TCP-IP networking.

In the early 1980’s TCP-IP (or Terminal Control Protocol – Internet Protocol) was an emerging networking protocol, developed by a team of research institutions for a U.S. Government funded project called the ARPANET. Specialized machines were connected to ARPANET and these machines ran TCPIP. Berkeley made the groundbreaking decision to integrate the TCP-IP protocol into the Unix operating system, suddenly allowing all processes on a general purpose Unix machine, to communicate to otherprocesses on any machine connected to the network. Then came the now ubiquitous programs that ran on top of the TCP-IP protocol. These programs include telnet, ftp and e-mail. The telnet program (as well as its cousins rlogin and rsh) program allow a user on one machine to transparently access another machine. Similarly, ftp allowed transmission of files between machines with ease. E-mail opened a new mode of communication. While these facilities are very basic and taken for granted today, they were considered revolutionary when they first appeared. However, as the number of networked computers increased dramatically, it was apparent that these services were simply not enough for an effective work environment. For example, let us assume a department (in 1985) has about 40 users assigned on to 10 machines. This assignment immediately led to a whole slew of problems, we outline some below:

A user can only use the machine on which he or she has an account. Soon users started wanting accounts on many if not all machines.· A user wanting to send mail to another colleague not only had to know the recipient’s name (acceptable) but which machines the recipient uses – in fact, the sender needs to know the recipient’s favorite machine.· Two users working together, but having different machine assignments have to use ftp to move files back and forth in order to accomplish joint work. Not only this requires they know each other’s passwords but also have to manually track the versions of the files.

Suddenly the boon of networking caused segregation of the workplace and became more of a bother rather than an enabling technology. At this point the systems designers realized the need for far tighter integration of networking and operating systems and the idea of a network operating system was born. The first popular commercial network operating system was SunOS from Sun Microsystems. SunOS is aerivative from the popular Berkeley Unix (BSD). Two major innovations present in SunOS are called Sun-NFS and Yellow Pages. Sun-NSF is a network file system. Sun-NSF allows a file that exists on one machine to be transparently visible from other machines. Proper use of Sun-NFS can eliminate file location and movement problems. Yellow Pages, which was later renamed to NIS (Network Information System), is a directory service. This service

of 45

allowed, among other things, user accounts created in one central administrative machine to be propagated to machines the user needs to use.

The addition of better, global services to the base operating system is the basic concept that propelled the emergence of network operating systems. Current operating systems provide a rather large number of such services built at the kernel layer or at higher layers to provide application programs with a unified view of the network. In fact, the goal of network operating systems is network transparency, that is, the network becomes invisible.

Services for Network Operating Systems____________________

System-wide services are the main facility a network operating system provides. These services come in many flavors and types. Services are functions provided by the operating system and forms a substrate used by those applications, which need to interact beyond the simplistic boundaries imposed by the process concept.

A service is provided by a server and accessed by clients. A server is a process or task that continuously monitors incoming service requests (similar to telephone operators). When a service request comes in, the server process reacts to the request, performs the task requested and then returns a response to the requestor. Often, one or more such server processes run on a computer and the computer is called a server. However, a server process does not have to run on a server and the two terms are often, confusingly used interchangeably.

What is a service? In regular operating systems, the system call interface or API (Application Programming Interface) defines the set of services provided by the operating system. For example, operating system services include process creation facilities, file manipulation facilities and so on. These services (or system calls) are predefined and static. However, this is not the case in a network operating system. Network operating systems do provide a set of static, predefined services, or system calls like the regular operating system, but in addition provides a much larger, richer set of dynamically creatable and configurable services. Additional services are added to the network operating system by the use of server processes and associated libraries.

Any process making a request to a server process is called a client. A client makes a request by sending a message to a server containing details of the request and awaiting a response. For each server, there is a well-defined protocol defining the requests that can be made to that server and the responses that are expected. In addition, any process can make a request; that is anyone can become a client, even temporarily. For example, a server process can obtain services from yet another server process, and while it is doing so, it can be termed a temporary client.

Services provided by a network operating system include file service, name service, object service, time service, memory service and so on.

Peripheral Sharing Service

Peripherals connected to one computer are often shared by other computers, by the use of peripheral sharing services. These services go by many names, such as remote device access, printer sharing, shared disks and so on. A computer having a peripheral device makes it available by exporting it. Other

of 45

computers can connect to the exported peripheral. After a connection is made, to a user on the machine connected to a shared peripheral, that peripheral appears to be local (that is, connected to the users machine). The sharing service is the most basic service provided by a network operating system.

File Service

The most common service that a network operating system provides is file service. File services allow user of a set of computers to access files and other persistent storage object from any computer connected to the network. The files are stored in one or more machines called the file server(s). The machines that use these files, often called workstations have transparent access to these files.

Note only is the file service a common service, but it is also the most important service in the network operating system. Consequently, it is the most heavily studied and optimized service. There are many different, often non-interoperable protocols for providing file service.

The first full-fledged implementation of a file service system was done by Sun Microsystems and is called the Sun Network File System (Sun-NFS). Sun-NFS has become an industry standard network file system for computers running the Unix operating system. Sun-NFS can also be used from computers running Windows (all varieties) and MacOS but with some limitations.

Under Sun-NFS a machine on a network can export a file system tree (i.e. a directory and all its contents and subdirectories). A machine that exports one of more directories is called a file server. After a directory has been exported, any machine connected to the file server (could be connected over the Internet) can import, or mount that file tree. Mounting is a process, by which the exported directory, all its contents, and all its subdirectories appear to be a local directory on the machine that mounted it. Mounting is a common method used in Unix system to build unified file systems from a set of disk partitions. The mounting of one exported directory from one machine to a local directory on another machine via Sun-NFS is termed remote mounting.

While many different configurations are possible by the innovative use of remote mounting, the system configuration shown in Figure 1 is quite commonly used. This is called the dataless workstation configuration. Is such a setup, all files, data and critical applications are kept on the file servers and mounted on the workstations. The local disks of the workstations only contain the operating system, some heavily used applications and swap space.

Sun-NFS works by using a protocol defined for remote file service. When an application program makes a request to read (or write) a file, it makes a local system call to the operating system. The operating system then consults its mounting tables to determine if the file is a local file or a remote file. If the file is local, the conventional file access mechanisms handle the task. If the file is remote, the operating system creates a request packet confirming to the NFS protocol and sends the packet to the machine having the file.

The remote machine runs a server process, also called a daemon, named nfsd. Nfsd receives the request and reads (or writes) the file, as requested by the application and returns a confirmation to the requesting machine. Then the

of 45

requesting machine informs the application of the success of the operation. Of course, the application does not know whether the execution of the file operation was local or remote. Similar to Sun-NFS, there are several other protocols for file service. These include Appleshare for Macintosh computers, the SMB protocol for Windows 95/NT and the DFS protocol used in the Andrew file system. Of these, the Andrew file system is the most innovative. Andrew, developed at CMU in the late 1980’s is a scalable file system. Andrew is designed to handle hundreds of file servers and many thousands of workstations without degrading the file service performance. Degraded performance in other file systems is the result of bottlenecks at file servers and network access points. The key feature that makes Andrew a scalable system is the use of innovative file caching strategies. The Andrew file system is also available commercially and is called DFS (Distributed File System).

In Andrew/DFS when an application accesses a file, the entire file is transmitted from the server to the workstation, or a special intermediate file storage system, closer to the workstation. Then the application uses the file, in a manner similar to NFS. After the user running the application logs out of the workstation, the file is sent back to the server. Such a system however has the potential of suffering from file inconsistencies if the same user uses two workstations at two locations. In order to keep files consistent, when it is used concurrently, the file server uses a callback protocol. The server can recall the file in use by a workstation if another workstation uses it simultaneously. Under the callback scheme, the server stores the file and both workstations reach the file remotely. Performance suffers, but consistency is retained. Since concurrent access to a file is rare, the callback protocol is very infrequently used; and thus does not hamper the scalability of the system.

Directory or Name Service

A network of computers managed by a network operating system can get rather large. A particular problem in large networks is the maintenance of information about the availability of services and their physical location. For example, a particular client needs access to a database. There are many differentdatabase services running on the network. How would the client know whether the particular service it is interested in, is available, and if so, on what server?

Directory services, sometimes called name services address such problems.

Directory services are the mainstay of large network operating systems. When a client application needs to access a server process, it contacts the directory server and requests the address of the service. The directory server identifies the service by its name – all services have unique names. Then the directory server informs the client of the address of the service – the address contains the name of the server. The directory server is responsible for knowing the current locations and availability of all services and hence can inform the client of the unique network address (somewhat like a telephone number) of the service.

The directory service is thus a database of service names and service addresses. All servers register themselves with the directory service upon startup. Clients find server addresses upon startup. Clients can retain the results of a directory lookup for the duration of its life, or can store it in a file and thus retain it potentially forever. Retaining addresses of services is termed address caching. Address caching causes gains in performance and reduces loads on the directory

of 45

server. Caching also has disadvantages. If the system is reconfigured and the service address changes, then the cached data is wrong and can indeedcause serious disruptions if some other service is assigned that address. Thus, when caching is used, clients and servers have to verify the accuracy of cached information.

The directory service is just like any other service, i.e. it is provided by a service process. So there are two problems:

1. How does the client find the address of the directory service?2. What happens if the directory service process crashes?

Making the address of the directory service a constant, solves the first problem. Different systems have different techniques for doing this, but a client always has enough information about contacting the directory service.

To ensure the directory service is robust and not dependent on one machine, the directory service is often replicated or mirrored. That is, there are several independent directory servers and all of them contain (hopefully) the same information. A client is aware of all these services and contacts any one. As long as one directory service is reachable, the client gets the information it seeks. However, keeping the directory servers consistent, i.e. have the same information is not a simple task. This is generally done by using one of many replication control protocols.

The directory service has been subsequently expanded not just to handle service addresses, but higher level information such as user information, object information, web information and so on. A standard for worldwide directory services over large networks such as the Internet has been developed and isknown as the X.500 directory service. However the deployment of X.500 has been low and thus its importance has eroded. As of this now, a simpler directory service called LDAP (Lightweight Directory Access Protocol) is gaining momentum, and most network operating systems provide support for this protocol.

RPC Service

A particular mechanism for implementing the services in a network operating system is called Remote Procedure Calls or RPC. The RPC mechanism is discussed later in the section entitled Mechanisms for Network Operating Systems. The RPC mechanism needs the availability of an RPC server accessible byan RPC client. However, a particular system may contain tens if not hundreds or even thousands of RPC servers. In order to avoid conflicts and divergent communication protocols the network operating system provides support for building and managing and accessing RPC servers.

Each RPC service is an application-defined service. However, the operating system also provides an RPC service, which is a meta-service, which allows the application specific RPC services to be used in a uniform manner. This service provides several features:

1. Management of unique identifiers (or addresses) for each RPC server.

of 45

2. Tools for building client and server stubs for packing and unpacking (also known as marshalling and unmarshalling) of arguments between clients and servers.

3. A per-machine RPC listening service.

The RPC service defines a set of unique numbers that can be used by all RPC servers on the network. Each specific RPC server is assigned one of these numbers (addresses). The operating system manages the creation and assignment of these identifiers. The operating system also provides tools that allow the programmers of RPC services to build a consistent client-server interface. This is done by the use of language processing tools and stub generators, which embed routines in the client and server code. Theseroutines package the data sent from the client to the server (and vice versa) in some predefined format, which is also machine independent.

When a client uses the number to contact the service, its looks up the directory and finds the name of the physical machine that contains the service. Then it sends a RPC request to the RPC listener on that machine. The RPC listener is an operating system provided service that redirects RPC calls to the actual RPC server process that should handle the call. RPC services are available in all network operating systems. The three most common types of RPC systems are Sun RPC, DCE RPC and Microsoft RPC.

Object and Brokerage Service

The success and popularity of RPC services coupled with the object-orientation frenzy of the mid-1980’s led to the development of Object Services and then to Brokerage services. The concept of object services is as follows.

Services in networked environments can be thought of as basic services and composite services. Each basic service is implemented by an object. An object is an instance of a class, while a class is inherited from one or more base or composite classes. The object is a persistent entity that stores data in a structured form, and may contain other objects. The object has an external interface, visible from clients and is defined by the public methods the object supports.

Composite services are composed of multiple objects (basic and composite) which can be embedded or linked. Thus we can build a highly structured service infrastructure that is flexible, modular and has unlimited growth potential.

In order to achieve the above concept, the network operating systems started providing uniform methods of describing, implementing and supporting objects (similar to the support for RPC).

While the concept sounds very attractive in theory, there are some practical problems.

These are:

1. How does a client access a service?

of 45

2. How does a client know of the available services and the interfaces they offer?

3. How does one actually build objects (or services)?

We discuss the questions in reverse order. The services or objects are built using a language that allows the specification of objects, classes, and methods; and allows for inheritance and overloading. While C++ seems to be a natural choice, C++ does not provide the features of defining external service interfaces and does not have the power of remote linking. Therefore, languages have been defined, based on C++ that provides such features.

The client knows of the object interface, due to the predefined type of the object providing the service. The programming language provides and enforces the type information. Hence at compile time, the client can be configured by the compiler to use the correct interface – based on the class of object the client is using. However, such a scheme makes the client use a static interface. That is, once a client has been complied the service cannot be updated with new features that change the interface. This need for dynamic interface management leads to the need for Brokerage Services.

After the client knows of the existence of the service, and the interface it offers, the client accesses the service using two key mechanisms – the client stub and the ORB (Object Request Broker). The client stub transforms a method invocation into a transmittable service request. Embedded in the service request is ample information about the type of service requested and the arguments (and type of these arguments) and the type of expected results. The client stub then sends a message to the ORB handling requests of this type.

The ORB is just one of the many services a brokerage system provides. The ORB is responsible for handling client requests and is an intermediate between the client and the object. Thus, the ORB is a server-side stub that receives incoming service requests and converts them to correct formats, and sends

them to the appropriate objects.

The Brokerage Service is a significantly more complex entity. It is responsible for handling:

1. Names and types of objects and their locations and types.2. Controlling the concurrency of method invocations on objects, if they happen concurrently.3. Event notification and error handling.4. Managing the creation and deletion of objects and updates of objects as they happen, dynamically.5. Handling the persistence and consistency of objects. Some critical objects may need transaction management.6. Handle queries about object capabilities and interfaces.7. Handle reliability and replication.8. Provide Trader Services.

The Trader Service mentioned above is interesting. The main power in object services is unleashed when clients can pick and choose services dynamically. For

of 45

example, a client wants access to a database object containing movies. Many such services may exist on the network offering different or even similarfeatures. The client can first contact the trader, get information about services (including quality, price, range of offerings and so on) and then decide to use one of them. This is, of course, based on the successful, real-world business model. Trader services thus offer viable and useful methods of interfacing clients and objects on a large network.

The object and brokerage services depend heavily upon standards, as all programs running on a network have to conform to the same standard, in order to inter-operate. As of writing, the OSF-DCE (Open Software Foundation, Distributed Computing Environment) is the oldest multi-platform standard, but has limited features (does not support inheritance, dynamic interfaces and so on). The CORBA (Common Object Request Broker Architecture) standard is gaining importance as a much better standard and is being deployed quite aggressively. Its competition, the DCOM (Distributed Common Object Model) standard is also gaining momentum, but its availability seems to be currently limited to the Windows family of operating systems.

Group Communication Service

Group communication is an extension of multicasting for communicating process groups. When the recipient of a message is a set of processes the message is called a multicast message (a single recipient message – unicast, all processes are recipients – broadcast). A process group is a set of processes whose membership may change over time. If a process sends a multicast message to a process group, all processes that are members of the group will receive this message. Simple implementations of multicasting does not work for group communications for a variety of reasons, such as follows:

1. A process may leave the group and then get messages sent to the group from a process who is not yet aware of the membership change.

2. Process P1 sends a multicast. In response to the multicast, process P2 sends another multicast. However, P2’s message arrives at P3 before P1’s message. This is causally inconsistent.

3. Some processes, which are members of the group, may not receive a multicast due to message loss or corruption.

Group communication protocols solve such problems by providing several important multicasting primitives. These include reliable multicasting, atomic multicasting, causally-related multicasting as well as dynamic group membership maintenance protocols.

The main provision in a group communication system is the provision of multicasting primitives. Some of the important ones are:

Reliable Multicast: The multicast is send to all processes and then retransmitted to processes that did not get the message, until all processes get the multicast. Reliable multicasts may not deliver all messages if some network problems arise.

Atomic Multicast: Similar to the reliable multicast, but guarantees that all processes will receive the message. If it is not possible for all processes to receive the message, then no process will receive the message.

of 45

Totally Ordered Multicast: All the multicasts are ordered strictly, that is all the receivers get all the messages in exactly the same order. Totally ordered multicasting is expensive to implement and is not necessary (in most cases). Causal multicasting is powerful enough for use by applications that needordered multicasting.

Causally Ordered Multicast: If two multicast messages are causally related in some way then all recipients of these multicasts will get them in the correct order.Imperative in the notion of multicasting is the notion of dynamic process groups. A multicast is sent to a process group and all current members of that group receive the message. The sender does not have to belong to the group.

Group communications is especially useful in building fault-tolerant services. For example, a set of separate servers, providing the same service is assigned to a group and all service requests are sent via causally ordered multicasting. Now all the servers will do exactly the same thing, and if one server fails, it can be removed from the group. This approach is used in the ISIS system

Time, Memory and Locking Services

Managing time on a distributed system is inherently conceptually difficult. Each machine runs its own clock and these clocks drift independently. In fact there is no method to even “initially” synchronize the clocks. Time servers provide a notion of time to any program interested in time, based on one of many clock algorithms (see section on theoretical foundations). Time services have two functions: provide consistent time information to all processes on the system and to provide a clock synchronization method that ensures all clocks on all systems appear to be logically synchronized.

Memory services provide a logically shared memory segment to processes not running on the same machine. The method used for this service is described later. A shared memory server provides the service and processes can attach to a shared memory segment which is automatically kept consistent by the server.

There is often a need for locking a resource on the network, by a process. This is especially true in systems using shared memory. While locking is quite common and simple in single computers, it is not so easy on a network. Thus, networks use a locking service. A locking service is typically a single server process that tracks all locked resources. When a process asks for a lock on a resource, the server grants the lock if that lock is currently not in use, else it makes the requesting process wait till the lock is released.

Other Services

A plethora of other services exists in network operating systems. These services can be loosely divided into two classes (1) services provided by the core network operating system and (2) services provided by applications.

Services provided by the operating system are generally low-level services used by the operating system itself, or by applications. These services of course vary from one operating system to another. The following is a brief overview of

of 45

services provided by most operating systems that use the TCP-IP protocol for network communications:

1. Logon services: These include telnet, rlogin, ftp, rsh and other authentication services that allow users on one machine to access facilities of other machines.

2. Mail services: These include SMTP (Simple Mail Transfer Protocol), POP (Post Office Protocol), and IMAP (Internet Message Access Protocol). These services provide the underlying framework for transmitting and accessing electronic mail. The mail application provides a nicer interface to the end user, but uses several of these low-level protocols to actually transmit and receive mail messages.

3. User Services: These include finger, rwho, whois and talk.

4. Publishing services: These include HTTP (Hyper Text Transfer Protocol), NNTP (Network News Transfer Protocol), Gopher and WAIS. These protocols provide the backbone of the Internet information services such as the WWW and the news network.

Application defined services, on the other hand, are used by specific applications that run on the network operating system. One of the major attributes of a network operating system is that is can provide support for distributed applications. These application programs span machine boundaries and user boundaries.That is, these applications use resources (both hardware and software) of multiple machines and input from multiple users to perform a complex task. Examples include parallel processing and CSCW (Computer Supported Cooperative Work).

Such distributed applications use the RPC services or object services provided by the underlying system to build services specific to the type of computation being performed. Parallel processing systems use the message passing and RPC mechanisms to provide remote job spawning and distribution of computational workload among all available machines on the network. CSCW applications provide services such as whiteboards and shared workspaces, which can be used by multiple persons at different locations on the network.

A particular, easy to understand application is a calendering program. In calendaring applications, a server maintains information about appointments and free periods of a set of people. All individuals set up their own schedules using a front-end program, which downloads such data into a server. If a person wants to set up a meeting, he or she can query the server for a list of free periods, for a specified set of people. After the server provides some alternatives, the person schedules a particular time and informs all the participants. While the scheduling decision is pending, the server marks the appointment time temporarily unavailable on the calendars of all participating members. Thus, the calendering application provides its own unique service – the calendar server.

of 45

Mechanisms for Network Operating Systems________________

Network operating systems provide three basic mechanisms that are used to the support the services provided by the operating system and applications. These mechanisms are (1) Message Passing (2) Remote Procedure Calls and (3) Distributed Shared Memory. These mechanisms support a feature called Inter Process Communication or IPC. While all the above mechanisms are suitable for all kinds of interprocess communication, RPC and DSM are favored over message passing by programmers.

Message Passing

Message passing is the most basic mechanism provided by the operating system. This mechanism allows a process on one machine to send a packet of raw, uninterpreted stream of bytes to another process.

In order to use the message passing system, a process wanting to receive messages (or the receiving process) creates a port (or mailbox). A port is an abstraction for a buffer, in which incoming messages are stored. Each port has a unique system-wide address, which is assigned, when the port is created. A port is created by the operating system upon a request from the receiving process and is created at the machine where the receiving process executes. Then the receiving process may choose to register the port address with a directory service.

After a port is created, the receiving process can request the operating system to retrieve a message from the port and provide the received data to the process. This is done via a receive system call. If there are no messages in the port, the process is blocked by the operating system until a message arrives. When a message arrives, the process is woken up and is allowed to access the message.

A message arrives at a port, after a process sends a message to that port. The sending process creates the data to be sent and packages the data in a packet. Then it requests the operating system to deliver this message to the particular port, using the address of the port. The port can be on the same machine as the sender, or a machine connected to the same network.

When a message is sent to a port that is not on the same machine as the sender (the most common case) this message traverses a network. The actual transmission of the message uses a networking protocol that provides routing, reliability, accuracy and safe delivery. Then most common networking protocol is TCPIP. Other protocols include IPX/SPX, AppleTalk, NetBEUI, PPTP and so on. Network protocols use techniques such as packetizing, checksums, acknowledgements, gatewaying, routing and flow control to ensure messages that are sent are received correctly and in the order they were sent.

Message passing is the basic building block of distributed systems. Network operating system use message passing for inter-kernel as well as inter-process communications. Inter-kernel communications are necessary as the operating system on one machine needs to cooperate with operating systems on other machines to authenticate users, manage files, handle replication and so on.

of 45

Programming using message passing is achieved by using the send/receive system calls and the port creation and registering facilities. These facilities are part of the message passing API provided by the operating system. However, programming using message passing is considered to be a low-level technique that is error prone and best avoided. This is due to the unstructured nature of message passing. Message passing is unstructured, as there are no structural restrictions on its usage. Any process can send a message to any port. A process may send messages to a process that is not expecting any. A process may wait for messages from another process, and no message may originate from the second process. Such situations can lead to bugs that are very difficult to detect. Sometimes timeouts are used to get out of the blocked receive calls when no messages arrive but the message may actually arrive just after the timeout fires.

Even worse, the messages contain raw data. Suppose a sender sends three integers to a receiver who is expecting one floating-point value. This will cause very strange and often undetected behaviors in the programs. Such errors occur frequently due to the complex nature of message passing programs and hence better mechanisms have been developed for programs that need to cooperate.

Even so, a majority of the software developed for providing services and applications in networked environments use message passing. Some minimization of errors is done by strictly adhering to a programming style called the client-server programming paradigm. In this paradigm, some processes are pre-designated as servers. A server process consists of an infinite loop. Inside the loop is a receive statement which waits for messages to arrive at a port called the service port. When a message arrives, the server performs some task requested by the message and then executes a send call to send back results to the requestor and goes back to listening for new messages.

The other processes are clients. These processes send a message to a server and then waits for a response using a receive. In other words, all sends in a client process must be followed by a receive and all receives at a server process must be followed by a send. Following this scheme significantly reduced timing related bugs.

The performance of client-server based programs are however poorer than what can be achieved by other, nastier coding techniques. To alleviate this, often a multi-threaded server is used. In a multithreaded server several parallel threads can listen to the same port for incoming messages and perform requests in parallel. This causes quicker service response times.

Two better inter-process communication techniques are RPC and DSM, described below.

Remote Procedure Calls (RPC)

Remote Procedure Calls, or RPC is a method of performing inter-process communication with a familiar, procedure call like mechanism. In this scheme, to access remote services, a client makes a procedure call, just like a regular procedure call, but the procedure executes within the context of a different process, possibly on a different machine. The RPC mechanism is similar to the client-server programming style used in message passing. However, unlike message passing where the programmer is responsible for writing all the

of 45

communication code, in RPC a compiler automates much of the intricate details of the communication.

In concept, RPC works as follows: A client process wishes to get service from a server. It makes a remote procedure call on a procedure defined in the server. In order to do this the client sends a message to the RPC listening service on the machine where the remote procedure is stored. In the message, the client sends all the parameters needed to perform the task. The RPC listener then activates the procedure in the proper context, lets it run and returns the results generated by the procedure to the client program.

However, much of this task is automated and not under programmer control.An RPC service is created by a programmer who (let us assume) writes the server program as well as the client program. In order to do this; he or she first writes an interface description using a special language called the Interface Description Language (IDL). All RPC systems provide an IDL definition and an IDL compiler. The interface specification of a server documents all the procedures available in the server and the types of arguments they take and the results they provide.

The IDL compiler compiles this specification into two files, one containing C code that is to be used for writing the server program and the other containing code used to write the client program. The part for the server contains the definitions (or prototypes) of the procedures supported by the server. It also contains some code called the server loop. To this template, the programmer adds the global variables, private functions and the implementation of the procedures supported by the interface. When the resulting program is compiled, a server is generated. The server loop is inserted by the IDL compiler contains code to:

1. Register the service with a name server.

2. Listen for incoming requests (could be via the listening service provided by the operating system).3. Parse the incoming request and call the appropriate procedure using the supplied parameters. This step requires the extraction of the parameters from the message sent by the client. The extraction process is called unmarshalling. During unmarshalling some type-checking can also be performed.4. After the procedure returns, the server loop packages the return results into a message (marshalling) and sends a reply message to the client.

Note that all the above functionality is automatically inserted into the RPC server by the IDL compiler and the programmer does not have to write any of these.

Then the programmer writes the client. In the client program, the programmer #include’s the header file for clients generated by the IDL compiler. This file has the definitions and pseudo-implementations (or proxies) of the procedures that are actually in the server. The client program is written as if the calls to the remote procedures are in fact local procedure calls. When the client program is run, the stubs inserted via the header files play an important role in the execution f the RPC’s. When the client process makes a call to a remote procedure, it actually calls a local procedure, which is a proxy for the remote procedure. This proxy procedure (or stub) gets all the arguments passed to it and packages them in

of 45

some predefined format. This packaging is called marshalling. After the arguments are marshaled, they are sent to the RPC server that handles requests for this procedure. Of course, as described above, the RPC server unmarshals arguments, runs the procedure and marshals results. The results flow back to the client, and the proxy procedure gets them. It unmarshals the results and returns control to the calling statement, just like a regular local procedure.

One problem remains. How does the client know what is the address of the server handling a particular procedure call? This function is automated too. The IDL compiler, when compiling an interface definition, obtains a unique number from the operating system and inserts it into both the client stub and the server stub, as a constant. The server registers this number with its address on the name service. The client uses this number to look up the server’s address from the name service.

The net effect is that a programmer can write a set of server routines, which can be used from multiple client processes running on a network of machines. The writing of these routines take minimal effort and calling them from remote processes is not difficult either. There is no need to write communications routines and routines to manage arguments and handle type checking. Automation reduces chances of bugs quite heavily. This has led to the acceptance of RPC as the preferred distributed programming tool.

Distributed Shared Memory (DSM)

While message passing and RPC are the mainstays of distributed programming, and is available on all network operating systems, Distributed Shared Memory or DSM is not at all ubiquitous. On a distributed system, DSM provides a logical equivalent to (real) shared memory, which is normally available only on multiprocessor systems.

Multiprocessor systems have the ability of providing the same physical memory to multiple processors. This is a very useful feature and has been utilized heavily for parallel processing and inter-process communication in multiprocessor machines. While RPC and message passing is also possible on multiprocessor systems, using shared memory for communication and data sharing is more natural and is preferred by most programmers.

While shared memory is naturally available in multiprocessors, due to the physical design of the computer, it is neither available nor was thought to be possible on a distributed system. However, the DSM concept has proven that a logical version of shared memory, which works just like the physical version, albeit at reduced performance, is both possible and is quite useful.

DSM is a feature by which two or more processes on two or more machines can map a single shared memory segment to their address spaces. This shared segment behaves like real shared memory, that is, any change made by any process to any byte in the shared segment is instantaneously seen by all the processes that map the segment. Of course, this segment cannot be at all the machines at the same time, and updates cannot be immediately propagated, due to the limitations of speed of the network.

of 45

DSM is implemented by having a DSM server that stores the shared segment, that is, it has the data contained by shared segment. The segment is an integral number of pages. When a process maps the segment to its address space, the operating system reserves the address range in memory and marks the virtual addresses of the mapped pages as inaccessible (via the page table). If this process accesses any page in the shared segment, a page fault is caused. The DSM client is the page fault handler of the process.

The workings of DSM are rather complex due to the enormous number of cases the algorithm has to handle. Modern DSM systems provide intricate optimizations that make the system run faster but are hard to understand. In this section, we discuss a simple, un-optimized DSM system – which if implemented would work, but would be rather inefficient.

DSM works with memory by organizing it as pages (similar to virtual memory systems). The mapped segment is a set of pages. The protection attributes of these pages are set to inaccessible, read-only or read-write:

1. Inaccessible: This denotes that the current version of the page is not available on this machine and the server needs to be contacted before the page can be read or written.

2. Read-only: This denotes that the most recent version of the page is available on this machine, i.e. the process on this machine holds the page in read mode. Other processes may also have the page in read-only mode, but no process has it in write mode. This page can be freely read, but not updated without informing the DSM server.

3. Read-write: This denotes that this machine has the sole, latest version of the page, i.e. the process on this machine holds the page in write mode. No other process has a copy of this page. It can be freely read or updated. However, if this page is needed anywhere else, the DSM server may yank the privileges by invalidating the page.

The DSM client or page fault handler is activated whenever there is a page fault. When activated, the DSM client first determines whether the page fault was due to a read access or a write access. The two cases are different and are described separately, below:

Read Access Fault: On a read access fault, the DSM client contacts the DSM server and asks for the page in read mode. If there are no clients that have already requested the page in write mode, the server sends the page to the DSM client. After getting the page, the DSM client copies it into the memory of the process, at the correct address, and sets the protection of the page as read-only. It then restarts the process that caused the page fault.

If there is one client already holding the page in write mode (there can be at most one client in write mode) then the server first asks the client to relinquish the page. This is called invalidation. The client relinquishes the page by sending it back to the server and marking the page as inaccessible. After the invalidation is done, the server sends the page to the requesting client, as before.

of 45

Write Access Fault: On a write access fault, the DSM client contacts the server and requests the page in write mode. If the page is not currently used in read or write mode by any other process, the server provides a copy of the page to the client. The client then copies the page to memory, sets the protection to read-write and restarts the process.

If the page is currently held by some processes in read or write mode, the server invalidates all these copies of the page. Then it sends the page to the requesting client, which installs it and sets the protection to read-write.

The net effects of the above algorithm are as follows:

1. Only pages that are used by a process on a machine migrate to that machine.2. Pages that are read by several processes migrate to the machines these processes are running on.Each machine has a copy.3. Pages that are being updated, migrate to the machines they are being updated on, however there is at most one update copy of the page at any point in time. If the page is being simultaneously read and updated by two or more machines, then the page shuttles back and forth between these machines.

Page shuttling is a serious problem in DSM systems. There are many algorithms used to prevent page shuttling. Effective page shuttling prevention is done by relaxed memory coherence requirements, such as release consistency. Also, with careful design of applications page shuttling can be minimized.

The first system to incorporate DSM was Ivy (5). Several DSM packages are available, these include TreadMarks, Quarks, Avalanche and Calypso.

Kernel Architectures_____________________________________

Operating systems have been always constructed (and often still are) using the monolithic kernel approach. The monolithic kernel is a large piece of protected software that implements all the services the operating system has to offer via a system call interface (or API). This approach has some significant disadvantages. The kernel, unlike application programs, is not a sequential program. A kernel is an interrupt driven program. That is, different parts of the kernel are triggered and made to execute at different (and unpredictable) points in time, due to interrupts. In fact, the entire kernel is interrupt driven.

The net effect of this structure is that:

1. The kernel is hard to program. The dependencies of the independently interrupt-triggerable parts are hard to keep track of.

2. The kernel is hard to debug. There is no way of systematically running and testing the kernel. When a kernel is deployed, random parts start executing quite unpredictably.

of 45

3. The kernel is crucial. A bug in the kernel causes applications to crash, often mysteriously.

4. The kernel is very timing dependent. Timing errors are very hard to catch problems that are not repeatable and the kernel often contains many such glitches that are not detectable.

The emergence of network operating systems saw the sudden drastic increase in the size of kernels. This is due to the addition of a whole slew of facilities in the kernel, such as message passing, protocol handling, network device handling, network file systems, naming systems, RPC handling, time management and so on. Soon it was apparent that this bloat led to kernel implementations that are unwieldy, buggy and doomed to fail.

This rise in complexity, resulted in the development of an innovative kernel architecture, targeted at network operating systems, called the microkernel architecture. A true microkernel places only those features in the kernel, that positively have to be in the kernel. This includes low-level service such as CPU scheduling, memory management, device drivers, network drivers. Then it places a low-level message passing interface in the kernel. The user-level API is just essentially the message passing routines.

All other services are built outside the kernel, using server processes. It has been shown that almost every API service and all networking services can be placed outside the kernel. This architecture has some significant benefits, a few of which are listed below:

1. Services can be programmed and tested separately. Changes to the service do not need recompiling the microkernel.

2. All services are insulated from each other – bugs in one service do not affect another service. This is not only a good feature, but makes debugging significantly easier.

3. Adding, updating and reconfiguring services are trivial.

4. Many different implementations of the same service can co-exist.

Microkernel operating systems that proved successful include Amoeba (10), Mach (12) and the VSystem. A commercial microkernel operating system called Chorus is marketed by Chorus Systems (France).

The advantages of microkernels come at a price, namely performance. Performance of operating systems is an all-important feature that can make or break the usage of the system, especially commercial systems. Hence, commercial systems typically shun the microkernel approach but choose a compromise called the hybrid kernel. A hybrid kernel is a microkernel in spirit, but a monolithic kernel in reality. The Chorus operating system pioneered the hybrid kernel. Windows NT is also a hybrid system. A hybrid system starts as a microkernel. Then as services are developed and debugged they are migrated into the kernel. This retains some of the advantages of the microkernel, but the migration of services into the kernel significantly improves the performance.

of 45

Theoretical Foundations__________________________________

The theoretical study of autonomous but networked computing system was propelled by the need for algorithms for use in networked environments. This active field of research has produced some interesting and seminal results. Much of the foundational work has resulted in the development of distributed algorithms (20). These algorithms are designed to allow a set of independent processes, running on independent computers (or machines, or nodes) to cooperate and interact to achieve a common goal. Many such algorithms are used for application programming. Some of the algorithms are however relevant to management of distributed systems and are used in network operating systems. In the following sections, we present a few algorithms, which form the theoretical foundations of network and distributed operating systems. These include time management, deadlock handling, mutual exclusion, checkpointing, deadlocks detection, concurrency control, consensus and replication control.

Distributed Clocks

Each physical machine on a network has its own clock, which is a hardware counter. This clock runs freely, and cannot be physically synchronized with other clocks. This makes the notion of time on a distributed system hard to define and obtain. The first clock synchronization algorithm provided a method of logically synchronizing clocks such that no application running on the system could ever detect any drift amongst the physical clocks (even though, the clocks do drift). Clocks on systems built using this technique are called Lamport Clocks after the inventor of the algorithm.

The Lamport Clock algorithm works by stamping a time on every message outgoing from any machine. When the operating system on system Si sends out a message, it stamps it with the time Ti, where Ti is the time according to the physical clock on Si.

Suppose the message is received by the operating system on system Sj. The operating system on Sj checks the timestamp in the message with the time according to the local clock on Sj, i.e. Tj:

· If Ti < Tj then no action is needed.· If Ti > Tj then the clock on Sj is incremented to Ti+1.

The above action, at the least, ensures that no messages are received “before” they are sent. However, it also has some interesting side effects. These are:

· All clocks follow the fastest clock.· The clocks are not physically synchronized, but they are logically synchronized. That is, to all applications running on the systems, the clocks appear completely synchronized.· If two actions or events on two different machines are transitively related; that is there is a chain of events from the occurrence of event i to the occurrence of event j; then the time of occurrence of I will always be lower than the time of occurrence of j. Even if i and j happened on two different machines with two different clocks.

of 45

The Lamport Clock is a very simple algorithm which produces properly synchronized (logical) distributed clocks. However it has the shortcoming that clocks cannot be “set back”, and hence real time clocks cannot use this method. In fact, setting back a clock, will cause it to race ahead to catch up with the fastest clock. This problem is solved by the use of vector clocks.

In the vector clock scheme, each system clock is independent and is never updated by the clock algorithm. Every system maintains its own time, and information about the time on other systems. That is, there is a local clock on each system, as well as registers containing some approximation of the time on the sibling systems.

The time is maintained as an n-tuple (or vector) where n is the number of systems on the network. Each machine maintains this n-tuple. On machine Si, the n-tuple (or the time vector) is Tn. Tn, of course has n fields and Tn[i] is the local clock time. The other fields are updated in accordance to the following algorithm.

When a message is sent from Si to Sj, the value of Ti is sent along with the message. When Sj receives the message, it updates its time vector Tj by updating each field in Tj to the larger of the values contained in the corresponding fields of Ti and Tj.

Now it can be shown that any two timestamps can be compared using vector clock algebra. Suppose we want to compare two timestamps Ta and Tb. Each has n fields, Ta[0] to Ta[n-1]. The comparison operators are defined below.

Equal: For all i , Ta[i] is equal to Tb[i].Not Equal: For some i Ta[i] is not equal to Tb[i].Less than or equal: For all i, Ta[i] is less than or equal to Tb[i].

Not less than or equal: For some i, Ta[i] is not less than or equal Tb[i].Less than: (Ta is less than or equal to Tb) and (Ta is not equal Tb).Concurrent: not (Ta less than Tb) and not (Tb less than Ta).

The vector clock thus provides all the functions of Lamport Clocks as far as timestamps and event ordering is concerned. It is also just as simple to implement, but the time on one machine can be adjusted without affecting the time on other machines.

Distributed Mutual Exclusion

Distributed Mutual Exclusion (DME) is a classic problem in distributed computing. There are n processes executing on n sites. Each process is an infinite loop and has a critical section inside the loop. How do you ensure that at most one process executes within its critical section at any given time?

The easy solution is to use a lock server. Each process asks the lock server for permission to enter. The lock server permits only one process at a time. When a process leaves the critical section, it informs the lock server and the lock server can now allow another process to enter. This solution is called the centralized solution to the DME problem.

of 45

The above solution is called centralized, because all decisions are made at one site. In a problem such as DME, we can define two sets for each site. A site i has a Request Set Qi and a Response set Ri. Qi is the set of sites that i will contact when it wants to enter the critical section. Ri is the set of sites that contact I if they want to enter the critical section.

In order for a mutual exclusion algorithm to be distributed, two rules must apply. These are:

Equal Responsibility Rule: For all i, j, |Qi| = |Qj|.Equal Effort Rule: For all i, j, |Ri| = |Rj|

In the centralized case, Ri for all i is the lock server site; and for all i, Qi is empty. Thus, the centralized solution fails the two rules. Many different DME algorithms can meet such rules. Lamport proposed the first solution. In the Lamport algorithm, there are three steps:

Step 1: When a process wants to enter the critical section, it sends a request message, along with a timestamp, to all other processes, including itself. Upon receiving such a message, each process queues the request in timestamp order in a local request queue and sends an acknowledgment. The requesting process waits for all acknowledgments before proceeding.

Step 2: A process can enter when it notices that its own request is the first request in its own local request queue.

Step 3: Whenever a process exits the critical section it informs all processes and they remove the exiting processes request from their local request queues.

The above algorithm meets the equal responsibility and equal effort rules. It uses 3n messages per entry into a critical section. The number of messages can be reduced to sqrt(n) by using a type of algorithm first proposed by Maekawa. Currently there are a large number of algorithms each having some advantage over the other.

Note that in most practical situations, the centralized algorithm works better and uses the lowest number of messages (just 2 messages, per entrance). Thus, it is the most commonly used algorithm.

Distributed Checkpoints

Checkpointing is a method used to restart or debug computations. On a centralized operating system, checkpointing is easy, the process to be checkpointed is stopped and its memory contents are written to a file, then the process can continue execution. The checkpoint can later be used to restart the process (in case of failure) or to analyze its execution (in case of debugging).

In a networked or distributed system this technique does not work. Consider two processes P1 and P2. P1 sends a message to P2. We ask both P1 and P2 to stop and checkpoint themselves. P1 does so, and then continues, and sends a message to P2. P2 receives the message from P1 and then receives the checkpoint notification and then checkpoints itself. Now if we compare the

of 45

checkpoints of P1 and P2, we find P2 has received a message that has not yet been sent by P1. This is called an inconsistent checkpoint.

The classic consistent checkpoint algorithm was proposed by Chandy and Lamport and is called the Snapshot Algorithm . In the snapshot algorithm, to initiate a checkpoint, a marker message is sent to any one process. When a process gets a marker message for the first time, it checkpoints itself and then ends out marker messages to all the processes it communicates with. If a process receives a marker message subsequent to its first time, it ignores the message. It can be shown that the markers eventually disappear, and when the markers disappear, all processes have recorded a set of consistent checkpoints.

Of course many other checkpointing algorithms have been propose since then, having characteristics and features greater that the basic algorithm outlined above.

Distributed Deadlocks

Resource management in operating systems can lead to deadlocks. A resource is any entity, such as files, peripherals, memory and so on. Deadlocks occur, for instance when processes acquire locks on resources. For example, suppose a process P1 locks resource x and then process P2 locks resource y. Thereafter process P1 requests a lock on x and process P2 requests a lock on y. Now, neither P1 nor P2 can progress any further and has to wait for ever. This situation is called a deadlock, and it needs to be detected and then resolved by terminating one of the processes. Deadlock detection on centralized systems are easier than deadlock detection on distributed systems.

Consider the following situation, similar to the deadlock described above, but in the context of a distributed system. A process P1 requests and obtains a lock on resource x. The resource x is located on a machine Mx and hence is controlled by a lock server running on machine Mx. Now, process P2 requests and obtains a lock on resource y, which is located on a machine My and controlled by a lock server on machine My. Then process P1 requests a lock on y and process P2 requests a lock on x.

The above situation is a deadlock. However, the lock servers cannot detect this deadlock by themselves. At the lock server on machine Mx, a process (P1) holds a lock on x and another process (P2) has requested a lock on x. This is a perfectly normal, legal situation, that is not a deadlock. Similarly there is no deadlock at machine My. However, a global or distributed deadlock exists, involving two lock servers.

In a system consisting of a large number of lock servers and large numbers of processes and resources, detection of deadlocks becomes a serious issue. Most early distributed deadlock detection algorithms tried to consolidate the data about resource allocation from multiple lock servers in order to find deadlocks. Such algorithms proved to be complicated, expensive in terms of computational complexity and prone to detect deadlocks even if there are no deadlocks (a phenomenon called false deadlocks).

A distributed deadlock detection algorithm by Chandy and Misra was a breakthrough that solved the deadlock problem in a simple fashion. The solution

of 45

is called the probe algorithm. In this scheme, a process waiting for a resource sends a probe message to the lock server handling the resource. The lock server forwards the probe to a process that is currently holding the resource. When a process receives a probe, and the process is not currently waiting for a resource, it ignores the probe. If the process is currently waiting for a resource, then it forwards the probe to the lock server that controls the resource. If the originator of the probe gets the probe returned to it, then there is a deadlock. A careful implementation of this protocol can be shown to be free from detection of false deadlocks.

Distributed Concurrency Control

Concurrency control is a mechanism by which the integrity of data is preserved in spite of concurrent access by multiple processes. Concurrently control is necessary in both single computer systems and distributed systems. In distributed system, the issues are somewhat more complicated as the data may be stored at many different sites.

Concurrency control ensures serializability. Serializability is a property that ensures that the concurrent execution of a set of processes have results that are equivalent to some serial execution of the same set of processes. Serializability is an important property for any system that handles persistent, interrelated data. Provision of serializability is made possible by many techniques, the two most well known are two phase locking and timestamping.

In the two phase commit scheme, a process that reads or writes data have to obtain a lock on the data item it accesses before they access the data item, and may release the lock after the access is over. If multiple data items are accessed, then no lock can be released until all locks have been acquired. This ensures serializable updates to the data.

In the timestamp scheme, all data items bear two timestamps, the read-timestamp and the write timestamp. All processes or transactions also bear timestamps. The process timestamp is the time at which the process was created. The read-timestamp on a data item is the value, which is the largest of all the process timestamps, of processes which have read the data item. The write-timestamp is equal to the process timestamp of the process that last wrote this data item.

The timestamp protocol works as follows. Suppose a process bearing a timestamp pt wants to read a data time with a read-timestamp rt and a write timestamp wt. If pt < wt then the process is aborted or restarted. Otherwise it is allowed to read the item, and if pt > rt then the read timestamp of the item is updated to be equal to rt. If the process tried to write a new value to the data item, then pt must be higher than both rt and wt (else the process is aborted). After the write, both read and write timestamps of the data item is set to pt. The timestamp protocol is termed an optimistic protocol, as it does not have any locking delays and all operations are processed immediately or aborted.

The two-phase locking and timestamp protocol can be adapted to distributed systems. To implement two phase locking, one or more distributed lock servers have to be provided. If multiple lock servers are provided, then distributed

of 45

deadlock detection has to be added. In addition, the two-phase commit protocol may have to be used for consensus.

To make timestamping work in a distributed system, there needs to be a mechanism to provide system wide unique timestamps. This is of course possible by using vector clocks as the timestamp. Even Lamport clocks can be used, but to ensure uniqueness, the site identifier of the site that assign the timestamp is appended to the end of the timestamp.

Distributed Consensus

Consensus is a problem unique to distributed systems. The reason is that distributed systems are composed of separate autonomous systems that need to cooperate. At the times they need to cooperate, there is often a need to agree on something. Suppose there is a file containing the value 0 (zero) on three machines. A process wants to update the value to 1 on all three machines. It tells servers on all the three machines to do it. The servers now want to ensure all of them do it, or none of them do it (to preserve consistency). So they need to agree (or arrive at a consensus) to either perform the operation (flip the 0 to 1) or abort the operation (leave it as 0).

In theory, it can be shown that consensus in distributed system is impossible to achieve, if there is any chance of loosing messages on the network. The proof is quite involved, but consider the following conversation:

Machine 1 to Machine 2: Flip the bit from 0 to 1, and tell me when you are done so that I will flip it too.

Machine 2 to Machine 1: OK, I have flipped it. But, please acknowledge this message, or else I will think you did not get my reply and you chose not to flip – in which case I will flip mine back to 0.

Machine 1 to Machine 2: Everything is fine. Got your message. But, please acknowledge this message, as I need to know that you got this message, or you may flip the bit back.

Machine 2 to Machine 1: Got it. But now I need another acknowledgment, to ensure......

As is obvious, this bickering continues forever. It can be shown that there is no finite length sequence of messages that achieves consensus, even if messages are not lost, as long as there is a fear of a message getting lost.

In reality, however there is need for consensus, and impossibility is not a deterrence. Many systems just assume messages are not lost and thus implement consensus trivially (machine 1 tells machine 2, to flip it and assumes it will be done). In more critical applications, the two-phase commit protocol is used.

The two-phase commit protocol works as follows. A machine is selected as the leader (e.g. the one that started the process, that made updates) and the rest of the machines are cohorts. That leader tells all the cohorts to “flip the bit”. All of them flip it, and retains a copy of the old value and sends an OK the coordinator.

of 45

This is called the pre-commit phase. At this point, all the cohorts have the old value and the new value. After all the Ok’s are received, the leader sends a commit message which causes all the cohorts to install the new (flipped) value. If some OK’s are not received, the leader tells all the cohorts to abort, that is install the old value back. It can be shown that this protocol (with some extensions for failure handling) works for most cases of message loss and machine failure.

Replication Control

In distributed systems, data is often replicated, that is multiple copies of the same data are stored on multiple sites. This is for reliability, performance, or both. Performance is enhanced if regularly accessed data is scattered over the network, rather than in one place – it evens out the access load. In addition, if one site having the data fails then the date is still available from the other sites. Replication works very well for read-only data. But, to be useful, replication should work with read-write data also. Replication control protocols ensure that data replication is consistent, in spite of failures for read-write data. There are many protocols, a few are outlined below.

Read one, write all: In this scheme, a reader can read from any copy, but a writer has to update all copies. If not all copies are available, the writer cannot update. Most commonly used.

Primary Copy: A variant of the above, read any copy, write to the primary copy. The machine holding the primary copy then propagates the update.

Read majority write majority: If there are N copies, then read N/2+1 copies and take the value from the most recent of the copies. Writing to any of the N/2+1 copies is good enough.

Voting: Each copy has a certain number of votes. The total number of votes is v. Choose a read quorum r and a write quorum w such that r + w = q + 1. Now, to access, find enough copies such that the total vote is equal (or greater) than r for reading, and w for writing.

Depending on the read traffic, the write traffic, and the failure probabilities, one of the above protocols is chosen. Note that voting is a general protocol, where setting the votes of each item to 1 and r to 1 and w to N makes it the read-one-write-all protocol. Similarly, it can mimic the majority protocol. There are other protocols that are more general than voting (such as quorum consensus).

Use In Routers__________________________________________

Network Operating Systems (NOS of windows) are embedded in a router or hardware firewall that operates the functions in the network layer (layer 3) of the OSI model. Examples:

JUNOS, used in routers and switches from Juniper Networks,

Cisco IOS (formerly "Cisco Internetwork Operating System").

TiMOS, used in routers from Alcatel-Lucent

of 45

Huawei VRP (Versatile Routing Platform), used in routers from Huawei

MikroTik RouterOS™ (is a router operating system and software which turns a

regular Intel PC or MikroTik RouterBOARD™ hardware into a dedicated router.)

ZyNOS, used in network devices made by ZyXEL.

ExtremeXOS, used in network devices made by Extreme Networks. Also called

EXOS.

Types of Network Operating Systems_______________________

Nearly all modern networks are a combination of both. The networking design can be

considered independent of the servers and workstations that will share it.

Peer-to-Peer

Peer-to-peer network operating systems allow users to share resources and files located on their computers and to access shared resources found on other computers. All systems act as both users of resources and providers of resources, but no one system is dedicated to a single function.

Peer-to-peer networks are generally best suited to small networks and usually are less expensive than client/server networks. However, they do not have a file server or a centralized management source. In a peer-to-peer network, all computers are considered equal; they all have the same abilities to use the resources available on the network. Peer-to-peer networks are designed primarily for small to medium local area networks. Nearly all modern desktop operating systems, such as Macintosh OSX, Linux, and Windows, can function as peer-to-peer network operating systems. With operating systems such as Windows 98, XP, Vista or Win7, each client machine can also be a server. The OS might still be considered a network operating system, but it is more lightweight than a full-blown NOS.

Examples:

AppleShare used for networking connecting Apple products.

of 45

Windows for Workgroups used for networking peer-to-peer windows computers.

Advantages Ease of setup Less hardware needed, no server needs to be purchased.

Disadvantages No central location for storage. Lack of security that a client/server type offers.

Client/Server

Client/server network operating systems allow the network to centralize functions and applications in one or more dedicated file servers. The file servers become the heart of the system, providing access to resources and providing security. Individual workstations (clients) have access to the resources available on the file servers. They are either users of network resources or providers of resources. Client/server networks are typically more expensive and robust than peer-to-peer networks and generally support the building of larger networks.The network operating system provides the mechanism to integrate all the components of the network and allow multiple users to simultaneously share the same resources irrespective of physical location. UNIX/Linux and the Microsoft family of Windows Servers are examples of client/server network operating systems.

Examples: Novell Netware Windows Server

of 45

The Software in a Network Client This shows the

various software

components that reside in a user's client workstation in a network.

The Software in a Network Client This shows the network

operating system and various system software components in a network server.

of 45

Advantages Centralized servers are more stable. Security is provided through the server. New technology and hardware can be easily integrated into the system. Servers are able to be accessed remotely from different locations and types

of systems.

Disadvantages Cost of buying and running a server are high. Dependence on a central location for operation. Requires regular maintenance and updates.

Security Issues Involved In Using Client/ServerIn a client/server network security issues may evolve at three different

locations: the client, the network, and the server. All three points need to be monitored for unauthorized activity and need to be secured against hackers or eavesdroppers.

The Client

The client is the end user of the network and needs to be secured the most. The client end usually exposes data through the screen of the computer. Client connections to server should be secured through passwords and upon leaving their workstations clients should make sure that their connection to the server is securely cut off in order to make sure that no hackers or intruders are able to reach the server data. Not only securing the workstations connection to the server is important but also securing the files on the workstation (client) is important as it ensures that no hackers are able to reach the system. Another possibility is that of introducing a virus or running unauthorized software on the client workstation thus threatening the entire information bank at the server (Exforsys Inc., 2007).

The users themselves could also be a security threat if they purposely leave their IDs logged in or use easy IDs and passwords to enable hacking. Users may also be sharing their passwords in order to give the hackers access to confidential data (Wilson, Lin, & Craske, 1999). This can be overcome by giving passwords to each client and regularly asking clients to change their passwords. Also passwords should be checked for guessability and for their strength and uniqueness.

The Network

of 45

The network allows transmission of data from the clients to the server. There are several points on the network where a hacker could eavesdrop or steal important packets of information. These packets may contain important confidential data such as passwords or company details. It is important that these networks are secured properly to keep unauthorized professionals away from all the data stored on the server. This can be done by encrypting important data being sent on the network. However, encryption may not be the only possible way of protecting networks as hackers can work their way around encryption. Another method could be conducting security audits regularly and ensuring identification and authorisation of individuals at all points along the network. This should discourage potential hackers (Wilson, Lin, & Craske, 1999). Making the entire environment difficult to impersonate also makes sure that the clients are reaching the true files and applications on the server and that the server is providing information to authorized personnel only.

The ServerThe server can be secured by placing all the data in a secure, centralized

location that is protected through permitting access to authorized personnel only. Virus protection should also be available on server computers as vast amounts of data can be infected. Regular upgrades should be provided to the servers as the software and the applications need to be updated. Even the entire body of data on a server could be encrypted in order to make sure that reaching the data would require excessive time and effort (Wilson, Lin, & Craske, 1999).

Divisions of Network Operating Systems____________________

Local Area Network

The local area network (LAN) is the type of computer network which covers a little area, like groups of buildings, home, or office or a school. The crucial character of LANs, it the great deal of data transfer rates, lesser area covered, and not having any necessity of the leased lines of telecommunication. Wi-Fi and Ethernet (above unprotected warped pair of cable) are the couple of most widespread technologies used at present, but Token Ring, ARCNET and a lot of others was there in the past.

First time local area network was put into practice in the year 1964 at the Livermore Laboratory to prop up the research on atomic weapons. The use of local area network increase in the public corporations at the ending of 1970s and was brought into play to generate high-speed connections in between a number of great central computers in one place.

Features:

Even though the Ethernet is at the present the mostly used data connecting layer protocol and IP at the same time is used as a network layer protocol, a lot of diverse choices have been also in use, and a quantity of the still carries on to be accepted in a good number of places.

of 45

Lesser important local area networks usually contain one or more buttons connected to one another and in a lot of cases one of them are connected with a cable modem, DSL modem, or router for the Internet to access.

The superior local area networks are distinguished by the application of superfluous links with controls by the use of the spanning tree protocol to put a stop to the different loops to separate out traffic via VLANing and with their skills to administer reverse traffic types via excellence in the service.

It is possible to set up connections with one LAN to another LAN with the help of leased services, leased lines, or with the use of 'tunnels' all across the Internet by the use of the VPN technology.

Wide Area Network

Wide Area Network (or WAN) is the type computer network which encloses a broader population (i.e., any network which has contact links all across the national, regional, or metropolitan boundaries).

Features:

Less officially it is the type of network which uses the public communications links and routers.

On the contrary to the campus area networks (CANs), local area networks (LANs), personal area networks (PANs) or metropolitan area networks (MANs) which are more often restricted to campus, buildings, rooms, or to a precise metropolitan spot (e.g., a capital of a state) respectively the WANs cover a more large population.

The Internet is the biggest and most renowned model of the WAN. LANs and other types of networks can be connected together with the

help of the WANs, so as to provide the facility to computers and users present in a certain area to be in touch with computers and users in any other different area.

A large number of WANs are generally designed for any specific organization and are confidential property of that organization.

While the other types of WANs are designed by Internet service providers which provides connections from the LAN of an organization to the Internet. Many a times the WANs are designed on the platform of leased lines. At every terminal of the leased line, a router is placed to connect it with the LAN on one surface and a core inside the WAN on the other surface.

These leased lines are most of the time very costly. So, in place of using these leased lines, the lesser expensive packet

switching or circuit switching techniques can also be adopted to design the WANs.

The transport and addressing functions are delivered by the Network protocols like the TCP/IP.

To improve the performance WAN or WAFS Optimization are also used in certain cases

System Features of Network Operating Systems

Amoeba

of 45

Amoeba, developed at Vrije University, is an operating system using a microkernel design, supporting very fast message passing designed to utilize processor farms. A processor farm is a set of rack mounted single-board computers connected by regular networking (Ethernet). Amoeba makes the collection machines look like one fast timesharing system. It also provides support for threads, RPC, group communication, and all other facilities needed for networking. Amoeba supports a parallel programming language called Orca.

Clouds

Clouds, developed at Georgia Tech, is a system designed to support persistent objects that are large grained. Each object is an address space that is backed up on disk and hence is persistent. The system paradigm uses a thread-object model, where threads are distributed and can access objects via a modified RPC mechanism. The object invocation causes the thread to move between address spaces rather than use a server for processing the RPC request. The entire system is supported on top of a low-level Distributed Shared Memory mechanism thus making all objects available at all computers. Services are built into objects and can be accessed using the RPC mechanism. Message passing is not supported at the API level. Clouds has been used for research in reliability, transaction processing, replication and distributed debugging.

Mach

Mach, developed at Carnegie-Mellon University, is a Unix compatible operating system that is built on a microkernel. The microkernel supports message passing, tasks and threads. Mach supports an innovative user-level external paging system that causes messages to be sent to a paging process whenever there is a page-fault generated by a user process. These external pagers allowed Mach to support a variety of emulation features. The Unix operating system is supported on top of Mach as a user-level process, providing the Unix service. Mach is also heavily customizable, making it an ideal platform for research with operating systems.

MAC OS AppleShare

AppleShare was a product from Apple Computer which implemented various network services. Its main purpose was acting as a file server, using the AFP protocol. Other network services implemented in later versions of AppleShare included a print server using the Printer Access Protocol (PAP), web server, electronic mail server, and SMB / CIFS server to support file sharing to Microsoft Windows clients.

Earlier versions of AppleShare supported only the AppleTalk network transport protocol but later versions, sold under the name AppleShare IP, allowed use of the TCP/IP protocol stack, as used on most modern networks. AppleShare provided three different protocols for application-layer services: AppleShare File Server, AppleShare Print Server and AppleShare PC.

of 45

http://en.wikipedia.org/wiki/TCP/IP

http://en.wikipedia.org/wiki/OSI_model

http://en.wikipedia.org/wiki/OSI_model

http://en.wikipedia.org/wiki/AppleTalk

http://en.wikipedia.org/wiki/Microsoft_Windows

http://en.wikipedia.org/wiki/CIFS

http://en.wikipedia.org/wiki/Server_Message_Block

http://en.wikipedia.org/wiki/Mail_server

http://en.wikipedia.org/wiki/Web_server

http://en.wikipedia.org/wiki/Web_server

http://en.wikipedia.org/wiki/Apple_Filing_Protocol

http://en.wikipedia.org/wiki/File_server

http://en.wikipedia.org/wiki/Network_service

http://en.wikipedia.org/wiki/Apple_Computer

AppleShare would operate with any physical network medium. Early installations

used mainly LocalTalk and more recently Ethernet but any physical medium could be used which could be directly or indirectly connected to an AppleShare server system.

Equivalent third party server products include the open-source Netatalk suite on Unix-like systems, and Services for Macintosh on Microsoft Windows NT and 2000. Versions of Mac OS from System 7 onwards included Personal File Sharing, which is a more limited AFP implementation. The most obvious difference between Personal File Sharing and AppleShare is that the former supports only a small number of concurrent remote users.

All versions of Mac OS are capable of acting as a client to an AppleShare server (via AFP and later SMB) over AppleTalk and TCP/IP protocols, although more recent versions of OS X have gradually removed support for AppleTalk in favor of the standard TCP/IP. Third-party vendors created client software such as PC MACLAN (discontinued) and DAVE to implement client functionality on Windows systems. Other developers offered server software that provided similar functionality on Windows Servers such as GroupLogic ExtremeZ-IP and Cyan Software MacServerIP and NetATalk on Linux. Later versions of AppleShare also implemented the SMB and CIFS protocols which are the native file sharing protocols on Windows machines.

Apple discontinued the AppleShare product line following the release of OS X Server which provides equivalent functionality.

AppleShare provides network services for the Mac OS operating systems. AppleShare supports file and printer sharing over several types of physical networks by using one of the AppleTalk transport protocols: LocalTalk, EtherTalk, TokenTalk or FDDITalk. As the Internet has grown in popularity, TCP/IP software has been developed for the Apple Macintosh computers, along with the standard TCP/IP client applications like Web Browsers and FTP file transfer clients. AppleShare’s peer-to-peer networking is used in small or moderately sized workgroup settings, but has not been adopted widely in large-scale corporate LAN environments.

Novell Netware

NetWare is a computer network operating system developed by Novell, Inc. It initially used cooperative multitasking to run various services on a personal computer, with network protocols based on the archetypal Xerox Network Systems stack.

The original NetWare product in 1983 supported clients running both CP/M and MS-DOS, ran over a proprietary star network topology and was based on a Novell-built file server using the Motorola 68000 processor. The company soon moved away from building its own hardware, and NetWare became hardware-independent, running on any suitable Intel-based IBM PC compatible system, and a wide range of network cards. From the beginning NetWare implemented a number of

of 45

http://en.wikipedia.org/wiki/OS_X_Server

http://en.wikipedia.org/wiki/OS_X_Server

http://en.wikipedia.org/wiki/CIFS


http://en.wikipedia.org/wiki/ExtremeZ-IP

http://en.wikipedia.org/wiki/GroupLogic

http://en.wikipedia.org/wiki/OS_X


http://en.wikipedia.org/wiki/Apple_Filing_Protocol

http://en.wikipedia.org/wiki/Windows_2000

http://en.wikipedia.org/wiki/Windows_NT

http://en.wikipedia.org/wiki/Windows_NT


http://en.wikipedia.org/wiki/Unix

http://en.wikipedia.org/wiki/Netatalk

http://en.wikipedia.org/wiki/Open-source

http://en.wikipedia.org/wiki/Ethernet

http://en.wikipedia.org/wiki/LocalTalk

features inspired by mainframe and mini computer systems that were not available in its competitors.

In the early 1990s, Novell introduced separate cheaper networking products, unrelated to classic NetWare. These wereNetWare Lite 1.0 (NWL), and later Personal NetWare 1.0 (PNW) in 1993.

In 1993 the main product line took a dramatic turn when Version 4 introduced NetWare Directory Services (NDS), a global directory service broadly similar to Microsoft's Active Directory released seven years later. This, along with a new e-mail system, GroupWise, application configuration suite, ZENworks and security product BorderManager were all targeted at the needs of large enterprises.

By 2000 however, Microsoft was making increasing inroads into Novell's customer base and Novell increasingly looked to a future based on a Linux kernel.

The successor product to NetWare, Open Enterprise Server, was released in March 2005. OES offers all the services previously hosted by NetWare v6.5, and added the choice of delivering those services using either a NetWare v6.5 or SUSE Linux Enterprise Server v9 kernel.

Novell NetWare 4.X features NetWare Directory Services (NDS), which allows a user to logon from anywhere on the network and access the same resources regardless of where the user logs on. This product has become Novell’s migration strategy for the future. Novell NetWare does not actually provide a computer operating system for client workstations; it supplies an operating system for a dedicated server computer, and networking support to existing client workstation operating systems.

Sprite

Sprite, developed at University of California – Berkeley (13), is an operating system that provides a single system image to a cluster of workstations. Much of the focus of research with Sprite has been directed at improving file system performance. As a result, Sprite provides a very high performance file system through client and server caching. It has process migration to take advantage of idle machines. It was used as a testbed for research in log-structured file systems, striped file systems, crash recovery, and RAID file systems.

Unix/Linux

Unix is a commercial product of Unix Systems Laboratories. Various other companies sell variants of Unix, using other trade names, the most well known being SunOS/Solaris. SunOS was the first system to provide a commercial, robust, full-featured network file system (NFS). Linux is a free Unix compatible operating system. The kernel of Unix is monolithic and most network-based services are added as separate user processes. Unix is an older operating system, adapted for network use. Due to the prevalence of Unix in research institutions, all services developed for networking are developed on Unix platforms first. Hence, everything is available for

of 45

Unix, though not from the commercial providers of Unix. Unix is the mainstay of network operating systems in the academic and research communities.

Microsoft Windows operating systems, with a few exceptions:

1. Network File System (NFS) - NFS provides hard disk sharing over TCP/IP networks. It is the primary file and directory sharing protocol used in UNIX. NSF clients are available for Microsoft Windows operating systems, but are generally used for interoperation with UNIX hosts.

2. Remote login services (RLOGIN and TELNET) - UNIX has always been a truly multi-user multi-session operating system. Multitasking of applications and processes is extremely robust; much more so than under Windows NT. In addition many users can run private sessions simultaneously under UNIX, by logging in to a host using a character based terminal or emulated terminal over a serial or network connection. RLOGIN and TELNET protocols support logging in over network connections.

3. Graphical user interface windowing system (X Windows) - XWindows is a completely distributed graphical user interface system. Using X-Windows, a user can execute an application on one computer, and let that application interact with a user on a different computer, using a network connection. X Windows allows computers to share their video displays, keyboards and pointing devices with applications running on other computers. There is no requirement that the application execution and display hosts even have the same type of hardware or run the same operating system, as long as the client application and the display server support the X Windows protocol. Windows NT has recently included support for graphical session-based terminal servers that offer some similar functionality to X Windows, but with much less flexibility. In addition, X Windows server software is available for Microsoft Windows operating systems, allowing applications running on UNIX hosts to be displayed on computers running Microsoft Windows operating systems.

V-System

The V-System, developed at Stanford University, is a microkernel operating system with support for fast message passing. Services are added to V by running user-level servers. The innovative use of low latency protocols for inter-machine messaging provides V with excellent performance on a networked environment. Also innovative is the uniform support for input-output, a capability based naming scheme and the clean design of the kernel.

Windows for Workgroups/Windows 95/Windows NT Server

Windows for Workgroups/Windows 95

Windows for Workgroups, introduced in the early 90s and Windows 95,introduced in 1995 are considered peer-to-peer networking systems and donot have the capabilities of true internetworking operating systems. Theyare, however, inexpensive and more than adequate for small workgroupswanting to share resources, use email, and connect to the Internet.Windows for Workgroups and Windows 95 both offer peer-to-peer networkprotocols (Windows NT will be discussed later). The protocols used bythese operating systems allow users to share files and devices over LANs.

of 45

Both offer NetBEUI, Microsoft’s small network protocol. They also offerTCP/IP, and IPX/SPX protocols to access the network through either a dialupconnection/modem, or directly through a NIC.

Windows 95

Windows 95 is a consumer-oriented graphical user interface-based operating system. It was released on August 24, 1995 by Microsoft, and was a significant progression from the company's previous Windows products. During development, it was referred to as Windows 4.0 or by the internal codename Chicago.

Windows 95 integrated Microsoft's formerly separate MS-DOS and Windows products. It featured significant improvements over its predecessor, Windows 3.1, most notably in the graphical user interface (GUI) and in its relatively simplified "plug-n-play" features. There were also major changes made at lower levels of the operating system, such as moving from a mainly16-bit architecture to a pre-emptively multitasked 32-bit architecture.

Accompanied by an extensive marketing campaign, Windows 95 was a major success in the marketplace at launch and shortly became the most popular desktop operating system. It also introduced numerous functions and features that were featured in later Windows versions, such as the taskbar, the 'Start' button, and the way the user navigates. It was also suggested that Windows 95 had an effect of driving other major players (including OS/2) out of business, something which would later be used in court against Microsoft.

Some three years after its introduction, Windows 95 was succeeded by Windows 98.

NetBEUI protocols, while not routable, are more than adequate to meetsmall LAN needs. NetBEUI protocols are easy to use and do not require in- depth networking knowledge. NetBEUI software identifies computer devices by name and it is certainly easier to remember that a computer’s name is Juanita or Justin than 141.252.20.2 or 141.252.10.1. Each device name in a network must be unique. NetBEUI software installed on each of the networked computers is all that is necessary to configure devices in order to share resources and create a network. If a small company does want to connect to the Internet, the necessary software and protocols are available with these operating systems.

Shared resources on Windows for Workgroups/95 networks are accessed by a password that protects the resource and there is only one level of access; either you have access or you don’t have access. Also user-by-user passwords are not part of the protocols unless Windows NT is present. What this means is that anyone connected to the network who knows the password of the resource has access to that resource. This can create security issues since there is no way to prevent a user from access once s/he knows the password. As the network grows, it is usually more difficult to keep resource passwords secure.

of 45

Since there is no central control, managing these peer-to-peer networks becomes an issue when the network becomes too large.

Windows NT Server

Windows NT is a family of operating systems produced by Microsoft, the first version of which was released in July 1993. It was a powerful high-level-language-based, processor-independent, multiprocessing, multiuser operating system with features comparable to Unix. It was intended to complement consumer versions of Windows that were based on MS-DOS. NT was the first fully 32-bit version of Windows, whereas its consumer-oriented counterparts, Windows 3.1x and Windows 9x, were 16-bit/32-bit hybrids. Windows 2000, Windows XP, Windows Server 2003, Windows Vista, Windows Home Server, Windows Server 2008, Windows 7, Windows Server 2008 R2, Windows 8, Windows Phone 8, Windows RT and Windows Server 2012 are members of the Windows NT family, although they are not branded using the name "Windows NT".

Windows NT Server was first introduced in the mid-90s and is capable ofmanaging workgroups just like Windows for Workgroups/95. It is a client/server networking operating system that uses routable protocols, making it a true Internetworking Operating System enabling network administrators to connect LANs to WANs. Windows NT Server also provides services for OS/2 and Novell NetWare clients and is able to run on various microprocessors such as Intel Pentium, DEC Alpha, RISC, MIPS, and PowerPC based systems. This operating system has a hybrid kernel, that is the inner core of the operating system follows the microkernel technology, but the services are not at the user-level. Services are added to Windows NT as modules called DLLs (dynamically loadable libraries). The operating system is extensible and allows for a variety of pluggable modules at the level of device drivers, kernel extensions as well as services at the user level. Windows NT provides many of the services described in this article in a commercial product and competes with the various forms of Unix in the marketplace. Windows NT also has the ability of running applications written for DOS, Windows 3.1 and Windows 95, all of which are completely different operating systems. For network use, Windows NT provides file service, name service, replication service, RPC service and messaging using several protocols.

Windows NT Server has all of the advantages mentioned for the otherWindows operating systems, plus, it contains several other features making it more robust. The security on Windows NT allows a network administrator to not only provide passwords for resources but also to individuals or groups. This operating system does require the use of a more powerful server computer whose sole function is to act as administrator of the NOS program. Having a server where all access data is stored makes managing Windows NT Server Operating Systems efficient.

Windows NT Server has more than one level of security beyond access/noaccess. This NOS offers:

of 45



http://en.wikipedia.org/wiki/Windows_RT

http://en.wikipedia.org/wiki/Windows_Phone_8


http://en.wikipedia.org/wiki/Windows_Server_2008_R2

http://en.wikipedia.org/wiki/Windows_Server_2008_R2



http://en.wikipedia.org/wiki/Windows_Home_Server

http://en.wikipedia.org/wiki/Windows_Vista

http://en.wikipedia.org/wiki/Windows_Vista


http://en.wikipedia.org/wiki/Windows_XP


http://en.wikipedia.org/wiki/Windows_9x

http://en.wikipedia.org/wiki/Windows_3.1x

http://en.wikipedia.org/wiki/MS-DOS

http://en.wikipedia.org/wiki/Unix


http://en.wikipedia.org/wiki/Operating_system

· no access.· access that restricts the user to read only capabilities.· access that allows read and write usage.· access that allows you to change access permissions for network users.

Each user who wishes to access services on the network must have a password and a user account set up within the domain. A domain is a security model where the database of user accounts is stored on one or more computers known as domain controllers. What this does is centralize control of the network. The network administrator creates, deletes, and manages these accounts and passwords using the User Manager for Domains program that comes with Windows NT Server.

For security reasons, companies often have two servers capable of authenticating passwords, one that acts as the Primary Domain Controller (PDC) and the other as the Backup Domain Controller (BDC). The PDC isa computer on the network that maintains a database of users and securitypolicies for the domain. This database contains the names of users and theresources they may access. User access may be defined for individual filesor for entire directories. Note that although the PDC is often a server computer, this is not necessary. Network services can be implemented on one computer and the PDC functions on another.

A BDC maintains a copy of the PDC database. This backup database prevents network disruption in the event of failure of the PDC. If a failure occurs, the backup domain controller can be easily switched to the PDC.

With Windows NT Server networks can connect to other domains. There are four basic domain models:

Single Domain Model - In the single domain model, there is only onedefined set of security and user accounts. This means that all management functions are centralized.

Master Domain Model - In this model, there is one master domain server that has the defined set of security and user account data of all other domain servers. However, each of these has only the specific security data for one domain. A user on one domain cannot get permissions not found on her/his domain server. This improves on the single domain model since it allows segmentation of networks, which increases security measures, yet still allows centralized management of network user accounts and accessibility.

Multiple Master Domain Model - The Multiple Master DomainModel has several master domain servers, each with their own specificdomains. In this model, network management becomes somewhatdecentralized and network administrators cannot maintain the entirenetwork from one central domain server.

Multiple Trust Domain Model - This model is really a peer-to-peerrelationship among domain servers, therefore it becomes decentralizedand security is essentially the same as with Windows for Workgroupsand Windows 95.

of 45

Related Topics__________________________________________

Distributed Operating Systems

Distributed operating systems are network operating systems with significantly more integration between the autonomous operating system running on each machine. The distributed operating system is hence able to provide services that are beyond the capability of network operating systems. A few of the additional facilities are summarized below:

Dynamic Distributed Data Placement: A data item of file is located close to where it is used. Its location changes dynamically as its usage pattern changes. The logical location (such as a file is in one particular directory) is not an indicator of its physical locations. For example, a directory may contain three files, but the files may be located at three different machines, at some point in time.

Process Scheduling: When a process is started, it is not started on the same machine as its parent, but the process scheduler decides where to start the process. The chosen machine may be a machine with the lightest load, or a machine that is close to the data the process will be accessing.

Process Migration: Processes may move from machine to machine (automatically) depending upon its data access patterns, or resource needs, or just for load balancing.

Fault Tolerance: Failures of sites do not affect any of the computations. Failed computations are automatically restarted, inaccessible data is made available though replicated copies. Users connected to the failed machine are transparently relocated.

Distributed Parallel Processing Systems

The bastion of parallel processing used to be large, expensive machines called parallel processors. The advent of network operating systems has shifted the focus of parallel processing platforms to cheaper hardware – a network of smaller machines. Parallel processing involves splitting a large task into smaller units, each of which can be executed on a separate processor, concurrently. This method uses more hardware, but causes the task to run faster and complete quicker. Parallel processing is very necessary in applications such as weather forecasting, space exploration, image processing, large database handling and many scientific computations.

Parallel processing on network operating system use toolkits, also known as middleware, which sits between the application and the operating system and manages the control flow and the data flow. A particularly popular package is called PVM (Parallel

of 45

Virtual Machine). PVM augments the message passing system provided by the operating system with simpler to use primitives, that allow: control of spawning processes on remote machines, transmission of data to the machine and collection of results of the computations. Another package with similar characteristics is MPI. An interesting system that uses a radically different approach to parallel processing is Linda. Linda integrates the notion of work and data into a unified concept called the tuple-space. The tuple-space contains work tuples and data tuples. Processes called workers run on many machines and access the tuple-space to get work, to get input and to store the results. Some recent parallel processing system use distributed shared memory to hold the data, mimicking the facilities available on the large parallel processors. Such systems are easier to program as they insulate the programmer from the idiosyncrasies of data placement and data transmission. TreadMarks is a product that provides a high-performance distributed shared memory system using a method called release consistency. Calypso is another system that supports easy to program parallel processing, and also provides load balancing and fault tolerance with no additional cost. Calypso uses a manager worker model that creates a logical parallel processor, and can dynamically change the number of workers depending upon physical network characteristics. Other systems that are in use include Amber, Avalanche, GLU, P4, Piranha and Quarks.

of 45

References_____________________________________________

Al-Shawakfa, Emad; Evens, Martha (2001). "The Dialoguer: An Interactive Bilingual Interface to a Network Operating System.", Expert Systems Vol. 18 Issue 3, p131, 19p, Retrieved 5/7/2011.

Dean, Tamara (2009). "Network Operating Systems", Network+ Guide to Networks, 421(483).

Winkelman, Dr. Roy (2009). "Chapter 6: Software", An Educator's Guide to School Networks, 6.

Davis, Ziff (2011). "network operating system", PCmag.com

Davis, Nick. "Characteristics of Network Operating Systems", http://www.ehow.com/list_6808012_characteristics-network-operating-systems.html

Derfler, Jr., Frank J., & Freed, L. (1998). How Networks Work, FourthEdition. Macmillan Computer Publishing/Que Corporation, Indianapolis,Indiana.

Hayden, Matt. (1998). Sam's Teach Yourself Networking in 24 Hours.Sam's Publishing, Indianapolis, Indiana.

Horak, R. and Miller, M. (1997) Communications Systems and Networks.M & T Books, New York.

Kosiur, D. and Snyder, J. (1994) Macworld Networking Bible, SecondEdition. IDG Books Worldwide, Inc., San Mateo, California.

Lowe, Doug. (1998). Networking for Dummies. Third Edition. IDG BooksWorldwide, Inc., Foster City, California.

MaranGraphicsä. (1997). Teach Yourself Networking Visually, IDGBooks Worldwide, Inc., Forest City, California.

Palmer , Michael J. (1998) Hands-On Networking Essentials with Projects,Course Technology, Inc. Cambridge, Massachusetts.

Website References:

http://en.wikipedia.org/wiki/Network_operating_system

of 45

http://www.pcmag.com/encyclopedia_term/0,2542,t=network+operating+system&i=47901,00.asphttp://fcit.usf.edu/network/chap6/chap6.htmhttp://www.ehow.com/list_6808012_characteristics-network-operating-systems.htmlhttp://www.altiusdirectory.com/Computers/network-operating-systems.htmlhttp://www.ehow.com/list_6732646_features-network-operating-systems.htmlhttp://en.wikipedia.org/wiki/Windows_95http://en.wikipedia.org/wiki/AppleShare

of 45

network operating system.docx

Documents

idg books

regular operating

backup domain

phase commit

single domain

distributed

distributed

local request