# MCA Assignment (Semester 2 + 3 Full) Sikkim Manipal University, SMU

Post on 03-Dec-2014

113 views

Category:

## Documents

TRANSCRIPT

Master of Computer Application (MCA) Semester 2MC0066 OOPS using C++ 4 Credits (Book ID: B0681 & B0715)

Assignment Set 1MC0066 OOPS using C++ (Book ID: B0681 & B0715)

1. Write a program in C++ for matrix multiplication. The program should accept the dimensions of both the matrices to be multiplied and check for compatibility with appropriate messages and give the output. void main() { int m1[10][10],i,j,k,m2[10][10],add[10][10], printf("Enter number of rows and columns of first matrix MAX 10\n"); scanf("%d%d",&r1,&c1); printf("Enter number of rows and columns of second matrix MAX 10\n"); scanf("%d%d",&r2,&c2); if(r2==c1) { printf("Enter rows and columns of First matrix \n"); printf("Row wise\n"); for(i=0;ilink=temp; Retrun temp;}

Insert a node at the front end: Consider the list contains 4 nodes and a pointer variable last contains address of the last node. 20 45 10 80

temp itself as the last node and establish a link between the first node and the last node. Repeatedly insert the items using the above procedure to create a list. Function to insert an item at the front end of the list. NODE insert_front(int item, NODE last) { NODE temp; Temp=getnode(); Temp->info=item; If(last==NULL) Last=temp; Else temp->link=last->link; Last->link=temp; Return last; } 3. a). Compare and contrast DFS and BFS and DFS+ID approaches b). Discuss how Splay Tree differs from a Binary Tree?Justify your answer with ex. a) Once youve identified a problem as a search problem, its important to choose the right type of search. Here are some information which may be useful while taking the decision. Search Time Space When to use Must search tree anyway, know the level the DFS O(c^k) O(k) answers are on, or you arent looking for the shallowest number. Know answers are very near top of tree, or BFS O(c^d) O(c^d) want shallowest answer. Want to do BFS, dont have enough space, DFS+ID O(c^d) O(d) and can spare the time. d is the depth of the answer k is the depth searched d = 0}

Constraint in UML is usually written in curly braces using any computer or natural language including English, so that it might look as: {account balance is positive} UML prefers OCL as constraint language. 5. What are the important factors that need to be considered to model a system from different views. Modeling Different Views of a System:When you model a system from different views, you are in effect constructing your system simultaneously from multiple dimensions. By choosing the right set of views, you set up a process that forces you to ask good questions about your system and to expose risks that need to be attacked. If you do a poor job of choosing these views or if you focus on one view at the expense of all others, you run the risk of hiding issues and deferring problems that will eventually destroy any chance of success. To model u system from different views, Decide which views you need to best express the architecture of your system and to expose the technical risks to your project. The five views of an architecture described earlier are a good starting point. For each of these views, decide which artifacts you need to create to capture the essential details of that view. For the most part, these artifacts will consist of various UML diagrams. As part of your process planning, decide which of these diagrams youll want to put under some sort of formal or semi formal control. These are the diagrams for which youll want to schedule reviews and to preserve as documentation for the project. Allow room for diagrams that are thrown away. Such transitory diagrams are still useful for exploring the implications of your decisions and for experimenting with changes. For example, if you are modeling a simple monolithic application that runs on a single machine, you might need only the following handful of diagrams. Use case view Use ease diagrams Design view Class diagrams (for structural modeling) Interaction diagrams (for behavioral modeling) Process view None required Implementation view None required Deployment view None required If yours is a reactive system or if it focuses on process flow, youll probably want to include statechart diagrams and activity diagrams, respectively, to model your systems behavior.

Figure 6-3 illustrates a simple class diagram specifying an instantiation of the chain of responsibility pattern. This particular instantiation involves three classes: Client, EventHandler, and GUIEventHandler. Client and EventHandler are shown as abstract classes, whereas GUI EventHandler is concrete. EventHandler has the usual operation expected of this pattern (handleRequest), although two private attributes have been added for this instantiation.

Figure : Forward Engineering

Master of Computer Application (MCA) Semester 2 MC0070 Operating Systems with Unix 4 Credits

Assignment Set 1MC0070 Operating Systems with Unix (Book ID: B0682 & B0683)1. Describe the concept of process control in Operating systems. :A process can be simply defined as a program in execution. Process along with program code, comprises of program counter value, Processor register contents, values of variables, stack and program data. A process is created and terminated, and it follows some or all of the states of process transition; such as New, Ready, Running, Waiting, and Exit. Process is not the same as program. A process is more than a program code. A process is an active entity as oppose to program which considered being a passive entity. As we all know that a program is an algorithm expressed in some programming language. Being a passive, a program is only a part of process. Process, on the other hand, includes: Current value of Program Counter (PC) Contents of the processors registers Value of the variables The process stack, which typically contains temporary data such as subroutine parameter, return address, and temporary variables. A data section that contains global variables. A process is the unit of work in a system. A process has certain attributes that directly affect execution, these include: PID - The PID stands for the process identification. This is a unique number that defines the process within the kernel. PPID - This is the processes Parent PID, the creator of the process. UID - The User ID number of the user that owns this process. EUID - The effective User ID of the process. GID - The Group ID of the user that owns this process. EGID - The effective Group User ID that owns this process. Priority - The priority that this process runs at. To view a process you use the ps command. # ps -l F S UID PID PPID C PRI NI P SZ:RSS WCHAN TTY TIME COMD 30 S 0 11660 145 1 26 20 * 66:20 88249f10 ttyq 6 0:00 rlogind The F field: This is the flag field. It uses hexadecimal values which are added to show the value of the flag bits for the process. For a normal user process this will be 30, meaning it is loaded into memory.

The S field: The S field is the state of the process, the two most common values are S for Sleeping and R for Running. An important value to look for is X, which means the process is waiting for memory to become available. PID field: The PID shows the Process ID of each process. This value should be unique. Generally PID are allocated lowest to highest, but wrap at some point. This value is necessary for you to send a signal to a process such as the KILL signal. PRI field: This stands for priority field. The lower the value the higher the value. This refers to the process NICE value. It will range from 0 to 39. The default is 20, as a process uses the CPU the system will raise the nice value. P flag: This is the processor flag. On the SGI this refers to the processor the process is running on. SZ field: This refers to the SIZE field. This is the total number of pages in the process. Each page is 4096 bytes. TTY field: This is the terminal assigned to your process. Time field: The cumulative execution time of the process in minutes and seconds. COMD field: The command that was executed. In Process model, all software on the computer is organized into a number of sequential processes. A process includes PC, registers, and variables. Conceptually, each process has its own virtual CPU. In reality, the CPU switches back and forth among processes. 2. Describe the following: A.) Layered Approach B.) Micro Kernels C.) Virtual Machines A) Layered Approach: With proper hardware support, operating systems can be broken into pieces that are smaller and more appropriate than those allowed by the original MS-DOS or UNIX systems. The operating system can then retain much greater control over the computer and over the applications that make use of that computer. Implementers have more freedom in changing the inner workings of the system and in creating modular operating systems. Under the top-down approach, the overall functionality and features are determined and the separated into components. Information hiding is also important, because it leaves programmers free to implement the low-level routines as they see fit, provided that the external interface of the routine stays unchanged and that the routine itself performs the advertised task. A system can be made modular in many ways. One method is the layered approach, in which the operating system is broken up into a number of layers (levels). The bottom layer (layer 0) id the hardware; the highest (layer N) is the user interface.

Users File Systems Inter-process Communication I/O and Device Management Virtual Memory Primitive Process Management Hardware B) Micro-kernels: We have already seen that as UNIX expanded, the kernel became large and difficult to manage. In the mid-1980s, researches at Carnegie Mellon University developed an operating system called Mach that modularized the kernel using the microkernel approach. This method structures the operating system by removing all nonessential components from the kernel and implementing then as system and user-level programs. The result is a smaller kernel. There is little consensus regarding which services should remain in the kernel and which should be implemented in user space. Typically, however, micro-kernels provide minimal process and memory management, in addition to a communication facility. Device File Client . Virtual Drivers Server Process Memory Microkernel Hardware C) Virtual Machine: The layered approach of operating systems is taken to its logical conclusion in the concept of virtual machine. The fundamental idea behind a virtual machine is to abstract the hardware of a single computer (the CPU, Memory, Disk drives, Network Interface Cards, and so forth) into several different execution environments and thereby creating the illusion that each separate execution environment is running its own private computer. By using CPU Scheduling and Virtual Memory techniques, an operating system can create the illusion that a process has its own processor with its own (virtual) memory. Normally a process has additional features, such as system calls and a file system, which are not provided by the hardware. The Virtual machine approach does not provide any such additional functionality but rather an interface that is identical to the underlying bare hardware. Each process is provided with a (virtual) copy of the underlying computer.

3. Memory management is important in operating systems. Discuss the main problems that can occur if memory is managed poorly. The part of the operating system which handles this responsibility is called the memory manager. Since every process must have some amount of primary memory in order to execute, the performance of the memory manager is crucial to the performance of the entire system. Virtual memory refers to the technology in which some space in hard disk is used as an extension of main memory so that a user program need not worry if its size extends the size of the main memory. For paging memory management, each process is associated with a page table. Each entry in the table contains the frame number of the corresponding page in the virtual address space of the process. This same page table is also the central data structure for virtual memory mechanism based on paging, although more facilities are needed. It covers the Control bits, Multi-level page table etc. Segmentation is another popular method for both memory management and virtual memory Basic Cache Structure : The idea of cache memories is similar to virtual memory in that some active portion of a low-speed memory is stored in duplicate in a higher-speed cache memory. When a memory request is generated, the request is first presented to the cache memory, and if the cache cannot respond, the request is then presented to main memory. Content-Addressable Memory (CAM) is a special type of computer memory used in certain very high speed searching applications. It is also known as associative memory, associative storage, or associative array, although the last term is more often used for a programming data structure. In addition to the responsibility of managing processes, the operating system must efficiently manage the primary memory of the computer. The part of the operating system which handles this responsibility is called the memory manager. Since every process must have some amount of primary memory in order to execute, the performance of the memory manager is crucial to the performance of the entire system. Nutt explains: The memory manager is responsible for allocating primary memory to processes and for assisting the programmer in loading and storing the contents of the primary memory. Managing the sharing of primary memory and minimizing memory access time are the basic goals of the memory manager.

The real challenge of efficiently managing memory is seen in the case of a system which has multiple processes running at the same time. Since primary memory can be space-multiplexed, the memory manager can allocate a portion of primary memory to each process for its own use. However, the memory manager must keep track of which processes are running in which memory locations, and it must also determine how to allocate and de-allocate available memory when new processes are created and when old processes complete execution. While various different strategies are used to allocate space to processes competing for memory, three of the most popular are Best fit, Worst fit, and First fit. Each of these strategies are described below: Best fit: The allocator places a process in the smallest block of unallocated memory in which it will fit. For example, suppose a process requests 12KB of memory and the memory manager currently has a list of unallocated blocks of 6KB, 14KB, 19KB, 11KB, and 13KB blocks. The best-fit strategy will allocate 12KB of the 13KB block to the process. Worst fit: The memory manager places a process in the largest block of unallocated memory available. The idea is that this placement will create the largest hold after the allocations, thus increasing the possibility that, compared to best fit, another process can use the remaining space. Using the same example as above, worst fit will allocate 12KB of the 19KB block to the process, leaving a 7KB block for future use. First fit: There may be many holes in the memory, so the operating system, to reduce the amount of time it spends analyzing the available spaces, begins at the start of primary memory and allocates memory from the first hole it encounters large enough to satisfy the request. Using the same example as above, first fit will allocate 12KB of the 14KB block to the process.

Notice in the diagram above that the Best fit and First fit strategies both leave a tiny segment of memory unallocated just beyond the new process. Since the amount of memory is small, it is not likely that any new processes can be loaded here. This condition of splitting primary memory into segments as the memory is allocated and deallocated is known as fragmentation. The Worst fit strategy attempts to reduce the problem of fragmentation by allocating the largest fragments to new processes. Thus, a larger amount of space will be left as seen in the diagram above.

Another way in which the memory manager enhances the ability of the operating system to support multiple process running simultaneously is by the use of virtual memory. According the Nutt, virtual memory strategies allow a process to use the CPU when only part of its address space is loaded in the primary memory. In this approach, each processs address space is partitioned into parts that can be loaded into primary memory when they are needed and written back to secondary memory otherwise. Another consequence of this approach is that the system can run programs which are actually larger than the primary memory of the system, hence the idea of virtual memory. Brookshear explains how this is accomplished: Suppose, for example, that a main memory of 64 megabytes is required but only 32 megabytes is actually available. To create the illusion of the larger memory space, the memory manager would divide the required space into units called pages and store the contents of these pages in mass storage. A typical page size is no more than four kilobytes. As different pages are actually required in main memory, the memory manager would exchange them for pages that are no longer required, and thus the other software units could execute as though there were actually 64 megabytes of main memory in the machine. In order for this system to work, the memory manager must keep track of all the pages that are currently loaded into the primary memory. This information is stored in a page table maintained by the memory manager. A page fault occurs whenever a process requests a page that is not currently loaded into primary memory. To handle page faults, the memory manager takes the following steps: The memory manager locates the missing page in secondary memory. The page is loaded into primary memory, usually causing another page to be unloaded. The page table in the memory manager is adjusted to reflect the new state of the memory. The processor re-executes the instructions which caused the page fault.

4. Discuss the following: A) File Substitution B) I/O Control A) File Substitution and I/O Control: It is important to understand how file substitution actually works. In the previous examples, the ls command doesnt do the work of file substitution the shell does. Even though all the previous examples employ the ls command, any command that accepts filenames on the command line can use file substitution. In fact, using the simple echo command is a good way to experiment with file substitution without having to worry about unexpected results. For example, \$ echo p*

p10 p101 p11 When a metacharacter is encountered in a UNIX command, the shell looks for patterns in filenames that match the metacharacter. When a match is found, the shell substitutes the actual filename in place of the string containing the metacharacter so that the command sees only a list of valid filenames. If the shell finds no filenames that match the pattern, it passes an empty string to the command. The shell can expand more than one pattern on a single line. Therefore, the shell interprets the command \$ ls LINES.* PAGES.* as \$ ls LINES.dat LINES.idx PAGES.dat PAGES.idx There are file substitution situations that you should be wary of. You should be careful about the use of whitespace (extra blanks) in a command line. If you enter the following command, for example, the results might surprise you:

What has happened is that the shell interpreted the first parameter as the filename LINES. with no metacharacters and passed it directly on to ls. Next, the shell saw the single asterisk (*), and matched it to any character string, which matches every file in the directory. This is not a big problem if you are simply listing the files, but it could mean disaster if you were using the command to delete data files! Unusual results can also occur if you use the period (.) in a shell command. Suppose that you are using the \$ ls .* command to view the hidden files. What the shell would see after it finishes interpreting the metacharacter is \$ ls . .. .profile which gives you a complete directory listing of both the current and parent directories. When you think about how filename substitution works, you might assume that the default form of the ls command is actually \$ ls * However, in this case the shell passes to ls the names of directories, which causes ls to list all the files in the subdirectories. The actual form of the default ls command is \$ ls .

5. Discuss the concept of File substitution with respect to managing data files in UNIX. It is important to understand how file substitution actually works. In the previous examples, the ls command doesnt do the work of file substitution the shell does. Even though all the previous examples employ the ls command, any command that accepts filenames on the command line can use file substitution. In fact, using the simple echo command is a good way to experiment with file substitution without having to worry about unexpected results. For example, \$ echo p* p10 p101 p11 When a metacharacter is encountered in a UNIX command, the shell looks for patterns in filenames that match the metacharacter. When a match is found, the shell substitutes the actual filename in place of the string containing the metacharacter so that the command sees only a list of valid filenames. If the shell finds no filenames that match the pattern, it passes an empty string to the command. The shell can expand more than one pattern on a single line. Therefore, the shell interprets the command \$ ls LINES.* PAGES.* as \$ ls LINES.dat LINES.idx PAGES.dat PAGES.idx There are file substitution situations that you should be wary of. You should be careful about the use of whitespace (extra blanks) in a command line. If you enter the following command, for example, the results might surprise you:

What has happened is that the shell interpreted the first parameter as the filename LINES. with no metacharacters and passed it directly on to ls. Next, the shell saw the single asterisk (*), and matched it to any character string, which matches every file in the directory. This is not a big problem if you are simply listing the files, but it could mean disaster if you were using the command to delete data files! Unusual results can also occur if you use the period (.) in a shell command. Suppose that you are using the \$ ls .* command to view the hidden files. What the shell would see after it finishes interpreting the metacharacter is

\$ ls . .. .profile which gives you a complete directory listing of both the current and parent directories. When you think about how filename substitution works, you might assume that the default form of the ls command is actually \$ ls * However, in this case the shell passes to ls the names of directories, which causes ls to list all the files in the subdirectories. The actual form of the default ls command is \$ ls .

Assignment Set 2MC0070 Operating Systems with Unix Book ID: B0682 & B0683

this is conceptually easy but often hard to implement in practice because it assumes that a process knows what resources it will need in advance. 2. Discuss the following with respect to I/O in Operating Systems: a. I/O Control Strategies b. Program - controlled I/O c. Interrupt - controlled I/O a. I/O Control Strategies: Several I/O strategies are used between the computer system and I/O devices, depending on the relative speeds of the computer system and the I/O devices. The simplest strategy is to use the processor itself as the I/O controller, and to require that the device follow a strict order of events under direct program control, with the processor waiting for the I/O device at each step. Another strategy is to allow the processor to be ``interrupted'' by the I/O devices, and to have a (possibly different) ``interrupt handling routine'' for each device. This allows for more flexible scheduling of I/O events, as well as more efficient use of the processor. (Interrupt handling is an important component of the operating system.) A third general I/O strategy is to allow the I/O device, or the controller for the device, access to the main memory. The device would write a block of information in main memory, without intervention from the CPU, and then inform the CPU in some way that that block of memory had been overwritten or read. This might be done by leaving a message in memory, or by interrupting the processor. (This is generally the I/O strategy used by the highest speed devices -- hard disks and the video controller.) b. Program - controlled I/O: One common I/O strategy is program-controlled I/O, (often called polled I/O). Here all I/O is performed under control of an ``I/O handling procedure,'' and input or output is initiated by this procedure. The I/O handling procedure will require some status information (handshaking information) from the I/O device (e.g., whether the device is ready to receive data). This information is usually obtained through a second input from the device; a single bit is usually sufficient, so one input ``port'' can be used to collect status, or handshake, information from several I/O devices. (A port is the name given to a connection to an I/O device; e.g., to the memory location into which an I/O device is mapped). An I/O port is usually implemented as a register (possibly

a set of D flip flops) which also acts as a buffer between the CPU and the actual I/O device. The word portis often used to refer to the buffer itself. Typically, there will be several I/O devices connected to the processor; the processor checks the ``status'' input port periodically, under program control by the I/O handling procedure. If an I/O device requires service, it will signal this need by altering its input to the ``status'' port. When the I/O control program detects that this has occurred (by reading the status port) then the appropriate operation will be performed on the I/O device which requested the service. A typical configuration might look somewhat as shown in Figure . The outputs labeled ``handshake in'' would be connected to bits in the ``status'' port. The input labeled ``handshake in'' would typically be generated by the appropriate decode logic when the I/O port corresponding to the device was addressed.

c. Interrupt - controlled I/O:

Interrupt-controlled I/O reduces the severity of the two problems mentioned for program-controlled I/O by allowing the I/O device itself to initiate the device

implementation (i.e., a separate routine for each device) that this method is almost universally used on modern computer systems. Some processors have a number of special inputs for vectored interrupts (each acting much like the simple interrupt described earlier). Others require that the interrupting device itself provide the interrupt address as part of the process of interrupting the processor.

In order for deadlock to occur, four conditions must be true. Mutual exclusion Each resource is either currently allocated to exactly one process or it is available. (Two processes cannot simultaneously control the same resource or be in their critical section). Hold and Wait processes currently holding resources can request new resources No preemption Once a process holds a resource, it cannot be taken away by another process or the kernel. Circular wait Each process is waiting to obtain a resource which is held by another process.

The dining philosophers problem discussed in an earlier section is a classic example of deadlock. Each philosopher picks up his or her left fork and waits for the right fork to become available, but it never does. Deadlock can be modeled with a directed graph. In a deadlock graph, vertices represent either processes (circles) or resources (squares). A process which has acquired a resource is show with an arrow (edge) from the resource to the process. A process which has requested a resource which has not yet been assigned to it is modeled with an arrow from the process to the resource. If these create a cycle, there is deadlock.The deadlock situation in the above code can be modeled like this.

This graph shows an extremely simple deadlock situation, but it is also possible for a more complex situation to create deadlock. Here is an example of deadlock with four processes and four resources.

There are a number of ways that deadlock can occur in an operating situation. We have seen some examples, here are two more. Two processes need to lock two files, the first process locks one file the second process locks the other, and each waits for the other to free up the locked file. Two processes want to write a file to a print spool area at the same time and both start writing. However, the print spool area is of fixed size, and it fills up before either process finishes writing its file, so both wait for more space to become available 5. What do you mean by a Process? What are the various possible states of Process? Discuss. A process under unix consists of an address space and a set of data structures in the kernel to keep track of that process. The address space is a section of memory that contains the code to execute as well as the process stack. The kernel must keep track of the following data for each process on the system:

the address space map, the current status of the process, the execution priority of the process, the resource usage of the process, the current signal mask, the owner of the process. A process has certain attributes that directly affect execution, these include: PID - The PID stands for the process identification. This is a unique number that defines the process within the kernel. PPID - This is the processes Parent PID, the creator of the process. UID - The User ID number of the user that owns this process. EUID - The effective User ID of the process. GID - The Group ID of the user that owns this process. EGID - The effective Group User ID that owns this process. riority - The priority that this process runs at. To view a process you use the ps command. # ps -l F S UID PID PPID C PRI NI P SZ:RSS WCHAN TTY TIME COMD 30 S 0 11660 145 1 26 20 * 66:20 88249f10 ttyq 6 0:00 rlogind The F field: This is the flag field. It uses hexadecimal values which are added to show the value of the flag bits for the process. For a normal user process this will be 30, meaning it is loaded into memory. The S field: The S field is the state of the process, the two most common values are S for Sleeping and R for Running. An important value to look for is X, which means the process is waiting for memory to become available. PID field: The PID shows the Process ID of each process. This value should be unique. Generally PID are allocated lowest to highest, but wrap at some point. This value is necessary for you to send a signal to a process such as the KILL signal. PRI field: This stands for priority field. The lower the value the higher the value. This refers to the process NICE value. It will range form 0 to 39. The default is 20, as a process uses the CPU the system will raise the nice value. P flag: This is the processor flag. On the SGI this refers to the processor the process is running on. SZ field: This refers to the SIZE field. This is the total number of pages in the process. Each page is 4096 bytes. TTY field: This is the terminal assigned to your process. Time field: The cumulative execution time of the process in minutes and seconds. COMD field: The command that was executed. The fork() System Call The fork() system call is the basic way to create a new process. It is also a very unique system call, since it returns twice(!) to the caller. This system call causes the current process to be split into two processes - a parent process, and a child

process. All of the memory pages used by the original process get duplicated during the fork() call, so both parent and child process see the exact same image. The only distinction is when the call returns. When it returns in the parent process, its return value is the process ID (PID) of the child process. When it returns inside the child process, its return value is 0. If for some reason this call failed (not enough memory, too many processes, etc.), no new process is created, and the return value of the call is -1. In case the process was created successfully, both child process and parent process continue from the same place in the code where the fork() call was used. #include /* defines fork(), and pid_t. */ #include /* defines the wait() system call. */ //storage place for the pid of the child process, and its exit status.

spid_t child_pid; int child_status; child_pid = fork(); /* lets fork off a child process */ switch (child_pid) /* check what the fork() call actually did */ { case -1: /* fork() failed */ perror("fork"); /* print a system-defined error message */ exit(1); case 0: /* fork() succeeded, were inside the child process */ printf("hello worldn"); exit(0); //here the CHILD process exits, not the parent. default: /* fork() succeeded, were inside the parent process */ wait(&child_status); /* wait till the child process exits */ } /* parents process code may continue here */

6. Explain the working of file substitution in UNIX. Also describe the usage of pipes in UNIX Operating system. It is important to understand how file substitution actually works. In the previous examples, the ls command doesnt do the work of file substitution the shell does. Even though all the previous examples employ the ls command, any command that accepts filenames on the command line can use file substitution. In fact, using the simple echo command is a good way to experiment with file substitution without having to worry about unexpected results. For example, \$ echo p* p10 p101 p11 When a metacharacter is encountered in a UNIX command, the shell looks for patterns in filenames that match the metacharacter. When a match is found, the shell substitutes the actual filename in place of the string containing the metacharacter so that the command sees only a list of valid filenames. If the shell finds no filenames that match the pattern, it passes an empty string to the command.

The shell can expand more than one pattern on a single line. Therefore, the shell interprets the command \$ ls LINES.* PAGES.* as \$ ls LINES.dat LINES.idx PAGES.dat PAGES.idx There are file substitution situations that you should be wary of. You should be careful about the use of whitespace (extra blanks) in a command line. If you enter the following command, for example, the results might surprise you:

What has happened is that the shell interpreted the first parameter as the filename LINES. with no metacharacters and passed it directly on to ls. Next, the shell saw the single asterisk (*), and matched it to any character string, which matches every file in the directory. This is not a big problem if you are simply listing the files, but it could mean disaster if you were using the command to delete data files! Unusual results can also occur if you use the period (.) in a shell command. Suppose that you are using the \$ ls .* command to view the hidden files. What the shell would see after it finishes interpreting the metacharacter is \$ ls . .. .profile which gives you a complete directory listing of both the current and parent directories. When you think about how filename substitution works, you might assume that the default form of the ls command is actually \$ ls * However, in this case the shell passes to ls the names of directories, which causes ls to list all the files in the subdirectories. The actual form of the default ls command is \$ ls . The find Command One of the wonderful things about UNIX is its unlimited path names. A directory can have a subdirectory that itself has a subdirectory, and so on. This provides great flexibility in organizing your data. Unlimited path names have a drawback, though. To perform any operation on a file that is not in your current working directory, you must have its complete path name. Disk files are a lot like flashlights: You store them in what seem to be perfectly logical places, but when you need them again, you cant remember where you put them. Fortunately, UNIX has the find command.

The find command begins at a specified point on a directory tree and searches all lower branches for files that meet some criteria. Since find searches by path name, the search crosses file systems, including those residing on a network, unless you specifically instruct it otherwise. Once it finds a file, find can perform operations on it. Suppose you have a file named urgent.todo, but you cannot remember the directory where you stored it. You can use the find command to locate the file. \$ find / -name urgent.todo -print /usr/home/stuff/urgent.todo The syntax of the find command is a little different, but the remainder of this section should clear up any questions. The find command is different from most UNIX commands in that each of the argument expressions following the beginning path name is considered a Boolean expression. At any given stop along a branch, the entire expression is true file found if all of the expressions are true; or false file not found if any one of the expressions is false. In other words, a file is found only if all the search criteria are met. For example, \$ find /usr/home -user marsha -size +50 is true for every file beginning at /usr/home that is owned by Marsha and is larger than 50 blocks. It is not true for Marshas files that are 50 or fewer blocks long, nor is it true for large files owned by someone else. An important point to remember is that expressions are evaluated from left to right. Since the entire expression is false if any one expression is false, the program stops evaluating a file as soon as it fails to pass a test. In the previous example, a file that is not owned by Marsha is not evaluated for its size. If the order of the expressions is reversed, each file is evaluated first for size, and then for ownership. Another unusual thing about the find command is that it has no natural output. In the previous example, find dutifully searches all the paths and finds all of Marshas large files, but it takes no action. For the find command to be useful, you must specify an expression that causes an action to be taken. For example, \$ find /usr/home -user me -size +50 -print /usr/home/stuff/bigfile /usr/home/trash/bigfile.old first finds all the files beginning at /usr/home that are owned by me and are larger than 50 blocks. Then it prints the full path name. The argument expressions for the find command fall into three categories: Search criteria Action expressions Search qualifiers Although the three types of expressions have different functions, each is still considered a Boolean expression and must be found to be true before any further evaluation of the entire expression can take place. (The significance of this is discussed later.) Typically, a find operation consists of one or more search criteria, a single action expression, and perhaps a search qualifier. In other words, it finds a file and takes some action, even if that action is simply to print

the path name. The rest of this section describes each of the categories of the find options. Search Criteria The first task of the find command is to locate files according to some userspecified criteria. You can search for files by name, file size, file ownership, and several other characteristics. Finding Files with a Specific Name: -name fname Often, the one thing that you know about a file for which youre searching is its name. Suppose that you wanted to locateand possibly take some action on all the files named core. You might use the following command: \$ find / -name core -print This locates all the files on the system that exactly match the name core, and it prints their complete path names. The -name option makes filename substitutions. The command \$ find /usr/home -name "*.tmp" -print prints the names of all the files that end in .tmp. Notice that when filename substitutions are used, the substitution string is enclosed in quotation marks. This is because the UNIX shell attempts to make filename substitutions before it invokes the command. If the quotation marks were omitted from "*.tmp" and if the working directory contained more than one *.tmp file, the actual argument passed to the find command might look like this: \$ find /usr/home -name a.tmp b.tmp c.tmp -print This would cause a syntax error to occur.

Master of Computer Application (MCA) Semester 3 Software Engineering 4 Credits 0071

Assignment Set 1MC0071 - Software Engineering (Book ID: B0808 & B0809)

1. What is the importance of Software Validation, in testing?

Software Validation Also known as software quality control. Validation checks that the product design satisfies or fits the intended usage (high-level checking) i.e., you built the right product. This is done through dynamic testing and other forms of review. According to the Capability Maturity Model (CMMI-SW v1.1),

Verification: The process of evaluating software to determine whether the products of a given development phase satisfy the conditions imposed at the start of that phase. [IEEE-STD-610].

Validation: The process of evaluating software during or at the end of the development process to determine whether it satisfies specified requirements. [IEEE-STD-610] In other words, validation ensures that the product actually meets the users needs, and that the specifications were correct in the first place, while verification is ensuring that the product has been built according to the requirements and design specifications. Validation ensures that you built the right thing. Verification ensures that you built it right. Validation confirms that the product, as provided, will fulfill its intended use.

From testing perspective:

Fault wrong or missing function in the code. Failure the manifestation of a fault during execution. Malfunction according to its specification the system does not meet its specified functionality. Within the modeling and simulation community, the definitions of validation, verification and accreditation are similar:

Validation is the process of determining the degree to which a model, simulation, or federation of models and simulations, and their associated data are accurate representations of the real world from the perspective of the intended use(s). Accreditation is the formal certification that a model or simulation is acceptable to be used for a specific purpose. Verification is the process of determining that a computer model, simulation, or federation of models and simulations implementations and their associated data accurately represents the developers conceptual description and specifications. 2. Explain the following concepts with respect to Software Reliability: A) Software Reliability Metrics B) Programming for Reliability Software reliability Metrics: Metrics which have been used for software reliability specification are shown in Figure 3.1 shown below .The choice of which metric should be used depends on the type of system to which it applies and the requirements of the application domain. For some systems, it may be appropriate to use different reliability metrics for different sub-systems.

Reliability matrix In some cases, system users are most concerned about how often the system will fail, perhaps because there is a significant cost in restarting the system. In those cases, a metric based on a rate of failure occurrence (ROCOF) or the mean time to failure should be used. In other cases, it is essential that a system should always meet a request for service because there is some cost in failing to deliver the service. The number of failures in some time period is less important. In those cases, a metric based on the probability of failure on demand (POFOD) should be used. Finally, users or system operators may be mostly concerned that the system is available when a request for service is made. They will incur some loss if the system is unavailable. Availability (AVAIL). Which takes into account repair or restart time, is then the most appropriate metric. There are three kinds of measurement, which can be made when assessing the reliability of a system: 1) The number of system failures given a number of systems inputs. This is used to measure the POFOD. 2) The time (or number of transaction) between system failures. This is used to measure ROCOF and MTTF. 3) The elapsed repair or restart time when a system failure occurs. Given that the system must be continuously available, this is used to measure AVAIL. Time is a factor in all of this reliability metrics. It is essential that the appropriate time units should be chosen if measurements are to be meaningful. Time units, which may be used, are calendar time, processor time or may be some discrete unit such as number of transactions.

Programming for Reliability: There is a general requirement for more reliable systems in all application domains. Customers expect their software to operate without failures and to be available when it is required. Improved programming techniques, better programming languages and better quality management have led to very significant improvements in reliability for most software. However, for some systems, such as those, which control unattended machinery, these normal techniques may not be enough to achieve the level of reliability required. In these cases, special programming techniques may be necessary to achieve the required reliability. Some of these techniques are discussed in this chapter. Reliability in a software system can be achieved using three strategies: Fault avoidance: This is the most important strategy, which is applicable to all types of system. The design and implementation process should be organized with the objective of producing fault-free systems. Fault tolerance: This strategy assumes that residual faults remain in the system. Facilities are provided in the software to allow operation to continue when these faults cause system failures. Fault detection: Faults are detected before the software is put into operation. The software validation process uses static and dynamic methods to discover any faults, which remain in a system after implementation.

3. Suggest six reasons why software reliability is important. Using an example, explain the difficulties of describing what software reliability means. Six reasons why software reliability is important : 1) Computers are now cheap and fast: There is little need to maximize equipment usage. Paradoxically, however, faster equipment leads to increasing expectations on the part of the user so efficiency considerations cannot be completely ignored. 2) Unreliable software is liable to be discarded by users: If a company attains a reputation for unreliability because of single unreliable product, it is likely to affect future sales of all of that companys products.

3) System failure costs may be enormous: For some applications, such a reactor control system or an aircraft navigation system, the cost of system failure is orders of magnitude greater than the cost of the control system. 4) Unreliable systems are difficult to improve: It is usually possible to tune an inefficient system because most execution time is spent in small program sections. An unreliable system is more difficult to improve as unreliability tends to be distributed throughout the system. 5) Inefficiency is predictable: Programs take a long time to execute and users can adjust their work to take this into account. Unreliability, by contrast, usually surprises the user. Software that is unreliable can have hidden errors which can violate system and user data without warning and whose consequences are not immediately obvious. For example, a fault in a CAD program used to design aircraft might not be discovered until several plane crashers occurs. 6) Unreliable systems may cause information loss: Information is very expensive to collect and maintains; it may sometimes be worth more than the computer system on which it is processed. A great deal of effort and money is spent duplicating valuable data to guard against data corruption caused by unreliable software.

4. What are the essential skills and traits necessary for effective project managers in successfully handling projects? Project management can be defined as a set of principles, methods, tools, and techniques for planning, organizing, staffing, directing, and controlling projectrelated activities in order to achieve project objectives within time and under cost and performance constraints. The effectiveness of the project manager is critical to project success. The qualities that a project manager must possess include an understanding of negotiation techniques, communication and analytical skills, and requisite project knowledge. Control variables that are decisive in predicting the effectiveness of a project manager include the managers competence as a communicator, skill as a negotiator, and leadership excellence, and whether he or she is a good team worker and has interdisciplinary skills. Project mangers are responsible for directing project resources and developing plans, and must be able to ensure that a project will be completed in a given period of time. They play the essential role of coordinating between and interfacing with customers and management. Project mangers must be able to: Optimize the likelihood of overall project success

Apply the experiences and concepts learned from recent projects to new projects Manage the projects priorities Resolve conflicts Identify weaknesses in the development process and in the solution Identify process strengths upon completion of the project Expeditiously engage team members to become informed about and involved in the project Studies of project management in Mateyaschuk (1888), Sauer, Johnston, and Liu (1888), and Posner (1887) identify common skills and traits deemed essential for effective project managers, including: Leadership Strong planning and organizational skills Team-building ability Coping skills The ability to identify risks and create contingency plans The ability to produce reports that can be understood by business managers The ability to evaluate information from specialists Flexibility and willingness to try new approaches Feeny and Willcocks (1888) claim that the two main indicators of a project managers likely effectiveness are prior successful project experience and the managers credibility with stakeholders. The underlying rationale for this is that such conditions, taken together, help ensure that the project manager has the necessary skills to execute a project and see it through to completion and that the business stakeholders will continue to support the project; see also Mateyaschuk (1888) and Weston & Stedman (1888a,b). Research also suggests that the intangibility, complexity, and volatility of project requirements have a critical impact on the success of software project managers. 5. Which are the four phases of development according to Rational Unified

Process? Rational Unified Process Model (RUP): The RUP constitutes a complete framework for software development. The elements of the RUP (not of the problem being modeled) are the workers who implement the development, each working on some cohesive set of development activities and responsible for creating specific development artifacts. A worker is like a role a member plays and the worker can play many roles (wear many hats) during the development. For example, a designer is a worker and the artifact that the designer creates may be a class definition. An artifact supplied to a customer as part of the product is a deliverable. The artifacts are maintained in the Rational Rose tools, not as separate paper documents. A workflow is defined as a meaningful sequence of activities that produce some valuable result (Krutchen 2003). The development process has nine core workflows: business modeling; requirements; analysis and design; implementation; test; deployment; configuration and change management; project management; and environment. Other RUP elements, such as tool mentors, simplify training in the use of the Rational Rose system. These core workflows are spread out over the four phases of development: The inception phase defines the vision of the actual user end-product and the scope of the project. The elaboration phase plans activities and specifies the architecture. The construction phase builds the product, modifying the vision and the plan as it proceeds. The transition phase transitions the product to the user (delivery, training, support, maintenance). In a typical two-year project, the inception and transition might take a total of five months, with a year required for the construction phase and the rest of the time for elaboration. It is important to remember that the development process is iterative, so the core workflows are repeatedly executed during each iterative visitation to a phase. Although particular workflows will predominate during a particular type of phase (such as the planning and requirements workflows during inception), they will also be executed during the other phases. For example, the implementation workflow will peak during construction, but it is also a workflow during elaboration and transition. The goals and activities for each phase will be examined in some detail. The purpose of the inception phase is achieving concurrence among all stakeholders on the objectives for the project. This includes the project boundary

and its acceptance criteria. Especially important is identifying the essential use cases of the system, which are defined as the primary scenarios of behavior that will drive the systems functionality. Based on the usual spiral model expectation, the developers must also identify a candidate or potential architecture as well as demonstrate its feasibility on the most important use cases. Finally, cost estimation, planning, and risk estimation must be done. Artifacts produced during this phase include the vision statement for the product; the business case for development; a preliminary description of the basic use cases; business criteria for success such as revenues expected from the product; the plan; and an overall risk assessment with risks rated by likelihood and impact. A throw-away prototype may be developed for demonstration purposes but not for architectural purposes. The following elaboration phase ensures that the architecture, requirements, and plans are stable enough, and the risks are sufficiently mitigated, that [one] can reliably determine the costs and schedule for the project. The outcomes for this phase include an 80 percent complete use case model, nonfunctional performance requirements, and an executable architectural prototype. The components of the architecture must be understood in sufficient detail to allow a decision to make, buy, or reuse components, and to estimate the schedule and costs with a reasonable degree of confidence. Krutchen observes that a robust architecture and an understandable plan are highly correlated[so] one of the critical qualities of the architecture is its ease of construction. Prototyping entails integrating the selected architectural components and testing them against the primary use case scenarios. The construction phase leads to a product that is ready to be deployed to the users. The transition phase deploys a usable subset of the system at an acceptable quality to the users, including beta testing of the product, possible parallel operation with a legacy system that is being replaced, and software staff and user training.

Assignment Set 2MC0071 - Software Engineering (Book ID: B0808 & B0809)

1. Explain the following with respect to Configuration Management: A) Change Management B) Version and Release Management A) Change Management: The change management process should come into effects when the software or associated documentation is put under the control of the configuration management team. Change management procedures should be designed to ensure that the costs and benefits of change are properly analyzed and that changes to a system are made in a controlled way. Change management processes involve technical change analysis, cost benefit analysis and change tracking. The pseudo-code, shown in table below defines a process, which may be used to manage software system changes: The first stage in the change management process is to complete a change request form (CRF). This is a formal document where the requester sets out the change required to the system. As well as recording the change required, the CRF records the recommendations regarding the change, the estimated costs of the change and the dates when the change was requested, approved, implemented and validated. It may also include a section where the maintenance engineer outlines how the change is to be implemented. The information provided in the change request form is recorded in the CM database.

Once a change request form has been submitted, it is analyzed to check that the change is valid. Some change requests may be due to user misunderstandings rather than system faults; others may refer to already known faults. If the analysis process discovers that a change request is invalid duplicated or has already been considered the change should be rejected. The reason for the rejection should be returned to the person who submitted the change request. For valid changes, the next stage of the process is change assessment and costing. The impact of the change on the rest of the system must be checked. A technical analysis must be made of how to implement the change. The cost of making the change and possibly changing other system components to accommodate the change is then estimated. This should be recorded on the change request form. This assessment process may use the configuration database where component interrelation is recorded. The impact of the change on other components may then be assessed. Unless the change involves simple correction of minor errors on screen displays or in documents, it should then be submitted to a change control board (CCB) who decide whether or not the change should be accepted. The change control board considers the impact of the change from a strategic and organizational rather than a technical point of view. It decides if the change is economically justified and if there are good organizational reasons to accept the change. The term change control board sounds very formal. It implies a rather grand group which makes change decisions. Formally structured change control boards which include senior client and contractor staff are a requirement of military projects. For small or medium-sized projects, however, the change control board may simply consist of a project manager plus one or two engineers who are not directly involved in the software development. In some cases, there may only be a single change reviewer who gives advice on whether or not changes are justifiable.

When a set of changes has been approved, the software is handed over to the development of maintenance team for implementation. Once these have been completed, the revised software must be revalidated to check that these changes have been correctly implemented. The CM team, rather than the system developers, is responsible for building a new version or release of the software. Change requests are themselves configuration items. They should be registered in the configuration database. It should be possible to use this database to discover the status of change requests and the change requests, which are associated with specific software components. As software components are changed, a record of the changes made to each component should be maintained. This is sometimes called the derivation history of a component. One way to maintain such a record is in a standardized comment prologue kept at the beginning of the component. This should reference the change request associated with the software change. The change management process is very procedural. Each person involved in the process is responsible for some activity. They complete this activity then pass on the forms and associated configuration items to someone else. The procedural nature of this process means that a change process model can be designed and integrated with a version management system. This model may then be interpreted so that the right documents are passed to the right people at the right time.

B) Version and Release Management : Version management are the processes of identifying and keeping track of different versions and releases of a system. Version managers must devise procedures to ensure that different versions of a system may be retrieved when required and are not accidentally changed. They may also work with customer liaison staff to plan when new releases of a system should be distributed. A system version is an instance of a system that differs, in some way, from other instances. New versions of the system may have different functionality, performance or may repair system faults. Some versions may be functionally equivalent but designed for different hardware or software configurations. If there are only small differences between versions, one of these is sometimes called a variant of the other. A system release is a version that is distributed to customers. Each system release should either include new functionality or should be intended for a different hardware platform. Normally, there are more versions of a system than

releases. Some versions may never be released to customers. For example, versions may be created within an organization for internal development or for testing. A release is not just an executable program or set of programs. It usually includes: (1) Configuration files defining how the release should be configured for particular installations. (2) Data files which are needed for successful system operation. (3) An installation program which is used to help install the system on target hardware. (4) Electronic and paper documentation describing the system. All this information must be made available on some medium, which can be read by customers for that software. For large systems, this may be magnetic tape. For smaller systems, floppy disks may be used. Increasingly, however, releases are distributed on CD-ROM disks because of their large storage capacity. When a system release is produced, it is important to record the versions of the operating system, libraries, compilers and other tools used to build the software. If it has to be rebuilt at some later date, it may be necessary to reproduce the exact platform configuration. In some cases, copies of the platform software and tools may also be placed under version management. Some automated tool almost always supports version management. This tool is responsible for managing the storage of each system version.

2. Discuss the Control models in details. The models for structuring a system are concerned with how a system is decomposed into sub-systems. To work as a system, sub-systems must be controlled so that their services are delivered to the right place at the right time. Structural models do not (and should not) include control information. Rather, the architect should organize the sub-systems according to some control model, which supplements the structure model is used. Control models at the architectural level are concerned with the control flow between sub-systems. Two general approaches to control can be identified:

(1) Centralized control: One sub-system has overall responsibility for control and starts and stops other sub-systems. It may also devolve control to another sub-system but will expect to have this control responsibility returned to it. (2) Event-based control: Rather than control information being embedded in a sub-system, each sub-system can respond to externally generated events. These events might come from other sub-systems or from the environment of the system. Control models supplement structural models. All the above structural models may be implemented using either centralized or event-based control. Centralized control In a centralized control model, one sub-system is designated as the system controller and has responsibility for managing the execution of other subsystems. Event-driven systems In centralized control models, control decisions are usually determined by the values of some system state variables. By contrast, event-driven control models are driven by externally generated events. The distinction between an event and a simple input is that the timing of the event is outside the control of the process which handless that event. A sub-system may need to access state information to handle these events but this state information does not usually determine the flow of control. There are two event-driven control models: (1) Broadcast models: In these models, an event is, in principle, broadcast to all sub-systems. Any sub-system, which is designed to handle that event, responds to it. (2) Interrupt-driven models: These are exclusively used in real-time systems where an interrupt handler detects external interrupts. They are then passed to some other component for processing. Broadcast models are effective in integrating sub-systems distributed across different computers on a network. Interrupt-driven models are used in real-time systems with stringent timing requirements. The advantage of this approach to control is that it allows very fast responses to events to be implemented. Its disadvantages are that it is complex to program and difficult to validate.

3. Using examples describe how data flow diagram may be used to document a system design. What are the advantages of using this type of design model? Data-flow models: Data-flow model is a way of showing how data is processed by a system. At the analysis level, they should be used to model the way in which data is processed in the existing system. The notations used in these models represents functional processing, data stores and data movements between functions. Data-flow models are used to show how data flows through a sequence of processing steps. The data is transformed at each step before moving on to the next stage. These processing steps or transformations are program functions when data-flow diagrams are used to document a software design. Figure shows the steps involved in processing an order for goods (such as computer equipment) in an organization.

Data flow diagrams of Order processing The model shows how the order for the goods moves from process to process. It also shows the data stores that are involved in this process. There are various notations used for data-flow diagrams. In figure rounded rectangles represent processing steps, arrow annotated with the data name represent flows and rectangles represent data stores (data sources). Data-flow diagrams have the advantage that, unlike some other modelling notations, they are simple and intuitive. These diagrams are not a good way to describe subsystem with complex interfaces.

The advantages of this architecture are:

(1) It supports the reuse of transformations. (2) It is intuitive in that many people think of their work in terms of input and output processing. (3) Evolving system by adding new transformations is usually straightforward. (4) It is simple to implement either as a concurrent or a sequential system.

4. Describe the Classic Invalid assumptions with respect to Assessment of Process Life Cycle Models. Classic Invalid Assumptions Four unspoken assumptions that have played an important role in the history of software development are considered next. First Assumption: Internal or External Drivers The first unspoken assumption is that software problems are primarily driven by internal software factors. Granted this supposition, the focus of problem solving will necessarily be narrowed to the software context, thereby reducing the role of people, money, knowledge, etc. in terms of their potential to influence the solution of problems. Excluding the people factor reduces the impact of disciplines such as management (people as managers); marketing (people as customers); and psychology (people as perceivers). Excluding the money factor reduces the impact of disciplines such as economics (software in terms of business value cost and benefit); financial management (software in terms of risk and return); and portfolio management (software in terms of options and alternatives). Excluding the knowledge factor reduces the impact of engineering; social studies; politics; language arts; communication sciences; mathematics; statistics; and application area knowledge (accounting, manufacturing, World Wide Web, government, etc). It has even been argued that the entire discipline of software engineering emerged as a reaction against this assumption and represented an attempt to view software development from a broader perspective. Examples range from the emergence of requirements engineering to the spiral model to human computer interaction (HCI). Nonetheless, these developments still viewed nonsoftware-focused factors such as ancillary or external drivers and failed to place software development in a comprehensive, interdisciplinary context. Because software development problems are highly interdisciplinary in nature, they can only be understood using interdisciplinary analysis and capabilities. In fact, no purely technical software problems or products exist because every software

assumptions, this assumption overlooks the role of business in the software development process. Fourth Assumption: Process Centered or Architecture Centered There are currently two broad approaches in software engineering; one is process centered and the other is architecture centered. In process-centered software engineering, the quality of the product is seen as emerging from the quality of the process. This approach reflects the concerns and interests of industrial engineering, management, and standardized or systematic quality assurance approaches such as the Capability Maturity Model and ISO. The viewpoint is that obtaining quality in a product requires adopting and implementing a correct problem-solving approach. If a product contains an error, one should be able to attribute and trace it to an error that occurred somewhere during the application of the process by carefully examining each phase or step in the process. In contrast, in architecture-centered software engineering, the quality of the software product is viewed as determined by the characteristics of the software design. Studies have shown that 60 to 70 percent of the faults detected in software projects are specification or design faults. Because these faults constitute such a large percentage of all faults within the final product, it is critical to implement design-quality metrics. Implementing design-quality assurance in software systems and adopting proper design metrics have become key to the development process because of their potential to provide timely feedback. This allows developers to reduce costs and development time by ensuring that the correct measurements are taken from the very beginning of the project before actual coding commences. Decisions about the architecture of the design have a major impact on the behavior of the resulting software particularly the extent of development required; reliability; reusability; understandability; modi-fiability; and maintainability of the final product, characteristics that play a key role in assessing overall design quality. However, an architecture-centered approach has several drawbacks. In the first place, one only arrives at the design phase after a systematic process. The act or product of design is not just a model or design architecture or pattern, but a solution to a problem that must be at least reasonably well defined. For example, establishing a functional design can be done by defining architectural structure charts, which in turn are based on previously determined data flow diagrams, after which a transformational or transitional method can be used to convert the data flow diagrams into structure charts. The data flow diagrams are outcomes of requirements analysis process based on a preliminary inspection of project feasibility. Similarly, designing object-oriented architectures in UML requires first building use-case scenarios and static object models prior to moving to the design phase.

A further point is that the design phase is a process involving architectural, interface, component, data structure, and database design (logical and physical). The design phase cannot be validated or verified without correlating or matching its outputs to the inputs of the software development process. Without a process design, one could end up building a model, pattern, or architecture that was irrelevant or at least ambivalent because of the lack of metrics for evaluating whether the design was adequate. In a comprehensive process model, such metrics are extracted from predesign and postdesign phases. Finally, a process is not merely a set of documents, but a problem-solving strategy encompassing every step needed to achieve a reliable software product that creates business value. A process has no value unless it designs quality solutions.

5. Describe the concept of Software technology as a limited business tool.

Software Technology as a Limited Business Tool What Computers Cannot Do? Software technology enables business to solve problems more efficiently than otherwise; however, as with any tool, it has its limitations. Solving business problems involves many considerations that transcend hardware or software capabilities; thus, software solutions can only become effective when they are placed in the context of a more general problem-solving strategy. Software solutions should be seen as essential tools in problem solving that are to be combined with other interdisciplinary tools and capabilities. This kind of interoperation can be achieved by integrating such tools with the software development process. Additionally, the software development process can also be used as a part of a larger problem-solving process that analyzes business problems and designs and generates working solutions with maximum business value. Some examples of this are discussed in the following sections. People have different needs that change over time Software technology is limited in its ability to recognize the application or cognitive stylistic differences of individuals or to adapt to the variety of individual needs and requirements. These differences among individuals have multiple causes and include: Use of different cognitive styles when approaching problem solving Variations in background, experience, levels and kinds of education, and, even more broadly, diversity in culture, values, attitudes, ethical standards, and religions

Different goals, ambitions, and risk-management strategies Assorted levels of involvement and responsibilities in the business organizations process A software system is designed once to work with the entire business environment all the time. However, organizational needs are not stable and can change for many reasons even over short periods of time due to changes in personnel, task requirements, educational or training level, or experience. Designing a software system that can adjust, customize, or personalize to such a diversity of needs and variety of cognitive styles in different organizations and dispersed locations is an immense challenge. It entails building a customizable software system and also necessitates a continuous development process to adapt to ongoing changes in the nature of the environment. Most Users Do not Understand Computer Languages A software solution can only be considered relevant and effective after one has understood the actual user problems. The people who write the source code for computer applications use technical languages to express the solution and, in some cases, they do not thoroughly investigate whether their final product reflects what users asked for. The final product is expected to convert or transform the users language and expectations in a way that realizes the systems requirements. Otherwise, the system will be a failure in terms of meeting its stated goals appropriately and will fail its validation and verification criteria. In a utopian environment, end-users could become sufficiently knowledgeable in software development environments and languages so that they could write their software to ensure systems were designed with their own real needs in mind. Of course, by the very nature of the division of expertise, this could rarely happen and so the distance in functional intention between user languages and their translation into programming languages is often considerable. This creates a barrier between software solutions reaching their intended market and users and customers finding reliable solutions. In many ways, the ideal scenario, in which one approached system design and development from a user point of view, was one of the driving rationales behind the original development of the software engineering discipline. Software engineering was intended as a problem-solving framework that could bridge the gap between user languages (requirements) and computer languages (the final product or source code). In software engineering, the users linguistic formulation of a problem is first understood and then specified naturally, grammatically, diagrammatically, mathematically, or even automatically; then, it is translated into a preliminary software architecture that can be coded in a programming

language. Thus, the underlying objective in software engineering is that the development solutions be truly reflective of user or customer needs. Decisions and Problems Complex and Ill Structured The existence of a negative correlation between organizational complexity and the impact of technical change (Keen 1981) is disputed. More complex organizations have more ill-structured problems (Mitroff & Turoff 1963). Consequently, their technical requirements in terms of information systems become harder to address. On the other hand, information technology may allow a complex organization to redesign its business processes so that it can manage complexity more effectively (Davenport & Stoddard 1994). On balance, a negative correlation is likely in complex organizations for many reasons. First, the complexity of an organization increases the degree of ambiguity and equivocality in its operations (Daft & Lengel 1986). Many organizations will not invest resources sufficient to carry out an adequately representative analysis of a problem. Therefore, requirement specifications tend to become less accurate and concise. Implementing a system based on a poor systems analysis increases the likelihood of failure as well as the likelihood of a lack of compatibility with the organizations diverse or competing needs. A demand for careful analysis and feasibility studies to allow a thorough determination of requirements might bring another dimension of complexity to the original problem. Second, technology faces more people-based resistance in complex organizations (Markus 1983). This can occur because a newly introduced system has not been well engineered according to accurate requirements in the first place, as well as because of the combination of social, psychological, and political factors found in complex organizations. One further factor complicating the effective delivery of computerized systems in large projects is the time that it takes to get key people involved. Finally, there are obvious differences in the rate of growth for complex organizations and information technology. Although information technology advances rapidly, complex organizations are subject to greater inertia and thus may change relatively slowly. Subsequently, incorporating or synthesizing technical change into an organization becomes a real challenge for individuals and departments and is affected by factors such as adaptability, training, the ability to upgrade, and maintainability. For such reasons, one expects a negative correlation between organizational complexity and the impact of technical change in terms of applying software technology and achieving intended organizational outcomes. Business View Software Technology as a Black Box for Creating Economic Value

decades. Business value has always been embedded implicitly or explicitly in almost every progress in software process modeling. Minimization of time was behind the Rapid Application Development (RAD) and prototyping models. Risk control and reduction were major issues behind spiral models. The efficient use of human resources lies behind the dynamic models. The impact of user involvement in software process models reflects the importance of customer influence. Achieving competitive advantage in software systems is a key business value related to users and customers. However, little empirical examination of the affect of the different problem solving strategies adopted in software process models takes place. The interdependencies between the software process and business performance must be a key issue. The former is driven by the need for business value, and the latter in turn depends more than ever on software. This encompasses users, analysts, project managers, software engineers, customers, programmers, and other stakeholders. Computer systems are human inventions and do not function or interact without human input. Some manifestations of this dependency are: Software applications are produced by people and are based on people needs. Software applications that do not create value will not survive in the marketplace. Computers cannot elastically adjust to real situations (they work with preexisting code and prescribed user inputs). Computers do not think; in terms of expertise, they reflect ifthen inputs or stored knowledge-based experiences. The main goal of software technology is to solve the problems of people. This dependency on the human environment makes the automation that computers facilitate meaningless without human involvement and underscores the limits of computer systems. It also highlights the central role that people play in making software technology an effective tool for producing desired outcomes.

6. Describe the round-trip problem solving approach. Round-Trip Problem-Solving Approach

The software engineering process represents a round-trip framework for problem solving in a business context in several senses. The software engineering process is a problem-solving process entailing that software engineering should incorporate or utilize the problem-solving literature regardless of its interdisciplinary sources. The value of software engineering derives from its success in solving business and human problems. This entails establishing strong relationships between the software process and the business metrics used to evaluate business processes in general. The software engineering process is a round-trip approach. It has a bidirectional character, which frequently requires adopting forward and reverse engineering strategies to restructure and reengi-neer information systems. It uses feedback control loops to ensure that specifications are accurately maintained across multiple process phases; reflective quality assurance is a critical metric for the process in general. The nonterminating, continuing character of the software development process is necessary to respond to ongoing changes in customer requirements and environmental pressures.

Master of Computer Application (MCA) Semester 3 MC0072 Computer Graphics 4 Credits

Assignment Set 1MC0072 Computer Graphics (Book ID: B0810)

1. Write a short note on: a) Replicating pixels b) Moving pen c) Filling area between boundaries d) Approximation by thick polyline e) Line style ad pen style.

a ) Replicating pixels : A quick extension to the scan-conversation ineer loop to write multiple pixel at each computed pixel works resonably well fo lines: Here , pixel are duplicated in columns for lines with 1 < slope < 1 and in rows for all other lines . The effect, however , is that the line ends are always vertical or horizontal , which is not pleasing for rather thick lines as shown in figure

Furthermore , lines that are horizontal and vertical have different thickness for lines at an angle , where the thickness of the primitive is defined as the distence between the primitives boundaries parpendicular to its tangent. Thus if the thickness paramater is t , a horizontal or horizontal or verticalline has thickness t , whereas one drawn at 45o has an average thickness of This is another result of having fewer pixels in the line at an angle, as first noted in Section ; it descreses the brightness contrast with horizontal and vertical lines of the same thickness. Still another problem with pixel replication is the generic problem of even-numbered width: We cannot center the duplicated column or row about the selected pixel, so we must choose a side of the primitive to have

an extra pixel. Altogether, pixel replication is an efficient but crude approximation that works best for primitives that are not very thick. b) Moving pen : Choosing a rectangular pen whose center or corner travels along the single-pixel outline of the primitive works reasonably ell for lines; it produces the line shown in figure below

Notice that this line ios similar to that produced by pixel replication but is thicker at the endpoints. As with pixel replication, because the pen stays vertically aligned, the perceived thickness of the primitive varies as a function of the primitives angle, but in the opposite way:

The width is thinnest for horizontal segments and thickest for segments with slope of . An ellipse are, for example , varies in thickness alone its entire trajectory, being of the specified thickness when the tangent is nearly horizontal or vertical, and thickened by a factor of around . this problem would be eliminated if the square turned to follow the path, but it is much better to use a circular cross-section so that the thickness is angle-independent. Now lets look at how to implement the moving-pen algorithm for the simple case of an upright rectangular or circular cross-section ( also called footprint ) so that its center or corner is at the chosen pixel; for a circular footprint and a pattern drawn in opaque mode, we must in ddition mask off the bits outside the circular region, which is not as easy task unless our low-level copy pixel has a write mask for the destination region. The bure-force copy solution writes pixels more than

once , since the pens footprints overlap at adjacent pixels. A better technique that also handle the circular-cross-section problem is to use the snaps of the footprint to compute spans for successive footprints at adjacent pixels. As in the filling d) Approximation by thick polyline We can do piecewise-linear approximation of any primitive by computing points on the boundary ( with floating-point coordinates ) , then connecting these points with line segments to from a polyline. The advantage of this approach is that the algorithms for both line clipping and line scan conversion ( for thin primitives ) , and for polygon clipping and polygon scan conversion ( for thick primitives ) , are efficient. Naturall, the segnates must be quite short in places where the primitive changes direction rapidly. Ellipse ares cane be represented as ratios of parametric polynomials, which lend themselves readily to such piecewise-linear approximation. The individual line segments are then drawn as rectangles with the specified thickness. To make the thick approximation look nice, however, we must solve the problem of making thick lines join smoothly . e) Line style ad pen style. SRGPs line-style atribute can affect any outline primitive. In general, we must use conditional logic to test whether or not to write a pixel, writing only for 1s. we store the pattern write mask as a string of 16 booleans (e.g., a 16-bit integer ); it should therefore repeat every 16 pixels. We modify the unconditional WritePixel statemet in the line scan-conversion algorithm to handle this.There is a drawback to this technique, however. Since each bit in the mask corresponds to an iteration of the loop, and not to a unit distance along the line . the length of dashes varies with the angle of the line; a dash at an angle is longer than is a horizontal or vertical dash. For engineering drawings, this variation is unacceptable, and the dashes must be calculated and scan-coverted as individual line segments of length invariant with angle. Thick line are created as sequences of alternating solid and transparent rectangles whose vertices are calculated exactly as a function of the line style selected. The rectangles are then scan-converted individually; for horizontal and vertical line , the program may be able to copypixel the rectangle. Line style and pen style interact in thick outline primitive. The line style is used to calculate the rectangle for each dash, and each rectangle is filled with the selected pen pattern.

2. What is DDA line drawing algorithm explain it with the suitable example? discuss the merit and demerit of the algorithm.

DDA Line Algorithm 1: Read the line end points (X1,Y1) and (X2,Y2) such that they are equal . [ If equal then plot that point and end ] 2: x = 3. If Length= Else Length= end if 4. = (x2-x1)/length = (y2-y1)/length This makes either or equal to 1 because the length is either| x2-x1| or |y2-y1|, the incremental value for either x or y is 1. 5. x = x1+0.5 * sign ( y = y1+0.5*sign( ) ) then and y =

[Here the sign function makes the algorithm worl in all quadrant. It returns 1, 0,1 depending on whether its argument is 0 respectively. The factor 0.5 makes it possible to round the

values in the integer function rather than truncating them] 6. i=1 [begins the loop, in this loop points are plotted] 7. while(i length) { Plot (Integer(x), Integer(y)) x= x+x y= y+y i=i+1 } 8. stop Let us see few examples to illustrate this algorithm. Ex.1: Consider the line from (0,0) to (4,6). Use the simple DDA algorithm to rasterize this line. Sol: Evaluating steps 1 to 5 in the DDA algorithm we have x1=0, x2= 4, y1=0, y2=6 therefor Length = x = y= =6

/ Length = 4/6 /Length = 6/6-1

Initial values for x= 0+0.5*sign ( 4/6 ) = 0.5 y= 0+ 0.5 * sign(1)=0.5 Tabulating the results of each iteration in the step 6 we get,

The results are plotted as shown in the It shows that the rasterized line lies to both sides of the actual line, i.e. the algorithm is orientation dependent

Result for s simple DDA Ex. 2 : Consider the line from (0,0) to (-6,-6). Use the simple DDA algorithm to rasterize this line. Sol : x1=0, x2= -6,y1=0, y2=-6 Therefore Therefore Lenth = | X2 -- X1 | = | Y2-Y1| = 6 Ax = Ay = -1 Initial values for x= 0+0.5*sign (-1) = -0.5 y= 0+ 0.5 * sign(-1) = -0.5 Tabulating the results of each iteration in the step 6 we get,

The results are plotted as shown in the It shows that the rasterized line lies on the actual line and it is 450 line. Advantages of DDA Algorithm: 1. It is the simplest algorithm and it does not require special skills for implementation. 2. It is a faster method for calculating pixel positions than the direct use of equation y=mx + b. It eliminates the multiplication in the equation by making use of raster characteristics, so that appropriate increments are applied in the x or y direction to find the pixel positions along the line path

Disadvantages of DDA Algorithm 1. Floating point arithmetic in DDA algorithm is still time-consuming. 2. The algorithm is orientation dependent. Hence end point accuracy is poor.

3. Write a short note on: A) Reflection B) Sheer C) Rotation about an arbitrary axis

A ) Reflection : A reflection is a transformation that produces a mirror image of an object relative to an axis of reflection. We can choose an axis of reflection in the xy plane or perpendicular to the xy plane. The table below gives examples of some common reflection.

B) Shear : A transformation that slants the shape of an objects is called the shear transformation. Two common shearing transformations are used. One shifts x coordinate values and other shifts y coordinate values. However, in both the

cases only one coordinate (x or y) changes its coordinates and other preserves its values. 1 X shear : The x shear preserves they coordinates, but changes the x values which causes vertical lines to tilt right or left as shown in the fig. 6.7. The transformation matrix for x shear is given as

2 Y Shear: The y shear preserves the x coordinates, but changes the y values which causes horizontal lines to transform into lines which slope up or down, as shown in the

The transformation matrix for y shear is given as

C) Rotation about an arbitrary axis: A rotation matrix for any axis that does not coincide with a coordinate axis can be set up as a composite transformation involving combinations of translation and the coordinate-axes rotations. In a special case where an object is to be rotated about an axis that is parallel to one of the coordinate axes we can obtain the resultant coordinates with the following transformation sequence. 1. Translate the object so that the rotation axis coincides with the parallel coordinate axis 2. Perform the specified rotation about that axis. 3. Translate the object so that the rotation axis is moved back to its original position When an object is to be rotated about an axis that is not parallel to one of the coordinate axes, we have to perform some additional transformations. The sequence of these transformations is given below. 1. Translate the object so that rotation axis specified by unit vector u passes through the coordinate origin. 2. Rotate the object so that the axis of rotation coincides with one of the coordinate axes. Usually the z axis is preferred. To coincide the axis of rotation to z axis we have to first perform rotation of unit vector u about x axis to bring it into xz plane and then perform rotation about y axis to coincide it with z axis. 3. Perform the desired rotation about the z axis 4. Apply the inverse rotation about y axis and then about x axis to bring the rotation axis back to its original orientation. 5. Apply the inverse translation to move the rotation axis back to its original position. As shown in the Fig. the rotation axis is defined with two coordinate points P1 and P2 and unit vector u is defined along the rotation of axis as

Where V is the axis vector defined by two points P1 and P2 as V = P2 P1= (x2 x1, y2 y1, z2 z1) The components a, b and c of unit vector us are the direction cosines for the rotation axis and they can be defined as

4. Describe the following: A) Basic Concepts in Line Drawing C) Bresenhams Line Drawing Algorithm A) Basic Concepts in Line Drawing : B) Digital Differential Analyzer

A line can be represented by the equation: y = m * x + b; where 'y' and 'x' are the co-ordinates; 'm' is the slope i.e. a quantity which indicates how much 'y' increases when 'x' increases by one unit. 'b' is the

intercept of line on 'y' axis (However we can safely ignore it for now). So what's the trick behind the equation. Would you believe it, we already have the algorithm developed ! Pay close attention to the definition of 'm'. It states A quantity that indicates by how much 'y' changes when 'x' changes by one unit. So, instead of determining 'y' for every value of 'x' we will let 'x' have certain discrete values and determine 'y' for those values. Moreover since we have the quantity 'm'; we will increase 'x' by one and add 'm' to 'y' each time. This way we can easily plot the line. The only thing we are left with now, is the actual calculations. Let us say we have two endpoints (xa,ya) and (xb,yb). Let us further assume that (xb-xa) is greater than (yb-ya) in magnitude and that (xa < xb). This means we will move from (xa) to the right towards (xb) finding 'y' at each point. The first thing however that we need to do is to find the slope 'm'. This can be done using the formulae: m = (yb - ya) / (xb - xa) Now we can follow the following algorithm to draw our line.

Let R represent the row and C the column Set C = Round(xa) Let F = Round(xb) Let H = ya Find the slope m Set R = Round(H) Plot the point at R,C on the screen Increment C {C+1} If C S1) or move back to the same state (S0 -> S0)). A DFA has a start state (denoted graphically by an arrow coming in from nowhere) where computations begin, and a set of accept states (denoted graphically by a double circle) which help define when a computation is successful. DFAs recognize exactly the set of regular languages which are, among other things, useful for doing lexical analysis and pattern matching. A DFA can be used in either an accepting mode to verify that an input string is indeed part of the language it represents, or a generating mode to create a list of all the strings in the language. A DFA is defined as an abstract mathematical concept, but due to the deterministic nature of a DFA, it is implementable in hardware and software for solving various specific problems. For example, a software state machine that decides whether or not online user-input such as phone numbers and email addresses are valid can be modeled as a DFA. Another example in hardware is the digital logic circuitry that controls whether an automatic door is open or closed, using input from motion sensors or pressure pads to decide whether or not to perform a state transition .

Formal definition A deterministic finite automaton M is a 5-tuple, (Q, , , q0, F), consisting of

a finite set of states (Q) a finite set of input symbols called the alphabet () a transition function ( : Q Q) a start state (q0 Q) a set of accept states (F Q)

Let w = a1a2 ... an be a string over the alphabet . The automaton M accepts the string w if a sequence of states, r0,r1, ..., rn, exists in Q with the following conditions: 1. 2. 3. r0 = q 0 ri+1 = (ri, ai+1), for i = 0, ..., n1 rn F.

In words, the first condition says that the machine starts in the start state q0. The second condition says that given each character of string w, the machine will transition from state to state according to the transition function . The last condition says that the machine accepts w if the last input of w causes the machine to halt in one of the accepting states. Otherwise, it is said that the automatonrejects the string. The set of strings M accepts is the language recognized by M and this language is denoted by L(M). A deterministic finite automaton without accept states and without a starting state is known as a transition system or semiautomaton. For more comprehensive introduction of the formal definition see automata theory.

DFAs can be built from nondeterministic finite automata through the powerset construction.

An example of a Deterministic Finite Automaton that accepts only binary numbers that are multiples of 3. The state S0 is both the start state and an accept state.

Example The following example is of a DFA M, with a binary alphabet, which requires that the input contains an even number of 0s.

The state diagram for M M = (Q, , , q0, F) where

Q = {S1, S2}, = {0, 1}, q0 = S1, F = {S1}, and is defined by the following state transition table: 0 1 S1 S2 S1 S2 S1 S2

The state S1 represents that there has been an even number of 0s in the input so far, while S2 signifies an odd number. A 1 in the input does not change the state of the automaton. When the input ends, the state will show whether the input contained an even number of 0s or not. If the input did contain an even number of 0s, M will finish in state S1, an accepting state, so the input string will be accepted. The language recognized by M is the regular language given by the regular expression 1*( 0 (1*) 0 (1*) )*, where "*" is the Kleene star, e.g., 1* denotes any non-negative number (possibly zero) of symbols "1".

Nondeterministic finite automaton (NFA) or nondeterministic finite state machine is a finite state machine where from each state and a given input symbol the automaton may jump into several possible next states. This distinguishes it from the deterministic finite automaton (DFA), where the next possible state is uniquely determined. Although the DFA and NFA have distinct definitions, a NFA can be translated to equivalent DFA using powerset construction, i.e., the constructed DFA and the NFA recognize the same formal language. Both types of automata recognize only regular languages. Nondeterministic finite automata were introduced in 1959 by Michael O. Rabin and Dana Scott,[1] who also showed their equivalence to deterministic finite automata. Non-deterministic finite state machines are sometimes studied by the name subshifts of finite type. Non-deterministic finite state machines are generalized by probabilistic automata, which assign a probability to each state transition. Formal definition An NFA is represented formally by a 5-tuple, (Q, , , q0, F), consisting of

a finite set of states Q a finite set of input symbols a transition function : Q P(Q). an initial (or start) state q0 Q a set of states F distinguished as accepting (or final) states F Q.

Here, P(Q) denotes the power set of Q. Let w = a1a2 ... an be a word over the alphabet . The automaton M accepts the word w if a sequence of states, r0,r1, ..., rn, exists in Q with the following conditions: 1. 2. 3. r0 = q 0 ri+1 (ri, ai+1), for i = 0, ..., n1 rn F.

In words, the first condition says that the machine starts in the start state q0. The second condition says that given each character of string w, the machine will transition from state to state according to the transition relation . The last

condition says that the machine accepts w if the last input of w causes the machine to halt in one of the accepting states. Otherwise, it is said that the automatonrejects the string. The set of strings M accepts is the language recognized by M and this language is denoted by L(M). For more comprehensive introduction of the formal definition see automata theory. NFA- The NFA- (also sometimes called NFA- or NFA with epsilon moves) replaces the transition function with one that allows the empty string as a possible input, so that one has instead : Q ( {}) P(Q). It can be shown that ordinary NFA and NFA- are equivalent, in that, given either one, one can construct the other, which recognizes the same language. Example

The state diagram for M Let M be a NFA-, with a binary alphabet, that determines if the input contains an even number of 0s or an even number of 1s. Note that 0 occurrences is an even number of occurrences as well. In formal notation, let M = ({s0, s1, s2, s3, s4}, {0, 1}, , s0, {s1, s3}) where the transition relation T can be defined by this state transition table: 0 S0 {} S1 {S2} S2 {S1} 1 {} {S1, S3} {S1} {} {S2} {}

S3 {S3} {S4} S4 {S4} {S3}

{} {}

M can be viewed as the union of two DFAs: one with states {S1, S2} and the other with states {S3, S4}. The language of M can be described by theregular language given by this regular expression (1*(01*01*)*) (0*(10*10*)*). We define M using -moves but M can be defined without using -moves.

3. Write a short note on: A) C Preprocessor for GCC version 2 B) Conditional Assembly A) The C Preprocessor for GCC version 2 : The C preprocessor is a macro processor that is used automatically by the C compiler to transform your program before actual compilation. It is called a macro processor because it allows you to define macros, which are brief abbreviations for longer constructs. The C preprocessor provides four separate facilities that you can use as you see fit: Inclusion of header files. These are files of declarations that can be substituted into your program. Macro expansion. You can define macros, which are abbreviations for arbitrary fragments of C code, and then the C preprocessor will replace the macros with their definitions throughout the program. Conditional compilation. Using special preprocessing directives, you can include or exclude parts of the program according to various conditions. Line control. If you use a program to combine or rearrange source files into an intermediate file which is then compiled, you can use line control to inform the compiler of where each source line originally came from. ANSI Standard C requires the rejection of many harmless constructs commonly used by todays C programs. Such incompatibility would be inconvenient for users, so the GNU C preprocessor is configured to accept these constructs by default. Strictly speaking, to get ANSI Standard C, you must use the options `trigraphs, `-undef and `-pedantic, but in practice the consequences of having strict ANSI Standard C make it undesirable to do this.

Conditional Assembly : Means that some sections of the program may be optional, either included or not in the final program, dependent upon specified conditions. A reasonable use of conditional assembly would be to combine two versions of a program, one that prints debugging information during test executions for the developer, another version for production operation that displays only results of interest for the average user. A program fragment that assembles the instructions to print the Ax register only if Debug is true is given below. Note that true is any non-zero value.

Here is a conditional statements in C programming, the following statements tests the expression `BUFSIZE == 1020, where `BUFSIZE must be a macro. #if BUFSIZE == 1020 printf ("Large buffers!n"); #endif /* BUFSIZE is large */

4. Write about different Phases of Compilation. Phases of Compilation: 1. Lexical analysis (scanning): the source text is broken into tokens. 2. Syntactic analysis (parsing): tokens are combined to form syntactic structures, typically represented by a parse tree. The parser may be replaced by a syntax-directed editor, which directly generates a parse tree as a product of editing. 3. Semantic analysis: intermediate code is generated for each syntactic structure.

Type checking is performed in this phase. Complicated features such as generic declarations and operator overloading (as in Ada and C++) are also processed. 4. Machine-independent optimization: intermediate code is optimized to improve efficiency. 5. Code generation: intermediate code is translated to relocatable object code for the target machine. 6. Machine-dependent optimization: the machine code is optimized.

5. What is MACRO? Discuss its use. A macro is a rule or pattern that specifies how a certain input sequence (often a sequence of characters) should be mapped to an output sequence (also often a sequence of characters) according to a defined procedure. The mapping process that instantiates (transforms) a macro into a specific output sequence is known as macro expansion. The term is used to make available to the programmer, a sequence of computing instructions as a single program statement, making the programming task less tedious and less error-prone. (Thus, they are called "macros" because a big block of code can be expanded from a small sequence of characters). Macros often allow positional or keyword parameters that dictate what the conditional assembler program generates and have been used to create entire programs or program suites according to such variables as operating system, platform or other factors. 6. What is compiler? Explain the compiler process. A compiler is a computer program that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code). The most common reason for wanting to transform source code is to create an executable program. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language (e.g., assembly language or machine code). If the compiled program can run on a computer whose CPU or operating system is different from the one on which

the compiler runs, the compiler is known as a cross-compiler. A program that translates from a low level language to a higher level one is a de-compiler. A program that translates between high-level languages is usually called a language translator, source to source translator, or language converter.

A compiler is likely to perform many or all of the following operations: lexical analysis, preprocessing, parsing, semantic analysis (Syntax-directed translation), code generation, and code optimization. Program faults caused by incorrect compiler behavior can be very difficult to track down and work around; therefore, compiler implementers invest a lot of time ensuring the correctness of their software. The term compiler-compiler is sometimes used to refer to a parser generator, a tool often used to help create the lexer and parser.

A diagram of the operation of a typical multi-language, multi-target compiler

Basic process of Compiler: Compilers bridge source programs in high-level languages with the underlying hardware. A compiler requires 1) determining the correctness of the syntax of programs, 2) generating correct and efficient object code, 3) run-time organization, and 4) formatting output according to assembler and/or linker conventions. A compiler consists of three main parts: the frontend, the middleend, and the backend. The front end checks whether the program is correctly written in terms of the programming language syntax and semantics. Here legal and illegal programs are recognized. Errors are reported, if any, in a useful way. Type checking is also performed by collecting type information. The frontend then generates an intermediate representation or IR of the source code for processing by the middle-end. The middle end is where optimization takes place. Typical transformations for optimization are removal of useless or unreachable code, discovery and propagation of constant values, relocation of computation to a less frequently executed place (e.g., out of a loop), or specialization of computation based on the context. The middle-end generates another IR for the following backend. Most optimization efforts are focused on this part. The back end is responsible for translating the IR from the middle-end into assembly code. The target instruction(s) are chosen for each IR instruction. Register allocation assigns processor registersfor the program variables where possible. The backend utilizes the hardware by figuring out how to keep parallel execution units busy, filling delay slots, and so on. Although most algorithms for optimization are in NP, heuristic techniques are well-developed.

Master of Computer Application (MCA) Semester 3 MC0074 Statistical and Numerical methods using C++ 4 Credits

Assignment Set 1MC0074 Statistical and Numerical methods using C++ (Book ID: B0812)2. Discuss and define the Correlation coefficient with the suitable example.

Correlation coefficient Correlation is one of the most widely used statistical techniques. Whenever two variable are so related that a change in one variable result in a direct or inverse change in the other and also greater the magnitude of change in one variable corresponds to greater the magnitude of change in the other, then the variable are said to be correlated or the relationship between the variables is known as correlation. We have been concerned with associating parameters such as E(x) and V(X) with the distribution of one-dimensional random variable. If we have a twodimensional random variable (X,Y), an analogous problem is encountered. Definition Let (X, Y) be a two-dimensional random variable. We define xy, the correlation coefficient, between X and Y, as follows:

xy = The numerator of , is called the covariance of X and Y.

Example Suppose that the two-dimensional random variable (X, Y) is uniformly distributed over the triangular region R = {(x, y) | 0 < x < y < 1} The pdf is given as f(x, y) = 2, (x, y) R,

= 0, elsewhere. Thus the marginal pdfs of X and of Y are

g(x) = 2 (1 x), 0 x 1

h(y) = = 2y, 0 y 1 Therefore

E(X) =

, E(Y) =

E(X2) =

, E(Y2) =

V(X) = E(X2) (E(X))2 =

V(Y) = E(Y2) (E(Y))2 =

E(XY) = Hence

xy =

=

3. If x is normally distributed with zero mean and unit variance, find the expectation and variance of 2 . The equation of the normal curve is

If mean is zero and variance is unit, then putting m=0

and

,the above

equation reduced to

Expectation of x2 i.e.

=

=

(i)

=

Integrating by parts taking x as first function and remembering that

=

Putting

Hence

=1

(ii)

Integrating by parts taking x3 as first function

=

=

=3

=3(1)

with the help of (ii)

Variance of x2 = = 3-(1)2 =2

4. The sales in a particular department store for the last five years is given in the following table Years Sales (in lakhs) 1974 40 1976 43 1978 48 1980 52 1982 57

Estimate the sales for the year 1979. Newtons backward difference table is

We have

p= yn = 5,2

yn = 1,

3

yn = 2,

4

yn = 5

Newtons interpolati0on formula gives y1979 = 57 + (-1.5) 5 + = 57 7.5 + 0.375 + 0.125 + 0.1172 y1979 = 50.1172

5. Find out the geometric mean of the following series Class Frequency 0-10 17 10-20 10 20-30 11 30-40 15 40-50 8

Here we have Class 0-10 10-20 20-30 Frequency(f) 17 10 11 Mid value(x) 5 15 25 Log x .6990 1.1761 1.3979 f.(Logx) 11.883 11.761 15.3769

30-40 40-50

15 8 N=61

35 45

1.5441 1.6532

23.1615 13.2256 Sum=75.408

If F be the required geometric mean, then

Log G = = 1/61(75.408) = 1.236197 G = antilog 1.23 = 16.28

6. Find the equation of regression line of x on y from the following data x y 0 1 2 3 4 10 12 27 10 30

sum(X) = 0+1+2+3+4 = 10 sum(X) = 0+1+2+3+4 = 30 sum(Y) = 10+12+27+10+30 = 89 sum(Y) = 10+12+27+10+30 = 1973 sum(XY) = 0.10 + 1.12 + 2.27 + 3.10 + 4.30 = 10.89 n =5 Xbar = sumX / n = 10 / 5 = 2 Ybar = sumY / n = 89 / 5 = 17.8 gradient m = [ n sumXY - sumX sumY ] / [ n sumX - (sumX) ] = (5.10.89 - 10.89) / (5.30 - 10)

= (54.45 - 890) / (150 - 100) = -835.55 / 50 = -16.711 Equation is y = mx + c Ybar = m.Xbar + c 17.8 =-16.711(2) + c c = 17.8 +33.422 = 51.222 Therefore the equation of the regressed line is y = (-16.711)x + 51.222

Assignment Set 2MC0074 Statistical and Numerical methods using C++ (Book ID: B0812)

2. If

2 is approximated by 0.667, find the absolute and relative errors? 3

Absolute, relative and percentage errors An error is usually quantified in two different but related ways. One is known as absolute error and the other is called relative error. Let us suppose that true value of a data item is denoted by xt and its approximate value is denoted by xa. Then, they are related as follows: True value xt = Approximate value xa + Error The error is then given by: Error = xt - xa The error may be negative or positive depending on the values of xt and xa. In error analysis, what is important is the magnitude of the error and not the sign and, therefore, we normally consider what is known as absolute error which is denoted by e a = | x t xa | In general absolute error is the numerical difference between the true value of a quantity and its approximate value. In many cases, absolute error may not reflect its influence correctly as it does not take into account the order of magnitude of the value. In view of this, the concept of relative error is introduced which is nothing but the normalized absolute error. The relative error is defined as

er =

= absolute error of 2/3= 0.001666666... relative error of 2/3 = 0.0024999 approx.

3. If , denote forward, backward and central difference operator, E and , are respectively the shift and average operators, in the analysis of data with equal spacing h, show that (1) 1 + 2 2 = 1 + 2 2 2

(2) E1/2 = +

2

(3) =

2 2 + 1+ 4 2

From the definition of operators, we have

= Therefore

1+

= Also

..(1)

=

..(2)

From equation (1) and (2)

1 + 2 2 =

=

+

=

= =E1 Thus we get

= 4. Find a real root of the equation x3 4x 9 = 0 using the bisection method.

First Let x0 be 1 and x1 be 3 F(x0) = x3 4x -9 =1 4 9 = -12 < 0 F(x1) =27 12 9 =6> 0 Therefore, the root lies between 1 and 3 Now we try with x2 =2 F(x2) = 8 8 9 = -9 < 0 Therefore, the root lies between 2 and 3 X3 = (x1+x2)/2 =(3+2)/2 = 2.5 F(x3) = 15.625 10 9 = - 3.375 < 0 Therefore, the root lies between 2.5 and 3 X4 = (x1+x3)/2 = 2.75

5. Find Newtons difference interpolation polynomial for the following data: x f(x) 0.1 1.40 0.2 1.56 0.3 1.76 0.4 2.00 0.5 2.28

Forward difference table

Here

p=

We have Newtons forward interpolation formula as

y=

(1) From the table substitute all the values in equation (1)

y = 1.40 + (10x 1) (0.16) + y = 2x2 + x + 1.28 This is the required Newtons interpolating polynomial.

6. Evaluate value of .

1 + x 20

1

dx

using Trapezoidal rule with h = 0.2. Hence determine the

, which is known as the trapezoidal rule. The trapezoidal rule uses trapezoids to approximate the curve on a subinterval. The area of a trapezoid is the width times the average height, given by the sum of the function values at the endpoints, divided by two. Therefore: 0.2( f(0) + 2f(0.2) + 2f(0.4) + 2f(0.6) + 2f(0.8) + f(1) ) / 2 = 0.2( 1 + 2*(0.96154) + 2(0.86207) + 2(0.73529) + 2(0.60976) + 0.5) / 2 = 0.78373 The integrand is the derivative of the inverse tangent function. In particular, if we integrate from 0 to 1, the answer is pi/4 . Consequently we can use this integral to approximate pi. Multiplyingby four, we get an approximation for pi: 3.1349

Master of Computer Application (MCA) Semester 3 MC0075 Computer Networks 4 Credits

Assignment Set 1MC0075 Computer Networks (Book ID: B0813 & B0814)

1. Discuss the advantages and disadvantages of synchronous and asynchronous transmission. Synchronous & Asynchronous transmission:: Synchronous Transmission: Synchronous is any type of communication in which the parties communicating are "live" or present in the same space and time. A chat room where both parties must be at their computer, connected to the Internet, and using software to communicate in the chat room protocols is a synchronous method of communication. E-mail is an example of an asynchronous mode of communication where one party can send a note to another person and the recipient need not be online to receive the e-mail. Synchronous mode of transmissions are illustrated in figure 3.11

Figure :Synchronous and Asynchronous Transmissions The two ends of a link are synchronized, by carrying the transmitters clock information along with data. Bytes are transmitted continuously, if there are gaps then inserts idle bytes as padding Advantage: This reduces overhead bits It overcomes the two main deficiencies of the asynchronous method, that of inefficiency and lack of error detection. Disadvantage: For correct operation the receiver must start to sample the line at the correct instant Application: Used in high speed transmission example: HDLC Asynchronous transmission:

Asynchronous refers to processes that proceed independently of each other until one process needs to "interrupt" the other process with a request. Using the client- server model, the server handles many asynchronous requests from its many clients. The client is often able to proceed with other work or must wait on the service requested from the server.

Figure : Asynchronous Transmissions Asynchronous mode of transmissions is illustrated in figure 3.12. Here a Start and Stop signal is necessary before and after the character. Start signal is of same length as information bit. Stop signal is usually 1, 1.5 or 2 times the length of the information signal Advantage: The character is self contained & Transmitter and receiver need not be synchronized Transmitting and receiving clocks are independent of each other Disadvantage: Overhead of start and stop bits False recognition of these bits due to noise on the channel Application: If channel is reliable, then suitable for high speed else low speed transmission Most common use is in the ASCII terminals 2. Describe the ISO-OSI reference model and discuss the importance of every layer. Ans:2 The OSI Reference Model:

This reference model is proposed by International standard organization (ISO) as a a first step towards standardization of the protocols used in various layers in 1983 by Day and Zimmermann. This model is called Open system Interconnection (OSI) reference model. It is referred OSI as it deals with connection open systems. That is the systems are open for communication with other systems. It consists of seven layers. Layers of OSI Model : The principles that were applied to arrive at 7 layers: 1. A layer should be created where a different level of abstraction is needed. 2. Each layer should perform a well defined task. 3. The function of each layer should define internationally standardized protocols 4. Layer boundaries should be chosen to minimize the information flow across the interface. 5. The number of layers should not be high or too small.

Figure : ISO - OSI Reference Model The ISO-OSI reference model is as shown in figure 2.5. As such this model is not a network architecture as it does not specify exact services and protocols. It just tells what each layer should do and where it lies. The bottom most layer is referred as physical layer. ISO has produced standards for each layers and are published separately. Each layer of the ISO-OSI reference model are discussed below:

The best known example of this is Ethernet. This layer manages the interaction of devices with a shared medium. Other examples of data link protocols are HDLC and ADCCP for point-to-point or packet-switched networks and Aloha for local area networks. On IEEE 802 local area networks, and some non-IEEE 802 networks such as FDDI, this layer may be split into a Media Access Control (MAC) layer and the IEEE 802.2 Logical Link Control (LLC) layer. It arranges bits from the physical layer into logical chunks of data, known as frames. This is the layer at which the bridges and switches operate. Connectivity is provided only among locally attached network nodes forming layer 2 domains for unicast or broadcast forwarding. Other protocols may be imposed on the data frames to create tunnels and logically separated layer 2 forwarding domain. The data link layer might implement a sliding window flow control and acknowledgment mechanism to provide reliable delivery of frames; that is the case for SDLC and HDLC, and derivatives of HDLC such as LAPB and LAPD. In modern practice, only error detection, not flow control using sliding window, is present in modern data link protocols such as Point-to-Point Protocol (PPP), and, on local area networks, the IEEE 802.2 LLC layer is not used for most protocols on Ethernet, and, on other local area networks, its flow control and acknowledgment mechanisms are rarely used. Sliding window flow control and acknowledgment is used at the transport layers by protocols such as TCP. 3. Network Layer The Network layer provides the functional and procedural means of transferring variable length data sequences from a source to a destination via one or more networks while maintaining the quality of service requested by the Transport layer. The Network layer performs network routing functions, and might also perform fragmentation and reassembly, and report delivery errors. Routers operate at this layer sending data throughout the extended network and making the Internet possible. This is a logical addressing scheme values are chosen by the network engineer. The addressing scheme is hierarchical. The best known example of a layer 3 protocol is the Internet Protocol (IP). Perhaps its easier to visualize this layer as managing the sequence of human carriers taking a letter from the sender to the local post office, trucks that carry sacks of mail to other post offices or airports, airplanes that carry airmail between major cities, trucks that distribute mail sacks in a city, and carriers that take a letter to its destinations. Think of fragmentation as splitting a large document into smaller envelopes for shipping, or, in the case of the network layer, splitting an application or transport record into packets. The major tasks of network layer are listed It controls routes for individual message through the actual topology.

Finds the best route. Finds alternate routes. It accomplishes buffering and deadlock handling.

4. Transport Layer The Transport layer provides transparent transfer of data between end users, providing reliable data transfer while relieving the upper layers of it. The transport layer controls the reliability of a given link through flow control, segmentation/desegmentation, and error control. Some protocols are state and connection oriented. This means that the transport layer can keep track of the segments and retransmit those that fail. The best known example of a layer 4 protocol is the Transmission Control Protocol (TCP). The transport layer is the layer that converts messages into TCP segments or User Datagram Protocol (UDP), Stream Control Transmission Protocol (SCTP), etc. packets. Perhaps an easy way to visualize the Transport Layer is to compare it with a Post Office, which deals with the dispatch and classification of mail and parcels sent. Do remember, however, that a post office manages the outer envelope of mail. Higher layers may have the equivalent of double envelopes, such as cryptographic Presentation services that can be read by the addressee only. Roughly speaking, tunneling protocols operate at the transport layer, such as carrying non-IP protocols such as IBMs SNA or Novells IPX over an IP network, or end-to-end encryption with IP security (IP sec). While Generic Routing Encapsulation (GRE) might seem to be a network layer protocol, if the encapsulation of the payload takes place only at endpoint, GRE becomes closer to a transport protocol that uses IP headers but contains complete frames or packets to deliver to an endpoint. The major tasks of Transport layer are listed below: It locates the other party It creates a transport pipe between both end-users. It breaks the message into packets and reassembles them at the destination. It applies flow control to the packet stream.

5. Session Layer The Session layer controls the dialogues/connections (sessions) between computers. It establishes, manages and terminates the connections between the local and remote application. It provides for either full-duplex or half-duplex operation, and establishes check pointing, adjournment, termination, and restart procedures. The OSI model made this layer responsible for "graceful close" of sessions, which is a property of TCP, and also for session check pointing and recovery, which is not usually used in the Internet protocols suite. The major tasks of session layer are listed It is responsible for the relation between two end-users. It maintains the integrity and controls the data exchanged between the endusers. The end-users are aware of each other when the relation is established (synchronization). It uses naming and addressing to identify a particular user. It makes sure that the lower layer guarantees delivering the message (flow control).

6. Presentation Layer The Presentation layer transforms the data to provide a standard interface for the Application layer. MIME encoding, data encryption and similar manipulation of the presentation are done at this layer to present the data as a service or protocol developer sees fit. Examples of this layer are converting an EBCDICcoded text file to an ASCII-coded file, or serializing objects and other data structures into and out of XML. The major tasks of presentation layer are listed below: It translates the language used by the application layer. It makes the users as independent as possible, and then they can concentrate on conversation.

7. Application Layer (end users)

The application layer is the seventh level of the seven-layer OSI model. It interfaces directly to the users and performs common application services for the application processes. It also issues requests to the presentation layer. Note carefully that this layer provides services to user-defined application processes, and not to the end user. For example, it defines a file transfer protocol, but the end user must go through an application process to invoke file transfer. The OSI model does not include human interfaces. The common application services sub layer provides functional elements including the Remote Operations Service Element (comparable to Internet Remote Procedure Call), Association Control, and Transaction Processing (according to the ACID requirements). Above the common application service sub layer are functions meaningful to user application programs, such as messaging (X.400), directory (X.500), file transfer (FTAM), virtual terminal (VTAM), and batch job manipulation (JTAM).

3. Explain the following with respect to Data Communications: A) Fourier analysis B) Band limited signals C) Maximum data rate of a channel A) Fourier analysis : In 19th century, the French mathematician Fourier proved that any periodic function of time g (t) with period T can be constructed by summing a number of cosines and sines.

(3.1) Where f=1/T is the fundamental frequency, and are the sine and cosine amplitudes of the nth harmonics. Such decomposition is called a Fourier series. B) Band limited signals : Consider the signal given in figure 3.1(a). Figure shows the signal that is the ASCII representation of the character b which consists of the bit pattern 01100010 along with its harmonics.

Figure 3.1: (a) A binary signal (b-e) Successive approximation of original signal Any transmission facility cannot pass all the harmonics and hence few of the harmonics are diminished and distorted. The bandwidth is restricted to low frequencies consisting of 1, 2, 4, and 8 harmonics and then transmitted. Figure 3.1(b) to 3.1(e) shows the spectra and reconstructed functions for these bandlimited signals. Limiting the bandwidth limits the data rate even for perfect channels. However complex coding schemes that use several voltage levels do exist and can achieve higher data rates. C) Maximum data rate of a channel : In 1924, H. Nyquist realized the existence of the fundamental limit and derived the equation expressing the maximum data for a finite bandwidth noiseless channel. In 1948, Claude Shannon carried Nyquist work further and extended it to the case of a channel subject to random noise. In communications, it is not really the amount of noise that concerns us, but rather the amount of noise compared to the level of the desired signal. That is, it is the ratio of signal to noise power that is important, rather than the noise power alone. This Signal-to-Noise Ratio (SNR), usually expressed in decibel (dB), is one of the most important specifications of any communication system. The decibel is a logarithmic unit used for comparisons of power levels or voltage levels. In order to understand the implication of dB, it is important to know that a sound level of zero dB corresponds to the threshold of hearing, which is the smallest sound that can be heard. A normal speech conversation would measure about 60 dB.

If an arbitrary signal is passed through the Low pass filter of bandwidth H, the filtered signal can be completely reconstructed by making only 2H samples per second. Sampling the line faster than 2H per second is pointless. If the signal consists of V discrete levels, then Nyquist theorem states that, for a noiseless channel Maximum data rate = 2H.log2 (V) bits per second. (3.2) For a noisy channel with bandwidth is again H, knowing signal to noise ratio S/N, the maximum data rate according to Shannon is given as Maximum data rate = H.log2 (1+S/N) bits per second. (3.3)

4. Explain what all facilities FTP offers beyond the transfer function? FTP offers many facilities beyond the transfer function itself:

Interactive Access: It provides an interactive interface that allows humans to easily interact with remote servers. For example: A user can ask for listing of all files in a directory on a remote machine. Also a client usually responds to the input help by showing the user information about possible commands that can be invoked. Format (Representation) Specification: FTP allows the client to specify the type of format of stored data. For example: the user can specify whether a file contains text or binary integers and whether a text files use ASCII or EBCDIC character sets. Authentication Control: FTP requires clients to authorize themselves by sending a login name and password to the server before requesting file transfers. The server refuses access to clients that cannot supply a valid login and password.

5. What is the use of IDENTIFIER and SEQUENCE NUMBER fields of echo request and echo reply message? Explain. Echo Request and Echo Reply message format: The echo request contains an optional data area. The echo reply contains the copy of the data sent in the request message. The format for the echo request and echo reply is as shown in figure 5.2.

Figure 5.2 echo request and echo reply message format The field OPTIONALDATA is a variable length that contains data to be returned to the original sender. An echo reply always returns exactly the same data as ws to receive in the request. Field IDENTIFIER and SEQUENCE NUMBER are used by the sender to match replies to requests. The value of the TYPE field specifies whether it is echo request when equal to 8 or echo reply when equal to 0.

6. In what conditions is ARP protocol used? Explain. ARP protocol : In computer networking, the Address Resolution Protocol (ARP) is the standard method for finding a hosts hardware address when only its network layer address is known. ARP is primarily used to translate IP addresses to Ethernet MAC addresses. It is also used for IP over other LAN technologies, such as Token Ring, FDDI, or IEEE 802.11, and for IP over ATM. ARP is used in four cases of two hosts communicating:

1. When two hosts are on the same network and one desires to send a packet to the other 2. When two hosts are on different networks and must use a gateway/router to reach the other host 3. When a router needs to forward a packet for one host through another router 4. When a router needs to forward a packet from one host to the destination host on the same network The first case is used when two hosts are on the same physical network. That is, they can directly communicate without going through a router. The last three cases are the most used over the Internet as two computers on the internet are typically separated by more than 3 hops. Imagine computer A sends a packet to computer D and there are two routers, B & C, between them. Case 2 covers A sending to B; case 3 covers B sending to C; and case 4 covers C sending to D. ARP is defined in RFC 826. It is a current Internet Standard, STD 37.

Assignment Set 2

MC0075 Computer Networks (Book ID: B0813 & B0814)

1. Discuss the physical description of the different transmission mediums. Physical description: [[Guided transmission media]] Twisted pair=> Physical description>> _ Two insulated copper wires arranged in regular spiral pattern. _ Number of pairs are bundled together in a cable. _ Twisting decreases the crosstalk interference between adjacent pairs in the cable, by using different. _ twist length for neighboring pairs. Coaxial cable>> Physical description>> _ Consists of two conductors with construction that allows it to operate over a wider range of frequencies, compared to twisted pair. _ Hollow outer cylindrical conductor surrounding a single inner wire conductor. _ Inner conductor held in place by regularly spaced insulating rings or solid dielectrical material. _ Outer conductor covered with a jacket or shield. _ Diameter from 1 to 2.5 cm. _ Shielded concentric construction reduces interference and crosstalk. _ Can be used over longer distances and support more stations on a shared line than twisted pair. Optical fiber=> Physical description>> 1. Core _ Innermost section of the fiber

_ One or more very thin (dia - 8-100 mm). 2. Cladding _ Surrounds each strand _ Plastic or glass coating with optical properties different from core _ Interface between core and cladding prevents light from escaping the core 3. Jacket _ Outermost layer, surrounding one or more claddings _ Made of plastic and other materials _ Protects from environmental elements like moisture, abrasions, and crushing

[[Wireless Transmission]] Terrestrial microwave=>> Physical description: _ Parabolic dish antenna, about 3m in diameter. _ Fixed rigidly with a focused beam along line of sight to receiving antenna. _ With no obstacles, maximum distance (d, in km) between antennae can be d = 7:14 Rt(Kh) [[where h is antenna height and K is an adjustment factor to account for the bend in microwave due to earth's curvature, enabling it to travel further than the line of sight; typically K = 4/3]] _ Two microwave antennae at a height of 100m may be as far as 7:14 Rt(Kh) _ Long distance microwave transmission is achieved by a series of microwave relay towers Satellite microwave=>> Physical description: _ Communication satellite is a microwave relay station between two or more ground stations (also called earth stations). _ Satellite uses different frequency bands for incoming (uplink) and outgoing (downlink) data. _ A single satellite can operate on a number of frequency bands, known as transponder channels or transponders. _ Geosynchronous orbit (35,784 km). _ Satellites cannot be too close to each other to avoid interference. _ Current standard requires a 4_ displacement in the 4/6 ghz band and 3_ displacement at 12/14 Ghz.

_ This limits the number of available satellites. Broadcast radio=> Physical description: _ Omnidirectional transmission. _ No need for dish antennae. 2. Describe the following Medium Access Control Sub Layers Multiple access protocols: A) Pure ALOHA B) Slotted ALOHA Pure ALOHA or Unslotted ALOHA: The ALOHA network was created at the University of Hawaii in 1970 under the leadership of Norman Abramson. The Aloha protocol is an OSI layer 2 protocol for LAN networks with broadcast topology. The first version of the protocol was basic: If you have data to send, send the data If the message collides with another transmission, try resending later

Figure 7.3: Pure ALOHA

Figure 7.4: Vulnerable period for the node: frame The Aloha protocol is an OSI layer 2 protocol used for LAN. A user is assumed to be always in two states: typing or waiting. The station transmits a frame and checks the channel to see f it was successful. If so the user sees the reply and continues to type. If the frame transmission is not successful, the user waits and retransmits the frame over and over until it has been successfully sent. Let the frame time denote the amount of time needed to transmit the standard fixed length frame. We assume the there are infinite users and generate the new frames according Poisson distribution with the mean N frames per frame time. So if N>1 the users are generating the frames at higher rate than the channel can handle. Hence all frames will suffer collision. Hence the range for N is 0>1, many retransmissions and hence G>N. Under all loads: throughput S is just the offered load G times the probability of successful transmission P0

The probability that k frames are generated during a given frame time is given by Poisson distribution

So the probability of zero frames is just . The basic throughput calculation follows a Poisson distribution with an average number of arrivals of 2G arrivals per two frame time. Therefore, the lambda parameter in the Poisson distribution becomes 2G. Hence Hence the throughput We get for G = 0.5 resulting in a maximum throughput of 0.184, i.e. 18.4%. Pure Aloha had a maximum throughput of about 18.4%. This means that about 81.6% of the total available bandwidth was essentially wasted due to losses from packet collisions.

Slotted ALOHA or Impure ALOHA An improvement to the original Aloha protocol was Slotted Aloha. It is in 1972, Roberts published a method to double the throughput of an pure ALOHA by uses discrete timeslots. His proposal was to divide the time into discrete slots corresponding to one frame time. This approach requires the users to agree to the frame boundaries. To achieve synchronization one special station emits a pip at the start of each interval similar to a clock. Thus the capacity of slotted ALOHA increased to the maximum throughput of 36.8%. The throughput for pure and slotted ALOHA system is as shown in figure 7.5. A station can send only at the beginning of a timeslot, and thus collisions are reduced. In this case, the average number of aggregate arrivals is G arrivals per 2X seconds. This leverages the lambda parameter to be G. The maximum throughput is reached for G = 1.

Figure 7.5: Throughput versus offered load traffic. With Slotted Aloha, a centralized clock sent out small clock tick packets to the outlying stations. Outlying stations were only allowed to send their packets immediately after receiving a clock tick. If there is only one station with a packet to send, this guarantees that there will never be a collision for that packet. On the other hand if there are two stations with packets to send, this algorithm guarantees that there will be a collision, and the whole of the slot period up to the next clock tick is wasted. With some mathematics, it is possible to demonstrate that this protocol does improve the overall channel utilization, by reducing the probability of collisions by a half. It should be noted that Alohas characteristics are still not much different from those experienced today by Wi-Fi, and similar contention-based systems that have no carrier sense capability. There is a certain amount of inherent inefficiency in these systems. It is typical to see these types of networks throughput break down significantly as the number of users and message burstiness increase. For these reasons, applications which need highly deterministic load behavior often use token-passing schemes (such as token ring) instead of contention systems For instance ARCNET is very popular in embedded applications. Nonetheless, contention based systems also have significant advantages, including ease of management and speed in initial communication. Slotted Aloha is used on low bandwidth tactical Satellite communications networks by the US Military; subscriber based Satellite communications networks, and contact less RFID technologies.

3. Discuss the different types of noise with suitable example . Noise: Noise is a third impairment. It can be define as unwanted energy from sources other than the transmitter. Thermal noise is caused by the random motion of the electrons in a wire and is unavoidable. Consider a signal as shown in figure 3.5, to which a noise shown in figure 3.6, is added may be in the channel.

Figure 3.5: Signal

Figure 3.6: Noise

Figure 3.7: Signal + Noise At the receiver, the signal is recovered from the received signal and is shown in figure 3.7. That is signals are reconstructed by sampling. Increased data rate implies "shorter" bits with higher sensitivity to noise Source of Noise Thermal: Agitates the electrons in conductors, and is a function of the temperature. It is often referred to as white noise, because it affects uniformly the different frequencies. The thermal noise in a bandwidth W is Where T=temperature, and k= Boltzmanns constant = 1.38 10-23 Joules/degrees Kelvin.

Signal to noise ratio:

Typically measured at the receiver, because it is the point where the noise is to be removed from the signal.

Intermodulation: Results from interference of different frequencies sharing the same medium. It is caused by a component malfunction or a signal with excessive strength is used. For example, the mixing of signals at frequencies f1 and f2 might produce energy at the frequency f1 + f2 . This derived signal could interfere with an intended signal at frequency f1 + f2 . Cross talk: Similarly cross talk is a noise where foreign signal enters the path of the transmitted signal. That is, cross talk is caused due to the inductive coupling between two wires that are close to each other. Sometime when talking on the telephone, you can hear another conversation in the background. That is cross talk. Impulse: These are noise owing to irregular disturbances, such as lightning, flawed communication elements. It is a primary source of error in digital data.

4. What is Non-repudiation? Define cryptanalysis. Non-repudiation: Non-repudiation, or more specifically non-repudiation of origin, is an important aspect of digital signatures. By this property an entity that has signed some information cannot at a later time deny having signed it. Similarly, access to the public key only does not enable a fraudulent party to fake a valid signature. It deals with signatures. Not denying or reneging. Digital signatures and certificates provide nonrepudiation because they guarantee the authenticity of a document or message. As a result, the sending parties cannot deny that they sent it (they cannot repudiate it). Nonrepudiation can also be used to ensure that an e-mail message was opened. Example: how does one prove that the order was placed by the customer. Cryptanalysis: The main constraint on cryptography is the ability of the code to perform the necessary transformation. From the top-secret military files, to the protection of private notes between friends, various entities over the years have found themselves in need of disguises for their transmissions for many different reasons. This practice of disguising or scrambling messages is called encryption.

In cryptography, a digital signature or digital signature scheme is a type of asymmetric cryptography used to simulate the security properties of a signature in digital, rather than written, form. Digital signature schemes normally give two algorithms, one for signing which involves the users secret or private key, and one for verifying signatures which involves the users public key. The output of the signature process is called the "digital signature."

5. Explain mask-address pair used in update message. Discuss importance of path attributes. Update message

Figure 7.4 BGP UPDATE message format The BGP peers after establishing a TCP connection sends OPEN message and acknowledge then. Then use UPDATE message to advertise new destinations that are reachable or withdraw previous advertisement when a destination has become unreachable. The UPDATE message format is as shown in Figure 7.4. Each UPDATE message is divided into two parts: 1. List of previously advertised destinations that are being withdrawn 2. List of new destinations being advertised. Fields labeled variable do not have fixed size. Update message contains following fields: WITHDRAWN LEN: is a 2-byte that specifies the size of withdrawn destinations field. If no withdrawn destination then its value =0 WITHDRAWN DESTINATIONS: IP addresses of withdrawn destinations. PATH LEN: is similar to WITHDRAWN LEN, but it specifies the length of path attributes that are associated with new destinations being advertised.

Figure 7.5 BGP compress format This single octet LEN specifies the number of bits in the mask, assuming contiguous mask. The address that follows is also compressed and only those octets are covered by mask is included. Depending on the value of LEN the number of octets in the address field is listed in table 7.1. A zero length is used for default router. Table 7.1 LEN Less than 8 9 to 16 17 to 24 25 to 32 Path Attribute Path attributes are factored to reduce the size of the update message. That is attributes apply to all destinations. If any destinations have different attributes then, they must be advertised in a separate update message. The path attribute (4-octet) contains a list of items and each item is of the form given in figure 7.6(a). Number of octets in address 1 2 3 4

(type, length, value) Figure 7.6 (a) path attribute item The type is 2 bytes long. The format of the type field of an item in path attribute is given in figure 7.6(b).

Figure 7.6 (b) BGP 2-octet type field of path attribute FLAG Bits 0 1 2 3 5-7 Description Zero for required attribute, set to 1 if optional Set to 1 for transitive, zero for no transitive Set to 1 for complete, zero for partial Set to 1 for length is 2 octets, zero for one Unused and must be zero

Figure 7.7 Flag bits of type field of path attribute Type code 1 2 3 4 5 6 7 Description Specify origin of the path information List of AS on path to destination Next hop to use for destination Discriminator used for multiple AS exit points Preference used within an AS Indication that routes have been aggregated ID of AS that aggregates routes

Figure 7.8 Type codes of type field The values of flag bits and type code of type field of an item in path attribute is given in figure 7.7 and figure 7.8 respectively. A length field follows type field may be either 1 or 2 bytes long depending on flag bit 3, which specifies the size of the length field. Then the contents of length field gives the size of the value filed. Importance of path attributes:

1. Path information allows a receiver to check for routing loops. The sender can specify exact path through AS to the destination. If any AS is listed more than once then there is a routing loop. 2. Path information allows a receiver to implement policy constraints. A receiver can examine the path so that they should not pass through untrusted AS. 3. Path information allows a receiver to know the source of all routes.

6. Explain the following with respect to E-Mail: A) Architecture B) Header format A) Architecture : E-mail system normally consists of two sub systems 1. the user agents 2. the message transfer agents The user agents allow people to read and send e-mails. The message transfer agents move the messages from source to destination. The user agents are local programs that provide a command based, menu-based, or graphical method for interacting with e-mail system. The message transfer agents are daemons, which are processes that run in background. Their job is to move datagram e-mail through system. A key idea in e-mail system is the distinction between the envelope and its contents. The envelope encapsulates the message. It contains all the information needed for transporting the message like destinations address, priority, and security level, all of which are distinct from the message itself.

B) Header format The header is separated from the body by a blank line. Envelopes and messages are illustrated in figure 8.1. 9 Fig. 8.1: E-mail envelopes and messages The message header fields that are used in an example of figure 8.1.

consists of following fields From: The e-mail address, and optionally name, of the sender of the message. To: one or more e-mail addresses, and optionally name, of the receivers of the message. Subject: A brief summary of the contents of the message. Date: The local time and date when the message was originally sent. E-mail system based on RFC 822 contains the message header as shown in figure 8.2. The figure gives the fields along with their meaning.

Fig. 8.2: Fields used in the RFC 822 message header The fields in the message header of E-mail system based on RFC 822 related to message transport are given in figure 8.3. The figure gives the fields along with their meaning.

Fig. 8.3: Header Fields related to message Transport