good programming concepts

Controlled copy

Release ID: QSCO-GDPROG.doc / 07.02.2003 C3: Protected

Good Programming Concepts

Controlled copy Good Programming Practices

Release ID: QSCO-GDPROG.doc / 07.02.2003 C3: Protected of 16

Table of Contents

1.0 Introduction 3

1.1 Purpose and Scope 3

2.0 Background Information 3

2.1 Desirable Characteristics in Programs 3

2.2 Measuring and Verifying Desirable Characteristics 4

2.3 Detailed Design Parameters 4

2.3.1 Program structuring 42.3.2 Usability 52.3.3 Integrity 5

3.0 Programming Conventions/Guidelines 5

3.1 Introduction 5

3.2 Code Layout and Presentation Aspects 5

3.3 Identification Headers 7

3.4 Naming Conventions 8

3.5 Data Declarations 8

3.5.1 Header Files/ Libraries 93.5.2 Global Declarations 93.5.3 Local Declarations 93.5.4 Array Declarations 93.5.5 Function Declarations 9

3.6 Statement Construction 9

3.7 Defensive Programming and Error Tolerance 10

3.7.1 Input-output 113.7.2 Security 113.7.3 Handling abnormal situations 11

3.8 Modifiability/Flexibility 12

3.9 On-Screen Error and Help Messages 13

3.10 Efficiency 14

3.11 Commenting Source Code 14



1.0 Introduction

1.1 Purpose and Scope

It is true that programs are written to meet their specifications, but that cannot be the only goal. This is obvious given the fact that the same specifications can be met by several programs written in different ways. Such being the case, what are the other goals that the programmer should meet? Or, given alternative ways of meeting specifications, how does one judge which is a better way of coding? This document identifies desirable characteristics in a program and practices to achieve this.

2.0 Background Information

2.1 Desirable Characteristics in Programs

Apart from functional correctness, some of the characteristics desirable in a program are:

Efficiency: The amount of computing resources and code required by a program to perform a function

Flexibility: Effort required to modify an operational program (also called modifiability)

Integrity: The extent to which access to software or data by unauthorized persons can be controlled

Interoperability: Effort required to couple one system with another

Maintainability: Effort required to locate and fix an error in an otherwise operational program

Portability: Effort required to transfer a program from one hardware configuration and/or environment to another

Reliability: The extent to which a program can be expected to perform it’s intended function with the required precision, consistently

Reusability: Extent to which a program can be used in other applications related to the packaging and scope of the functions that the program performs

Testability: Effort required to test a program to ensure that it performs its intended function

Usability: Effort required to learn, operate, prepare input and interpret output of a program

High Cohesion and low coupling are also desirable characteristics of programs. It may be impossible for a program to possess all the stated characteristics, and some amount of prioritization has to be done based on the application and the environment in which the program is to function.



2.2 Measuring and Verifying Desirable Characteristics

Defining desirable characteristics is one thing, specifying ways of measuring and verifying them is a different thing altogether. Correctness and, to some extent reliability, can be checked through a unit test, but it is difficult to verify other characteristics in a given program.

The general practice is to define programming conventions or guidelines which theoretically enhance the above characteristics. Then, by means of code reviews, walkthroughs and inspections, one checks if the given program adheres to these conventions. Other auxiliary benefits of such reviews is that they highlight potential problems that no one would have suspected.

2.3 Detailed Design Parameters

It should be remembered that even before the first line of code is written in a program, a lot of decisions have already been taken about the program as part of the Low Level or Detailed Design stage. When the specifications of a program are handed over to the programmer, it is on the assumption that the design has taken care of program structuring, usability and integrity (security) considerations. Each of these is explained in the following paragraphs:

2.3.1 Program structuring

The program should be structured into hierarchically organized logical segments sometimes called modules, paragraphs, functions) having the following characteristics:

Each segment or module is of the right size

Each segment or module performs a distinct function.(this is a characteristic of High cohesion). A good design would have the following arrangement:

Top-level segments focus on high-level control, directing program flow

Going down the hierarchy, the emphasis on control reduces, with increasing focus on detail or specific actions

The lowest segments are very specific and action-oriented, having little or no program control. Typically they contain statements such as database calls, reads, calculations, etc.

Interdependence between segments is low (i.e. coupling is low)

Each segment has a single entry point and a single exit point

Use of data type declarations (as in C #define INT int) to take care of data storage differences

Separate modules to handle platform and OS specific function calls



Wherever possible, values should be passed to a called program or function by value only, and not by address or reference. This prevents inadvertent modification of arguments within a function. Arguments should be passed by reference only when absolutely required.

2.3.2 Usability

This is a major determinant of how “user-friendly” the program will be perceived. Factors directly enhancing usability are:

Functionality: The set of features that the user perceives in the program.

Error Tolerance: The degree of tolerance shown by the program towards mistakes committed by the user.

Informativeness: The degree to which the program is informative in terms of providing help facilities, informative prompts and error messages.

Intuitiveness: The degree to which the program follows a logical or common sense based sequence of actions; in other words, the degree to which the user can operate the program without specific training.

Effort: The amount of effort required to operate the software (the lesser the better). This includes providing input (data, commands), operating the software and interpreting the output.

2.3.3 Integrity

The program should restrict access to data and other resources to authorized users only and for any function access should be provided only to the data required to accomplish the function.

If the program design does not provide for the above aspects in the manner outlined, the programmer must take it up with the designer and get it resolved.

3.0 Programming Conventions/Guidelines

3.1 Introduction

This Section outlines concepts that are generic in nature and applicable to most software tools and platforms. Platform specific conventions and guidelines are covered under the relevant Company Standard.

A Project may adopt its own standards based on customer requirement. In such cases, this fact should be clearly documented and made available to all team members as well as the quality personnel. Some of the items that may require project-specific conventions are -- naming conventions, standard screen structure (headers, footers, border etc), report formats, date formats (DD/MM/YY, MM/DD/YY etc), other user interface standards (for example, conventions for functions keys, standard exit and entry screens etc.), attributes of display items (position on screen, font, size, attributes, screen size, colors used, etc.).



3.2 Code Layout and Presentation Aspects

Code layout deals with the structure of the code and the way it is laid out. It affects the readability of the code the ease of modification of code. Since these details are closely coupled with the code, adherence to standards or guidelines must be performed during the initial coding phase itself; it is very difficult to modify or beautify code at a later date.

1. Code Layout deals with the structure of the code and the way it is laid out. It affects readability and ease of modification of code. These should be followed from the initial coding phase.

2. A good layout should bring out the logical program structure through appropriate use of blank lines, spaces and indentation

3. Only one statement per line should be coded even if the language allows for more than one statement per line.

4. Whenever possible following parts of code should be indented and aligned properly through use of tabs and / or spaces:

A block of code following control structure. This should be done recursively

Statement that continue to the next line

Comments

Spaces and parenthesis should be used to enhance readability in the following parts of the code

Expressions involving array references, pointers, indices and subscripts,

Parameters / arguments of functions / sub-programs / routines

Logical expressions involving use of relational operators (like <, >), Boolean operations (like AND, OR) when execution order is not obvious

Blank lines should be used to space out the following parts of a program

Heading, titles

Beginning of a new paragraph or function

Changes made to fix bugs

Statement label to which control jumps occur

Groups of assignment statements together around “=‘ sign

For data declarations the following guidelines should be followed:

Only one data item per line should be declared to increase flexibility

Variables of same type should be defined together in alphabetical order,

Common variables / structures used across modules should be declared in a separate library file and included

When fixing bugs



The statement (s) in error should be commented out with appropriate comments identifying the Bug report Identifier

The changed statement (s) should follow it with a blank lines on either side

At the beginning of the program / module a line or para should be added for each bug fixed with bug report identifier

Lines fixing the bugs can be tagged with bug report identifier in the languages that provide the feature

IF THEN/ELSE should be coded as follows:

If and the corresponding else (and Endif) should be aligned

The statements for THEN and ELSE branches should be indented to the right

The layout style should be durable to handle changes over period of time like minimal indentation (2 positions) to provide for adding new lines and long comments and variable names)

3.3 Identification Headers

Each program should contain certain basic information about itself, its author, revision history etc. While this will not add to its functionality, it is of immense help during maintenance. Some of the aspects that could be contained are as follows (this is only a suggested list, not a mandatory one):

******************************************************************************

* PROGRAM NAME

*

* FUNCTIONALITY:

*

* IMPORTANT PARAMETERS:

*

* FILES USED:

* Logical Physical Opening Purpose

* Name Name Mode Served

*

*

* FUNCTIONS USED and CALLED:



* Name Type Purpose

*

*

* DEVELOPMENT HISTORY:

*

* Revision Main Date of Comments

* Number Author Completion

*

*

* REMARKS:

*

*

******************************************************************************

3.4 Naming Conventions

Conventions bring about uniformity in the way program entities like variables, symbolic constants, procedures and functions are named. The benefits of adhering to naming conventions is that somebody going through the program can get an idea about the purpose of various entities from their names, thus enhancing program readability. The naming convention for a particular language or a tool is covered under its relevant Company Standard. In the absence of a Company Standard, the following guidelines may be followed:

Procedures Names should consist of a strong verb followed by an object. For example, PrintReport (), CalcMthRevenue (), CheckOrderInfo( ) etc.

Function Names should describe the value returned by the function. For example, NextCustomerId (), Cost (), PrinterReady (), etc.

Variable names should be relevant to the nature of the variable, with length not more than 15 characters.

Variables should be declared so that the names are meaningful and the carry a prefix or suffix to identify their type like start Integer variables with I in the beginning

Variables should be declared with use of Hyphen or underscore to make variable names meaningful like End_of_file



3.5 Data Declarations

A program consists of two basic entities -- Data and Instructions. Data elements or structures should be declared and initialized before (executable) instructions.

3.5.1 Header Files/ Libraries

All header files and libraries used in the program (whether standard or user defined) should be declared, if necessary with the relevant path.

3.5.2 Global Declarations

The number of global declarations used should be minimized so as to reduce coupling between modules. Care should be taken to ensure that global declarations are not misused within the sub-programs.

3.5.3 Local Declarations

It is advisable to initialize local variables to avoid inadvertent carry over of values from outside the module.

All unique or complex variables or data structures should be described through appropriate comments, clarifying the reason for such complexity.

3.5.4 Array Declarations

It is cumbersome to handle arrays having more than three dimensions, such arrays should be avoided. Array bounds must be specified such that no overflow or excessive memory usage takes place during run-time.

3.5.5 Function Declarations

Functions and their parameters should be declared taking care to ensure that no type mismatches occur during runtime between the calling and called module/function/procedure.

The number of parameters passed between functions or procedures or modules should be reduced as much as possible, in order to reduce coupling.

The sub-section on Code Layout may be referred on how data declarations should be laid-out.

3.6 Statement Construction

The source code should be uniformly written in a case (UPPER, lower or Mixed) appropriate to the language used. Structured programming constructs should be used to the maximum extent possible.



Only standard (ANSI) features should be used. If for some reason, features specific to a particular compiler are used, then they should be highlighted through comments.

Parenthesis should be used to clarify the intent of arithmetic and logical expressions to the reader (even if the compiler does not require them).

The usage of temporary variables to store intermediate results should be reduced, if not avoided. This is because they occupy extra memory space and affect program readability.

Mixing variable types in an arithmetic statement, though supported by the compiler, should be avoided, since it creates unnecessary overhead on the compiler.

Care should be taken in coding integer arithmetic, since the results can sometimes be unpredictable (due to truncation and rounding off).

The program should ensure that there is no “dead code” which will never be accessed. This also helps in making the program precise and complete in itself.

Altering the loop control variable within the loop should be avoided at all costs.

Excessive nesting of loops and IF statements should be avoided (typically, more than three levels is not advisable)

Complicated and negative conditions in an IF statement should be avoided and, if possible, replaced by simpler equivalents. If the language permits symbolic conditions (for example, level 88 in COBOL), then these should be used extensively.

If the program logic calls for nesting multiple IF statements, then the most frequently occurring conditions should be tested first. This improves program efficiency. Additionally, it may be possible that after one condition is satisfied, it is unnecessary to check for other conditions. There is an elegant way of accomplishing this; rather than nesting multiple IF statements, a paragraph that contains the IF statements should be created and exited when one condition is satisfied.

The following strategy may be followed with regard to GOTO statement:

They should be avoided, if possible, and used in only those cases where they can simplify the complexity of the code, and enhance its readability. For example, going to the EXIT paragraph of a PERFORM loop in a COBOL program.

The following, however, should be avoided:

Jumping from inside a loop to a point outside the loop

Backward jumps i.e. jumping to an instruction which precedes the current instruction

The label of the statement to which control is jumping should be in capitals (where the syntax of the language permits it)

The GOTO statement should be in a separate line by itself (to draw attention)

The statement to which control is jumping should be preceded and followed by blank lines



3.7 Defensive Programming and Error Tolerance

In the course of program execution several unexpected things may happen, like wrong or erroneous data being fed-in, rounding off or truncation errors, divide-by-zero, security access violations, file corruption etc. Hence it is a good programming practice to include provisions to detect and correct, if possible, different types of abnormal situations and errors at every stage. This is the principle of defensive programming. However, it should be noted that if carried to extreme lengths, defensive programming can result in inefficient code having bad readability; hence this concept has to be used judiciously.

Defensive programming practices can be classified under three heads -- Input-output, security, and handling abnormal situations. These are explained in the following sub-sections:

3.7.1 Input-output

Some good practices are:

When input is required from the user, an appropriate message should be made available giving him/her the type and width of input required (in many languages, the prompt is accompanied by characters or a box denoting the maximum field width).

Using uniform and simple format specifications when displaying and storing data.

Validating the input by checking for different types of valid and invalid values

Asking the user to reconfirm when the action requested has irrecoverable consequences (like deleting a file or record)

Providing preview facilities before printing data

Storing spaces and/or flushing buffer variables before re-using them

3.7.2 Security


Restricting program access to authorized users through IDs, Passwords, etc.

Checking for security permissions before accessing key resources

Locking files and records to help data integrity

Taking appropriate actions when unauthorized usage is detected. Some of these could be, for example, terminating the session, initiating a warning message to the system administrator, etc.

3.7.3 Handling abnormal situations


1. Checking for rounding, truncation and overflow of numbers after critical calculations

2. Checking for overflows in subscripts/indexes of arrays and tables

3. Providing fail-safe features like check-pointing, crash-recovery etc.



4. Performing a clean exit whenever the program terminates (normally or abnormally). This calls for the following actions:

Releasing resources obtained (memory, printer, tape drive, record-locks, etc.)

Closing all files and devices used

Flushing all system buffers

Displaying appropriate messages

3.8 Modifiability/Flexibility

Requirements change, expectations change, and so it is but natural that programs have to be modified in order to suit the new specifications. This means that the program should be flexible enough to be modified with little or no effort (easier said than done!!)

Flexibility to a very large extent is determined by the process design; a design that results in high cohesion and low coupling is relatively easier to modify.

With regard to coding, one of the major factors constraining modifiability is hard-coding i.e. the extent to which values have been hard-coded in the program. Greater the degree of hard coding, lesser the modifiability.

Some practices which can enhance flexibility are:

Replacing literal constants by symbolic constants, which have been separately defined. For example, suppose a program has to deal with an array consisting of 100 elements, a better way of doing this is shown below (the example is in ‘C’ language, but the idea can be extended to other languages)

# define ARRAY_LIMIT 100

for (i = 0 ; i++ ; i < ARRAY_LIMIT)

{

:

}

The advantages of using the symbolic constant (ARRAY_LIMIT) instead of the literal constant (100) in the example is:

1. It is clear that the upper limit for the loop is the size of the array

2. If the array size changes from 100 to 200, only the # define statement has to be changed

Creating file-based error and help messages (this is explained further in sub-section 4.11)

Using a data-dictionary or a separate header/library file for all variable declarations used in different programs (this is very useful in COBOL where all record



structures are stored in a separate file and included by means of a COPY statement)

Converting settings related to screen and visual characteristics into parameters, stored in a header or control file. Some of the key aspects that need to be covered are foreground color, background color, border color, cursor, text color, font, size etc.

Replacing multiple instances of the same repeated code by a function or procedure. Incidentally, this also improves efficiency.

Using a CASE-like construct instead of nesting several levels IF statements, if permitted by the program logic.

3.9 On-Screen Error and Help Messages

This section describes text-based error and help messages (though most of the ideas are applicable for visual messages also). These have a strong bearing on user-friendliness. Guidelines to be followed are:

1. Message content and timing should be appropriate and specific to the context, and free from spelling or grammatical errors.

2. The message should not get cleared until there is some action from the user (like hitting a key). This ensures that the user notices the message.

3. The message should provide a reference or id that can be used while referring a reference manual for further information.

4. Error messages should be displayed at a constant location on the screen (the user can then look at that portion of the screen to check if something has gone wrong). If this is not possible, the message should be displayed in a manner that is bound to attract the user’s attention. This can be done using different visual attributes (color, font, size, etc.), or characteristics (bold, reverse video, blinking etc.)

5. Special care should be taken to ensure that an error message says what is wrong in a manner that the user can understand and relate to. For example, “ERROR # 109: Master record missing for this customer-id!” is much better than “ERROR # 109: Rec. not found!”

6. If possible, the message should state what possibly could be done to correct the error, and provide a reference to related topics or a help index that can be used to obtain further information.

7. In order to facilitate maintenance, embedding messages in the code should be avoided. A better strategy is to embed help and error messages in different indexed files, and then display the appropriate message during program execution. This provides the following benefits:

Messages can be easily located and changed

Messages can be modified without editing the source code and re-compiling

Support for multiple languages (German, French, etc.) can be easily provided

The same message can be used when the context is similar. This reduces coding effort and makes modification easier

Throughout the application, help should be available to the user invoking the same key or icon. All help messages should appear in the same location on the screen.



During Input, the interface

Should provide for highlighting the input in error,

Should provide the user with valid list (box) of values,

Should provide for beep and / or disallowing invalid type of data (e.g. alpha in numeric fields)

In case of Windows based applications,

The error box should have an appropriate title;

The help window should have the look and feel of a windows Help screen. This means support for aspects like Hyper-text etc.

3.10 Efficiency

This is an aspect that has to be carefully tackled because, if wrongly done, it can cause more harm than good. Some of the guidelines to be kept in mind are:

Optimization should start with the design. Fine-tuning of the design can lead to improvements in efficiency, which are several orders higher than what can be achieved through code optimizations. So, given a choice, it is better to concentrate on the design than on the code.

Code optimization should be taken up only after all other issues like functionality, readability etc., have been satisfactorily resolved. Optimization should never be at the cost of readability, portability and maintainability, unless there are very compelling reasons (in which case, there should be sufficient documentation to explain the optimization done and the reasons).

Repeating or common code should be replaced by a function or paragraph (containing the repeating code) and called whenever required. This enhances efficiency and improves readability and maintainability

If possible, an attempt should be made to reduce file I/O requests since they slow down execution. This can be done by replacing several logical reads by one physical read of a large size. To the extent possible, buffered I/O should be used (if permitted by the compiler).

Dead-code should be removed (after suitable investigation!)

Mixing data types in arithmetic statements, even if permitted by the compiler, should be avoided since it takes a heavy toll of efficiency. It may be worthwhile to use facilities provided by the language to convert one form to another (INT, REAL etc.)

If the programming language or tool used supports special features that enhance program efficiency, without sacrificing readability, portability and maintainability, they should be used. It is always safer to let the tool optimize the code, rather than doing it manually. Sometimes the compiler itself flags possible problems in the code, these warnings should never be ignored.

Use special features provided by programming languages that support algorithm or functions. If programming languages supports recursion, then recursion should be used where applicable.



3.11 Commenting Source Code

It has been estimated that in a large project, the effort that goes into creating program-related documentation (including comments) is about twice as much as the effort required to code!!

There are several important points that have to be kept in mind in this regard. These are:

1. Comments cannot improve the readability of badly written source code. The better approach is to write code that is self-explanatory, and add comments to make it read better.

2. It is better to have no comments, than comments which,

Are inaccurate, misleading or out-dated

Merely repeat what is said in the source code

Disturb the flow of the source code, thus interrupting the reading process

Are excessively verbose, or in other ways irritate the person reading the program

Like source code, comments also need maintenance, (otherwise they become outdated and hence useless). This means that comments have to be written such that they are easy to change.

Whenever possible, the following aspects of the code should be commented:

Identification headers (i.e., function name, purpose, author, functions called, parameters passed and received, modification history, limitations etc.)

Data declarations (especially symbolic constants, arrays and array limits, units of measure of numeric data, special variables, flags, loop counters etc.)

Assumptions made about input and output variables (range of allowable values, accuracy, )

Assumptions made in other parts of the program (for example, user interface)

Beginning and end of a block of code - function, module, control structures (for, while, until); more so when such structures are nested

Changes made to fix bugs in earlier versions

Any code whose purpose and functioning is not obvious, and which has been added to optimize performance or other parameters of interest (a better solution would be to rewrite the complex code thus obviating the necessity for comments)

Code that gets around an error or undocumented feature in the language or environment

Code that violates programming conventions (along with reasons for the violation)

Global data being used (and modified) in a function or program

Comments describing a block or paragraph of code, should ideally describe what the block of code is trying to do, rather than how. This helps in several ways as follows:



The information it contains is one that is not present in the code; so there is no redundancy.

Since it focuses more on what, it is likely to remain relevant even if the how changes due to changes in the code

One line of comment can summarize a large number of source code lines; this makes it easier to write and read

Good comments have the following traits:

They say things about the code, which the code cannot say by itself

They are accurate, concise and provide valuable information

They focus on what the code does rather than how it is being done (as stated in point 5)

They are easy to write and modify

They stand apart from the code, and do not interrupt the flow during reading (important comments have blank lines before and after them; ordinary comments are indented to the right to enhance readability)

They are physically close to the code they are providing information about

A better way of developing comments without extra effort is to adopt the following approach:

Write the program logic in Program Design Language or pseudo-code (with each logical statement in a separate line)

Convert the pseudo-code statements into comment lines

Follow each pseudo-code statement by code that implements the pseudo-code

Add comments to explain variables, and other aspects outlined in the preceding paragraphs

good programming concepts

Documents