mt311 (oct 2007) java application development variables – binding, type, and scope expression...

MT311 (Oct 2007)Java Application Development

Variables – binding, type, and scope

Expression

Tutorial 6

Tutor Information

Edmund Chiu (Group 2) Email: [email protected] Please begin your email subject with [MT311] Webpage: http://www.geocities.com/gianted

Part I

Variable Attributes

Variables

There are six attributes of variables– Name– Address– Value– Type– Lifetime– Scope

Variable Name

The issues affecting your choices of names– Maximum length of a name:

Short names reduce readabilityExample: access_level is better than acclvl

– Connector characters:You may use _ in names to increase readability in some language. You may also use different case if the language is case-sensitive. Example: access_level, accessLevel is better than accesslevel

– Case sensitivity:If Edge and edge are different variables, it is easy to be confused

– Reserved words / keywords:Too many reserved words makes the program hard to write

Advantages and Disadvantages of No Reserved Word

If there is no reserved words for very common keyword (e.g. IF), it will be very difficult to understand a program if the keyword is used as a variable.

If there is no reserved words,– programmer does not have the problem in using an

identifier that has been reserved– when a language is extended, the new reserved

words may have been used in old program. The existing code may no longer valid.

Variable Name Limit in Different Languages

Address

The location of memory space for a variable, sometimes called l-value

The time when this address is fixed is called the address binding time, which may be– load time, during the program is loaded into the

memory– run time, when a specific statement is executed

Type

Type specifies the range of possible values of a variable

Primitive data types are defined when the language is designed.

User defined types are defined by users when a program is written.

Value

Value of a variable is the contents of the memory address associated with the variable (r-value)

Some languages do not check whether the value stored in a variable is within the valid range – it sacrifices the reliability for the efficiency

Lifetime and Scope

The lifetime of a variable begins when memory space is allocated to the variable

The lifetime ends when the space becomes unavailable

The scope of a variable is the range of statements in which the variable can be referenced

Example of Lifetime and Scope

abc in class A starts its lifetime when the class is created and ends when it is destoryed

Scope of abc in class A is all statements. In methodB, if abc of classA is referenced, we use this.abc

abc in methodB starts its lifetime when the method is invoked and ends when it is returned

Scope of abc in methodB is the statements in methodB only

class A {

int abc; // main

...

public void methodA(int a) {

abc = a;

}

public void mehtodB(int a) {

int abc = a;

...

}

...

}

Part II

Type Binding and Type Checking

Binding

Binding refers to the association between:– an attribute and an entity

association of type to a variable – type binding association of address to a variable – address/storage

binding

– an operation and a symbol (association of meaning to a symbol)

Binding Time

Binding time can happen at any of the following time– Language design time

E.g., meaning of operation is defined when the language is designed– Language implementation time

E.g., different compiler may have different implementation of a type such as different range

– Compile time type of a variable is usually bound when the program is compiled

– Link time the relative address of a function is bound when the program is

linked in C.– Load time

the absolute address of a function is bound when the program is loaded

Binding Time (cont'd)

– Run time Local variables in a function are usually bound during the

function is invoked

In general, we can have two types of bindings– Static binding

binding occurs before run time and remains unchanged throughout the program execution

– Dynamic bindingbinding occurs during run time and may be changed during program is executed

Static Type Bindings

In many programming languages like C, Pascal, Fortran and Cobol mainly uses static type binding

– The data type of a variable is defined using either explicit declaration or implicit declaration

– The compiler will bind the data type to a variable when the declaration statement is read

– The data type will not be changed throughout the whole program execution

An advantage of static type binding is that the type of a variable is known at compile time and thus the compiler can detect errors due to incompatible types

Dynamic Type Binding

Some other languages like APL and SNOBOL4 use dynamic type binding

– the data type of a variable is determined at run time by the interpreter

– the data type of the variable can be changed during the execution of the program

Advantage– Generic subroutines can be written – the same subroutine can

support different data type Disadvantage

– type checking mechanism of static type binding cannot be used

– this makes the running cost very high

Storage Bindings and Variable Lifetime

The lifetime of a variable is the time during the variable is bound to a specific memory location

– It begins when the memory location is allocated to the variable and ends when the memory location is deallocated

Four categories of variables can be distinguished according to their lifetime

– Static variables– Stack dynamic variables– Explicit Heap dynamic variables– Implicit Heap dynamic variables

Static Variables

They are bound to specific memory spaces when they are loaded into the memory before the program execution begins

They remain bound to the same memory location until program execution ends

In Java, all static attributes of a class are static variables Static variables are efficient and accessible throughout

the whole program However, it reduces the flexibility and cannot be used for

recursive subroutines. It also takes up the storage in the memory and makes it not sharable for other usage

Stack-Dynamic Variables

They are bound to storage at run time when their declaration statement are elaborated (reached)

The memory for stack-dynamic variables is allocated from the run-time stack

Type is statically bound to the variable In Java, local variables of primitive types are stack

dynamic Stack-dynamic variables support recursive subroutine

and helps in reusing the delloacted memory spaces

Explicit Heap-Dynamic Variables

The storage is allocated and deallocated by using system functions in a program

– C uses alloc/malloc and delete– Java uses "new" keyword (no explicit deallocation, done by

garbage collection)– The memory is usually allocated from the heap– The variables are usually called pointer/reference

In Java, object variables are explicit heap-dynamic variables

Explicit heap-dynamic variables are usually used for dynamic structure, but pointers and references are usually more difficult to use. The cost of allocation/deallocation is also a consideration

Implicit Heap-Dynamic Variables

The storage, type and values are all bound at runtime only when the variables are assigned values.

Implicit heap-dynamic variables give the highest degree of flexibility

The runtime overhead of maintaining all dynamic attributes are very high.

The error detection (e.g. type checking) by compiler is difficult if not impossible

Type Checking

Type checking is the activity of ensuring the operands of an/a operator/function are of compatible types

– Type error will occurs is inappropriate type is used in a parameter of a function where type checking has not been done for the parameter.

If a language has only static type binding, the type checking process can nearly be done by the compiler (static type checking)

Dynamic type checking is much more expensive and complicated

– Example: C++ union requires dynamic type checking

Strong Typing

A language is strongly typed if all type errors is always detected, either in compile time or at run time

– C, C++, FORTRAN and COBOL are NOT strongly typed– Java, Pascal and Ada are nearly strongly typed

Some languages will convert one or all operands to a different type to make them compatible in an operation. We call this conversion a coercion

– Too few coercion may make the language less flexible– Too many coercion make it hard for the compiler to detect

programming errors and weakens the purpose of having a strongly typed language

Type Compatibility

Different languages have their rules to decide which data types are mutually compatible

Two fundamental rules for type compatibility– Name type compatibility

Two variables are of the same type only if they are declared with the same type name.

– Structure type compatibilityTwo variables are of the same type if they are of the same structure.

Example of Type Compatibility

If using name type compatibility, only A and B are of same type

If using structure type compatibility, all three variables are of the same type

type arraytype1 = array [1..10] of integer;arraytype2 = array [1..10] of integer;

var

A,B: arraytype1;C: arraytype2;

Comparison of Type Compatibility

Type Compatibility (cont'd)

Few programming languages use strict name or structure type compatibility.

– Pascal use a slight variation of name type compatibility called declaration equivalence – a programmer may define if a type is equivalent to another type

– C uses a variation of structure type compatibility – name type compatibility is used for union and struct types

– Ada uses a variation of name type compatibility but provides subtypes (compatible with parent type) and derived types (not compatible with parent type).

Scope

Scope of a variable is the range of the statements in which the variable can be referenced– In static scoping, the compiler determine the scope

of each variable by inspecting the program– In dynamic scoping, the scope of variables can

only be determined at runtime. The calling sequence of the subprogram will affect the scope of a variable.

Part III

Variable Scope and Lifetime

Static Scope

In static-scoped languages with nested subprograms, a reference to a variable is found in this way:

– Local variable will be searched first– If the variable is not found, we search the program that defines the

subprogram (static parent)– If the variable is still not found, we continue to the program that

defines that program (static ancestor) until the outermost program is searched

– Example (see next page): sub1 is the static parent of sub2 and big is the static ancestor of sub2

– Y is not declared in sub2, y of sub1 will be used in sub2 In C and Java,

– We needed to find the local and global/instance variables only in C/Java

– Variables in blocks are local to the block itself.

Scope Example

procedure Big;var y : integer;procedure sub1;

var x, y : integer;procedure sub2;

var x : integer;begin

x := 1;y := 2;

endbegin...end

begin...end

Dynamic Scope

When using dynamic scoping, the reference to the variables are determined by the calling sequence of the subprograms– Assume the in Big, we call Sub2, y in Sub2 is then

referring to the y in Big instead of that of Sub1 because it is Big who call Sub2

– If there are no y in Big, we will need to see if the program that call Big provides integer y

– Thus, when using dynamic scoping, there is a chance that we are referring a different variable even we are calling the same subprogram

Disadvantages of Dynamic Scoping

Dynamic scoping is less reliable because all local variables of the calling subprogram are visible from the called program. That makes information hiding impossible

The compiler cannot check type compatibility because it does not know where a non-local variable is declared

Referencing non-local variables is more expensive The program is more difficult to read because the

identity of a non-local variable is difficult to trace by just reading the program source code

Scope and Lifetime

Scope and lifetime are not always related– Static variables in C and C++ is statically bound to

the scope of the function but their lifetime extends over the entire execution of the program

– A variable’s scope is also not extended to the called subprogram but the lifetime of the variable extends over the time during the subprogram is executed and will be accessed again after the subprogram ends its execution

Referencing Environments

The referencing environment is just the other side of mirror of the scope concept– We look from the point of view of a program

statement and lists out the collection of all variables that are visible in the statement

– Variables that are defined in both caller and called subprogram will have the variables in the caller program temporarily hidden

Example in Referencing Environment (Static Scope)procedure Ex is

A, B : Integer;...procedure Sub1 is

X, Y : Integer;begin { Sub1 }... -- POINT 1end;

procedure Sub2 isX : Integer;...procedure Sub3 is

X : Integer;

begin { Sub 3 }

... -- POINT 2

end;begin { Sub 2 }... -- POINT 3end;

begin { Ex }... -- POINT 4end.

If static scoping is used, the referencing environments are

– At point 1:x and y of Sub1, A and B of Ex

– At point 2:x of Sub3 (X of Sub2 is hidden),A and B of Ex

– At point 3:X of Sub2, A and B of Ex

– At point 4:A and B of Ex

Example in Referencing Environment (Dynamic Scope)

void sub1() {

int a, b;

... // POINT 1

} // end of sub 1

void sub2() {

int b, c;

... // POINT 2

sub1();

} // end of sub 2

void main() {int c, d;

... // POINT 3

sub2();

} // end of sub3

If dynamic scoping is used, the referencing environments are

– At point 1:a and b of sub1, c of sub2, d of mainc of main and b of sub2 are hidden

– At point 2:b and c of sub2, d of mainc of main is hidden

– At point 3:c and d of main

Part IV

Data Types

Primitive Data Types

Nearly all programming languages provide a set of primitive data types

– This is the building block of user-defined data types The common primitive data types

– Numeric types includes integers, floating point numbers, and sometimes decimal types

– Boolean types – usually one byte instead of one bit for efficient access

– Character types C is a special language that the differences among these types

are very vague– No boolean type and integer and character types are interchangeable– Flexible (writability ) but type checking mechanism is defected

(reliability )

Character String Types

When designing a String type, we need to consider:– Should a String be array of characters or a primitive

data type?– Should strings have static or dynamic length?

Dynamic length strings require more complex storage – the storage for a string will grow and shrink dynamically

Limited dynamic strings do not need runtime descriptor

– What operations are allowed for a String?

Character String Types in Common Languages

C/C++ uses char arrays (limited dynamic length) to store character strings

– An end of string character (\0) is placed at the end of the String

– operations are provided through a standard library

Java supports String through the String class (constant strings) and StringBuffer class (changeable strings, like array of characters)

Fortran 95 treats strings as a primitive data type– Assignment, relational, catenation and substring operations

are available

User-Defined Ordinal Types

Enumeration type (C/Pascal)– Increase readability and reliability (if the enumeration type

limits the value stored in the variable)– All possible values (named constants) are provided in the

definition– In C, these enumeration constants are implicitly assigned the

integer values, or explicitly assigned other values – variables accepting such type will then accepts any integer value

Subrange type (Pascal/Ada)– A contiguous subsequence of an ordinal type– Subrange types require the compiler to generate range-

checking code for every assignment to a variable

Array Types

Issues in arrays– Array indexes (subscripts)

if enumerated types are used in subscripts, it is more readable and reliableAlso, if the index range is checked implicitly (e.g., in Java), it will also be more reliable

– Subscript range boundLower bound of range in C-based languages are 0. Some other languages fixed it at 1.

– Initialization at storage allocation – increase writability– Multi-dimension array – rectangular/jagged– Array with no subscript bounds (unconstrained array)

useful in defining methods, but not used in C (pointer access)

Array Categories

Static Array– Subscript range is statically bound– Storage allocation is static (done before runtime)

Fixed Stack-Dynamic Array– Subscript range is statically bound– Storage allocation is done at declaration elaboration time– More space efficient than static array

Stack-Dynamic Array– Subscript range is dynamically bound and storage allocation is

dynamic done during runtime– However, once the range and storage are bound, they remain

fixed during the lifetime of the variable

Array Categories

Fixed Heap-Dynamic Array– Similar to fixed stack-dynamic, but range is

dynamically bound– Like stack-dynamic array, range and storage are

fixed once bound

Heap-Dynamic Array– Binding of both subscript ranges and storage

allocation is dynamic and can be changed freely during the array’s lifetime

Addresses of Array Elements

The address to reference an array element in a one-dimensional array

– Address(arr[k]) = address(arr[0]) + k * element_size In a multi-dimensional array, the storage is actually

done in a one-dimensional way– Different formula will be used for row major order and column

major order array– For row major order

address(arr[i,j]) = address(arr[0,0]) +i * col_in_row * element_size + j *

element_size– For column major order

address(arr[i,j]) = address(arr[0,0]) +j * row_in_col * element_size + i *

element_size

Associative Arrays

Associative arrays (or hash) are used in Perl and Java– Store associations of a key to a value– The associated value can be easily retrieved by

using the keys (good for searching)

Record Types

In C, we use struct to form record type that includes fields of variables of different types

– The use of operators in C increase the writability In Java, records can be defined as data classes, with data

members as the record fields To refer to the record fields, some languages allow the use of

elliptical references – the record name can be skipped in reference to a field

Operations allowed in record type is limited – referencing, taking the address and assignments are the typical operations

Assignment to a record is sometimes done through a collection of values, e.g., rec := (“ABC”, 100);

– Supported by Ada and C (in initialization only)

Union Types

A union is a type that allows a variable to store values of different data types at different times during program execution

– Allows to define heterogeneous data structures such as tree structures of integers or floating points

– In Java, inheritance from a root class can do the same task and is more reliable

In some languages, union is a new data type itself (C/C++). It is called a free union

On the other hand, some languages limits that union must be embedded in a record type. It is called a discriminated union

Type checking is difficult – it must be dynamic– If no type checking is employed, there will be assurance that the value

is of the intended type

Example of Discriminated Union Type

type Node (Tag : Boolean) isrecordcase Tag is

when true => Count : Integer;

when false =>Sum : Float;

end case;end record;

Address

*

Offset

BOOLEAN

Discriminated Union

False

True

INTEGER

COUNT

FLOAT

SUM

TAG Case table

NameType

NameType

Pointer Types

A pointer type is one in which the variables that contain memory addresses, or null

– Null is a special value that means the pointer cannot be used to reference any memory space

Pointer provides the power of indirect addressing and a method of dynamic storage management

– Variables storage is bound at runtime and memory is allocated from the heap

– The pointer can dereference the variable that the pointer is pointing to indirectly

– Data type of the values the pointer is pointing to is usually predefined

Pointer Problem – Dangling Pointers

Occurs when a pointer that contains the address of a heap-dynamic variable has been deallocated:

– Pointer p1 is set to point at a new heap-dynamic variable– Pointer p2 is assigned p1’s value– p1 is then explicitly dealloacted (set to null) p2 is now dangling

The deallocated location my have been reallocated to some new heap-dynamic variable with different data type – type checking error occurs if the dangling pointer is used

Even no type checking error occurs, the dangling pointer should have no relation to the new variable.

The dangling pointer can change the value of the new heap-dynamic variable

If the heap variable is used by the storage management system, it is possible that it is used as a pointer in a chain – changing the value will cause the storage manager to fail

Pointer Problem – Lost Heap-Dynamic Variables

Occurs when an allocated heap-dynamic variable is no longer accessible to the user program

– Pointer p1 is set to point to a new heap-dynamic variable– Pointer p1 is then set to point to another new heap-dynamic

variable original heap-dynamic variable is now inaccessible– We also call the inaccessible variable garbage

Languages that require explicit deallocation of dynamic variables share the above problem, which is also known as memory leakage

Reference Types

Reference type is a safer version of pointer– In C++, reference types provide a two-way

communication between caller function and called function without explicit dereferencing

– In Java, references refer to class instances and do not allow arithmetic on references

– In Java, references can be assigned to refer to different class variables.

– No dangling reference will occur in Java because Java class instances are implicitly deallocated

Part V

Expressions

Arithmetic Expressions

In programming languages, arithmetic expressions consist of – Operators (unary – single operand, binary – two operands, and/or

ternary in C -- ? : ),– Operands, parentheses, and function calls

Operator Evaluation Order– Fortran: ** > *, / > all +, - > binary +, -– C-based languages: postfix ++, -- > prefix ++, -- > unary +, - > *, /, % >

binary +, -– Ada: **, abc > *, /, mod, rem > unary +, - > binary +, -

Associativity– Fortran: All operators are left associative except **, which is right

assocaitive– C-based languages: *, /, %, and binary + - are left associative, ++, -- and

unary +, - are right associative– Ada: All operators are left associative except **, which is non-associative

Example of Operator Evaluation Order and Associativity

Consider the expression 3 * 4 – 5 – 6 * 2 – 7– Using mathematical precedence and associativity

( ( (3*4) – 5 ) – (6*2) ) – 7 = -12– If precedence of – is higher than * and left

associativity is used for all operators( 3 * ( (4-5) – 6 ) ) * (2-7) = 105

– If precedence of – is higher than * and right associativity is used for all operators3 * ( ( 4 – (5-6) ) * (2-7) ) = -75

Operand Evaluation Order

Side effects of a function makes operand evaluation order important

Example: int a=9, b; b = a + a++;– In C, as no specific operand evaluation order is

defined, b may become both 18 and 19 can be produced depended on hot the compiler is implemented

– In Java, the operands is evaluated in left-to-right order and eliminates the above problem

Overloaded Operators

The same operator can mean different operations according to situation

– Example * can mean both multiplication and dereferencing in different situation in C

– We call the process operator overloading

Problems– Poor readability and also hard to find the error because

leaving out an operand may still yield a compilable program

Another usually overloaded operator is /– For floating-point and integral division

Type Conversion

When the compiler finds that the operands are not of the same type, coercion (type conversion) will occur according to the language specification

– In most of the common languages, mixed mode arithmetic expression is allowed without any restriction

– In Java, byte and short are coerced to int whenever any operator is applied to them

– Too much coercion makes errors difficult to detect, while having too few makes the language less flexible

Explicit type conversion can be made – type casting– Overflow or underflow problem may occur as a result of casting

Relational Expressions

Relational operator is an operator that compares the values of the two operands– ==, !=, >, <, >=, <=– Overloading relational operators for dfifferent data

type is very common

Relational expression gives a boolean result

Boolean Expressions

Boolean expressions consist of Boolean variables/constants, relational expressions and Boolean opeartors (AND/OR/NOT)

In C, AND has a higher precedence than OR and is left associative

In Ada, the two operators have the same precedence

Precedence of Operators in C-based language

(highest precedence) Postfix ++, -- Unary +, -, prefix ++, --, ! *, /, % Binary +, - <, >, <=, >= ==, != && (lowest precedence) ||

Short-Circuit Evaluation

A short-circuit evaluation is in which the result is determined without evaluating all of the operands in &&/|| operators

– If first expression in && is false, the second expression needs not to be evaluated

– Similarly, if second expression in || is true, the second expression needs not to be evaluated

Advantages– More efficient in the evaluation of the expression as

sometimes there is no need to evaluate the whole expression– We can make use of the first expression to avoid evaluating

the second expression when it is undefinedExample: (x>0) && (log(x)>0)

Assignment Statements

Simple assignment– e.g., a = 10;

Conditional Targets– e.g., flag ? count1 : count2 = 0;

Multiple Targets– e.g., a, b, c := d;– Increase writability and efficiency because the value needs to

load to the register for 1 time only Compound Assignment Operators increases writability

– e.g., sum += value; Unary Assignment Operators

– e.g., - count ++;– Aware of the precedence: it should be – (count ++)

Assignment as Expression

For example: while ( (ch=getchar()) != EOF) {…}– ch stores the character got from the function– The expression itself also returns a value – the left side value

of the expression Increase writability and efficient because the assigned

value is already in register before the assignment statement is executed

However, it decreases the readability Also, the expression may return a incompatible/invalid

value and make it easy to make mistake– Example: we may take if (x=y) instead of if(x==y)

mt311 (oct 2007) java application development variables – binding, type, and scope expression...

Documents

type type

binding time binding

address binding time

type checking slide

dynamic type

binding time contd

variable attributes

variable addressstorage