mt311 (oct 2007) java application development variables – binding, type, and scope expression...
TRANSCRIPT
MT311 (Oct 2007)Java Application Development
Variables – binding, type, and scope
Expression
Tutorial 6
Tutor Information
Edmund Chiu (Group 2) Email: [email protected] Please begin your email subject with [MT311] Webpage: http://www.geocities.com/gianted
Part I
Variable Attributes
Variables
There are six attributes of variables– Name– Address– Value– Type– Lifetime– Scope
Variable Name
The issues affecting your choices of names– Maximum length of a name:
Short names reduce readabilityExample: access_level is better than acclvl
– Connector characters:You may use _ in names to increase readability in some language. You may also use different case if the language is case-sensitive. Example: access_level, accessLevel is better than accesslevel
– Case sensitivity:If Edge and edge are different variables, it is easy to be confused
– Reserved words / keywords:Too many reserved words makes the program hard to write
Advantages and Disadvantages of No Reserved Word
If there is no reserved words for very common keyword (e.g. IF), it will be very difficult to understand a program if the keyword is used as a variable.
If there is no reserved words,– programmer does not have the problem in using an
identifier that has been reserved– when a language is extended, the new reserved
words may have been used in old program. The existing code may no longer valid.
Variable Name Limit in Different Languages
Address
The location of memory space for a variable, sometimes called l-value
The time when this address is fixed is called the address binding time, which may be– load time, during the program is loaded into the
memory– run time, when a specific statement is executed
Type
Type specifies the range of possible values of a variable
Primitive data types are defined when the language is designed.
User defined types are defined by users when a program is written.
Value
Value of a variable is the contents of the memory address associated with the variable (r-value)
Some languages do not check whether the value stored in a variable is within the valid range – it sacrifices the reliability for the efficiency
Lifetime and Scope
The lifetime of a variable begins when memory space is allocated to the variable
The lifetime ends when the space becomes unavailable
The scope of a variable is the range of statements in which the variable can be referenced
Example of Lifetime and Scope
abc in class A starts its lifetime when the class is created and ends when it is destoryed
Scope of abc in class A is all statements. In methodB, if abc of classA is referenced, we use this.abc
abc in methodB starts its lifetime when the method is invoked and ends when it is returned
Scope of abc in methodB is the statements in methodB only
class A {
int abc; // main
...
public void methodA(int a) {
abc = a;
}
public void mehtodB(int a) {
int abc = a;
...
}
...
}
Part II
Type Binding and Type Checking
Binding
Binding refers to the association between:– an attribute and an entity
association of type to a variable – type binding association of address to a variable – address/storage
binding
– an operation and a symbol (association of meaning to a symbol)
Binding Time
Binding time can happen at any of the following time– Language design time
E.g., meaning of operation is defined when the language is designed– Language implementation time
E.g., different compiler may have different implementation of a type such as different range
– Compile time type of a variable is usually bound when the program is compiled
– Link time the relative address of a function is bound when the program is
linked in C.– Load time
the absolute address of a function is bound when the program is loaded
Binding Time (cont'd)
– Run time Local variables in a function are usually bound during the
function is invoked
In general, we can have two types of bindings– Static binding
binding occurs before run time and remains unchanged throughout the program execution
– Dynamic bindingbinding occurs during run time and may be changed during program is executed
Static Type Bindings
In many programming languages like C, Pascal, Fortran and Cobol mainly uses static type binding
– The data type of a variable is defined using either explicit declaration or implicit declaration
– The compiler will bind the data type to a variable when the declaration statement is read
– The data type will not be changed throughout the whole program execution
An advantage of static type binding is that the type of a variable is known at compile time and thus the compiler can detect errors due to incompatible types
Dynamic Type Binding
Some other languages like APL and SNOBOL4 use dynamic type binding
– the data type of a variable is determined at run time by the interpreter
– the data type of the variable can be changed during the execution of the program
Advantage– Generic subroutines can be written – the same subroutine can
support different data type Disadvantage
– type checking mechanism of static type binding cannot be used
– this makes the running cost very high
Storage Bindings and Variable Lifetime
The lifetime of a variable is the time during the variable is bound to a specific memory location
– It begins when the memory location is allocated to the variable and ends when the memory location is deallocated
Four categories of variables can be distinguished according to their lifetime
– Static variables– Stack dynamic variables– Explicit Heap dynamic variables– Implicit Heap dynamic variables
Static Variables
They are bound to specific memory spaces when they are loaded into the memory before the program execution begins
They remain bound to the same memory location until program execution ends
In Java, all static attributes of a class are static variables Static variables are efficient and accessible throughout
the whole program However, it reduces the flexibility and cannot be used for
recursive subroutines. It also takes up the storage in the memory and makes it not sharable for other usage
Stack-Dynamic Variables
They are bound to storage at run time when their declaration statement are elaborated (reached)
The memory for stack-dynamic variables is allocated from the run-time stack
Type is statically bound to the variable In Java, local variables of primitive types are stack
dynamic Stack-dynamic variables support recursive subroutine
and helps in reusing the delloacted memory spaces
Explicit Heap-Dynamic Variables
The storage is allocated and deallocated by using system functions in a program
– C uses alloc/malloc and delete– Java uses "new" keyword (no explicit deallocation, done by
garbage collection)– The memory is usually allocated from the heap– The variables are usually called pointer/reference
In Java, object variables are explicit heap-dynamic variables
Explicit heap-dynamic variables are usually used for dynamic structure, but pointers and references are usually more difficult to use. The cost of allocation/deallocation is also a consideration
Implicit Heap-Dynamic Variables
The storage, type and values are all bound at runtime only when the variables are assigned values.
Implicit heap-dynamic variables give the highest degree of flexibility
The runtime overhead of maintaining all dynamic attributes are very high.
The error detection (e.g. type checking) by compiler is difficult if not impossible
Type Checking
Type checking is the activity of ensuring the operands of an/a operator/function are of compatible types
– Type error will occurs is inappropriate type is used in a parameter of a function where type checking has not been done for the parameter.
If a language has only static type binding, the type checking process can nearly be done by the compiler (static type checking)
Dynamic type checking is much more expensive and complicated
– Example: C++ union requires dynamic type checking
Strong Typing
A language is strongly typed if all type errors is always detected, either in compile time or at run time
– C, C++, FORTRAN and COBOL are NOT strongly typed– Java, Pascal and Ada are nearly strongly typed
Some languages will convert one or all operands to a different type to make them compatible in an operation. We call this conversion a coercion
– Too few coercion may make the language less flexible– Too many coercion make it hard for the compiler to detect
programming errors and weakens the purpose of having a strongly typed language
Type Compatibility
Different languages have their rules to decide which data types are mutually compatible
Two fundamental rules for type compatibility– Name type compatibility
Two variables are of the same type only if they are declared with the same type name.
– Structure type compatibilityTwo variables are of the same type if they are of the same structure.
Example of Type Compatibility
If using name type compatibility, only A and B are of same type
If using structure type compatibility, all three variables are of the same type
type arraytype1 = array [1..10] of integer;arraytype2 = array [1..10] of integer;
var
A,B: arraytype1;C: arraytype2;
Comparison of Type Compatibility
Type Compatibility (cont'd)
Few programming languages use strict name or structure type compatibility.
– Pascal use a slight variation of name type compatibility called declaration equivalence – a programmer may define if a type is equivalent to another type
– C uses a variation of structure type compatibility – name type compatibility is used for union and struct types
– Ada uses a variation of name type compatibility but provides subtypes (compatible with parent type) and derived types (not compatible with parent type).
Scope
Scope of a variable is the range of the statements in which the variable can be referenced– In static scoping, the compiler determine the scope
of each variable by inspecting the program– In dynamic scoping, the scope of variables can
only be determined at runtime. The calling sequence of the subprogram will affect the scope of a variable.
Part III
Variable Scope and Lifetime
Static Scope
In static-scoped languages with nested subprograms, a reference to a variable is found in this way:
– Local variable will be searched first– If the variable is not found, we search the program that defines the
subprogram (static parent)– If the variable is still not found, we continue to the program that
defines that program (static ancestor) until the outermost program is searched
– Example (see next page): sub1 is the static parent of sub2 and big is the static ancestor of sub2
– Y is not declared in sub2, y of sub1 will be used in sub2 In C and Java,
– We needed to find the local and global/instance variables only in C/Java
– Variables in blocks are local to the block itself.
Scope Example
procedure Big;var y : integer;procedure sub1;
var x, y : integer;procedure sub2;
var x : integer;begin
x := 1;y := 2;
endbegin...end
begin...end
Dynamic Scope
When using dynamic scoping, the reference to the variables are determined by the calling sequence of the subprograms– Assume the in Big, we call Sub2, y in Sub2 is then
referring to the y in Big instead of that of Sub1 because it is Big who call Sub2
– If there are no y in Big, we will need to see if the program that call Big provides integer y
– Thus, when using dynamic scoping, there is a chance that we are referring a different variable even we are calling the same subprogram
Disadvantages of Dynamic Scoping
Dynamic scoping is less reliable because all local variables of the calling subprogram are visible from the called program. That makes information hiding impossible
The compiler cannot check type compatibility because it does not know where a non-local variable is declared
Referencing non-local variables is more expensive The program is more difficult to read because the
identity of a non-local variable is difficult to trace by just reading the program source code
Scope and Lifetime
Scope and lifetime are not always related– Static variables in C and C++ is statically bound to
the scope of the function but their lifetime extends over the entire execution of the program
– A variable’s scope is also not extended to the called subprogram but the lifetime of the variable extends over the time during the subprogram is executed and will be accessed again after the subprogram ends its execution
Referencing Environments
The referencing environment is just the other side of mirror of the scope concept– We look from the point of view of a program
statement and lists out the collection of all variables that are visible in the statement
– Variables that are defined in both caller and called subprogram will have the variables in the caller program temporarily hidden
Example in Referencing Environment (Static Scope)procedure Ex is
A, B : Integer;...procedure Sub1 is
X, Y : Integer;begin { Sub1 }... -- POINT 1end;
procedure Sub2 isX : Integer;...procedure Sub3 is
X : Integer;
begin { Sub 3 }
... -- POINT 2
end;begin { Sub 2 }... -- POINT 3end;
begin { Ex }... -- POINT 4end.
If static scoping is used, the referencing environments are
– At point 1:x and y of Sub1, A and B of Ex
– At point 2:x of Sub3 (X of Sub2 is hidden),A and B of Ex
– At point 3:X of Sub2, A and B of Ex
– At point 4:A and B of Ex
Example in Referencing Environment (Dynamic Scope)
void sub1() {
int a, b;
... // POINT 1
} // end of sub 1
void sub2() {
int b, c;
... // POINT 2
sub1();
} // end of sub 2
void main() {int c, d;
... // POINT 3
sub2();
} // end of sub3
If dynamic scoping is used, the referencing environments are
– At point 1:a and b of sub1, c of sub2, d of mainc of main and b of sub2 are hidden
– At point 2:b and c of sub2, d of mainc of main is hidden
– At point 3:c and d of main
Part IV
Data Types
Primitive Data Types
Nearly all programming languages provide a set of primitive data types
– This is the building block of user-defined data types The common primitive data types
– Numeric types includes integers, floating point numbers, and sometimes decimal types
– Boolean types – usually one byte instead of one bit for efficient access
– Character types C is a special language that the differences among these types
are very vague– No boolean type and integer and character types are interchangeable– Flexible (writability ) but type checking mechanism is defected
(reliability )
Character String Types
When designing a String type, we need to consider:– Should a String be array of characters or a primitive
data type?– Should strings have static or dynamic length?
Dynamic length strings require more complex storage – the storage for a string will grow and shrink dynamically
Limited dynamic strings do not need runtime descriptor
– What operations are allowed for a String?
Character String Types in Common Languages
C/C++ uses char arrays (limited dynamic length) to store character strings
– An end of string character (\0) is placed at the end of the String
– operations are provided through a standard library
Java supports String through the String class (constant strings) and StringBuffer class (changeable strings, like array of characters)
Fortran 95 treats strings as a primitive data type– Assignment, relational, catenation and substring operations
are available
User-Defined Ordinal Types
Enumeration type (C/Pascal)– Increase readability and reliability (if the enumeration type
limits the value stored in the variable)– All possible values (named constants) are provided in the
definition– In C, these enumeration constants are implicitly assigned the
integer values, or explicitly assigned other values – variables accepting such type will then accepts any integer value
Subrange type (Pascal/Ada)– A contiguous subsequence of an ordinal type– Subrange types require the compiler to generate range-
checking code for every assignment to a variable
Array Types
Issues in arrays– Array indexes (subscripts)
if enumerated types are used in subscripts, it is more readable and reliableAlso, if the index range is checked implicitly (e.g., in Java), it will also be more reliable
– Subscript range boundLower bound of range in C-based languages are 0. Some other languages fixed it at 1.
– Initialization at storage allocation – increase writability– Multi-dimension array – rectangular/jagged– Array with no subscript bounds (unconstrained array)
useful in defining methods, but not used in C (pointer access)
Array Categories
Static Array– Subscript range is statically bound– Storage allocation is static (done before runtime)
Fixed Stack-Dynamic Array– Subscript range is statically bound– Storage allocation is done at declaration elaboration time– More space efficient than static array
Stack-Dynamic Array– Subscript range is dynamically bound and storage allocation is
dynamic done during runtime– However, once the range and storage are bound, they remain
fixed during the lifetime of the variable
Array Categories
Fixed Heap-Dynamic Array– Similar to fixed stack-dynamic, but range is
dynamically bound– Like stack-dynamic array, range and storage are
fixed once bound
Heap-Dynamic Array– Binding of both subscript ranges and storage
allocation is dynamic and can be changed freely during the array’s lifetime
Addresses of Array Elements
The address to reference an array element in a one-dimensional array
– Address(arr[k]) = address(arr[0]) + k * element_size In a multi-dimensional array, the storage is actually
done in a one-dimensional way– Different formula will be used for row major order and column
major order array– For row major order
address(arr[i,j]) = address(arr[0,0]) +i * col_in_row * element_size + j *
element_size– For column major order
address(arr[i,j]) = address(arr[0,0]) +j * row_in_col * element_size + i *
element_size
Associative Arrays
Associative arrays (or hash) are used in Perl and Java– Store associations of a key to a value– The associated value can be easily retrieved by
using the keys (good for searching)
Record Types
In C, we use struct to form record type that includes fields of variables of different types
– The use of operators in C increase the writability In Java, records can be defined as data classes, with data
members as the record fields To refer to the record fields, some languages allow the use of
elliptical references – the record name can be skipped in reference to a field
Operations allowed in record type is limited – referencing, taking the address and assignments are the typical operations
Assignment to a record is sometimes done through a collection of values, e.g., rec := (“ABC”, 100);
– Supported by Ada and C (in initialization only)
Union Types
A union is a type that allows a variable to store values of different data types at different times during program execution
– Allows to define heterogeneous data structures such as tree structures of integers or floating points
– In Java, inheritance from a root class can do the same task and is more reliable
In some languages, union is a new data type itself (C/C++). It is called a free union
On the other hand, some languages limits that union must be embedded in a record type. It is called a discriminated union
Type checking is difficult – it must be dynamic– If no type checking is employed, there will be assurance that the value
is of the intended type
Example of Discriminated Union Type
type Node (Tag : Boolean) isrecordcase Tag is
when true => Count : Integer;
when false =>Sum : Float;
end case;end record;
Address
*
Offset
BOOLEAN
Discriminated Union
False
True
INTEGER
COUNT
FLOAT
SUM
TAG Case table
NameType
NameType
Pointer Types
A pointer type is one in which the variables that contain memory addresses, or null
– Null is a special value that means the pointer cannot be used to reference any memory space
Pointer provides the power of indirect addressing and a method of dynamic storage management
– Variables storage is bound at runtime and memory is allocated from the heap
– The pointer can dereference the variable that the pointer is pointing to indirectly
– Data type of the values the pointer is pointing to is usually predefined
Pointer Problem – Dangling Pointers
Occurs when a pointer that contains the address of a heap-dynamic variable has been deallocated:
– Pointer p1 is set to point at a new heap-dynamic variable– Pointer p2 is assigned p1’s value– p1 is then explicitly dealloacted (set to null) p2 is now dangling
The deallocated location my have been reallocated to some new heap-dynamic variable with different data type – type checking error occurs if the dangling pointer is used
Even no type checking error occurs, the dangling pointer should have no relation to the new variable.
The dangling pointer can change the value of the new heap-dynamic variable
If the heap variable is used by the storage management system, it is possible that it is used as a pointer in a chain – changing the value will cause the storage manager to fail
Pointer Problem – Lost Heap-Dynamic Variables
Occurs when an allocated heap-dynamic variable is no longer accessible to the user program
– Pointer p1 is set to point to a new heap-dynamic variable– Pointer p1 is then set to point to another new heap-dynamic
variable original heap-dynamic variable is now inaccessible– We also call the inaccessible variable garbage
Languages that require explicit deallocation of dynamic variables share the above problem, which is also known as memory leakage
Reference Types
Reference type is a safer version of pointer– In C++, reference types provide a two-way
communication between caller function and called function without explicit dereferencing
– In Java, references refer to class instances and do not allow arithmetic on references
– In Java, references can be assigned to refer to different class variables.
– No dangling reference will occur in Java because Java class instances are implicitly deallocated
Part V
Expressions
Arithmetic Expressions
In programming languages, arithmetic expressions consist of – Operators (unary – single operand, binary – two operands, and/or
ternary in C -- ? : ),– Operands, parentheses, and function calls
Operator Evaluation Order– Fortran: ** > *, / > all +, - > binary +, -– C-based languages: postfix ++, -- > prefix ++, -- > unary +, - > *, /, % >
binary +, -– Ada: **, abc > *, /, mod, rem > unary +, - > binary +, -
Associativity– Fortran: All operators are left associative except **, which is right
assocaitive– C-based languages: *, /, %, and binary + - are left associative, ++, -- and
unary +, - are right associative– Ada: All operators are left associative except **, which is non-associative
Example of Operator Evaluation Order and Associativity
Consider the expression 3 * 4 – 5 – 6 * 2 – 7– Using mathematical precedence and associativity
( ( (3*4) – 5 ) – (6*2) ) – 7 = -12– If precedence of – is higher than * and left
associativity is used for all operators( 3 * ( (4-5) – 6 ) ) * (2-7) = 105
– If precedence of – is higher than * and right associativity is used for all operators3 * ( ( 4 – (5-6) ) * (2-7) ) = -75
Operand Evaluation Order
Side effects of a function makes operand evaluation order important
Example: int a=9, b; b = a + a++;– In C, as no specific operand evaluation order is
defined, b may become both 18 and 19 can be produced depended on hot the compiler is implemented
– In Java, the operands is evaluated in left-to-right order and eliminates the above problem
Overloaded Operators
The same operator can mean different operations according to situation
– Example * can mean both multiplication and dereferencing in different situation in C
– We call the process operator overloading
Problems– Poor readability and also hard to find the error because
leaving out an operand may still yield a compilable program
Another usually overloaded operator is /– For floating-point and integral division
Type Conversion
When the compiler finds that the operands are not of the same type, coercion (type conversion) will occur according to the language specification
– In most of the common languages, mixed mode arithmetic expression is allowed without any restriction
– In Java, byte and short are coerced to int whenever any operator is applied to them
– Too much coercion makes errors difficult to detect, while having too few makes the language less flexible
Explicit type conversion can be made – type casting– Overflow or underflow problem may occur as a result of casting
Relational Expressions
Relational operator is an operator that compares the values of the two operands– ==, !=, >, <, >=, <=– Overloading relational operators for dfifferent data
type is very common
Relational expression gives a boolean result
Boolean Expressions
Boolean expressions consist of Boolean variables/constants, relational expressions and Boolean opeartors (AND/OR/NOT)
In C, AND has a higher precedence than OR and is left associative
In Ada, the two operators have the same precedence
Precedence of Operators in C-based language
(highest precedence) Postfix ++, -- Unary +, -, prefix ++, --, ! *, /, % Binary +, - <, >, <=, >= ==, != && (lowest precedence) ||
Short-Circuit Evaluation
A short-circuit evaluation is in which the result is determined without evaluating all of the operands in &&/|| operators
– If first expression in && is false, the second expression needs not to be evaluated
– Similarly, if second expression in || is true, the second expression needs not to be evaluated
Advantages– More efficient in the evaluation of the expression as
sometimes there is no need to evaluate the whole expression– We can make use of the first expression to avoid evaluating
the second expression when it is undefinedExample: (x>0) && (log(x)>0)
Assignment Statements
Simple assignment– e.g., a = 10;
Conditional Targets– e.g., flag ? count1 : count2 = 0;
Multiple Targets– e.g., a, b, c := d;– Increase writability and efficiency because the value needs to
load to the register for 1 time only Compound Assignment Operators increases writability
– e.g., sum += value; Unary Assignment Operators
– e.g., - count ++;– Aware of the precedence: it should be – (count ++)
Assignment as Expression
For example: while ( (ch=getchar()) != EOF) {…}– ch stores the character got from the function– The expression itself also returns a value – the left side value
of the expression Increase writability and efficient because the assigned
value is already in register before the assignment statement is executed
However, it decreases the readability Also, the expression may return a incompatible/invalid
value and make it easy to make mistake– Example: we may take if (x=y) instead of if(x==y)