part 4 - expressions and statement
DESCRIPTION
Programming languagesTRANSCRIPT
Louden, 2003 1
Chapter 7 - Control I: Chapter 7 - Control I: Expressions and StatementsExpressions and Statements
Programming Languages:Principles and Practice, 2nd Ed.Kenneth C. Louden
Chapter 7 K. Louden, Programming Languages 2
Expression and Statement: Basic Expression and Statement: Basic introductionintroduction "Control" is the general study of the
semantics of execution paths through code: what gets executed, when, and in what order.
Most important control issue in modern languages: procedure/function/method call and return, studied in Chapter 8.
Here we study more localized control issues in expressions and statements, and:
Exception handling, which involves non-local control but has a local component too.
IntroductionIntroduction An expression represents a single data item--
usually a number. The expression may consist of a single entity,
such as a constant or variable, or it may consist of some combination of such entities, interconnected by one or more operators.
Expressions can also represent logical conditions which are either true or false.
However, in C, the conditions true and false are represented by the integer values 1 and 0, respectively.
Chapter 7 K. Louden, Programming Languages 3
Several simple expressions are given below:a + b x = y t = u + v x <= y ++j
The first expression, which employs the addition operator (+), represents the sum of the values assigned to variables a and b.
The second expression involves the assignment operator (=), and causes the value represented by y to be assigned to x.
Chapter 7 K. Louden, Programming Languages 4
In the third expression, the value of the expression (u + v) is assigned to t.
The fourth expression takes the value 1 (true) if the value of x is less than or equal to the value of y.
Otherwise, the expression takes the value 0 (false). Here, <= is a relational operator that compares the values of x and y.
The final example causes the value of j to be increased by 1. Thus, the expression is equivalent to
j = j + 1
Chapter 7 K. Louden, Programming Languages 5
The increment (by unity) operator ++ is called a unary operator, because it only possesses one operand.
A statement causes the computer to carry out some definite action.
There are three different classes of statements in C: expression statements, compound statements, and control statements.
An expression statement consists of an expression followed by a semicolon. The execution of such a statement causes the associated expression to be evaluated. For example:
a = 6; c = a + b;
++j;Chapter 7 K. Louden, Programming Languages 6
The first two expression statements both cause the value of the expression on the right of the equal sign to be assigned to the variable on the left.
The third expression statement causes the value of j to be incremented by 1.
Again, there is no restriction on the length of an expression statement: such a statement can even be split over many lines, so long as its end is signaled by a semicolon.
Chapter 7 K. Louden, Programming Languages 7
A compound statement consists of several individual statements enclosed within a pair of braces { }.
The individual statements may themselves be expression statements, compound statements, or control statements.
Unlike expression statements, compound statements do not end with semicolons.
A typical compound statement is shown below:{
pi = 3.141593; circumference = 2. * pi * radius;
area = pi * radius * radius; }
Chapter 7 K. Louden, Programming Languages 8
This particular compound statement consists of three expression statements, but acts like a single entity in the program in which it appears.
A symbolic constant is a name that substitutes for a sequence of characters.
The characters may represent either a number or a string.
When a program is compiled, each occurrence of a symbolic constant is replaced by its corresponding character sequence.
Symbolic constants are usually defined at the beginning of a program, by writing
#define NAME textChapter 7 K. Louden, Programming Languages 9
where NAME represents a symbolic name, typically written in upper-case letters, and text represents the sequence of characters that is associated with that name.
Note that text does not end with a semicolon, since a symbolic constant definition is not a true C statement.
In fact, during compilation, the resolution of symbolic names is performed (by the C preprocessor) before the start of true compilation.
For instance, suppose that a C program contains the following symbolic constant definition:
#define PI 3.141593Chapter 7 K. Louden, Programming Languages 10
Suppose, further, that the program contains the statement
area = PI * radius * radius;
During the compilation process, the preprocessor replaces each occurrence of the symbolic constant PI by its corresponding text. Hence, the above statement becomes
area = 3.141593 * radius * radius;
Chapter 7 K. Louden, Programming Languages 11
Symbolic constants are particularly useful in scientific programs for representing constants of nature, such as the mass of an electron, the speed of light, etc.
Since these quantities are fixed, there is little point in assigning variables in which to store them.
Chapter 7 K. Louden, Programming Languages 12
Chapter 7 K. Louden, Programming Languages 13
Expressions : Expounded DefinitionExpressions : Expounded Definition In their purest form, expressions do not involve
control issues: subexpressions can be evaluated in arbitrary order, and the order does not affect the result. Functional programming tries to achieve this goal for whole programs.
Of course, there must always be a few expressions that can modify the execution/evaluation process: if-then-else expressions, short-circuit boolean operators, case/switch expressions.
If these could have arbitrary evaluation order, programs would become non-deterministic: any of a number of different outcomes might be possible. Usually this is not desirable, but see later.
Chapter 7 K. Louden, Programming Languages 14
Side EffectsSide Effects A side effect is any observable change to
memory, input or output. A program without any side effect is useless. Side effects expose evaluation order:
class Order{ static int x = 1; public static int getX() { return x++; } public static void main(String[] args) { System.out.println( x+getX() ); }}
This prints 2, but the corresponding C program will usually print 3!
Referential transparency limits side effects, so this can't happen for r. t. expressions.
Chapter 7 K. Louden, Programming Languages 15
Prefix notation Prefix notation
Ordinary prefixFunction name precedes its arguments f(a,b,c)Example:(a+b)*(c/d) becomes *(+(a,b),/(c,d))
A variant (Cambridge Polish or fully parenthesized) moved the left paren before the operand and deletes the commas Example: (a+b)*(c/d) becomes (*(+a b) (/c d)) - LISP
Polish, allows parens to be dropped Parens are unnecessary if number of args is fixed and knownExample: *+ab/cd Named because the Polish mathematician Lukasiewiez invented the notation.Difficult to readWorks for any number of operands (unlike infix notation)Easy to decode mathematically.
Chapter 7 K. Louden, Programming Languages 16
Postfix Notation (suffix or reverse Polish) notation Not used in programming languages, but frequently for execution time representationEasily evaluated using a stack - Easy code generation
Infix Notation Suitable only for binary operationsCommon use in mathematics and programming languages
Chapter 7 K. Louden, Programming Languages 17
Problems with infix Since only works for binary operations, others must use prefix
(or postfix) making translation worse ambiguity: parens, precedence Comparison of notations Infix is a natural representation,
but requires complex implicit rules and doesn't work for non-binary operatorsIn the absence of implicit rules, large number of parens are required
prefix and Cambridge Polish require large number of parens Polish requires no parens, but requires you know the arity of
each operatorHard to read
Chapter 7 K. Louden, Programming Languages 18
Functional Side EffectsFunctional Side Effects Two Possible Solutions to the Problem:
1. Write the language definition to disallow functional side effects
– No two-way parameters in functions– No non-local references in functions– Advantage: it works!– Disadvantage: Programmers want the
flexibility of two-way parameters (what about C?) and non-local references
Chapter 7 K. Louden, Programming Languages 19
Functional Side EffectsFunctional Side Effects2. Write the language definition to demand that
operand evaluation order be fixed– Disadvantage: limits some compiler
optimizations
Chapter 7 K. Louden, Programming Languages 20
Overloaded OperatorsOverloaded Operators C++ and Ada allow user-defined
overloaded operators Potential problems:
– Users can define nonsense operations– Readability may suffer, even when the
operators make sense
Chapter 7 K. Louden, Programming Languages 21
Type ConversionsType Conversions Def: A narrowing conversion is one that
converts an object to a type that cannot include all of the values of the original type e.g., float to int
Def: A widening conversion is one in which an object is converted to a type that can include at least approximations to all of the values of the original type e.g., int to float
Chapter 7 K. Louden, Programming Languages 22
Type ConversionsType Conversions Def: A mixed-mode expression is one that has
operands of different types Def: A coercion is an implicit type conversion The disadvantage of coercions:
– They decrease in the type error detection ability of the compiler
In most languages, all numeric types are coerced in expressions, using widening conversions
In Ada, there are virtually no coercions in expressions
Chapter 7 K. Louden, Programming Languages 23
Type ConversionsType Conversions Explicit Type Conversions Often called casts
Chapter 7 K. Louden, Programming Languages 24
Eager evaluation For each operation node in the expression tree, first evaluate (or
generate code to do so) each operand, then apply the operation. Sounds good - but complications: Z+(y==0?x:x/y)If we evaluate the operands of condition first, we do what the condition was set up to avoid.
Sometimes optimizations make use of the fact that association can be changed. Sometimes, reordering causes problems:
adding large and small values together - lose small ones due to number of significant digits which can be stored
can be confusing to reader - exception thrown or side effects seen to be out of order
Evaluation Order Bottom-up evaluation of syntax tree For function calls: all arguments evaluated before call. translator may rearrange the order of computation so it is more
efficient. If order of evaluation is unspecified, not portable
Chapter 7 K. Louden, Programming Languages 25
Short Circuit EvaluationShort Circuit Evaluation Suppose Java did not use short-circuit
evaluation Problem: table look-upindex = 1;while (index <= length) && (LIST[index] != value)
index++;Problem: divide by zero
Chapter 7 K. Louden, Programming Languages 26
Short Circuit EvaluationShort Circuit Evaluation C, C++, and Java: use short-circuit evaluation for
the usual Boolean operators (&& and ||), but also provide bitwise Boolean operators that are not short circuit (& and |)
Ada: programmer can specify either (short-circuit is specified with and then and or else)
Short-circuit evaluation exposes the potential problem of side effects in expressions e.g. (a > b) || (b++ / 3)
Chapter 7 K. Louden, Programming Languages 27
Delayed order evaluation used in functional languages. Don't evaluate
until actually needed.Example:
function sq(x:integer):integer; begin sq = x*x; end; sq(i++) Becomes sq(i++) = (i++)*(i++) is evaluated
twice.
Chapter 7 K. Louden, Programming Languages 28
StrictnessStrictness An evaluation order for expressions is strict if all
subexpressions of an expression are evaluated, whether or not they are needed to determine the value of the result, non-strict otherwise.
Arithmetic is almost always strict. Every language has at least a few non-strict
expressions (?:, &&, || in Java). Some languages use a form of non-strictness called
normal-order evaluation: no expression is ever evaluated until it is needed (Haskell). Also called delayed evaluation.
A form of strict evaluation called applicative-order is more common: "bottom-up" or "inside-out".
Still leaves open whether left-to-right or not.
Chapter 7 K. Louden, Programming Languages 29
Function callsFunction calls Obey evaluation rules like other
expressions. Applicative order: evaluate all arguments
(left to right?), then call the procedure. Normal order: pass in unevaluated
representations of the arguments. Only evaluate when needed.
With side effects, order makes a difference.
Representation of argument value also makes a difference (value or reference?).
Chapter 7 K. Louden, Programming Languages 30
ExamplesExamples C and Scheme: no explicit order required for
subexpressions or arguments to calls. Java always says left to right, but warns against
using that knowledge. Case/switch/cond expressions imply a top-down
order:(define (f) (cond (#t 1) (#t 2)))
Theoretically, this could return either 1 or 2 (non-determinism—the "guarded if" of text).
Java and C outlaw this: no duplicate cases.
Chapter 7 K. Louden, Programming Languages 31
Sequencing and StatementsSequencing and Statements A sequence of expressions makes no sense
without side effects. Thus, a referentially transparent program should not need sequences.
Both ML and Scheme have sequences:(e1;e2;…) [ML] and (begin e1 e2 …) [Scheme].
What about a let expression? Is there an implied sequence? (let val x = e1 in e2 end;)
Applicative order would say yes: e1 is an argument to a call: (fn x => e2) e1.
Normal order would say no: only evaluate e1 if the value of x is actually needed in e2.
Statements by definition imply sequencing, since there is no computed value.
Chapter 7 K. Louden, Programming Languages 32
StatementsStatements Can be viewed as expressions with no
value. Java, C have very few: if, while, do, for,
switch, plus "expression statements." Scheme: valueless expressions also exist:
define, set! (some versions give these values).
ML: "valueless" expressions have value (). What about val and fun? Declarations may be neither expressions
nor statements.
Chapter 7 K. Louden, Programming Languages 33
SummarySummary Every language has three major program
components: expressions, statements, and declarations.
Expressions are executed for their values (but may have side effects), and may or may not be sequenced.
Statements are executed solely for their side effects, and they must be sequenced.
Declarations define names; they can also give values to those names. They may or may not be viewed by a language as expressions or statements.