programming languages the beginning. in the beginning... computers were very expensive; programmers...
TRANSCRIPT
In the beginning...
• Computers were very expensive; programmers were cheap
• Programming was by plugboards or binary numbers (on-off switches)
• Memories were maybe 4K and up
• Cycle times were in milliseconds
• No compromises for programmer's ease
Early compromises
• Assembly language– reduced human error– programs were just as efficient
• FORTRAN– the compiler generated very efficient code– easier to read, understand, debug– need to compile was still extra overhead– the idea was: compile once, run many times
The Automation Principle
• Automate mechanical, tedious, or error-prone activities.– Higher-level languages are an example– Assembly language automates writing binary
code– FORTRAN automates writing assembly
language or binary code
Textbook, p. 10Textbook, p. 10
FORTRAN
• Efficiency was everything
• Card oriented, with information in fixed columns
• First language to catch on in a big way
• Because it was first, FORTRAN has many, many "mistakes"
• Algol 60 was a great leap forward
Expressions
• In FORTRAN, an expression...– In a WRITE statement, could be c (constant), v
(variable), or v+c, or v-c– As a parameter, could be c or v– As an array subscript, could be c, v, c*v, v+c, v-c,
c*v+c, or c*v-c– On the RHS of an assignment, could be anything
• In Algol 60, an expression is an expression is an expression!
The Regularity Principle
• Regular rules, without exceptions, are easier to learn, use, describe, and implement.– Arithmetic expressions in FORTRAN vs. Algol
are an example– BNF is a great aid to imposing regularity
Textbook, p. 11Textbook, p. 11
Lexical Conventions
• Reserved words, used in most modern languages (C, Pascal, Java)– Easier for experts, somewhat harder for novices
• Keywords, unambiguously marked– Hard to type and often hard to read
• Keywords in context (FORTRAN, PL/1)– If it makes sense as a keyword, it's a keyword,
otherwise it's something else
The Reserved Word Controversy
• FORTRAN had no reserved words
• IF (I) = 1 was an array assignment• Advantages of reserved words
– Easier for the compiler writer– Helps avoid ambiguities in the language
• Disadvantages of reserved words– Programmer has to know them all– Few reserved words imply less language power?
Other FORTRAN "Mistakes"
• The FORTRAN compiler ignored blanks– DO 50 I=1,10 became DO50I=1,10
• Did not require variable declarations– Misspellings were automatically new variables
• Earliest versions did not have subprograms– But they did have callable library routines
• DO , IF, and GO TO were the only control structures
The Impossible Error Principle
• Making errors impossible to commit is preferable to detecting them after their commission.– In FORTRAN, variables were declared simply
by appearing in a program
– DO 50 I = 1. 10 is an assignment to DO50I– In general, the earlier an error can be detected,
the better
Textbook, p. 12Textbook, p. 12
Algol 60
• Introduced the idea of a virtual machine
• Used BNF to define syntax formally
• Introduced nested scopes
• Introduced free-format programs
• Introduced recursion
• Required declarations of all variables
• Introduced if-then-else and flexible loops
Algol 60 was three+ languages
• The language definition was given in the reference language
• Algorithms were published using the publication language, in which keywords were boldface
• Programs were in a hardware representation– Often looked like: 'IF' X < Y 'THEN' X := X + 1– Difficult to type, difficult to read
Algol 60 Shortcomings
• Algol 60 had no input/output!– I/O was considered to be too hardware-specific– Most implementations “borrowed” FORTRAN's I/O
• Used “call by name” semantics– powerful, but hard to implement– sometimes hard to understand
• Few but very flexible control structures were considered to be “baroque”
Metalanguages
• A metalanguage is a language used to talk about a language (usually a different one)
• We can use English as its own metalanguage (e.g. describing English grammar in English)
• It is essential to distinguish between the metalanguage terms and the object language terms
BNF
• BNF stands for either Backus Naur Form or Backus Normal Form
• BNF is a metalanguage used to describe the grammar of a programming language
• BNF is formal and precise
• BNF is essential in compiler construction
• There are many dialects of BNF in use, but…
• …the differences are almost always minor
BNF
• < > indicate a nonterminal that needs to be further expanded, e.g. <variable>
• Symbols not enclosed in < > are terminals; they represent themselves, e.g. if, while, (
• The symbol ::= means is defined as
• The symbol | means or; it separates alternatives, e.g. <addop> ::= + | -
BNF uses recursion
• <integer> ::= <digit> | <integer> <digit> or<integer> ::= <digit> | <digit> <integer>
• Many people find recursion confusing
• "Extended BNF" allows repetition as well as recursion
• Repetition is often more efficient when using BNF to construct a compiler
BNF Examples I
• <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
• <if statement> ::= if ( <condition> ) <statement> | if ( <condition> ) <statement> else <statement>
BNF Examples II
• <unsigned integer> ::= <digit> | <unsigned integer> <digit>
• <integer> ::= <unsigned integer> | + <unsigned integer> | - <unsigned integer>
BNF Examples III
• <identifier> ::= <letter> | <identifier> <letter> | <identifier> <digit>
• <block> ::= { <statement list> }• <statement list> ::=
<statement> | <statement list> <statement>
BNF Examples IV
• <statement> ::= <block> | <assignment statement> | <break statement> | <continue statement> | <do statement> | <for loop> | <goto statement> | <if statement> | . . .
Extended BNF
• The following are pretty standard:– [ ] enclose an optional part of the rule
– { } mean the enclosed can be repeated any number of times (including zero)
• The textbook uses a different notation:– x* means repeat x zero or more times– x+ means repeat x one or more times
– { } enclose alternatives, usually listed vertically
Limitations of BNF
• No easy way to impose length limitations, such as maximum length of variable names
• No way to impose distributed requirements, such as, a variable must be declared before it is used
• Describes only syntax, not semantics
• Nothing clearly better has been devised
Functions and Procedures
• In mathematics, a function:– returns a value– has no side effects (except maybe I/O)– does not alter its parameters
• A procedure (a.k.a. subroutine):– does not return a value– is called precisely for its side effects– may (probably does) alter its parameters
C Has No Procedures
• Functions may return void (no value)
• Functions really can’t alter their parameters
• This is inadequate for real programs
• There are workarounds, such as passing pointers to values
• Hence, “functions” in C are seriously distorted
Predicates
• A predicate is a binary (two-valued) function
• In some languages, a predicate returns “true” or “false”
• In other languages (Snobol IV, Prolog, Icon), a predicate “succeeds” or “fails”
Methods
• A procedure or function is called directly
• In an O-O language, an object has methods
• A message is sent to an object
• The object decides what to do about the message
• Typically, the object chooses a method to execute
Scope Rules
• The scope of a variable is the part of a program in which it is defined and accessible
• FORTRAN: scope is the enclosing subprogram
• Prolog: scope is the (one) enclosing clause
• Java: scope is from point of definition to }• Algol, Pascal: scopes are nested
Nested Scopes
beginbegin int x, y; int x, y; --int x and int y are defined here --int x and int y are defined here begin begin float x, z; float x, z; -- int y, float x, float z are defined -- int y, float x, float z are defined herehere -- this is a “hole” in the scope of int x -- this is a “hole” in the scope of int x end end -- int x, int y are defined here -- int x, int y are defined hereendend
Actual and Formal Parameters
• Parameters are passed to subprograms in a variety of ways
• Actual parameters are the values used in a call to the subprogram
• Formal parameters are the names used for those values in the subprogram
Parameter Transmission
• Call by reference (FORTRAN, Pascal, Java)
• Call by value (C, Pascal, Java)
• Call by name (Algol 60)
• Call by value-result (Ada)
• Call by unification (Prolog)
Call by Reference
• Every value (data item) is stored at some particular machine address
• The subprogram is given that address
• The subprogram directly manipulates the original data item
• This is the most efficient way to pass parameters
• In FORTRAN, could alter “constants” this way
Example of Call by Reference
procedure Aprocedure A int X int X X = 5 X = 5
call B (X)call B (X)endend
procedure B (Y)procedure B (Y) Y = Y + 1 Y = Y + 1
endend
X is stored in only X is stored in only one place (that one place (that place is in A), and place is in A), and B is made to refer B is made to refer to that place.to that place.
Call by Value
• Subprograms are given a copy of the formal parameter
• Subprograms are free to change their copy
• The changed value is not copied back
• Safe, but not efficient or flexible
• C uses call-by-value exclusively– workaround: pass a pointer to the data item
Example of Call by Value
procedure Aprocedure A int X int X X = 5 X = 5
call B (X)call B (X)endend
procedure B (Y)procedure B (Y) Y = Y + 1 Y = Y + 1
endend
Value is Value is copied down copied down when B is when B is called, and called, and never copied never copied back upback up
Call by Name
• Used in Algol 60, hardly anywhere else
• Uses the copy rule: the subprogram acts as if it had a textual copy of the formal parameter
• Mathematically, this is very neat
• Doesn’t play well with scope rules
• Violates information hiding (names matter)
• Difficult to implement and usually inefficient
Example of Call by Name
int a, r;int a, r;
function fiddle (int x) {function fiddle (int x) { int a = 3, b = 5; int a = 3, b = 5; return a * x + b; // means a * (a+1) + return a * x + b; // means a * (a+1) + bb}}
a = 9;a = 9;r = fiddle (a+1); // should return 17r = fiddle (a+1); // should return 17print (r);print (r);
Algol Supported Recursion
• Four rules for recursion:– Always check for base cases first– Recur only with a simpler case– Avoid global variables– Don’t look down