cos 301 chapter 3 topics programming...
TRANSCRIPT
Sebesta Chapter 3.1 – 3.4Syntax and Semantics
COS 301
Programming Languages
Chapter 3 Topics
• Introduction• The General Problem of Describing Syntax• Formal Methods of Describing Syntax• Attribute Grammars• Dynamic Semantics
Introduction
• Syntax: the form or structure of the expressions, statements, and program units
• Semantics: the meaning of the expressions, statements, and program units
• Syntax and semantics provide a language’s definition
A language that is simple to parse for the compiler is also simple to parse for the human programmer.
N. Wirth
Describing Syntax
• Descriptions of syntax are intended to communicate facts about a language to an audience. Who?– Programmers want to find out what legal programs
look like – Implementers want an exact, detailed definition– Tools such parser and scanner generators need an
exact, detailed definition in a particular, machine-readable form
– Tools often need ambiguity eliminated, while people often prefer a more readable grammar
Some Terminology
• Any language (human or computer or otherwise) consists of a set of strings called sentences
• The syntactic rules of a language specify what the legal strings are members of the language – This does not of course preclude a language from
having an infinite number of such strings
• Human languages are quite complex compared to computer languages
Some Terminology
• A sentence is a string of characters over some alphabet– Can be as small as two symbols e.g. {0,1}
• A language is a set of sentences• A lexeme is the lowest level syntactic unit of a
language (e.g., 1.0, *, sum, begin)– Formal syntactic descriptions of a language are
usually separated into lexical and syntactic rules. – Lexical rules specify how numeric literals are formed,
language operators, keywords, etc.• A token type is a category of lexemes (e.g.,
identifier)
Tokens and Lexemes
• Lexemes are partitioned into groups or types such as identifiers, operators, integer literals etc.
• Often the term “token” is used in place of lexemeIndex = 2 * count + x;
Lexeme Token ValueIndex identifier "index"= assignment2 int literal 2Count identifier "count"+ addition17 int literal 17
Lexical and Syntactic Rules
• Lexical and syntactic rules are specified separately because they are specified by different types of grammars and are recognized by different types of automata
• In particular, lexical rules are equivalent to regular expressions and specified by very restricted grammars called regular grammars
Formal Definition of Languages
• Languages can be formally defined in two different ways: 1. Recognizers2. Generators
Formal Definition of Languages
• Recognizers– A recognition device reads input strings over the
alphabet of the language and decides whether the input strings belong to the language
– Example: syntax analysis part of a compiler
• Generators– A device that generates sentences of a language– One can determine if the syntax of a particular
sentence is syntactically correct by comparing it to the structure of the generator
– Example: a grammar
Recognizers and Generators
• There is a close relationship between recognizers and generators of a language
• Given a context-free grammar (a generator) we can algorithmically construct a recognizer (a parser)
• Many such systems have been constructed• The oldest (and still in wide use) is yacc
– Yet Another Compiler Compiler
The Chomsky Hierarchy
• Noam Chomsky developed the idea of formal grammars in the late 1950’s
• Four levels of grammar:1. Regular2. Context-free3. Context-sensitive4. Unrestricted (recursively enumerable)• We will use only regular and context free
grammars• Chomsky’s work has been extended into a set
of 9 levels distinguished by recognition automata
BNF and Context-Free Grammars
• Context-Free Grammars– Grammars are language generators, meant to describe the syntax
of natural languages– A context-free grammar defines a class of languages called
context-free languages– CFGs are the most powerful grammars that are amenable to
computation
• Backus-Naur Form (1959)– Invented by John Backus and Peter Naur to describe Algol 60– BNF grammars are equivalent to context-free grammars– A similar notation was actually used over 2,000 years ago to
describe the structure of Sanskrit (one of the most regular of human languages
BNF Grammar: Formalism
• The grammar of a programming language is a set of {P,T,N,S} with four members
1. A set of productions: P2. A set of terminal symbols: T3. A set of nonterminal symbols: N4. start symbol: S ∈ N• A production has the form A →ω where A ∈
N and ω ∈ (N ∪ T)
Note that N and T are disjoint sets
BNF Fundamentals
• BNF is a metalanguage (a language used to describe another language)
• BNF uses abstractions to represent classes of syntactic structures
• A simple assignment statement might be represented by the symbol <assign><assign> -> <var> = <expression>
• A production or rule shows how a nonterminal can be expanded
• A rule has a left-hand side (LHS), which is a nonterminal, and a right-hand side (RHS), which is a string of terminals and/or nonterminals
• Terminals cannot be expanded further
BNF Notation
• Nonterminals are often enclosed in angle brackets– Examples of BNF rules:
<ident_list> → identifier | identifier, <ident_list><if_stmt> → if <logic_exp> then <stmt>
BNF Rules or Productions
• A production is a rule for rewriting that can be applied to a string of symbols called a sentential form– The nonterminal symbols N identify grammatical categories
such as identifier, integer, expression, program– The start symbol S identifies the principal grammatical
category (usually Program).– The terminal symbolsT are the lexemes or tokens from
which programs are constructed
• An abstraction (or nonterminal symbol) can have more than one RHS
<stmt> <single_stmt> | begin <stmt_list> end
Describing Lists
• Syntactic lists are described using recursion<ident_list> ident
| ident, <ident_list>
• A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)
Metasymbols
• The symbol is used after the left nonterminalof a rule. Alternate (original) notation uses ::=
• Nonterminals may be written in angle brackets or with a distinctive font<statement>
• Selection (OR) is designated by the | character.• Parentheses may be used for grouping ( ... ).• Note that there are several different written
styles for BNF but all are fundamentally equivalent
Definition: A Language
• The language L defined by a BNF grammar G = {P,T,N,S} is the set of all terminal strings that can be derived from the start symbol in zero or more steps.
An Example Grammar
<program> <stmts><stmts> <stmt> | <stmt> ; <stmts><stmt> <var> = <expr><var> a | b | c | d<expr> <term> + <term> | <term> - <term><term> <var> | const
An Example Derivation
<program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <term> + <term>=> a = <var> + <term> => a = b + <term>=> a = b + const
Derivations
• Every string of symbols in a derivation is a sentential form
• A sentence is a sentential form that has only terminal symbols
• A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded
• A derivation may be neither leftmost nor rightmost
Example
• Given G below, does the string cbab belong to L(G)? In other words, is there a way to derive cbab from the start symbol?
• G = { T, V, P, S }T = { a, b, c }V = { A, B, C, W }S = { W }
• P consist of the rules:1. W AB or <W> ::= <A><B>2. A Ca <A> ::= <C>a3. B Ba <B> ::= <B>a4. B Cb <B> ::= <C>b5. B b <B> ::= b6. C cb <C> ::= cb7. C b <C> ::= b
Leftmost derivation
• Begin with the start symbol W and apply production rules expanding the leftmost non-terminal.W AB Rule 1AB CaB Rule 2CaB cbaB Rule 6cbaB cbab Rule 5
Rightmost derivation
• Begin with the start symbol W and apply production rules expanding the rightmost non-terminal.W AB Rule 1AB Ab Rule 5Ab Cab Rule 2Cab cbab Rule 6
A shorter version of G
• Using selection in the RHSG = { T, V, P, S }T = { a, b, c }V = { A, B, C, W }S = { W }
1. W AB or <W> ::= <A><B>2. A Ca <A> ::= <C>a3. B Ba | Cb | b <B> ::= <B>a | <C>b | b4. C cb | b <C> ::= cb | b
Parse Tree
• A tree representation of a derivation
<program>
<stmts>
<stmt>
const
a
<var> = <expr>
<var>
b
<term> + <term>
Parse trees
• In a parse tree:– Each internal node of the tree corresponds to a step
in the derivation.– Each child of a node represents a right-hand side of
a production.– Each leaf node represents a symbol of the derived
string, reading from left to right.
A Grammar for Assigment Statements
<assign> ::= <id> = <expr><id> ::= A | B | C<expr> ::= <id> + <expr>
| <id> * <expr>| ( <expr> )| <id>
Example derivation
• A = B * ( A + C )<assign> => <id> = <expr>
=> A = <expr>=> A = <id> * <expr>=> A = B * <expr>=> A = B * ( <expr> )=> A = B * ( <id> + <expr> )=> A = B * ( A + <expr> )=> A = B * ( A + <id> )=> A = B * ( A + C )
Ambiguity in Grammars
• A grammar is ambiguous when it generates a sentential form that has two or more distinct parse trees
An ambiguous grammar
• Simple assignment statements<assign> ::= <id> = <expr><id> ::= A | B | C<expr> ::= <expr> + <expr>
| <expr> * <expr>| ( <expr> )| <id>
Ambiguity
A small difference
• Ambiguous<assign> ::= <id> = <expr><id> ::= A | B | C<expr> ::= <expr> + <expr>
| <expr> * <expr>| ( <expr> )| <id>
• Not ambiguous<assign> ::= <id> = <expr><id> ::= A | B | C<expr> ::= <id> + <expr>
| <id> * <expr>| ( <expr> )| <id>
What causes ambiguity?
• In the example above the unambiguous grammar allows the expression to grow only on the right
• Ambiguity is actually undecidable but there are some useful indicators such as the presence of more than one leftmost or rightmost derivation
• Parsers can use extra-grammatical information to correct ambiguity
An Unambiguous Expression Grammar
• If we use the parse tree to indicate precedence levels of the operators, we cannot have ambiguity
<expr> <expr> - <term> | <term><term> <term> / const| const
<expr>
<expr> <term>
<term> <term>
const const
const/
-
Precedence of Operators
• Operator a has higher precedence than operator b if operator a should be evaluated before operator b in all parenthesis-free expressions involving only the two operators– Ex: 5 * 4 + 3 = 23 5 + 4 * 3 = 17
• “Evaluated before” means lower in the parse tree
No Precedence (right to left evaluation)
• In this grammar any parse tree with multiple operators has the rightmost operator lowest in the tree<assign> ::= <id> = <expr><id> ::= A | B | C<expr> ::= <id> + <expr>
| <id> * <expr>| ( <expr> )| <id>
• In A + B * C multiplication will be first• In A * B + C addition will be first
Precedence of C++ Operators -1Precedence Operator Description Example Associativity 1 :: Scoping operator Class::age = 2; none
2
() [] -> . ++ --
Grouping operator Array access Member access from a pointer Member access from an object Post-increment Post-decrement
(a + b) / 4; array[4] = 2; ptr->age = 34; obj.age = 34; for( i = 0; i < 10; i++ ) ... for( i = 10; i > 0; i-- ) ...
left to right
3
! ~ ++ -- - + * & (type) sizeof
Logical negation Bitwise complement Pre-increment Pre-decrement Unary minus Unary plus Dereference Address of Cast to a given type Return size in bytes
if( !done ) ... flags = ~flags; for( i = 0; i < 10; ++i ) ... for( i = 10; i > 0; --i ) ... int i = -1; int i = +1; data = *ptr; address = &obj; int i = (int) floatNum; int size = sizeof(floatNum);
right to left
4 ->* .*
Member pointer selector Member object selector
ptr->*var = 24; obj.*var = 24; left to right
5 * / %
Multiplication Division Modulus
int i = 2 * 4; float f = 10 / 3; int rem = 4 % 3;
left to right
Precedence of C++ Operators -2
6 + -
Addition Subtraction
int i = 2 + 3; int i = 5 - 1; left to right
7 << >>
Bitwise shift left Bitwise shift right
int flags = 33 << 1; int flags = 33 >> 1; left to right
8
< <= > >=
Comparison less-than Comparison less-than-or-equal-to Comparison greater-than Comparison geater-than-or-equal-to
if( i < 42 ) ... if( i <= 42 ) ... if( i > 42 ) ... if( i >= 42 ) ...
left to right
9 == !=
Comparison equal-to Comparison not-equal-to
if( i == 42 ) ... if( i != 42 ) ... left to right
10 & Bitwise AND flags = flags & 42; left to right 11 ^ Bitwise exclusive OR flags = flags ̂ 42; left to right 12 | Bitwise inclusive (normal) OR flags = flags | 42; left to right
Precedence of C++ Operators -3
13 && Logical AND if( conditionA && conditionB ) ...
left to right
14 || Logical OR if( conditionA || conditionB ) ... left to right
15 ? : Ternary conditional (if-then-else) int i = (a > b) ? a : b; right to left
16
= += -= *= /= %= &= ^= |= <<= >>=
Assignment operator Increment and assign Decrement and assign Multiply and assign Divide and assign Modulo and assign Bitwise AND and assign Bitwise exclusive OR and assign Bitwise inclusive (normal) OR and assign Bitwise shift left and assign Bitwise shift right and assign
int a = b; a += 3; b -= 4; a *= 5; a /= 2; a %= 3; flags &= new_flags; flags ^= new_flags; flags |= new_flags; flags <<= 2; flags >>= 2;
right to left
17 , Sequential evaluation operator for( i = 0, j = 0; i < 10; i++, j++ ) ...
left to right
Associativity
• Associativity specifies whether operators of equal precedence should be evaluated left-to-right or right-to-left– Ex: (left) 5 - 4 - 3 = 1 - 3 = -2– (right) 2 ** 3 ** 3 = 2 ** 9 = 512
Associativity
• A grammar can be used to define both associativity and precedence among the operators in an expression. Consider the conventonal rules:+ and - are left-associative operators*, %, and / are left associative but have higher precedence than +
and –Exponentiation (^) is right associative and has the highest precedence
• Consider this grammar G:Expr -> Expr + Term | Expr – Term | TermTerm -> Term * Factor | Term / Factor
| Term % Factor | FactorFactor -> Primary ** Factor | PrimaryPrimary -> 0 | ... | 9 | ( Expr )
Parse tree for 4**2**3+5*6+7 Determining precedence and associativity
• Precedence is determined by the length of the shortest derivation from start symbol to operator– Shorter derivations have lower precedence
• Associativity is determined by use of left or right recursion– Left
Expr Expr + Term | Expr - Term | Term
– RightFactor Primary ^ Factor | Primary
Some design choices
• C++ has 17 distinct levels of precedence, Java has 16, C has 15 – In all three languages some operators associate to
the left and others to the righta = b < c ? * p + b * c : 1 << d ()
– Perhaps too many?
• Pascal has only 5a <= 0 or 100 <= a Error!– Perhaps too few?
Some design choices - 2
• Smalltalk has no precedence and all operators are left-associative– (minor aside: Smalltalk actually does not have operators at
all, but a number of binary message characters are defined in the grammar. The meaning of + is determined by the classes that implement it)
• APL has no precedence and all operators are right-associative
Where syntax meets semantics
• Parse trees are where syntax meets semantics• We want the structure of the parse tree to
correspond to the semantics of the string it generates
The Dangling Else ProblemIfStatement -> if ( Expression ) Statement
| if ( Expression ) Statement else Statement
Statement -> Assignment | IfStatement| Block
Block -> { Statements }
Statements -> Statements Statement | Statement
Example
• With which ‘if’ does the following ‘else’associate
if (x < 0)if (y < 0) y = y - 1;else y = 0;
• Answer: either one!
Parse Trees
Solving the dangling else problem
1. Algol 60, C, C++, Pascal: associate each elsewith closest if; use { } or begin…end to override.
2. Algol 68, Modula, Ada, Visual Basic: use explicit delimiter to end every conditional (e.g., if…fi)
3. Java: rewrite the grammar to limit what can appear in a conditional:
IfThenStatement -> if ( Expression ) StatementIfThenElseStatement -> if ( Expression ) StatementNoShortIf
else Statement
The category StatementNoShortIf includes all except IfThenStatement.
Audiences
• Grammars are means of communicating information to an audience– Programmers want to find out what legal programs
look like – Implementers want an exact, detailed definition– Tools such parser and scanner generators need an
exact, detailed definition in a particular, machine-readable form
– Tools often need ambiguity eliminated, while people often prefer a more readable grammar
• Grammars therefore can vary with the audience
Levels of Precedence and Complexity
• C, C++, and Java have a large number of operators and precedence levels
• For each precedence level we need to introduce a new non-terminal
• Grammar can get large and difficult to read• Instead of using a large grammar, we can:
– Write a smaller ambiguous grammar, and– Specify precedence and associativity separately
Extended BNF
• BNF was developed in the late 1950’s; still very widely used
• However the original BNF has a few minor inconveniences such as recursion instead of iteration and verbose selection syntax– Note that for some applications such as recursive
descent parsing, left recursion is forbidden
• Extended BNF (EBNF) increases readability and writability– Expressive power is unchanged: still CFGs
• Several variations exist
Extended BNF: Optional parts
• Optional parts are placed in brackets [ ]<proc_call> → ident ([<expr_list>])
• Replaces<proc_call> → ident()
| → ident (<expr_list>)
Extended BNF: Alternative RHS
• Alternative parts of RHSs are placed inside parentheses and separated via vertical bars <term> → <term> (+|-) const
• Replaces <term> → <term> + const
| <term> - const
Extended BNF: Recursion
• Repetitions (0 or more) are placed inside braces { }<ident> → letter {letter|digit}
• Replaces<ident> → letter
| <ident> letter| <ident> digit
BNF and EBNF
• BNF<expr> <expr> + <term>
| <expr> - <term>| <term>
<term> <term> * <factor>| <term> / <factor>| <factor>
• EBNF<expr> <term> {(+ | -) <term>}<term> <factor> {(* | /) <factor>}
EBNF and Associativity
• Note that the production:Expr -> Term { ( + | - ) Term }
does not seem to specify the left associativitythat we have in Expr -> Expr + Term | Expr + Term | Term
• In EBNF left recursion is usually assumed. – Explicit recursion is used for right associative
operators– Some EBNF grammars may specify associativity
outside of the grammar
Recent Variations in EBNF
• Alternative RHSs are put on separate lines• Use of a colon instead of =>• Use of opt for optional parts• Use of oneof for choices
EBNF to BNF• We can always rewrite an EBNF grammar as a BNF
grammar. E.g.,A -> x { y } z
• can be rewritten:A -> x A' zA' -> | y A'
• Note is the standard symbol used in grammars for the “empty string”
• Rewriting EBNF rules with ( ), [ ] can be done in a similar fashion
• While EBNF is no more powerful than BNF, its rules are often simpler and clearer for human readers
Syntax Diagrams
• Similar to EBNF• Introduced by Jensen and Wirth with Pascal in
1975
Ex: Expressions with addition A More Complex Example
An Expression Grammar
From http://en.wikipedia.org/wiki/Syntax_diagram
Static Semantics
• Context-free grammars (CFGs) cannot describe all of the syntax of programming languages
• “Static semantics” has only an indirect relationship with meaning– Static semantic rules deal with the legal form of
programs (syntax)– Most rules deal with typing systems– Called “static” because analysis is done at compile
time
• Dynamic semantics describe the meaning (runtime behavior) of a program
Static Semantics
• Typical items for static semantic analysis:– Type of RHS of an expression must match the type
of the LHS (lvalue) – All variables must be declared before being
referenced
• First restriction can be expressed in BNF but only in a very cumbersome way
• The second restriction cannot be expressed in BNF– Consider the common-sense meaning of “context
free” and “context sensitive”
Attribute Grammars
• Attribute Grammars (AGs) were developed by Donald Knuth 1968
• AGs are additions to CFGs to carry some semantic information on parse tree nodes
• CFG plus – Attributes
• Associated with terminals and non-terminals, similar to variables in that values can be assigned
– attribute computation functions• Aka semantic functions, associated with grammatical
rules, specify how attribute values are computed– predicate functions
• State semantic rules; associated with grammatical rules
Attribute Grammars : Definition
• Def: An attribute grammar is a context-free grammar G = {P,T,N,S} with the following additions:– For each grammar symbol x in N,T there is a set
A(x) of attribute values• A(x) consists of two disjoint sets S(x) and I(x) called
synthesized attrbutes and inherited attributes
– Each rule has a set of functions that define certain attributes of the nonterminals in the rule
– Each rule has a (possibly empty) set of predicates to check for attribute consistency
Synthesized Attributes
• Synthesized attributes are used to pass semantic information UP in parse tree– Synthesized = computed– For a grammar rule of the form X0-> X1 . . . Xn the
synthesized attributes of X0 are computed as a function f(A(X1) . . . A(X1n) )
– The value of a synthesized attribute therefore depends only on the value of the attributes of that nodes children
Inherited Attributes
• Inherited attributes are used to pass semantic information DOWN the parse tree– Child nodes inherit from the parent– Synthesized = computed– For a grammar rule of the form X0-> X1 . . . Xn the
inherited attributes of Xj are computed as a function f(A(X0) . . . A(Xj-1) )
– The value of an inherited attribute therefore depends only on the value of the attributes of the parent and (usually) the left siblings
Predicate Functions
• A predicate is a Boolean expression on the union of the attribute set {A(X0) . . . A(Xn) } and a set of literal values
• The only derivations allowed in an attribute grammar are those in which every predicate associated with a nonterminal is true.
• A false predicate indicates a rule violation
Attributed (decorated) parse trees
• The parse tree has a possibly empty set of attributes attached to each node.
• When all attributes have been computed the tree is fully attributed or decorated
• Conceptually you think of the parse as producing a parse tree, then attribute values are computed in a second pass
Intrinsic Attributes
• Are synthesized attributes whose values are determined outside the parse tree– Example: type of a variable instance is taken from
the symbol table. – Contents of the symbol table are determined by
declaration statements– Initially given an unattributed parse tree, the only
values with attributes are the intrinsic attributes of the leaf nodes
– Given the intrinsic values, the semantic functions can compute the remaining attribute values
Attribute Grammars: Definition
• Let X0 X1 ... Xn be a rule• Functions of the form S(X0) = f(A(X1), ... ,
A(Xn)) define synthesized attributes• Functions of the form I(Xj) = f(A(X0), ... ,
A(Xn)), for i <= j <= n, define inherited attributes
• Initially, there are intrinsic attributes on the leaves
Attribute Grammars: An Example
• Consider the Ada rule that name on endof a procedure statement must match the procedure name
• Syntax rule:<proc_def> -> procedure <proc_name>[1]<proc_body> end <proc_name>[2]
• Predicate:<proc_name>[1].string == <proc_name>[2].string
Attribute Grammar: Example 2 Example 2
• Actual type– A synthesized attribute associated with <expr> and
<var>.– Intrinsic for <var>– Synthesized from children of <expr>
• Expected type– Inherited, associated with non-terminal <expr>– Stores type expected for expression as determined
by the type of the variable on the left hand side of the assignment
Parse tree of A = A + B Computing Attribute values
• How are attribute values computed?– If all attributes were inherited, the tree could be
decorated in top-down order.– If all attributes were synthesized, the tree could be
decorated in bottom-up order.– In many cases, both kinds of attributes are used,
and it is some combination of top-down and bottom-up that must be used.
• The general case is a complex problem that requires the construction of a dependency graph to determine evaluation order
Decorating the tree
1. <var>.actual_type <- lookup(A) [Rule 4]
2. <expr>.exp_type <- <var>.actual_type [Rule 1]
3. <var>[2].actual_type <- lookup(A) [Rule 4]4. <var>[3].actual_type <- lookup(B) [Rule 4]
5. <expr>.actual_type <- (int | real)[Rule 2]
6. <expr>.exp_type == <expr>.actual_type is either TRUE or FALSE (Rule 2)
Decorating
The decorated tree Semantics
• There is no single widely acceptable notation or formalism for describing semantics
• Several needs for a methodology and notation for semantics:– Programmers need to know what statements mean– Compiler writers must know exactly what language constructs
do– Correctness proofs would be possible– Compiler generators would be possible– Designers could detect ambiguities and inconsistencies
Operational Semantics
• Operational Semantics– Describe the meaning of a program by executing its
statements on a machine, either simulated or actual. The change in the state of the machine (memory, registers, etc.) defines the meaning of the statement
• To use operational semantics for a high-level language, a virtual machine is needed
Operational Semantics
• Whatever happens when the program is compiled by compiler C and run on machine M
• A few problems with this– Architectural dependency reduces usefulness for
people working with a different architecture– Requires a precise semantic definition of machine M– Not useful as a basis for standardization
Operational Semantics (continued)
• Uses of operational semantics:- Language manuals and textbooks- Teaching programming languages
• Two different levels of uses of operational semantics:- Natural operational semantics- Structural operational semantics
• Evaluation- Good if used informally (language manuals, etc.)- Extremely complex if used formally (e.g.,VDL)
Denotational Semantics
• Based on recursive function theory• The most abstract semantics description
method• Originally developed by Scott and Strachey
(1970)
• Each type of statement in the abstract syntax is defined as a state-transforming function
• A program is a collection of functions operating on the program state
Denotational Semantics - continued
• The process of building a denotationalspecification for a language:- Define a mathematical object for each language
entity– Define a function that maps instances of the
language entities onto instances of the corresponding mathematical objects
• The meaning of language constructs are defined by only the values of the program's variables
Evaluation of Denotational Semantics
• Can be used to prove the correctness of programs
• Provides a rigorous way to think about programs
• Can be an aid to language design• Has been used in compiler generation systems • Because of its complexity, it are of little use to
language users
Axiomatic Semantics
• Based on formal logic (predicate calculus)• Original purpose: formal program verification• Axioms or inference rules are defined for each
statement type in the language (to allow transformations of logic expressions into more formal logic expressions)
• The logic expressions are called assertions
Axiomatic Semantics
• Given a formal specification of a program P it should be possible to mechanically prove that P is correct by deriving its semantics from the axioms
• Nice in theory, but very difficult and tedious in practice
• Some recent tools support axiomatic semantics:– Java Modeling Language (JML)– Haskell– Spark/Ada
Evaluation of Axiomatic Semantics
• Developing axioms or inference rules for all of the statements in a language is difficult
• It is a good tool for correctness proofs, and an excellent framework for reasoning about programs, but it is not as useful for language users and compiler writers
• Its usefulness in describing the meaning of a programming language is limited for language users or compiler writers