1 / 48 formal a language theory and describing semantics 886340 principles of programming languages...
TRANSCRIPT
1 / 48
Formal a Language Theory
and Describing Semantics
886340Principles of Programming Languages
4
2 / 48
Content
•Formal a Language Theory• Regular Expressions• Finite State Automaton (FA)
•Describing Semantics• Operation• Denotational• axiomatic
3 / 48
Review
• A grammar can be ambiguous– i.e. more than one parse tree for same string of
terminals– in a PL we want to base meaning on parse so– ambiguous parse -> ambiguous meaning ->
• Especially grammars for expressions– precedence associativity
4 / 48
Review
• Solution: encode precedence & associativity in grammar
– non-terminal for each level of precedence<term>* / <factor>
5 / 48
Review
• Context Free Grammars (CFGs) are used to specify the overall structure of a programming language:
– if/then/else, ...– brackets: ( ), { }, begin/end, ...
• Regular Grammars (RGs) are used to specify the structure of tokens:
– identifiers, numbers, keywords, ...
6 / 48
Formal a Language Theory
• Offers a way to describe computation problems formulated as language recognition problems
– Enables proofs of relative difficulty of certain computational problems
• Provides a mechanism to aid description of programming language constructs
– Regular expressions ~ PL tokens (e.g., keywords) - Finite state automata (FSAs)
– Context-free grammars ~ PL statements
7 / 48
Formal a Language Theory
• Recognizers for languages are more complex as the languages themselves become more complex
– Simple constructs correspond to FSAsKeywords, numerical constants
– More complex constructs correspond to Push-down Automata
If statements, looping statements, declarations– Even more complex constructs correspond to more complex automata
Type checking of use with declared type
8 / 48
Regular Expressions
• Formalism for describing simple PL constructs– reserved words– identifiers– numbers
• Simplest sort of structure• Recognized by a finite state automaton• Defined recursively
9 / 48
Regular Expressions
PL construct RE Notation Language
an empty { }
symbol a a {a}
null symbol {}
R,S regular exprsa,b terminals
R | S a|b (alternation)
LR LS
{a,b}
R,S regular exprsa,b terminals
RS ab (concatenation)
LRLS
{ab}
10 / 48
Regular Expressions
PL construct RE Notation LanguageR regular exprsa
R*a*
{} LR LR LR LR LR LR ...{ ,a,aa,aaa,...}
R regular exprsa
R+
a+LR LR LR LR LR LR ...
{a,aa,aaa,...}
Note: a = a = aPrecedence is {* +} ----concatenation ---- |high to low (all are left associative operators)
11 / 48
Regular Expressions
1 | 2 {1,2}1* | 2 {2, ,1,11,111,...}1 2* {1, 12, 122, 1222, ...}1 2* | 0+ {0,00,000,...,1,12,122,...}(1 | 2)* {,1,2,12,11,21,22,...}(0 | 1)+ {,1,01,001,011,111,...}
12 / 48
RE’s for PLs
• Let letter stand for a|b|c|...|z and digit stand for 0|1|2|3|4|5|6|7|8|9
– letter (letter | digit)* is identifier– digit+ is an integer constant– digit* . digit + is real number
• Which identifiers are described by– letter (letter | digit)*
? ABC 0C B% X1
Louden uses [0-9]
[a-z]([a-z] | [0-9])*
[0-9]+
[0-9]* . [0-9]+
13 / 48
Examples
• Which of the following are legal real numbers described by – digit * . digit +
? .5 1.5 2 4. 6.3 0.2
• Can see that simple PL constructs can be defined as regular expressions – Can you define a number in scientific notation as an RE? (e.g., 1.25e+20, .05e-3)
14 / 48
Finite State Automaton (FA)
• Described by<set of states, labelled transitions, start state, final
state(s)>
• Example:
15 / 48
Finite State Automaton (FA)
• FA accepts or recognizes an input string if there is a path from its start state to a final state such that the labels on the path are the terminals in that string.
• What strings are recognized?
16 / 48
Finite State Automaton (FA)
• Binary numbers containing a pair of adjacent 1’s:
• Recognizes: (0 | 1) * 1 1 (0 | 1) *
17 / 48
Finite State Automaton (FA)
• Binary numbers containing a pair of adjacent 1’s:
• Recognizes: (0 | 1) * 1 1 (0 | 1) *• FA1 and FA2 recognize the same set of strings, i.e.,
the same language! Therefore, FAs are not unique.
18 / 48
Finite State Automaton (FA)
• Exponent in scientific notation:
• Recognizes: E (+ | -) digit + | E digit +
19 / 48
Finite State Automaton (FA)
• Binary numbers which begin and end with a 1:
• Recognizes: 1(0|1)*1
20 / 48
Finite State Automaton (FA)
• Binary numbers containing at least one digit, in which all the 1’s precede all the 0’s:
• Recognizes: 0+ | 1+0*
21 / 48
• Recognition of a string– Is this given string in the language described
(recognized) by this given RE (FA)?
• Description of a language– Given an RE (FA), what language does it describe(recognize)?
• Codification of a language– Given a language, find an RE and an FA that
corresponds to it
Tasks for REs and FAs
22 / 48
Tasks for REs and FAs
• Recognition of a string– Given 10*, which of these strings is described by it:
1, 00, 10, 1000, 01– Given the following FA:
which of these strings is recognized by it: 1, 00, 10, 1000, 01
23 / 48
Tasks for REs and FAs
• Description of a language– What language is described by the following RE:
(01)+ | (10)+
– What language is recognized by the following FA:
24 / 48
Tasks for REs and FAs
• Description of a language– What language is recognized by the following FA:
25 / 48
Tasks for REs and FAs• Codification of a language
– Complex constants are parenthesized pairs of integers– Let digit = 0|1|2|3|4|5|6|7|8|9– The RE for complex constants is:
( digit + , digit + )– The FA for complex constants is:
26 / 48
Describing Semantics
27 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Attribute Grammars
•Attribute grammars (AGs) have additions to CFGs to carry some semantic info on parse tree nodes
•Primary value of AGs:•Static semantics specification•Compiler design (static semantics checking)
28 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Attribute Grammars : Definition
•Def: An attribute grammar is a context-free grammar G = (S, N, T, P) with the following additions:•For each grammar symbol x there is a set A(x) of attribute values
•Each rule has a set of functions that define certain attributes of the nonterminals in the rule
•Each rule has a (possibly empty) set of predicates to check for attribute consistency
29 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Attribute Grammars: Definition
•Let X0 X1 ... Xn be a rule•Functions of the form
S(X0) = f(A(X1), ... , A(Xn)) define synthesized attributes
•Functions of the form I(Xj) = f(A(X0), ... , A(Xn)), for i <= j <= n,
define inherited attributes
•Initially, there are intrinsic attributes on the leaves
30 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Attribute Grammars: An Example
•Syntax<assign> <var> = <expr><expr> <var> + <var> | <var><var> A | B | C
•actual_type: synthesized for <var> and <expr>
•expected_type: inherited for <expr>
31 / 48
Attribute Grammars: An Example
32 / 48
• A parse tree of the sentence A = A + B
Attribute Grammars: An Example
33 / 48
Attribute Grammars: An Example
34 / 48
• The flow of attributes in the tree• In this example, A is defined as a real and B is defined as an int.
Attribute Grammars: An Example
35 / 48
• A fully attributed parse tree
Attribute Grammars: An Example
36 / 48
Operational Semantics
• Operational Semantics• Describe the meaning of a program by executing its
statements on a machine, either simulated or actual. The change in the state of the machine (memory, registers, etc.) defines the meaning of the statement
• To use operational semantics for a high-level language, a virtual machine is needed
Copyright © 2012 Addison-Wesley. All rights reserved.
37 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Operational Semantics
•A hardware pure interpreter would be too expensive
•A software pure interpreter also has problems•The detailed characteristics of the particular computer would make actions difficult to understand
•Such a semantic definition would be machine- dependent
•A better alternative: A complete computer simulation
•The process:•Build a translator (translates source code to the machine code of an idealized computer)
•Build a simulator for the idealized computer
38 / 48
Denotational Semantics
• Based on recursive function theory• The process of building a denotational specification
for a language: - Define a mathematical object for each language entity
-Define a function that maps instances of the language entities onto instances of the corresponding mathematical objects
• The meaning of language constructs are defined by only the values of the program's variables
Copyright © 2012 Addison-Wesley. All rights reserved.
39 / 48
• The meaning of binary numbers using denotational semantics, we associate the actual meaning (a decimal number)
• The semantic function, named Mbin
Mbin('0') = 0
Mbin('1') = 1
Mbin(<bin_num> '0') = 2 *Mbin(<bin_num>)
Mbin(<bin_num> '1') = 2 *Mbin(<bin_num>) + 1
Binary Numbers
40 / 48
Decimal Numbers
<dec_num> '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | <dec_num> ('0' | '1' | '2' | '3' |
'4' | '5' | '6' | '7' | '8' | '9')
Mdec('0') = 0, Mdec ('1') = 1, …, Mdec ('9') = 9
Mdec (<dec_num> '0') = 10 * Mdec (<dec_num>)
Mdec (<dec_num> '1’) = 10 * Mdec (<dec_num>) + 1
…
Mdec (<dec_num> '9') = 10 * Mdec (<dec_num>) + 9
Copyright © 2012 Addison-Wesley. All rights reserved.
41 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Axiomatic Semantics
•Based on formal logic (predicate calculus)
•Original purpose: formal program verification
•Axioms or inference rules are defined for each statement type in the language
•The logic expressions are called assertions
42 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Axiomatic Semantics (continued)
•An assertion before a statement (a precondition) states the relationships and constraints among variables that are true at that point in execution
•An assertion following a statement is a postcondition
•A weakest precondition is the least restrictive precondition that will guarantee the postcondition
43 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Axiomatic Semantics Form
•Pre-, post form: {P} statement {Q}
•examples•a = b + 1 {a > 1}•One possible precondition: {b > 10}•Weakest precondition: {b > 0}
•sum = 2 * x + 1 {sum > 1}•Weakest precondition: ???
44 / 48
• examples•a = b / 2 - 1 {a < 10}• Weakest precondition:??
•x = x + y - 3 {x > 10}• Weakest precondition: {y > 13 – x}
Axiomatic Semantics Form
45 / 48
• examples•y = 3 * x + 1;•x = y + 3;•{x < 10} • The precondition for the second assignment statement is y
< 7•The precondition for •the first assignment statement•3 * x + 1 < 7•x < 2 • So, {x < 2} is the precondition of both
Axiomatic Semantics Form
46 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Program Proof Process
•The postcondition for the entire program is the desired result•Work back through the program to the first statement. If the precondition on the first statement is the same as the program specification, the program is correct.
47 / 48Copyright © 2012 Addison-Wesley. All rights
reserved.
Summary
•BNF and context-free grammars are equivalent meta-languages•Well-suited for describing the syntax of programming languages
•An attribute grammar is a descriptive formalism that can describe both the syntax and the semantics of a language
•Three primary methods of semantics description•Operation, denotational, axiomatic
48 / 48
Conclusionand
Question