1 / 48 formal a language theory and describing semantics 886340 principles of programming languages...

48
1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

Upload: frederica-greene

Post on 17-Jan-2016

230 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

1 / 48

Formal a Language Theory

and Describing Semantics

886340Principles of Programming Languages

4

Page 2: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

2 / 48

Content

•Formal a Language Theory• Regular Expressions• Finite State Automaton (FA)

•Describing Semantics• Operation• Denotational• axiomatic

Page 3: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

3 / 48

Review

• A grammar can be ambiguous– i.e. more than one parse tree for same string of

terminals– in a PL we want to base meaning on parse so– ambiguous parse -> ambiguous meaning ->

• Especially grammars for expressions– precedence associativity

Page 4: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

4 / 48

Review

• Solution: encode precedence & associativity in grammar

– non-terminal for each level of precedence<term>* / <factor>

Page 5: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

5 / 48

Review

• Context Free Grammars (CFGs) are used to specify the overall structure of a programming language:

– if/then/else, ...– brackets: ( ), { }, begin/end, ...

• Regular Grammars (RGs) are used to specify the structure of tokens:

– identifiers, numbers, keywords, ...

Page 6: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

6 / 48

Formal a Language Theory

• Offers a way to describe computation problems formulated as language recognition problems

– Enables proofs of relative difficulty of certain computational problems

• Provides a mechanism to aid description of programming language constructs

– Regular expressions ~ PL tokens (e.g., keywords) - Finite state automata (FSAs)

– Context-free grammars ~ PL statements

Page 7: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

7 / 48

Formal a Language Theory

• Recognizers for languages are more complex as the languages themselves become more complex

– Simple constructs correspond to FSAsKeywords, numerical constants

– More complex constructs correspond to Push-down Automata

If statements, looping statements, declarations– Even more complex constructs correspond to more complex automata

Type checking of use with declared type

Page 8: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

8 / 48

Regular Expressions

• Formalism for describing simple PL constructs– reserved words– identifiers– numbers

• Simplest sort of structure• Recognized by a finite state automaton• Defined recursively

Page 9: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

9 / 48

Regular Expressions

PL construct RE Notation Language

an empty { }

symbol a a {a}

null symbol {}

R,S regular exprsa,b terminals

R | S a|b (alternation)

LR LS

{a,b}

R,S regular exprsa,b terminals

RS ab (concatenation)

LRLS

{ab}

Page 10: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

10 / 48

Regular Expressions

PL construct RE Notation LanguageR regular exprsa

R*a*

{} LR LR LR LR LR LR ...{ ,a,aa,aaa,...}

R regular exprsa

R+

a+LR LR LR LR LR LR ...

{a,aa,aaa,...}

Note: a = a = aPrecedence is {* +} ----concatenation ---- |high to low (all are left associative operators)

Page 11: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

11 / 48

Regular Expressions

1 | 2 {1,2}1* | 2 {2, ,1,11,111,...}1 2* {1, 12, 122, 1222, ...}1 2* | 0+ {0,00,000,...,1,12,122,...}(1 | 2)* {,1,2,12,11,21,22,...}(0 | 1)+ {,1,01,001,011,111,...}

Page 12: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

12 / 48

RE’s for PLs

• Let letter stand for a|b|c|...|z and digit stand for 0|1|2|3|4|5|6|7|8|9

– letter (letter | digit)* is identifier– digit+ is an integer constant– digit* . digit + is real number

• Which identifiers are described by– letter (letter | digit)*

? ABC 0C B% X1

Louden uses [0-9]

[a-z]([a-z] | [0-9])*

[0-9]+

[0-9]* . [0-9]+

Page 13: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

13 / 48

Examples

• Which of the following are legal real numbers described by – digit * . digit +

? .5 1.5 2 4. 6.3 0.2

• Can see that simple PL constructs can be defined as regular expressions – Can you define a number in scientific notation as an RE? (e.g., 1.25e+20, .05e-3)

Page 14: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

14 / 48

Finite State Automaton (FA)

• Described by<set of states, labelled transitions, start state, final

state(s)>

• Example:

Page 15: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

15 / 48

Finite State Automaton (FA)

• FA accepts or recognizes an input string if there is a path from its start state to a final state such that the labels on the path are the terminals in that string.

• What strings are recognized?

Page 16: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

16 / 48

Finite State Automaton (FA)

• Binary numbers containing a pair of adjacent 1’s:

• Recognizes: (0 | 1) * 1 1 (0 | 1) *

Page 17: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

17 / 48

Finite State Automaton (FA)

• Binary numbers containing a pair of adjacent 1’s:

• Recognizes: (0 | 1) * 1 1 (0 | 1) *• FA1 and FA2 recognize the same set of strings, i.e.,

the same language! Therefore, FAs are not unique.

Page 18: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

18 / 48

Finite State Automaton (FA)

• Exponent in scientific notation:

• Recognizes: E (+ | -) digit + | E digit +

Page 19: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

19 / 48

Finite State Automaton (FA)

• Binary numbers which begin and end with a 1:

• Recognizes: 1(0|1)*1

Page 20: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

20 / 48

Finite State Automaton (FA)

• Binary numbers containing at least one digit, in which all the 1’s precede all the 0’s:

• Recognizes: 0+ | 1+0*

Page 21: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

21 / 48

• Recognition of a string– Is this given string in the language described

(recognized) by this given RE (FA)?

• Description of a language– Given an RE (FA), what language does it describe(recognize)?

• Codification of a language– Given a language, find an RE and an FA that

corresponds to it

Tasks for REs and FAs

Page 22: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

22 / 48

Tasks for REs and FAs

• Recognition of a string– Given 10*, which of these strings is described by it:

1, 00, 10, 1000, 01– Given the following FA:

which of these strings is recognized by it: 1, 00, 10, 1000, 01

Page 23: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

23 / 48

Tasks for REs and FAs

• Description of a language– What language is described by the following RE:

(01)+ | (10)+

– What language is recognized by the following FA:

Page 24: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

24 / 48

Tasks for REs and FAs

• Description of a language– What language is recognized by the following FA:

Page 25: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

25 / 48

Tasks for REs and FAs• Codification of a language

– Complex constants are parenthesized pairs of integers– Let digit = 0|1|2|3|4|5|6|7|8|9– The RE for complex constants is:

( digit + , digit + )– The FA for complex constants is:

Page 26: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

26 / 48

Describing Semantics

Page 27: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

27 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Attribute Grammars

•Attribute grammars (AGs) have additions to CFGs to carry some semantic info on parse tree nodes

•Primary value of AGs:•Static semantics specification•Compiler design (static semantics checking)

Page 28: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

28 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Attribute Grammars : Definition

•Def: An attribute grammar is a context-free grammar G = (S, N, T, P) with the following additions:•For each grammar symbol x there is a set A(x) of attribute values

•Each rule has a set of functions that define certain attributes of the nonterminals in the rule

•Each rule has a (possibly empty) set of predicates to check for attribute consistency

Page 29: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

29 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Attribute Grammars: Definition

•Let X0 X1 ... Xn be a rule•Functions of the form

S(X0) = f(A(X1), ... , A(Xn)) define synthesized attributes

•Functions of the form I(Xj) = f(A(X0), ... , A(Xn)), for i <= j <= n,

define inherited attributes

•Initially, there are intrinsic attributes on the leaves

Page 30: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

30 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Attribute Grammars: An Example

•Syntax<assign> <var> = <expr><expr> <var> + <var> | <var><var> A | B | C

•actual_type: synthesized for <var> and <expr>

•expected_type: inherited for <expr>

Page 31: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

31 / 48

Attribute Grammars: An Example

Page 32: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

32 / 48

• A parse tree of the sentence A = A + B

Attribute Grammars: An Example

Page 33: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

33 / 48

Attribute Grammars: An Example

Page 34: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

34 / 48

• The flow of attributes in the tree• In this example, A is defined as a real and B is defined as an int.

Attribute Grammars: An Example

Page 35: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

35 / 48

• A fully attributed parse tree

Attribute Grammars: An Example

Page 36: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

36 / 48

Operational Semantics

• Operational Semantics• Describe the meaning of a program by executing its

statements on a machine, either simulated or actual. The change in the state of the machine (memory, registers, etc.) defines the meaning of the statement

• To use operational semantics for a high-level language, a virtual machine is needed

Copyright © 2012 Addison-Wesley. All rights reserved.

Page 37: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

37 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Operational Semantics

•A hardware pure interpreter would be too expensive

•A software pure interpreter also has problems•The detailed characteristics of the particular computer would make actions difficult to understand

•Such a semantic definition would be machine- dependent

•A better alternative: A complete computer simulation

•The process:•Build a translator (translates source code to the machine code of an idealized computer)

•Build a simulator for the idealized computer

Page 38: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

38 / 48

Denotational Semantics

• Based on recursive function theory• The process of building a denotational specification

for a language: - Define a mathematical object for each language entity

-Define a function that maps instances of the language entities onto instances of the corresponding mathematical objects

• The meaning of language constructs are defined by only the values of the program's variables

Copyright © 2012 Addison-Wesley. All rights reserved.

Page 39: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

39 / 48

• The meaning of binary numbers using denotational semantics, we associate the actual meaning (a decimal number)

• The semantic function, named Mbin

Mbin('0') = 0

Mbin('1') = 1

Mbin(<bin_num> '0') = 2 *Mbin(<bin_num>)

Mbin(<bin_num> '1') = 2 *Mbin(<bin_num>) + 1

Binary Numbers

Page 40: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

40 / 48

Decimal Numbers

<dec_num> '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | <dec_num> ('0' | '1' | '2' | '3' |

'4' | '5' | '6' | '7' | '8' | '9')

Mdec('0') = 0, Mdec ('1') = 1, …, Mdec ('9') = 9

Mdec (<dec_num> '0') = 10 * Mdec (<dec_num>)

Mdec (<dec_num> '1’) = 10 * Mdec (<dec_num>) + 1

Mdec (<dec_num> '9') = 10 * Mdec (<dec_num>) + 9

Copyright © 2012 Addison-Wesley. All rights reserved.

Page 41: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

41 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Axiomatic Semantics

•Based on formal logic (predicate calculus)

•Original purpose: formal program verification

•Axioms or inference rules are defined for each statement type in the language

•The logic expressions are called assertions

Page 42: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

42 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Axiomatic Semantics (continued)

•An assertion before a statement (a precondition) states the relationships and constraints among variables that are true at that point in execution

•An assertion following a statement is a postcondition

•A weakest precondition is the least restrictive precondition that will guarantee the postcondition

Page 43: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

43 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Axiomatic Semantics Form

•Pre-, post form: {P} statement {Q}

•examples•a = b + 1 {a > 1}•One possible precondition: {b > 10}•Weakest precondition: {b > 0}

•sum = 2 * x + 1 {sum > 1}•Weakest precondition: ???

Page 44: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

44 / 48

• examples•a = b / 2 - 1 {a < 10}• Weakest precondition:??

•x = x + y - 3 {x > 10}• Weakest precondition: {y > 13 – x}

Axiomatic Semantics Form

Page 45: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

45 / 48

• examples•y = 3 * x + 1;•x = y + 3;•{x < 10} • The precondition for the second assignment statement is y

< 7•The precondition for •the first assignment statement•3 * x + 1 < 7•x < 2 • So, {x < 2} is the precondition of both

Axiomatic Semantics Form

Page 46: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

46 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Program Proof Process

•The postcondition for the entire program is the desired result•Work back through the program to the first statement. If the precondition on the first statement is the same as the program specification, the program is correct.

Page 47: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

47 / 48Copyright © 2012 Addison-Wesley. All rights

reserved.

Summary

•BNF and context-free grammars are equivalent meta-languages•Well-suited for describing the syntax of programming languages

•An attribute grammar is a descriptive formalism that can describe both the syntax and the semantics of a language

•Three primary methods of semantics description•Operation, denotational, axiomatic

Page 48: 1 / 48 Formal a Language Theory and Describing Semantics 886340 Principles of Programming Languages 4

48 / 48

Conclusionand

Question