describing syntax and semantics chapter 3
DESCRIPTION
Describing Syntax and Semantics Chapter 3. Describing Syntax and Semantics. SYNTAX - the form of the expressions, statements and program units in a programming language. SEMANTICS - the meaning of those expressions, statements and program units. Describing Syntax. - PowerPoint PPT PresentationTRANSCRIPT
Describing Syntaxand Semantics
Chapter 3
Describing Syntaxand Semantics
SYNTAX - the form of the expressions, statements and program units in a programming language
SEMANTICS - the meaning of those expressions, statements and program units
Describing Syntax
The strings of a language are called sentences or statements
The lowest level syntactic units of a programming language are called lexemes.(identifiers, constants, operators, etc.)
A token is a category of lexemes.
Describing Syntax
index = 2 * count + 17;LEXEME: TOKEN:
index identifier= equal_sign2 int_constant* mult_opcount identifier+ plus_op17 int_constant; semicolon
Describing Syntax
Languages may be formally defined byone of two methods: recognition or generation.
Describing Syntax
Given a language, L, that uses the alphabet , a recognizer will indicate if the given string of characters from is in the language L.
A compiler is an example of a recognizer.
Describing Syntax
A generator is a device which can be used to generate sentences or statements of a language.
Useful for humans as a guide for the generation of valid statements in the language.
A grammar is an example of a generator.
Describing Syntax
Noam Chomsky, linguist, described four language generation devices, called grammars, that can be used to generate four different classes of languages.
Regular grammars - used to describe tokens
Context-free grammars - used to describe whole programming languages
Describing Syntax
John Backus, introduced a new formal notation for describing a programming language syntax. (Algol 58)
The notation was modified by Peter Naur.
Backus-Naur form (BNF)
Describing Syntax
Metalanguage - a language that can be used to describe another language
BNF is a metalanguage for the description of programming languages.
Describing Syntax
BNF uses abstractions to denote various syntactic structures.
<assign> is an abstraction which denotes a valid assignment statement
Describing Syntax
BNF uses rules or productions to describevalid programs, statements and expressions within the language.
<assign> <var> := <expression>
is a rule which describes a valid assignment statement in Pascal
LHS RHS
Describing Syntax
GRAMMAR - a set of rules or productions which describe a language
RULE or PRODUCTION- describes valid programs, statements and expressions within the language.
NON-TERMINAL - an abstraction which can be expanded by the application of some rule
TERMINAL - corresponds to lexemes or tokens of the language
Describing Syntax
BNF rule or production:
<assign> <var> := <expression>
LHS RHS
The non-terminal on the LHS may be replaced with the string of terminals and non-terminals on the RHS
Describing Syntax
BNF rule or production:
<id> A | B | C
Multiple definitions can be written in a single rule byseparating different definitions with the OR symbol
Describing Syntax
BNF rules or productions for lists:
<id_list> <id> | <id>, <id_list>
Recursion is used to describe rules for variable length lists. A rule is recursive if its LHS also appears in its RHS
Describing Syntax
Derivations: Sentences in a language may be generated through a sequence of applications of the rules, beginning with a start symbol.
Parse Trees: The hierarchical syntactic structure of a sentence in the language can be described with a parse tree.
Describing Syntax
<program> begin <stmt_list> end
<stmt_list> <stmt> | <stmt>; <stmt_list>
<stmt> <var> := <expression>
<var> A | B | C
<expression> <var> + <var> | <var> - <var> | <var>
Example 3.1
Grammar for Simple Assignment Statements
<assign> <id> := <expr>
<id> A | B | C
<expr> <id> + <expr> | <id> * <expr> | ( <expr> ) | <id>
Example 3.2
An Ambiguous Grammar for Simple Assignment Statements
<assign> <id> := <expr>
<id> A | B | C
<expr> <expr> + <expr> | <expr> * <expr> | ( <expr> ) | <id>
Example 3.3
More on RECURSION IN GRAMMAR RULES
<expr> <id> + <expr> | <id> * <expr> | ( <expr> ) | <id>
When a BNF rule has the non-terminal symbol of its left hand side appearing as the rightmost symbol on its right hand side, the rule is said to be right recursive.
RECURSION IN GRAMMAR RULES
A rule which is right recursive can be used to specify right associativity (meaning that operators of equal precedence are evaluated from right to left)
Many programming languages specify that addition, subtraction, multiplication and division follow rules of left associativity. Those that contain that exponentiation operators, specify that right associativity be used for exponentiation.
An Unambiguous Grammar with Operator Precedence
<assign> <id> := <expr>
<id> A | B | C
<expr> <expr> + <term> | <term>
<term> <term> * <factor> | <factor>
<factor> ( <expr> ) | <id>
Example 3.4
Syntax Graphs
expr term
-expr
+