1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 4: Syntax Specification COMP 144 Programming Language Concepts Spring 2002 Felix

Download 1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 4: Syntax Specification COMP 144 Programming Language Concepts Spring 2002 Felix

Post on 21-Dec-2015

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

<ul><li> Slide 1 </li> <li> 1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 4: Syntax Specification COMP 144 Programming Language Concepts Spring 2002 Felix Hernandez-Campos Jan 16 The University of North Carolina at Chapel Hill </li> <li> Slide 2 </li> <li> 2 COMP 144 Programming Language Concepts Felix Hernandez-Campos Phases of Compilation </li> <li> Slide 3 </li> <li> 3 COMP 144 Programming Language Concepts Felix Hernandez-Campos Syntax Analysis Syntax:Syntax: Websters definition: 1 a : the way in which linguistic elements (as words) are put together to form constituents (as phrases or clauses) The syntax of a programming languageThe syntax of a programming language Describes its form i.e. Organization of tokens (elements) Formal notation Context Free Grammars (CFGs) </li> <li> Slide 4 </li> <li> 4 COMP 144 Programming Language Concepts Felix Hernandez-Campos Review: Formal definition of tokens A set of tokens is a set of strings over an alphabetA set of tokens is a set of strings over an alphabet {read, write, +, -, *, /, :=, 1, 2, , 10, , 3.45e-3, } A set of tokens is a regular set that can be defined by comprehension using a regular expressionA set of tokens is a regular set that can be defined by comprehension using a regular expression For every regular set, there is a deterministic finite automaton (DFA) that can recognize itFor every regular set, there is a deterministic finite automaton (DFA) that can recognize it i.e. determine whether a string belongs to the set or not Scanners extract tokens from source code in the same way DFAs determine membership </li> <li> Slide 5 </li> <li> 5 COMP 144 Programming Language Concepts Felix Hernandez-Campos Review: Regular Expressions A regular expression (RE) is:A regular expression (RE) is: A single character The empty string, The concatenation of two regular expressions Notation: RE 1 RE 2 (i.e. RE 1 followed by RE 2 ) The union of two regular expressions Notation: RE 1 | RE 2 The closure of a regular expression Notation: RE* * is known as the Kleene star * represents the concatenation of 0 or more strings </li> <li> Slide 6 </li> <li> 6 COMP 144 Programming Language Concepts Felix Hernandez-Campos Review: Token Definition Example Numeric literals in PascalNumeric literals in Pascal Definition of the token unsigned_number digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer digit digit* unsigned_number unsigned_integer ( (. unsigned_integer ) | ) ( ( e ( + | | ) unsigned_integer ) | ) Recursion is not allowed!Recursion is not allowed! </li> <li> Slide 7 </li> <li> 7 COMP 144 Programming Language Concepts Felix Hernandez-Campos Exercise digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer digit digit* unsigned_number unsigned_integer ( (. unsigned_integer ) | ) ( ( e ( + | | ) unsigned_integer ) | ) Regular expression forRegular expression for Decimal numbers number </li> <li> Slide 8 </li> <li> 8 COMP 144 Programming Language Concepts Felix Hernandez-Campos Exercise digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer digit digit* unsigned_number unsigned_integer ( (. unsigned_integer ) | ) ( ( e ( + | | ) unsigned_integer ) | ) Regular expression forRegular expression for Decimal numbers number ( + | | ) unsigned_integer ( ( unsigned_integer ) | ) </li> <li> Slide 9 </li> <li> 9 COMP 144 Programming Language Concepts Felix Hernandez-Campos Exercise digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer digit digit* unsigned_number unsigned_integer ( (. unsigned_integer ) | ) ( ( e ( + | | ) unsigned_integer ) | ) Regular expression forRegular expression for Identifiers identifier </li> <li> Slide 10 </li> <li> 10 COMP 144 Programming Language Concepts Felix Hernandez-Campos Exercise digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer digit digit* unsigned_number unsigned_integer ( (. unsigned_integer ) | ) ( ( e ( + | | ) unsigned_integer ) | ) Regular expression forRegular expression for Identifiers identifier letter ( letter | digit | )* letter a | b | c | | z </li> <li> Slide 11 </li> <li> 11 COMP 144 Programming Language Concepts Felix Hernandez-Campos Context Free Grammars CFGsCFGs Add recursion to regular expressions Nested constructions Notation expression identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator + | - | * | / Terminal symbols Non-terminal symbols Production rule (i.e. substitution rule) terminal symbol terminal and non-terminal symbols </li> <li> Slide 12 </li> <li> 12 COMP 144 Programming Language Concepts Felix Hernandez-Campos Backus-Naur Form Backus-Naur Form (BNF)Backus-Naur Form (BNF) Equivalent to CFGs in power CFG expression identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator + | - | * | / BNF expression identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator + | - | * | / </li> <li> Slide 13 </li> <li> 13 COMP 144 Programming Language Concepts Felix Hernandez-Campos Extended Backus-Naur Form Extended Backus-Naur Form (EBNF)Extended Backus-Naur Form (EBNF) Adds some convenient symbols Union| Kleene star* Meta-level parentheses( ) It has the same expressive power </li> <li> Slide 14 </li> <li> 14 COMP 144 Programming Language Concepts Felix Hernandez-Campos Extended Backus-Naur Form Extended Backus-Naur Form (EBNF)Extended Backus-Naur Form (EBNF) It has the same expressive power BNF digit 0 digit 1 digit 9 unsigned_integer digit unsigned_integer digit unsigned_integer EBNF digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer digit digit * </li> <li> Slide 15 </li> <li> 15 COMP 144 Programming Language Concepts Felix Hernandez-Campos Derivations A derivation shows how to generate a syntactically valid stringA derivation shows how to generate a syntactically valid string Given a CFG Example: CFG expression identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator + | - | * | / Derivation of slope * x + intercept </li> <li> Slide 16 </li> <li> 16 COMP 144 Programming Language Concepts Felix Hernandez-Campos Derivation Example Derivation of slope * x + interceptDerivation of slope * x + intercept expression expression operator expression expression operator intercept expression operator intercept expression + intercept expression + intercept expression operator expression + intercept expression operator expression + intercept expression operator x + intercept expression operator x + intercept expression * x + intercept expression * x + intercept slope * x + intercept slope * x + intercept expression * slope * x + intercept Identifiers were not derived for simplicity </li> <li> Slide 17 </li> <li> 17 COMP 144 Programming Language Concepts Felix Hernandez-Campos Parse Trees A parse is graphical representation of a derivationA parse is graphical representation of a derivation ExampleExample </li> <li> Slide 18 </li> <li> 18 COMP 144 Programming Language Concepts Felix Hernandez-Campos Ambiguous Grammars Alternative parse treeAlternative parse tree same expression same grammar This grammar is ambiguousThis grammar is ambiguous </li> <li> Slide 19 </li> <li> 19 COMP 144 Programming Language Concepts Felix Hernandez-Campos Designing unambiguous grammars Specify more grammatical structureSpecify more grammatical structure In our example, left associativity and operator precedence 10 4 3 means (10 4) 3 3 + 4 * 5 means 3 + (4 * 5) </li> <li> Slide 20 </li> <li> 20 COMP 144 Programming Language Concepts Felix Hernandez-Campos Example Parse tree for 3 + 4 * 5Parse tree for 3 + 4 * 5 Exercise: parse tree forExercise: parse tree for - 10 / 5 * 8 4 - 5 - 10 / 5 * 8 4 - 5 </li> <li> Slide 21 </li> <li> 21 COMP 144 Programming Language Concepts Felix Hernandez-Campos Java Language Specification Available on-lineAvailable on-line http://java.sun.com/docs/books/jls/second_edition/html/j.tit le.doc.html http://java.sun.com/docs/books/jls/second_edition/html/j.tit le.doc.htmlhttp://java.sun.com/docs/books/jls/second_edition/html/j.tit le.doc.html ExamplesExamples Comments: http://java.sun.com/docs/books/jls/second_edition/html/lex ical.doc.html#48125 http://java.sun.com/docs/books/jls/second_edition/html/lex ical.doc.html#48125 http://java.sun.com/docs/books/jls/second_edition/html/lex ical.doc.html#48125 Multiplicative Operators: http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#239829 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#239829 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#239829 Unary Operators: http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#4990 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#4990 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#4990 </li> <li> Slide 22 </li> <li> 22 COMP 144 Programming Language Concepts Felix Hernandez-Campos Reading Assignment Scotts Chapter 2Scotts Chapter 2 Section 2.1.2 Section 2.1.3 Java language specificationJava language specification Chapter 2 (Grammars) Glance at chapter 3 Glance at sections 15.17, 15.18 and 15.15 </li> </ul>