binary studio academy pro: antlr course by alexander vasiltsov (lesson 2)

18
ANTLR 4 Grammars by Alexander Vasiltsov

Upload: binary-studio

Post on 12-Jul-2015

208 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

ANTLR 4

Grammars

by Alexander Vasiltsov

Page 2: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

EBNF

● lexeme “::=” its description (or “=”)

● ‘...’ - text element - character or group of

characters

● А В - element А followed by element B

(concatenation)

● A | B - element А or В (choice)

● [A] - element А exists or not (optional

existence)

● {A} - zero or more А elements (repeat)

● (А В) - elements grouping

Page 3: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

ANTLR Notation

Page 4: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Grammar patterns

Sequence of elements

Choice between multiple alternatives

Token dependence - precence of some token

requires presence of its counterpart

somewhere in a phrase

Nested phrase - a self-similar language

construct

Page 5: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Sequence

This is a finite or arbitrarily long sequence of

tokens or subphrases

Sequence with terminator

Sequence with separator

Page 6: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Choiсe (Alternatives)

This is a set of alternative phrases

Page 7: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Token Dependency

The presence of one token requires the

presence of one or more subsequent tokens

Page 8: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Nested Phrase

This is a self-similar language structure

Page 9: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Common lexical structures

Page 10: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Lexical Starter Kit (1)

Page 11: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Lexical Starter Kit (2)

Page 12: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Lexical Starter Kit (3)

Page 13: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Line between lexer and parser

● Match and discard anything in the lexer that the parser

does not need to see at all

● Match common tokens such as identifiers, keywords,

strings, and numbers in the lexer

● Lump together into a single token type those lexical

structures that the parser does not need to distinguish

● Lump together anything that the parser can treat as a

single entity

● On the other hand, if the parser needs to pull apart a

lump of text to process it, the lexer should pass the

individual components as tokens to the parser

Page 14: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

JSON Reference

http://json.org

Page 15: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

JSON grammar (1)grammar JSON;

json: object

| array

;

object

: '{' pair (',' pair)* '}'

| '{' '}' // empty object

;

pair: STRING ':' value ;

array

: '[' value (',' value)* ']'

| '[' ']' // empty array

;

value

: STRING

| NUMBER

| object // recursion

| array // recursion

| 'true' // keywords

| 'false'

| 'null'

;

Page 16: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

JSON grammar (2)

STRING : '"' (ESC | ~["\\])* '"' ;

fragment ESC : '\\' (["\\/bfnrt] | UNICODE) ;

fragment UNICODE : 'u' HEX HEX HEX HEX ;

fragment HEX : [0-9a-fA-F] ;

NUMBER

: '-'? INT '.' [0-9]+ EXP? // 1.35, 1.35E-9, 0.3, -4.5

| '-'? INT EXP // 1e10 -3e4

| '-'? INT // -3, 45

;

fragment INT : '0' | [1-9] [0-9]* ; // no leading zeros

fragment EXP : [Ee] [+\-]? INT ; // \- since - means "range" inside [...]

WS : [ \t\n\r]+ -> skip ;

Page 17: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Typical JSON

Page 18: Binary Studio Academy PRO: ANTLR course by Alexander Vasiltsov (lesson 2)

Parse tree