xtext the very least

53
The very least about Xtext Juri Luca De Coi Saint-Etienne, France, 16-05-2011

Upload: carloscercos

Post on 27-Oct-2014

337 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Xtext the Very Least

The very least about Xtext

Juri Luca De Coi

Saint-Etienne, France, 16-05-2011

Page 2: Xtext the Very Least

Outline

• Introduction

• How to specify the target language

– Terminal rules

– Parser rules

• How to specify the target AST

– The working example

– Return types

• Further features

• The header

• Getting Xtext up & running

Page 3: Xtext the Very Least

Outline

• Introduction

• How to specify the target language

– Terminal rules

– Parser rules

• How to specify the target AST

– The working example

– Return types

• Further features

• The header

• Getting Xtext up & running

Page 4: Xtext the Very Least

Xtext (I)

Xtext is a language development framework

– i.e., a technology supporting the activity of developing languages

Given the Xtext grammar of a language, it provides you with

for that language

an Eclipse editor with • content assistance

• quick fixes • template proposals

• outline • hyperlinking

• syntax coloring • project wizard

a code generator

a serializer and code formatter

a scoping and linking framework

a validator an AST builder

a parser

Page 5: Xtext the Very Least

Xtext (II)

The more you want, the more you have to pay BUT • if you are fine with the (reasonable) defaults,

your amount of work will be pretty low • otherwise, Xtext is highly configurable

– Each automatically generated class can be replaced in a non-invasive way

What do we want? • A parser • An AST builder We will (almost) only focus on Xtext’s grammar

language

Page 6: Xtext the Very Least

Technical remark

Xtext is based on Ecore

Knowledge of Ecore is required to exploit Xtext’s full potential

Ecore is the core of the Eclipse Modeling Framework Project (EMF)

– EMF is “a modeling framework and code generation facility for building tools and other applications based on a structured data model”

I will try to leave Ecore out as much as possible

I will skip some (most) parts of Xtext

Page 7: Xtext the Very Least

Xtext’s grammar language (XGL)

• A language to describe (textual) languages

• An Xtext grammar describes

– the syntax of the target language

– the structure of the target AST

Page 8: Xtext the Very Least

The syntax of the target language

Not surprisingly, XGL distinguishes between

• lexical level

• syntactic level

specifies the language’s

by means of

exploited by the

Lexical level

tokens keywords, terminal rules

lexer (a.k.a. scanner or tokenizer)

Syntactic level

grammar parser rules

parser

Page 9: Xtext the Very Least

Outline

• Introduction

• How to specify the target language

– Terminal rules

– Parser rules

• How to specify the target AST

– The working example

– Return types

• Further features

• The header

• Getting Xtext up & running

Page 10: Xtext the Very Least

Terminal rules (I)

terminal NAME : expression ;

expression can contain

1. keywords – single- or double-quoted

– can have any length

– can contain arbitrary characters • including the escape sequences \b, \t , \n , \f , \r , \" , \' and \\

• Unicode escape sequences (e.g., \u123) are not supported

EX: 'foo', "\foo", '"', "'"

Page 11: Xtext the Very Least

Terminal rules (II)

2. wildcard (.)

– An arbitrary character

– EX: .

3. rule calls

– Terminal rules can only point to other terminal rules

– EX: ID (assuming that ID is the name of a terminal rule)

4. character ranges (..)

– Extremes are included

– EX: 'a'..'z', 'A'..'Z', '0'..'9'

Page 12: Xtext the Very Least

Terminal rules (III)

5. until token (->)

– All input between the preceding and the following token (extremes are included)

– EX: '/*' -> '*/'

6. negated token (!)

– Input different than the following

– EX: !'\n'

7. cardinality operators (?, *, + or nothing)

– EX: '^'?, '\r'*, '9'+

Page 13: Xtext the Very Least

Terminal rules (IV)

8. groups (token sequences)

– EX: 'a' . ID (assuming that ID is the name of a terminal rule)

9. alternatives (|)

– EX: ' ' | '\t' | '\r' | '\n'

Page 14: Xtext the Very Least

Operator priority

Ordered by decreasing priority

Parenthesis (()) can override default priorities terminal ID:

'^'?

('a'..'z'|'A'..'Z'|'_')

('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;

Character ranges ..

Until token, Negated token ->, !

Cardinality operators ?, *, + or nothing

Groups Token sequences

Alternatives |

Page 15: Xtext the Very Least

Technical remark

NOTE: Terminal rules can hide each other

The order of terminal rules is crucial

This is especially important when mixing

• newly introduced rules and

• rules from imported grammars (cf. below)

Page 16: Xtext the Very Least

Outline

• Introduction

• How to specify the target language

– Terminal rules

– Parser rules

• How to specify the target AST

– The working example

– Return types

• Further features

• The header

• Getting Xtext up & running

Page 17: Xtext the Very Least

Parser rules (I)

name : expression ;

expression can contain

1. keywords – single- or double-quoted

– can have any length

– can contain arbitrary characters • including the escape sequences \b, \t , \n , \f , \r , \" , \' and \\

• Unicode escape sequences (e.g., \u123) are not supported

EX: 'foo', "\foo", '"', "'"

Page 18: Xtext the Very Least

Parser rules (II)

2. rule calls – EX: ID (assuming that ID is the name of a rule)

3. cardinality operators (?, *, + or nothing) – EX: '^'?, '\r'*, '9'+

4. groups (token sequences) – EX: 'a' ID

5. unordered groups (&) – Elements can appear in any order but only once

– Elements with cardinality * or + must appear continuously without interruption

– EX: 'a' & ID*

Page 19: Xtext the Very Least

Parser rules (III)

6. alternatives (|)

– EX: ' ' | '\t' | '\r' | '\n'

Page 20: Xtext the Very Least

Operator priority

Ordered by decreasing priority

Parenthesis (()) can override default priorities Action:

'{' TypeRef (

'.' ID ('='|'+=') 'current'

)? '}' ;

Cardinality operators ?, *, + or nothing

Groups Token sequences

Unordered groups &

Alternatives |

Page 21: Xtext the Very Least

Outline

• Introduction

• How to specify the target language

– Terminal rules

– Parser rules

• How to specify the target AST

– The working example

– Return types

• Further features

• The header

• Getting Xtext up & running

Page 22: Xtext the Very Least

The structure of the resulting AST

• You typically (should) know the AST you want before defining a textual representation for it

• You will now learn how to instruct Xtext to build the ASTs you want

• Let start with the classical example

Page 23: Xtext the Very Least

Arithmetical expressions (I)

Page 24: Xtext the Very Least

Arithmetical expressions (II)

• We have to define a corresponding textual representation – i.e., we have to define a corresponding grammar

• To keep things easy, let define a grammar which – does not consider operator priorities

– does not consider operator associativity

– requires to explicitly specify parenthesis

EX:

• not 1 + 2 * (3 – 4 / 5)

• but 1 + (2 * (3 – (4 / 5)))

Page 25: Xtext the Very Least

Arithmetical expressions (III)

Expression ::= IntOrPar ( FactorSign

IntOrPar | TermSign IntOrPar )?

IntOrPar ::= INT | '(' Expression ')'

INT ::= '0' | '1'..'9' '0'..'9'*

FactorSign ::= '*' | 'multiply' | '/'

| 'divide'

TermSign ::= '+' | 'plus' | '-' |

'minus'

Page 26: Xtext the Very Least

Arithmetical expressions (IV)

Ho

w t

o in

stru

ct X

text

to

bu

ild t

his

A

ST o

ut

of

1 + (2 * (3 – (4 / 5)))

?

Page 27: Xtext the Very Least

Outline

• Introduction

• How to specify the target language

– Terminal rules

– Parser rules

• How to specify the target AST

– The working example

– Return types

• Further features

• The header

• Getting Xtext up & running

Page 28: Xtext the Very Least

Return types

• Each rule should specify a return type

• The return type defaults to – ecore::EString (for terminal and data type rules–cf.

below)

– the rule’s name (otherwise)

IntOrPar returns Expression: … ;

terminal INT returns ecore::EInt: … ;

The Xtext framework will create The parser generated by the

Xtext framework will create

a Java class for each (non-

existing) return type

an instance of such class

whenever applying a rule with

such a return type

Page 29: Xtext the Very Least

Enumeration rules

TermSign and FactorSign are enumerations

• enumerations can be specified by (enumeration) rules

enum TermSign: PLUS='+' | PLUS='plus' |

MINUS='-' | MINUS='minus';

• If the value is omitted, you will get equal name and value

• The first enumeration value is the default one

The Xtext framework will create The parser generated by the

Xtext framework will create

an enumeration for each

enumeration rule

an enumeration value whenever

applying the corresponding

enumeration rule

It is (theoretically) possible • using alternative literals • referencing a value twice In practice, Xtext complains

It is (theoretically) possible • using alternative literals • referencing a value twice In practice, Xtext complains

Page 30: Xtext the Very Least

Terminal rules

• Terminal and data type rules (cf. below) return ecore::EString by default

• You probably want the following rule to return an integer

terminal INT: '0' | '1'..'9' '0'..'9'*;

To this goal, you have to

1. declare the return type in the rule terminal INT returns ecore::EInt: … ;

2. create a value converter (VC)

3. create a value converter service (VCS)

4. register the VC at the VCS

Page 31: Xtext the Very Least

Creating a value converter

Create a class implementing IValueConverter /* Responsible for the string-to-value conversion */

X toValue(String, AbstractNode)

/* Responsible for the value-to-string conversion */

String toString(X)

• X is the return type of the grammar rule • ValueConverterExceptions signal conversion

errors IValueConverter and ValueConverterException belong

to package org.eclipse.xtext.conversion AbstractNode belongs to package

org.eclipse.xtext.parsetree

Page 32: Xtext the Very Least

Creating a value converter service

Create a class implementing IValueConverter

• The easiest way is by extending AbstractDeclarativeValueConverterServ

ice

• Extend DefaultTerminalConverters if you imported grammar Terminals (cf. below)

IValueConverter belongs to package

org.eclipse.xtext.conversion

AbstractDeclarativeValueConverterService belongs to package org.eclipse.xtext.conversion.impl

DefaultTerminalConverters belongs to package org.eclipse.xtext.common.services

Terminals belongs to package org.eclipse.xtext.common

Page 33: Xtext the Very Least

Registering VCs at VCSs Declare as many VCS fields as IValueConverters you need @Inject private type name; • type implements IValueConverter • name is an arbitrary name Declare as many VCS methods as grammar rules you handle @ValueConverter(rule = "rule") public IValueConverter<returnType> rule(){

return converter; }

• rule is the name of the grammar rule • returnType is the type returned by converter • converter is the IValueConverter responsible for rule Inject belongs to package com.google.inject ValueConverter belongs to package

org.eclipse.xtext.conversion

Page 34: Xtext the Very Least

Simple actions

IntOrPar returns Expression:

'(' Expression ')' |

{Integer} value=INT;

The Xtext framework will The parser generated by the Xtext

framework will

• create a class Expression

• create a class Integer

(extending Expression)

• add a field value of type

ecore::EInt to class

Integer

In the first case

return the created Expression

In the second case

• create an Integer

• assign the parsed INT to its field value

• return the created Integer

The right-hand side can be either of • a rule call • a keyword • a cross-reference (cf. below) • an alternative of the formers

Page 35: Xtext the Very Least

Field assignment

• The operator = assigns atomic values to fields

• The operator += assigns multiple values to fields

Pair:

values+=Element ',' values+=Element;

• The operator ?= assigns binary values to fields

Wrapper: isNull?='null' | inner=Wrapped;

The Xtext framework will add The parser generated by the Xtext

framework will

a list field (with values of the proper

type) for each assignment with the +=

operator

add elements to such a list whenever

creating the corresponding object

a boolean field for each assignment

with the ?= operator

initialize such a field to false (resp.

true) if the parser does not scan

(resp. scan) the assignment’s right-side

Page 36: Xtext the Very Least

The Xtext framework will The parser generated by the Xtext

framework will

• create classes Factor and

Term (extending

Expression)

• add them fields left, sign

and right (of the proper type)

In there is no optional part

return the created Expression

Otherwise

• create a Factor or Term

• assign the parsed IntOrPar to its field

left

• go on as expected

Assigned actions

Expression: IntOrPar (

{Factor.left=current} sign=FactorSign

right=IntOrPar |

{Term.left=current} sign=TermSign

right=IntOrPar

)? ;

Page 37: Xtext the Very Least

Outline

• Introduction

• How to specify the target language

– Terminal rules

– Parser rules

• How to specify the target AST

– The working example

– Return types

• Further features

• The header

• Getting Xtext up & running

Page 38: Xtext the Very Least

Hidden tokens

• Can be defined at (parser) rule- or grammar-level

– Rule-level hidden tokens override grammar-level ones

• Are automatically skipped when processing the rule/grammar

EX: Expression return Expression hidden(WS): … ;

Grammar-level Rule-level

When importing one single

grammar, its hidden tokens are

reused

Hidden tokens defined for a calling rule are

reused for called rules (unless they define

their own hidden tokens)

Page 39: Xtext the Very Least

Data type rules

They are parser rules which

• contain neither assignments nor actions

• only call terminal or data type rules

The AST builder simply concatenates the parsed text

Why should we use data type rules instead of terminal rules?

• They allow hidden tokens

• They allow backtracking

Page 40: Xtext the Very Least

References: Motivation

• In a language, it is often the case that the same entity is referred over and over

EX: Variables and methods in Java programs

• You do not want the AST builder to create new instances of the entity whenever a reference is found

• You rather want the AST builder to point to the entity created at definition-time

Page 41: Xtext the Very Least

field=[type|rule]

where

• field is the field of the object created by the AST builder which is supposed to refer to an entity

• type is the class of the referred entity

• rule is a grammar rule specifying the string representation of the reference

– If omitted (with the preceding |), org.eclipse.xtext.common.Terminals

.ID is assumed

Notice that • references can only be used within assignments • entities of different classes can have the same string representation • cross-references across file boundaries are supported

• as long as the referenced entities are on the classpath

References: Syntax

Page 42: Xtext the Very Least

References: (Default) Semantics

• In order to be referenceable, entities must have a field name

• Reference resolution is based on qualified names

• An entity’s qualified name is computed by concatenating

– the qualified name of the entity’s container

– a dot (.)

– the entity’s name

Page 43: Xtext the Very Least

Outline

• Introduction

• How to specify the target language

– Terminal rules

– Parser rules

• How to specify the target AST

– The working example

– Return types

• Further features

• The header

• Getting Xtext up & running

Page 44: Xtext the Very Least

The header of Xtext grammars

Consists of declarations of • the grammar’s name and possibly • imported grammars • grammar-scoped hidden tokens • imported Ecore packages • the Ecore package to generate The first rule in a grammar (entry rule) is assumed

to be its entry point • i.e., it is the first rule the parser generated by

Xtext will try to apply

Page 45: Xtext the Very Least

Name and imported grammars

The grammar’s name

• Xtext grammar names follow Java’s naming conventions

The grammar file must have the same name as the grammar it contains (and extension .xtext)

EX: grammar org.eclipse.xtext.Xtext

Imported grammars

• The current grammar can reuse (or override) rules defined in other grammars

EX: with org.eclipse.xtext.common.Terminals

Page 46: Xtext the Very Least

Hidden tokes and imported packages

Grammar-scoped hidden tokens

• Are declared just like hidden tokens for rules

EX: hidden(WS)

Imported Ecore packages

• You do not really need to care about them

• Just do not be scared if you see something like

import

"http://www.eclipse.org/emf/200

2/Ecore" as ecore

Page 47: Xtext the Very Least

The package to generate

• Among else, Xtext creates an Ecore package (whatever it is)

• Just keep in mind that

– A name and a namespace URI are required to create an Ecore package

– You must provide Xtext with such data

EX: generate myDsl "http://www.univStEtienne.fr/my

dsl/MyDsl"

Page 48: Xtext the Very Least

Exercise

To test your understanding of XGL, have a look at XGL’s Xtext grammar

http://dev.eclipse.org/viewcvs/v

iewvc.cgi/org.eclipse.tmf/org.e

clipse.xtext/plugins/org.eclips

e.xtext/src/org/eclipse/xtext/X

text.xtext?root=Modeling_Projec

t&view=markup

Page 49: Xtext the Very Least

Outline

• Introduction

• How to specify the target language

– Terminal rules

– Parser rules

• How to specify the target AST

– The working example

– Return types

• Further features

• The header of Xtext grammars

• Getting Xtext up & running

Page 50: Xtext the Very Least

Getting Xtext up & running (I)

Install Eclipse

1. Download Eclipse Modeling Tools – http://www.eclipse.org/downloads/

2. Start Eclipse Modeling Tools

3. Click on Install Modeling Components (the fifth icon from the left on the icon bar right below the menu bar)

4. Select Xtext

Page 51: Xtext the Very Least

Getting Xtext up & running (II)

Create an Xtext project

1. File New Project… Xtext Xtext project

2. Choose a meaningful project name, language name and file extension

3. Uncheck the Create generator project box

4. Click on Finish

5. Add http://download.itemis.com/ant

lr-generator-3.0.1.jar to the project’s classpath

Page 52: Xtext the Very Least

Getting Xtext up & running (III) Generate the language artifacts 1. Replace the content of the automatically opened

grammar file with your grammar 2. Locate the file GenerategrammarName.mwe2 next to

the grammar file in the package explorer view 3. Choose Run As MWE2 Workflow from its context menu 4. Possibly add your converters and converter service to the

non-ui project – Add the following method to the class

grammarNameRuntimeModule @Override

public Class<? extends IValueConverterService> bindIValueConverterService() {

return converterService.class; }

Page 53: Xtext the Very Least

Getting Xtext up & running (IV)

Run the generated IDE plug-in

1. Right-click on the Xtext project and choose Run As Eclipse Application

– This will spawn a new Eclipse workbench

2. Create a new project

3. Create a new file with the file extension you chose in the beginning

– This will open the generated entity editor

4. Enjoy the editor