Download - Managed Compiler
Language framework in a managed environmentMicrosoft Development Center Copenhagen
Author: João Filipe Gama de Magalhães
E-mail: [email protected]
February 2008
Overview
1. Objectives
2. Compiler Structure
3. Compiler Implementation
4. Developed Solution
5. Demo
6. Conclusions
7. Questions
2Language framework in a managed environmentFebruary 2008
Objectives
3Language framework in a managed environmentFebruary 2008
• Introduce compiler construction
• Differences between managed (.NET) compiler and
native compiler
• Introduce some of the used tools
• Give directions on some of the most important
design decisions
4Language framework in a managed environmentFebruary 2008
• Constitutes the way to convert an input file (source)
into a output (target) file in a process called translation
• Composed by 4 fundamental parts (lexer, parser,
semantic analysis and generation)
Lexical Analyzer
Semantic Analyzer
Syntactical Analyzer
Code Generator
AST
source file
target file
Compiler Structure
tokens
Compiler Implementation
5Language framework in a managed environmentFebruary 2008
Overview
• Construction limited to managed (.NET) tools
• Lexer and parser defined in LEX/YACC syntax (reuse old
parsers and lexers)
• C# backed parser, semantic analyzer and code
generator
• Utilization of patterns to simplify work in semantic
analysis and code generation
Compiler Implementation
6Language framework in a managed environmentFebruary 2008
GPLEX Scanner Generator I
• Created by Queensland University
• LEX / FLEX (like) syntax
• Generates C# code
• Compatible with GPPG generated parsers
• Easy to use
Compiler Implementation
7Language framework in a managed environmentFebruary 2008
GPLEX Scanner Generator II
%using gppg;
%option minimize
%namespace Microsoft.Dynamics.PBvNext.Modelling.ExpressionCompiler
/* independent scanner section start */
%x COMMENT%x STRING%x METADATA
White0 [ \t\r\f\v]White {White0}|\n
CmntStart \/\*CmntEnd \*\/ABStar [^\*\n]*ABStar2 [^\"\*\n]*NONl [^\n]*StrStart \"StrEnd \"
Compiler Implementation
8Language framework in a managed environmentFebruary 2008
GPLEX Scanner Generator III
%{
%}
if { return (int)Tokens.KWIF; }switch { return (int)Tokens.KWSWITCH; }case { return (int)Tokens.KWCASE; }
[_][a-zA-Z0-9_\.]+ return (int)Tokens.IDENT; }[0-9]+ { return (int)Tokens.NUMBER; }
{CmntStart}{ABStar}\**{CmntEnd} { return (int)Tokens.LEX_COMMENT; } {CmntStart}{ABStar}\** { BEGIN(COMMENT); return (int)Tokens.LEX_COMMENT; }<COMMENT>\n | <COMMENT>{ABStar}\** { return (int)Tokens.LEX_COMMENT; }/* the end of the comment */<COMMENT>{ABStar}\**{CmntEnd} { BEGIN(INITIAL); return (int)Tokens.LEX_COMMENT; }
/* all the other cases */
. { yyerror("illegal char"); return (int)Tokens.LEX_ERROR; }
%{
%}
Compiler Implementation
9Language framework in a managed environmentFebruary 2008
GPPG Parser Generator I
• […] also created by Queensland University
• YACC / Bison (like) syntax
• […] also generates C# code
• Compatible with GPLEX generated parsers
• […] also easy to use
Compiler Implementation
10Language framework in a managed environmentFebruary 2008
GPPG Parser Generator II
%using Microsoft.Dynamics.PBvNext.DataStructure.ExpressionAST
%namespace Microsoft.Dynamics.PBvNext.Modelling.ExpressionCompiler
%valuetype LexValue
%partial
%union { public string str; public AstNode node;}
%{ public RootNode rootNode;%}
%token KWIF KWSWITCH KWCASE%token STR CMT META IDENT NUMBER
%left BARBAR AMPAMP%left '!'%left NEQ EQ%left '-' '+'
Compiler Implementation
11Language framework in a managed environmentFebruary 2008
GPPG Parser Generator III
%%
E : B { rootNode = new RootNode((ExpressionNode) $1.node); }| { rootNode = new RootNode(); };
B : B AMPAMP B { $$.node = new BinaryBooleanExpressionNode(BooleanOperationType.AND, (ExpressionNode) $1.node, (ExpressionNode) $3.node); } ;
A : A '+' A { $$.node = new BinaryArithmeticExpressionNode(ArithmeticOperationType.PLUS, (ArithmeticExpressionNode) $1.node, (ArithmeticExpressionNode) $3.node); }A : A '+' error {throw new ParsingError(”Error reducing literal expression”); }
;
L : ATTR { $$.node = new AttributeNode($1.str); }| CONST { $$.node = new ConstantNode(Int32.Parse($1.str)); }| error { throw new ParsingError(”Error reducing literal expression”); };
CONST : NUMBER { $$.str = $1.str; } ;
ATTR : IDENT { $$.str = $1.str; } ;
Compiler Implementation
12Language framework in a managed environmentFebruary 2008
GPPG and GPLEX
// creates a new scannerScanner scanner = new Scanner(); // creates a new parserParser parser = new Parser(); // sets the scanner for the parserparser.scanner = scanner; // sets the stream to parsescanner.buffer = new Scanner.StringBuff(this.Value); // parses the fileparser.Parse(); // retrieves the output root nodethis.RootNode = parser.rootNode;
Compiler Implementation
13Language framework in a managed environmentFebruary 2008
Visitor Pattern I
• One of the 23 GOF design patterns
• Is a technique used to provide a separation between
the algorithm and an object structure
• Uses the concept of “visitor” (algorithm) and
“visitable” (structure) to implement that separation
Compiler Implementation
14Language framework in a managed environmentFebruary 2008
Visitor Pattern II
Compiler Implementation
15Language framework in a managed environmentFebruary 2008
Visitor Pattern III
// implementation of the IVisitable (Element) in the class ProductNodepublic override void Accept(IVisitor visitor){
// first visitvisitor.Visit(this);
if (!visitor.State)
return;
foreach (ProductElementNode node in productElementList){
node.Accept(visitor);
}
// second visit (optional)visitor.Visit(this);
}
Compiler Implementation
16Language framework in a managed environmentFebruary 2008
Semantic Analysis I
• Using visitor to provide the semantic analysis
creates a simple yet powerful way of doing it
• In case the language is complex it might be
necessary to use a stack to give some memory […] a
symbols table is also useful (hashing)
• Maybe multiple visits are required (multiple pass)
Compiler Implementation
17Language framework in a managed environmentFebruary 2008
Semantic Analysis II
public void Visit(ProductNode productNode){
// tests if the symbol is already definedif (symbols.contains(productNode.ProductValue))
throw new DuplicateValueException(productNode);}
public void Visit(VariableNode variableNode){
// tests if the symbol is already defined in the local table
if (localSymbols.contains(variableNode.VariableValue))throw new DuplicateValueException(variableNode);
if (variableNode.Array)
{// tests if the array size is validif (variableNode.ArrayIndex < 1)
throw new InvalidArrayxception(variableNode);}
}
Compiler Implementation
18Language framework in a managed environmentFebruary 2008
Code Generation I
• Visitor is the perfect choice in case XML is the target
generated code
• A stack is required for layout management (in case
the AST is not well organized)
• Dual visit is required, in order to close the xml
nodes
Compiler Implementation
19Language framework in a managed environmentFebruary 2008
Code Generation II
public void Visit(ProductNode productNode){
// tests if it is the first visitif (testVisit(productNode)){
xmlTextWriter.WriteStartElement(“product");
xmlTextWriter.WriteAttributeString("value", productNode.ProductValue);
...
}// in case it is the second visitelse
xmlTextWriter.WriteEndElement();}
Compiler Implementation
20Language framework in a managed environmentFebruary 2008
XML Interpretation I
• Recursive descent of the XML document structure
• This method is more hard coded and less flexible
than the usage of visitors
• Memory maybe necessary depending on the XML
structure
• Dual visit is not necessary
Compiler Implementation
21Language framework in a managed environmentFebruary 2008
XML Interpretation II
private ProductNode ParseProduct(XmlNode productNode){
// creates the product node
ProductNode node = new ProductNode();
// retrieves all the attributes from the nodeXmlAttributeCollection xmlAttributeCollection = productNode.Attributes;
foreach (XmlAttribute attribute in xmlAttributeCollection)
{
// gets the name of the attributeString name = attribute.Name;
// gets the value of the attribute
String value = attribute.Value;
if (name.Equals("value"))node.ProductValue = value;
else
}
// gets the child nodesXmlNodeList list = productNode.ChildNodes;
foreach (XmlNode xmlNode in list)
{ProductElementNode productElementNode = ParseProductElement(xmlNode);node.ProductElementList.Add(productElementNode);
}
return node;}
Compiler Implementation
22Language framework in a managed environmentFebruary 2008
Compiler Services I
• Syntax highlighting implementation is straightforward
• Using the lexer is possible to associate a color with
each of the tokens we want to colorize
• The auto completion services (“Intellisense”) are more
complex and require a good implementation of the
compiler to provide a good level of usage
Compiler Implementation
23Language framework in a managed environmentFebruary 2008
Compiler Services II
• Many problems emerge for the code completion
• Parser must be able to process “all” the erroneous
situations in order to be able to reduce them
• Context localization is a very complex task
• Performance is an issue […] incremental AST is the
solution (not easy to implement)
Compiler Implementation
24Language framework in a managed environmentFebruary 2008
Adapter and Plug-ins I
• Another pattern of the 23 GOF design patterns
• “Adapts” one interface for a class into one that a
client expects
• Provides a simple system of combining a legacy
implementation and a modern one
Compiler Implementation
25Language framework in a managed environmentFebruary 2008
Adapter and Plug-ins II
Compiler Implementation
26Language framework in a managed environmentFebruary 2008
Adapter and Plug-ins III
• Gives a new level of abstraction
• No implementation source code required
• Limitation on the interface create low level of
interference between both parts
• Separation of two levels of behavior internal and
external (no intrusion)
Compiler Implementation
27Language framework in a managed environmentFebruary 2008
Adapter and Plug-ins IV
• Combining a plug-in system and the Visitor pattern
may provide a whole new level of flexibility to the
compiler
• Can be used in:
– Dynamic targeted code generation (multiple target languages)
– Interpretation of multiple source codes
– Reuse of old interpreters and generators (Adapter pattern)
– etc.
Developed Solution
28Language framework in a managed environmentFebruary 2008
Introduction
• Project developed as a thesis for Master
• Timeline from March 2007 to June 2007
• Small language for product configuration
• Language developed completely in managed code
• Complete framework for external use (André’s
project)
Developed Solution
29Language framework in a managed environmentFebruary 2008
Project Description
• Language for product configuration
• Object Orientation (OO)
• Declarative Syntax
• Syntax highlighting and code completion services
• Managed Environment (.NET CLR)
• Compiler (modeling) + interpreter (configuration)
Developed Solution
30Language framework in a managed environmentFebruary 2008
The Pml Language
• Defines the product block as the base block, equivalent to class, extension support
• Supports the definition of the BOM and Route structures
• The inheritance on the product variables, BOM structure, Route structure and constraints
• Contains namespaces for context separation
Developed Solution
31Language framework in a managed environmentFebruary 2008
The Modeling Environment
Compiler ToolsPml Compiler
Client Compiled Product Model
Code Generator
Lexical Analyzer
Semantic Analyzer
Syntactical Analyzer
Product Model
Pml Code Tools
Syntax Highlighter
Code CompleterAST
Developed Solution
32Language framework in a managed environmentFebruary 2008
The Compiler
• Semantic analyzer based
in a recursive descent of
the Pml AST using C#
code
• Multiple code output
(XPML, XCML), but it
possible to add others
Pml Compiler Input / Output
PML FilePML Compiler
XPML File
XCML File
Other Pml output
compliant file
• The lexical and the syntactical analyzers are based
respectively in the GPLEX and GPPG solutions
Developed Solution
33Language framework in a managed environmentFebruary 2008
The Compiler Tools
• Provide simple way to produce and compile Pml code
• Use the current location of the caret to send information to the client
• Rely on the context information to provide accurate completion data
• Based in dictionaries (hash) to provide a fast and responsive system
Developed Solution
34Language framework in a managed environmentFebruary 2008
The Configuration Environment
Configuration EngineAdapter Abstraction
Layer
Microsoft Constraint Solver
Adapter
Microsoft Parallel Constraint Solver
Adapter
Interpretation Tools
Client
API
Microsoft Parallel Constraint Solver
Microsoft Constraint Solver
Compiled Product Model
ASTAdapters
Developed Solution
35Language framework in a managed environmentFebruary 2008
The Interpretation Tools
• Supports both XCML and XPML model formats
• Uses the .NET XML parsing library (DOM based)
• Outputs a simple AST used by the various visitors
• Faster than interpreting Pml code directly
Developed Solution
36Language framework in a managed environmentFebruary 2008
The Adapter Abstraction Layer
• Provides the necessary support to load multiple
constraint solvers
• .NET Reflection based
• “Uses” the Adapter pattern to load the adapters
• Requires the Visitor pattern to adapt the AST
contents to the selected solver
• Reference implementation contains support for two
Microsoft based constraint solvers
Developed Solution
37Language framework in a managed environmentFebruary 2008
The Configuration Engine
• Main entry point for the configuration process
• Controls the interpretation of the compiled models
• Calls the AAL for loading and running of the various
constraint solvers
• Controls the external API calls
38Language framework in a managed environmentFebruary 2008
Demo
Demo
39Language framework in a managed environmentFebruary 2008
• It’s really easy to construct a good compiler in a managed environment
• The usage of design patterns in some of the compilation steps can provide a simple and clean design decision
• Syntax highlighting is a good and easy to implement compiler tool
• Code completion “in extremis” is a complex task and requires a good compiler design
• Combining the Visitor and the Adapter and a plug-in system is ideal for a flexible compiler
Conclusions
40Language framework in a managed environmentFebruary 2008
Questions
Questions