scaling xtext - xtextcon 2015

Scaling XtextLieven Lemiengre

Sigasi

● IDE for Hardware Description Languages○ VHDL, (System)Verilog

● Using Xtext for 4 years

● Large user base○ (commercial, free, students)

http://www.youtube.com/watch?v=vwnDAzct4kc

Our company goal

● Assist hardware designer

● High quality interactive front-end compiler○ Instant feedback

■ parsing, semantic, linting, style checking○ IDE Services

■ visualisations■ design exploration■ documentation generation

○ Integrate with ecosystem■ other compilers, simulators, synthesizers

Visualisations

http://www.youtube.com/watch?v=lhT_oJC9UMU

The challenge

● Pre-specified languages● Large projects

○ > 250 KLOC is not uncommon○ design + external libraries○ big files

■ some libraries are distributed as 1 file■ generated hardware cores

●

Adopting Xtext

● Started with the early Xtext 2.0 snapshots

● Initial performance analysis○ Clean build performance of a big project (330k LOC)

■ > 20 minutes■ > 2 GB

○ Editing big files (> 1 MB)■ unusable

Adopting Xtext

● Started with the early Xtext 2.0 snapshots

● Initial performance analysis○ Clean build performance of a big project (330k LOC)

■ > 20 minutes → < 1 min■ > 2 GB → ~ 1 GB memory

○ Editing big files (> 1 MB)■ unusable → usable with reduced editor

● Xtext framework improvements

● Measure → analyze → improve or cheat○ faster build○ reduce memory usage○ UI responsiveness

Improving performance

Overview

● Analysing build performance○ Analyze the build

■ Macro build measurements■ Key performance points

● Reduce workload● Parallelize the build

○ Tracking performance○ VHDL linking in depth

● Analyzing UI issues○ Monitoring the UI thread○ Saveguards

Analyzing builds: builder overview

Global indexing

LinkingValidation

Custom Validation

Global index

Eclipse resources

warnings errorsresource

descriptions

Builder Participants

resource changes

?


Global indexing

Global index

resource descriptions

resource changes

● Usage○ Location of exported declarations○ Incremental compilation

● Implementation○ IResourceDescriptionsStrategy○ Default: all declarations○ Customize!○ Runs before linking!

● IResourceDescriptions & IEObjectDescriptions○ Always in memory○ Persisted tot disk @ shutdown


LinkingValidation

● Usage○ Determine IScope all cross references○ Link cross reference or create linking error

● Implementation○ ILinkingService, IScopeProvider,

LazyLinkingResource○ Requires global index for global scope○ Direct link or link to global scope○ Linking may trigger linking and resource

loading

linking errors

Eclipse resources


● Usage○ Execute all custom validations○ Creates errors / warnings

● Implementation○ AbstractDeclarativeValidator○ Execute validations using reflection○ Works against linked model○ May trigger linking & resource loading

LinkingValidation

errorswarnings

Eclipse resources


Global indexing

LinkingValidation

Custom Validation

Global index

Eclipse resources

warnings errorsresource

descriptions


resource changes

?

● iterations ?● order ?

Analyzing builds: metrics

● For each build○ # of files being build○ timing: Global index, Linking, Validation, Individual

builder participants● Instrument by overriding

ClusteringBuilderState & XtextBuilder● Example:

Building 134 resources, timing: {

global index=1806,

linking=378,

validation=823,

totalLinkingAndValidation=1364

}

Analyzing builds: resource loads

● Observation: ○ Most time spent in resource loads○ Certain files are loaded multiple times?!

● Solutions○ Reduce memory pressure○ Make loading faster

Global indexing

Linking validation

Custom Validation


resources

LOAD

POTENTIAL RELOADS

POTENTIAL RELOADS

Memory pressure

Global index

ResourceSet

Memory pressure?● Size of EMF models

○ All the resources loaded during the build● Size of global index

○ Always loaded○ Depends on number of open projects

Memory pressure: EMF models

Reduce EMF size○ Watch out for inferred model

http://www.sigasi.com/content/view-complexity-your-xtext-ecore-model

○ Avoid■ Emf classes with just one list of things

● ListOfThings : ‘(‘ things+=Thing (‘,’ things+=Thing)* ‘)’● class ListOfThings { contains Thing[] things }

■ Often unused fields

○ Code duplication in grammar vs efficient model

○ Fine-grained control with Xcore model



Memory pressure: Global index

In YourResourceDescriptionStrategy○ Export foo, foo.rec, foo.rec.field1, foo.rec.field2○ Add user-data: someType & anotherType○ To reduce memory usage: don’t export child elements

■ export foo.rec + hash of fields■ export foo + hash of contents of foo■ can’t link these elements without loading anymore

package foo is

record rec is

field1 : someType;

field2 : anotherType(X downto Y);

end;

end;

Optimize loading

● What is resource load?○ Parse○ build EMF model & install EMF proxies○ build Node model

Optimize loading


● Parallelise○ parse multiple files simultaneously○ ~3 time faster loads on 4 core machine○ only loading, not linking

Optimize loading


● Parallelise○ parse multiple files simultaneously○ ~3 time faster loads on 4 core machine○ only loading, not linking

● Cache○ serialize EMF and Node model in a cache○ originally 3-4 time faster loads○ now 1.5x (no backtracking, simplified grammar)

Linking

Global indexing

Linking validation

Custom Validation


● Language specific○ VHDL vs Verilog

● Avoiding linking○ library files, only linked when used in user-code

● Many iterations○ lazy linking vs eager linking○ From 40% of build time to 20%

Custom Validation

Global indexing

LinkingValidation

Custom Validation


● Combine validations to avoid model traversals

● Local analysis, do global validations moved into builder participant

● Avoid validation○ disabled validations○ libraries: errors & warnings are suppressed anyway

● Monitor

Track performance

● Nightly build● log build times

VHDL linking in depth

History of VHDL linking● 1st version

○ AbstractDeclarativeScopeProvider○ best effort scoping

● 2nd version○ removed reflection

● 3rd version○ special rule-based internal java dsl○ first attempt to be 100% correct

● 4rth version○ batch/eager linking○ type errors


foo(baz) <= bar(bak.f(“?”), (‘1’, 2))) + 1;

● Most elements in an expression are overloaded○ subprograms, literals, enumliterals

● foo(baz)○ 9 kinds of meanings○ 4 of them can have subprogram overloading

● overloading includes return type

● overload resolution is very hard○ you have to find 1 unambiguous solution○ resolve all cross-references together


Xtext lazy linking● good

○ declarative: only a few rules○ fine-grained: can be good for performance○ re-use: scoping, auto-complete, serialisation

● bad○ hard to debug

■ can call itself■ lots of caching ■ indirection, huge stack traces

○ performance■ build context for every cross-reference


Batch/Eager linking● good

○ simple top-down algorithm○ natural fit for vhdl○ well described in literature

● bad○ resolve 1 reference = resolve all references○ a lot of extra xtext customisation

■ auto-complete & serialization?■ linking errors?


Our hybrid approach● Eager/batch linking of design units

○ Big files are partially scoped○ Parent-scope of a design unit is the global scope○ Local scoping is executed eagerly

● Global scope○ Import declarations of other design units○ Only query is find design unit x.y

■ load resource of x.y■ create an ‘ExternalScope’ & cache it

● Always load dependent resources○ needed for validation, hovers, highlighting anyway

Vhdl linking in depth

Conclusion● Easier to implement, debug and optimize● Type error reporting during linking● Memory intensive?

○ Every dependency is loaded○ OK in practice for VHDL

● A lot of xtext customisation!○ A lot of classes are affected○ Forward compatible?

UI responsiveness

● Measuring: detect a blocked UI thread○ initially Svelto https://github.com/dragos/svelto○ now our own method & logging○ Eclipse Mars

● Improvements○ UI is for drawing only!○ Make sure everything is cancellable

● Safeguards○ certain services should never be executed on the UI

thread => check & log

https://github.com/dragos/svelto

Lightweight Editor (fallback)

● Syntax-highlighting + markers● For files > 1 MB● Based on ContentTypes extension point

Two ContentTypes (based on file size)

<extension point="org.eclipse.core.contenttype.contentTypes">

<content-type ...

describer="com.sigasi...FullVhdlContentDescriber"

name="VHDL editor"

<describer class="...FullVhdlContentDescriber" />

</content-type>

<content-type ...

describer="com.sigasi....LightweightVhdlContentDescriber"

name="Lightweight VHDL editor"

<describer class="...LightweightVhdlContentDescriber" />

</content-type>

</extension>

Future work

● Continuous process

● Cache global index info per resource?

● Linking without node model?

● StoredResources

Come talk to us about...

● Documentation generation● Fancy linking algorithms / type systems● Graphical views● Cross-language support● Testing Xtext-plugins● Lexical macros● Manage large amount of validations● ...

scaling xtext - xtextcon 2015

Software