graph-based analysis of javascript source code repositories · ofacebook flow otern.js otajs c...
TRANSCRIPT
![Page 1: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/1.jpg)
Graph-based analysis of JavaScript
source code repositories
Gábor Szárnyas
Graph Processing devroom @ FOSDEM 2018
![Page 2: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/2.jpg)
JAVASCRIPT
Latest standard: ECMAScript 2017
![Page 3: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/3.jpg)
STATIC ANALYSIS
Static source code analysis is a software
testing approach performed without
compiling and executing the program itself.
Static
analysis
Development
Unit and
integration testsCompilation
Version Control
System
Codacy,
CodeClimate,
etc.
![Page 4: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/4.jpg)
STATIC ANALYSIS TOOLS
JavaScript
o ESLint
o Facebook Flow
o Tern.js
o TAJS
C
o lint -> linters
Java
o FindBugs
o PMD
![Page 5: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/5.jpg)
Checking global rules is computationally expensive
Slow for large projects, difficult to integrate even to CI
Workaround #1: no global rules (ESLint)
Workaround #2: batching (e.g. 1/day)
Workaround #3: custom algorithms (e.g. Flow)
PERFORMANCE CONSIDERATIONS
Unit testsCode analysis
☼ ☆☾☆
Unit tests
Code analysis
![Page 6: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/6.jpg)
PROJECT GOALS
Goal
Static analysis for JavaScript applications
Design considerations
Custom analysis rules
o Both global and local
o Extensible
High-performance
o “real-time” responses
![Page 7: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/7.jpg)
ARCHITECTURE AND WORKFLOW
![Page 8: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/8.jpg)
PROPOSED APPROACH
Design considerations
Custom analysis rules
High-performance
Approach
Use a declarative querylanguage
Use incremental processing
o in lieu of batch execution
o file-granularity
o maintain results
analyzer
Δ2.-1.1.
![Page 9: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/9.jpg)
ARCHITECTURE
Analysis rules
Main.js | ++----Dependency.js. | +++++-Fiterator.js. | ----Parser.js | ++
.+--- discoverer
+--- ChangeProcessor.js+--- CommandParser.js.+--- FileIterator.js+--- iterators+-------DepCollector.js+-------FileDiscoverer.js+-------InitIterator.js+--- Main.js+--- whitepages
+--- ConnectionMgr.js.
+--- DependencyMgr.js.
Validation report
Analysis server Graph database
Abstract Semantic
Graph
Abstract Syntax
Tree
WorkspaceVCS
Client
<!><?>
<.>
![Page 10: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/10.jpg)
CODE PROCESSING STEPS CODE
tokenizer
forráskód
tokenek
AST
ASG
parser
scope analyzer
a sequence of statements:
var foo = 1 / 0
tokenizer
code
tokens
AST
ASG
parser
scope analyzer
![Page 11: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/11.jpg)
CODE PROCESSING STEPS TOKENS
tokenizer
code
tokens
AST
ASG
parser
scope analyzer
Token Token type
var VAR (Keyword)
foo IDENTIFIER (Ident)
= ASSIGN (Punctuator)
1 NUMBER (NumericLiteral)
/ DIV (Punctuator)
0 NUMBER (NumericLiteral)
tokens: the shortest meaningful
character sequence
var foo = 1 / 0
![Page 12: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/12.jpg)
tokenizer
code
tokens
AST
ASG
parser
scope analyzer
CODE PROCESSING STEPS AST
Abstract Syntax Tree
o Tree representation of
o the grammar structure of
o sequence of tokens.
Module
VariableDeclarationStatement
VariableDeclaration
VariableDeclarator
BindingIdentifier
name = "foo"
BinaryExpression
operator = "Div"
LiteralNumericExpression
value = 1.0
LiteralNumericExpression
value = 0.0
declaration
declarators
items
binding init
left right
![Page 13: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/13.jpg)
tokenizer
code
tokens
AST
ASG
parser
scope analyzer
CODE PROCESSING STEPS ASG
Abstract Semantic Graph
o Not necessarily a tree
o Has scopes &
semantic info
o Cross edges
Module
declaration
declarators
items
binding init
left right
GlobalScope
variables
references
children
declarations
node
astNode
Module
declaration
declarators
items
binding init
left right
![Page 14: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/14.jpg)
AST VS. ASG
var foo = 1 / 0
1 LOC -> 20+ nodes
![Page 15: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/15.jpg)
PATTERN MATCHING
Declarative graph patterns
with CypherVariableDeclarator
BindingIdentifier
name = "foo"
BinaryExpression
operator = "Div"
LNExpression
value = 1.0
LNExpression
value = 0.0MATCH (binding:BindingIdentifier)
<-[:binding]-()-->
(be:BinaryExpression)
-[:right]->(right:LNExpression)
WHERE be.operator = 'Div'
AND right.value = 0.0
RETURN binding
binding be
right
Match result
![Page 16: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/16.jpg)
WORKFLOW
Version
control
system
transformationDeveloper’s
IDE
tokenizer
source code
tokens
AST
ASG
parser
scope analyzer
traceability
graph
database
Git, Visual Studio Code ShapeSecurity Shift Java, Cypher Neo4j
![Page 17: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/17.jpg)
USE CASES TYPE INFERENCING
function foo(x, y) {return (x + y);
}function bar(a, b) {return foo(b, a);
}var quux = bar("goodbye", "hello");
Source: http://marijnhaverbeke.nl/blog/tern.html
![Page 18: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/18.jpg)
USE CASES GLOBAL ANALYSIS
Reachability:
dead code detection
async/await (ECMAScript 2017)
potential division by zero
![Page 19: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/19.jpg)
TECH DETAILS
![Page 20: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/20.jpg)
IMPORTS AND EXPORTS
![Page 21: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/21.jpg)
FIXPOINT ALGORITHMS
Lots of propagation algorithms
„Run to completion” scheduling
oMix of Java code and Cypher
![Page 22: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/22.jpg)
EFFICIENT INITIALIZATION
Initial build of the graph with Cypher was slow
Generate CSV and bulk load
Two files: nodes, relationships
$NEO4J_HOME/bin/neo4j-admin import
--database=db
--nodes=nodes.csv
--relationships=relationships.csv
10× speedup
![Page 23: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/23.jpg)
REGULAR PATH QUERIES
Transitive closure on certain combinations
Workaround:
o Start transaction
o Add proxy relationships
o Calculate transitive closure
o Rollback transaction
openCypher proposal for path patterns
(:A)-/[:R1 :R2 :R3]+/->(:B)
A B
*
![Page 24: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/24.jpg)
INCREMENTAL QUERIES
![Page 25: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/25.jpg)
OPENCYPHER SYSTEMS
„The openCypher project aims to deliver a
full and open specification of the industry’s
most widely adopted graph database query
language: Cypher.” (late 2015)
Research prototypes
oGraphflow (Univesity of Waterloo)
o ingraph (incremental graph engine)
(Source: Keynote talk @ GraphConnect NYC 2017)
incremental
processing
![Page 26: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/26.jpg)
FOSDEM 2017: INGRAPH
![Page 27: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/27.jpg)
![Page 28: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/28.jpg)
STATE OF INGRAPH IN 2018
Cover a substantial fragment of openCypher
o MATCH, OPTIONAL MATCH, WHERE
o WITH, functions, aggregations
o CREATE, DELETE
Features on the roadmap
o MERGE, REMOVE, SET
o List comprehensions
G. Szárnyas:
Incremental View Maintenance for Property Graph Queries,
SIGMOD SRC, 2018
J. Marton, G. Szárnyas, D. Varró:
Formalising openCypher Graph Queries in Relational Algebra,
ADBIS, Springer, 2017
![Page 29: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/29.jpg)
RELATED PROJECTS
![Page 30: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/30.jpg)
JQASSISTANT
Dirk Mahler,
Pushing the evolution of software analytics
with graph technology,
Neo4j blog, 2017
Code comprehension: software to graph
![Page 31: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/31.jpg)
SLIZAA
slizaa uses Neo4j/jQAssistant and provides a front end with a
bunch of specific tools and viewers to provide an easy-to-use
in-depth insight of your software's architecture.
Gerd Wütherich,
Core concepts,
slizaa
![Page 32: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/32.jpg)
SLIZAA: ECLIPSE IDE
![Page 33: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/33.jpg)
SLIZAA: XTEXT OPENCYPHER
Xtext-based grammar
Used in the ingraph compiler
Now has a scope analyzer
Works in the Eclipse IDE and web UI
![Page 34: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/34.jpg)
WRAPPING UP
![Page 35: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/35.jpg)
PUBLICATIONS
Soma Lucz:
Static analysis algorithms
for JavaScript,
Bachelor’s thesis, 2017
Dániel Stein:
Graph-based source code
analysis of JavaScript
repositories,
Master’s thesis, 2016
![Page 36: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/36.jpg)
CONCLUSION
Some interesting analysis rules require a
global view of the code
Good use case for graph databases
o Property graph
o Cypher language
Very good use case for incremental queries
o Incrementality on multiple levels
![Page 37: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/37.jpg)
RELATED RESOURCESCodemodel-Rifle github.com/ftsrg/codemodel-rifle
ingraph engine github.com/ftsrg/ingraph
Shape Security’s Shift parser github.com/shapesecurity/shift-java
Slizaa openCypher Xtext github.com/slizaa/slizaa-opencypher-xtext
Thanks to Ádám Lippai, Soma Lucz, Dániel Stein, Dávid Honfi and the ingraph team.
![Page 38: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/38.jpg)
Ω
![Page 39: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/39.jpg)
VISUAL STUDIO CODE INTEGRATION
Language Server Protocol (LSP) allows
portable implementation
![Page 40: Graph-based analysis of JavaScript source code repositories · oFacebook Flow oTern.js oTAJS C olint -> linters Java oFindBugs oPMD Checking global rules is computationally expensive](https://reader031.vdocuments.mx/reader031/viewer/2022040514/5e6e4392e28420651c2933d0/html5/thumbnails/40.jpg)
USE CASES CFG
Control Flow Graph
o graph representation of
o every possible
statement sequence
Basis for type
inferencing and
test generation
statement
statement
statement statement
statement
error
if
done
statement
condition