v. winter, j. guerrero, a. james, c. reinke linking syntactic and semantic models of java source...

Post on 23-Dec-2015

219 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

V. Winter, J. Guerrero, A. James, C. Reinke

LINKING SYNTACTIC AND SEMANTIC MODELS OF JAVA SOURCE CODE WITHIN A PROGRAM

TRANSFORMATION SYSTEM

OUTLINE• Introduction

• Motivation: The need for static analysis

• Why transformation systems are interesting in this setting

• Creating a rule in PMD

• Creating a rule in Sextant

• GPS-Traverse

• Overview

• Example: Constructing a call-graph

• Technical details of GPS-Traverse

SOURCE-CODE ANALYSIS• Is heavily employed across the public and private sectors including:

• the top 5 commercial banks

• 5 of the top 7 computer software companies

• 3 of the top 5 commercial aerospace and defense industry leaders

• the 3 largest arms services for the US

• 3 of the leading 4 accounting firms

• 2 of the top 3 insurance companies

SOURCE-CODE ANALYSIS• It has been argued that source-code analysis can play an important role with respect to

software assurance within an Agile development process

• The FDA is recommending (and may eventually mandate) the use of static-analysis tools for the development of medical device software.

• GrammaTech’s CodeSonar is a static-analysis tool that the FDA is currently using to investigate failures in recalled medical devices.

STATIC-ANALYSIS TOOLS• Are frequently rule-based

• Utilize a variety of software models (e.g AST, call-graph, control-flow graph)

• In an OO implementation, involve traversals of object-structures using the visitor pattern.

• Make use of pattern recognition (e.g., matching).

• May transform source-code (e.g., inserting markers/annotations to control analysis)

• Query software models

• Aggregate information

CREATING A RULE IN PMDAvoid using while-loops without curly braces

CREATING A RULE IN PMD• Step 1: Figure out what to look for. In this case we want to capture the convention that

while-loops must use braces.

• Construct a compilation unit containing an instance of the syntactic property you want to detect.

class Example { void bar() { while (baz) buz.doSomething(); } }

AST GENERATION• PMD uses JavaCC to generate an AST (Abstract Syntax Tree) corresponding to the

source code.

CompilationUnit TypeDeclaration ClassDeclaration:(package private) UnmodifiedClassDeclaration(Example) ClassBody ClassBodyDeclaration MethodDeclaration:(package private) ResultType MethodDeclarator(bar) FormalParameters Block BlockStatement Statement WhileStatement Expression PrimaryExpression PrimaryPrefix Name:baz Statement StatementExpression:null PrimaryExpression PrimaryPrefix Name:buz.doSomething PrimarySuffix Arguments

PATTERN SELECTION• Select and generalize the smallest portion of the AST containing the pattern in which you

are interested. Make sure you discriminate good patterns from bad patterns (e.g., blocks versus no blocks). Consult Java grammar as needed.

CompilationUnit TypeDeclaration ClassDeclaration:(package private) UnmodifiedClassDeclaration(Example) ClassBody ClassBodyDeclaration MethodDeclaration:(package private) ResultType MethodDeclarator(bar) FormalParameters Block BlockStatement Statement WhileStatement Expression PrimaryExpression PrimaryPrefix Name:baz Statement StatementExpression:null PrimaryExpression PrimaryPrefix Name:buz.doSomething PrimarySuffix Arguments

CREATE RULE

public class WhileLoopsMustUseBracesRule extends AbstractRule { public Object visit(ASTWhileStatement node, Object data) { SimpleNode firstStmt = (SimpleNode)node.jjtGetChild(1); if (!hasBlockAsFirstChild(firstStmt)) { addViolation(data, node); } return super.visit(node,data); } }

CREATE PATTERN MATCHER

// pattern matcher private boolean hasBlockAsFirstChild(SimpleNode node) {

return (node.jjtGetNumChildren() != 0 && (node.jjtGetChild(0) instanceof ASTBlock));

}

ADD RULE TO RULESET• Add the Newly Created Rule to the PMD ruleset

<?xml version="1.0"?><ruleset name="My custom rules"xmlns="http://pmd.sf.net/ruleset/1.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://pmd.sf.net/ruleset/1.0.0 http://pmd.sf.net/ruleset_xml_schema.xsd"xsi:noNamespaceSchemaLocation="http://pmd.sf.net/ruleset_xml_schema.xsd"><rule name="WhileLoopsMustUseBracesRule"message="Avoid using 'while' statements without curly braces"class="WhileLoopsMustUseBracesRule"><description>Avoid using 'while' statements without using curly braces</description><priority>3</priority><example><![CDATA[public void doSomething() {while (true)x++;}]]></example></rule></ruleset>

IN SEXTANTAvoid using while-loops without curly braces

CREATE BASIC RULE PATTERN

strategy WhileLoopsMustUseBracesRule:

Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] Statement[:] while ( <Expression>_1 ) <Statement>_1 [:]

ADD SPECIFIC PATTERN CONSTRAINT

strategy WhileLoopsMustUseBracesRule:

Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] if { not(<Statement>_1 = Statement[:] <Block>_1 [:]) }

ADD METRIC/ACTION

strategy WhileLoopsMustUseBracesRule:

Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] Statement[:] while ( <Expression>_1 ) <Statement>_1 [:] if { not(<Statement>_1 = Statement[:] <Block>_1 [:]) andalso sml.addViolation(<Statement>_1) }

OBSERVATIONS• Primitive operations in transformation systems include:

• Parsing

• Matching

• Traversal

• The software models that transformation systems typically operate on are terms – either concrete or abstract syntax trees.

• This makes the foundational framework of transformation systems well-suited for rule-based source-code analysis systems. Especially systems whose rules have syntax-based specifications.

SEMANTIC RULESUse equals() instead of == to compare objects

JAVA’S INTEGER CACHE• Some rules require semantic analysis

• The implementation of such rules requires the ability to query semantic models (i.e., software models other than an AST)

package p1; public class A { static void myEq(Integer x, Integer y) {

System.out.println(x == y); }

public static void main(String[] args) { myEq(100,100); myEq(200,200); } }

GPS-TRAVERSELinking Syntactic and Semantic Models within a Transformation System

GPS-TRAVERSE • GPS-Traverse

• enables contextual information to be transparently tracked during transformation.

• is a collection of transformations whose purpose is to associate terms with the contexts in which they are defined

• This association is based on:

• Structural properties

• Nested classes

• Local classes

• Anonymous classes

• Frame variables currently in scope

• Generic variables currently in scope

NESTED CLASSES

package p1; class B1 { class B2 { class B3 { int x; } } }

FIELDS VERSUS LOCAL VARIABLES

class B { int x = 1; void f() { { ... x ... int x = 2; ... x ... } }

GENERIC TYPES VERSUS STANDARD TYPES

class C<T> { class T { T T; // field T of type <T> } }

IN SUMMARY…• GPS-Traverse: term context

• In turn, a tuple of the form (term, context) provides the basis for a variety of semantic analysis functions

• A particularly useful such analysis function is called resolution

RESOLUTION• Resolution is a semantic analysis function that operates on terms denoting references

• The resolution function used by Java is highly complex and involves:

• Static evaluation

• Type analysis

• Overloading, overriding, shadowing

• Generic analysis

• Local analysis

• Visibility – public, protected, package private, private

• Subtyping

• Imports: single-type, on-demand, and static

USES OF RESOLUTION• Resolution is a prerequisite for a variety of software-based analysis and manipulation

activities such as:

• Bootstrapping semantic models

• Software metrics

• API usage analysis

• Refactoring

• Slicing

• Migration – a well-formed compliment of slicing

• Join point recognition

• Resolution-informed transformation is well-suited for many of these activities

• And finally, resolution-informed transformation can also play a key role in the construction of semantic models of software such as the call graph of a software system

EXAMPLE: CALL GRAPH

package p1; public class A extends C { class innerA extends B1 { void g(byte b) {

f(b + 0); f(0);

} } } class B1 extends B2 { private void f(int x) { } } class B2 { void f(long x) { } void f(short x) { } } class C { void f(int x) { } }

TECHNICAL DETAILSBascinet, the TL System, and Sextant

BASCINET• A Netbeans-based IDE supporting the development of TL programs

• Syntax-directed editors for TL, ML, and EBNF files

• Code-folding for both TL and ML

• Hyperlinks from MLton compiler output to ML source code

• Integrated with third-party visualization tools such as Cytoscape , GraphViz, and TreeMap

• Solves some key system-level problems:

• Discrete concurrent (forgetful) application of a transformation to a file hierarchy

{ transformation } x {file1, file2, …}

• Continuous sequential (stateful) application of a transformation to a file hierarchy

state1 = transformation( state0, file1)

state2 = transformation( state1, file2)

THE TL SYSTEM• Input: GLR Parser

• Output: Abstract Prettyprinter

• TL – A language for specifying higher-order transformation

• First-order matching on concrete syntax trees

• First-order and higher-order generic traversals

• Standard combinators plus special-purpose combinators

• Modular

• Partially type-checked

• ML – A functional programming language tightly integrated with TL

• Computation is expressed in terms of modules written in TL and ML.

TL• The terms being manipulated are concrete syntax trees

• The computational unit is the conditional rewrite rule:

termlhs termrhs if { condition }

• Rules (also called strategies) can be bound to identifiers:

r: termlhs termrhs if { condition }

• Strategies can be constructed by composing rules using a variety of combinators:

r1 <+ r2

r1 <; r2

• Strategies can be applied to terms using traversals and iterators:

TDL myStrategy myTerm

import_closed GPS.Locator

module CyclomaticComplexity strategy initialize: ...

strategy outputResults: ...

strategy collectMetrics: TDL ( GPS.Locator.enter <; ccAnalysis <; GPS.Locator.exit )

strategy ccAnalysis: MethodCC <+ ConstructorCC strategy MethodCC: ... strategy ConstructorCC: ...

end // module

GPS-TRAVERSE• Transformationally maintains a semantic model which can be queried in a variety of

ways:

• getContextKey

• getEnclosingContextKey

• currentContextType

• enclosingContextType

• withinContextType

• inMethod

• isGeneric

• isLocalGeneric

• isVar

strategy CallGraph: <SelectorOptExpression>_methodCall <SelectorOptExpression>_methodCall if {

isMethodCall <SelectorOptExpression>_methodCallandalso sml.GPS_inMethod()andalso <key>_methodContext = sml.GPS_getContextKey()

// semantic queryandalso <key>_calledMethod = sml.resolve( <key>_methodContext ,<SelectorOptExpression>_methodCall)andalso sml.outputPP( <key>_methodContext )andalso sml.output(" calls ")andalso sml.outputPP( <key>_calledMethod )

}

strategy isMethodCall:

//basic call SelectorOptExpression[:] <TypeArgsOpt>_1 <Id>_1 <Arguments>_1 [:] SelectorOptExpression[:] <TypeArgsOpt>_1 <Id>_1 <Arguments>_1 [:] <+ // embedded call ...

Questions?

THE END

top related