1 example application: source code analysis 125 file types; 8029 files; 4689 non-java; 1112 svn...

Post on 31-Dec-2015

223 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

File Types in Alfresco Source

1

10

100

1000

10000

Example application: source code analysisExample application: source code analysis

125 file types; 8029 files; 4689 non-Java; 1112 svn revisions

2

build scripts

version history

spreadsheets

databases

config files

web pages

bug reports

softwarerepository

parsers

queryengine

analyst

dashboard

IDEplugin

exceladd-in

source code

developer

manager

Querying Software ArtefactsQuerying Software Artefacts

3

The problemThe problem

design query language and enginefor accessing vast repository of different types of source artefact

libraries of queries:tailor framework to different types of artefact

4

Tough problem!Tough problem!

Difficulties: - does not scale- efficient queries extremely hard to write- specific to one kind of source artefact

Dozens of attempts, in industry and academia since 1984: databases, prolog, domain-specific query languages

18 man-years of research at University of Oxford1996-2005 to discover ingredients of solution

15 man-years to implement an industrial product

3 patents pending, several more in pipeline

5

SemmleCode: the power of .QLSemmleCode: the power of .QL

6

The query language .QLThe query language .QL

Object-oriented, for creating libraries of queries

Recursive queries, as in logic programming

Familiar syntax to Java and SQL developers

On top of any traditional relational database

Syntax-highlighting, error-checking and auto-completion

7

How it worksHow it works

.QL library

.QL query

RDBMS

proceduralSQL

java / jar

bytecodefor search

XMLfiles

templatefor RDBMS

Semmleoptimiser

8

DemoDemo

The source we shall explore: Alfresco: Enterprise Content Management Spring: Java/JEE Application Framework Builds on Tomcat, JBoss, …

Demo parts:

• out-of-the-box• writing your own queries• querying XML config files

Vital statistics:

50553 Java methods6647 Java types516 XML files

9

Using SemmleCode out-of-the-boxUsing SemmleCode out-of-the-box

115 pre-packaged queries

Find common bug patterns:e.g. compareTo/equals, cloning, serialisation, internationalization

Compute metrics:42 different metrics, including Robert Martin’s package metrics

Examine dependencies:e.g. cyclic package dependencies

Visualization:pie charts, bar charts, tables, graphs, warnings/errors- easy navigation to source- exportable for generating reports

10

Writing queries of your own: Writing queries of your own: selectselect

from Method mwhere m.fromSource() and m.hasName("compareTo") and not m.getDeclaringType(). getAMethod().hasName("equals")select m, "missing equals?"

In general:

from <variable-declarations>where <conditions>select <results>

11

Writing queries of your own: Writing queries of your own: aggregatesaggregates

select sum (CompilationUnit cu | cu.fromSource() | cu.getNumberOfLinesOfCode())

In general:

agg( T1 x1, …, Tn xn | condition | expr )

12

Writing queries of your own: recursionWriting queries of your own: recursion

from RefType s, RefType t, RefType itwhere it.hasName("PasswordInputTag") and it.hasSupertype*(s) and it.hasSupertype*(t) and t.hasSupertype(s)select t,s

In general, can write recursive predicate definitions

13

Queries in .QLQueries in .QL

from-where-selectautocompletion, typechecking, emptiness tests

aggregatesarbitrary nesting, no group-by needed

recursionimplicit with chaining; or explicit

14

Defining new classes in .QLDefining new classes in .QL

class ClassAttribute extends XMLAttribute {

ClassAttribute() { this.getName()="class" }

string getClassName() { this.getValue() = result }

RefType getType() { result.getQualifiedName() = this.getClassName() }

predicate noType() { not exists(this.getType()) }}

from ClassAttribute cawhere ca.noType() and ca.getClassName().matches("org.alfresco%")select ca, ca.getClassName() + " not found"

15

Classes in .QLClasses in .QL

classes are logical properties “constructor” specifies characteristic property

methodsbody is relation between this, result and parametersmore than one result allowed

predicatesmethods without a resultbody is relation between this and parameters

16

The key points of .QLThe key points of .QL

classes are predicatesinheritance is implicationnondeterministic expressions

recursion with super-simple semantics

syntax familiar to SQL and Java programmers

designed for creating libraries of queries

excellent error checking and IDE integration

Concluding remarksConcluding remarks

18

Couldn’t you use LINQ instead of .QL?Couldn’t you use LINQ instead of .QL?

Different design goals:ORM versus libraries of queries

LINQ does not provide recursion

LINQ cannot do the optimisations across multiple queries that are key to efficiency in .QL

“Fortunately, there is light in the darkness. Based on decades of programming language research, the brilliant team at Semmle has created an elegant, industrial strength object-oriented query language called .QL with full support for recursive queries and aggregation… .QL has all the requisites to become a runaway success.”

(Erik Meijer, Creator of LINQ, Microsoft)

19

Too good to be true?Too good to be true?

Jeff Ullman, 1991:

It is not possible for a query languageto be seriously logical and seriouslyobject-oriented at the same time.

key breakthroughs are Semmle’s proprietary technology:- design of .QL- optimisations on “bytecode for search”

20

Wrapping upWrapping up

Java is not enoughsource code analysistools must process amultitude of artefacts

libraries of queriesa means to achieve suchheterogeneous tools

.QLobject-oriented queriesover trees and graphs made fast and easy

top related