xsnippet: mining for sample code
DESCRIPTION
XSnippet: Mining For Sample Code. Naiyana Tansalarak and Kajal Claypool Presented by: Shan Li CISC864. Topics. Overall of Research Purposes Contributions Approaches Detail in Approaches. Overall of Research. Purposes: To provide sample codes for new developers to learn tech. quickly - PowerPoint PPT PresentationTRANSCRIPT
XSnippet: Mining For Sample Code
Naiyana Tansalarak and Kajal Claypool
Presented by: Shan Li CISC864
Topics
Overall of Research Purposes Contributions Approaches
Detail in Approaches
Overall of Research
Purposes: To provide sample codes for new
developers to learn tech. quickly Approaches
Mining sample codes from existing software systems
Overall of Research cont. Steps in Approaches
Range of Queries generalized / specialized? Ranking Heuristics for context-sensitive /
context-independence Such as: constructor function / constructor
function of DOM Mining Algorithms
BFSMINE Alg. , restricts: inside a scope of a method
Extensions to BFSMINE Alg.
Approaches: the Snippet Mining Processes
Figure1: A high-level view of the snippet Mining Process
Approaches cont. The goal of the snippet mining is to mine from a
given code sample repository all code snippets that satisfy a given user query Q,
SelectionAgent pre-selects a set of code model instances cmi on B+ tree index defined on all types declared or referred to in the code sample repository.
The MiningAgent invokes the BFSMINE algorithm for every code model instance Ccmi
Approaches cont. BFSMINE algorithm traverses a code model
instance and produces as output a set of paths P that represent the final code snippets returned to the user.
On completion of the BFSMINE phase, the MiningAgent passes the collection of the paths P, to the PruningAgent.
Approaches cont. Queries
The query retun all snippets s,
containing codes that instantiate a type tq:
(1) all codes that instantiate tq:
(2) instantiation of tq is dependent of the code context, i.e. via a static method
The following example
Approaches cont.
Approaches cont. Queries
A type-based instantiation query tq is instantiated from any type from the
context CT(m) T (s) the lexically visible types in the
code snippet s and CT (m) denotes the type context of the method
CT (m) : all set of inherited types, visible types in a scope of method, all types for local fields
Approaches cont.
Approaches cont. Queries
Parent-based in instantiation query s denotes a snippet, CP (s) the parent
context of the snippet, CP (m) the parent context of the method m.
CP (m): The parent context of a method m, denoted as CP (m), is a set containing the superclass extended by its containing source class C, as well as all interfaces implemented by its containing source class C.
Approaches cont.
Approaches cont. Source Code Model
A graphic representation of the structure of source codes. Nodes: a type node, an object node, a method node
Edges: inheritance, implement, composite, method, assignment or parameter edge.
Approaches cont. BFSMINE Algorithm
Given a user query , The goal of the BFSMINE algorithm is to determine for all such instances nq, types and eventually code segments that instantiate the node nq and hence the query type tq. Domain(nq) = {tq}
Approaches cont.
Approaches cont.
Extension-BFSMIN
Approaches cont.
Extension-BFSMIN
Approaches cont.
Personal Comments Strengths
User defined queries Results from a context-independent retrieval
to various degrees of context-sensitive retrieval
BFSMIN Algorithm based on a graph that represents a source code model allows mining across method boundaries
Ranking heuristic (length, frequency, context ) for providing best-fit code snippets
Multiple sample codes with the same query context-independent retrieval (length / frequency ) context-sensitive retrieval (context)
Personal Comments Potential weakness
Results Is it possible to provide semantic ranking ? Why? Probably, the return code snippets do not
have logic among them, just only a chunk of codes Validation approaches
To prove that snippet codes is helpful for developers, authors use group test. Two groups with the same condition except that one uses snippet codes, other do not.
Limited ?