api usage pattern extraction using semantic similarity
DESCRIPTION
An enthusiastic project for API usage pattern extraction exploiting semantic similarity among API usage code examples.TRANSCRIPT
SEMANTIC NETWORK BASED API USAGE PATTERN EXTRACTION & LEARNING
Mohammad Masudur Rahman
Department of Computer Science
University of Saskatchewan
PRESENTATION OVERVIEW
Introduction Motivating Example Background Concepts Proposed Approach Semantic Network of Source code API Usage Pattern Extraction Pattern Learning & Visualization Experimental Results & Discussions Threats to Validity Conclusion & Future Works
INTRODUCTION
API (Application Programming Interface) Libraries
API Documentation, API Browser, forums API Usage learning for developers Existing projects using APIs API Usage Patterns
WHAT IS API USAGE PATTERN?
A frequent and consistent sequence of API method calls and field accesses
Performs a particular programming task. Widely used in multiple projects Widely accepted by developers community
API USAGE PATTERN
BIG QUESTION?
How to extract the API usage patterns from the source code?
SEMANTIC WEB OR NETWORK
What is the living place of the author of a particular software manual?
MOTIVATING EXAMPLE
MOTIVATING EXAMPLE
RESEARCH QUESTIONS
RQ 1: Can semantic network technologies represent the semantics of OO source code properly?
RQ 2: Can this representation be used for API usage pattern extraction and learning?
BACKGROUND CONCEPTS
API Usage Patterns API Usage Violation & Anomalies Semantic Web Semantic Network of Source Code Resource Description Framework (RDF) RDF Statement or Triples
RDF TRIPLE (BUILDING BLOCK OF SEMANTIC WEB OR NETWORK)
Subject Predicate Object
PROPOSED APPROACH FOR API USAGE PATTERN EXTRACTION & LEARNING
PROPOSED APPROACH FOR API USAGE PATTERN EXTRACTION & LEARNING
API Class List
OSS Projects
Contains API ?
Source code parser
Semantic Network Builder
API Pattern Explorer
API Usage Pattern
Manager
RDF Pattern Visualizer
Pattern Source Skeleton Builder
1
2
3 4 5
6
9 8
7
API Classes
Source files
No
Yes
Parsed Expressions
RDF Files
Patterns
Pattern Pattern
SOURCE CODE SEMANTIC NETWORK
AST Parser (Javaparser)
JavaExpressions
Apache Jena Framework
API Expression selection rules
RDF Maker
Java Source code
RDF Network
RDF Triples
API USAGE PATTERN EXTRACTION
All Usages of an API Class
Candidate API usage
Patterns
Common Sub-graph Selection
Pattern Score >
threshold ?
No
Selected API Usage Patterns
Yes
Discarded
EXPERIMENTAL RESULTS
25 Open source Projects 3 API libraries (java.io, java.util, java.awt) 250 API classes selected API usages found for 113 API classes Pattern found for 76 API classes Total 776 patterns
API USAGE PATTERNS
SOURCE CODE SKELETON
Fig: BufferedInputStream Usage Pattern
EXPERIMENTAL RESULTS
Project #Class #M &C
#ATCF #ADCF #ATPF #ADPF
Ant-Contrib
186 1388 96 23 1865 280
AOI 461 6489 218 55 1651 494
Groimp 1202 13875 132 41 1632 407
JFreechart 1059 12368 507 38 6841 410
JHotdraw7 689 7330 310 49 2547 462
#M & C =Methods & Constructors, #ATCF=Total API class, #ADCF=Distinct API class, #ATPF=Total API Patterns found, #ADPF=Distinct API Patterns found
PATTERNS PER CLASS
Fig: # patterns extracted per class comparison
RESULTS DISCUSSION
RQ 1: Can semantic network technologies represent the semantics of OO source code properly?
Graph-based API Usage Extraction by Nguyen et al, FSE, 2009 : Incomplete semantics for edges and attributes
Source code ontology by Wursch et al, ICSE, 2010 : Does not represent the complete source code
The proposed approach captures expression level syntax and semantics
Focuses on API usage patterns
RESULTS DISCUSSION
RQ 2: Can this representation be used for API usage pattern extraction and learning?
Successfully extracts 776 patterns for 76 API classes from 25 open source projects
A potential approach to be explored more for API usage pattern exploration
Visualization of RDF network helps in learning Source code as visual entities rather than
lines More comprehensive idea about OO source
code Applicable for complex OO relationships Very useful for quick learning
THREATS TO VALIDITY
Representing complete semantics: a non-trivial task.
More expressions for more accurate representation
RDF pattern visualization within limited display
Need to be introduced with RDF convention
CONCLUSION & FUTURE WORKS
Applicability of semantic web technologies for API usage pattern extraction
Semantic representation for learning by the developers
Real world user study Extracted patterns for automatic code
completion in the IDE. Extracted patterns for API violation and
anomaly detection
THANK YOU!!!
REFERENCES[1] Semantic web diagram.URL http://www.w3.org/ Talks/2002/10/16-sw/slide7-0.html.[2] Tung Thanh Nguyen, Hoan Anh Nguyen, NamH.Pham, JafarM.Al-Kofahi, and
TienN.Nguyen. Graph-based mining of multiple object usage patterns. In Proc. ESEC/FSE, 2009, pages 383-392.
[3] M.Wursch, G.Ghezzi, G.Reif,and H.C.Gall. Supporting developers with natural language queries. In Proc. ICSE, 2010,pages 165-174
[4] Tao Xie and Jian Pei. Mapo:mining api usages from open source repositories. In Proc. MSR, 2006, pages 574-57[5] Semantic web technology.URL http://www.w3.org/ 2001/sw[6] Visual learning style.URL http://www.learning-styles-online.com/style/visual-
spatial.[7] Apache Jena framework.URL http://jena.apache.org/.[8] Javaparser-java 1.5 parser and ast.URL http://code.google.com/p/javaparser/.[9] RDF-gravity tool.URL http://semweb.salzburgresearch.at/apps/rdf-gravity/.