visualizing results of data mining source code

13
Visualizing Results of Data Mining Source Code by Mike McCallie

Upload: alagan

Post on 26-Feb-2016

29 views

Category:

Documents


3 download

DESCRIPTION

Visualizing Results of Data Mining Source Code . by Mike McCallie. I want to combine Data Mining tools + Visualization tools I am motivated in using information in various forms to make informed decisions - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Visualizing Results of Data Mining Source Code

Visualizing Results of Data Mining Source Code

byMike McCallie

Page 2: Visualizing Results of Data Mining Source Code

ThoughtsI want to combine Data Mining tools + Visualization

tools

I am motivated in using information in various forms to make informed decisions

I believe inherit software structure (compliable source code) has an advantage over free-form text from a data mining perspective

I wish to “mine” data from source code and “build” visual models of code representation that are useful from a software engineer’s perspective

Page 3: Visualizing Results of Data Mining Source Code

Exploring tools at Moose for Data exploration

Page 4: Visualizing Results of Data Mining Source Code

Exploring “Code City” for Visual Representation

Classes are represented as buildings in the city. Packages are depicted as the districts in which the buildings reside.

CodeCity is programmed in VisualWorks Smalltalk on top of the Moose platform, uses OpenGL for rendering

Page 5: Visualizing Results of Data Mining Source Code

Conceptual Model

SourceCode

DataMining

“Engine”

“Mining”Algorithms

DataOutput

Visualization“Engine”

VisualResults

Page 6: Visualizing Results of Data Mining Source Code

Thesis Approach – Part iTheoretical Discussion

◦Data mining and visualization investigation

◦80’s and 90’s focus on program comprehension What worked What were dead-ends

(as important as what worked IMHO)

◦Literature review on program comprehension Gestalt principles were explored in previous class

◦Results of past empirical studies

Page 7: Visualizing Results of Data Mining Source Code

Thesis Approach – Part 1Motivating Scenario

◦Problem that is not too big, but not too small ◦“Bob the programmer was given the assignment to

add enhancement X to legacy system Y.”

◦Bob has ability to mine data from source code and visualize results

◦Question: What information is MOST relevant for Bob to succeed? (bound problem)

Page 8: Visualizing Results of Data Mining Source Code

Thesis Approach – Part 2Implementation

◦Moose tools for software analysis◦Code City for software visualization◦Source Code Analysis:

Public domain: Analyzing JHotDraw

Private domain: Analyzing 20+ year old legacy system at present

employer

Page 9: Visualizing Results of Data Mining Source Code

JHotDraw Framework

Classes ModelDesign Patterns

Role-Model-Enhanced Class Model

Page 10: Visualizing Results of Data Mining Source Code

Thesis Approach – Part 3• Empirical Study – Compare resultant artifacts

JHotDrawSource Code

DataMining

“Engine”+

Visualization“Engine”

JHotDrawArtifacts

Legacy SystemSource Code

Legacy SystemArtifacts

Compareto existing

JHotDraw artifacts

Compareto existing

Legacy System “expertise”

Page 11: Visualizing Results of Data Mining Source Code

Thesis Approach – Part 4• Results and Conclusions…

“Rule of Thumb” Mathematical Model

“I am very curious how close to a workable mathematical model I can create based on the findings of my empirical study”

Page 12: Visualizing Results of Data Mining Source Code

A big thank you to my thesis committee

Dr. Parvathi ChundiDr. Bill MahoneyDr. Harvey Siy

Page 13: Visualizing Results of Data Mining Source Code

And thank you for your time as well…

QuestionsCommentsConcernsObservationsPunsJokesLimericksetc.