reverse engineering of software architecture
DESCRIPTION
TRANSCRIPT
© 2012 Fraunhofer USA, Inc.
Software Architecture in Evolution and Reverse Engineering of Legacy systems
Mikael Lindvall, Dharma Ganesan
Software Architecture and Embedded Systems divisionFraunhofer Center for Experimental Software Engineering Maryland (FC-MD)
© 2012 Fraunhofer USA, Inc.
Your Presenters
• Division director, more than 13 years at FC-MD, co-invented FC-
MD’s reverse engineering
approach, analyzed e.g. NASA’s Space Network (10 MLOC ADA,
C++ etc). Review board member for SN replacement system (SGSS).
• Research scientist, more than 8
years at FC-MD, co-invented FC-
MD’s reverse engineering and
testing approach, analyzed NASA’s Core Flight Software,
GMSEC, Climate Modeling System etc. etc.
Mikael Lindvall, PhD Dharma Ganesan, PhD
© 2012 Fraunhofer USA, Inc.
Fraunhofer Center – Maryland (FC-MD)
• Applied Research and Tech Transfer, non-profit
– US incorporated
• Affiliated with
– University of Maryland, College Park
– Fraunhofer Germany
• Close to ties to NASA
– Goddard Space Flight Center around the corner
• Focus on Software Engineering
– Especially Software Quality
• Business model: Applied research services
© 2012 Fraunhofer USA, Inc.
Fraunhofer Center – Maryland (FC-MD)at MSquare
© 2012 Fraunhofer USA, Inc.
Clients ask Fraunhofer to determine
• If their sw architecture/design rules are met
• The risk involved if they change the software
• If their software meet certain regulations
• If their software has defects
• If their software is efficient
• Etc. etc.
Today: How reverse engineering can be used to deal with legacy systems using different kinds of examples on different systems
© 2012 Fraunhofer USA, Inc.
Reverse Engineering at Fraunhofer
• Developed an approach to analyze, visualize
and describe legacy software
– Structure and behavior
– Methods and tools
– Support from NASA IV&V
• Analyzed legacy software systems e.g.
– NASA’s Space Network (Ground segment)
– NASA’s Core Flight Software
– NASA’s GMSEC
• More than 10 years
© 2012 Fraunhofer USA, Inc.
Background: Software architecture
• Software architecture (SA) deals with components, connectors, and protocols
• SA is a multi-dimensional artifact
– Each dimension corresponds to one concern
(e.g. Database interaction concern)
• SA is represented by a collection of views
– Development/Implementation view
– Runtime view
7
© 2012 Fraunhofer USA, Inc.
Our Model of SA and RE
• Development views– Components of a development view
• Directories/files/functions/database tables
– Connectors of a development view• Function calls, includes, variable accesses, etc.
• Runtime views– Components of a runtime view
• Tasks, Processes
– Connectors of a runtime view• Sockets, Queues, Shared Memory, Software Bus etc.
• Create views from source code to answer questions!
8
© 2012 Fraunhofer USA, Inc.
The Fraunhofer RE Method
• Software architecture is influenced and inspired by external entities (EE)
– Programming language libraries
– COTS and Frameworks
• Reverse Engineering is driven by EE
• A knowledge base of EE based on ~24 real-world systems
– Several NASA systems and other companies
9
© 2012 Fraunhofer USA, Inc.
SAVESample Software Architecture Visualization and Evaluation Tools
(Depends on development environment)
Tool Type Purpose
Understand Commercial Extracts code-level dependencies and metrics from source code
RPA Research Queries the dependency models using relational algebra
Prefuse Research Visualizes the content of the knowledge base
Similarity Research Determines similarity among files
FindBugs Open Source Detects defects in Java code.
Other tools used to detect defects in other languages
SAVE Research Imports and visualizes dependency models tagged by similarity,
metrics, defects, knowledge.
Detects architecture violations (compares actual to planned).
© 2012 Fraunhofer USA, Inc.
Example: Common Ground System (CGS)
• Ground System implemented in C/C++
• Developed by Johns Hopkins University/ Applied Physics Laboratory (JHU/APL)
• 10 years old
• Product line for three different NASA missions
• Works well
• Software Quality is very important
© 2012 Fraunhofer USA, Inc.
Exploring actual architecture
© 2012 Fraunhofer USA, Inc.
The Common Ground System
level_zero
indexer
archive_server
Planning &Scheduling
CSCI
SSC POCClients
1:N
Archived
Telemetry
TelemetryCSCI
spooler
Real-Time
TelemetryPackets
Plotter
eng_dump
Assessment CSCI Telemetry Data Flow Diagram
MOPs/ I&TUsers
1:N
Archived
Telemetry
Level 0 Data FilesAncillary Products
ArchiveServerDirectives
DecommutatedPoints File
Non-Real-TIme Pkt
Files
Non-Real-Time ExtractedTelemetry Packets
Archive ofPkts and
Indexes
ArchiveServer
Directives
TelemetryPkt Files
Requested
Points File
ArchiveServer
DirectivesArchived
Telemetry
*Timekeeping System expanded separately
Web Data Server
Sorted
Telemetry PktFiles &
Indexes
merger
*Timekeeping
System
Archived
Telemetry
Plots
ArchivedTelemetry
ArchiveServer
Directives
LHerrera 08/03
instant_replay
instant_ replay
Directives Telemetry
gap_reporter
Manually drawn diagram from system documentation
© 2012 Fraunhofer USA, Inc.
A Typical application: eng_dump
Automatically drawn diagram based on source code
© 2012 Fraunhofer USA, Inc.
eng_dump’s use of Common
App_Specific
Automatically drawn diagram based on source code
© 2012 Fraunhofer USA, Inc.
Detecting high level violations
© 2012 Fraunhofer USA, Inc.
The Common Ground System
level_zero
indexer
archive_server
Planning &Scheduling
CSCI
SSC POCClients
1:N
Archived
Telemetry
TelemetryCSCI
spooler
Real-Time
TelemetryPackets
Plotter
eng_dump
Assessment CSCI Telemetry Data Flow Diagram
MOPs/ I&TUsers
1:N
Archived
Telemetry
Level 0 Data FilesAncillary Products
ArchiveServerDirectives
DecommutatedPoints File
Non-Real-TIme Pkt
Files
Non-Real-Time ExtractedTelemetry Packets
Archive ofPkts and
Indexes
ArchiveServer
Directives
TelemetryPkt Files
Requested
Points File
ArchiveServer
DirectivesArchived
Telemetry
*Timekeeping System expanded separately
Web Data Server
Sorted
Telemetry PktFiles &
Indexes
merger
*Timekeeping
System
Archived
Telemetry
Plots
ArchivedTelemetry
ArchiveServer
Directives
LHerrera 08/03
instant_replay
instant_ replay
Directives Telemetry
gap_reporter
Common
© 2012 Fraunhofer USA, Inc.
Violations of architecture:
Common depends on eng_dump
ASED
Automatically drawn diagram (actual) based on source code
© 2012 Fraunhofer USA, Inc.
Checking Design Rules
© 2012 Fraunhofer USA, Inc.
Encapsulation of
client/server interface
Planned Architecture: eng_dump
Application-Specific
Modules
The socket
Encapsulation of socket
communications
Client
Manually drawn diagram based on design rule
© 2012 Fraunhofer USA, Inc.
The Actual Architecture: eng_dump (“components” collapsed)
Automatically drawn diagram based on source code
© 2012 Fraunhofer USA, Inc.
Mapping planned and actual using patterns
© 2012 Fraunhofer USA, Inc.
Dependency in
actual, not in planned
Dependency
in planned,
not in actual
The Actual Architecture vs. The Planned: eng_dump
Who does socket communicate
with?
Client
© 2012 Fraunhofer USA, Inc.
Adding Components and Layers
to Common
© 2012 Fraunhofer USA, Inc.
Common; across all applications
Automatically drawn diagram based on source code
© 2012 Fraunhofer USA, Inc.
Suggested Target Architecture for Common
Basic rule: A lower layer cannot access a higher layer
Manually drawn diagram
© 2012 Fraunhofer USA, Inc.
Components and Layers
Manual refactoring of files into components and layers
© 2012 Fraunhofer USA, Inc.
Layers with dependencies
Let’s see how this target structure maps to the actual implementation
Automatically drawn diagram, manual layout
© 2012 Fraunhofer USA, Inc.
Back links from lower layer to higher layer
Automatically drawn diagram, manual layout
© 2012 Fraunhofer USA, Inc.
Analyzing embedded software
© 2012 Fraunhofer USA, Inc.
Case Study: CARA Medical Device
31
Blood Pressure
Monitor
CARA Software
© 2012 Fraunhofer USA, Inc.
Sample analysis needs at FDA• What is the architecture of the software in general?
– Is the software putrid?
• Where is a certain “Safety Function” located?
– Is it present at all?
• Once located:
– What is the quality of that “Safety Function”?
• From various perspective (cloning, look and feel etc.)
– Does the architecture allow for “modularized verification” of the “Safety Function”?
– If not, can the architecture be refactored to facilitate verification using detailed but time-consuming static analysis tools? 32
© 2012 Fraunhofer USA, Inc.
Analysis Types
• Goal: Analysis of Architectural Quality
• Variability Management
– OS/Hardware Abstraction
• Reverse architecture of module dependencies
• Reverse architecture of task dependencies
• Analysis of Testability
33
© 2012 Fraunhofer USA, Inc.
Summary Generator Output
34
CARA has • Several keywords of Windows libraries (e.g. windows.h)• Several keywords of VxWorks libraries (e.g. vxworks.h)• Multitasks because it has the taskSpawn keyword• Inter-Process because it has msgQSend/Receive keywords• Semaphores because it has semBCreate and semTake keywords• GUI because it has keywords of GUI libraries (e.g. afxwin.h)
© 2012 Fraunhofer USA, Inc.
Views of SAVE Light
35
© 2012 Fraunhofer USA, Inc.
Analysis of OS Variability
• Fraunhofer has a knowledge base (KB) of OS functions/files for different OS types
• KB was used for CARA analysis
• Automatically identified OS types of CARA:
– Vxworks
– Simulation of Vxworks APIs using Windows
APIs
36
© 2012 Fraunhofer USA, Inc.
sim.cpp
src/AD_Reader.c
src/BP_Reader.csrc/CARA_CUII.c
src/CARA_Calculations.c
src/CARA_DA.csrc/CARA_Globals.c
src/CARA_Globals.h
src/CARA_Interface.c
src/CARA_Macroes.csrc/CARA_Main.c
src/CARA_Timer.c
src/CARA_Types.hsrc/COM_Reader.cpp
src/VxWorksSim.hsrc/dscud.h
#ifdef WIN32#include vxworsksim.h
#endif
�CARA Architecture lacks an OS abstraction
�OS concerns present in several files
Analysis of OS Variability
37
© 2012 Fraunhofer USA, Inc.
File Name Count of #if WIN32
src/Interface.c 40
src/Cara_Main.c 6
src/Cara_DA.c 5
• Unnecessary complexity to manage OS variants
• 40 #if could have been avoided if the architecture has an OSAL
Analysis of OS Variability
38
© 2012 Fraunhofer USA, Inc.
…
…
… … …
OSAL in NASA CFS
39
• One generic interface for different OS types
• Implementations for each OS type• At build time developers can use the OS of interest
© 2012 Fraunhofer USA, Inc.
Summary of OS Variability
• CARA lacks an OSAL
– Complexity to support Windows and Vxworks
• OS variability analysis helped the FDA to run CodeSonar (static analysis tool) on CARA
• Identification of the right compiler switch to overcome the missing Vxworks related files
40
© 2012 Fraunhofer USA, Inc.
Because the CARA system was targeted for a specific embedded system platform, test execution was not possible on the development machines.
A test platform was prepared with facilities to simulate sensor inputs and monitor system responses.
Initial difficulties setting up the test hardware postponed the start of testing for a number of months.
Once these problems were addressed, there remained a limited amount of time available to test the increment.
Snippet from CARA documentation
Analysis of Hardware Abstraction
41
© 2012 Fraunhofer USA, Inc.
double CARA_READ_EMF(void)
{
#ifndef TEST/* Read the EMF value from the A/D board. */float Actual_Value;
Actual_Value = AD(EMF_CHANNEL);
Actual_Value = (float)(Actual_Value * (-1.0));
dprintf("EMF %f\n", Actual_Value);
return (double) Actual_Value;#elsereturn CARA_EMF_VALUE;
#endif
}
�Lack of hardware abstraction layer (HAL)
�Testing could have been facilitated better with HAL (e.g. Stubs)
Analysis of Hardware Abstraction
42
© 2012 Fraunhofer USA, Inc.
Extraction of Runtime Views
• Pool of relations are semi-automatically extracted using our KB and regular expressions
• Extracted relations are stored in a relational database (text file)
• Relational algebraic operators (e.g., Union, Transitive Closure) are used to extract runtime views
43
© 2012 Fraunhofer USA, Inc.
CARA_Log_SvcCARA_CUII_Svc
CARA_Main
CARA_Display_SvcCARA_B2B_Broker CARA_Warn_Svc CARA_Alarm_Svc
CARA_KVO_SERVICE
BP_Reader COM_ReaderAD_Reader
<<optional>> <<optional>> <<optional>>
Task
Creation of Task
Task creation view
44
© 2012 Fraunhofer USA, Inc.
CARA_Log_Svc
CARA_CUII_Svc
CARA_Main
CARA_Display_Svc
CARA_B2B_Broker
CARA_Warn_Svc CARA_Alarm_SvcCARA_KVO_SERVICE
BP_Reader
COM_ReaderAD_Reader
CARA_DSPQ_ID
CARA_MSGQ_ID
CUII_MSGQ_ID
CARA_LOGQ_ID
BP_Reader_Q_ID
Read from msg queue
Write to msg queue
Task
<<optional>>
<<optional>> <<optional>>
Task-Queue-Task View
45
© 2012 Fraunhofer USA, Inc.
Task-Queue-Task View
• Useful to reason about overall complexity
• Useful to reason about testability of each task
– Can tasks be unit tested independently?
• Tasks also communicate using shared variables (see demo)
46
© 2012 Fraunhofer USA, Inc.
Identifying high risk sw modules by combining information
Structures
Defects
Clones
© 2012 Fraunhofer USA, Inc.
© 2007 Fraunhofer CESE
48
High Level Architecture
• The high level architecture seems fairly
organized and clean, there are however worries
on the left
© 2012 Fraunhofer USA, Inc.
© 2007 Fraunhofer CESE
49
High Priority Bugs
11852 19 14
9
2 2
1
929
23
1
• Number of high priority bugs for each high level component
– mocclient is the most buggy package with 118 bugs
– dsdm, shareclient, and oamclient also contain many highly severe bugs
© 2012 Fraunhofer USA, Inc.
© 2007 Fraunhofer CESE
50
Duplicates
• Number of duplicates shared among high level components (duplicate dependencies)
– mocclient and oamclient share most duplicates
– snif shares duplicates with many other components
– Same high priority FindBugs bug in several cloned files
893
3
3
3
Cloned bugs
© 2012 Fraunhofer USA, Inc.
Conclusion
• Overview of reverse engineering• Knowledge based reverse engineering is a simple,
yet promising idea
• Architectural analysis offers complementary views on FDA’s static analysis– Helps to configure static analysis– Helps to plan static analysis on small portions
• Detected issues of CARA– Lack of OS abstraction layer– Lack of hardware abstraction layer– File structure and task structure are not symmetric
51
© 2012 Fraunhofer USA, Inc.
Summary
• Architecture can be reverse engineered using
external dependencies
• Multiple views are required to reason about
software quality
• Specified architecture can be compared with the
actual architecture
• Fraunhofer has wealth of experience in: Product
Line, Architecture, and Reverse Engineering
© 2012 Fraunhofer USA, Inc.
Sample Publications• Developing an Approach for Analyzing and Verifying System
Communication, The Aerospace conference, 2009
• Verifying Architectural Design Rules of the Flight Software Product Line, The Software Product Line Conference (SPLC), 2009
• Connecting Research and Practice: An Experience Report on Research Infusion with SAVE, Innovations in Systems and Software Engineering a NASA Journal, 2010
• D. Ganesan. Software Architecture Discovery for Testability, Performance, and Maintainability of Industrial Systems. PhD Thesis, Vrije Universiteit Amsterdam, 2012 (http://dare.ubvu.vu.nl/handle/1871/32693)
© 2012 Fraunhofer USA, Inc.
Contact information
• www.fc-md.umd.edu
• Mikael Lindvall
– 240-487-2902
• Dharma Ganesan
– 240-487-2915