A Framework for the Assessment and Selection of Software Components and Connectors in COTS-based Architectures
Jesal Bhuta, Chris Mattmann
{jesal, mattmann}@usc.edu
USC Center for Systems & Software Engineering
http://csse.usc.edu
February 13, 2007
2
Outline
Motivation and Context
COTS Interoperability Evaluation Framework
Demonstration
Experimentation & Results
Conclusion and Future work
3
COTS-Based Applications Growth Trend
Number of systems using OTS components steadily increasing
– USC e-Services projects show number of CBA’s rise from 28% in 1997 to 70% in 2002
– Standish group’s 2000 survey found similar results (54%) in the industry [Standish 2001 - Extreme Chaos]
0
10
20
30
40
50
60
70
80
1997 1998 1999 2000 2001 2002
Year
Per
cen
tag
e
CBA Growth Trend in USC e-Services Projects Standish Group Results
4
COTS Integration: Issues
COTS products are created with their own set of assumptions which are not always compatible
– Example: Java-Based Customer Relationship Management (CRM) and Microsoft Access integration
CRM supports JDBC, MS SQL supports ODBC
Java CRM
Microsoft SQL Server
JDBC
ODBC
5
Case Study [Garlan et al. 1995]
Develop a software architecture toolkit
COTS selected– OBST, public domain object oriented database
– Inter-views, GUI toolkit
– Softbench, event-based tool integration mechanism
– Mach RPC interface generator, an RPC mechanism
Estimated time to integrate: 6 months and 1 person-year
Actual time to integrate: 2 years and 5 person-years
6
Problem: Reduced Trade-Off Space
Detailed interoperability assessment is effort intensive– Requires detailed analysis of
interfaces and COTS characteristics, prototyping
Large number of COTS products available in the market– Over 100 CRM solutions, over
50 databases = 5000 possible combinations
This results in interoperability assessment being neglected until late in development cycle
These reduce trade-off space between – medium and low priority
requirements chosen over cost to integrate COTS
Large number of COTS choices
High Priority Functional Criteria Filtering
Medium and Low Priority Functional Criteria Filtering
COTS Product Type BCOTS Product Type A
COTS Product Type DCOTS Product Type C
7
Statement of Purpose
To develop an efficient and effective COTS interoperability assessment framework by:
1. Utilizing existing research and observations to introduce concepts for representing COTS products
2. Developing rules that define when specific interoperability mismatches could occur
3. Synthesizing (1 and 2) to develop a comprehensive framework for performing interoperability assessment early (late inception) in the system development cycle
Efficient: Acting or producing effectively with a minimum of unnecessary effort
Effective: Producing the desired effect (effort reduction during COTS integration)
8
Proposed Framework: Scope
Specifically addresses the problem of technical interoperability
Does not address non-technical interoperability issues– Human computer interaction incompatibilities
– Inter/intra organization incompatibilities
IRR – Inception Readiness Review; LCO – Life Cycle Objective Review; LCA – Life Cycle Architecture Review; IOC – Initial Operational Capability
[Boehm 2000]
Apply Proposed Framework
Detailed Analysis & Prototyping
Integration & TestingConceptualize Architecture
Identify COTS Software Products
Inception Elaboration Construction
LCOIRR LCA IOC
High return on investment area
9
Motivating Example: Large Scale Distributed Scenario
Manage and disseminate– Digital content (planetary
science data)
Data disseminated in multiple intervals
Two user classes separated by distributed geographic networks (Internet)– Scientists from European
Space Agency (ESA)– External users
Query Manager
Data flow
High Voluminous Data Connector (C1 & C2)
DigitalMetadata
Digital Asset Management
System
AdditionalPlanetary
Data
Query Manager
Data Store
Data Retrieval Component
Data Retrieval Component
Digital Asset Management
System
AdditionalPlanetary
Data
Digital content & metadata
Hig
h V
olum
inou
s D
ata
Con
nect
or (
C3)
External User Systems
Custom/COTS components
ESA (Spain, Madrid)
NASA JPL (USA, Pasedena)
Organization Intranet
10
Interoperability Evaluation Framework Interfaces
Integration Rules and Strategies
COTS Representation Attributes
COTS Interoperability Evaluator(StudioI)
COTS Components & Proposed System
Architecture
Cost & Effort Estimate to Integrate COTS
Products
COCOTS Glue-CodeEstimation model[Chris Abts 2002]
COTS Interoperability Analysis Report
Developer
Estimates Lines of Glue-Code
Interoperability Evaluation Framework
11
COTS Representation Attributes
COTS Representation Attributes
NameRole*TypeVersion
COTS General Attributes (4)
COTS Interface Attributes* (14)
Binding*Communication Language Support*Control Inputs*Control Output*Control Protocols*Error Handling Inputs*Error Handling Outputs*
COTS Internal Assumption Attributes (16)
BacktrackingControl UnitComponent PrioritiesConcurrencyDistributionDynamismEncapsulationError Handling Mechanism
Communication Dependency*Communication Incompatibilities*Deployment Language*Execution Language Support*Underlying Dependency*Same Node Incompatibilities*
COTS Dependency Attributes* (6)
Implementation language*LayeringPreemptionReconfigurationReentrantResponse TimeSynchronizationTriggering capability
Extensions*Data Inputs*Data Outputs*Data Protocols*Data Format*Data Representation*Packaging*
* indicates the attribute or attribute set can have multiple values
12
COTS Definition Example: Apache 2.0
COTS General Attributes (4) COTS Dependency Attributes (4)
Name Apache Communication Dependency
None
Role Platform Deployment Language
Binary
Type Third-party component Execution Language Support
CGI
Version 2.0 Underlying Dependencies
Linux, Unix, Windows, Solaris (OR)
Interface Attributes (14)
Backend Interface Web Interface COTS Internal Assumption Attributes (16)
Binding Runtime Dynamic Topologically Dynamic Backtracking No
Communication Language Support
C, C++ Control Unit Central
Control Inputs Procedure call, Trigger Component Priorities No
Control Outputs Procedure call, Trigger, Spawn
Concurrency Multi-threaded
Control Protocols None Distribution Single-node
Error Inputs Dynamism Dynamic
Error Outputs Logs HTTP Error Codes Encapsulation Encapsulated
Data Inputs Data access, Procedure call, Trigger
Error Handling Mechanism
Notification
Data Outputs Data access , Procedure call, Trigger
Implementation Lang C++
Data Protocols HTTP Layering None
Data Format N/A N/A Preemption Yes
Data Representation
Ascii, Unicode, Binary Ascii,Unicode, Binary Reconfiguration Offline
Extensions Supports Extensions Reentrant Yes
Packaging Executable Program Web service Response Time Bounded
Synchronization Asynchronous
Triggering Capability Yes
13
COTS Interoperability Evaluation Framework
COTS Definition Generator
COTS Connector Selector
Architecting User Interface
Component
COTS Definitions
Deployment Architecture
COTSDefinitions
Connector Query/Response
Integration Analysis Component
Connector Options
Interoperability Analysis Framework
COTS Selection Framework
Level of Service Connector Selection
Framework
COTS Interoperability
Analysis Report
COTS Definition Repository
Integration Rules Repository
IntegrationRules
ConnectorOptions
Project Analyst
DefineArchitecture &
COTScombinations
14
Integration Rules
Interface analysis rules– Example: ‘Failure due incompatible error communication’
Internal assumption analysis rules– Example: ‘Data connectors connecting components that
are not always active’
Dependency analysis rules– Example: ‘Parent node does not support dependencies
required by the child components’
Each rule includes: pre-conditions, results
15
Integration Rules: Interface Analysis
‘Failure due incompatible error communication’
– Pre-conditions 2 components (A and B) communicating via data &/or control
(bidirectional) One component’s (A) error handling mechanism is ‘notify’ Two components have incompatible error output/error input
methods
– Result Failure in the component A will not be communicated in
component B causing a permanent block or failure in component B
16
Integration Rules: Internal Assumption Analysis
‘Data connectors connecting components that are not always active’ – Pre-conditions
2 components connected via a data connector One of the component does not have a central control unit
– Result Potential data loss
Component A Component BPipe
17
Integration Rules: Dependency Analysis
‘Parent node does not support dependencies required by the child components’
– Pre-condition: Component in the system requires one or more software
components to function
– Result: The component will not function as expected
18
Voluminous Data Intensive Interaction Analysis
An Extension Point implementation of the Level of Service Connector Selector
Distribution connector profiles (DCPs)– Data access, distribution, streaming [Mehta et. al 2000] metadata
captured for each profiled connector– Can be generated manually, or using an automatic process
Distribution Scenarios– Constraint queries phrased against the architectural vocabulary
of data distribution Total Volume Number of Users Number of User Types Delivery Intervals Data Types Geographic Distribution Access Policies Performance Requirements
19
Voluminous Data Intensive Interaction Analysis
Need to understand the relationship between the scenario dimensions and the connector metadata– If we understood the relationship we would know which
connectors to select for a given scenario Current approach allows both Bayesian inference and
linear equations as a means of relating the connector metadata to the scenario dimensions
For our motivating example– 3 Connectors, C1-C3– Profiled 12 major OTS connector technologies
Including bbFTP, gridFTP, UDP bursting technologies, FTP, etc.
– Apply selection framework to “rank” most appropriate of 12 OTS connector solutions for given example scenarios
20
Voluminous Data Intensive Interaction Analysis
Precision-Recall analysis– Evaluated framework against 30 real-world data distribution
scenarios– 10 high volume, 9 medium volume, and 11 low volume scenarios– Used expert analysis to develop “answer key” for scenarios
Set of “right” connectors Set of “wrong” connectors
Applied Bayesian and linear programming connector selection algorithm– Clustered ranked connector lists using k-means clustering (k=2)
to develop similar answer key for each algorithm Bayesian selection algorithm: 80% precision, linear programming
48%– Bayesian algorithm more “white box”– Linear algorithm more “black box”– White box is better
Demonstration
22
Experiment 1
Conducted in graduate software engineering course on 8 projects– 6 projects COTS-Based Applications
2 web-based (3-tier) projects, 1 shared data project, 1 client-server project, 1 web-service interaction project and 1 single-user system
– Implemented this framework before RLCA* milestone on their respective projects
– Data collected using surveys Immediately after interoperability assessment After the completion of the project
* Rebaselined Life Cycle Architecture
23
Experiment 1 Results
Data Set Groups MeanStandardDeviation
P-Value
Dependency Accuracy
Pre Framework Application 79.3% 17.90.017
Post Framework Application 100% 0
Interface AccuracyPre Framework Application 76.9% 14.4
0.0029Post Framework Application 100% 0
Actual Assessment Effort
Projects using this framework 1.53 1.71
0.053Equivalent projects that did not use this framework
5 hrs 3.46
Actual Integration Effort
Projects using this framework 9.5 hrs 2.17
0.0003Equivalent projects that did not use this framework
18.2 hrs 3.37
* Accuracy of Dependency Assessment:1 – (number of unidentified dependencies/total number of dependencies)
** Accuracy of Interface Assessment: 1 – (number of interface interaction mismatches identified/total number of interface interactions)
Accuracy: a quantitative measure of the magnitude of error [IEEE 1990]
24
Experiment 2 – Controlled Experiment
Treatment Group Control Group
Number of Students 75 81
On campus Students 60 65
DEN Students 15 16
Average Experience 1.473 years 1.49 years
Average OnCampus Experience
0.54 years 0.62 years
Average DEN Experience
5.12 years 5 years
25
Experiment 2 - Cumulative Results
Data Set Groups MeanStandardDeviation
P-Value
Hypothesis IH1:
Dependency Accuracy
Treatment Group (75) 100% 0 <0.0001(t=20.7;
sdev=8.31;DOF=154)Control Group (81) 72.5% 11.5
Hypothesis IH2:
Interface AccuracyTreatment Group (75) 100% 0 <0.0001
(t=13.0;sdev=9.37;DOF=154)Control Group (81) 80.5% 13.0
Hypothesis IH3:
Actual Assessment Effort
Treatment Group (75) 72.8 min 28.8 <0.0001(t=-9.04;
sdev=77.5;DOF=154)Control Group (81) 185 min 104
26
Experiment 2 – DEN Results
Data Set Groups MeanStandardDeviation
P-Value
Hypothesis IH1:
Dependency Accuracy
Treatment Group (60) 100% 0 <0.0001(t=17.9;
sdev=8.50;DOF=123)Control Group (65) 72.6% 11.8
Hypothesis IH2:
Interface AccuracyTreatment Group (60) 100% 0 <0.0001
(t=12.0;sdev=9.12;DOF=123)Control Group (65) 80.4% 12.6
Hypothesis IH3:
Actual Assessment Effort
Treatment Group (60) 67.1 min 23.1 <0.0001(t=-8.75;
sdev=74.2;DOF=123)Control Group (65) 183 min 100
27
Conclusion and Future Work
Results (so far) indicate a “sweet spot” in small e-services project
Framework-based tool automates initial interoperability analysis:– Interface, internal assumption, dependency mismatches
Further experimental analysis ongoing– Different software development domains
– Projects with greater COTS complexity
Additional quality of service extensions
28
Questions