open source software - computational science - home
TRANSCRIPT
Open Source Software
Objects in the mirror are closer than they appear….
Phil Laplante, PhD, PE
Professor of Software Engineering
January 2007 Penn State Open Source Software Initiative 2
Overview• What is open source software?• Open source software processes and
evolution• Business models• Open source software research• Penn State University Open Source
Software Initiative
Penn State Open Source Software Initiative 4
OSI Open Source Definition• Free Redistribution• Include Source Code• Derived Works under
Same License• Integrity of Authors
Work• No discrimination
against persons or groups
• No discrimination against Fields of Endeavor
• Distribution of License• License Not Specific to
Product• License Not restrict to
other software• Technology Neutral
January 2007 Penn State Open Source Software Initiative 5
Open Source Repositories• SourceForge (140K project, 1.5M users)• Freshmeat• Google • GOCCR (Government Open Code
Collaborative Repository)• RubyForge• Many more…
January 2007 Penn State Open Source Software Initiative 6
Why Open Source Software (OSS)?• No Black boxes.• Availability of source code and ability to modify it.• The right to use the software in any way.• The software is not dependent on a single entity (e.g.
Microsoft).• The right to redistribute with modifications (under the
constraints of the license)• Quality – “Given enough eyeballs, all bugs are shallow,”
Eric Raymond .. (same goes for security flaws).• Financial issues – no per-copy cost to use (e.g. compare
Open Office to MS Office license).
January 2007 Penn State Open Source Software Initiative 7
Philosophy of OSS• Main ideas based on work in Eric
Raymond’s ‘Cathedral and Bazaar’ (1999)
• Raymond contrasts:– Traditional software development of a few
people planning a cathedral in splendid isolation with …
– The new collaborative “bazaar” form of OSS development
January 2007 Penn State Open Source Software Initiative 8
OSS Organizations• Free Software Foundation (GNU)• Open Source Initiative (OSI)• European Commission's Open Source
Observatory —clearinghouse of info for EU
• Mozilla, Apache, Eclipse Foundations• Many other country- and state-based
initiatives
Penn State Open Source Software Initiative 9
US Cultural Acceptance• 6 Years Ago – Linux is a “tool” for Network (Unix) Admins, a
“toy” for Enterprise Network Admins, and a model for academia
• 4 Years Ago– Linux used for cost savings and network efficiency– Open Source tools for Network Administration– OSS like Firefox, Tomcat, Eclipse, etc. found in many
enterprises• 2 Years Ago – Real Business Applications begin to appear• Today
– Widespread distribution the desktop– Widely used software development tools
Penn State Open Source Software Initiative 10
Distro's• Distro (= distribution) is a collection of OSS
based on a core Linux kernel and a series of tools and scripts.
• In 2003, over 350 distro's available • Starving at the buffet table.• Companies have emerged to manage distro’s.
Penn State Open Source Software Initiative 11
Desktop• The Open Source desktop has made significant
strides in the past 2-3 years.• Distro's now provide alternatives that can
completely replace all tools typically found on a fully-loaded MS workstation.– RedHat (KDE and Gnome)– SuSE (KDE, Gnome, Ximian)– Others.– Open Office, virus checkers, graphics editors,
etc.
Penn State Open Source Software Initiative 12
Examples• CRM/SFA – SugarCRM• ERP – Compiere• BI/KM – Pentaho• Reporting – Jaspersoft• Mail – Zimbra• Phone Switches – Asterisk/Digium• Radio Network Operations – Salem
Communications• Magnatune – Record Label• More...
January 2007 Penn State Open Source Software Initiative 13
Software Development Tools• Open source languages like PHP, Perl, Ruby,
Python, (now) Java used extensively in industry• Struts framework for the model-view-controller
architecture• Swing structure for managing (typically)
business objects. • CVS, Subversion for source code control.• Maven and ant for builds• Bugzilla for bug tracking• And much, much more
Penn State Open Source Software Initiative 14
Global Cultural Acceptance• Peru – Cost of desktop 4 times IT Budget• 34 Other countries banning or limiting use of
Microsoft• Software Libre – Global Initiative• Munich – Displacing Microsoft desktops• Red Flag – China is evolving their own Linux
variant for use across their government and country.
• Many more...
16
3 Types of OSS Projects (Nakakoji et al,2002)
Apache
PostgreSQL
Lessinnovation
Core membersinstead of a project
leader;
Many passive usersthat develop systems
for end users
Council-likeCentralcontrol
Providingstable
services
Service-Oriented
Linuxsystem
(excludingthe kernel= Explorationoriented )
Difficult tochoose the
rightprogram
Many PeripheralDevelopers
Peer support topassive users
Bazaar-likeDecentralized
control
Satisfyingan
individualneed
Utility-Oriented
GNUsystems
JUN
Perl
Subject tosplit
Project leaderMany readers
Cathedral-likeCentral control
Sharing innovation &
knowledge
Exploration-Oriented
ExamplesMajor Problems
Community Structure
ControlStyleObjectiveType
January 2007 Penn State Open Source Software Initiative 17
F/OSS System Size Categories• OSS can be :
– Small (<5K SLOC)– Medium (5K – 100 K SLOC) Large (100K –
1000K SLOC)– Very Large (>1M SLOC)
• The large and very large systems are the fewest in number but the most widely known.
January 2007 Penn State Open Source Software Initiative 18
Standard Definitions for Software Maintenance and Evolution:
• Software Maintenance: “…correction of errors, and implementation of modificationsneeded to allow an existing system to perform new tasks, and to perform old ones under a new conditions…”
[J Dvorak “Conceptual Entropy and its Effect on Class Hierarchies,” Computer, pp.59-63, 1994]
• Software Evolution:“…the dynamic behavior of programming systems as they are
maintained and enhanced over their lifetimes.”[L.A. Belady and M.M. Lehman, “A Model of Large Program Development,” IBM Systems J., vol. 15, no 1,
pp. 225-252, 1976].
19
Generally, the incremental growth and long-term growth rate of software systems tend to decline.
Conservation of Familiarity
V1978
Unless feedback mechanisms are appropriately adjusted, the average effective global activity rate in an evolving software system tends to remain constant over product lifetime
Conservation ofOrganizational
Stability IV
1978
Global software evolution processes are self-regulating
Self Regulation III1974
As software evolves, its complexity increases unless work is done to maintain or reduce it.
Increasing Complexity
II1974
Software must be continually adapted else progressively less satisfactory in use
Continuing Change
I1974
Lehman’s LawBrief NameNo.
20
Software evolution processes are multi-level, multi-loop, multi-agent
feedback systems
Feedback System
(Recognized 1971, formulated 1996)
VIII1996
The quality of software will decline unless it’s rigorously adapted accommodate changes in the operational environment
Declining Quality
VII1996
The functional capability of software must continually increase to maintain user satisfaction over the system lifetime
Continuing Growth
VI1991
Lehman’s LawBrief NameNo.
January 2007 Penn State Open Source Software Initiative 21
Characteristics of OSS Systems• Punctuated evolution with periods of
intensive rewrite
• According to Lehman’s Laws, the mean growth rate would be expected to decrease over time, however…
• Larger OSS projects have been shown to sustain super-linear growth (Linux kernel)
January 2007 Penn State Open Source Software Initiative 22
Types of OSS Development Processes
• Requirements Analysis and Specification– Takes the form of threaded messages and web
site discussions
• CVS, System Build and Incremental Release Review– Concurrent version systems play an essential
role for coordination of decentralized code
January 2007 Penn State Open Source Software Initiative 23
• Maintenance as evolutionary redevelopment, reinvention and revitalization
– Sharing, examining, modifying and redistributing concepts and techniques
through minor improvements and mutations across many releases with short life cycles.
– End users often act as developers or maintainers - continually produce mutations that allow the system to continually
adapt to what the “user-developers” want it to do.
Types of OSS Development Processes cont.
January 2007 Penn State Open Source Software Initiative 24
• Project management– Organized in interlinked layered meritocracy
- A hierarchical organizational form that centralizes and concentrates certain kinds of authority, trust and respect for experience and accomplishment within the team
– Operates as a dynamically organized but loosely coupled virtual enterprise.
Types of OSS Development Processes cont.
26
Evolutionary Patterns for OSS Projects
Feedback
Feedback
(A) GNU WingNut
Incorporate
Incorporate
(B) Apache
Exploration-OrientedSingle branch;
Feedback from Community
Service-OrientedSingle Branch;
patches merged through control
27
Evolutionary Patterns for OSS Projectspatch
(C) Linux
patch
patch Released public versions
(D) JUN
patch
Utility-OrientedMultiple versions coexist;
Tournament Style
Exploration-OrientedSingle branch;
Feedback from Community
January 2007 Penn State Open Source Software Initiative 28
Traditional and OSS Development: Essential Distinction
• Use and Reiteration of F/OSS Public Licenses– Essential component that enables transfer and
practice of OSS development– >50% of the 70,000 OSS projects use the GNU GPL
• GPL preserves and reiterates the beliefs and practices of:– sharing, examining, modifying and redistributing
OSS systems – OSS assets as property rights for collective
freedom.
January 2007 Penn State Open Source Software Initiative 29
Studying Evolution in OSS• Explore the relationship between change
rate and complexity during the evolution of an OSS system.
• Look for a correlation between high complexity and low stability components
• Understand properties of OSS for purposes of quality assessment and selection of candidate SW (can you do this with closed source software?)
January 2007 Penn State Open Source Software Initiative 30
Simple Metrics
• Lines of code, delta LOC
• Number of touches per file
• Cyclomatic complexity of methods (or functions)
• Use various OO metrics (e.g. the CK suite)
January 2007 Penn State Open Source Software Initiative 31
Chidamber Kemerer (CK) Suite• weighted methods per class • depth of inheritance tree • number of children • cohesion in methods • coupling between objects • response for a class• More…
January 2007 Penn State Open Source Software Initiative 34
Cycles and Coupling
Dependency graph showing a) high coupling with cycles; b) low coupling and no cycles.
January 2007 Penn State Open Source Software Initiative 35
0..*
interface
WritablePropertySource
ColorSpace
ColorSpaceJAI
IHSColorSpace
ContextualRenderedImageFactory
CRIFImpl
UnpackedImageData
NullCRIF
interface
PropertyChangeEmitter
Serializable
ParameterListImpl
Serializable
LookupTableJAI
ColorCube
interface
ParameterListDescriptor
Serializable
interface
PropertyGenerator
WarpQuadratic
WarpPolynomialWarpGridWarpPerspective
Serializable
Warp
PixelAccessor
PackedImageData
Serializable
ROI
interface
RegistryElementDescriptor
interface
OperationDescriptor
Serializable
OperationDescriptorImplParameterBlock
ParameterBlockJAI
Serializable
ParameterListDescriptorImpl
Serializable
EnumeratedParameter
interface
ParameterList
WritablePropertySourceImpl
java.util.List
RenderedImage
Serializable
RenderedImageList
RemoteImage
GeometricOpImage
InterpolationTable
+InterpolationBicubic
+SourcelessOpImage
CollectionChangeEvent
BorderExtenderZero
PropertyChangeListener
CollectionOp
RenderableImage
Serializable
RenderableOp
PropertyChangeSupport
PropertyChangeSupportJAI
RenderableImage
RenderableImageAdapter
Serializable
Interpolation
ScaleOpImage
ImageMIPMapRenderableImage
Serializable
MultiResolutionRenderableImage
IntegerSequence
interface
TileRequest
NullOpImage
interface
ImageJAI
Externalizable
OperationRegistry
Serializable
Histogram
interface
OperationRegistrySpi
interface
PropertySource
Cloneable
Serializable
PerspectiveTransform
Serializable
PropertySourceImpl
Serializable
OperationNodeSupport
JAI
Observable
Serializable
DeferredData
ROIShape
RenderedImage
PlanarImage
Serializable
BorderExtender
OpImage
BorderExtenderCopy
RenderedImageAdapter
WritableRenderedImage
WritableRenderedImageAdapter
Collection
CollectionImage
interface
RenderableCollectionImageFactory
TileObserver
SnapshotImage
interface
CollectionImageFactory
1.1, jai
1.1, jai
Java imaging API
January 2007 Penn State Open Source Software Initiative 36
Example: Imaging vs Non-Imaging SW
Application CCD DMS LCOM MC LOD-F WLC BCC
Non-Imaging #1 381 0.711 86.603 11.643 140 0.08 2.125 Non-Imaging #2 968 0.438 13.077 11.066 316 0.106 1.541 Non-Imaging #3 3071 0.324 284.19 15.157 3440 0.021 2.918 Average 1473 0.491 128 12.62 1299 0.069 2.19 Imaging #1 2749 0.528 144.625 13.379 4290 0 2.825 Imaging #2 787 0.315 72.897 8.86 252 0.008 2.677 Imaging #3 2143 0.637 108.662 9.869 177 0.078 2.353 Average 1893 0.493 109 10.71 1573 0.029 2.62 Difference 28% 0% -17% -18% 21% -137% 20%
CCD cumulative component dependency, DMS distance from main sequence A+I=1, LCOM lack of cohesion in methods, MC McCabe Complexity, WLC weighted lines of code, BCC byte code complexity
January 2007 Penn State Open Source Software Initiative 37
Example: Imaging vs Non-Imaging SWLack of Cohesion Methods (LCOM)
0
50
100
150
200
250
300
1 2 3
Non-Imaging Software Imaging Software
January 2007 Penn State Open Source Software Initiative 38
Structure 101• Tool by Headway Software for analyzing
code structure• Replaced “Review” tool which determined
more than 60 different structural measures.
• “fat" refers to the interdependencies in a given package, and
• "tangle" refers to cyclic dependencies between packages.
January 2007 Penn State Open Source Software Initiative 40
XS Breakdown - XS Contribution Related to Metric and Scope
0
10
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
Release
Perc
ent X
S
Avg XS Tangled (Design) Fat (design) Fat (leaf) Fat (class) Fat(method)
41
time
XSSize (Lines of code and or number of classes)
Code refactoringinitiative (one or more iterations of these possible)
Design restructuring initiative (one or more iterations of these possible)
Architecture reengineering needed
m1
m2
m3
M1 = code rot rate
M2 = design decay rate
M3 = architectural degradation
Evolutionary Trajectory
42
time
XS
Code refactoringinitiative A
Code refactoringinitiative B
Design restructuring initiative
Code refactoringinitiative C
Whether it is a code refactoring initiative or a design restructuring depends on which is reduced (fat for code, tangle for design).
Evolutionary Trajectory II
Penn State Open Source Software Initiative 44
Major Players• Unisys• IBM• Sun• Novell• RedHat• SpikeSource• Oracle• Microsoft• SCO
– …
Penn State Open Source Software Initiative 45
Business Models• OSS may change business models in the
acquisition and use of software.• The value proposition of every part of the
chain will be altered, and directly impact the costs, profitability, deliverables and expectations of each.
Penn State Open Source Software Initiative 46
Food-Chain Terminology• Software Firm• Resellers
– VAR, Wholesaler, Retailers, Web
• Vendors, Partners, and Consultants– Hardware Sales– Implementors– Integrators– Developers– Training/Doc
• Software Teams• Resellers?• Community
– Virtual Teams– Commercial Open Source
Company• Community Partners &
Consultants– Value Added Integrator
(VAI), Web– Implementor– Integrator– Consultancy
Penn State Open Source Software Initiative 47
Proprietary Software Companies – How Do They Make Money?
• Software License Sales• Support on Software Sales• Professional Services
– Typically an after-thought or necessity– Large clients or large contracts– Expands when software sales slump... usually at
expense of partners• Partner Models
– Training & Certification• Documentation
Penn State Open Source Software Initiative 48
Open Source Software Teams – How Will They Make Money?
• Licensing• Support• Professional Services• Product Certification
– Training– Documentation
Penn State Open Source Software Initiative 49
Other Entities (PSU?) -- How Can They Make Money from OSS?
• Support• Education• Strategy• Product Review/Selection• Product Validation/Surety• Product Implementation/Configuration• Product customization (not a layer or wrapper... but the
base, source code)• Training (Core Product and Customization)• Documentation
Penn State Open Source Software Initiative 50
Industries Prime for Open Source• Network Operating System
– All/All Public and Private Sector• Federal Agencies• State Agencies• Financial• Automotive• etc
• Application Layer– Non-Regulated Industry Sectors– Non-Profits, Education, Legal– SMB's– Retail POS
Penn State Open Source Software Initiative 51
Licensing• Roughly 100+ types of licenses currently in use claiming
to be Open Source variants.• OSI has approved 58 license types as being compliant
with their stated criteria/goals.• 4 “baseline” types from the late 90's:
– GPL– LGPL– BSD– MIT
January 2007 Penn State Open Source Software Initiative 53
Some Research Areas• OSS business models• Architecture modernization using Open Source
Components• Metrics for comparison and evaluation of open
source code• Verification and validation approaches for open
source code• Evolution of code structure in open source
repositories• Dynamics of open source communities• More…
January 2007 Penn State Open Source Software Initiative 54
Penn State University Open Source Software Initiative
January 2007 Penn State Open Source Software Initiative 55
Mission
• To make the Greater Philadelphia region an international focal point for open source software innovation, commercialization, and research.
January 2007 Penn State Open Source Software Initiative 56
Participants• Penn State faculty• Industry partners*• Government agencies (local and national)• Open source entities (e.g. OSF, GNU)• Faculty from other regional universities• Guests and visitors
* Several interested partners are already identified
January 2007 Penn State Open Source Software Initiative 57
Activities
• Research• Evangelization• Best practices dissemination• Fostering industry partnerships• Training
January 2007 Penn State Open Source Software Initiative 58
Funding Model• Corporate sponsors will provide majority of
funding (need lead sponsor)• Looking for three year commitment totaling ???
per year to fund:– Research Associate /Graduate Students– Faculty release time– Lab and leasehold expenses– Travel and publications expenses
• Grants from local economic development sources will also be sought
January 2007 Penn State Open Source Software Initiative 59
Affiliated Faculty
• Phillip A. Laplante (Director), PE, PhD (Stevens)
• Colin Neill, PhD (Swansea)• Raghvinder Sangwan, PhD (Temple)• Pam Vercellone-Smith, PhD (Delaware)
January 2007 Penn State Open Source Software Initiative 60
Acknowledgements
• The following individuals contributed significantly to this presentation:– Dr. Pam Vercellone-Smith– Tom Costello, President, Upstreme, Inc.
January 2007 Penn State Open Source Software Initiative 61
Recent Publications• Raghvinder S. Sangwan, Phillip A. Laplante and Pamela Vercellone-Smith,
“Measuring the Complexity of Design in Real-Time Imaging Software,” to appear, Proc. 3rd Conference on Real-Time Image Processing , San Jose, 2007.
• Colin J. Neill, “Will Commercialization of Open Source Drive the Volunteers Away?” IT Professional. Vol, 9, No. 1, January-February 2007. pp. 50-52.
• Albert Elcock and Phillip A. Laplante, “Testing Without Requirements,” to appear, Innovations in Systems and Software Engineering,” Fall 2006.
• Magnus E. Larsson and Phillip A. Laplante, “On the Complexity of Design in Imaging Software, Proc. 11th IEEE Conference on Engineering of Complex Computer Systems, Palo Alto, CA, August 2006, pp. 37-42. (50%) (Larsson is PSGV student)
• Melissa M. Simmons, Pam Vercellone-Smith and Phillip A. Laplante, “Understanding Open Source Software through Software Archaeology: The Case of Nethack,” Proc. 30th NASA Software Engineering Workshop, Columbia MD, April 2006.
January 2007 Penn State Open Source Software Initiative 62
References• Capiluppi, Andrea, Alvaro E. Faria, and Juan Ramil. 2005. Proceedings
of the Ninth European Conference on Software Maintenance and Reengineering.
• Godfrey, Michael and Qiang Tu. 2001. Growth, Evolution and Structural Change in Open Source Software.
• Goldman, R. and Richard Gabriel, Innovation Happens Elsewhere, Morgan Kaufmann, 2005.
• Koch, Stephan. 2005. Evolution of Open Source Software Systems – A large scale investigation. Proceedings to 1st Int’l Conference on OSS.
• Nakakoji, Kumiyo, Yasuhiro Yamamoto, Yoshiyuki Nishinaka, KouichiKishida and Yunwen Ye. May 2002. Evolution Patterns of Open-Source Software Systems and Communities. Proceedings of the International Workshop on Principles of Software Evolution. ACM Press.
January 2007 Penn State Open Source Software Initiative 63
References• Raymond, Eric S.. “The Cathedral and the Bazaar. Musings on Linux and
Open Source by an Accidental Revolutionary”. 2001. O’Reilly .
• Scacchi, Walt 2004. “Understanding Open Source Software Evolution”, revised version to appear in N.H. Madhavji, M.M. Lehman, J.F. Ramil and D. Perry (eds.), Software Evolution, John Wiley and Sons Inc, New York, 2004.
• Warren, Robert, Omar Nafees and John Champaign. 2004. CS846: Topics in Software Evolutions and Design. Case Study – Nethack. http://www.cs.uwaterloo.ca/~omnafees/cs846/group_asst_1/NafeesWarreenChampaign-nethack-report.pdf