open source software - computational science - home

63
Open Source Software Objects in the mirror are closer than they appear…. Phil Laplante, PhD, PE Professor of Software Engineering

Upload: others

Post on 03-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Open Source Software

Objects in the mirror are closer than they appear….

Phil Laplante, PhD, PE

Professor of Software Engineering

January 2007 Penn State Open Source Software Initiative 2

Overview• What is open source software?• Open source software processes and

evolution• Business models• Open source software research• Penn State University Open Source

Software Initiative

What is Open Source Software?

Penn State Open Source Software Initiative 4

OSI Open Source Definition• Free Redistribution• Include Source Code• Derived Works under

Same License• Integrity of Authors

Work• No discrimination

against persons or groups

• No discrimination against Fields of Endeavor

• Distribution of License• License Not Specific to

Product• License Not restrict to

other software• Technology Neutral

January 2007 Penn State Open Source Software Initiative 5

Open Source Repositories• SourceForge (140K project, 1.5M users)• Freshmeat• Google • GOCCR (Government Open Code

Collaborative Repository)• RubyForge• Many more…

January 2007 Penn State Open Source Software Initiative 6

Why Open Source Software (OSS)?• No Black boxes.• Availability of source code and ability to modify it.• The right to use the software in any way.• The software is not dependent on a single entity (e.g.

Microsoft).• The right to redistribute with modifications (under the

constraints of the license)• Quality – “Given enough eyeballs, all bugs are shallow,”

Eric Raymond .. (same goes for security flaws).• Financial issues – no per-copy cost to use (e.g. compare

Open Office to MS Office license).

January 2007 Penn State Open Source Software Initiative 7

Philosophy of OSS• Main ideas based on work in Eric

Raymond’s ‘Cathedral and Bazaar’ (1999)

• Raymond contrasts:– Traditional software development of a few

people planning a cathedral in splendid isolation with …

– The new collaborative “bazaar” form of OSS development

January 2007 Penn State Open Source Software Initiative 8

OSS Organizations• Free Software Foundation (GNU)• Open Source Initiative (OSI)• European Commission's Open Source

Observatory —clearinghouse of info for EU

• Mozilla, Apache, Eclipse Foundations• Many other country- and state-based

initiatives

Penn State Open Source Software Initiative 9

US Cultural Acceptance• 6 Years Ago – Linux is a “tool” for Network (Unix) Admins, a

“toy” for Enterprise Network Admins, and a model for academia

• 4 Years Ago– Linux used for cost savings and network efficiency– Open Source tools for Network Administration– OSS like Firefox, Tomcat, Eclipse, etc. found in many

enterprises• 2 Years Ago – Real Business Applications begin to appear• Today

– Widespread distribution the desktop– Widely used software development tools

Penn State Open Source Software Initiative 10

Distro's• Distro (= distribution) is a collection of OSS

based on a core Linux kernel and a series of tools and scripts.

• In 2003, over 350 distro's available • Starving at the buffet table.• Companies have emerged to manage distro’s.

Penn State Open Source Software Initiative 11

Desktop• The Open Source desktop has made significant

strides in the past 2-3 years.• Distro's now provide alternatives that can

completely replace all tools typically found on a fully-loaded MS workstation.– RedHat (KDE and Gnome)– SuSE (KDE, Gnome, Ximian)– Others.– Open Office, virus checkers, graphics editors,

etc.

Penn State Open Source Software Initiative 12

Examples• CRM/SFA – SugarCRM• ERP – Compiere• BI/KM – Pentaho• Reporting – Jaspersoft• Mail – Zimbra• Phone Switches – Asterisk/Digium• Radio Network Operations – Salem

Communications• Magnatune – Record Label• More...

January 2007 Penn State Open Source Software Initiative 13

Software Development Tools• Open source languages like PHP, Perl, Ruby,

Python, (now) Java used extensively in industry• Struts framework for the model-view-controller

architecture• Swing structure for managing (typically)

business objects. • CVS, Subversion for source code control.• Maven and ant for builds• Bugzilla for bug tracking• And much, much more

Penn State Open Source Software Initiative 14

Global Cultural Acceptance• Peru – Cost of desktop 4 times IT Budget• 34 Other countries banning or limiting use of

Microsoft• Software Libre – Global Initiative• Munich – Displacing Microsoft desktops• Red Flag – China is evolving their own Linux

variant for use across their government and country.

• Many more...

Open Source Software Processes and Evolution

16

3 Types of OSS Projects (Nakakoji et al,2002)

Apache

PostgreSQL

Lessinnovation

Core membersinstead of a project

leader;

Many passive usersthat develop systems

for end users

Council-likeCentralcontrol

Providingstable

services

Service-Oriented

Linuxsystem

(excludingthe kernel= Explorationoriented )

Difficult tochoose the

rightprogram

Many PeripheralDevelopers

Peer support topassive users

Bazaar-likeDecentralized

control

Satisfyingan

individualneed

Utility-Oriented

GNUsystems

JUN

Perl

Subject tosplit

Project leaderMany readers

Cathedral-likeCentral control

Sharing innovation &

knowledge

Exploration-Oriented

ExamplesMajor Problems

Community Structure

ControlStyleObjectiveType

January 2007 Penn State Open Source Software Initiative 17

F/OSS System Size Categories• OSS can be :

– Small (<5K SLOC)– Medium (5K – 100 K SLOC) Large (100K –

1000K SLOC)– Very Large (>1M SLOC)

• The large and very large systems are the fewest in number but the most widely known.

January 2007 Penn State Open Source Software Initiative 18

Standard Definitions for Software Maintenance and Evolution:

• Software Maintenance: “…correction of errors, and implementation of modificationsneeded to allow an existing system to perform new tasks, and to perform old ones under a new conditions…”

[J Dvorak “Conceptual Entropy and its Effect on Class Hierarchies,” Computer, pp.59-63, 1994]

• Software Evolution:“…the dynamic behavior of programming systems as they are

maintained and enhanced over their lifetimes.”[L.A. Belady and M.M. Lehman, “A Model of Large Program Development,” IBM Systems J., vol. 15, no 1,

pp. 225-252, 1976].

19

Generally, the incremental growth and long-term growth rate of software systems tend to decline.

Conservation of Familiarity

V1978

Unless feedback mechanisms are appropriately adjusted, the average effective global activity rate in an evolving software system tends to remain constant over product lifetime

Conservation ofOrganizational

Stability IV

1978

Global software evolution processes are self-regulating

Self Regulation III1974

As software evolves, its complexity increases unless work is done to maintain or reduce it.

Increasing Complexity

II1974

Software must be continually adapted else progressively less satisfactory in use

Continuing Change

I1974

Lehman’s LawBrief NameNo.

20

Software evolution processes are multi-level, multi-loop, multi-agent

feedback systems

Feedback System

(Recognized 1971, formulated 1996)

VIII1996

The quality of software will decline unless it’s rigorously adapted accommodate changes in the operational environment

Declining Quality

VII1996

The functional capability of software must continually increase to maintain user satisfaction over the system lifetime

Continuing Growth

VI1991

Lehman’s LawBrief NameNo.

January 2007 Penn State Open Source Software Initiative 21

Characteristics of OSS Systems• Punctuated evolution with periods of

intensive rewrite

• According to Lehman’s Laws, the mean growth rate would be expected to decrease over time, however…

• Larger OSS projects have been shown to sustain super-linear growth (Linux kernel)

January 2007 Penn State Open Source Software Initiative 22

Types of OSS Development Processes

• Requirements Analysis and Specification– Takes the form of threaded messages and web

site discussions

• CVS, System Build and Incremental Release Review– Concurrent version systems play an essential

role for coordination of decentralized code

January 2007 Penn State Open Source Software Initiative 23

• Maintenance as evolutionary redevelopment, reinvention and revitalization

– Sharing, examining, modifying and redistributing concepts and techniques

through minor improvements and mutations across many releases with short life cycles.

– End users often act as developers or maintainers - continually produce mutations that allow the system to continually

adapt to what the “user-developers” want it to do.

Types of OSS Development Processes cont.

January 2007 Penn State Open Source Software Initiative 24

• Project management– Organized in interlinked layered meritocracy

- A hierarchical organizational form that centralizes and concentrates certain kinds of authority, trust and respect for experience and accomplishment within the team

– Operates as a dynamically organized but loosely coupled virtual enterprise.

Types of OSS Development Processes cont.

25

Over 60% of OSS developers participate in 2-10 projects.

26

Evolutionary Patterns for OSS Projects

Feedback

Feedback

(A) GNU WingNut

Incorporate

Incorporate

(B) Apache

Exploration-OrientedSingle branch;

Feedback from Community

Service-OrientedSingle Branch;

patches merged through control

27

Evolutionary Patterns for OSS Projectspatch

(C) Linux

patch

patch Released public versions

(D) JUN

patch

Utility-OrientedMultiple versions coexist;

Tournament Style

Exploration-OrientedSingle branch;

Feedback from Community

January 2007 Penn State Open Source Software Initiative 28

Traditional and OSS Development: Essential Distinction

• Use and Reiteration of F/OSS Public Licenses– Essential component that enables transfer and

practice of OSS development– >50% of the 70,000 OSS projects use the GNU GPL

• GPL preserves and reiterates the beliefs and practices of:– sharing, examining, modifying and redistributing

OSS systems – OSS assets as property rights for collective

freedom.

January 2007 Penn State Open Source Software Initiative 29

Studying Evolution in OSS• Explore the relationship between change

rate and complexity during the evolution of an OSS system.

• Look for a correlation between high complexity and low stability components

• Understand properties of OSS for purposes of quality assessment and selection of candidate SW (can you do this with closed source software?)

January 2007 Penn State Open Source Software Initiative 30

Simple Metrics

• Lines of code, delta LOC

• Number of touches per file

• Cyclomatic complexity of methods (or functions)

• Use various OO metrics (e.g. the CK suite)

January 2007 Penn State Open Source Software Initiative 31

Chidamber Kemerer (CK) Suite• weighted methods per class • depth of inheritance tree • number of children • cohesion in methods • coupling between objects • response for a class• More…

January 2007 Penn State Open Source Software Initiative 32

High Cohesion, Low Coupling

January 2007 Penn State Open Source Software Initiative 33

Low Cohesion, High Coupling

January 2007 Penn State Open Source Software Initiative 34

Cycles and Coupling

Dependency graph showing a) high coupling with cycles; b) low coupling and no cycles.

January 2007 Penn State Open Source Software Initiative 35

0..*

interface

WritablePropertySource

ColorSpace

ColorSpaceJAI

IHSColorSpace

ContextualRenderedImageFactory

CRIFImpl

UnpackedImageData

NullCRIF

interface

PropertyChangeEmitter

Serializable

ParameterListImpl

Serializable

LookupTableJAI

ColorCube

interface

ParameterListDescriptor

Serializable

interface

PropertyGenerator

WarpQuadratic

WarpPolynomialWarpGridWarpPerspective

Serializable

Warp

PixelAccessor

PackedImageData

Serializable

ROI

interface

RegistryElementDescriptor

interface

OperationDescriptor

Serializable

OperationDescriptorImplParameterBlock

ParameterBlockJAI

Serializable

ParameterListDescriptorImpl

Serializable

EnumeratedParameter

interface

ParameterList

WritablePropertySourceImpl

java.util.List

RenderedImage

Serializable

RenderedImageList

RemoteImage

GeometricOpImage

InterpolationTable

+InterpolationBicubic

+SourcelessOpImage

CollectionChangeEvent

BorderExtenderZero

PropertyChangeListener

CollectionOp

RenderableImage

Serializable

RenderableOp

PropertyChangeSupport

PropertyChangeSupportJAI

RenderableImage

RenderableImageAdapter

Serializable

Interpolation

ScaleOpImage

ImageMIPMapRenderableImage

Serializable

MultiResolutionRenderableImage

IntegerSequence

interface

TileRequest

NullOpImage

interface

ImageJAI

Externalizable

OperationRegistry

Serializable

Histogram

interface

OperationRegistrySpi

interface

PropertySource

Cloneable

Serializable

PerspectiveTransform

Serializable

PropertySourceImpl

Serializable

OperationNodeSupport

JAI

Observable

Serializable

DeferredData

ROIShape

RenderedImage

PlanarImage

Serializable

BorderExtender

OpImage

BorderExtenderCopy

RenderedImageAdapter

WritableRenderedImage

WritableRenderedImageAdapter

Collection

CollectionImage

interface

RenderableCollectionImageFactory

TileObserver

SnapshotImage

interface

CollectionImageFactory

1.1, jai

1.1, jai

Java imaging API

January 2007 Penn State Open Source Software Initiative 36

Example: Imaging vs Non-Imaging SW

Application CCD DMS LCOM MC LOD-F WLC BCC

Non-Imaging #1 381 0.711 86.603 11.643 140 0.08 2.125 Non-Imaging #2 968 0.438 13.077 11.066 316 0.106 1.541 Non-Imaging #3 3071 0.324 284.19 15.157 3440 0.021 2.918 Average 1473 0.491 128 12.62 1299 0.069 2.19 Imaging #1 2749 0.528 144.625 13.379 4290 0 2.825 Imaging #2 787 0.315 72.897 8.86 252 0.008 2.677 Imaging #3 2143 0.637 108.662 9.869 177 0.078 2.353 Average 1893 0.493 109 10.71 1573 0.029 2.62 Difference 28% 0% -17% -18% 21% -137% 20%

CCD cumulative component dependency, DMS distance from main sequence A+I=1, LCOM lack of cohesion in methods, MC McCabe Complexity, WLC weighted lines of code, BCC byte code complexity

January 2007 Penn State Open Source Software Initiative 37

Example: Imaging vs Non-Imaging SWLack of Cohesion Methods (LCOM)

0

50

100

150

200

250

300

1 2 3

Non-Imaging Software Imaging Software

January 2007 Penn State Open Source Software Initiative 38

Structure 101• Tool by Headway Software for analyzing

code structure• Replaced “Review” tool which determined

more than 60 different structural measures.

• “fat" refers to the interdependencies in a given package, and

• "tangle" refers to cyclic dependencies between packages.

January 2007 Penn State Open Source Software Initiative 39

January 2007 Penn State Open Source Software Initiative 40

XS Breakdown - XS Contribution Related to Metric and Scope

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

Release

Perc

ent X

S

Avg XS Tangled (Design) Fat (design) Fat (leaf) Fat (class) Fat(method)

41

time

XSSize (Lines of code and or number of classes)

Code refactoringinitiative (one or more iterations of these possible)

Design restructuring initiative (one or more iterations of these possible)

Architecture reengineering needed

m1

m2

m3

M1 = code rot rate

M2 = design decay rate

M3 = architectural degradation

Evolutionary Trajectory

42

time

XS

Code refactoringinitiative A

Code refactoringinitiative B

Design restructuring initiative

Code refactoringinitiative C

Whether it is a code refactoring initiative or a design restructuring depends on which is reduced (fat for code, tangle for design).

Evolutionary Trajectory II

Business models

Penn State Open Source Software Initiative 44

Major Players• Unisys• IBM• Sun• Novell• RedHat• SpikeSource• Oracle• Microsoft• SCO

– …

Penn State Open Source Software Initiative 45

Business Models• OSS may change business models in the

acquisition and use of software.• The value proposition of every part of the

chain will be altered, and directly impact the costs, profitability, deliverables and expectations of each.

Penn State Open Source Software Initiative 46

Food-Chain Terminology• Software Firm• Resellers

– VAR, Wholesaler, Retailers, Web

• Vendors, Partners, and Consultants– Hardware Sales– Implementors– Integrators– Developers– Training/Doc

• Software Teams• Resellers?• Community

– Virtual Teams– Commercial Open Source

Company• Community Partners &

Consultants– Value Added Integrator

(VAI), Web– Implementor– Integrator– Consultancy

Penn State Open Source Software Initiative 47

Proprietary Software Companies – How Do They Make Money?

• Software License Sales• Support on Software Sales• Professional Services

– Typically an after-thought or necessity– Large clients or large contracts– Expands when software sales slump... usually at

expense of partners• Partner Models

– Training & Certification• Documentation

Penn State Open Source Software Initiative 48

Open Source Software Teams – How Will They Make Money?

• Licensing• Support• Professional Services• Product Certification

– Training– Documentation

Penn State Open Source Software Initiative 49

Other Entities (PSU?) -- How Can They Make Money from OSS?

• Support• Education• Strategy• Product Review/Selection• Product Validation/Surety• Product Implementation/Configuration• Product customization (not a layer or wrapper... but the

base, source code)• Training (Core Product and Customization)• Documentation

Penn State Open Source Software Initiative 50

Industries Prime for Open Source• Network Operating System

– All/All Public and Private Sector• Federal Agencies• State Agencies• Financial• Automotive• etc

• Application Layer– Non-Regulated Industry Sectors– Non-Profits, Education, Legal– SMB's– Retail POS

Penn State Open Source Software Initiative 51

Licensing• Roughly 100+ types of licenses currently in use claiming

to be Open Source variants.• OSI has approved 58 license types as being compliant

with their stated criteria/goals.• 4 “baseline” types from the late 90's:

– GPL– LGPL– BSD– MIT

Open Source Software Research

January 2007 Penn State Open Source Software Initiative 53

Some Research Areas• OSS business models• Architecture modernization using Open Source

Components• Metrics for comparison and evaluation of open

source code• Verification and validation approaches for open

source code• Evolution of code structure in open source

repositories• Dynamics of open source communities• More…

January 2007 Penn State Open Source Software Initiative 54

Penn State University Open Source Software Initiative

January 2007 Penn State Open Source Software Initiative 55

Mission

• To make the Greater Philadelphia region an international focal point for open source software innovation, commercialization, and research.

January 2007 Penn State Open Source Software Initiative 56

Participants• Penn State faculty• Industry partners*• Government agencies (local and national)• Open source entities (e.g. OSF, GNU)• Faculty from other regional universities• Guests and visitors

* Several interested partners are already identified

January 2007 Penn State Open Source Software Initiative 57

Activities

• Research• Evangelization• Best practices dissemination• Fostering industry partnerships• Training

January 2007 Penn State Open Source Software Initiative 58

Funding Model• Corporate sponsors will provide majority of

funding (need lead sponsor)• Looking for three year commitment totaling ???

per year to fund:– Research Associate /Graduate Students– Faculty release time– Lab and leasehold expenses– Travel and publications expenses

• Grants from local economic development sources will also be sought

January 2007 Penn State Open Source Software Initiative 59

Affiliated Faculty

• Phillip A. Laplante (Director), PE, PhD (Stevens)

• Colin Neill, PhD (Swansea)• Raghvinder Sangwan, PhD (Temple)• Pam Vercellone-Smith, PhD (Delaware)

January 2007 Penn State Open Source Software Initiative 60

Acknowledgements

• The following individuals contributed significantly to this presentation:– Dr. Pam Vercellone-Smith– Tom Costello, President, Upstreme, Inc.

January 2007 Penn State Open Source Software Initiative 61

Recent Publications• Raghvinder S. Sangwan, Phillip A. Laplante and Pamela Vercellone-Smith,

“Measuring the Complexity of Design in Real-Time Imaging Software,” to appear, Proc. 3rd Conference on Real-Time Image Processing , San Jose, 2007.

• Colin J. Neill, “Will Commercialization of Open Source Drive the Volunteers Away?” IT Professional. Vol, 9, No. 1, January-February 2007. pp. 50-52.

• Albert Elcock and Phillip A. Laplante, “Testing Without Requirements,” to appear, Innovations in Systems and Software Engineering,” Fall 2006.

• Magnus E. Larsson and Phillip A. Laplante, “On the Complexity of Design in Imaging Software, Proc. 11th IEEE Conference on Engineering of Complex Computer Systems, Palo Alto, CA, August 2006, pp. 37-42. (50%) (Larsson is PSGV student)

• Melissa M. Simmons, Pam Vercellone-Smith and Phillip A. Laplante, “Understanding Open Source Software through Software Archaeology: The Case of Nethack,” Proc. 30th NASA Software Engineering Workshop, Columbia MD, April 2006.

January 2007 Penn State Open Source Software Initiative 62

References• Capiluppi, Andrea, Alvaro E. Faria, and Juan Ramil. 2005. Proceedings

of the Ninth European Conference on Software Maintenance and Reengineering.

• Godfrey, Michael and Qiang Tu. 2001. Growth, Evolution and Structural Change in Open Source Software.

• Goldman, R. and Richard Gabriel, Innovation Happens Elsewhere, Morgan Kaufmann, 2005.

• Koch, Stephan. 2005. Evolution of Open Source Software Systems – A large scale investigation. Proceedings to 1st Int’l Conference on OSS.

• Nakakoji, Kumiyo, Yasuhiro Yamamoto, Yoshiyuki Nishinaka, KouichiKishida and Yunwen Ye. May 2002. Evolution Patterns of Open-Source Software Systems and Communities. Proceedings of the International Workshop on Principles of Software Evolution. ACM Press.

January 2007 Penn State Open Source Software Initiative 63

References• Raymond, Eric S.. “The Cathedral and the Bazaar. Musings on Linux and

Open Source by an Accidental Revolutionary”. 2001. O’Reilly .

• Scacchi, Walt 2004. “Understanding Open Source Software Evolution”, revised version to appear in N.H. Madhavji, M.M. Lehman, J.F. Ramil and D. Perry (eds.), Software Evolution, John Wiley and Sons Inc, New York, 2004.

• Warren, Robert, Omar Nafees and John Champaign. 2004. CS846: Topics in Software Evolutions and Design. Case Study – Nethack. http://www.cs.uwaterloo.ca/~omnafees/cs846/group_asst_1/NafeesWarreenChampaign-nethack-report.pdf