conceptual framework for agent-based modeling and simulation: the computer experiment

28
Conceptual Framework for Age Conceptual Framework for Age nt-Based Modeling and Simula nt-Based Modeling and Simula tion: tion: The Computer Experiment The Computer Experiment Yongqin Gao Yongqin Gao Vincent Freeh Vincent Freeh Greg Madey Greg Madey CSE Department CSE Department CS Department CS Department CSE Departmen CSE Departmen t University of Notre Dame University of Notre Dame NCSU NCSU Univer Univer sity of Notre Dame sity of Notre Dame NAACSOS Conference NAACSOS Conference Pittsburgh, PA Pittsburgh, PA June 25, 2003 June 25, 2003 Supported in part by the Supported in part by the National Science Foundation - Digital Society & Techn National Science Foundation - Digital Society & Techn

Upload: lindsay

Post on 17-Mar-2016

44 views

Category:

Documents


1 download

DESCRIPTION

Conceptual Framework for Agent-Based Modeling and Simulation: The Computer Experiment. Yongqin GaoVincent Freeh Greg Madey CSE DepartmentCS Department CSE Department University of Notre DameNCSU University of Notre Dame NAACSOS Conference Pittsburgh, PA - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Conceptual Framework for AgenConceptual Framework for Agent-Based Modeling and Simulation:t-Based Modeling and Simulation: The Computer ExperimentThe Computer Experiment

Yongqin GaoYongqin Gao Vincent FreehVincent Freeh Greg Madey Greg MadeyCSE DepartmentCSE Department CS DepartmentCS Department CSE Department CSE DepartmentUniversity of Notre DameUniversity of Notre Dame NCSUNCSU University of Notre Da University of Notre Dameme

NAACSOS ConferenceNAACSOS ConferencePittsburgh, PAPittsburgh, PAJune 25, 2003June 25, 2003

Supported in part by the Supported in part by the National Science Foundation - Digital Society & Technology PrograNational Science Foundation - Digital Society & Technology Programm

Page 2: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

The Computer The Computer ExperimentExperiment

Page 3: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Agent-Based Simulation as Agent-Based Simulation as a Component of the a Component of the

Scientific MethodScientific MethodModeling(Hypothesis)

Agent -BasedSimulation(Experiment)

Observation

Social NetworkModel of F/OSS

Grow ArtificialSourceForge

Analysis ofSourceForge

Data

Page 4: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

OutlineOutline• Investigation: Free/Open Source Software Investigation: Free/Open Source Software

(F/OSS)(F/OSS)• Conceptual framework(s)Conceptual framework(s)• Model descriptionModel description• ER modelER model• BA modelBA model• BA model with constant fitnessBA model with constant fitness• BA model with dynamic fitnessBA model with dynamic fitness• SummarySummary

Page 5: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Open Source Software (OSS)Open Source Software (OSS)• Free …Free …

– to view sourceto view source– to modifyto modify– to shareto share– of costof cost

• ExamplesExamples– ApacheApache– PerlPerl– GNUGNU– LinuxLinux– SendmailSendmail– PythonPython– KDEKDE– GNOMEGNOME– MozillaMozilla– Thousands moreThousands more

LinuxGNU

Savannah

Page 6: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Free Open Source Software Free Open Source Software (F/OSS)(F/OSS)

• DevelopmentDevelopment– Mostly volunteerMostly volunteer– Global teamsGlobal teams– Virtual teamsVirtual teams– Self-organized - often peer-based meritocracy Self-organized - often peer-based meritocracy – Self-managed - but often a “charismatic” Self-managed - but often a “charismatic”

leaderleader– Often large numbers of developers, testers, Often large numbers of developers, testers,

support help, end user participationsupport help, end user participation– Rapid, frequent releasesRapid, frequent releases– Mostly unpaidMostly unpaid

Page 7: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

TypicalTypicalCharismatCharismat

ic ic Leaders?Leaders?

Linus TolvaldsLinux

Larry WallPerl

Richard StallmanGNU Manifesto

Eric RaymondCathedral and Bazaar

Page 8: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

F/OSS: SignificanceF/OSS: Significance• Contradicts traditional Contradicts traditional

wisdom:wisdom:– Software engineeringSoftware engineering– Coordination, large Coordination, large

numbersnumbers– Motivation of developersMotivation of developers– QualityQuality– SecuritySecurity– Business strategyBusiness strategy

• Almost everything is done Almost everything is done electronically and electronically and available in digital form available in digital form

• Opportunity for Social Opportunity for Social Science Research -- large Science Research -- large amounts of online data amounts of online data availableavailable

• Research issues:Research issues:– Understanding motivesUnderstanding motives– Understanding Understanding

processesprocesses– Intellectual propertyIntellectual property– Digital divideDigital divide– Self-organizationSelf-organization– Government policyGovernment policy– Impact on innovationImpact on innovation– EthicsEthics– Economic modelsEconomic models– Cultural issuesCultural issues– International factorsInternational factors

Page 9: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

SourceForgeSourceForge• VA Software• Part of OSDN• Started 12/1999• Collaboration tools• 58,685 Projects• 80,000 Developers• 590,00 Registered Users

Page 10: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

SavannahSavannah• Uses SourceForge Software • Free Software Foundation•1,508 Projects•15,265 Registered Users

Page 11: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

F/OSS: Importance

Major Component of e-Technology Infrastructure with major presence in

e-Commercee-Sciencee-Governmente-Learning

Apache has over 65% market share of Internet Web serversLinux on over 7 million computersMost Internet e-mail runs on SendmailTens of thousands of quality productsPart of product offerings of companies like IBM, Apple

Apache in WebSphere, Linux on mainframe, FreeBSD in OSXCorporate employees participating on OSS projects

Page 12: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Free/Open Source Free/Open Source SoftwareSoftware

• Seems to challenge traditional economic Seems to challenge traditional economic assumptionsassumptions

• Model for software engineeringModel for software engineering• New business strategiesNew business strategies

– Cooperation with competitorsCooperation with competitors– Beyond trade associations, shared industry research, Beyond trade associations, shared industry research,

and standards processes — shared product and standards processes — shared product development!development!

• Virtual, self-organizing and self-managing teamsVirtual, self-organizing and self-managing teams• Social issues, e.g., digital divide, international Social issues, e.g., digital divide, international

participationparticipation• Government policy issues, e.g., US software Government policy issues, e.g., US software

industry, impact on innovation, security, industry, impact on innovation, security, intellectual property intellectual property

Page 13: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Research ModelResearch Model

Parameter Values

Structural Features

Parameter Values

Cross Validation

Structural Features

Combined Data MiningParameter Values

Understanding the Social and Task

Dynamics that Predict Developer Behaviors

Social Network Analysis : Longitudinal

Study of Preferential Attachment and Dynamic

Attachment

Conceptual Explanatory Model of

OSS: Agent-Based Modeling and Simulation

Page 14: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Data Collection — Data Collection — MonthlyMonthly

• Web crawler (scripts)Web crawler (scripts)– PythonPython– PerlPerl– AWKAWK– SedSed

• MonthlyMonthly• Since Jan 2001 Since Jan 2001 • ProjectIDProjectID• DeveloperIDDeveloperID• Almost 2 million recordsAlmost 2 million records• Relational databaseRelational database

PROJ|DEVELOPER8001|dev3788001|dev89758001|dev99728002|dev276508005|dev313518006|dev125098007|dev193958007|dev46228007|dev356118008|dev8975

Page 15: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

15850 dev[46]dev[83] 15850 dev[46]

dev[48]

15850 dev[46]dev[56]

15850 dev[46]dev[58]

6882 dev[58]dev[47]

6882 dev[47]dev[79]

6882 dev[47]dev[52]

6882 dev[47]dev[55]

7028 dev[46]dev[99]

7028 dev[46]dev[51]

7028 dev[46]dev[57]

7597 dev[46]dev[45]

7597 dev[46]dev[72]

7597 dev[46]dev[55]

7597 dev[46]dev[58]

7597 dev[46]dev[61]

7597 dev[46]dev[64]7597 dev[46]

dev[67]

7597 dev[46]dev[70]

9859 dev[46]dev[49]9859 dev[46]

dev[53]

9859 dev[46]dev[54]

9859 dev[46]dev[59]

dev[46]

dev[83] dev[56]

dev[48]

dev[52]

dev[79]

dev[72]

dev[51]

dev[57]

dev[55]

dev[99]

dev[47]

dev[58]

dev[53]

dev[58]

dev[65]

dev[45]

dev[70]

dev[67]

dev[59]

dev[54]

dev[49]

dev[64]

dev[61]

Project 6882

Project 9859

Project 7597

Project 7028

Project 15850

F/OSS Developers - Social Network ComponentDevelopers are nodes / Projects are links

24 Developers5 Projects

2 Linchpin Developers1 Cluster

Page 16: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Models of the F/OSS Social Models of the F/OSS Social NetworkNetwork

(Alternative Hypotheses)(Alternative Hypotheses)• General model featuresGeneral model features– Agents are nodes on a graph (developers or projects) Agents are nodes on a graph (developers or projects) – Behaviors: Create, join, abandon and idleBehaviors: Create, join, abandon and idle– Edges are relationships (joint project participation)Edges are relationships (joint project participation)– Growth of network: random or types of preferential Growth of network: random or types of preferential

attachment, formation of clustersattachment, formation of clusters– FitnessFitness – Network attributes: diameter, average degree, Network attributes: diameter, average degree,

power law, clustering coefficientpower law, clustering coefficient• Four specific modelsFour specific models

– ER (random graph) ER (random graph) – BA (scale free)BA (scale free)– BA ( + constant fitness)BA ( + constant fitness)– BA ( + dynamic fitness)BA ( + dynamic fitness)

Page 17: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

ER model – degree ER model – degree distributiondistribution

• Degree distribution is binomial distribution while it is power law in empirical data

• R2 = 0.9712 for developer network

• R2 = 0.9815 for project network

Page 18: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

ER model - diameterER model - diameter• Average degree is Average degree is

decreasing while it decreasing while it is increasing in is increasing in empirical dataempirical data

• Diameter is Diameter is increasing while it increasing while it is decreasing in is decreasing in empirical dataempirical data

Page 19: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

ER model – clustering ER model – clustering coefficientcoefficient

• Clustering Clustering coefficient is coefficient is relatively low around relatively low around 0.4 while it is around 0.4 while it is around 0.7 in empirical data.0.7 in empirical data.

• Clustering Clustering coefficient is coefficient is decreasing while it is decreasing while it is increasing in increasing in empirical dataempirical data

Page 20: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

ER model – cluster ER model – cluster distributiondistribution

• Cluster distribution in ER Cluster distribution in ER model also have power model also have power law distribution with Rlaw distribution with R22 as as 0.6667 (0.9953 without 0.6667 (0.9953 without the major cluster) while the major cluster) while RR22 in empirical data is in empirical data is 0.7457 (0.9797 without 0.7457 (0.9797 without the major cluster)the major cluster)

• The actual distribution is The actual distribution is different from empirical different from empirical datadata

• The later models (BA and The later models (BA and further models) have further models) have similar behaviorssimilar behaviors

Page 21: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

BA model – degree BA model – degree distributiondistribution

• Power laws in degree Power laws in degree distribution, similar to distribution, similar to empirical data (+ for empirical data (+ for simulated data and x simulated data and x for empirical data).for empirical data).

• For developer For developer distribution: simulated distribution: simulated data has Rdata has R22 as 0.9798 as 0.9798 and empirical data has and empirical data has RR22 as 0.9712. as 0.9712.

• For project distribution: For project distribution: simulated data has Rsimulated data has R22 as 0.6650 and empirical as 0.6650 and empirical data has Rdata has R22 as 0.9815. as 0.9815.

Page 22: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

BA model – diameter and BA model – diameter and CCCC

• Small diameter Small diameter and high clustering and high clustering coefficient like coefficient like empirical dataempirical data

• Diameter and Diameter and clustering clustering coefficient are both coefficient are both decreasing like decreasing like empirical dataempirical data

Page 23: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

BA model with constant BA model with constant fitnessfitness

• Power laws in degree Power laws in degree distribution, similar to distribution, similar to empirical data (+ for empirical data (+ for simulated data and x for simulated data and x for empirical data).empirical data).

• For developer distribution: For developer distribution: simulated data has R2 as simulated data has R2 as 0.9742 and empirical data 0.9742 and empirical data has R2 as 0.9712.has R2 as 0.9712.

• For project distribution: For project distribution: simulated data has R2 as simulated data has R2 as 0.7253 and empirical data 0.7253 and empirical data has R2 as 0.9815.has R2 as 0.9815.

• Diameter and CC are Diameter and CC are similar to simple BA model.similar to simple BA model.

Page 24: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

BA model with dynamic BA model with dynamic fitnessfitness

• Power laws in degree diPower laws in degree distribution, similar to estribution, similar to empirical data (+ for simmpirical data (+ for simulated data and x for eulated data and x for empirical data).mpirical data).• For developer distributiFor developer distribution: simulated data has on: simulated data has R2 as 0.9695 and empiriR2 as 0.9695 and empirical data has R2 as 0.971cal data has R2 as 0.9712.2.• For project distribution:For project distribution: simulated data has R2 simulated data has R2 as 0.8051 and empirical as 0.8051 and empirical data has R2 as 0.9815.data has R2 as 0.9815.

Page 25: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Advantage of BA with dynamic Advantage of BA with dynamic fitnessfitness

• Intuition: Fitness should decreasing Intuition: Fitness should decreasing with time.with time.

• Statistics: project has life cycle Statistics: project has life cycle behavior which can not be replicated behavior which can not be replicated by BA model with constant fitness by BA model with constant fitness but can be replicated by BA model but can be replicated by BA model with dynamic fitnesswith dynamic fitness

Page 26: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Conceptual FrameworkConceptual FrameworkAgent-Based Modeling and Agent-Based Modeling and

Simulation as Components of Simulation as Components of the Scientific Methodthe Scientific Method

Observation

Hypothesis

Experiment

Page 27: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

SummarySummary• We use ABM to model and simulate the We use ABM to model and simulate the

SourceForge collaboration network.SourceForge collaboration network.• Conceptual framework is proposed for Conceptual framework is proposed for

agent-based modeling and simulation.agent-based modeling and simulation.• Case study of this framework: Case study of this framework:

SourceForge study through ER, BA, BA SourceForge study through ER, BA, BA with constant fitness and BA with with constant fitness and BA with dynamic fitness.dynamic fitness.

Page 28: Conceptual Framework for Agent-Based Modeling and Simulation:  The Computer Experiment

Thank youThank you