unsupervised machine learning in agent-based modeling · 2018-03-02 · agent-based modeling is...

,-

II. Agent-Based ModelAgent-basedmodelingis“aformofcomputationalmodelingwherebyaphenomenonismodeledintermsofagentsandtheirinteractions”[1].IntheWolfSheepPredationmodel,wolves(black)andsheep(white)moverandomlyonafieldofgrassanddirtpatches.Whenawolfandasheepoccupythesamespace,thewolfeatsthesheep.Ifawolforsheepconsumesasufficientamountoffood,itreproduces.Ifawolforsheepgoeslongenoughwithoutfood,itdiesout.Afinaldetail:Ifthe‘grass?’toggleisON,grasscanbedepletedbythesheep(resultinginabrownpatch,whichwilleventuallygrowback).If‘grass?’ isOFF,thengrassisaninfiniteresourceforthesheep.

I. OverviewAgent-basedmodels(ABMs)areusedbyresearchersinavarietyoffieldstomodelnaturalphenomena.InanABM,awiderangeofbehaviorsandoutcomescanbeobservedbasedontheparametersofthemodel.Inmanycases,thesebehaviorscanbecategorizedintodiscreteoutcomesidentifiablebyhumanobservers.Ourgoalwastouseclusteringalgorithmstoidentifythoseoutcomesfrommodeloutputdata.Ifthistaskcanbecompletedreliablybyacomputer,itwillmakethetaskofinvestigatinganABMeasierforhumanusers.Forthisproject,weusedtheWolfSheepPredationmodelthatcanbefoundinthemodelslibraryofNetlogo 5.3.1[2].Fordataanalysisandvisualization,weusedPython3.5.2 andthescikit-learnandmatplotlib libraries.

Unsupervised Machine Learning in Agent-Based ModelingLuke Robinson and Forrest Stonedahl*

Mathematics and Computer Science Department, Augustana College *Faculty Advisor

V. Overarching Goals and Future Work

VI. References[1]Wilensky,U.,&Rand,W.(2015). AnIntroductiontoagent-based

modeling:modelingnatural,social,andengineeredcomplexsystemswithNetLogo.Cambridge,Mass.:MITPress.

[2]Wilensky,U.1999.NetLogo.http://ccl.northwestern.edu/netlogo/CenterforConnectedLearningandComputer-BasedModeling,NorthwesternUniversity.Evanston,IL.

[3]Sklearn.cluster.DBSCAN.RetrievedFebruary13,2017,fromhttp://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html

[4]2.3.Clustering.RetrievedFebruary13,2017,fromhttp://scikit-learn.org/stable/modules/clustering.html

,-

BehaviorSpace Experiment

DBSCANDBSCANisshortfor‘Density-BasedSpatialClusteringofApplicationswithNoise’[3]. Thisalgorithmfindscoresamplesofhighdensityandexpandsclustersfromthese.Anepsilon valueof50setsthemaximumdistancebetweentwopointstobeinthesameneighborhood.Avalueof30formin_samples specifieshowmanydatapointsmustbeinaneighborhoodforittobeconsideredthecoreofacluster.Theblackdotsonthediagonalfringewereclassifiedasnoise(notpartofanycluster).

BehaviorSpace isatoolinNetlogo thatallowstheusertoperformanexperimentthatcarriesoutmanyrunsofamodelwithdifferentcombinationsofthemodel’sparameters.ThedataforthisprojectcamefromaBehaviorSpace experimentwiththefollowingparameters:

Outputmeasures:countsheep,countwolves

Stopcondition:notany?turtlesorcountsheep>1500

Timelimit: 1500stepsTotalruns: 24,300

K-Means(n_clusters=2)

silhouettescore=.86K-MeansK-Meansrequirestheusertospecifythenumberofclusters,anditclustersdatabytryingtoseparatesamplesintogroupsofequalvariance[4].Ourdataformstwoclusters:simulationswherethesheeppopulationexploded,andsimulationswhereitdidn’t.K-Meansgenerallyfoundthesetwoclusters,butmisclassifiedsomepointsontheboundary(thebluepointsbetween800and1100sheep).

Affinity PropagationTheAffinityPropagationclusteringalgorithmchooses‘exemplars’(shownaslargedotsatleft),whicharepointsthatbestrepresentothersaroundthem,andthenformsclustersbygrowingthemoutwardsfromtheexemplars[4].Duetothetimecomplexityofthisalgorithmandthelargedatasetthatweused,wehadtorunthealgorithmonarandomsubsetwithonly2,000datapoints.Thesparserdatasetcontributedtowhythisalgorithmfoundmoreclustersthantheotheralgorithms.

ABM ClusteringAlgorithmonOutputData

BehaviorSpaceExperiment

Cluster1Cluster2Cluster3

ClusterEvaluations

ClusterVisualizations

AnalysisofParameters

ParameterVisualizations

Theflowchartaboveshowstheprocessofevaluatingagent-basedmodelsusingunsupervisedlearningtechniques.Therearemanyopportunitiesforfutureworkonthisproject.Ideally,thisisaniterativeprocessthatcanberefinedtohelpamodelerbuildthemostinformativemodelandgainafullunderstandingofit.ResearchcouldbedoneintothebestwaystoadjusttheparametersoftheABMbasedontheparameterdistributionsthatappearinclusterings ofthemodel'soutputdata.Also,attemptscouldbemadetoautomatesomeorallofthisprocesstofurtherreducetheamountofworkfortheresearcher.

,-

WeinvestigatedthetwoclustersthatmadeuptheK-meansclusteringbyexaminingthedistributionoftheparametersfromtheABMthatledtoarunbeinginitsrespectivecluster.

Thehistogramsformostofthevariablesshowedanevendistributionforbothclusters,withthenotableexceptionsofthe‘wolfgainfromfood’and‘grass’variables.

IV. Parameter Analysis

When‘wolfgainfromfood’hasahighervalue,therunismorelikelytobeincluster1 (red),wherethesheeppopulationdidnotexplode.Thereverseisalsotrue.Thismakessensebecausepresumablythewolves’increasedgainfromfoodenabledthemtogrowinpopulationandlimitthesheeppopulationgrowth.

Incluster1,¾ofthetrialshadgrass-depletionenabled.Incluster2,basicallyalltrialshadgrassdepletiondisabled.Thispointstoakeyoutcomefromthismodel,whichisthatanecosystemwiththreeinteractingspeciesismorestable(lesslikelytohaveapopulationexplosion)thanan2-speciesecosystem.

DBSCAN(epsilon=50,min_samples=30)

silhouettescore=.66

Silhouette ScoresThesilhouettescoreisaclusteringmetricrangingfrom-1(worst)to1(best),bycomparingdistancesbetweenpointswithinaclusterandbetweendifferentclusters[4].

["sheep-reproduce"357]["initial-number-sheep"[8010120]]["wolf-gain-from-food"[12828]]["wolf-reproduce"357]["show-energy?"false]["sheep-gain-from-food"357]["initial-number-wolves"[301070]]["grass-regrowth-time"[201040]]["grass?"truefalse]

AffinityPropagation(damping=.75)

silhouettescore=.60

III. Clustering Algorithms

'grass?'=OFF(nograssdepletion)

'grass?'=ON(grassdepletion)

unsupervised machine learning in agent-based modeling · 2018-03-02 · agent-based modeling is...

Documents