[ieee 2012 international conference on advances in social networks analysis and mining (asonam 2012)...

6
A Domain Specific Language Approach for Agent-Based Social Network Modeling Enrico Franchi Dipartimento di Ingegneria dell’Informazione Universit` a degli Studi di Parma Parma, Italy Email: [email protected] Abstract—Although in the past twenty years agent-based modeling has been widely adopted as a research tool in the fields of social and political sciences, there is lack of software instruments specifically created for social network simulations. Restricting the field of interest specifically to social network models and simulations instead of supporting general agent-based ones, allows for the creation of easier to use, more focused tools. In this work, we propose PyNetSYM, an agent-based mod- eling framework designed to be friendly to programmers and non-programmers alike. PyNetSYM provides a domain-specific language to specify social network simulations expressed as agent-based models. PyNetSYM was created to deal with large simulations and to work effortlessly with other social network analysis toolkits. I. I NTRODUCTION In the last ten years, the pervasive adoption of social networking sites has deeply changed the web and such sites became an unprecedented social phenomenon [1]. According to [2], web sites have attracted users with very weak interest in technology, including people who, before the social network- ing revolution, were not even regular users of either popular Internet services or computers in general. In the preceding decade, Agent-Based Modeling (ABM) was widely adopted as a research tool in the fields of social and political sciences, because a multi-agent system is a suitable means for modeling and performing simulations on complex systems: a model consists of a set of agents whose execution emulates the behavior of the various individuals that constitute the system [3]. Multi-agent systems are especially appropriate to model systems (i) characterized by a high degree of localization and distribution and (ii) dominated by discrete decisions. In particular, ABM gave important results in social science because it represents and simulates not only the behavior of individuals or groups of individuals but also their interactions that concur to the emergent behavior [4], [5]. In parallel with the theoretical development of ABM a number of software platforms were developed to ease the task of running the simulations; among these the most popular are Swarm [6], Mason [7], RePast [8] and NetLogo [9]. Neither of these tools is specifically tailored for social network simulations. In this paper, instead, we introduce a different kind of ABM tool specifically created for network generation and general processes over networks. The tool we propose does not aim at being a general purpose agent-based modeling tool, thus overall remaining a simpler software system, whereas it is extensible where it really matters (e.g., supporting different representations for networks, from simple memory-based ones to pure disk-based storage for huge simulations). Simulation parameters can be easily set both programmati- cally or as command line arguments, so that simulations can be easily run with different starting parameters, for example to perform sensitivity analysis or any kind of batch execution. People with little programming skills shall be able to specify and run their simulations, but skilled programmers must not feel limited. Consequently, we designed an agent- based Domain-Specific Language (DSL) created to express and execute (i) social network generation models and (ii) gen- eral processes over social networks. The system focuses on expressing communicative acts among the nodes in the net- work, as they can be used to express any process conveniently (e.g., link creation/removal or information/disease diffusion). In Section II we expose the motivations for this work more thoroughly and we compare it with the ABM tools mentioned before. In Section III we describe the logical meta-model of the simulation toolkit, and in Section IV and Section V we present the DSL design and the concurrency model, respec- tively. Eventually, in Section VI we give some experimental results on our system and draw some conclusions. II. MOTIVATIONS OF THE WORK At the beginning of this Section we briefly explain what ABM is and, specifically, the general structure of an agent- based simulation; then, in Subsection II-B, we put ABM in the perspective of social network modeling, i.e., we present the peculiar traits of social network simulations and how they impact on designing an ABM of a social network processes. Eventually, we discuss the state of the art of ABM and how it is related to our project. A. Agent-Based Simulation Fundamentals An agent-based simulation has three main components: (a)a set of agents,(b) a set of agent relationships and (c) the agents environment [10]. Although the actual definition of agent is still debated [11][12], there is general consensus: (i) that agents are autonomous, i.e., they operate without human intervention; 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 978-0-7695-4799-2/12 $26.00 © 2012 IEEE DOI 10.1109/ASONAM.2012.102 606 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 978-0-7695-4799-2/12 $26.00 © 2012 IEEE DOI 10.1109/ASONAM.2012.102 607

Upload: e

Post on 09-Dec-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

A Domain Specific Language Approach forAgent-Based Social Network Modeling

Enrico Franchi

Dipartimento di Ingegneria dell’Informazione

Universita degli Studi di Parma

Parma, Italy

Email: [email protected]

Abstract—Although in the past twenty years agent-basedmodeling has been widely adopted as a research tool in thefields of social and political sciences, there is lack of softwareinstruments specifically created for social network simulations.Restricting the field of interest specifically to social networkmodels and simulations instead of supporting general agent-basedones, allows for the creation of easier to use, more focused tools.

In this work, we propose PyNetSYM, an agent-based mod-eling framework designed to be friendly to programmers andnon-programmers alike. PyNetSYM provides a domain-specificlanguage to specify social network simulations expressed asagent-based models. PyNetSYM was created to deal with largesimulations and to work effortlessly with other social networkanalysis toolkits.

I. INTRODUCTION

In the last ten years, the pervasive adoption of social

networking sites has deeply changed the web and such sites

became an unprecedented social phenomenon [1]. According

to [2], web sites have attracted users with very weak interest in

technology, including people who, before the social network-

ing revolution, were not even regular users of either popular

Internet services or computers in general.

In the preceding decade, Agent-Based Modeling (ABM)

was widely adopted as a research tool in the fields of social

and political sciences, because a multi-agent system is a

suitable means for modeling and performing simulations on

complex systems: a model consists of a set of agents whose

execution emulates the behavior of the various individuals that

constitute the system [3]. Multi-agent systems are especially

appropriate to model systems (i) characterized by a high

degree of localization and distribution and (ii) dominated by

discrete decisions. In particular, ABM gave important results

in social science because it represents and simulates not only

the behavior of individuals or groups of individuals but also

their interactions that concur to the emergent behavior [4], [5].

In parallel with the theoretical development of ABM a

number of software platforms were developed to ease the task

of running the simulations; among these the most popular

are Swarm [6], Mason [7], RePast [8] and NetLogo [9].

Neither of these tools is specifically tailored for social network

simulations.

In this paper, instead, we introduce a different kind of ABM

tool specifically created for network generation and general

processes over networks. The tool we propose does not aim

at being a general purpose agent-based modeling tool, thus

overall remaining a simpler software system, whereas it is

extensible where it really matters (e.g., supporting different

representations for networks, from simple memory-based ones

to pure disk-based storage for huge simulations).

Simulation parameters can be easily set both programmati-

cally or as command line arguments, so that simulations can

be easily run with different starting parameters, for example

to perform sensitivity analysis or any kind of batch execution.

People with little programming skills shall be able to

specify and run their simulations, but skilled programmers

must not feel limited. Consequently, we designed an agent-

based Domain-Specific Language (DSL) created to express

and execute (i) social network generation models and (ii) gen-

eral processes over social networks. The system focuses on

expressing communicative acts among the nodes in the net-

work, as they can be used to express any process conveniently

(e.g., link creation/removal or information/disease diffusion).

In Section II we expose the motivations for this work more

thoroughly and we compare it with the ABM tools mentioned

before. In Section III we describe the logical meta-model of

the simulation toolkit, and in Section IV and Section V we

present the DSL design and the concurrency model, respec-

tively. Eventually, in Section VI we give some experimental

results on our system and draw some conclusions.

II. MOTIVATIONS OF THE WORK

At the beginning of this Section we briefly explain what

ABM is and, specifically, the general structure of an agent-

based simulation; then, in Subsection II-B, we put ABM in

the perspective of social network modeling, i.e., we present

the peculiar traits of social network simulations and how they

impact on designing an ABM of a social network processes.

Eventually, we discuss the state of the art of ABM and how

it is related to our project.

A. Agent-Based Simulation Fundamentals

An agent-based simulation has three main components: (a) a

set of agents, (b) a set of agent relationships and (c) the agentsenvironment [10].

Although the actual definition of agent is still debated

[11][12], there is general consensus: (i) that agents are

autonomous, i.e., they operate without human intervention;

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

978-0-7695-4799-2/12 $26.00 © 2012 IEEE

DOI 10.1109/ASONAM.2012.102

606

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

978-0-7695-4799-2/12 $26.00 © 2012 IEEE

DOI 10.1109/ASONAM.2012.102

607

Page 2: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

(ii) that agents are reactive, i.e., they can react to external

events; (iii) that agents are pro-active or have goal-driven

behavior; and (iv) that agents provide a defined interface to an

arbitrary system. However, the fact that agents in simulations

actually have these properties, and especially true autonomy

and pro-activeness, is questioned in [13]; for a thorough review

of agents in the context of social network, we refer to [14].

When using ABM to model social networks, two kinds

of relationships should be taken into account, i.e., system-level relationships and network-level relationships. The former

represents all mutual dependencies, interactions and message

passing among the agents in the system, the latter are the

actual relationships in the social network.

In ABM many different environments may be simulated,

according to the actual model. However, in the case of social

network simulations, the only true environment is the social

network itself and there is little need for other kinds of

environment.

B. Towards Agent-Based Modeling of Social Networks

In the present Subsection we introduce the design prin-

ciples of PyNetSYM1, a social network simulation system

we propose that has the following defining goals: (i) it must

support both small and large networks; (ii) simulations shall

be effortlessly run on remote machines; (iii) it must be

easy to use, even for people without a strong programming

background; (iv) deploying a large simulation should not be

significantly harder than deploying a small one.

A PyNetSYM simulation has four main components that can

all be changed independently: (i) the user interface, (ii) the

simulation engine, (iii) the simulation model and (iv) the

network database. The simulation model needs to be specified

for each simulation and is the only part that has to be written

by the user. The user interface is responsible to take input

from the user, e.g., simulation parameters or information on

the analysis to perform. The simulation engine defines the

concurrency model of the simulation, the scheduling strategies

of the agents and the communication model among the agents.

The network database actually holds a representation of the

social network; it may be in-memory or on persistent storage,

depending on the size of the simulation. Multiple network

database implementations are provided and the framework can

be extended with additional ones.

As a general comment, it should be said that, although

PyNetSYM provides an interface to and abstracts from the

actual representation, memory and CPU issues remain. Let

us consider two different network representation that are rea-

sonable extremes regarding RAM usage. While the NetworkX

library provides many useful algorithms and has a thorough

support for edge and node attributes, a network of order

n = 106 and size m ∼ 10·n, when represented as a NetworkX

graph occupies 4-5 GB of RAM. On the other hand, the same

network, represented as a sparse matrix occupies less than 2

MB, but with this approach the attributes are much harder to

1https://github.com/rik0/pynetsym

manage and, moreover, there are no specific algorithms for

complex network analysis. Different situations have different

trade-offs and consequently we are working to make the choice

of underlying representation as transparent as possible.

Large scale simulations typically require more resources

than those available on a desktop-class machine and, con-

sequently, need to be deployed on external more powerful

machines or clusters. In order to simplify the operation, we

designed our system so that a simulation can be entirely

specified in a single file that can be easily copied or even

sent by email.

Considering that the simulations are frequently run on

remote machines, we opted for a command line interface,

because a traditional GUI becomes more complex in this

scenario. An added benefit is that batch executions are also

greatly simplified. We also support Read-Eval-Print Loop

(REPL) interaction to dynamically interact with the simulation.

In order to allow people without a strong programming

background to easily write simulations, we decided to create

a Domain-Specific Language (DSL). A DSL is a language

providing syntactic constructs over a semantic model tailored

towards a specific domain (e.g., building software). The idea

is that DSLs offer significant gains in productivity, because

they allow the developers to write code that looks more

natural with respect to the problem at hand than the code

written in General-Purpose Language (GPL) with a suitable

library. Moreover, a well thought DSL may be more easily

understandable and perhaps even written by domain experts.

DSLs are usually categorized in internal and external: an

external DSL is a new language that is compiled or interpreted

(e.g., makefile), an internal DSL is represented within the

syntax of a GPL, essentially using the compiler/interpreter and

runtime of the host language. A more thorough discussion of

the subject can be found in [15].

Traditionally, DSLs were sparingly used to solve engineer-

ing problems. Nowadays, with the widespread diffusion of

very expressive languages such as Ruby, Haskell or Groovy

the general attitude towards DSLs has changed ([16], [17]),

especially because writing internal DSLs in such languages

significantly lowers the developing costs to a point that is

comparable to an API based solution [18].

Consequently, we decided to develop an internal DSL over

a very high level programming language, in order to provide

an environment that is friendly enough for scientists without

strong programming background, without being limiting for

the others. Moreover, the internal DSL approach greatly sim-

plifies external library usage, since all the libraries available

for the host language are available for the DSL as well, and

we think that accessing high quality scientific libraries and

using an expressive language are both features of paramount

importance. The actual description of our DSL is detailed in

Section IV.

C. State of the Art of ABM

In this Subsection, we review the most widespread ABM

toolkits in more detail and compare our design decisions with

607608

Page 3: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

theirs, where appropriate.

NetLogo focuses on the 2-dimensional grid topology where

situated agents (turtles) interact with the environment and

among themselves. A problem we have found with this ap-

proach is that the visualization leaks in the rest of the model.

Eventually, we feel that the NetLogo language lacks: (i) many

interesting features found in general purpose languages that

may be useful to implement algorithms and (ii) many useful

libraries (e.g., proper social network analysis toolkits, statis-

tical toolkits). In fact, using network analytic or statistical

techniques is hardly uncommon inside a model, and many of

these algorithms may well be as complicated to implement as

the simulation itself, especially considering that NetLogo is

also aimed at non-programmers.

In fact, out approach is more similar to that of ReLogo

which is a Groovy DSL inspired by NetLogo and built upon

the RePast Symphony framework. Since ReLogo models are

written in Groovy, they are able to call any Java library

providing the required functionality. However, ReLogo is still

designed with the 2D grid in mind and it does not easily inter-

operate with the rest of the RePast framework [19].

Mason, RePast and Swarm follow an approach different

from our own. They are mostly libraries or frameworks for

general purpose languages (Java for Mason and RePast and

Objective-C or Java for Swarm) and this is a clear advan-

tage when interfacing with external libraries. However, all

considered, those systems are more complex to learn because

they have rich and elaborate APIs [20]. Moreover, simulations

are software projects typically split in more files and rather

complex deployment.

III. PYNETSYM RUNTIME MODEL

A sound DSL design methodology suggests to create a

semantic model, that is to say an in-memory object model

of the same subject that the DSL describes [16]. When the

host language for an internal DSL (or the implementation

language for an external one) is an object oriented language,

the semantic model is essentially the library or framework that

the DSL manipulates.

In our case, however, instead of pure objects, we have

active objects with their own thread of control and the se-

mantic model is essentially the runtime model of the agent-

oriented application that executes the simulation. All the

communication occurs through message passing; the default

behavior of these objects is waiting for messages to arrive and

then processing the incoming message with the appropriate

handlers. However, full agent behavior (autonomous, goal-

oriented, pro-active) can be implemented simply by overriding

the main run-loop in a subclass.

The main entry point for a simulation is the SIMULATION

class, that is one of the few domain classes not inheriting

from AGENT. Its main responsibilities are: (i) to define the

simulation parameters (e.g., the number of simulation steps);

(ii) to build a command line option parser that can take

the parameter values from the command line; (iii) to define

which objects should cover determined crucial roles for the

simulation, such as the CONFIGURATOR object; (iv) to create

the ADDRESS-BOOK, that resolves the agent addresses to the

actual agents, the NODE-MANAGER and other simulation spe-

cific infrastructural agents; (v) to actually run the simulation.

The NODE-MANAGER is responsible for (i) creating the

new agents, passing them the appropriate initialization param-

eters and (ii) monitoring them, so that their termination (ex-

ceptional or not) is managed. Typically, in huge simulations

we do not want the idle agents to be in RAM and we shut them

down after a while. The NODE-MANAGER tracks their non-

exceptional termination and is able to bring them back should

they receive a message. The CONFIGURATOR defines the

initial state for the simulation, e.g., reading the initial network

from a file, using a pre-defined topology or simply creating nnode-agents of a given class (BASICCONFIGURATOR).

In order to find social network simulations to test our

software, in [21] and [22]: (i) we analyzed many probabilistic

social network generation models (ii) we singled out a general

structure that is shared by most models and (iii) we based

our semantic model on that structure. Eventually, we applied

it successfully to non-generative simulations, such as disease

infection over networks simulations.

The general structure of the simulation is presented in Fig. 1.

A simulation is fully described providing the specifications

of: (i) the selection process of the groups of nodes to create,

activate or destroy, which is performed by the ACTIVATOR

agent; (ii) the behavior of the nodes themselves When a node

is activated, it performs some actions according to its behavior,

e.g., propagate information. Notice that the general structure

does not change significantly even when introducing some

agent-ness in the simulations, e.g., introducing goal-directed

behavior, until we do not account for pro-active agents.

IV. DEFINING A DSL FOR NETWORK SIMULATIONS

In Subsection II-B we mentioned the advantages of a DSL in

terms of ease of development for both programmers and non-

programmers because of the increased language expressivity.

In Section III the semantic model underlying the DSL has

been described, while in this Section we show the general

structure of our DSL. We also present an example using a

pseudo-language; although the syntax is slightly modified in

order to improve typesetting in the two-column layout, its

semantics is basically that of a Python program, the actual host

language we used. Our choice of language was motivated by

the following Python features: (i) focus in readability; (ii) ease

of use; (iii) availability of useful libraries for scientific com-

putations and consequently (iv) widespread adoption in many

scientific areas; (v) solid choice of concurrency frameworks;

(vi) powerful REPL implementations. However, from our

point of view, the host language is almost an implementation

detail, since other object oriented high level languages such

as Ruby or Groovy could have provided the same features.

Although the pseudo-language is mostly self-explanatory,

here we review some aspects that may nonetheless be surpris-

ing, especially for readers with experience in static languages

such as Java or C++:

608609

Page 4: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

«Agent»clock: Clock «Agent»

activator: NodeActivator

«Agent»node_manager: NodeManager

«Agent»node1: Node

«Agent»node2: Node

«Agent»node3: Node

«Agent»node4: Node

1: tick 2: create

4: terminate 3a: activate

3b: activate

«Agent»node5: Node

«create»

Fig. 1. Interaction diagram of the main agents in the Social Network Generation System.

(a) class A(B) means that class A is a subclass of class B;

(b) the explicitly specified formal parameter “self” is the

object the method is invoked on;

(c) methods can be invoked with named parameters;

(d) set ({a, b}) and hash-map literals ( {k : "v", . . .});(e) everything defined directly in the class body becomes a

class attribute;

(f) instance attributes declarations are omitted for brevity;

(g) classes are first class objects: when called, are factories;

(h) classes can be defined inside other classes.

As a consequence of properties (e), (g) and (h), an inner

class can be invoked as a method of the enclosing class and

returns the appropriate instance object.

Our simulations have two distinct logical elements:

1) the essentially imperative/object oriented description of

the agents behavior (e.g., the nodes and the ACTIVATOR)

2) the mostly declarative description of the simulation struc-

ture: (i) presence/absence of an ACTIVATOR; (ii) choice

of the CONFIGURATOR and its options; and (iii) simula-

tion options specification

The agents behavior is specified using directly the host

language for maximum expressivity. Agents have an address,

specified in their “id” field. Moreover, the AGENT base

class provides methods to send and receive synchronous and

asynchronous messages. Message handlers are invoked when

the corresponding message is received and can be overridden.

When send is called with additional parameters, they are sent

along the message and processed by the target.

The other logical element defining a model is the SIMULA-

TION class. A SIMULATION is not an agent and is essentially

the executable specification of the simulation that collates

together all the other elements. A SIMULATION object has a

run method that is called to execute the simulation. When

run is called, both command line and programmatically

passed parameters are taken into account, i.e., it processes the

“simulation options” field and creates a command line parser

that can parse command line options according to the spec-

ification. When a Simulation instantiates agents, it inspects

their class to see if they need any simulation options (e.g.,

they have an “options” field or a field whose name ends with

options), and, if so, it passes them the requested parameters.

If the configuration is wrong or incomplete (e.g., because of a

1: class BA-NODE(Node)2: handler ACTIVATE (self)3: skip = {self.id}4: while LEN(self.edges) < self.starting edges do5: target = graph.pref attachment()6: if target not in skip then7: self.send(target, "link_to", self.id)8: skip = skip ∪ {target}9: end if

10: end while11: end handler12: end class13: class BA-ACTIVATOR(Activator)14: options = {"starting_edges"}15: handler TICK (self)16: answ = self.send(NODEMANAGER.name,17: "create_node", cls=BA-NODE,18: parameters= [self.starting edges])

19: self.send(answ.get(), "activate")20: end handler21: end class22: class BA-SIMULATION(DE-Simulation)23: activator = BA-ACTIVATOR

24: simulation options= {25: (network size, {default: 100, type: int}),26: (starting edges, {default: 5, type: int})}27: class CONFIGURATOR(BasicConfigurator)28: node class = BA-NODE

29: node options = {"starting_edges"}30: end class31: end class

Fig. 2. Barabasi-Albert (BA) model simulation implementation.

missing or wrongly typed parameter), the simulation fails as

early as possible, ideally at “compile time” (i.e., when classes

are evaluated).

In Fig. 2 we show the code for a simulation implementing

the Barabasi-Albert (BA) model [23]. Line numbers reported

in the following paragraphs refer to Fig. 2. The BA model

starts with n0 nodes and no edges. At each step a new node

with m random links is added. The m links are directed

towards nodes with a probability proportional to their degree,

this strategy is called preferential attachment.In Lines 22–31 the simulation configuration is specified.

Line 23 specifies that the BA-ACTIVATOR (defined in lines

13–21) is the activator to be used. Since classes are first class

609610

Page 5: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

citizens, they can be stored in variables like any other object.

Lines 24–26 specify the command line options, their default

values and the type of the parameters. When values for some

options are not specified neither when calling run, nor in the

command line, the default value is used.

The CONFIGURATOR object (Lines 27–30) is responsible

for creating the starting configuration for the simulation:

in this case, it is a subclass of BASICCONFIGURATOR, so

all the starting nodes: (i) have the same type, BA-NODE

(line 28), and (ii) are passed option values specified in the

“node options” variable (line 29). This is an example of what

we already mentioned: we declaratively specify that nodes

created at the CONFIGURATOR’s request should be passed the

“starting edges” option value. In the BASICCONFIGURATOR

definition we specify that it also reads the “network size”

property to choose the initial number of nodes to create:

essentially CONFIGURATOR inherits the value of its options

attribute from BASICCONFIGURATOR.

In Lines 1–12 and 13–21 there are the specifications for the

nodes and the activator respectively. The nodes are specified

providing a handler for the an Activate message. Moreover, in

the NODE class we defined handlers for the link to message,

so that the graph is updated with the information.

The ACTIVATOR accepts a Tick message from the clock. In

Line 15 the ACTIVATOR sends the NODE-MANAGER: (i) the

class of the agent to create in the “cls” parameter and (ii) the

arguments to pass to their constructor with the “parameters”

parameter. Send calls return a “placeholder” for the actual

return value, that can be queried calling get. In this case, we

have the semantics of a synchronous message, because the

sender has to wait for the receiver to process the message. On

the other hand, the call on line 19 is asynchronous.

V. PYNETSYM CONCURRENCY MODEL

In Section III we defined the runtime model of our sim-

ulation framework. Essentially, we assumed that agents do

have their own thread of control, but did not specify how

concurrency was implemented. The traditional approach in

agent-based frameworks is to implement concurrency mainly

with threads and to provide cooperative multitasking as an

option [24].

On the other contrary, we decided to use gevent2, a net-

working and concurrency framework that uses coroutines to

implement cooperative tasks, called in gevent lingo “green-

lets”. Coroutines are a generalization of subroutines allowing

multiple entry points for suspending and resuming execution

at certain locations [25]. Gevent also provides the greenlets

scheduler. Frameworks such as gevent are popular for writing

programs with very high concurrency, since greenlets: (i) are

less expensive to create and to destroy; and (ii) do not use

memory structures in kernel space. In fact, greenlets live

entirely in the user-space, thus context switches between

different greenlets are inexpensive as well.

2http://www.gevent.org

TABLE IGREENLET AND THREAD EFFICIENCY COMPARISON

# items Thread execution (s) Greenlet execution (s)100 0.022 0.002500 0.110 0.0091000 0.224 0.018

10000 2.168 0.180100000 21.85 1.796

1000000 223.3 18.24

This is particularly relevant in our case since in ABM it is

required to potentially support millions of concurrent agents,

and, since greenlets use no kernel resources, their number is

limited only by the amount of available RAM. On the other

hand, experience suggests that a modern operating system can

roughly support few thousand threads.

In PyNetSYM an agent is essentially a greenlet with a

mailbox where it receives messages. However, since gevent

also provides cooperative networking primitives, agents can

transparently communicate with agents on remote machines.

Moreover, PyNetSYM promotes a programming model where

idle agents do not necessarily store state in-memory, and

consequently they can be destroyed and recreated only when

needed, thus using even less RAM.

In Table I we report execution times of a simple bench-

mark performed with a different number of threads/greenlets.

Essentially, for this simple benchmark, greenlets are roughly

ten times faster than threads. In the benchmark program, we

implemented reactive agents both with threads and greenlets.

The main agent spawns a new agent and sends it a message

with the numeric value of the number of agents to spawn,

then it waits for a message on its own queue. When receives

a message with value n, each secondary agent: (i) spawns a

new agent and sends it a message with the value n−1 if n �= 0;

(ii) on the other hand, if n = 0, it means that all the agents

have been created and sends a message to the main thread.

Notice that the agents do not exist all at the same time: once

they have received and sent a message, they terminate. Further

comparisons between greenlet and thread performances can be

found in [26].

VI. CONCLUSION

In this paper, we present PyNetSYM, a novel language

and runtime for network specific simulations. PyNetSYM is

built for the generative approach to science typical of Agent-

Based Modeling (ABM). We believe there is a strong need

for tools that are both easy to use (especially for people

with little programming background but with a significant

expertise in social sciences) and able to tackle large scale

simulations, using remote high-performance machines and

potentially distributing the computation on multiple servers.

Therefore, while our primary goal is maintaining our toolkit

simple and easy to use, efficiency is our second priority, since

nowadays there is a wide interest on networks of huge size.

Specifically, we created PyNetSYM: (i) to easily support

both small and huge networks, using either in-memory and

610611

Page 6: [IEEE 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) - Istanbul (2012.08.26-2012.08.29)] 2012 IEEE/ACM International Conference on Advances

on-disk network representations, and (ii) to be as easy to use

both on personal machines or on remote servers.

We designed PyNetSYM so that all the entities, both the

infrastructural ones and those under simulation, are agents:

defining the network simulation is essentially a matter of

specifying (i) the behavior of the nodes and (ii) a few

additional simulation parameters (e.g., storage strategy and

user-customizable options).

Given the merits of Domain-Specific Languages (DSLs)

in general, and specifically those concerning how to make

development easier both for programmers and non program-

mers alike, we decided to create a DSL to drive PyNetSYM

simulations, so that it is possible to write programs that are

machine-executable, high-level formulations of the problem to

solve. Specifically, our DSL is an internal DSL over Python.

As PyNetSYM provides the simulation engine, the simu-

lation can be written in a simple file using our DSL. Thus,

PyNetSYM models are very easy to deploy (copying around

a single file is sufficient) and can also be effortlessly shared

between researchers. Moreover, PyNetSYM models can also

be written and executed interactively from a Read-Eval-Print

Loop (REPL), that is extremely useful when exploring the re-

sults of a simulation, because it makes possible to interactively

and dynamically perform analysis and visualize data.

We also implemented some classic network generation

models with PyNetSYM and found that the resulting programs

are readable and fluent; in fact, they are very similar to a

pseudo-code formulation of the models, even though they are

efficiently executable. At the present moment we are using

PyNetSYM to simulate the behavior of users in a novel fully

distributed social networking platform, in order to understand

the condition under which the information propagates to the

intended recipients. Although, the study is at a very early stage,

we are satisfied with PyNetSYM modeling features.

Our results show that our approach is successful in provid-

ing a friendly and easy to use environment to perform agent-

based simulations over social networks. Agent-based simula-

tion is a powerful conceptual modeling tool for social network

simulations and with the present work we contribute a natural

and expressive way to specify social network simulations using

a DSL.

ACKNOWLEDGMENT

Stimulating discussions with Prof. A. Poggi are gratefully

acknowledged.

REFERENCES

[1] D. Boyd and N. B. Ellison, “Social network sites: Definition, history, andscholarship,” Journal of Computer-Mediated Communication, vol. 13,no. 1, pp. 210–230, Jan. 2008.

[2] D. Stroud, “Social networking: An age-neutral commodity — Socialnetworking becomes a mature web application,” Journal of Direct, Dataand Digital Marketing Practice, vol. 9, no. 3, pp. 278–292, Mar. 2008.

[3] H. Van Dyke Parunak, R. Savit, and R. Riolo, “Agent-based modeling vs.equation-based modeling: A case study and users’ guide,” in Multi-AgentSystems and Agent-Based Simulation, ser. Lecture Notes in ComputerScience, J. Sichman, R. Conte, and N. Gilbert, Eds. Springer Berlin /Heidelberg, 1998, vol. 1534, pp. 277–283.

[4] R. Axelrod, “Simulation in social sciences,” in Handbook of Research onNature-Inspired Computing for Economics and Management, J. Rennard,Ed. Hershey, PA, USA: IGI Global, 2007, pp. 90–100.

[5] J. M. Epstein, “Agent-based computational models and generative socialscience,” Complexity, vol. 4, no. 5, pp. 41–60, 1999.

[6] N. Minar, R. Burkhart, C. Langton, and M. Askenazi, “The swarmsimulation system: a toolkit for building multi-agent simulations,” SantaFe Institute, Santa Fe, Tech. Rep., 1996.

[7] S. Luke, C. Cioffi-Revilla, L. Panait, K. Sullivan, and G. Balan, “Mason:A multiagent simulation environment,” Simulation, vol. 81, no. 7, pp.517–527, 2005.

[8] M. North, T. Howe, N. Collier, and J. Vos, “A declarative modelassembly infrastructure for verification and validation,” in AdvancingSocial Simulation: The First World Congress. Springer Japan, 2007,pp. 129–140.

[9] S. Tisue and U. Wilensky, “NetLogo: A simple environment for mod-eling complexity,” in International Conference on Complex Systems,Boston, MA, USA, 2004, pp. 16–21.

[10] C. M. Macal and M. J. North, “Tutorial on agent-based modelling andsimulation,” Journal of Simulation, vol. 4, no. 3, pp. 151–162, 2010.

[11] M. Genesereth and S. Ketchpel, “Software agents,” Communications ofthe ACM, vol. 37, no. 7, pp. 48–53, 1994.

[12] M. Wooldridge and N. Jennings, “Intelligent agents: Theory and prac-tice,” Knowledge engineering review, vol. 10, no. 2, pp. 115–152, 1995.

[13] A. Drogoul, D. Vanbergue, and T. Meurisse, “Multi-agent based sim-ulation: Where are the agents?” in Multi-Agent-Based Simulation II,J. Sichman, F. Bousquet, and P. Davidsson, Eds. Springer Berlin /Heidelberg, 2003, pp. 43–49.

[14] E. Franchi and A. Poggi, “Multi-Agent Systems and Social Net-works,” in Business Social Networking: Organizational, Managerial,and Technological Dimensions, M. Cruz-Cunha, G. D. Putnik, N. Lopes,G. Patrıcia, and E. Miranda, Eds. Hershey, PA: IGI Global, 2011.

[15] M. Mernik, J. Heering, and A. Sloane, “When and how to developdomain-specific languages,” ACM computing surveys (CSUR), vol. 37,no. 4, pp. 316–344, 2005.

[16] M. Fowler, Domain-Specific Languages, ser. Addison-Wesley SignatureSeries (Fowler). Addison-Wesley Professional, 2010.

[17] D. Ghosh, DSLs in Action. Manning Publications, 2010.[18] T. Kosar, P. E. Martınez, Lpez, P. A. Barrientos, and M. Mernik, “A

preliminary study on various implementation approaches of domain-specific language,” Information and Software Technology, vol. 50, no. 5,pp. 390–405, 2008.

[19] S. Lytinen and S. Railsback, “The Evolution of Agent-based SimulationPlatforms: A Review of NetLogo 5.0 and ReLogo,” in Proceedingsof the Fourth International Symposium on Agent-Based Modeling andSimulation, Vienna, Austria, 2012.

[20] S. F. Railsback, S. L. Lytinen, and S. K. Jackson, “Agent-based simula-tion platforms: Review and development recommendations,” Simulation,vol. 82, no. 9, pp. 609–623, 2006.

[21] E. Franchi, “Towards agent-based models for synthetic social networkgeneration,” in Proceedings of the First International Conference onVirtual and Networked Organizations Emergent Technologies and Tools(ViNOrg ’11), Ofir, Portugal, 2011, pp. 1–10.

[22] F. Bergenti, E. Franchi, and A. Poggi, “Selected Models for Agent-basedSimulation of Social Networks,” in Proceedings of the 3rd Symposiumon Social Networks and Multiagent Systems (SNAMAS ’11), D. Kazakovand G. Tsoulas, Eds. York: Society for the Study of ArtificialIntelligence and the Simulation of Behaviour, 2011, pp. 27–32.

[23] A. L. Barabasi and R. Albert, “Emergence of Scaling in RandomNetworks,” Science, vol. 286, pp. 509–512, 1999.

[24] F. Bellifemine, G. Caire, A. Poggi, and G. Rimassa, “JADE: A softwareframework for developing multi-agent applications. Lessons learned,”Information and Software Technology, vol. 50, no. 1-2, pp. 10–21, Jan.2008.

[25] A. L. D. Moura and R. Ierusalimschy, “Revisiting coroutines,” ACMTransactions on Programming Languages and Systems, vol. 31, no. 2,pp. 6:1–6:31, 2009.

[26] R. M. Friborg, J. M. Bjrndalen, and B. Vinter, “Three Unique Im-plementations of Processes for PyCSP,” in Communicating ProcessArchitectures 2009 - WoTUG-32, ser. Concurrent Systems Engineering,P. H. Welch, H. Roebbers, J. F. Broenink, F. R. M. Barnes, C. G. Ritson,A. T. Sampson, G. S. Stiles, and B. Vinter, Eds. IOS Press, 2009,vol. 67, pp. 277–293.

611612