automated experimentation in social informatics

Automated Experimentation in Social Informatics

Maurizio Marchese

Department of Information Engineering and Computer Science - DISI University of Trento, Italy

Workshop on Automated Experimentation, February 23, 2010, Edinburgh

Early work (ca. 1990) G. Jacucci, M. Marchese and C. Uhrik, "Composing Simulations by expert rules: modeling plasma sprayed films", in "Knowledge Based Hybrid Systems in Engineering and Manufacturing", edited by I. Mezgar and P. Bertok, , Elsevier Science Publisher B.V. , 1993, North-Holland

Early work (ca. 1990) G. Jacucci, M. Marchese and C. Uhrik, "Composing Simulations by expert rules: modeling plasma sprayed films", in "Knowledge Based Hybrid Systems in Engineering and Manufacturing", edited by I. Mezgar and P. Bertok, , Elsevier Science Publisher B.V. , 1993, North-Holland

Porosity 4%

Porosity 10%

Porosity 17%

Early work (ca. 1990)

What we were missing

• Computational power

• Appropriate level of abstraction

• Appropriate specification languages

Social Informatics

• Social Informatics (SI) refers - among others - to the body of research and study that examines the uses of information technologies in social contexts

• Two examples: ▫ P2P systems in eResponse domain ▫ Discovery of Scientific Communities

Use case 1: Automated Experimentation in eResponse

Automated Experimentation in eResponse • In emergency contexts… ▫ Large number of actors

involved ▫ Geographically dispersed

agents collaborate and coordinate

▫ “Live” experimentation is difficult and expensive

Automated Experimentation in eResponse

• Automated Experimentation ▫ Enable to explore different

architectures for information sharing

▫ Enable to explore dynamic and flexible interaction patterns

• OpenKnowledge use-case ▫ Development of a

simulation environment through which different information gathering strategies are modelled and simulated

• An interaction-driven mechanism relying on a distributed infrastructure (OK Kernel);

• Enable finding and coordination of peers by publishing, discovering and executing interaction models (IM), i.e. multi party conversational protocols, specified in Lightweight Coordination Calculus (LCC)

A Flood Case Study Prealarm sensor network

• An Emergency Monitoring System (EMS): ▫ gathers data from sensors

placed in the town ▫ checks weather information

in order to enrich the data needed to predict the evolution of a potential flood

▫ sends an alarm to the emergency coordinator when the situation becomes critical

Poll-reporter LCC interaction model

A Flood Case Study: sensor network

A Flood Case Study Evacuation • Agents (e.g., emergency subordinates such fire-fighters)

move to specific locations assigned by the coordinator

The e-Response System Architecture

Experiments Configuration

Exp N° Scenario Description Runs Variable Settings

1 Centralized 20 - Number of Moving Peers =1 - Paths = 1 x run - Flooding Law = fixed - Topology Nodes = 60 x run - Number of Reporter peers (x node) = 1

2 Decentralized 20

Experimental Results

Automated Experimentation benefits

• Explore diverse parameters in complex environments

• Inject fault conditions (e.g., disruption of communication channels and inaccurate signaling)

• Test the conditions where a p2p architecture improves the overall performance and robustness over traditional centralized architectures

• Test whether an specific IT platform (OK Kernel) can supports the coordination of team-members in an emergency site (e.g., reporters as mobile agents)

Use case 2: Scientific Community Discovery

Scientific Community discovery: • In research, researchers write contributions together,

they publish their advances in some event or journal. • Their contributions refer other contributions, some

contributions are organized in collections, and so on. • They create a big network with interesting relations, and

a community structure that could be used (among others) to improve two main aspect in the research scope: search and assessment. ▫ search for a contribution, or group them; people working in

similar content, events that are related to a contribution, ▫ measure the impact on a specific community (normalize the

actual metrics), narrow down the search space into a community structure, and so on.

Community Detection Process 1

Conf A Conf B Conf A Conf B

1 (one author in common)

2

1

3

4

5

6

2

1

3

4

5

6

Community A Community B Conference Network

% of Common Authors Overlapping Between Communities

TELETEACHING/HUM_INT (chi,hicss)

AI/DB (icai,aaai)

ROBOTIC/M.MEDIA (icra,icpr)

TELECOM (icc,globecom)

APPLIED COMPUTING/CRYPTO(sac,compsac)

SOFTWARE ENGINEERING(kbse,icse)

DIST. SYSTEM/COMPILER(ipps,iccS)

GENETIC AND EVO ALG(cec,gecco) HUMMAN –COMP INTER(icchp,hci)

Overview of the DBLP Network

Normalizing Metrics

18.8

24.6

38.6

25

14.25 13

25.2 21.2 20.4

h-index

1332.8

2846

6676.6

3587

749.5 579.8

3008

2055.2 1989.4

Citations

Community Detection Process 2

O1

C2

C1 C5 P1 P2

P1 P3 O2

P2

C1 C2 C3

C4

Citation Network

Authorship Network

Affiliation Network

Scientific Contributions

People

Organizations

C1 C2 C3

C4

P1 P2 C5 O1

P3

Complete Network Citation Authorship Affiliation U U

Apply Clustering (Newman’s Cluster Algorithm) to find Communities

1 2 3

Main Network

S. Contribution

Person

Organization

Authorship

Affiliation

Citation

Issues

• Access to (distributed) datasets • Provenance of data • Experiment model (executable specification) in

order to replicate results • Adapt the experiment model (for instance to

new metrics)

Thank you

automated experimentation in social informatics

Education