automated experimentation in social informatics
DESCRIPTION
The slides of the invited talk Maurizio Marchese from the LiquidPub team gave at the Workhop on Automated Experimentation at e-Science Institute, Edinburgh, February 24th, 2010TRANSCRIPT
Automated Experimentation in Social Informatics
Maurizio Marchese
Department of Information Engineering and Computer Science - DISI University of Trento, Italy
Workshop on Automated Experimentation, February 23, 2010, Edinburgh
Early work (ca. 1990) G. Jacucci, M. Marchese and C. Uhrik, "Composing Simulations by expert rules: modeling plasma sprayed films", in "Knowledge Based Hybrid Systems in Engineering and Manufacturing", edited by I. Mezgar and P. Bertok, , Elsevier Science Publisher B.V. , 1993, North-Holland
Early work (ca. 1990) G. Jacucci, M. Marchese and C. Uhrik, "Composing Simulations by expert rules: modeling plasma sprayed films", in "Knowledge Based Hybrid Systems in Engineering and Manufacturing", edited by I. Mezgar and P. Bertok, , Elsevier Science Publisher B.V. , 1993, North-Holland
Porosity 4%
Porosity 10%
Porosity 17%
Early work (ca. 1990)
What we were missing
• Computational power
• Appropriate level of abstraction
• Appropriate specification languages
Social Informatics
• Social Informatics (SI) refers - among others - to the body of research and study that examines the uses of information technologies in social contexts
• Two examples: ▫ P2P systems in eResponse domain ▫ Discovery of Scientific Communities
Use case 1: Automated Experimentation in eResponse
Automated Experimentation in eResponse • In emergency contexts… ▫ Large number of actors
involved ▫ Geographically dispersed
agents collaborate and coordinate
▫ “Live” experimentation is difficult and expensive
Automated Experimentation in eResponse
• Automated Experimentation ▫ Enable to explore different
architectures for information sharing
▫ Enable to explore dynamic and flexible interaction patterns
• OpenKnowledge use-case ▫ Development of a
simulation environment through which different information gathering strategies are modelled and simulated
• An interaction-driven mechanism relying on a distributed infrastructure (OK Kernel);
• Enable finding and coordination of peers by publishing, discovering and executing interaction models (IM), i.e. multi party conversational protocols, specified in Lightweight Coordination Calculus (LCC)
A Flood Case Study Prealarm sensor network
• An Emergency Monitoring System (EMS): ▫ gathers data from sensors
placed in the town ▫ checks weather information
in order to enrich the data needed to predict the evolution of a potential flood
▫ sends an alarm to the emergency coordinator when the situation becomes critical
Poll-reporter LCC interaction model
A Flood Case Study: sensor network
A Flood Case Study Evacuation • Agents (e.g., emergency subordinates such fire-fighters)
move to specific locations assigned by the coordinator
The e-Response System Architecture
Experiments Configuration
Exp N° Scenario Description Runs Variable Settings
1 Centralized 20 - Number of Moving Peers =1 - Paths = 1 x run - Flooding Law = fixed - Topology Nodes = 60 x run - Number of Reporter peers (x node) = 1
2 Decentralized 20
Experimental Results
Automated Experimentation benefits
• Explore diverse parameters in complex environments
• Inject fault conditions (e.g., disruption of communication channels and inaccurate signaling)
• Test the conditions where a p2p architecture improves the overall performance and robustness over traditional centralized architectures
• Test whether an specific IT platform (OK Kernel) can supports the coordination of team-members in an emergency site (e.g., reporters as mobile agents)
Use case 2: Scientific Community Discovery
Scientific Community discovery: • In research, researchers write contributions together,
they publish their advances in some event or journal. • Their contributions refer other contributions, some
contributions are organized in collections, and so on. • They create a big network with interesting relations, and
a community structure that could be used (among others) to improve two main aspect in the research scope: search and assessment. ▫ search for a contribution, or group them; people working in
similar content, events that are related to a contribution, ▫ measure the impact on a specific community (normalize the
actual metrics), narrow down the search space into a community structure, and so on.
Community Detection Process 1
Conf A Conf B Conf A Conf B
1 (one author in common)
2
1
3
4
5
6
2
1
3
4
5
6
Community A Community B Conference Network
% of Common Authors Overlapping Between Communities
TELETEACHING/HUM_INT (chi,hicss)
AI/DB (icai,aaai)
ROBOTIC/M.MEDIA (icra,icpr)
TELECOM (icc,globecom)
APPLIED COMPUTING/CRYPTO(sac,compsac)
SOFTWARE ENGINEERING(kbse,icse)
DIST. SYSTEM/COMPILER(ipps,iccS)
GENETIC AND EVO ALG(cec,gecco) HUMMAN –COMP INTER(icchp,hci)
Overview of the DBLP Network
Normalizing Metrics
18.8
24.6
38.6
25
14.25 13
25.2 21.2 20.4
h-index
1332.8
2846
6676.6
3587
749.5 579.8
3008
2055.2 1989.4
Citations
Community Detection Process 2
O1
C2
C1 C5 P1 P2
P1 P3 O2
P2
C1 C2 C3
C4
Citation Network
Authorship Network
Affiliation Network
Scientific Contributions
People
Organizations
C1 C2 C3
C4
P1 P2 C5 O1
P3
Complete Network Citation Authorship Affiliation U U
Apply Clustering (Newman’s Cluster Algorithm) to find Communities
1 2 3
Main Network
S. Contribution
Person
Organization
Authorship
Affiliation
Citation
Issues
• Access to (distributed) datasets • Provenance of data • Experiment model (executable specification) in
order to replicate results • Adapt the experiment model (for instance to
new metrics)
Thank you