human and social computation

Post on 12-Nov-2014

829 Views

Category:

Technology

6 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Putting Humans in the Loop: Human Computation for Natural

Resources Management

New Developments in IT & WaterAmsterdam, Nov 5 2012

Piero Fraternali

Politecnico di Milano, Italy

piero.fraternali@polimi.it

Outline

• Human Computation– Origins– Forms

• Crowdsourcing• Games with a purpose• Solution space exploration • Social Mobilization• Examples in Natural Resource & Water Management

– Open Issues– Research Projects– Conclusions and outlook

Human Computation: a definition

• According to Von Ahn• Combine humans and computers to solve large-scale problems

that neither can solve alone taking advantage of the human cycles

• According to Wikipedia:• Human-based computation is a computer science technique in

which a computational process performs its function by outsourcing certain steps to humans. This approach uses differences in abilities and alternative costs between humans and computer agents to achieve symbiotic human-computer interaction.

Early example: CAPTCHA• Stands for “Completely Automated Public

Turing test to tell Computers and Humans Apart”

• Luis von Ahn et al. coined the term in 2000• A Program that can tell

whether a user is a human or a computer

• Humans and machineshave complementaryskills

4

The disciplines of HC

Forms of HC: crowdsourcing• Crowdsourcing is a distributed model that

assigns tasks traditionally undertaken by employees or contractors to an undefined crowd

– Split the task into micro-tasks– Assign them to performers in the crowd– Collect partial results into the final one

Paid Crowdsourcing: Amazon Mechanical Turk

Forms of HC: GWAPS• Games with a Purpose (GWAPs)

– Exploiting the billions of hours that people spend online playing with computer games to solve complex problems that involve human intelligence [vA06,LvA09].

– Useful tasks are embedded in a playful experience where human judgment is exploited consciously or unconsciously

Types of Games[Luis von Ahn and Laura Dabbish, CACM 2008]

Three generic game structures

• Output agreement: – Type same output

• Input agreement: – Decide if having same input

• Inversion problem: – P1 generates output from input– P2 looks at P1-output and guesses P1-input

Output Agreement: ESP Game• Players look at common input• Need to agree on output

Input Agreement: TagATune• Sometimes difficult to type identical output

(e.g., “describe this song”)• Show same or different input, let users

describe, ask players if they have same input

Inversion Problem: Peekaboom

• Non-symmetric players• Input: Image with word• Player 1 slowly reveals pic• Player 2 tries to guess word

Sketchness• Puzzle Game, Guess and

Draw (Pictionary, iSketch…)

• Players take turns drawing the shapes of objects inside an image to make the other players guess the object

• Two roles: Sketcher & Guesser

• Objectives: Object detection, garment segmentation and tagging

Forms of HC: space exploration

• Combinatorial problems with intractable solutions spaces, in which humans can help the heuristic core in pruning– Protein folding: Proteins fold

from long chains into small balls, each in a very specific shape

– Shape is the lower-energy setting, which the most stable

– Fold shape is very important to understand interactions with out molecules

– Extremely expensive computationally! (too many degrees of freedom)

• A Mason-Pfizer monkey virus retroviral protease was modeled by FoldIT gamers in just three weeks

Forms of HC: social mobilization

• Social Mobilization– Problems with time constraints, where the

efficiency of task spreading and of solution finding is essential

– An example of the problem and of the techniques employed to face it is the Darpa Network Challenge [PRP+10]

– The solution comes from the nature of the reward mechanism and social ties of humans

HC & Natural Resource Management

• Objectives– Collect and validate data– Extract information from data– Involve people in resource usage planning and management– Change people’s behavior

• Approaches– Passive: mine information from existing user’s activity traces– Active: engage people in ad hoc tasks

• Ultimate goals – Obtain “better data” for predictive models, planning and

management tool: more accurate, at finer time/space resolution, in real time …

– Take “better decisions”: more participative, less conflicting, capable of promoting social change

Monitoring waterways: CreekWatch

• Problem: obtain simple yet useful parameters on water shed conditions in a vast territory at low cost

• Solution: geo localized mobile+Web application – Developed at IBM Research Almaden, 4000+ users, 25 countries– The city of San Jose, CA, uses it to prioritize pollution cleanup efforts

• Collected data are found to have good quality

Predicting population dynamics with twitter data

• Problem: obtaining impact of population on territory at high temporal resolution

• Can be used to detect events, estimate water consumption bursts, waste production, etc

• Solution: using low cost geo-localized data sources (e.g., tweets) together with structured and high cost sources (e.g., mobile phone traces)

http://www.streamreasoning.org/demos/london2012

Predicting snow level with Flickr images

• Problem: predicting the incidence of natural phenomena using user generated content

• Solution: using Flickr photos tagged with “snow” to estimate snow fall (precision 100% with 7 snow photos)– H Zhang, M Korayem, DJ Crandall, G LeBuhn: Mining

photo-sharing websites to study ecological phenomena. WWW 2012

Using social deliberation tools for partipatory planning

• Problem: letting a large crowd of citizens propose solutions or deliberate on proposals about public goods

• Solution: large scale deliberation and idea management tools– IdeaScale.com,

MIT’s Deliberatorium…

Open problems

• Humans, like machines, can make errors– Cognitive bias, fatigue

• Unlike machines humans can cheat– Classification of attacks– Spammer detection

• Quality of output improvement techniques are in use

• Voting schemes• Workers quality modeling and vote weighing (requires

ground truth or machine learning models and iterative / selective labeling of data)

• Micro-flows, worker’s pre-task testing• Task to worker assignment, active learning

Example of ongoing projects

Politecnico di Milano

The CrowdSearcher crowd engagement framework

Human task design:Tips on workplaces from friends

Human task execution with Facebook & Doodle

CUbRIK Project

• FP7 Integrating Project• Goals:

– Advance the architecture of multimedia search

– Exploit the human contribution in multimedia search

– Use open-source components provided by the community

– Start up a search business ecosystem

• http://www.cubrikproject.eu/

27

28

Multimedia processing with crowd

Detecting logo images in videos

Experimental evaluation

• Three experimental settings:–No human intervention–Logo validation performed by domain experts–Non-expert crowd on FaceBook

• Experiment size–40 people involved–50 task instances generated–70 collected answers

29

Experimental evaluation30

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Crowd

ExpertsCrowd

No Crowd

Experts

CrowdNo Crowd

Experts

CrowdAleveChunkyShout

Precision

Reca

ll

Experimental evaluation31

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Crowd

ExpertsCrowd

No Crowd

Experts

CrowdNo Crowd

Experts

CrowdAleveChunkyShout

Precision

Reca

ll

Precision decreases

Reasons for the wrong inclusion• Geographical location of the

users• Expertise of the involved users

Experimental evaluation32

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

No Crowd

ExpertsCrowd

No Crowd

Experts

CrowdNo Crowd

Experts

CrowdAleveChunkyShout

Precision

Reca

ll

Precision decreases• Similarity between two

logos in the data set

Future directions & outlook

• Find problems where crowd support can be useful, e.g., – Urban water demand prediction: smarter meters are

costly and not deployed. Household data can be used to build models

• Design crowd interaction– Non only IT: engagement, incentives, ethical and legal

issues• Collect and clean-up data• Integrate crowd model and data with (e.g.,

water) system models• Check validity

33

References• Managing Crowdsourced Human Computation, Panos

Ipeirotis, New York University Praveen Paritosh, Google

• [LvA09] Edith Law and Luis von Ahn. Input-agreement: a new mechanism for collecting data using human computation games. In Proc. CHI 2009, 2009.

• [vA06] Luis von Ahn. Games with a purpose. Computer, 39:92{94, 2006.

• [vAMM+08] Luis von Ahn, Ben Maurer, Colin McMillen, David Abraham, and Manuel Blum. recaptcha: Human-based character recognition via web security measures. Science, 321(5895):1465~1468, 2008.[

References• Galen Pickard, Iyad Rahwan, Wei Pan, Manuel Cebrian, Riley

Crane, Anmol Madan, and Alex Pentland. Time critical social mobilization: The darpa network challenge winning strategy. CoRR, abs/1008.3172, 2010.

• Trant J., Exploring the potential for social tagging and folksonomy in art museums: proof of concept. New Rev. Hypermed. Multimed. 12(1), 83–105

• Firas Khatib et al, Crystal structure of a monomeric retroviral protease solved by protein folding game players, NATURE, 2011

• S. Kim, C. Robson, T. Zimmerman, J. Pierce, and E. M. Haber. Creek watch: pairing usefulness and usability for successful citizen science. In Proceedings of the 29th Int Conf on Human Factors in Computing Systems, pages 2125–2134, New York, NY, 2011.

top related