crowdsourcing massimo poesio part 2: games with a purpose

69
CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

Upload: lawrence-malone

Post on 13-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

CROWDSOURCING

Massimo Poesio

Part 2: Games with a Purpose

Page 2: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

GAMES WITH A PURPOSE

• Luis von Ahn pioneered a new approach to resource creation on the Web: GAMES WITH A PURPOSE, or GWAP, in which people, as a side effect of playing, perform tasks ‘computers are unable to perform’ (sic)

Page 3: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

GWAP vs OPEN MIND COMMONSENSE vs MECHANICAL TURK

• GWAP do not rely on altruism or financial incentives to entice people to perform certain actions

• The key property of games is that PEOPLE WANT TO PLAY THEM

Page 4: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

EXAMPLES OF GWAP

• Games at www.gwap.com– ESP– Verbosity– TagATune

• Other games– Peekaboom– Phetch

Page 5: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

ESP

• The first GWAP developed by von Ahn and their group (2003 / 2004)

• The problem: obtain accurate description of images to be used– To train image search engines– To develop machine learning approaches to vision

• The goal: label the majority of the images on the Web

Page 6: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

ESP: the game

Page 7: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

ESP: THE GAME

• Two partners are picked at random from the large number of players online

• They are not told who their partner is, and can’t communicate with them

• They are both shown the same image• The goal: guess how their partner will describe the

image, and type that description– Hence, the ESP game

• If any of the strings typed by one player matches the string typed by the other player, they score points

Page 8: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE TASK

Page 9: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

SCORING BY MATCHING

Page 10: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE CHALLENGE: SCORES

• One of the motivating factors is to try to score as many points as possible

• Hourly, daily, weekly, and monthly scores are shown

Page 11: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

SCORES

Page 12: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE CHALLENGE: TIMING

• Partners try to agree on as many images as they can during 2 ½ minutes

• The termometer on the side indicates how many images they have agreed on

• If they agree on 15 images they score bonus points

Page 13: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

TABOO WORDS

• To ensure the production of a large number of specific labels, some words are declared TABOO and not allowed

• Taboo words are obtained from the game itself: any word that has been agreed upon by players who were shown a picture earlier becomes a taboo word for that image

Page 14: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

TABOO WORDS

Page 15: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

PASSING

Page 16: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

GOOD LABELS, COMPLETING AN IMAGE

• A label is considered “good” when more than N players produce it (with N a parameter of the game)

• An image is “done” when its list of taboo words is so extensive that most players pass on it

Page 17: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

IMPLEMENTATION

• Pre-recorded game play– Especially at the beginning, and at quiet times, there

won’t always be players to pair with– In these cases a player is paired against a recorded ‘hand’

of a previous game with the same picture• Cheating

– Players could cheat in a number of ways, including agreeing on labels / playing against themselves

– A number of mechanisms are in place against those cases• Selecting images

Page 18: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

SOME STATISTICS

• In the 4 months between August 9th 2003 and December 10th 2003– 13630 players– 1.2 million labels for 293,760 images– 80% of players played more than once

• By 2008: – 200,000 players– 50 million labels

Page 19: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

ANALYSIS

• The numbers indicate that the game is fun to play

• Exciting factors:– Playing with a partner– Playing against time

Page 20: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

QUALITY OF THE LABELS

• For IMAGE SEARCH:– choose 10 labels among those produced and look at which images

are returned• Compare labels produced by players with labels produced by

participants in an experiment– 15 participants, 20 images among the 1000 with more than 5

labels– 83% of game labels also produced by participants

• Manual assessment of labels (‘would you use these labels to describe this image?’)– 15 participants, 20 images– 85% of words rated useful

Page 21: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

GOOGLE IMAGE LABELLER

Page 22: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE TASK

Page 23: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

RESULTS

Page 24: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

VERBOSITY

• … or, the game approach to collecting commonsense knowledge

• Motivation: slow progress both on CYC (5 million facts collected) and on Open Mind Commonsense (around 700,000 facts)

Page 25: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE GAME

• Based on an existing game, TABOO:– Players have to guess a word– One of the players gives hints concerning the word

• In Verbosity, you have two players, the DESCRIBER and the GUESSER, and a SECRET WORD

Page 26: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE GAME

Page 27: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

TEMPLATES IN VERBOSITY

• As in Open Mind Commonsense, templates are used to ensure that the relations / properties of interest are collected

• The Describer produces hints by filling in a template

Page 28: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

GUESSING ATTRIBUTES

Page 29: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

PRODUCING A DESCRIPTION

Page 30: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

TEMPLATES

• _ is a kind of _• _ is used for _• _ is typically near/in/on _• _ is the opposite of _ / _ is related to _

Page 31: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

EMULATION

• As in ESP game, pre-recorded games are used when a player cannot be paired with another player

• The asymmetry of the game causes a problem not encountered in ESP game– Describer: can just repeat behavior of previous

describer– Guesser: not so easy

Page 32: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

RESULTS

• Only published results I’m aware of predate the actual release of the game so I don’t know about the QUANTITY

• Quality:– Ask six raters whether 200 facts collected using

Verbosity are ‘true’– Around 85% success

Page 33: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

PEEKABOOM

• Objective: collect data about the presence of objects in images in order to train vision algorithms for object detection

Page 34: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE GAME

• Two players• They take turns at playing ‘Peek’ and ‘Boom’• ‘Boom’ gets a picture with an associated word;

‘Peek’ has to guess what is the associated word

• ‘Boom’ reveals parts of a picture to ‘Peek’ by clicking on it (each click reveals a circular area of 20 pixels of radius)

Page 35: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE GAME: PEEK

Page 36: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE GAME

Page 37: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

PINGS

Page 38: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

HINTS

Page 39: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

IMPLEMENTATION

• Images and their labels come from ESP• Cheating:

– Player queue (wait until next ‘matching interval’ – one every 10 seconds – to start playing)

– IP address checks (to make sure players are not paired with themselves)

– Blocking bots: ‘seed images’ (previously annotated) and blacklist

Page 40: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

EVALUATION: USER STATISTICS

• Usage:– 1 month in 2005 – 14,153 players– 1,122,998 completed rounds– Average person played around 158 images (or 72

minutes)

Page 41: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

EVALUATION: ACCURACY OF DATA

• Accuracy of bounding boxes– Choose 50 images played by at least two pairs– Have four volunteers make bounding boxes– OVERLAP(A,B) = AREA(A∩B) / AREA(A B)∪– Average: 0.75

• Accuracy of pings– 50 images as above– Three subject decide if ping is ‘inside the object’– Result: 100%

Page 42: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

SOME GENERAL LESSONS

• von Ahn & Dabbish (2008) discuss the general approach and some lessons they took from their work

Page 43: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THREE TEMPLATES

• OUTPUT AGREEMENT GAMES– Generalization of ESP

• INVERSION-PROBLEM GAMES• INPUT-AGREEMENT GAMES

Page 44: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

OUTPUT AGREEMENT GAMES

• Two strangers are chosen among all potential players. They cannot see each other or communicate with each other.

• In each round, both are given the same input• Game instructions say that players should

produce same output as their partners• Winning condition: they produce the same

output, possibly after a few attempts

E.g.: ESP GAME.

Page 45: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

INVERSION PROBLEM GAMES• Two strangers are chosen among all potential

players. They cannot see each other or communicate with each other.

• In each round, one player is designated as the DESCRIBER whereas the other is designated as the GUESSER. The output from the describer should help the guesser guess the original input

• WINNING CONDITION: The guesser correctly guesses the input originally assigned to the describer.

E.g.: VERBOSITY. Based on ‘20 Questions’.

Page 46: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

INPUT AGREEMENT GAMES

• Two strangers are chosen among all potential players. They cannot see each other or communicate with each other.

• In each round, both are given input that is known by the game (but not by the players) to be the same or different

• Game instructions say that players should produce output describing their input so that they can decide whether input is same or different

• Winning condition: playing partners correctly decide whether input is same or different.

E.g.: TagATune.

Page 47: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

INCREASE ENJOYMENT

• Games designed so as to make the task enjoyable

• GWAPs by von Ahn et al attempt to do this by giving players a CHALLENGE:– TIMED RESPONSE– SCORE KEEPING– SKILL LEVELS– HIGH SCORE LEVELS

Page 48: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

OUTPUT ACCURACY

• Mechanisms to ensure correctness and avoid collusions (e.g., always produce the same label)– Random matching (players don’t know each other’s

identity)– Player testing (assess quality of particular player’s

input by matching his output against already annotated data)

– Repetition (output only considered correct if many players produced it)

– Taboo

Page 49: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

MISCELLANEOUS

• Other useful ideas• Evaluation

– Efficiency: THROUGHPUT (T)– ‘Enjoyability’: AVERAGE LIFETIME PLAY (ALP)– Combined measure:

EXPECTED CONTRIBUTION = T * ALP

Page 50: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

OTHER GAMES

• On gwap.com– TagATune

• Elsewhere:– FoldIt– Karaoke Callout– PheTch– Spectral Game

Page 51: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

FOLDIT

Page 52: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE PROBLEM: PROTEIN FOLDING

Page 53: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

Petsko G.A., Ringe, D., Protein Structure and Function 2004, figure 5-5, pg. 173.

REPRESENTING PROTEIN STRUCTUREWire diagram Ribbon diagram Ball & stick of

featured area

Space filling:van der Waals

Surface representation (GRASP image)

Blue: positiveRed: negative

Page 55: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

EVALUATION

Page 56: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

PROBLEMS SOLVED BY FOLDIT PLAYERS

56

Page 57: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

GWAPs for NLP

• Lexical Resource Creation:– (Verbosity)– Jeux de Mots– Groningen Meaning Bank

• Corpus annotation:– The GIVE challenge– Phatris– Phrase Detectives (next lecture)– The sentiment game

Page 58: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

JEUX DE MOTS

Page 59: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

JEUX DE MOTS

• A game to acquire a ‘lexical-semantic network’: a knowledge base with information about– Concepts– Their lexical associations– Their conceptual relations (ISA, PART-OF, etc)

• Developed by Mathieu Lafourcade• Since 2007

Page 60: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

BASICS

• A two-player game• The players do not know each other (as in

Verbosity etc)

Page 61: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

ENTERING LEXICAL ASSOCIATIONS

Page 62: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

Target word+ instructions

player 1 player 2

propositions

Target word+ instructions

propositions

=

intersection

Game playSCORING

Page 63: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

Game play

Target word+ instructions

player 1 player 2

propositions

Target word+ instructions

propositions

=

intersection

accordance

SCORING

Page 64: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

Game play

Mot cible+ consigne

player 1 player 2

propositions

Mot cible+ consigne

propositions

=

intersection

accordance

Reward

SCORING

Page 65: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

RESULTS OF A GAME

Page 66: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

RESULTS SO FAR

• 1,375,432 games played since 2007– Over 9 million relations entered

• Results of game(s): dictionary called DIKO

Page 67: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE GIVE CHALLENGE

Page 68: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

THE GIVE CHALLENGE

• Generating Instructions in Virtual Environments

• A shared task for the NLG community• Users evaluate systems by playing a game in

which the instructions are generated by NLG systems

Page 69: CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose

REFERENCES

• L. von Ahn and L. Dabbish (2008). Designing games with a purpose. Communications of the ACM, v. 51, n.8, 58-67

• L. von Ahn and L. Dabbish (2004). Labeling images with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 319–326.

• von Ahn, L., Liu, R., and Blum, M. (2006). Peekaboom. A Game for locating objects in images. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 55–64.

• www.gwap.com• Luis von Ahn’s talk on Human Computation at Google talks