programming with the wisdom of the crowd with the wisdom of the crowd 1 daniel w. barowy daniel g....

69
Programming with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN SIDDHARTH SURI EMERY D. BERGER

Upload: vunhu

Post on 25-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Programming with the Wisdom of the Crowd

1

DANIEL W. BAROWY

DANIEL G. GOLDSTEIN SIDDHARTH SURIEMERY D. BERGER

Page 2: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

2

Page 3: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

3

Page 4: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

5

Page 5: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

7

iCalorieCounter

Photograph Your Food

OK

Page 6: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

8

iCalorieCounter

Photograph Your Food

OK

1. take a photo

Page 7: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

8

iCalorieCounter

Photograph Your Food

OK

1. take a photo

230.3 kcal

3. return estimate

Page 8: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

8

iCalorieCounter

Photograph Your Food

OK

1. take a photo

230.3 kcal

3. return estimate2. algorithms (???)

Page 9: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI
Page 10: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI
Page 11: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

10

iCalorieCounter

Photograph Your Food

OK

1. take a photo 2. algorithms (???)

230.3 kcal

3. return estimate

Page 12: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

10

iCalorieCounter

Photograph Your Food

OK

1. take a photo 2. machine learning

230.3 kcal

3. return estimate

Page 13: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI
Page 14: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

12

iCalorieCounter

Photograph Your Food

OK

1. take a photo 2. machine learning

230.3 kcal

3. return estimate

Page 15: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

12

iCalorieCounter

Photograph Your Food

OK

1. take a photo 2. humans

230.3 kcal

3. return estimate

Page 16: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

13

VoxPLA programming language

utilize workers

functions

for estimates that lets you

as if they are just ordinary

in

Automatically handles scheduling, payment, and quality.

and .

(like cloud functions!)

Page 17: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

14

Page 18: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

15

Page 19: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

16

Page 20: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

16

becomes

Page 21: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

2. crowdsourcing

17

iCalorieCounter

Photograph Your Food

OK

1. take a photo

230.3 kcal

3. return estimate

Page 22: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

2. crowdsourcing

17

iCalorieCounter

Photograph Your Food

OK

1. take a photo

230.3 kcal

3. return estimate

VoxPL

Page 23: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

18

Challenge: work quality

Why VoxPL?

Page 24: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

18

Challenge: work quality

Why VoxPL?

Page 25: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

18

Challenge: work quality

Why VoxPL?

Page 26: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

18

Challenge: work quality

Why VoxPL?

Page 27: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

19

i.e., do workers agree?

Challenge: work quality

Why VoxPL?

Page 28: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Is this a giraffe?

yesno

Page 29: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Is this a giraffe?

yesno

yes no yes

Page 30: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Is this a giraffe?

yesno

yes no yes

Take the majority opinion

Page 31: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Is this a giraffe?

YesNo

21

Which ones don’t belong? What does this plate say?

XXXXXX

≈ answer choices drawn from a finite, low-cardinality set.

Labeling tasks

Page 32: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Is this a giraffe?

YesNo

21

Which ones don’t belong? What does this plate say?

XXXXXX

≈ answer choices drawn from a finite, low-cardinality set.

Labeling tasks

“majority opinion” is inadequate (AutoMan: A Platform for Integrating Digital and Human Computation,

Barowy, Curtsinger, Berger, McGregor; OOPSLA ’12)

Page 33: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Is this a giraffe?

yesno

yes no yes

Page 34: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Is this a giraffe?

yesno

Page 35: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Is this a giraffe?

yesno

P(majority) = 1 !!!

Page 36: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

How many calories?

331 10 352

Page 37: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

How many calories?

331 10 352

“majority opinion” is meaningless

Page 38: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

24

≈ answer choices drawn from an infinite, (or practically infinite) high-cardinality set.

How many calories? Where is this person’s nose? How hot is it in this photo?

Estimation tasks

Page 39: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

How many calories?

331 10 352

mean = 231

Page 40: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

How many calories?

331 10 352

Challenge: outliers One large outlier can skew the mean!

mean = 231

Page 41: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

How many calories?

331 10 352

Challenge: outliers One large outlier can skew the mean!median: must corrupt more than 1/2 of sample to skew estimate.

mean = 231median = 331

Page 42: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

VoxPL default estimator: L1 median

(x: 253, y: 134)

L1 median: point that minimizes distance to all other points

Page 43: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

28

Challenge: what is a “good” estimate?

Page 44: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

29

A few extra sprinkles don’t matter for calorie counting!

“Good” is domain-specific.

Page 45: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

What would it taketo trust that the median value of 244 is a good estimate?

30

Page 46: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

What would it taketo trust that the median value of 244 is a good estimate?

30

95 out of 100 times we ask the crowd, the estimate is between 194 and 294 kcal.

Page 47: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

What would it taketo trust that the median value of 244 is a good estimate?

30

300200 400244

95 out of 100 times we ask the crowd, the estimate is between 194 and 294 kcal.

Page 48: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

What would it taketo trust that the median value of 244 is a good estimate?

30

300200 400244

I.e., a confidence interval.

Page 49: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

What would it take

The set of estimates that are indistinguishable due to sampling error.

to trust that the median value of 244 is a good estimate?

30

300200 400244

Page 50: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

What would it taketo trust that the median value of 244 is a good estimate?

30

300200 400244

The set of opinions that are not likely to belong to the Homers or the Benders.

Page 51: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

What would it taketo trust that the median value of 244 is a good estimate?

30

300200 400244

“Donut contains 244±50 kcal.”

Page 52: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

What would it taketo trust that the median value of 244 is a good estimate?

31

300200 400244

“Donut contains 244± kcal.”25

Page 53: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

How to get tighter estimates?

32

300200 400244

“Donut contains 244± kcal.”25

Page 54: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

How to get tighter estimates?

32

300200 400244

“Donut contains 244± kcal.”25

Ask more people.

Page 55: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

How to get tighter estimates?

32

300200 400244

“Donut contains 244± kcal.”25

Error decreases as sample size increases.

Page 56: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

How to get tighter estimates?

32

300200 400244

“Donut contains 244± kcal.”25

Error decreases as sample size increases.

±25 calories is a great estimate!

Page 57: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

iCalorieCounter

Photograph Your Food

OK

VoxPL

±25 kcal L1 median$5.00

Page 58: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

iCalorieCounter

Photograph Your Food

OK

VoxPL 230.3± 111

±25 kcal L1 median$5.00

Page 59: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

iCalorieCounter

Photograph Your Food

OK

VoxPL 230.3± 111>±25

±25 kcal L1 median$5.00

Page 60: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

iCalorieCounter

Photograph Your Food

OK

VoxPL 230.3±49>±25

±25 kcal L1 median$5.00

Page 61: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

iCalorieCounter

Photograph Your Food

OK

VoxPL 230.3±22good!

±25 kcal L1 median$5.00

Page 62: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

34

VoxPLdef numCalories(url: String) = estimate ( budget = 5.00, confidenceInterval = SymmetricCI(25), text = "How many calories are in the food pictured?", imageUrl = url )

Page 63: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

35

numCalories(breakfast)

VoxPL also lets you compose estimates

Page 64: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

35

numCalories(breakfast)

numCalories(lunch)

+

VoxPL also lets you compose estimates

Page 65: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

35

numCalories(breakfast)

numCalories(lunch)

+

numCalories(dinner)

+

VoxPL also lets you compose estimates

Page 66: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

35

numCalories(breakfast)

numCalories(lunch)

+

numCalories(dinner)

+

VoxPL also lets you compose estimates

(how? bootstrap)

Page 67: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

Calorie Counter

36

• VoxPL: desired ±50 kcal actual mean error: ±51.5 kcal

• 208 images of school lunches w/ground truth kcal

How good are VoxPL's estimates?

• Competitive with professional nutritionists!

Page 68: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

37

±50 cost: $1.28

Calorie CounterHow cheap are the estimates?

VoxPL:

±50 cost: $100-200/hrNutritionist:

Page 69: Programming with the Wisdom of the Crowd with the Wisdom of the Crowd 1 DANIEL W. BAROWY DANIEL G. GOLDSTEIN EMERY D. BERGER SIDDHARTH SURI

38

http://automan-lang.org

Harnessing the crowd to do estimates (+ labeling) with automatic budgeting, scheduling & quality control

Programming with the Wisdom of the Crowd