algorithmic crowdsourcing - github pages · machine learning +crowdsourcing •almost all machine...

39
Algorithmic Crowdsourcing Denny Zhou Microsoft Research Redmond Dec 9, NIPS13, Lake Tahoe

Upload: others

Post on 26-May-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Algorithmic Crowdsourcing

Denny Zhou

Microsoft Research Redmond

Dec 9, NIPS13, Lake Tahoe

Page 2: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

John Platt (Microsoft)

Main collaborators

Xi Chen

(UC Berkeley)

Qiang Liu

(UC Irvine)

Chao Gao

(Yale)

Nihar Shah

(UC Berkeley)

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 2

Page 3: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Machine learning + crowdsourcing

• Almost all machine learning applications need training labels

• By crowdsourcing we can obtain many labels in a short time at very low cost

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 3

Page 4: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

12/9/2013NIPS 2013 Workshop on Crowdsourcing

4

Page 5: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Repeated labeling: Orange (O) vs. Mandarin (M)

M

O

O

M

O O O

O

M

M

O

O

M

M

M

M

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 5

Page 6: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Repeated labeling: Orange (O) vs. Mandarin (M)

M

O

O

M

O O O

O

M

M

O

O

M

M

M

M

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 6

Page 7: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

M

O

O

M

O O O

O

M

M

O

O

M

M

M

M

Repeated labeling: Orange (O) vs. Mandarin (M)

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 7

Page 8: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

How to make assumptions?

• Intuitively, label quality depends on worker ability and item difficulty. But,

– How to measure worker ability?

– How to measure item difficulty?

– How to combine worker ability and item difficulty?

– How to infer worker ability and item difficulty?

– How to infer ground truth?

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 8

Page 9: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

A single assumption for all questions

Our assumption: measurement objectivity

A B2

Invariance: No matter which scale, A is twice larger than B

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 9

Page 10: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

A single assumption for all questions

Our assumption: measurement objectivity

2

How to formulate invariance for mental measuring?

A B

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 10

Page 11: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

A single assumption for all questions

Our assumption: measurement objectivity

2

𝑅𝐴: number of right answers𝑊𝐴: number of wrong answers

A B

Assume a set ℵ of equally difficult questions:

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 11

𝑅𝐴/𝑊𝐴

𝑅𝐵/𝑊𝐵

Page 12: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

A single assumption for all questions

Our assumption: measurement objectivity

2

𝑅′𝐴: number of right answers𝑊′𝐴: number of wrong answers

A B

Assume another set ℵ′ of equally difficult questions:

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 12

𝑅′𝐴/𝑊′𝐴𝑅′𝐵/𝑊′𝐵

Page 13: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

A single assumption for all questions

Our assumption: measurement objectivity

2A B

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 13

𝑅𝐴/𝑊𝐴

𝑅𝐵/𝑊𝐵=

𝑅′𝐴/𝑊′𝐴𝑅′𝐵/𝑊′𝐵

Page 14: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

A single assumption for all questions

Our assumption: measurement objectivity

2A B

For multiclass labeling, we count the number ofmisclassifications from one class to another

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 14

Page 15: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Measurement objectivity assumption leads to a unique model!

𝑃 𝑋𝑖𝑗 = 𝑘 𝑌𝑗 = 𝑐 =1

𝑍exp[𝜎𝑖 𝑐, 𝑘 + 𝜏𝑗 𝑐, 𝑘 ]

worker 𝑖, item 𝑗labels 𝑐, 𝑘

data matrix 𝑋𝑖𝑗true label 𝑌𝑗

worker confusion matrix item confusion matrix

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 15

Page 16: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Estimation procedure

• First, estimate the worker and item confusion matrices by maximizing marginal likelihood

• Then, estimate the labels by using Bayes’ rule with the estimated confusion matrices

Two steps can be seamlessly unified in EM!

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 16

Page 17: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Expectation-Maximization (EM)

• Initialize label estimates via majority vote

• Iterate till converge:

– Given the estimates of labels, estimate worker and item confusion matrices

– Given the estimates of worker and item confusion matrices, estimate labels

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 17

Page 18: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Prevent overfitting

• Equivalently formulate our solution into minimax conditional entropy

• Prevent overfitting by a natural regularization

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 18

Page 19: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Minimax conditional entropy

• True label distribution: 𝑄(𝑌)

• Define two 4-dim tensors

– Empirical confusion tensor

– Expected confusion tensor

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 19

Page 20: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Minimax conditional entropy

• Jointly estimate 𝑃 and 𝑄 by

subject to

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 20

Page 21: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Minimax conditional entropy

• Exactly recover the previous model via the dual of maximum entropy

𝑃 𝑋𝑖𝑗 = 𝑘 𝑌𝑗 = 𝑐 =1

𝑍exp[𝜎𝑖 𝑐, 𝑘 + 𝜏𝑗 𝑐, 𝑘 ]

• Estimate true labels by minimizing maximum entropy (= maximizing likelihood)

Nothing but Lagrangian multipliers!

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 21

Page 22: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Regularization

• Move from exact matching to approximate matching:

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 22

≈ 0, ∀𝑖, 𝑘, 𝑐

≈ 0, ∀𝑗, 𝑘, 𝑐

Page 23: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Regularization

• Move from exact matching to approximate matching:

• Penalize large fluctuations:

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 23

Page 24: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Experimental results

• Bluebirds data (error rates)

o Belief propagation: Variational Inference for Crowdsourcing (Liu et al. NIPS 2013)o Other methods in the literature cannot outperform Dawid & Skene (1979)o Some are even worse than majority voting o Data: The multidimensional wisdom of crowds (Welinder et al, NIPS 2010)

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 24

Page 25: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Experimental results

• Web search data (error rates)

o Latent trait analysis code from: http://www.machinedlearnings.como Data: Learning from the wisdom of crowds by minimax entropy (Zhou et al., NIPS 2012)

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 25

Page 26: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Crowdsourced ordinal labeling

• Ordinal labels: web search, product rating

• Our assumption: adjacency confusability

1

2

3

4

5

likely to confuse

unlikely to confuse

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 26

Page 27: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Ordinal minimax conditional entropy

• Minimax conditional entropy with the ordinal-based worker and item constraints:

for all Δ and 𝛻 taking values from {≥,<}

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 27

Page 28: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Constraints: indirect label comparison

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 28

Page 29: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Ordinal labeling model

• Obtain the same model except confusion matrices are subtly structured by

• Fewer parameter, less model complexity

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 29

Page 30: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Experimental results

Web search data

o Latent trait analysis code from: http://www.machinedlearnings.como Entropy(M): regularized minimax conditional entropy for multiclass labelso Entropy(O): regularized minimax conditional entropy for ordinal labelso Data: Learning from the wisdom of crowds by minimax entropy (Zhou et al., NIPS 2012)

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 30

Page 31: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Experimental results

Price estimation data

o Latent trait analysis code from: http://www.machinedlearnings.como Entropy(M): regularized minimax conditional entropy for multiclass labelso Entropy(O): regularized minimax conditional entropy for ordinal labelso Data: 7 price ranges from least expensive to most expensive (Liu et al. NIPS 13)

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 31

Page 32: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Why latent trait model doesn’t work?

ordinal label = score range

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 32

When solving a given problem try to avoid solving a more general problem as an intermediate step.

-Vladimir Vapnik

Page 33: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Error bounds: problem setup

• Observed Data:

• Unknown true labels:

• Unknown workers’ accuracies:

• Simplified model of Dawid and Skene (1979) and minimax conditional entropy

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 33

Page 34: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Dawid-Skene estimator (1979)

• Complete likelihood:

• Marginal likelihood:

• Estimating workers’ accuracy by

• Estimating true labels by (plug-in)

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 34

Page 35: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Theorem (lower bound)

For any estimator, there exists a least favorable

(parameter space)

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 35

Page 36: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Theorem (upper bound)

Under mild assumptions, Dawid-Skene estimator is optimal in Wald’s sense

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 36

Page 37: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Budget-optimal crowdsourcing

We propose the following formulation:

• Given 𝑛 biased coins, we want to know which are biased to heads and which are biased to tails

• We have a budget of tossing 𝑚 times in total

• Our goal is to maximize the accuracy of prediction based on observed tossing outcomes

Optimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing (Chen et al, ICML 2013)

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 37

Page 38: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

Summary

Measurement Objectivity

Maximum Likelihood

Maximum Conditional Entropy

Minimum Conditional Entropy

Minimax Conditional Entropy

Regularized Minimax Conditional Entropy

Minimax optimal rates Budget-optimal crowdsourcing

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 38

Page 39: Algorithmic Crowdsourcing - GitHub Pages · Machine learning +crowdsourcing •Almost all machine learning applications need training labels •By crowdsourcing we can obtain many

• Chao Gao and Dengyong Zhou. Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels, no. MSR-TR-2013-110, October 2013

• Dengyong Zhou, Qiang Liu, John Platt and Chris Meek. Aggregating Ordinal Labels from Crowds by Minimax Conditional Entropy. Submitted.

• Xi Chen, Qihang Lin, and Dengyong Zhou. Optimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing, in Proceedings of the 30th International Conference on Machine Learning (ICML), 2013

• Dengyong Zhou, John Platt, Sumit Basu, and Yi Mao. Learning from the Wisdom of Crowds by Minimax Entropy, in Advances in Neural Information Processing Systems (NIPS), December 2012

http://research.microsoft.com/en-us/projects/crowd/

12/9/2013 NIPS 2013 Workshop on Crowdsourcing 39