cs 5306 info 5306: crowdsourcing and human hirsh/5306/ 5306: crowdsourcing and human computation...

Download CS 5306 INFO 5306: Crowdsourcing and Human hirsh/5306/ 5306: Crowdsourcing and Human Computation Lecture 21 ... Artificial intelligence and collective ... nearly real-time answers

Post on 07-May-2018

214 views

Category:

Documents

2 download

Embed Size (px)

TRANSCRIPT

  • CS 5306INFO 5306:

    Crowdsourcing andHuman Computation

    Lecture 2111/14/17

    Haym Hirsh

  • ?

    ??

    ?

    ?

    ?

    ?

    ?

    ?

    ?

    ?

    ? ?

    ? ?

    ? Long-term goal:Integrating human and machine intelligence

    Using human computation inartificial intelligence

    Using artificial intelligence inhuman computation

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Why AI for human computation Difficult to manage large number of tasks across diverse workers

    Allowing for more complex workflows

    Ease of use

    Efficiency gains

    Making sense of differing inputs

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Modeling worker skill Given gold standard questions for which answers are known

    Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing, Oleson D, Sorokin A, Laughlin GP, Hester V, Le J, Biewald L., HComp 2011

  • Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing, Oleson D, Sorokin A, Laughlin GP, Hester V,

    Le J, Biewald L., HComp 2011

    Gold questions Must be disguised to look like other tasks (even as other tasks might change,

    such as wording)

    Must be enough of them otherwise workers learn to recognize gold questions, resulting in incorrect accuracy estimates

    Can use as a tool to teach workers by giving the correct answer when they give the wrong one

    Can create questions targeting common errors

    Can automate gold question creation (pyrite questions) Mutate questions so that answers change in known ways (such as yes -> no)

    Take questions with a strong consensus for a single answer as new gold questions

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Modeling worker skill Given gold standard questions for which answers are known

    Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing, Oleson D, Sorokin A, Laughlin GP, Hester V, Le J, Biewald L., Hcomp 2011

    Collective Assessment using Expectation Maximization Maximum likelihood estimation of observer error-rates using the EM algorithm, Dawid

    AP, Skene AM. Applied statistics. 1979 Jan 1:20-8.

  • Maximum likelihood estimation of observer error-rates using the EM algorithm, Dawid AP, Skene AM, Applied Statistics

    Goal:Estimate for worker w, correct answer a, possible response r

    Approach: Set at random

    Iteratively improve Compute weighted majority vote

    Compare to correct answers

    Adjust

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Collective Assessment using Expectation Maximization Maximum likelihood estimation of observer error-rates using the EM algorithm, Dawid AP,

    Skene AM. Applied statistics. 1979 Jan 1:20-8.

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Collective Assessment using Expectation Maximization Maximum likelihood estimation of observer error-rates using the EM algorithm,

    Dawid AP, Skene AM. Applied statistics. 1979 Jan 1:20-8. Whose vote should count more: Optimal integration of labels from labelers of

    unknown expertise, Whitehill, J.; Ruvolo, P.; Bergsma, J.; Wu, T.; and Movellan, J., NIPS 2009. Learn task difficulties

    The multidimensional wisdom of crowds, Welinder, P.; Branson, S.; Belongie, S.; and Perona, NIPS 2010. Learn other task parameters

    An algorithm that finds truth even if most people are wrong, Prelec D, Seung S., 2007. Elicit meta-knowledge: Weight workers by how well they predict the crowd (Bayesian Truth

    Serum) Crowdsourcing control: Moving beyond multiple choice, Lin, C. H.; Mausam; and

    Weld, D. S. UAI 2012. Allow tasks without a fixed set of answers

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Collective Assessment using Expectation Maximization Maximum likelihood estimation of observer error-rates using the EM algorithm,

    Dawid AP, Skene AM. Applied statistics. 1979 Jan 1:20-8. Whose vote should count more: Optimal integration of labels from labelers of

    unknown expertise, Whitehill, J.; Ruvolo, P.; Bergsma, J.; Wu, T.; and Movellan, J., NIPS 2009. Learn task difficulties

    The multidimensional wisdom of crowds, Welinder, P.; Branson, S.; Belongie, S.; and Perona, NIPS 2010. Learn other task parameters

    An algorithm that finds truth even if most people are wrong, Prelec D, Seung S., 2007. Elicit meta-knowledge: Weight workers by how well they predict the crowd (Bayesian Truth

    Serum) Crowdsourcing control: Moving beyond multiple choice, Lin, C. H.; Mausam; and

    Weld, D. S. UAI 2012. Allow tasks without a fixed set of answers

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Collective Assessment using Expectation Maximization Learning from crowds, Raykar, V. C.; Yu, S.; Zhao, L. H.; and Valadez, G., Journal of Machine

    Learning Research 11:12971322, 2010 Weight workers not by label accuracy but by accuracy of the outcome of machine learning on the data

    Bayesian bias mitigation for crowdsourcing, Wauthier, F. L., and Jordan, M. I., NIPS 2011 Learn worker biases (and how to weight them) from the outcomes of machine learning on the data

    Vox populi: Collecting high-quality labels from a crowd, Dekel O, Shamir O., COLT 2009. Prune workers whose responses change the outcomes of machine learning

    Good learners for evil teachers, Dekel O, Shamir O, ICML 2009 Integrate learning fallibility into the machine learning algorithm

    False discovery rate control and statistical quality assessment of annotators in crowdsourced ranking ICML 2016. Learn systematic biases of workers (like preferring clicking left side answers)

    Optimality of Belief Propagation for Crowdsourced Classification, Jungseul Ok, Sewoong Oh, Jinwoo Shin, Yung Yi, ICML 2016 Optimality of algorithms if each worker is given only two tasks

    Better learning algorithms Better analyses

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Workflow optimization: By hand

    How many votes: Estimating (a,d) a answer, d difficulty Decision-theoretic control of crowd-sourced workflows,

    Dai, P.; Mausam; and Weld, D. S., AAAI 2010

    POMDP-based control of workflows for crowdsourcing Dai, P.; Lin, C. H.; Mausam; and Weld, D. S., Artificial Intelligence 202:5285, 2013

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Workflow optimization: How many votes: Exact exponent in optimal rates for crowdsourcing, Gao C, Lu Y,

    Zhou D., ICML 2016. Accuracy/cost tradeoff in a database setting: Crowdscreen: Algorithms for filtering

    data with humans, Parameswaran AG, Garcia-Molina H, Park H, Polyzotis N, Ramesh A, Widom J., SIGMOD 2010.

    Value of a workers judgement based on information it gives about answer: Pay by the bit: an information-theoretic metric for collective human judgment, Waterhouse TP., CSCW 2013.

    Also assess value of machine intelligence: Combining human and machine intelligence in large-scale crowdsourcing, Kamar, E.; Hacker, S.; and Horvitz, E., AAMAS 2012.

    Switching workflows: Dynamically switching between synergistic workflows for crowdsourcing, Lin, C. H.; Mausam; and Weld, D. S., AAAI 2012.

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Active learning what to label: Get another label? improving data quality and data mining using multiple, noisy

    labelers, Sheng, V. S.; Provost, F.; and Ipeirotis, P. G., KDD 2008. Which items to relabel to improve learning

    To re(label), or not to re(label), Lin, C. H.; Mausam; and Weld, D. S., HCOMP 2014 Is it better to relabel or to label something new

    Proactive learning: cost-sensitive active learning with multiple imperfect oracles, Donmez, P., and Carbonell, J. G., CIKM 2008. Weigh tradeoff between worker accuracy and cost

    Efficiently learning the accuracy of labeling sources for selective sampling, Donmez, P.; Carbonell, J. G.; and Schneider, J., KDD 2009. Weigh value of learning more about worker accuracy (exploration vs exploitation)

    Bayesian bias mitigation for crowdsourcing, Wauthier, F. L., and Jordan, M. I., NIPS 2011 Learn worker biases (and how to weight them) from the outcomes of machine learning on the data

  • "Artificial intelligence and collective intelligence", Weld, D.S., Lin, C.H. and Bragg, J., 2015

    Selecting the best worker for the task: Optimistic knowledge gradient policy for optimal budget allocation in crowdsourcing

    Chen, X.; Lin, Q.; and Zhou, D., ICML 2013 Adapt assignments to learned worker accuracy (without limit though)

    Budget-optimal crowdsourcing using low-rank matrix approximations, Karger, D. R.; Oh, S.; and Shah, D., Conference on Communication, Control, and Computing 2011 How to allocate tasks based on probability of error, cost

    Online task assignment in crowdsourcing markets, Ho, C.-J., and Vaughan, J. W., AAAI 2012Adaptive task assignment for crowdsourced classification, Ho, C.-J.; Jabbari, S.; and Vaughan, J. W., ICML 2013 Tasks fall into categories, workers have (initially unknown) ability on each type of task and a max

    number they can be given

    Bayesian bias mitigation for crowdsourcing, Wauthier, F. L., and Jordan, M. I., NIPS 2

Recommended

View more >