2014.chi.structured labeling to facilitate concept evolution in machine learning

20
STRUCTURED LABELING TO FACILITATE CONCEPT EVOLUTION IN MACHINE LEARNING Presenter: Hillol Sarker Authors Todd Kulesza, Saleema Amershi, Rich Caruana, Danyel Fisher, Denis Charles

Upload: bbkuhn

Post on 18-Aug-2015

4 views

Category:

Science


0 download

TRANSCRIPT

STRUCTURED LABELING TO FACILITATE CONCEPT EVOLUTION IN MACHINE LEARNING

Presenter: Hillol Sarker

AuthorsTodd Kulesza, Saleema Amershi, Rich Caruana, Danyel Fisher, Denis Charles

Motivation Machine Learning

We want to train a machine according to some target concept

Supervised machine learning needs consistent labeled data e.g., spam filter, email prioritize Difficult to obtain

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Problem Labeling Consistency is compromised

Labeler Expertise Familiarity with concept Judgment ability

Data Contains Ambiguity Changing distribution

Concept change over time

Example?Example?

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Semantic Location

Concept EvolutionConcept Evolution

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Existing Approach Machine Learning approaches

Noise-tolerant algorithm Multiple labeler Majority voting Weighting scheme Pairwise comparison (A better fit, then B)

Problem: No human judgment

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Approach Conduct series of formative studies

In order to investigate concept evolution in practice

Observations and feedback from these studies informed final prototype Incorporate feedbacks on initial labeler software

Design a Study Evaluate proposed Structured Labeling

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Preliminary Study 1 Researchers/practitioners create

guidelines for labelers Interviewed 2 Feedbacks

Guideline creation process is iterative Evolves observing new data

e.g., examples with multiple interpretation

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Preliminary Study 2 Recruited 11 machine learning expert Binary choice task

Prototype Software

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Preliminary Study 3 Conducted on 9 of previous 11

participants 4 week apart Using Same Prototype Software

Same content but shuffled order

Not Significant Difference

Significant Difference

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Incorporate Feedbacks in Study Software

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Study Software Interface Experiment tested 3 interface conditions

Baseline Traditional Mutually Exclusive “Yes”, “No”, “Could be”

Structured Manual Structuring

Structured Labeling Assisted Structuring

Structured Labeling + Automated Assistance

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Study Procedure 15 participant

108 items to label

Fixed task order Cooking, travel,

and gardening

Study Procedure Brief Introduction Time to practice Log interaction in each interface Completion of each

task=>Questionnaire Completion of 3 task=>Questionnaire

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Result: GroupGroup CountStructured > Baseline (p<0.001)

Manual > Baseline (p<0.001) Assisted > Baseline (p<0.001)

Pages per GroupCould be < Yes or NoYes < No

No Could Be Yes

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Result: RevisionRevisited Count

Manual > Baseline (p<0.005) Assisted > Baseline (p<0.005)

Revised CountStructured > Baseline (p<0.011)

Manual > Baseline (p<0.006) Assisted > Baseline (p<0.024)

First Half Last Half

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Result: Label Quality Matric ARI (Adjusted Rand Index)

Measures Agreement Pairs of items that should end up together over all

possible pairs Label Quality

Manual > Baseline (p=0.02) Assisted > Baseline (p=0.02) Manual ≠ Assisted (P=0.394)

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Result: Labeling Labeling Speed

Manual < Baseline (p=0.003) Assisted < Baseline (p<0.001)

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Feedback Participant

ranked each tool as their favorite

How often did your concept change? Likert-scale

Favorite Lease Favorite

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Summary Structured Labeling

Helps people evolve concept Increases label consistency at cost of speed

Can help Machine learning algorithm Weight for groups (e.g., “definitely yes” vs.

“yes”)

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Contribution Concept evolution causes inconsistent

labeling

Being first to show its importance

Not Significant Difference Significant Difference

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Critique of work Fixed task order used

e.g., Cooking, travel, and gardening Carry over effect

Limited to supervised learning Assisted structuring

Not always possible May bias decision

Introduction Preliminary Study Incorporate Feedback Study Result Conclusion

Thank You