frame the crowd: global visual features labeling boosted with crowdsourcing information
DESCRIPTION
Presentation of our submission for the Crowdsourcing task of the MediaEval 2013 Workshop.TRANSCRIPT
Frame the Crowd: Global Visual Features Labeling boosted with Crowdsourcing Information
Presentation: Michael Riegler, AAU Mathias Lux, AAU Christoph Kofler, TU Delft
Framing• Similar intentions for taking the
pictures will lead to similar framings of the images
Example 1
Example 2
Idea• Solve the problem with a Global Visual
Features approach based on the framing theory– Always available and for free (beside computation time)
• Workers Reliability for Crowdsourcing Information
• Transfer learning
Visual Classifier
• Modification of LIRE framework• Search based• 12 Global features • Feature selection• Feature combination– late fusion
Workers’ Reliability
• Calculate the reliability of a Worker:#(agrees with majority vote) /#(total votes by worker)
• Used as weight for the votes• Together with self report familiarity
as feature vector
Runs1. Reliability measure for workers2. Visual information with MMSys model3. Visual information with low fidelity worker
votes of Fashion10000 dataset model4. Visual information with new, by the
method of run#1, labeled Fashion10000 dataset
5. Visual information based decision for not clear results of run#1
MediaEval Results
1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
F1 Label 1 F1 Label 2
Crowdsourcing Visual + MMSys Visual + F10000 low Visual + F10k new labeled Crowd + Visual
MediaEval Results
1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
F1 Label 1 F1 Label 2
Crowdsourcing Visual + MMSys Visual + F10000 low Visual + F10k new labeled Crowd + Visual
Weighted F1 score (WF1)
• Weighted metric of each F1 score per class
• Can help to interpret the results better
• Can compensate differences between biased classes
Cross Validation Results
1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
F1 Label 1 F1 Label 2Weighted F1 Label 1 Weighted F1 Label 2
Crowdsourcing Visual + MMSys Visual + F10000 low Visual + F10k new labeled Crowd + Visual
Conclusion• Calculating the workers’ reliability performs
well– Well known that metadata leads to better results
• Transfer learning works well– Crowdsourcing can boost visual classification
• With visual features, even small amount of labeled data leads to good results
• Usefulness of Framing is reflected by the results
• Label 1 very good detectable with global visual features, but label 2 not (concept detection)
• Weighted F1 score can help to understand the results better