tool criticism

9
Tool Criticism Myriam C. Traub, Jacco van Ossenbruggen, Lynda Hardman Information Access

Upload: myriam-traub

Post on 03-Aug-2015

51 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Tool Criticism

Tool CriticismMyriam C. Traub, Jacco van Ossenbruggen, Lynda Hardman!Information Access

Page 2: Tool Criticism

“Amsterdam”

1642

“Amfterdam”

1624

“?”

16xx?

2

First mention of …

1618

… in the OCRed newspaper archive of the KB?

“Amslerdam”

1673

earliest !document

Page 3: Tool Criticism

01

Digital Humanities

✤ digital archives / large data collections!

✤ collection bias / digitization policy!

✤ digital representation ≠ physical object!

✤ tools are imperfect !

✤ source cannot be considered independent from tools!

✤ need methods to detect tool-induced bias

3

Page 4: Tool Criticism

01

Source criticism

✤ well established method in the humanities to detect bias in a source!✤ Who is the author?!✤ Is the information current?!✤ Is the information objective

and credible?…!

✤ need a similar approach for tool-induced bias: !! tool criticism

4

Page 5: Tool Criticism

01

Methodology

✤ interviews with humanities scholars!

✤ classification of common research tasks!

✤ lack of trust blocks progress!

✤ use case: digital newspaper archive of KB !

✤ no formal OCR evaluation!

✤ useful for scholars?!

✤ mismatch between two perspectives

5

Page 6: Tool Criticism

We care about average performance

on representative subsets for generic

cases.

I care about actual performance

on my non-representative subset

for my specific query.

6

Two different perspectives of quality evaluation

Page 7: Tool Criticism

No silver bullet

✤ we propose novel strategies that solve part of the problem:!✤ critical attitude !(awareness and better support)!

✤ transparency !(provenance, open source, documentation, …)!

✤ alternative quality metrics!(taking research context into account)

7

M. Traub, J. van Ossenbruggen, L. Hardman, !Impact Analysis of the OCR Quality Problem in Digital Archives, TPDL2015 (under review)

Page 8: Tool Criticism

8

“Amsterdam”

Context-aware quality indicatorsquery

data

task

Page 9: Tool Criticism

01

Future work

✤ What strategies should a tool support to help scholars discover and dealing with bias?!

✤ What is a good way of estimating uncertainty for a specific task?!

✤ Can we crowdsource (part of the data for) better estimates?!

✤ What is a good way of conveying the estimated impact to scholars?

9

… questions?