teamware: a collaborative, web-based annotation environment
DESCRIPTION
Teamware: A Collaborative, Web-based Annotation Environment. Kalina Bontcheva, Milan Agatonovic University of Sheffield. Hands-on Preparation. Go to the FIG’09 Wiki http://gate.ac.uk/wiki/Wiki.jsp?page=FIG09 Under Resources, Teamware lecture Click on link to the Teamware install - PowerPoint PPT PresentationTRANSCRIPT
University of Sheffield NLP
Teamware: A Collaborative,Web-based Annotation
EnvironmentKalina Bontcheva, Milan Agatonovic
University of Sheffield
University of Sheffield NLP
2GATE Summer School - July 27-31, 2009
Hands-on Preparation
• Go to the FIG’09 Wiki http://gate.ac.uk/wiki/Wiki.jsp?page=FIG09
• Under Resources, Teamware lecture Click on link to the Teamware install Login using you user name (from your reg.pack):
<cics-account-id>-annotator• Click on the link “Annotation Editor” to
download and prepare the software for our first hands on
• When it opens, leave it as is, till we need it
University of Sheffield NLP
3GATE Summer School - July 27-31, 2009
Outline
• Why Teamware?• What’s Teamware?• Teamware for annotation• Teamware for quality assurance and curation• Teamware for defining workflows, running
automatic services, managing annotation projects
• Outlook
University of Sheffield NLP
4GATE Summer School - July 27-31, 2009
From Annotation Tools to Collaborative Annotation WorkflowsWe have lots and lots of tools and algorithms
for annotation; what we need is1.methodological instead of purely technological2.multi-role instead of single role3.assistive instead of autonomous4.service-orientated, not monolithic5.usable by non-specialists
GATE Teamware Research users in several EU projects External users at IRF and Matrixware Interest from other commercial users as well
University of Sheffield NLP
5GATE Summer School - July 27-31, 2009
GATE Teamware: Annotation Workflows on the Web
GATE Teamware is:□Collaborative, social, Web 2.0, has behaviour
mining using Machine Learning□Parallel and distributed (using web services)□Scalable (via service replication)□Workflow based with business process
integration
University of Sheffield NLP
6GATE Summer School - July 27-31, 2009
Teamware – Layer Cake
TeamwareExecutiveLayer
WorkflowManagement
AuthenticationAnd User
Management
ServicesLayer
GATEDocument
Service
GATEAnnotation
ServicesGATE
OntologyService
GATEMachine Learning
API
User InterfaceLayer
Manual AnnotationUser Interface
SchemaAnnotation
UI
OntologyAnnotation
UI
Data CurationUser Interface
AnnotationDiff UI
ANNICUI
Document Browser
Language Engineer
User Interface
GATEDeveloper UI
University of Sheffield NLP
7GATE Summer School - July 27-31, 2009
Division of Labour: A Multi-role Methodology
• (Human) Annotators - labour has to be cheap! Bootstrap annotation process with JAPE rules or mixed-initiative learning
• Curators (or super-annotators) Reconcile differences between annotators, using IAA, AnnDiff, curator UI
Manager Defining annotation guidelines and schemas Choose relevant automatic services to pre-process Toolset including performance benchmarking, progress monitoring
tools, small linguistic customisations Define workflow, manage annotators, liaise with language engineers and
sys admins
• Sys admin Setup the Teamware system, users, etc.
Language engineer Uses GATE Developer to create bespoke services and deploy online
University of Sheffield NLP
8GATE Summer School - July 27-31, 2009
Teamware: Manual Annotation Tool
University of Sheffield NLP
9GATE Summer School - July 27-31, 2009
Manual Annotation Process
• Annotator logs into Teamware• Clicks on “Open Annotation Editor”• Requests an annotation task (first button)• Annotates the assigned document• When done, presses the “Finish task” button• If wants to save work and return to this task later –
“Save” button, then close the UI. Next time a task is requested, the same document will be assigned, so it can be finished
• Depending on the project setup, it might be possible to reject a document and then ask for another one to annotate (Reject button)
University of Sheffield NLP
10GATE Summer School - July 27-31, 2009
Hands-on
• Open a web browser and Teamware• Login using you user name (from your reg.pack):
<cics-account-id>-annotator
• Open the annotation UI• Try requesting tasks, editing annotations,
saving your work, asking for another task, etc.
• This is what Teamware looks like to a human annotator
University of Sheffield NLP
11GATE Summer School - July 27-31, 2009
Teamware for Curators
• Still being developed, so UI is in transition• Identify if there are differences between
annotators using IAA • Inspect differences in detail using AnnDiff• Edit and reconcile differences if required
New curator UI in Teamware under development Currently available in Developer
University of Sheffield NLP
12GATE Summer School - July 27-31, 2009
IAA: Do my annotators agree?
University of Sheffield NLP
13GATE Summer School - July 27-31, 2009
IAA: Results
University of Sheffield NLP
14GATE Summer School - July 27-31, 2009
IAA: Recap
• The IAA on IE tasks, such as named entity recognition, should be measured using f-measure across all annotators
• For classification tasks, use Kappa to measure IAA
• For details, see the evaluation lecture and the GATE user guide
University of Sheffield NLP
15GATE Summer School - July 27-31, 2009
AnnDiff: Finding the differences
University of Sheffield NLP
16GATE Summer School - July 27-31, 2009
Where are these in Teamware?
• Only visible to curators and their managers• Resources/Documents menu• Select the corpus worked on• Iterate through each document• Run IAA and AnnDiff, as required• Try for yourself:
Login as <cics-user-name>-curator Corpus: annie-demo The first or second documents
University of Sheffield NLP
17GATE Summer School - July 27-31, 2009
Forthcoming curator facilities
• Have a corpus-level view of IAA• Extended AnnDiff to allow easy reconciliation
of the differences between 2 annotators• Currently prototyped in Developer• Will be made available in Teamware soon
University of Sheffield NLP
18GATE Summer School - July 27-31, 2009
New AnnDiff in Developer
University of Sheffield NLP
19GATE Summer School - July 27-31, 2009
Beyond Pair-wise Reconciliation
• AnnDiff only handles 2 sets of annotations at a time – we often need more!
• Towards an in-place, content-based reconciliation interface
University of Sheffield NLP
20GATE Summer School - July 27-31, 2009
Current UI Prototype
University of Sheffield NLP
21GATE Summer School - July 27-31, 2009
Teamware for Managers
• Defining workflows • Running annotation projects• Tracking progress
University of Sheffield NLP
22GATE Summer School - July 27-31, 2009
Teamware Workflows
• Whole process is controlled by a workflow manager
• Workflow may be simple: Give the document to a human annotator Information curator checks a sample of documents for QC
• or more complex Invoke one or more web services to produce automatic
annotations Pass each document to 2 annotators Information curator to quickly check level of agreement between
the annotators and reconcile any differences Annotated documents used to train an ML model When model is good enough, start making suggestions to the
annotators
University of Sheffield NLP
23GATE Summer School - July 27-31, 2009
Workflow Templates
University of Sheffield NLP
24GATE Summer School - July 27-31, 2009
Defining new workflows
• Select Projects/WF Templates• Opens the WF wizard• Choose which services you want to run• Choose whether you want manual
annotation, how many annotators per doc, …
University of Sheffield NLP
25GATE Summer School - July 27-31, 2009
Setting up a Manual Annotation Project• Upload the schemas• Upload the documents• Define the Workflow template• Run the project, choosing the corpus, the
annotators, curators, etc. • DEMO!
University of Sheffield NLP
26GATE Summer School - July 27-31, 2009
Setting up an Automatic Annotation Project• Configure the web service(s)• Define the Workflow template• Run the project, choosing the corpus• DEMO!
University of Sheffield NLP
27GATE Summer School - July 27-31, 2009
Semi-automatic Projects
• Just combine the two sets of steps
University of Sheffield NLP
28GATE Summer School - July 27-31, 2009
Teamware: Monitoring Project Progress
University of Sheffield NLP
29GATE Summer School - July 27-31, 2009
Outlook
• Teamware is still under active development• Many features subject to change• If you’d like further information or to try it with
your data for a particular project, please contact Hamish and Kalina