uima introduction

21
UIMA Introduction SHARPn Summit June 11, 2012

Upload: audi

Post on 22-Feb-2016

83 views

Category:

Documents


0 download

DESCRIPTION

UIMA Introduction. SHARPn Summit June 11, 2012. Outline . UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations interactively. UIMA Terminology. CAS XCAS JCAS View Analysis Engine ( AE ) / Annotator - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: UIMA Introduction

UIMA Introduction

SHARPn Summit June 11, 2012

Page 2: UIMA Introduction

Outline

UIMA Terminology (not just TLAs) Parts of a UIMA pipeline Running a pipeline Viewing annotations interactively

Page 3: UIMA Introduction

UIMA Terminology

CAS XCAS JCAS View Analysis Engine (AE) / Annotator XML output: XCAS XMI Type System JCasGen CAS Visual Debugger (CVD) CPE (Collection Processing Engine)

Page 4: UIMA Introduction

UIMA

Framework– Defining data types– Passing data from one component to another

Tooling– Viewing results– Debugging– Editing XML visually

Page 5: UIMA Introduction

Data Through a Pipeline

Type System– Defines the data types passed along

CAS (Common Analysis Structure)– Container for the data passed along

– Created by UIMA from the Type System

Page 6: UIMA Introduction

Parts of a UIMA Pipeline

Collection Reader– Read input document

Analysis Engine(s) / Annotator(s)– Process document

CAS Consumer– Output data

Page 7: UIMA Introduction

Tying a Pipeline Together

CPE descriptor (Collection Processing Engine)

– Collection Reader – Analysis Engine(s)

– CAS Consumer

Aggregate analysis engine– Multiple Analysis Engines and their order

Page 8: UIMA Introduction

Pipeline Example

UIMA term

Collection Reader

Analysis Engine

Analysis Engine

Analysis Engine

CAS Consumer

Example

Read files from a dir

Sentence detector

Tokenizer annotator

Part of Speech tagger

Output tokens to DB

Page 9: UIMA Introduction

UIMA plugin for Eclipse

Provides visual editors for descriptors – Mini GUI for selecting options – Rather than editing XML directly

An “Update site” exists for installing pluginhttp://www.apache.org/dist/incubator/uima/eclipse-update-site

Page 10: UIMA Introduction

UIMA Tooling Options

Tools:– CPE Configurator – CVD (CAS Visual Debugger)

Options:– Command line scripts/.bat files

– Run within Eclipse

Page 11: UIMA Introduction

Running a Pipeline - CPE

cTAKES provides a script and a bat filerunctakesCPE

Choose a CPE descriptor, such astest_plaintext.xml

from cTAKESdesc/cdpdesc/collection_processing_engine

Page 12: UIMA Introduction

Viewing Annotations - CVD

Viewing annotations using the CVD– Load the Type System– Load the XCAS or XMI

Page 13: UIMA Introduction

Annotation Viewers

UIMA tools– CVD (CAS Visual Debugger)– Annotation viewer

Viewing XML output– Any XML viewer

– Any text editor

Page 14: UIMA Introduction

Questions?

http://uima.apache.org/

Page 15: UIMA Introduction

Supplemental slides follow

Page 16: UIMA Introduction

Options to Run a Pipeline

CPE GUI CVD GUI

– Single Aggregate Analysis Engine– No Collection Reader

Instantiate a CpeDescription and invoke

the process() method uimaFIT– removes dependency on XML

Page 17: UIMA Introduction

Creating a New Annotator

Within Eclipse– Create Java project– Right click -> Add UIMA Nature– Add UIMA jars to .classpath (Build Path)– Create Analysis Engine (AE) descriptor– Add types to AE descriptor, or optionally

create separate Type System descriptor– Write code!

Page 18: UIMA Introduction

Running an AE in CVD

Using CVD to run an Analysis Engine– No Collection Reader– Single Analysis Engine (can be an aggregate)– No CAS Consumer

– Load an Analysis Engine – Paste/type in text to process

Family history of hyperlipidemia.

Page 19: UIMA Introduction

Modifying a parameter

UIMA’s descriptor editors allow you to modify most parameters without looking at the XML itself.

Page 20: UIMA Introduction

Links

Getting started with UIMA http://uima.apache.org/doc-uima-annotator.html

UIMA Update site for use in Eclipse http://www.apache.org/dist/incubator/uima/eclipse-update-site