uima for nlp based researchers’ workplaces in medical domains

25
UIMA for NLP based Researchers’ Workplaces in Medical Domains Manuela Kunze Dietmar Rösner University of Magdeburg Germany

Upload: others

Post on 12-Sep-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UIMA for NLP based Researchers’ Workplaces in Medical Domains

UIMA for NLP based Researchers’ Workplaces in Medical Domains

Manuela KunzeDietmar Rösner

University of MagdeburgGermany

Page 2: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Researchers‘ Workplace in Medical Domains

Manuela Kunze UIMA Workshop 2008 2

GUIUIMA

components

Autopsy Protocols

Epicrises(clinical reports) psychotherapists

forensic medicine

Page 3: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Outline

• use cases1. Processing clinical reports

2. Processing autopsy protocols

• architecture– preprocessing

– analyses

– presentation of results

• summary

Manuela Kunze UIMA Workshop 2008 3

Page 4: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Corpus: Clinical Reports

• clinical reports: summary reports of diagnoses and treatmentfor specific patients

• research question by psychotherapists: – to detect significant changes in the distribution of diagnoses

• e.g. related to the fundamental changes of the socio-political system after 1989 in former East Germany

• started with a feasibility study– diagnostic summaries (parts of epicrises)

– span over time period of 20 years

– ca. 1000 summaries

Manuela Kunze UIMA Workshop 2008 4

Page 5: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Corpus: Clinical Reports

• a diagnostic summary can contains following parts:– psychopathological symptoms

– relevant personality traits and interpersonal problems

– diagnostic label

– (optional) related social incidents

Manuela Kunze UIMA Workshop 2008 5

Schwere depressive Störung im Zusammenhang mit beruflicher und

partnerschaftlicher Konfliktsituation bei schizoider Persönlichkeit mit

narzisstischen Anteilen.

Ängstlich-depressives Syndrom mit multiplen Somatisierungen.

2 examples:

Page 6: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Corpus: Autopsy Protocols

• 120 forensic autopsy protocols– ca. 1 million running word forms

– strictly defined format and content

• contains different parts with their ownsublanguage:– findings

– histological findings

– background

– death causes

– discussion

– …

• research question: – detection of injury patterns and creation of

resp. statistics

Manuela Kunze UIMA Workshop 2008 6

Page 7: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Outline

• use cases1. Processing epicrises

2. Processing autopsy protocols

• architecture– preprocessing

– analyses

– presentation of results

• summary

Manuela Kunze UIMA Workshop 2008 7

Page 8: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Preprocessing

Manuela Kunze UIMA Workshop 2008 8

clinical reports

• splitting: – documents contain the

collected diagnosticsummaries made in a year

• CPE– detection of a

diagnosis summary

– diagnostic printer

autopsy protocols

• anonymisation– names of persons, locations,

birth dates

• CPE:– detection of sensible data

– replacement by placeholders

Page 9: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Outline

• use cases1. Processing epicrises

2. Processing autopsy protocols

• architecture– preprocessing

– analyses

– presentation of results

• summary

Manuela Kunze UIMA Workshop 2008 9

Page 10: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Analyses of Documents

Manuela Kunze UIMA Workshop 2008 10

Analysis Module

general (medical) analysis engines

processing of epicrisis

discourse marker

annotator

subfragment

annotator

rule-based

classifier

OpenNlp-Maxent

classifier

structure tagger POS tagger

Gazetteer

annotator

GermaNet

annotator

UMLS

annotator

synonym

annotator

processing of autopsy protocols

Context based

analaysis

Personal data

annotator

Traumata

annotator

Weapons

annotator

Summary

annotator

Criminal offense

signs annotator

Page 11: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Analyses of Documents: General Medical Tools

• structure tagger: sentence boundaries, numbers, abbreviations, …

• POS Tagger: word categories, stems, case, number

• Gazetteer annotator: lists about syndroms, symptoms, diseases, …

• GermaNet annotator: information about GermaNet synsets

• UMLS annotator: information about concepts of metathesaurus of UMLS

Manuela Kunze UIMA Workshop 2008 11

general (medical) analysis engines

structure tagger POS tagger

Gazetteer

annotator

GermaNet

annotator

UMLS

annotator

Page 12: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Analyses of Documents

Manuela Kunze UIMA Workshop 2008 12

Analysis Module

general (medical) analysis engines

processing of epicrisis

discourse marker

annotator

subfragment

annotator

rule-based

classifier

OpenNlp-Maxent

classifier

structure tagger POS tagger

Gazetteer

annotator

GermaNet

annotator

UMLS

annotator

synonym

annotator

processing of autopsy protocols

Context based

analaysis

Personal data

annotator

Traumata

annotator

Weapons

annotator

Summary

annotator

Criminal offense

signs annotator

Page 13: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Analyses of Documents: Processing Clinical Reports

• discourse marker annotator

• subfragment annotator–annotates the different parts of a diagnostic summary

Manuela Kunze UIMA Workshop 2008 13

processing of epicrisis

discourse marker

annotator

subfragment

annotator

rule-based

classifier

OpenNlp-Maxent

classifier

synonym

annotator

Akute depressive Symptomatik im Zusammenhang mit Partnerschaftskonflikt auf der Basis einer primär neurotischen Fehlentwicklungbei selbstunsicherer depressiv strukturierter abhängiger Persönlichkeit.

Page 14: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Analyses of Documents: Processing Clinical Reports

• synonym annotator– domain specific synonymous terms

• e.g. psychosomatic disorders, depression

• fragment classifier– OpenNLP classifier– Rule-based classifier

Manuela Kunze UIMA Workshop 2008 14

processing of epicrisis

discourse marker

annotator

subfragment

annotator

rule-based

classifier

OpenNlp-Maxent

classifier

synonym

annotator

Page 15: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Analyses of Documents

Manuela Kunze UIMA Workshop 2008 15

Analysis Module

general (medical) analysis engines

processing of epicrisis

discourse marker

annotator

subfragment

annotator

rule-based

classifier

OpenNlp-Maxent

classifier

structure tagger POS tagger

Gazetteer

annotator

GermaNet

annotator

UMLS

annotator

synonym

annotator

processing of autopsy protocols

Context based

analaysis

Personal data

annotator

Traumata

annotator

Weapons

annotator

Summary

annotator

Criminal offense

signs annotator

Page 16: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Analyses of Documents: Processing Autopsy Protocols

Manuela Kunze UIMA Workshop 2008 16

processing of autopsy protocols

Context based

analaysis

Personal data

annotator

Traumata

annotator

Weapons

annotator

Summary

annotator

Criminal offense

signs annotator

• Personal Data Annotator: age, weight

• Traumata Annotator: fractures, hematoma, stab wound, etc.

• Criminal Offense Signs Annotator: signs of criminal offense (e.g. ‚stab canal‘)

• Weapons Annotator: thrusting, baton, …

• Summary Annotator: death cause and manner of death

• Context based Analysis: relations between injuries and their resp. locations

Page 17: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Outline

• use cases1. Processing epicrises

2. Processing autopsy protocols

• architecture– preprocessing

– analyses

– presentation of results

• summary

Manuela Kunze UIMA Workshop 2008 17

Page 18: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Processing of Results

• different CAS Consumers– text summaries

– for indexing: UIMA search engine, Lucene

• different user interfaces– presentation of annotations

– search engines

Manuela Kunze UIMA Workshop 2008 18

Page 19: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Processing of Results

Manuela Kunze UIMA Workshop 2008 19

Page 20: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Processing Results

• a generic interface to UIMA Search Engine– input:

• directory to indexed files

• directory to CAS files

• type system descriptor

• XML descriptor for indexing

• XML based description of possible values for features

Manuela Kunze UIMA Workshop 2008 20

Page 21: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Processing Results

Manuela Kunze UIMA Workshop 2008 21

Page 22: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Processing Results

Manuela Kunze UIMA Workshop 2008 22

Page 23: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Processing Results

Manuela Kunze UIMA Workshop 2008 23

Page 24: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Outline

• use cases1. Processing epicrises

2. Processing autopsy protocols

• architecture– preprocessing

– analyses

– presentation of results

• summary

Manuela Kunze UIMA Workshop 2008 24

Page 25: UIMA for NLP based Researchers’ Workplaces in Medical Domains

Summary

• Why UIMA?

– modular architecture, interfaces

– strict separation of resources and process methods

– simple changes by domain experts are possible

Manuela Kunze UIMA Workshop 2008 25

Researchers‘ Workplace

GUIUIMA

components

Autopsy Protocols

Epicrisespsychotherapists

forensic medicine