a semantic approach for digital long-term preservation of ... · an issue in healthcare...
TRANSCRIPT
-
A Semantic Approach for Digital
Long-Term Preservation of Electronic
Health Documents Stephan Kiefer, Fraunhofer Institute Biomedical Engineering
-
Fraunhofer Institute for Biomedical Engineering n One of the 60 r+d institutes of Fraunhofer Society n 360 projects annually, 120 with industry, in the fields:
u Biotechnology and Biobanking Technology u Cryo-Biotechnology u Stem Cell Research u Neuromonitoring, Neuroprosthetics u Miniaturized Systems u Ultrasound Systems u eHealth, Telemedicine, e-Infrastructures
n Fraunhofer Biomaterial Archives
-
Motivation and Background • Transition from paper based to full
electronic health record.
• Data need to be kept for decades. • Life long electronic health records need
to be accessible and usable for up to 100 years and more.
• Media, formats, standards, applications and operating systems will become obsolete.
• Data volumes produced in the healthcare sector are immense.
=> Digital preservation requires planning.
3
© Fraunhofer IBMT (Foto: Bernd Müller).
-
4
Economical Solutions for Long Term Digital Preservation
3 4 INNOVATIONS USE CASES Healthcare Clinical Studies Financial Services 1. EVALUATE Cost, Value and Risk
2. AUTOMATE Preservation Lifecycle
3. PROTECT Content-aware data protection
4. SCALE using ICT innovations
http://www.ensure-fp7.eu
-
Typical Preservation Workflow
5
-
ENSURE OAIS* based System Architecture
6
*OAIS - Reference Model for an Open Archival Information System, ISO 14721:2003 AIP, SIP, DIP: OAIS Information Packages of the archive
-
OAIS and the role of ontologies in ltdp • OAIS Archival Information Package:
– Content Information: Data Object together with its Representation Information
– Preservation Description Information: Reference, Fixity, Provenance and Context Information (i.e. PREMIS)
– Packaging Information (i.e. XFDU)
• Our approach: Modelling metadata through ontologies • Focus on information discovery and retrieval • Other potential usage: security/privacy, provenance
management, preservation planning • A strong need to address ontology ageing.
Hope: Can be leveraged to trigger actions on the archive to ensure consistency and usability
7
-
ltpao:ArchivalInformationPackage
xsd:string
ltpao:aipCopyId
ltpao:hasParent
ltdpao:isChildOf
xsd:string
ltpao:aipLogicalId
xsd:string
ltpao:aipVersionId
ltpao:AIPRole
ltpao:hasRole
ltdpao:isRoleOf
xsd:string
ltpao:aipName
nie:InformationElement
ltpao:PreservationInformationPackage
ltpao:ArchivalInformationPackage
ltpao:SubmissionInformationPackage
ltpao:DisseminationInformationPackage
ltpao:PreservationDescriptionInformation
ltpao:Reference
ltpao:Provenance ltpao:Context ltpao:Fixity
ltpao:AccessRights
nfo:Folder
nfo:DataContainer
ltpao:hasContentDataObject
isContentDataObjectOf
ltpao:descriptedBy
ltdpao:descripes
xsd:string
ltpao:aipCopyId
ltpao:hasParent
ltdpao:isChildOf
xsd:string
ltpao:aipLogicalId
xsd:string
ltpao:aipVersionId
ltpao:AIPRole
ltpao:hasRole
ltdpao:isRoleOf
ltpao:AIPStandardRole
ltpao:AIPStorletRole
ltpao:AIPTransformation
Role
xsd:stringltpao:externalId
xsd:dateTime
ltpao:dateLastFixityValidation
ltpao:PackageManifest
ltpao:XFDUManifest
ltpao:PREMISManifest
nfo:Document
ltpao:hasManifest
ltdpao:isManifestOf
ltpao:XIPManifest
nie:DataSource
xsd:string
ltpao:aipName
8
Concept for an initial Long-Term Preservation
Archive Ontology (LTPAO)
OAIS archive specific knowledge: • Metadata format • Preservation Description
Information • AIP related metadata
ENSURE Preservation Ontologies
-
9
Preservation Ontology Framework Ingest
Preservation Ontology Framework Ontologies
Registry(Ontologies, Views,
RDFTemplates)
RDF Storage
Preservation Runtime
Information Selector
Semantic Indexer
- MIME-type identication- Full-text and metadata
extraction- RDF and PREMIS
encoding
Access
Semantic Search &
Query Interface
Query(SPARQL)
Data Retriever
Query Results (SPARQL Results XML)
Metadata store
requestIngest
Packager
Preservation Asset Digital Lifecycle Management
(PDALM)
Content-Aware Data
Protection
Preservation-Aware Storage
Services
Full-text Index
Ingest request Access request
openSearchGUI
prepareIndex (files location)
package(files location)
triggerReindex/
Transformation
storeRDF (RDF/XML)
startAccess (AIP IDs)
prepareDIP (files location)
getOntologies(Views)
synchronise
getRDFTemplatesOntologies Manager
(Maintenance, Versioning)
commitIndex (files location, AIP Id)
ontology update request
dpFilter(AIP Map)
-
Semantic Search & Query Interface
10
-
11
Semantic Search & Query Interface
-
12
ENSURE Preservation Ontologies
• Digital objects represented as instances of ontologies
• NEPOMUK Information Element ontologies used as upper-level ontologies
-
13
Concept for a DICOM Data Model Ontology
DICOM Data Model specific knowledge:
• Represents logical structure of DICOM elements and real world relations
• Concentration on attributes that are relevant for later searches
nie:InformationElement
ddmo:DICOM_IOD
nfo:DataContainer
xsd:string
nfo:Media
ddmo:DICOM_IE
ddmo:Patient_IE ddmo:Study_IE ddmo:Serie_IE
ddmo:Visit_IE ddmo:Equipment_IE
ddmo:Image_IE
nfo:Visual
nfo:Image
ddmo:includeIE
ddmo:belongsToIOD
ddmo:isSubjectOf
ddmo:makesVisit
ddmo:includeStudy
ddmo:containsSerie
ddmo:containsImage
ddmo:DICOM_Module
ddmo:createsStudy
ddmo:Patient_Module
ddmo:includeModule
ddmo:belongsToIE
ddmo:attributePatientID
ddmo:DICOM_DIR
ddmo:belongsToFilesetddmo:decriptesFileset
xsd:stringddmo:attributePatientName
ddmo:includePatientModule
ddmo:belongsToPatientIE
ddmo:Visit_Module
ddmo:includeVisitModule
ddmo:belongsToVisitIE
ddmo:attributeVisitDate xsd:dateTime
ENSURE Preservation Ontologies
-
14
DICOM Data Model Ontology (2)
• Integration of established terminologies/taxonomies or domain ontologies
• Example ICD10 codes (International Code of Diseases) as subclasses of Attribute ‘Admitting_Diagnosis’
ddmo:Admitting_Diagnosis
icd:InfectiousAndParasiticDiseases
icd:Neoplasms
icd:MalignantNeoplasms
Icd:InSituNeoplasms
icd:BenignNeoplasms
icd:NeoplasmsOfUncertain OrUnknownBehaviour
ENSURE Preservation Ontologies
-
• Long-term accessibility and usability of eHealth data is an issue in healthcare organisations.
• Ontologies and semantic web technologies can help to manage preservation metadata more effectively.
• They enable searching and accessing digital objects with evolving search profiles (ongoing research).
• They have also the potential to model preservation knowledge including evolving privacy policies as a whole.
• The influence of updating ontologies on the consistency of the archive must be carefully analysed.
15
Conclusions
-
16
Questions? Contact: Stephan Kiefer Fraunhofer Institute for Biomedical Engineering Dep. Telematics & Intelligent Health Systems St. Ingbert, Germany [email protected] Eliot Salant (co-ordinator) IBM Haifa Research Labs [email protected]
http://www.ensure-fp7.eu The research leading to these results has received funding from the European Community's FP7 Research Framework Programme under grant agreement n° 270000.