metadata schema for cerif-2000
DESCRIPTION
Metadata Schema for CERIF-2000. Andrei Lopatenko Vienna University of Technology http://derpi.tuwien.ac.at/~andrei. What we have now. SGML DTD to describe CERIF data (old version of CERIF) SGML is used for data exchange between national institutions and ERGO - PowerPoint PPT PresentationTRANSCRIPT
Metadata Schema for CERIF-2000
Andrei LopatenkoVienna University of Technologyhttp://derpi.tuwien.ac.at/~andrei
What we have now SGML DTD to describe CERIF data (old
version of CERIF) SGML is used for data exchange
between national institutions and ERGO
SGML DTD is only for old version of CERIF (projects)
Strictly defined structure and semantic of elements
What we need
Metadata format to describe the CERIF-2000 data (with new entities, attributes)
Due to diversity of data descriptions in different countries, institutions it should be possible to extend schema with expressing meaning of new elements
Possible solution
Semantic Web – RDF (Resource Description Framework)
to encode data, DAML + OIL (DARPA Agent Markup
Language + Ontology Inference Layer) to express semantic of classes and attributes
Advantages
The direct way to Knowledge Management solution
The possible way to solve problems of different vocabularies, classifications. Ready to work in heterogeneous distributed environment
Easy to implement contrasting to KIF/KQML, Description Logic solutions
Advantages XML experience can be utilized for
development SW solutions XML compatibility makes solution close to
industry solutions Semantic richness of SW makes possible to
developed advanced information retrieval over SW encoded data
Already developed tools can be applied
Disadvantages
XML experience is not enough. Developed should be taught to SW
Not so powerful as complete Description Logic solutions
Not so efficient on huge volumes of data as traditional database technologies (replication)
DAML + OIL Allows to describe hierarchical relations
between classes of data Allows to specify classes (create
vocabulary!) of data using slot restrictionsExample: “Workshop” is “Event”“EU project” is a “Project”, which value of
attribute “funding organization” is an object of class “European Funding Organization”
DAML + OIL
Distributed ontologiesMy (AURIS-MM) project is a subClassOf
CERIF:Project. Tools for ontology checking
(Description Logics, CLOS based theory for DAML )
Tools for ontology development Tools for ontology visualization
DAML + OIL Advanced information retrieval solutions Implemented and tested Projects: EU Projects (On-To-Knowledge,
KA3:IAF ), DARPa project CAKE, WebScript, DAML Services, Knowledge Creation tools for DAML, ASCS, etc
See, www.cordis.lu, www.darpa.mil, www.daml.org, derpi.tuwien.ac.at/~andrei/DAML.htm
DAML + OIL Developed the first version of ontology http://derpi.tuwien.ac.at/~andrei/cerif-rdf-d
c-mn.daml Mapping (as a subclass relations and
axioms) to other well-known schemas (DublinCore and MathNet)
Tested for simple information retrieval operations (but including semantic information)
DAML + OIL example of schema<daml:Class
rdf:ID="http://derpi.tuwien.ac.at/~andrei/cerif-rdf-dc-mn.daml#CERIF.Workshop"><rdfs:label>CERIF.Workshop</rdfs:label>
<rdfs:comment /> <oiled:creationDate>16:19:57
07.08.2001</oiled:creationDate> <rdfs:subClassOf> <daml:Class
rdf:about="http://derpi.tuwien.ac.at/~andrei/cerif-rdf-dc-mn.daml#CERIF.Event" />
</rdfs:subClassOf>
</daml:Class>
DAML + OIL
Easy creation of custom vocabularies based on shared vocabularies
Easy specification of which classes (multiple classes possible) instantiate given object
DAML + OIL Example: Publications database: classes for researchers:
Dissertation, Conference article, Journal article, Journal with evaluations, Patent
Classes for university administration: Class A (score 2): International Patent, Class B (score 1): Journal Article in International
journal which is Journal with Evaluation
DAML + OIL
Created hierarchy of slots what makes information retrieval more clear and easy to implement
Example: full-text search operations based on “full-text description” slot (attribute)
project_abstract, project_title, project_desription are subslots of “full-text description”
If new slot added “project_last_year_summary” to include it nto full text search it would be enough tp specify it as a subslot of “full-text description”
DAML + OILExample of class hierarchy: from extended
CERIF
RDF
DAML + OIL specifies schema. Also possible to encode data (“instances”) in DAML
For EuroCRIS we propose use RDF as encoding format
RDF description should be consistent with DAML + OIL Schema
RDF Developed a toolset to export/import
data CERIF database <-> CERIF RDF Toolset to query CERIF RDF data (now
very simple information retrieval operation but distributed and with semantic)
Toolset to get data from CERIF RDF and put into Prolog knowledge base is beeing developed
Current work
RDF version of CERIF-2000. Knowledge Management solution for research but data store is RDF
New advanced information retrieval possibilities for CERIF
Proposal For testing try to use DAML + OIL and RDF
for data sharing and distributed retrieval operation between different EuroCRIS organization
Create and deploy advanced IR solution based on CERIF RDF and compatible with any CERIF database. Make it free and a par of CERIF implementation