page 1 drexel university, college of engineering achieving semantic interoperability with hydrologic...
TRANSCRIPT
Page 1
Drexel University, College of Engineering
ACHIEVING SEMANTIC INTEROPERABILITY WITH
HYDROLOGIC ONTOLOGIES FOR THE WEB
6th International Conference on
HydroScience and
Engineering
Michael Piasecki Luis Bermudez
Page 2
Drexel University, College of Engineering
• Overview of metadata
• Metadata Interoperability problems
• A possible Solution:
Hydrologic Ontologies for the Web
Content
Page 3
Drexel University, College of Engineering
• Answers: what, when, where, how, who and why of the described data. • Helps to: discover, access, evaluate and use of data.
Creator : USGSKeyword: Gage Height
Metadata
Metadata
Page 4
Drexel University, College of Engineering
Hydrologic Information Communities (HIC) need a metadata agreement
What descriptors can be used ?
Keyword or topic
Author or Creator
Gage height or water elevation
Which possible values?
Is there any metadata agreement available to describe hydrologic data ?
Keyword or topic
Author or Creator
Page 5
Drexel University, College of Engineering
Metadata Specifications related to Hydrology
• ISO-19115:2003
• FGDC-STD-001-1998
• Ecological Markup Language
• Geographical Markup Language
• USGS Hydrologic Markup Language
• Earth Science Markup Language
• Dublin Core Metadata Initiative
Page 6
Drexel University, College of Engineering
Problem 1: Metadata specifications lack domain specific elements
For example:• They do not tell if area and outlet location should be defined when a watershed is being described
For example:• They do not incorporate a list of possible stations and variables related to surface water collected by
a particular HIC
What is the problem with these ?
Page 7
Drexel University, College of Engineering
EX_GeographicIdentifier
geographicIdentifier
MD_Identifier
code
…
Descriptive Keywords
MD_Keywords
keyword
Citation …
HIC A creates an HTML form to collect Metadata
#24
Water elev.
X
Not consistent
#23
W #34
#56
=Stage height
HIC A
Page 8
Drexel University, College of Engineering
EX_GeographicIdentifier
geographicIdentifier
MD_Identifier
code
…
Descriptive Keywords
MD_Keywords
keyword
Citation …
Need to incorporate domain vocabulary to get consistent metadata
consistent
consistent
#23 #34 #56
discharge stage height
Page 9
Drexel University, College of Engineering
Problem 2: Metadata standards do not solve Semantic heterogeneities
Finds only data set X
Metadata (FGDC) about dataset Y Theme_Keyword = Gage HeightTheme_Keyword_Thesaurus = USGS
Metadata (ISO) about dataset X keyword = Stage HeightthesaurusName = GCMD
and not data set Y Metadata repository
search for: Stage Height
Page 10
Drexel University, College of Engineering
Possible solutions to our ProblemsHow to incorporate domain vocabulary in metadata
specifications?
• Create a new metadata specification.
• Rewrite a previous one and extend
• Hardcode semantics into application
• Dynamic Extension with ontologies
Page 11
Drexel University, College of Engineering
Extending Metadata Specifications to meet specific needs of a HIC
Express metadata specifications and vocabularies
in ontologies.
Use the knowledge inference capabilities of
ontologies to link the metadata elements with
selected vocabulary terms.
Page 12
Drexel University, College of Engineering
Ontologies Specification of conceptualizations
Body of Water Class
RiverLake
Has water
Is inland body
Has a defined channel
Lake RiverExample:1. Properties of real
world objects are identified.
2. Similarities are identified.
3. Concepts are created4. and are expressed as
a class. 5. Classes are related.
Subclass
Page 13
Drexel University, College of Engineering
Web Ontology Language : OWL
Body of Water
River Lake
<XML>
</XML>
<owl:Class>Body_of_Water</owl:Class><owl:Class>River</owl:Class><owl:Class>Lake</owl:Class>
W3C Recommendation
since 02/2004
Page 14
Drexel University, College of Engineering
MD_Metadata+ fileIdentifier[0..1] : CharacterString+ language[0..1] : CharacterString…
MD_Identification
…+ abstract : CharacterString…
+ identificationInfo 1..*
Metadata specs expressed in ontologies<XML> Classesdatatype Propertiesobject Properties
<owl:Class>
</owl:Class>
<owl:Class>
</owl:Class>
</XML>
Page 15
Hydrologic Unit
Region Subregion Accounting Unit
Cataloging Unit
Is part of
Mid Atlantic
Delaware
Lower Delaware
Schuylkill
Is part of
Is part of
Is part of
Class
Subclasses
</XML>
<XML>Is Transitive
Infer isPartOf
Page 16
Drexel University, College of Engineering
More about knowledge Inference
<Station rdf:ID=“A"> </Station><Station rdf:ID=“B"> </Station><Station rdf:ID=“C"> </Station>
<owl:Class rdf:ID “W-Station” type of station that has property isPartOf = W </owl:Class>
<isPartOf rdf:resource=“#W”/>
<isPartOf rdf:resource=“#W”/>
<isPartOf rdf:resource=“#Y”/>
W
A
B
C Y
How to infer the stations that are only in W ?
W-Stations = A, B
Program infer
Page 17Dynamic extension with ontologies
Restriction onProperty: code allValuesFrom : W-station
MD_Identifier_Extension+ code: CharacterString…
MD_Identifier+ code: CharacterString…
W-station isPartOf = W
Metadata Specifications Domain Vocabularies
</XML>
<XML> <XML>
</XML>
Program could infer
code AB
Dynamic HTML form using the extension
A
B
C
W
Y
e.g. Restrict the descriptor code to only have W-station values
Page 18
Drexel University, College of Engineering
Ontologies provide means
to resolve Semantic Heterogeneities
Page 19
Drexel University, College of Engineering
Use of ontologies to map metadata specifications
<owl:Class rdf:ID = "&iso;MD_Keywords"><owl:equivalentClass
rdf:resource ="&fgdc;Keywords"/></owl:Class>
<owl:DatatypeProperty rdf:ID = "&iso;title"> <owl:equivalentProperty
rdf:resource = "&fgdc;title“/><owl:DatatypeProperty>
Page 20
Drexel University, College of Engineering
Use of ontologies to solve semantic heterogeneities among different domain
vocabularies
<gcmd:Variable ="&gcmd;Stage_Height">
<owl:sameAs rdf:resource=“&noaa;stage"/>
<owl:sameAs rdf:resource=“&usgs;gage_Height"/>
<owl:differentFrom
rdf:resource=“&events;Stage_Height"/>
</gcmd:Variable>
Page 21
Drexel University, College of Engineering
Semantic Interoperability
Finds data set X and Y
Metadata repository
e.g. search for Stage Height
Metadata (FGDC) about dataset Y Theme_Keyword = Gage HeightTheme_Keyword_Thesaurus = USGS
Metadata (ISO) about dataset X keyword = Stage HeightthesaurusName = GCMD
USGS
GCMD
Mapper
Hydrologic vocabulary
Metadata specifications
FGDC
ISO
Mapper
Page 22
Drexel University, College of Engineering
Why is XML Schema
not good enough?
Page 23
Drexel University, College of Engineering
..<xsd:element ref="watershed" type="watershedType" /><xsd:complexType name="watershedType"> <xsd:sequence> <xsd:element ref="outletLoc“
type="xsd:nonNegativeInteger” minOccurs="1" maxOccurs="1“/> <xsd:element ref=“id" type="xsd:nonNegativeInteger
minOccurs="1" maxOccurs="1"/>
E.g. defining that a watershed has only one outlet location and only one unique identifier
XML Schema cannot express semantics.
Page 24
Drexel University, College of Engineering
XML Schema cannot express semantics
…<watershed> <outletLoc>567</outletLoc> <name>X</name> <id>101</id></watershed>…
…<watershed> <outletLoc>838</outletLoc> <name>X</name> <id>101</id></watershed>…
Valid XML document Valid XML document
Semantically they are not correct 567 <> 838X
XML Schema is good to validate the structure of a document, but not the semantics
Page 25
Drexel University, College of Engineering
Hydrologic Ontologies will help to:
• Extend standards
• Solve semantic heterogeneities
• Interoperate between systems
• e.g. Find a numerical model and data to compute runoff for a specific location with a specific resolution.
• System Engineering benefits
• Efforts are not duplicated because the conceptual models could be reused and shared.
• Semantics not need to be hard coded in computer programs.
Page 26
Drexel University, College of Engineering
Acknowledgements
Drexel Team (Luis Bermudez, Saiful Islam, Bora Beran)
Stephane Fellah (Member ISO TC 211 Canada team) will submit
19115 in OWL to ISO as a draft document
NOPP NAG 13 0040 (Web based dissemination portal)
NSF- GEO Directorate grant from EAR division to create Hydrologic
Metadata for CUAHSI, prototype Hydrologic Information System
(HIS), in the Neuse River Basin
Discussion List : Protégé, Jena, W3C