semantic support for complex ecosystem research environments
TRANSCRIPT
Semantic Support for Complex Ecosystem Research
Environments
Deborah McGuinness1, Paulo Pinheiro1, Henrique Santos1,2, Matthew Klawonn1,
Katherine Chastain1
1Rensselaer Polytechnic Institute, USA2Universidade de Fortaleza, Brazil
AGU, December 2015
Outline
• Problem Statement• Foundational Technologies
–Long standing semantic tools–Custom solutions
• Recent Developments• Conclusions• Future Directions
2
Problem Statement
• In large projects, how should data be:– Integrated with other relevant data and
metadata?– Interpreted?
• And also–Accessed, shared, and visualized?
• Examples of data types in projects we work on:–Environmental monitoring–Architecture science and ecology
3
Foundational Technologies
• Ontologies: For capturing context–PROV-O–OBOE–VSTO–HASNetO
• Apache SOLR: For storage and retrieval• Contextualized CSVs: For data annotation• D3 Javascript: For metadata visualization
4
The Human-Aware Sensor Network Ontology
vstoi:Detector
vstoi:Instrument
vstoi:Platform
hasneto:Sensing
Perspective
oboe:Characteristic
oboe:Entity
vstoi:Detachable
Detector
vstoi:AttachedDetector
*
*
*1
0..1
*hasPerspectiveCharacteristic
perspectiveOf
prov:Activity
hasneto:DataCollection
vstoi:Deployment
xsd:dateTime
xsd:dateTime
hasDataCollection
1*
prov:Agent
wasAssociatedWithstartedAtTime
endedAtTime
1
1
*
**
*
oboe:Measurement
of-characteristic
hasneto:hasMeasurement 1
1
*
*
HADatAc
• Human Aware Data Acquisition Framework• A web application based on Apache SOLR, the
Play Framework• Goal: To provide a one-stop-shop for combined
data and metadata management, markup, integration, retrieval, and visualization
• Uses ontologies combined with limited human markup to achieve this goal
• Can be deployed on a laptop or server, depending on a user's needs
6
Combining Data and Metadata
7
Mouse over
Mouse over
Metadata
based
facete
d searc
h
Measurement metadata
Metadata about the metadata
Data Privacy
• In addition to nice visualization, integration, and retrieval features, HADatAc has sophisticated privacy mechanisms
• Data has various levels of access open to anonymous and pre-registered users.
8
Data Privacy cont.
9
Ease of Use== START-PREAMBLE ==
@base <http://localhost#> .
.
@prefix hasneto: <http://hadatac.org/ont/hasneto#> .
@prefix hadatac: <http://hadatac.org/ont/hadatac#> .
<example-kb> a hadatac:KnowledgeBase; hadatac:hasHost "http://localhost"^^xsd:anyURI .
<dataCollection-example01> a hasneto:DataCollection; prov:startedAtTime "2015-02-12T09:30:00Z"^^xsd:dateTime .
<deployment-example01> hasneto:hasDataCollection <dataCollection-example01> .
<example01-dataset01> a vstoi:Dataset; prov:wasGeneratedBy <dataCollection-example01>; hadatac:hasMeasurementType <mt0>,<mt1> .
<mt0> a hadatac:MeasurementType; time:inDateTime <ts0>; hadatac:atColumn 3; oboe:ofCharacteristic hadatac-entities:EC-WindDirection; oboe:usesStandard oboe-standards:Degree .
<mt1> a hadatac:MeasurementType; time:inDateTime <ts0>; hadatac:atColumn 2; oboe:ofCharacteristic hadatac-entities:EC-WindSpeed; oboe:usesStandard oboe-standards:MeterPerSecond .
<ts0> hadatac:atColumn 0 .
== END-PREAMBLE ==
TimeStamp,Record,WindSpdAve_ms,WindDir,WindSpd_ms_Min,WindSpdGust_ms_Max,AirTemp_C_Avg,RH_Pct_Avg,BaroPress_hPa_Avg,Rain_mm_Tot,Hail_Hits_Tot
2015-02-12T09:30:00Z,0,0.99,217.9,0.3,1.7,-4.5,66.58,995,0,0
2015-02-12T09:45:00Z,1,1.112,227.8,0.1,2.1,-4.372,66.45,995,0,0
2015-02-12T10:00:00Z,2,1.169,222.2,0.3,2.6,-4.146,65.98,995,0,0
10
• Work with csv files• Automate data
transfer across the web, including large amounts of data
• Retrieval (e.g faceted search), and visualization tools are automatically usable with uploaded data.
Conclusions
• Various ontologies were presented with the intent to show how they capture context in big data projects
• HADatAc was introduced, along with some of its key functionalities.
11
HADatAc is a cross-platform web service which integrates annotated data sets with other relevant data and metadata, and surrounds them with retrieval (faceted search) and visualization tools as well as privacy controls.
Future Steps
• Refine HASNetO vocabulary and test it over a constantly growing HASNetO-based knowledge base.
• Continue to add functionality to HADatAc–More visualization tools–Enhanced search capabilities–Looking to integrate with lab information
management systems (potentially use with science other than medicine)
12
More Information
• Contact Information–Deborah McGuinness: [email protected]–Paulo Pinheiro: [email protected]–Matt Klawonn: [email protected]
13