weather station data publication at irstea: an implementation report
DESCRIPTION
Réunion du réseau MIA, 14 octobre 2014, Montpellier.TRANSCRIPT
www.irstea.fr
Pour mieux
affirmer
ses missions,
le Cemagref
devient Irstea
Catherine ROUSSEY, Stephan BERNARD, Géraldine ANDRE,
Oscar CORCHO, Gil DE SOUSA, Daniel BOFFETY ,
Jean-Pierre CHANET
October 13th 2014
Weather Station Data Publication at Irstea: an implementation Report
Thanks to
Jean Paul CALBIMONT,
W3C SSN Working Group and SSN rewievers
2
Outline
• Irstea needs
• a data provider
• From open data to linked open data
• State of the art about meteorological dataset publication
• Dataset
• Weather dataset from montoldre weather station
• Csv files
• Model the data, use standard vocabularies
• Semantic Sensor Network (SSN) ontology
• Networks of ontologies around SSN: SSN+GeoSPARQL+locn, SSN+
AWS+ Climate and Forecast, SSN+ QU+ Time
• Convert data to linked data representation
• Conclusion and Perspectives
3
Irstea: an environmental data provider
Irstea uses and provides several datasets.
Teams belongs to several environmental observatories.
• Data Base about avalanche
• BDOH Data Base about hydrology https://bdoh.irstea.fr/
• Data about soil pollution
Scientific data may be used by other public and research institutes
Scientific data
open data (non proprietary format)
linked open data (linked RDF)
4
What is Open Data?
Open data is data that can be freely used, reused and redistributed by
anyone - subject only, at most, to the requirement to attribute and
sharealike.
• Availability and Access: the data must be available as a whole and
at no more than a reasonable reproduction cost, preferably by
downloading over the internet. The data must also be available in a
convenient and modifiable form.
• Reuse and Redistribution: the data must be provided under terms
that permit reuse and redistribution including the intermixing with other
datasets.
• Universal Participation: everyone must be able to use, reuse and
redistribute - there should be no discrimination against fields of
endeavour or against persons or groups.
source: Open Data Handbook,
http://opendatahandbook.org/en/what-is-open-data/
5
What is 5 star Open Data?
source: Tim Berners-Lee, http://5stardata.info/
6
How to build 5 star Open Data
1. Prepare Stakeholders
2. Select a dataset
3. Model the data.
4. Specify an appropriate open data license
5. Create good URIs for Linked Data
6. Use standard vocabularies
7. Convert data to a Linked Data
representation.
8. Provide machine access to data
9. Announce the new data sets on an
authoritative domain
10. Recognize the social contract
Hyland, B., Atemezing G, & Villazón-Terrazas B (2014) Best
Practices for Publishing Linked Data. W3C Working Group
Note. http://www.w3.org/TR/ld-bp/
7
Linked Open Data cloud
An extension of the
current Web…
… where data are given
well-defined and
explicitly represented
meaning, …
… so that it can be
shared and used by
humans and machines,
...
... better enabling them
to work in cooperation
And clear principles on
how to publish data
8
State of the Art SSN SSN FOR PUBLISHING METEOROLOGICAL DATA
Feature of interest, spatial, time
• AEMET (Agencia Estatal de Meteorologia)
AEMET, WGS84,Geobuddies, W3C Time
• Swiss Experiment project
SWEET, WGS84, QUDT
• ACORN-SAT (Australian Bureau of Meteorology)
WGS84, UK Intervals, DUL, Data Cube
• SMEAR (Finnish Station for Measuring Ecosystem Atmosphere
Relations)
SWEET, Geoname, WGS84,DUL, Data Cube, Situation Theory
9
Irstea Weather Station MONTOLDRE
Montoldre center of France
Vantage Pro 2 of Davis Instruments
Sensors:
• temperature outdoor temperature
• atmospheric pressure external pressure
• air humidity outdoor relative humidity
• weathervane wind direction
• anemometer wind speed
• rain gauge precipitation quantity + precipitation rate
• solar radiation solar radiation
Measurement from 2010 to 2013, every 30 minutes
convertion of CSV files
10
Irstea Weather Station
11
Semantic Sensor Network Ontology
12
Network of Ontologies
Semantic Sensor Network : the backbone
Sensing Device
ontology for meteorological sensor (aws)
Feature of Interest
Climate and Forecast (cf-feature + cf-property)
Platform location
GeoSPARQL and Location Core Vocabulary (geosparql + locn)
Observation
W3C Time Ontology (time)
Observation value
Library of Quantity Kind and Units (qu + dim)
Dolce Ultra Light (dul)
13
Description of Weather Station SSN + LOCATION + GEOMETRY
What is a weather station?
It is a ssn:Platform, ssn:System.
• Platform is not the set of software uses to manage the sensor nodes
Platform is an entity to which other entities can be attached
Where is the weather station?
The location is always associated to a Platform individual
• WGS84 vocabulary usage does not make the difference between the
spatial feature and its geometrical representation (a point). Spatial
feature may have several geometrical representations depending of
the scale (point, polygon etc…)
Spatial queries : Where are the sensors near "Clermont Ferrand"?
14
Description of Weather Station SSN + LOCATION + GEOMETRY
15
Description of sensors SSN + AWS + CF-PROPERTY
Which type of sensor ?
• It is hard to find the specific type of sensor.
• Documentation is incomplete and not precise enough.
What type of phenomenum observes sensor?
Cf-property individuals are not declared as instances of ssn:Property
class
No problem the constraint on the property ssn:observes will infers that these
individuals are instances of ssn:Property class
Which station belongs the sensors?
The property ssn:onPlatform should be used between a sensor and
the weather station
• Query: How many sensors onPlatform lesPalanquinsVP2_1? no results
16
Description of Sensors
17
Description of Observation SSN (DUL) + CF-FEATURE +CF-PROPERTY+ QU
Observation describes the context of measurement.
Which sensor do the measurement ?
What is measured?
What is the measured data?
What is the unit of the data ?
• Dul properties and qu properties are redondants: which one should be
used and why?
• Lots of (blank) nodes between the observation and the data value
• Hard to find an URI pattern for observation :
at_Time_of_Plateform_Sensor_on_Property
A sensor (rain gauge) can observe several properties
18
Description of Observation
19
Description of Observation SSN + TIME
Observation describes the context of measurement.
When the measure was done?
A measurement can be a instant event: temperature, pressure, humidity
A measurement may be an interval event: precipitation quantity,
precipitation rate, wind direction, wind speed, solar radiation.
• Lack of documentation (wind direction)
Aggregation queries:
Find the strange days?
What are the day where the average temperature is above the monthly expected
temperature?
Find the days where the farmer can not go working (too much
precipitation or wind)
Give me the date where the daily quantity precipitation is above a threshold?
20
Time Instant Observation
21
Time Interval Observation
22
Convert data to linked data representation
TRANSFORMATION FROM CSV TO RDF
• Timestamps and duration
creation
• Wind direction conversion
• Split by month
23
Provide Machine Access to Data DEMO
http://ontology.irstea.fr
select weather data
SPARQL endpoint
http://ontology.irstea.fr/weather/snorql/
Rdf server jena fuseki
No reasoner
Dataset
8 type of measurement * 48 measurements per day * 365 days * 4
years= 560 640 observations
9 300 000 triplets
24
Recommendations
• Find a set of ontologies that are build to be connected together
• Never create a new class, just reference existing classes from others
ontologies
• Good URI are not so easy
• Define pattern (see cooluri)
• Create URI for individual with / only (#?)
• No Blank Nodes in order to browse the dataset
• Review your dataset with several reviewers (ssn workshop)
25
Conclusion & Perspectives
Not so easy to do it well !
Promote our dataset
• find a correct licence
• Publish it in datahub
Use it at a benchmark to run aggregation queries
New dataset about hydrology
Query a dataset in french and in natural language
One day to
publish a dataset
Ok we do it in 6
months
www.irstea.fr
Pour mieux
affirmer
ses missions,
le Cemagref
devient Irstea
Thanks for your attention!
27
W3C Semantic Sensor Incubator Group : SSN XG
SSN – XG : mars 2009
41 Participants de 16 organisations : Des grands noms du domaine des
ontologies et des réseaux de capteurs : CSIRO, Wright State University, OGC, DERI, OEG,
Knoesis etc…
Objectifs:
• Proposer un modèle unifié de données de capteurs et de métadonnées
• Etat de l’art sur les ontologies de capteurs existantes
• Proposer des méthodes de développements applications intelligentes
travaillant sur les données de capteurs
Résultat :
une ontologie qui intègre plusieurs ontologies existantes, validées dans des
projets.
Final Report 28 June 2011
http://www.w3.org/2005/Incubator/ssn/XGR-ssn-20110628/
28
Semantic Sensor Network Ontology
Format OWL 2, disponible sur le web et documentée
(!!) Orientée capteur uniquement, compatible avec les standards de OGC
Aligner sur l’ontologie de haut niveau Dolce Ultra Light (DUL)
Faciliter l’intégration avec d’autres ontologies
SSN ne s’utilise jamais seule (!!), chaque application ne réutilise qu’une sous partie
de l’ontologie
Ontologie modulaire basé sur des patrons de conception (Design Pattern)
Importe que les parties nécessaires
Faciliter l’évolution de l’ontologie
Répond à plusieurs cas d’usage (4)
Permettre d’avoir plusieurs niveaux de description
« Redondance » voulue et nécessaire
Semantic Sensor Network Ontology: http://www.w3.org/2005/Incubator/ssn/ssnx/ssn
M. Compton et al. The SSN ontology of the W3C semantic sensor network incubator
group. Web Semantics: Science, Services and Agents on the World Wide Web
Volume 17, December 2012, pp 25–32
29
Ontology Design Pattern: ODP SSO STIMULUS SENSOR OBSERVATION
Sensor is anything that observes
How it senses ?
What is sensed?
What senses ?
30
Ontology Design Pattern: SSO in SSN STIMULUS SENSOR OBSERVATION
Sensor is anything that observes
How it senses ?
What is sensed?
What senses ?
31
DUL et SSN