realising the potential of algal biomass production through semantic web and linked data

37
[email protected] I-Semantics 2012, 5th September 2012, Graz Realising the Potential of Algal Biomass Production through Semantic Web and Linked data The LEAPS Framework Monika Solanki Knowledge Based Engineering Lab Birmingham City University, UK Joint work with Johannes Skarka Karlsruhe Institute of Technology, ITAS

Upload: monika-solanki

Post on 29-Aug-2014

630 views

Category:

Education


1 download

DESCRIPTION

Presentation at I-Semantics 2012, Graz

TRANSCRIPT

Page 1: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Realising the Potential of Algal BiomassProduction through Semantic Web and

Linked data

The LEAPS Framework

Monika SolankiKnowledge Based Engineering Lab

Birmingham City University UK

Joint work withJohannes Skarka

Karlsruhe Institute of Technology ITAS

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Outline

1 Motivation

2 Modelling Algal Biomass Knowledge

3 Lifting XML datasets to Linked data

4 System Architecture

5 Querying Linked Algal Biomass Data

6 Conclusion and Future work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Motivation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal biomass as biofuels

Extensive research is being undertaken in the search andproduction of naturally viable and sustainable energysourcesThe idea that algae biomass based biofuels could serve asan alternative to fossil fuels has been embraced bycouncils across the globeMajor companies government bodies and dedicated nonprofit organisations are getting involvedThe domain is a rich source of datainformationknowledge

httpwwwalgalbiomassorghttpwwweaba-associationeu

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal biomass as biofuels Observations

No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

LEAPS A Potential SolutionLinked Entities for Algal Plant Sites

motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Energetic Algae

Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme

19 partners and 14 Observers across 7 EU states

Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Some of the objectives

Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW and Linked data for Algal Biomass

Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 2: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Outline

1 Motivation

2 Modelling Algal Biomass Knowledge

3 Lifting XML datasets to Linked data

4 System Architecture

5 Querying Linked Algal Biomass Data

6 Conclusion and Future work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Motivation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal biomass as biofuels

Extensive research is being undertaken in the search andproduction of naturally viable and sustainable energysourcesThe idea that algae biomass based biofuels could serve asan alternative to fossil fuels has been embraced bycouncils across the globeMajor companies government bodies and dedicated nonprofit organisations are getting involvedThe domain is a rich source of datainformationknowledge

httpwwwalgalbiomassorghttpwwweaba-associationeu

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal biomass as biofuels Observations

No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

LEAPS A Potential SolutionLinked Entities for Algal Plant Sites

motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Energetic Algae

Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme

19 partners and 14 Observers across 7 EU states

Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Some of the objectives

Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW and Linked data for Algal Biomass

Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 3: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Motivation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal biomass as biofuels

Extensive research is being undertaken in the search andproduction of naturally viable and sustainable energysourcesThe idea that algae biomass based biofuels could serve asan alternative to fossil fuels has been embraced bycouncils across the globeMajor companies government bodies and dedicated nonprofit organisations are getting involvedThe domain is a rich source of datainformationknowledge

httpwwwalgalbiomassorghttpwwweaba-associationeu

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal biomass as biofuels Observations

No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

LEAPS A Potential SolutionLinked Entities for Algal Plant Sites

motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Energetic Algae

Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme

19 partners and 14 Observers across 7 EU states

Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Some of the objectives

Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW and Linked data for Algal Biomass

Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 4: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal biomass as biofuels

Extensive research is being undertaken in the search andproduction of naturally viable and sustainable energysourcesThe idea that algae biomass based biofuels could serve asan alternative to fossil fuels has been embraced bycouncils across the globeMajor companies government bodies and dedicated nonprofit organisations are getting involvedThe domain is a rich source of datainformationknowledge

httpwwwalgalbiomassorghttpwwweaba-associationeu

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal biomass as biofuels Observations

No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

LEAPS A Potential SolutionLinked Entities for Algal Plant Sites

motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Energetic Algae

Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme

19 partners and 14 Observers across 7 EU states

Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Some of the objectives

Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW and Linked data for Algal Biomass

Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 5: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal biomass as biofuels Observations

No systematic analysis of the algae biomass potential forNorth-Western EuropeMost of the knowledge buried in various formats of imagesspreadsheets proprietary data sources and grey literatureLack of a knowledge level infrastructure that is equippedwith the capabilities to provide semantic grounding to thedatasets for algal biomassLow levels of motivation among stakeholders for datasetsto be interlinked shared and reused within the biomasscommunity

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

LEAPS A Potential SolutionLinked Entities for Algal Plant Sites

motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Energetic Algae

Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme

19 partners and 14 Observers across 7 EU states

Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Some of the objectives

Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW and Linked data for Algal Biomass

Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 6: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

LEAPS A Potential SolutionLinked Entities for Algal Plant Sites

motivate the use of Semantic Web technologies and LODfor the algal biomass domainlaying out a set of ontological requirements for knowledgerepresentation that support the publication of algalbiomass dataelaborating on how algal biomass datasets are transformedto their corresponding RDF model representationinterlinking the generated RDF datasets along spatialdimensions with other datasets on the Web of datavisualising the linked datasets via an end user LOD RESTWeb service

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Energetic Algae

Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme

19 partners and 14 Observers across 7 EU states

Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Some of the objectives

Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW and Linked data for Algal Biomass

Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 7: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Energetic Algae

Aims to reduce CO2 emissions and dependency onunsustainable energy sources in North West Europe4 Year Strategic initiative of Interreg IVb NWE programme

19 partners and 14 Observers across 7 EU states

Coordinated set of activities focussing on sharing bestpractice developing effective stakeholder engagement andencouraging transnational cooperation

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Some of the objectives

Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW and Linked data for Algal Biomass

Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 8: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

EnAlgae Some of the objectives

Accelerate development of sustainable technologies forBiomass productionCreate a network of pilot scale algal facilities across NWEin order to address the current lack of verifiable informationon algal productivityMaintain an up to date inventory in which pilots collect andshare data in a standardised mannerCombine information across the entire algal bioenergydelivery chain into a comprehensive and user friendlyDecision Support System for practitioners policy makersand investors

httpwwwenalgaeeu

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW and Linked data for Algal Biomass

Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 9: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW and Linked data for Algal Biomass

Algal biomass data manifests itself across several facetsThe valuesupply chain ranges from cultivation of algae toproduction of biofuels and other productsCultivation harvesting processing and fuel productionfurther involves several intermediate processesEvery stage in the algal supply chain is governed byrequirements regulatory policies and strategiesEach of the facets consumes and produces a large volumeof unstructured data and information

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 10: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 11: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

SW Linked data and the Algal Supply Chain

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 12: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Competency questions for stage 1 datasetsData driven

Which are the algal operation sites with CO2 sources thathave CO2 emissions less than 130000 kgs where totalcosts of supplying CO2 is lower then 5000 GBP per ton ofCO2 areal yield is greater than 30 tons per hectare andwhich are located within the NUTS region ldquoUKM61rdquoSupplement the data with supporting information about theregionWhich are the top ten algal operation sites with the lowestimpact on global warming potentialFor a given algal operation site which are the first five mostcost effective combinations of light water nutrients andCO2 sources

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 13: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Modelling Algal BiomassKnowledge

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 14: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontological requirements

Ontologies needed to representSpatiality location of possible algae cultivation siteslocation of the sources of consumables (CO2 nutrientsand water)Geometries area of the cultivation site - extentspolygons linear and ring arraysUnits and Measurements conventional measurementunits such as Kgs for quantities and hectares for areabespoke units of measurements ie Kgshectare orKgsannumTerritorial units for statistics core concepts of the NUTSsystemDomain specific knowledge algae cultivation sites CO2sources pipelines

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 15: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

Spatial Data WGS84 spatial relations GeonamesNeoGeoGeometries WGS84 extended NeoGeoUnits and Measurements extended QUDT

httpwwww3org200301geowgs84_poshttpwwwordnancesurveycoukoswebsiteontology

spatialrelationsowlhttpwwwgeonamesorgontologyontology_v221rdf

httpgeovocaborggeometryhttpqudtorg11vocabdimensionalunit

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 16: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Reuse

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 17: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Ontologies for Algal Biomass Domainknowledge

Ontologies for modelling spatial knowledge units andmeasurements were reusedDiscovering vocabularies conceptualising the domainknowledge for algal biomass was non trivialConcepts and relationships for algal biomass had to bedefined from ground-up in accordance to the principles ofontology developmentThe design was very strongly guided by feedback fromquestionnaires made available to the stakeholdersinterviews with domain experts providers of raw datasetsand grey literature from the algal biomass and biofuelsdomain

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 18: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

Ontologies for Algal Biomass Domainknowledge

Ontologies available at httppurlorgbiomassontologies

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 19: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Designing URIs for Algal Biomass Data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 20: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets toLinked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 21: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked dataRaw data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 22: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

First stepThe first part of the data processing and the potentialcalculation are performed in a GIS-based model which wasdeveloped for this purpose using ArcGISRaw datasets with various origins and formats -transformed using bespoke computational algorithms to anArchGIS specific XML format

brings uniformity in the format of representation of thedatasets and in the process of transformationimportant computations that are part of the final datasetsare performed

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 23: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Second stepThe original data sources had several limitations and aone-to-one transformation was not possible

The XML data sources related the biomass production sitesand the CO2 sources via the pipeline datasetIn order to query for all sources that supplied CO2 to aspecific site the query would have to be made via thepipeline datasetThe site source and NUTS identifiers in the datasets werestring literals rather than URIs

A bespoke parser that exploits XPath to selectively querythe XML datasets and generate linked data wasimplementedIt utilises a complex underlying data structure to facilitatethe transformation

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 24: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

Four datasets were transformed and stored in distributedtriple store repositoriesThe NUTS regions dataset in RDF was available but therewas no SPARQL endpoint or service to query the datasetWe retrieved the dataset dump and curated it in our localtriple store as a separate repositoryThe transformed datasets interlinked resources definingsites CO2 sources pipelines regions and NUTS data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 25: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Lifting XML datasets to Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 26: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 27: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

System Architecture

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 28: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Architecture Main componentsParsing modules lifting the datafrom their original formats to RDF

Ontologies

Linking engine producing the linkeddata representation of the datasets

Triple store OWLIM SE 50

REST Web services

SPARQL endpoints

Web Interface

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 29: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Querying Linked Algal Biomass Data

Most queries over the datasets are based on retrievingknowledge centered around location informationThe queries are federated across the various repositoriesholding the linked dataRepresentative Query

Which are the algal operation sites with CO2 sources that haveCO2 emissions less than 130000 kgs where total costs ofsupplying CO2 is lower then 5000 GBP per ton of CO2 arealyield is greater than 30 tons per hectare and which are locatedwithin the NUTS region ldquoUKM61rdquo Supplement the data withsupporting information about the region

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 30: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

Typical QueryWHERE

SERVICE lthttplocalhostrepositoriesbiomassgt site a siteOperationSite

siteinNUTSRegion regiongeolocation loc locgeolat latloc geolong longsite sitehasSiteID siteIDsitehasArealYield zz qudtquantityValue yy qudtnumericValue arealYieldy qudtunit unit

SERVICE lthttplocalhostrepositoriesco2sourcegt source a co2CO2Source

co2hasSourceID sourceIDco2hasCO2Emission emissionemission qudtquantityValue emissionQtyemissionQty qudtnumericValue emissionValue

continued

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 31: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

Typical QuerySERVICE lthttplocalhostrepositoriespipelinegt pipe a pipePipeline

pipehasSiteID siteIDpipehasSourceID sourceIDpipehasTotalCO2Cost costcost qudtquantityValue qtyqty qudtnumericValue totalCO2CostValueqty qudtunit totalCO2CostUnit

SERVICE lthttplocalhostrepositoriesregiongt regionID a ramonNUTSRegion

owlsameAs relatedFILTER((emissionValue lt 130000)

ampamp (contains(str(region) UKM61))ampamp (arealYield gt 30)ampamp (totalCO2CostValue lt 5000))

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 32: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related EffortsConclusions and

Future Work

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 33: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Related effortsAquaFuels a taxonomy of algal strains available as PDFBioEnerGIS a GIS based Decision support toolBIOPOLE for biomass plants feeding district heatingsystemsBioKDF Bioenergy knowledge discovery framework fromthe US department of EnergyReegle various energy related datasets as linked opendata and a SPARQL endpoint to access the datasets

httpwwwaquafuelseuhttpwwwbioenergiseuhttpsbioenergykdfnet

httpdatareegleinfo

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 34: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

ConclusionsInvestigations into using algal biomass as an alternativesource of fuel is gaining widespread momentumThe Algal biomass community currently does not employany knowledge representation techniques to formalise andstructure valuable knowledge harnessed through theiroperationsAs research in the sector progresses a wealth ofinformation will be available that could be exploited bydomain specific applications

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 35: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Summary

The LEAPS framework exploits SW and LD for the algalbiomass community

enabling the screening of data for promising individualplant sites and provides base data for more detailedplanning purposesproposing a set of domain specific ontologies for algalplant sites CO2 and pipelines to be shared and extendedby the communitydefining a linked data publishing architecture thattransforms raw data in disparate formats to a uniform XMLrepresentationusing a set of well established and domain specificontologies as metadata to transform it further into linkeddataproviding various data access options such as a SPARQLendpoint an interactive Google map interface and a RESTAPI for making the data accessible to stakeholders

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 36: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Future WorkSeveral other datasets need to be integrated once theybecome availableOne of the core datasets - algal strains from AlgaebaseMultifaceted visualisation of the integrated datasets tofacilitate the uptake of the framework by stakeholdersRule based reasoning to model and inference domainspecific constraints

httpwwwalgaebaseorg

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work
Page 37: Realising the Potential of Algal Biomass Production   through Semantic Web and Linked data

monikasolankibcuacuk I-Semantics 2012 5th September 2012 Graz

Many Thanks

  • Motivation
  • Modelling Algal Biomass Knowledge
  • Lifting XML datasets to Linked data
  • System Architecture
  • Querying Linked Algal Biomass Data
  • Conclusion and Future work