caepia 2011 linked data methodology
Post on 13-Jun-2015
856 Views
Preview:
TRANSCRIPT
http://www.weso.es http://www.bcn.cl
An architecture and process of implantation for Linked Data
environments
A case study for the Library of Congress of Chile
Francisco Cifuentes – José María ÁlvarezChristian Sifaqui – José Emilio Labra
TLDE-CAEPIA 2011
http://www.weso.es http://www.bcn.cl
Overview: this talk in 1’
Why?Linked Open Data in Public Administrations
How?Proposal of Architecture
Adoption process
Where?Library of Congress - Chile
http://www.weso.es http://www.bcn.cl
Linked Open Data in Public Administrations
Government data & actions can be supervised
Improve transparency & confidence
http://www.weso.es http://www.bcn.cl
Linked Open Data inPublic Administrations
Public value (generates citizen experience)
Research & Collaboration
Reuse data
http://www.weso.es http://www.bcn.cl
Linked Open Data in Public Administrations
Public information belongs to citizens
Financed by public resources
Return of inversion
http://www.weso.es http://www.bcn.cl
Linked Open Data inPublic Administrations
Legislation is public information…
…and must be of public domain
Everyone is affected by laws
OK, ¡Linked Open Data is good!but…
http://www.weso.es http://www.bcn.cl
Architecture & Adoption Process
There is huge interest to publish LOD
Practical guidelines & methodologies ?
Our proposal:Architecture of Linked Open Data
Implementation methodology
http://www.weso.es http://www.bcn.cl
Considerations in Public Administrations context
Large volumes of dataSemistructured content
Contents of general interestHigh expectations
New projects should not interfereSmall teams in large organizations
Low semantic expertise
http://www.weso.es http://www.bcn.cl
Linked Open Data Architecture
Web Server Operating System
RDF Storage CacheDB
Endpoint SPARQL
OutputRDF
GraphOntologies
DocumentationPortal
UpdateRDF
GraphService
Web Application Server
Server side
Client side
Web Browser Semantic Application
http://www.weso.es http://www.bcn.cl
Adoption Process
Time
Phases
Contextualization
Ontology design
RDF Graph Modeling
SPARQL Endpoint Implementation
RDF Graph Implementation
Update Graph Service
Documentation Web Portal
Non functional Requirements
Optional Data Visualization & demos
OK, you propose an architecture& adoption process, but…
http://www.weso.es http://www.bcn.cl
Contextualization
Library of Congress - Chile
http://www.weso.es http://www.bcn.cl
ContextualizationLeychile 2008
Juridical certainty
LOD in Leychile: Natural extension
Improve interoperability (more formats)
Create domain ontologies
Complex queries through SPARQL endpoint
http://www.weso.es http://www.bcn.cl
Contextualization
Publish Linked Open Data – 5 stars
Norms and relationships in a global RDF graph
Infrastructure for future developments
First stage, pilot project
http://www.weso.es http://www.bcn.cl
Contextualization
≈ 300.000 norms and their relationships Modifications, Concordances, etc.
First stage ⇒ Only main metadata of norms Title, important dates, types, relationships
We exclude body text (articles, chapters, etc.)
http://www.weso.es http://www.bcn.cl
Contextualization
Definition of domain model:Norms, relationships, types of norms, metadata,
Functional requirements for bibliographical records (FRBR)
Output formats: RDFa, RDF/XML, JSON, N3,…
http://www.weso.es http://www.bcn.cl
Domain OntologiesSmall Ontology about Norms
http://www.weso.es http://www.bcn.cl
RDF Graph Modeling
A norm can be modified by another norm
Decree 296Published 1995-02-17
Art..1. abc.Art. 2. def.Artí.3. ghi.
Decree 12066Published 2005-05-15
Art. 1. Modify decree 296 in the following way:: substitute in Art.1 the words “a” by “xyz”.
Now, Decree 296 should be:
Decree 296
Artículo 1. xyzbc.Artículo 2. def.Artículo 3. ghi.
http://www.weso.es http://www.bcn.cl
RDF Graph Modeling
Careful URI Design
Expressiveness
http://www.weso.es http://www.bcn.cl
RDF Graph Modelinghttp://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/Decree 296
http://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/es@1995-02-17Original
http://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/es@2005-05-10Latest version
http://www.weso.es http://www.bcn.cl
Links to other datasets (Countries for International Treaties)DBPedia, Geonames
Reuse vocabularies / OntologiesSKOS, DC, FOAF, DBPedia, ORG
Triplestore: Openlink Virtuoso
SPARQL Endpoint
http://www.weso.es http://www.bcn.cl
SPARQL Endpoint
Example of queryFind all norms emitted by a municipality between 1995 and 2000
that were modified after 2005.PREFIX dc: <http://purl.org/dc/elements/1.1/>PREFIX n: <http://datos.bcn.cl/ontologies/bcn-norms#>
SELECT ?normTitle ?creatorName ?pubDate ?pubDateOtherWHERE {?norm n:createdBy ?creator .?creator n:hasName ?creatorName .?norm dc:title ?normTitle .?norm n:publishDate ?pubDate .?norm n:isModifiedBy ?otherNorm .?otherNorm n:publishDate ?pubDateOther .FILTER (regex(?creatorName,"MUNICIPALIDAD","i"))FILTER (?pubDate > "1995" &&
?pubDate < "2000" && ?pubDateOther > "2005")
}ORDER BY (?pubDate)
http://www.weso.es http://www.bcn.cl
RDF Graph Implementation
http://code.google.com/p/weso-desh/
We developed a Linked Data Frontend (WESO-DESH)
Content negotiation based on HTTP 303 See Other
Definition of URIs based on regular expressions
Easy configuration
Support for CONSTRUCT, ASK & DESCRIBE
Delegates output formats to SPARQL Endpoint
Result caching
GUI for administration backend (in progress)
http://www.weso.es http://www.bcn.cl
RDF Graph Implementation
WESO-DESH (Linked Data Frontend)
Output HTML+RDFa
XML Configuration
http://www.weso.es http://www.bcn.cl
26
Update Graph Service
*ETL = Extraction, Transformation Loading
Automatic extraction & transformation process to update the RDF GraphBased on Pentaho - Kettle ETL
Executes Transformations in threads
Configuration in XML
http://www.weso.es http://www.bcn.cl
Documentation
Documentation Web Portal: TYPO3 CMS
Sections:URI construction guidelines
Example queries
Output formats
Ontology documentation
etc.
http://www.weso.es http://www.bcn.cl
Non-Functional Requirements
Answer timeCache system, Profiling
Security & privacityDifferent views and access levels of RDF Graph
OthersInternationalization
Accessibility
Use of standards
http://www.weso.es http://www.bcn.cl
29
Optional: Data visualization
http://www.weso.es/lodviz/
Protype tool: LODViz (Linked Open Data Vizualization)
Based on HTML5 (pattern library)
Work in progress
http://www.weso.es http://www.bcn.cl
http://www.weso.es http://www.bcn.cl
31
Results
Public Dataset Catalogs Faceted Browser - CTIC FoundationFive stars Linked Open Data
http://www.weso.es http://www.bcn.cl
32
Conclusions
First stage finished> 300.000 norms exported
≈ 8mill. triples, ≈ 27 triples by norm
200/400 triples added each day
3 tools in developmentWESO DESH - Linked data frontend
WESO RUD – RDF Updater
LODVIZ – Linked Open Data Visualization
Proposed methodology of Linked Open Data
http://www.weso.es http://www.bcn.cl
Future Work
Library of Congress of ChileMore datasets: Biographies, Geographical data
History of Law
Improve documentation
WESO Research groupSemantic search engine
Entity extraction & reconciliation in text
Resource Recommendation
Provenance & graph views
The End
http://www.weso.es
http://www.bcn.cl
More Information
http://www.weso.es http://www.bcn.cl
35
Main Team
Francisco CifuentesMember of WESO Research Group and Library of Congress of Chilehttp://www.weso.es/~fcifuentes
José María ÁlvarezMember of WESO Research Grouphttp://josemalvarez.es
Christian SifaquiHead of Systems and Network information servicesLibrary of Congress of Chilehttp://sifaqui.blogspot.com/
Jose Emilio LabraAssociate Professor of University of Oviedo and Head of WESO Research Grouphttp://www.di.uniovi.es/~labra/
http://www.weso.es http://www.bcn.cl
CreditsMost of the people were obtained from Internet.
Imagen transparencia: http://2.bp.blogspot.com/--wFwsKwMgAg/TjSDXOLCTzI/AAAAAAAAOzQ/qvBtbShckdI/s1600/11.2.bmp
Euros: Minuto digital. http://www.minutodigital.com/wp-content/uploads/euros-300x196.jpg
Biblioteca: http://ffernandez.files.wordpress.com/2010/04/biblioteca.jpg
FRBR: http://cucataloging.blogspot.com/
Contextualization: http://tentblogger.com/right-advertisers/
Documentation: http://susops.blogspot.com/2010/07/power-of-documentation.html
top related