linking the world with python and semantics

84
Linking the world with Python and Semantics @tati_alchueyr (Globo.com) 25 th July 2012, FISL 13

Upload: tatiana-al-chueyr

Post on 15-Jan-2015

6.796 views

Category:

Technology


4 download

DESCRIPTION

Introduction on how to use open data and Python, with examples of RDFLib, SuRF and RDF-Alchemy. http://softwarelivre.org/fisl13

TRANSCRIPT

Page 1: Linking the world with Python and Semantics

Linking the world with Python and Semantics@tati_alchueyr (Globo.com)25th July 2012, FISL 13

Page 2: Linking the world with Python and Semantics

how do you store your data?

Page 3: Linking the world with Python and Semantics

how do you store your data?

[ ] data... what data?![ ] raw files (csv, json, xml)[ ] database (eg. Relational Data Base)

[ ] graphs (eg. Resource Description Framework)

[ ] other...

Page 4: Linking the world with Python and Semantics

how do you search for...?

Apartments near English-Portuguese bilingual childcare in Rio de Janeiro state.

ERP service providers with offices in São Paulo and New York.

Researchers working on artificial intelligence in Southeast of Brazil.

GNU GPL software for image processing developed from 2009 to 2010 authored also by Brazilian developers

Page 5: Linking the world with Python and Semantics

how do you search for...?

Apartments near English-Portuguese bilingual childcare in Rio de Janeiro state.

ERP service providers with offices in São Paulo and New York.

Researchers working on artificial intelligence in Southeast of Brazil.

GNU GPL software for image processing developed from 2009 to 2010 authored also by Brazilian developers

Page 6: Linking the world with Python and Semantics

how do you search for...?

Apartments near English-Portuguese bilingual childcare in Rio de Janeiro state.

ERP service providers with offices in São Paulo and New York.

Researchers working on artificial intelligence in Southeast of Brazil.

GNU GPL software for image processing developed from 2009 to 2010 authored also by Brazilian developers

Page 7: Linking the world with Python and Semantics

how do you search for...?

Apartments near English-Portuguese bilingual childcare in Rio de Janeiro state.

ERP service providers with offices in São Paulo and New York.

Researchers working on artificial intelligence in Southeast of Brazil.

GNU GPL software for image processing developed from 2009 to 2010 authored also by Brazilian developers

Page 8: Linking the world with Python and Semantics

what ^ have in common?

Page 9: Linking the world with Python and Semantics

linked open data in 2007

Page 10: Linking the world with Python and Semantics

linked open data in 2008

Page 11: Linking the world with Python and Semantics

linked open data in 2009

Page 12: Linking the world with Python and Semantics

linked open data in 2011

Page 13: Linking the world with Python and Semantics

traditional RDMS

Page 14: Linking the world with Python and Semantics

linked data graph

Page 15: Linking the world with Python and Semantics

linked data modelling

Page 16: Linking the world with Python and Semantics

modelling

Page 17: Linking the world with Python and Semantics

modelling

Page 18: Linking the world with Python and Semantics

quering RDB

select bookID, authorName from books, authorswhere books.aid = authors.aid and books.isbn = ‘006251587X’.

Page 19: Linking the world with Python and Semantics

quering RDF

select ?authName ?authEmail where { <amazon:book#006251587X> <amazon:hasAuthor> <foaf:name#TimBerners-Lee> <foaf:name#TimBerners-Lee> <foaf:name> ?authName <foaf:name#TimBerners-Lee> <foaf:email>?authEmail}

Page 20: Linking the world with Python and Semantics

globo.com developers before usingweb semantics

Page 21: Linking the world with Python and Semantics

globo.com developers while learningweb semantics

(?w ?t ?f)

Page 22: Linking the world with Python and Semantics

globo.com developers after usingweb semantics

Page 23: Linking the world with Python and Semantics

Sample hard to test code

Page 24: Linking the world with Python and Semantics

approach 1# queries isolation

Page 25: Linking the world with Python and Semantics
Page 26: Linking the world with Python and Semantics

approach 2# data as object

DAO

Page 27: Linking the world with Python and Semantics
Page 28: Linking the world with Python and Semantics

Y U NO make

SPARQL queries?!

Page 29: Linking the world with Python and Semantics

Y U NO make

data access easy?!

Page 30: Linking the world with Python and Semantics

Y U NO make

things testable?!

Page 31: Linking the world with Python and Semantics

product developers evaluatingweb semantics

Page 32: Linking the world with Python and Semantics
Page 33: Linking the world with Python and Semantics

fact 1: we don't have anout-of-box solution

Page 34: Linking the world with Python and Semantics

fact 2: but we do havesome options

Page 35: Linking the world with Python and Semantics

#1: create a solutionfrom scratch

#2: study existing solutions and then[ ] contribute to them[ ] develop on top of them[ ] goto #1

some options

Page 36: Linking the world with Python and Semantics

the final decision is not only ours

Page 37: Linking the world with Python and Semantics

but we chose starting from #2

#2: study existing solutions and then (...)

Page 38: Linking the world with Python and Semantics

ok, lmgfy

Page 39: Linking the world with Python and Semantics

a few results from google

ActiveRDF

active-semantic

Django4Store

Django-RDF

Django-RDFAlchemy

Djubby

EasyRDF

Jena

FuXi

Oort

Pymantic

PyRdfa

pysparql

RDFAlchemy

RdfLib

Redland

semantic-django

SPARQLWrapper

Sparrow

Sparta

SuRF

Page 41: Linking the world with Python and Semantics

ActiveRDF

active-semantic

Django4Store

Django-RDF

Django-RDFAlchemy

Djubby

EasyRDF

Jena

FuXi

Oort

Pymantic

PyRdfa

pysparql

RDFAlchemy

RdfLib

Redland

semantic-django

SPARQLWrapper

Sparrow

Sparta

SuRF

{?project :by_author ?author .?author :works_at :globocom . }

Page 42: Linking the world with Python and Semantics

ActiveRDF

active-semantic

Django4Store

Django-RDF

Django-RDFAlchemy

Djubby

EasyRDF

Jena

FuXi

Oort

Pymantic

PyRdfa

pysparql

RDFAlchemy

RdfLib

Redland

semantic-django

SPARQLWrapper

Sparrow

Sparta

SuRF

{?project :use_language :python . }

Page 43: Linking the world with Python and Semantics

{?project :use_language :python ;:last_commit ?commit .

FILTER (?commit >= "2011-12-01"^^xsd:date) }ActiveRDF

active-semantic

Django4Store

Django-RDF

Django-RDFAlchemy

Djubby

EasyRDF

Jena

FuXi

Oort

Pymantic

PyRdfa

pysparql

RDFAlchemy

RdfLib

Redland

semantic-django

SPARQLWrapper

Sparrow

Sparta

SuRF

Page 44: Linking the world with Python and Semantics

relation between these tools

Page 45: Linking the world with Python and Semantics

team filtering

ActiveRDF

active-semantic

Django4Store

Django-RDF

Django-RDFAlchemy

Djubby

EasyRDF

Jena

FuXi

Oort

Pymantic

PyRdfa

pysparql

RDFAlchemy

RdfLib

Redland

semantic-django

SPARQLWrapper

Sparrow

Sparta

SuRF

Page 46: Linking the world with Python and Semantics

# List all predicates of dbonto:Bandquery = """SELECT distinct ?subjectFROM <http://dbpedia.org>{ ?subject rdfs:domain ?object . <http://dbpedia.org/ontology/Band> rdfs:subClassOf ?object OPTION (TRANSITIVE, t_distinct, t_step('step_no') as ?n, t_min (0) ).}""" http://live.dbpedia.org/sparql

sparql = SPARQLWrapper("http://dbpedia.org/sparql")sparql.setQuery(query)sparql.setReturnFormat(JSON)results = sparql.query().convert()

for result in results["results"]["bindings"]: print(result["subject"]["value"])

SPARQLWrapperproblem: list all predicates of a class

Page 47: Linking the world with Python and Semantics

SPARQLWrapper

# List all predicates of dbonto:Bandquery = """SELECT distinct ?subjectFROM <http://dbpedia.org>{ ?subject rdfs:domain ?object . <http://dbpedia.org/ontology/Band> rdfs:subClassOf ?object OPTION (TRANSITIVE, t_distinct, t_step('step_no') as ?n, t_min (0) ).}""" http://live.dbpedia.org/sparql

sparql = SPARQLWrapper("http://dbpedia.org/sparql")sparql.setQuery(query)sparql.setReturnFormat(JSON)results = sparql.query().convert()

for result in results["results"]["bindings"]: print(result["subject"]["value"])

abstract endpoint returns dict

Page 48: Linking the world with Python and Semantics

SPARQLWrapper

Ok, not different from what we have...

Page 49: Linking the world with Python and Semantics

SPARQLWrapper

just a wrapper around a SPARQL serverwell tested ;)

Page 50: Linking the world with Python and Semantics

SPARQLWrapperproblem: list all subjects given ?p ?o

from SPARQLWrapper import SPARQLWrapper, JSON

# List all instances (eg. bands) with genre Metalquery = """PREFIX db: <http://dbpedia.org/resource/>PREFIX dbonto: <http://dbpedia.org/ontology/>

SELECT DISTINCT ?whoFROM <http://dbpedia.org>WHERE { ?who dbonto:genre db:Metal .}"""

sparql = SPARQLWrapper("http://dbpedia.org/sparql")sparql.setQuery(query)sparql.setReturnFormat(JSON)results = sparql.query().convert()

for result in results["results"]["bindings"]: print(result["who"]["value"])

Page 51: Linking the world with Python and Semantics

import rdflibimport rdfextras.store.SPARQL

# SPARQL endpoint setupendpoint = "http://dbpedia.org/sparql"store = rdfextras.store.SPARQL.SPARQLStore(endpoint)graph = rdflib.Graph(store)

# Definitionsgenre = rdflib.URIRef("http://dbpedia.org/ontology/genre")metal = rdflib.URIRef("http://dbpedia.org/resource/Metal")

# Queryfor label in graph.subjects(genre, metal):

print label

RdfLibproblem: list all subjects given ?p ?o

Page 52: Linking the world with Python and Semantics

RdfLibabstract endpoint returns dict namespace

import rdflibimport rdfextras.store.SPARQL

# SPARQL endpoint setupendpoint = "http://dbpedia.org/sparql"store = rdfextras.store.SPARQL.SPARQLStore(endpoint)graph = rdflib.Graph(store)

# Namespaces to clear up definitionsDBONTO = rdflib.Namespace("http://dbpedia.org/ontology/")DB = rdflib.Namespace("http://dbpedia.org/resource/")

# Queryfor label in graph.subjects(DBONTO.genre, DB.Metal):

print label

Page 53: Linking the world with Python and Semantics

RdfLibabstract endpoint returns dict namespace

import rdflibimport rdfextras.store.SPARQL

# SPARQL endpoint setupendpoint = "http://dbpedia.org/sparql"store = rdfextras.store.SPARQL.SPARQLStore(endpoint)graph = rdflib.Graph(store)

# Namespaces to clear up definitionsDBONTO = rdflib.Namespace("http://dbpedia.org/ontology/")DB = rdflib.Namespace("http://dbpedia.org/resource/")

# Queryfor label in graph.subjects(DBONTO.genre, DB.Metal):

print label

subjectspredicatesobjectssubject_predicatessubject_objectspredicates_objects

Page 54: Linking the world with Python and Semantics

RdfLibabstract endpoint returns dict namespace

import rdflibimport rdfextras.store.SPARQL

# SPARQL endpoint setupendpoint = "http://dbpedia.org/sparql"store = rdfextras.store.SPARQL.SPARQLStore(endpoint)graph = rdflib.Graph(store)

# Namespaces to clear up definitionsDBONTO = rdflib.Namespace("http://dbpedia.org/ontology/")DB = rdflib.Namespace("http://dbpedia.org/resource/")

# Using triplesfor musician, _, _ in graph.triples((None, DBONTO.genre, DB.Metal)): print musician

Page 55: Linking the world with Python and Semantics

RdfLibabstract endpoint returns dict namespace query by triples

import rdflibimport rdfextras.store.SPARQL

# SPARQL endpoint setupendpoint = "http://dbpedia.org/sparql"store = rdfextras.store.SPARQL.SPARQLStore(endpoint)graph = rdflib.Graph(store)

# Namespaces to clear up definitionsDBONTO = rdflib.Namespace("http://dbpedia.org/ontology/")DB = rdflib.Namespace("http://dbpedia.org/resource/")

# Queryfor label in graph.subjects(DBONTO.genre, DB.Metal):

print label

Page 56: Linking the world with Python and Semantics

RdfLibabstract endpoint returns dict namespace query by triples

import rdflibimport rdfextras.store.SPARQL

# n3 fixture filegraph = rdflib.Graph()graph.parse("fixture_genre_metal.nt", format="nt")

# NamespaceDBONTO = rdflib.Namespace("http://dbpedia.org/ontology/")DB = rdflib.Namespace("http://dbpedia.org/resource/")

# Add nodesgraph.add((DB.AndrewsMedina, DBONTO.genre, DB.Metal))graph.add((DB.Siminino, DBONTO.genre, DB.Metal))graph.add((DB.Herman, DBONTO.genre, DB.Metal))

# Remove nodesgraph.remove((DB.AndrewsMedina, DBONTO.genre, DB.Metal))

add / remove

Page 57: Linking the world with Python and Semantics

RdfLib

concentrates on providing the core RDF types and interfaces, through plugin interface

Page 58: Linking the world with Python and Semantics

RdfLib

makes testing simple, allowingfixtures using n3 files, add triplesand remove triples

Page 59: Linking the world with Python and Semantics

RdfLib

but...

each triple query requires a new connection to SPARQL

Page 60: Linking the world with Python and Semantics

RdfLib

therefore

too many access to SPARQL endpoint

Page 61: Linking the world with Python and Semantics

RdfLib

and...

doesn't provide an ORM (object relational mapping)

Page 62: Linking the world with Python and Semantics

SuRFabstract endpoint returns dict namespace query by triples add / remove

ORM

from surf import Store, Session, ns, query

store = Store(reader='sparql_protocol',                   endpoint='http://dbpedia.org/sparql')session = Session(store, {})session.enable_logging = False

ns.register(db='http://dbpedia.org/resource/')ns.register(dbonto='http://dbpedia.org/ontology/')

MusicalArtist = session.get_class(ns.DB['MusicalArtist'])

artistas_metal = MusicalArtist.get_by(dbonto_genre=ns.DB["Metal"])

print artistas_metal

Page 63: Linking the world with Python and Semantics

SuRFproblem: list all subjects given ?p ?o

from surf import Store, Session, ns, query

store = Store(reader='sparql_protocol', endpoint='http://dbpedia.org/sparql')session = Session(store, {})

ns.register(db='http://dbpedia.org/resource/')ns.register(dbonto='http://dbpedia.org/ontology/')

query_surf = query.select("?who").distinct()query_surf.where(("?who", ns.DBONTO.genre, ns.DB.Metal))

metal_bands = session.default_store.execute(query_surf)

for band in metal_bands:print band

ORMcomposed

queries

Page 64: Linking the world with Python and Semantics

SuRF

various approachesORM

programaticaly

Page 65: Linking the world with Python and Semantics

SuRF

simple ORMno need to redeclare

TTL definitions

Page 66: Linking the world with Python and Semantics

SuRF

“complex” queries using

lazy evalutation

Page 67: Linking the world with Python and Semantics

SuRF

documentation&

community

Page 68: Linking the world with Python and Semantics

SuRF

but...

no django-style models

Page 69: Linking the world with Python and Semantics

SuRF

verbose syntax

Page 70: Linking the world with Python and Semantics

RDFAlchemy

from rdfalchemy.sparql import SPARQLGraphfrom rdflib import Namespace

endpoint = "http://dbpedia.org/sparql"graph = SPARQLGraph(endpoint)

DB = Namespace("http://dbpedia.org/resource/")DBONTO = Namespace("http://dbpedia.org/ontology/")

metal_bands = graph.subjects(predicate=DBONTO.genre, object=DB.Metal)

for band in metal_bands:print band

problem: list all subjects given ?p ?o

Page 71: Linking the world with Python and Semantics

RDFAlchemyabstract endpoint returns dict namespace query by triples add / remove

ORM django-like

from rdfalchemy.sparql import SPARQLGraphfrom rdfalchemy import rdfSubject, rdfSinglefrom rdflib import Namespace

DB = Namespace('http://dbpedia.org/resource/')DBONTO = Namespace("http://dbpedia.org/ontology/")RDFS = Namespace('http://www.w3.org/2000/01/rdf-schema#')

endpoint = "http://live.dbpedia.org/sparql"graph = SPARQLGraph(endpoint)rdfSubject.db = graph

class MusicalArtist(rdfSubject): rdfs_label = rdfSingle(RDFS.label, 'label') genre = rdfSingle(DBONTO.genre, 'genre')

metal_artists = MusicalArtist.filter_by(genre=DB.Metal)

for band in metal_artists: print band

Page 72: Linking the world with Python and Semantics

RDFAlchemy

django-likemodels

Page 73: Linking the world with Python and Semantics

RDFAlchemy

simple syntax

Page 74: Linking the world with Python and Semantics

RDFAlchemy

but...

non-lazy

Page 75: Linking the world with Python and Semantics

RDFAlchemy

we have to declare all data already

described in TTL filesas python classes

Page 76: Linking the world with Python and Semantics

semantic-django

# Classes similar to django model's are created from TTL# files (using manage.py)

class BaseLugar(BaseEntidade): latitude = models.UriField() longitude = models.UriField() geonameid = models.UriField() tem_mapa = models.UriField() apelido = models.UriField() ImagemMapa = models.UriField() genero_gramatical = models.UriField() class Meta: semantic_graph = 'http://semantica.globo.com/base/Lugar'

abstract endpoint returns dict namespace query by triples

ORM django-like

add / remove

Page 77: Linking the world with Python and Semantics

semantic-django

https://github.com/rfloriano/semantic-django

Page 78: Linking the world with Python and Semantics

semantic-django

dream ofmany

product developers

Page 79: Linking the world with Python and Semantics

semantic-django

but...

just started to be developed

Page 80: Linking the world with Python and Semantics

[ ] contribute to them[ ] develop on top of them[ ] create a solution from scratch[ ] other, _________________

study existing solutions, and now?

Page 81: Linking the world with Python and Semantics

grab your post-it, it's review time!

SuRF

RDFAlchemy

RDFlib

semantic-django

(...)

=) =( comments

nomodels

showsquery

modelsnotlazy

niceAPI

djangolike

namespace

lowlayer

juststarted

myfavorite

mychoice

Page 83: Linking the world with Python and Semantics

any questions...?

@tati_alchueyr