rdf and triplestores cmsc 461 michael wilson. reasoning relational databases allow us to reason...

18
RDF and triplestore s CMSC 461 Michael Wilson

Upload: kevin-dawson

Post on 03-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

RDF and triplestoresCMSC 461Michael Wilson

Page 2: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Reasoning Relational databases allow us to reason

about data that is organized in a specific way Data that models specific relationships Data that is very cleanly structured

What other reasoning methods are available to us?

Page 3: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Metadata “Data about data”

Data that describes other data Gives context

Example metadata: Image EXIT data (geolocation, rotation,

etc.) User statistics Last saved information in a file

Page 4: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

What’s so important? The context that we gather from

metadata often allows us to understand a much greater picture Can correlate and tie metadata together Calculate statistics on metadata Understand trends Infinite possibilities

Page 5: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

The depth of metadata Many systems have their own way of

storing metadata Database tables may be organized to

house specific metadata This does not lend itself well to

discovering new types of metadata Person may have age, DOB Later want to add new types (friends,

Facebook ID, Twitter ID, etc.)

Page 6: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Metadata structures RDF

Resource Description Framework OWL

Web Ontology Language Ontology – established vocabulary to

describe knowledge within a domain RDF is more widely used

Page 7: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Schemas RDF and other structured metadata formats

allow us to establish a common language to describe different sorts of metadata We can make schemas that describe

Social media Physical location Job details

Moreover, we can tie them all to one subject Doesn’t require database reorganization

Page 8: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Why is that cool? What this means is that we can tie any

arbitrary sets of data together with very little work on our part

We make a schema that describes a new domain, and staple that information onto an existing subject

Page 9: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Triples Within these schemas, data is

conceptually organized as <subject> <predicate> <object>

Subject The subject of the expression

Predicate The relationship between the subject and object

Object The direct object of the expression

These expressions are called “triples”

Page 10: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Triple examples Examples?

Page 11: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Storing triples Since we are often interesting in large

amounts of data, we need to think on how to store these

Triplestores Pretty obvious What do these give us over doing

something like storing the information in a database?

Page 12: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Triplestore querying Triplestores can also be queried

SQL is more limited for the kinds of queries we’d like to be able to make

SPARQL The acronym stands for:

SPARQL Protocol and RDF Query Language

Page 13: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

SPARQL SPARQL is a SQL-like query language

Allows us to query on the various schemas we have assigned to our subjects

SPARQL queries can look surprisingly readable

Page 14: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

SPARQL examplePREFIX abc: <http://example.com/exampleOntology#>SELECT ?capital ?countryWHERE { ?x abc:cityname ?capital ; abc:isCapitalOf ?y . ?y abc:countryname ?country ; abc:isInContinent abc:Africa .

Page 15: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Querying power Using SPARQL, you can make extremely

deep, powerful queries and reason very intuitively on the data present in a triplestore

Organizing data this way allows computers to actually be able to reason on data as well

Page 16: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Caveats All this tech is SUPER new

All tied very heavily into the Semantic Web Basically introduce a system like this into

the web at large Metadata stored about web pages,

computers can reason about them Much of this is a moving target

Not a whole lot of production applications using this stuff yet

Page 17: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

Tools There are a few triplestore servers and

other tools you can use Jena

Apache project Framework that allows for Semantic Web

concepts to be employed Can query using SPARQL Jena can use Postgres in the background

Page 18: RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data

More tools RDFLib

https://github.com/RDFLib Python library for RDF Can run entirely in memory

Good for experimentation purposes and more