rdf and triplestores cmsc 461 michael wilson. reasoning relational databases allow us to reason...
TRANSCRIPT
RDF and triplestoresCMSC 461Michael Wilson
Reasoning Relational databases allow us to reason
about data that is organized in a specific way Data that models specific relationships Data that is very cleanly structured
What other reasoning methods are available to us?
Metadata “Data about data”
Data that describes other data Gives context
Example metadata: Image EXIT data (geolocation, rotation,
etc.) User statistics Last saved information in a file
What’s so important? The context that we gather from
metadata often allows us to understand a much greater picture Can correlate and tie metadata together Calculate statistics on metadata Understand trends Infinite possibilities
The depth of metadata Many systems have their own way of
storing metadata Database tables may be organized to
house specific metadata This does not lend itself well to
discovering new types of metadata Person may have age, DOB Later want to add new types (friends,
Facebook ID, Twitter ID, etc.)
Metadata structures RDF
Resource Description Framework OWL
Web Ontology Language Ontology – established vocabulary to
describe knowledge within a domain RDF is more widely used
Schemas RDF and other structured metadata formats
allow us to establish a common language to describe different sorts of metadata We can make schemas that describe
Social media Physical location Job details
Moreover, we can tie them all to one subject Doesn’t require database reorganization
Why is that cool? What this means is that we can tie any
arbitrary sets of data together with very little work on our part
We make a schema that describes a new domain, and staple that information onto an existing subject
Triples Within these schemas, data is
conceptually organized as <subject> <predicate> <object>
Subject The subject of the expression
Predicate The relationship between the subject and object
Object The direct object of the expression
These expressions are called “triples”
Triple examples Examples?
Storing triples Since we are often interesting in large
amounts of data, we need to think on how to store these
Triplestores Pretty obvious What do these give us over doing
something like storing the information in a database?
Triplestore querying Triplestores can also be queried
SQL is more limited for the kinds of queries we’d like to be able to make
SPARQL The acronym stands for:
SPARQL Protocol and RDF Query Language
SPARQL SPARQL is a SQL-like query language
Allows us to query on the various schemas we have assigned to our subjects
SPARQL queries can look surprisingly readable
SPARQL examplePREFIX abc: <http://example.com/exampleOntology#>SELECT ?capital ?countryWHERE { ?x abc:cityname ?capital ; abc:isCapitalOf ?y . ?y abc:countryname ?country ; abc:isInContinent abc:Africa .
Querying power Using SPARQL, you can make extremely
deep, powerful queries and reason very intuitively on the data present in a triplestore
Organizing data this way allows computers to actually be able to reason on data as well
Caveats All this tech is SUPER new
All tied very heavily into the Semantic Web Basically introduce a system like this into
the web at large Metadata stored about web pages,
computers can reason about them Much of this is a moving target
Not a whole lot of production applications using this stuff yet
Tools There are a few triplestore servers and
other tools you can use Jena
Apache project Framework that allows for Semantic Web
concepts to be employed Can query using SPARQL Jena can use Postgres in the background
More tools RDFLib
https://github.com/RDFLib Python library for RDF Can run entirely in memory
Good for experimentation purposes and more