on-demand rdf graph databases in the cloud

Post on 31-Jul-2015

129 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

On-Demand RDF Graph Databases in the Cloud

A webinar with Marin Dimitrov, CTO of Ontotext

Jun 11th, 2015

On-Demand RDF Graph Databases in the Cloud #1Jun 2015

• The Self-Service Semantic Suite (S4)• RDF graph databases• On-demand RDF databases in the Cloud• Technical details• Demo• Roadmap• Q&A session

Today’s Topics

#2On-Demand RDF Graph Databases in the Cloud Jun 2015

About Ontotext

• Provides products & solutions for content enrichment and metadata management– 70 employees, headquarters in Sofia (Bulgaria)– Sales presence in London, NYC, Washington & Boston

• Major clients and industries– Media & Publishing– Health Care & Life Sciences– Cultural Heritage & Digital Libraries– Government– Education

#3On-Demand RDF Graph Databases in the Cloud Jun 2015

Some of our clients

#4On-Demand RDF Graph Databases in the Cloud Jun 2015

Ontotext’s vision for smart data management

Graph Database• Flexible RDF graph

data model• Ontology metadata

layer

Semantic Search• Semantic,

exploratory search• Metadata driven

content

Text Mining & Interlinking• People, locations,

organisations, topics• Discover implicit

relations• Reuse open knowledge

graphs

#5On-Demand RDF Graph Databases in the Cloud Jun 2015

Ontotext and AstraZeneca

Profile• Global, Bio-pharma company• $28 billion in sales in 2012• $4 billion in R&D across three continents

Goals• Efficient design of new clinical studies• Quick access to all of the data• Improved evidence based decision-making• Strengthen the knowledge feedback loop• Enable predictive science

Challenges• Over 7,000 studies and 23,000 documents

are difficult to obtain• Searches returning 1,000 – 10,000 results• Document repositories not designed for

reuse• Tedious process to arrive at evidence

based decisions

#6On-Demand RDF Graph Databases in the Cloud Jun 2015

Ontotext and LMI

Profile• Established in 1961 to enable federal

agencies • Specializes in logistics, financial,

infrastructure & information management

Goals• Unlock large collections of complex

documents• Improve analyst productivity• Create an application they can sell to US

Federal agencies

Challenges• Analysts taking hours to find, download

and search documents, using inaccurate keyword searches

• Needed a knowledge base to search quickly and guide the analysts – highly relevant searches

• Extracts knowledge from collection of documents

• Uses GraphDB to intuitively search and filter• More than 90% savings in analyst time• Accurate results

#7On-Demand RDF Graph Databases in the Cloud Jun 2015

The Self-Service Semantic Suite (S4)

#8On-Demand RDF Graph Databases in the Cloud Jun 2015

• Capabilities for text analytics, content enrichment and smart data management– Text analytics for news, life sciences and social media– RDF graph database as-a-service– Access to large open knowledge graphs

• Available on-demand, anytime, anywhere– Simple RESTful services

• Simple pay-per-use pricing– No upfront commitments

What is S4?

#9On-Demand RDF Graph Databases in the Cloud Jun 2015

What is S4?

#10On-Demand RDF Graph Databases in the Cloud Jun 2015

Today’s webinar

focus

• Enables quick prototyping– Instantly available, no provisioning & operations

required– Focus on building applications, don’t worry about

infrastructure

• Free tier!• Easy to start, shorter learning curve

– Various add-ons, SDKs and demo code

• Based on enterprise semantic technology

Benefits

#11On-Demand RDF Graph Databases in the Cloud Jun 2015

Getting started in minutes

#12

1. Register a personal account at

s4.ontotext.com

2. Generate an API key

pair

3. Check out the docs, demos & code

at docs.s4.ontotext.co

m

4. Contact us with

questions!

On-Demand RDF Graph Databases in the Cloud Jun 2015

• Text analytics services– News annotation– News categorisation– Biomedical– Twitter

• Entity linking & disambiguation– Mappings to DBpedia & GeoNames instances– Mappings to biomedical data sources (LinkedLifeData)

• HTML, MS Word, XML, plain text input• Simple JSON output

Text analytics with S4

#13On-Demand RDF Graph Databases in the Cloud Jun 2015

News analytics example

#14

S4 result

On-Demand RDF Graph Databases in the Cloud Jun 2015

• SPARQL query endpoint to the FactForge semantic data warehouse– 500 million entities / 5 billion triples

• Key LOD datasets integrated– DBpedia, Freebase/WikiData, GeoNames, WordNet– Dublin Core, SKOS, PROTON ontologies and vocabularies

Knowledge graphs with S4

#15On-Demand RDF Graph Databases in the Cloud Jun 2015

Knowledge graph query example

#16

SPARQL query using

DBpedia data

On-Demand RDF Graph Databases in the Cloud Jun 2015

RDF Data Management

#17On-Demand RDF Graph Databases in the Cloud Jun 2015

• Standards compliance– Based on a mature set of W3C standards: RDF/S, OWL,

SPARQL– Portability & interoperability

• Schema-less data integration, easy querying of diverse data

• Complex & exploratory queries• Infer implicit relations in the graph• Reuse open knowledge graphs (Linked Open Data)

RDF for smart data management

#18On-Demand RDF Graph Databases in the Cloud Jun 2015

A visual view of RDF data

#19

Sub-propertiesSub-classesTransitive relations

Inference

On-Demand RDF Graph Databases in the Cloud Jun 2015

• High performance RDF database• Full SPARQL 1.1 support• Various reasoning profiles, including custom rules• Efficient data integration (“sameAs” optimisations)• Efficient deletion of statements & their inferences• Geo-spatial indexing & querying with SPARQL• RDF Rank, full-text search, 3rd party plugins

GraphDB by Ontotext

#20On-Demand RDF Graph Databases in the Cloud Jun 2015

On-demand RDF Databases in the Cloud

#21On-Demand RDF Graph Databases in the Cloud Jun 2015

• Ideal for customers who are… – still evaluating and testing RDF technology– In the early phase of adoption / POC

• Enterprise grade RDF database in the Cloud– No need for upfront payments for licenses & hardware– Pay only for what you use, when you use it– Instantly operational within minutes– No need for complex planning - use as many DB

instances for as long as needed– Timely upgrades to the latest version

• Self-managed and full-managed options

RDF database in the Cloud with S4

#22On-Demand RDF Graph Databases in the Cloud Jun 2015

• Available from AWS Marketplace• Variety of hardware configurations

– 2 to 8 CPU cores / 8 to 61 GB RAM– IOPS performance & encryption (EBS)

• Manage large data volumes• Pay-per-hour pricing

Self-managed RDF DB in the Cloud

#23On-Demand RDF Graph Databases in the Cloud Jun 2015

• Low-cost graph DBaaS available 24/7• Ideal for small & moderate data & query volumes

– database options: 1M, 10M, 50M, 250M and 1B triples

• Instantly deploy new databases when needed • Zero administration

– automated operations, maintenance & upgrades

• Users pay only for the actual database utilisation• Standard OpenRDF REST API

Fully managed RDF DB in the Cloud

#24On-Demand RDF Graph Databases in the Cloud Jun 2015

Fully managed RDF DB in the Cloud

#25

Database type Max triples

micro 1 million

XS 10 million

S 50 million

M 250 million

L 1 billion

On-Demand RDF Graph Databases in the Cloud Jun 2015

Fully managed RDF DB in the Cloud

#26On-Demand RDF Graph Databases in the Cloud Jun 2015

• …

Self vs fully

#27On-Demand RDF Graph Databases in the Cloud Jun 2015

• …

Use cases

#28On-Demand RDF Graph Databases in the Cloud Jun 2015

Technical Details

#29On-Demand RDF Graph Databases in the Cloud Jun 2015

• Cloud native architecture, running on AWS• Designed for elasticity & high availability

– More resources added whenever needed– Failed nodes replaced immediately

• GraphDB is the RDF DB engine– OpenRDF REST API

• Isolation of the multi-tenant databases– Docker containers– Private NAS volumes (EBS) for data storage

Fully managed RDF DB in the Cloud

#30On-Demand RDF Graph Databases in the Cloud Jun 2015

OpenRDF REST API

#31

resource operations comments

/repositories GET Get info on DB repos

/repositories/<REPOSITORY> GET, POST, PUT, DELETE

Create*, delete, query a repository

/repositories/<REPOSITORY>/size GET Gets the number of triples in a repository

/repositories/<REPOSITORY>/statements GET, POST, PUT, DELETE

Add, read, update, delete statements

repositories/<REPOSITORY>/rdf-graphs/<GRAPH>

GET, POST, PUT, DELETE

Same as above

/settings GET, PUT Configure the DB*

On-Demand RDF Graph Databases in the Cloud Jun 2015

Uploading data (OpenRDF Workbench)

#32On-Demand RDF Graph Databases in the Cloud Jun 2015

Uploading data (OpenRDF Workbench)

#33On-Demand RDF Graph Databases in the Cloud Jun 2015

Uploading data (curl)

#34

API_KEY=… KEY_SECRET=… USER=… DATABASE=… REPOSITORY=…

SERVICE_ENDPOINT="https://$API_KEY:$KEY_SECRET@rdf.s4.ontotext.com/$USER/$DATABASE"

curl -X POST -H "Content-Type:application/rdf+xml;charset=UTF-8" -T example.rdf $SERVICE_ENDPOINT/repositories/$REPOSITORY/statements

On-Demand RDF Graph Databases in the Cloud Jun 2015

Uploading data (Java / OpenRDF SDK)

#35

String dbaasURL = "<dbaas URL>"; String repositoryId="<repository ID>"; String pathToTheFile="<pathToTheFile>"; String ApiKey = "<api-key>"; String ApiPass = "<api-pass>";

//The base URI to resolve any relative URIs that are in the data against. String baseURI="http://www.example.org";

// Create a RemoteRepositoryManager RemoteRepositoryManager manager = RemoteRepositoryManager.getInstance(dbaasURL, ApiKey, ApiPass);

// Open a connection to the repositoryRepository repository = manager.getRepository(repositoryId); RepositoryConnection repositoryConnection = repository.getConnection();

// upload RDF dataFile fileToUpload=new File(pathToTheFile); repositoryConnection.add(fileToUpload, baseURI, RDFFormat.RDFXML);

// close the connectionrepositoryConnection.close();

On-Demand RDF Graph Databases in the Cloud Jun 2015

Querying data (OpenRDF Workbench)

#36On-Demand RDF Graph Databases in the Cloud Jun 2015

Querying data (OpenRDF Workbench)

#37On-Demand RDF Graph Databases in the Cloud Jun 2015

Querying data (curl)

#38

API_KEY=… KEY_SECRET=… USER=… DATABASE=… REPOSITORY=… SERVICE_ENDPOINT="https://$API_KEY:$KEY_SECRET@rdf.s4.ontotext.com/$USER/$DATABASE"

SPARQL_QUERY="…"

curl -X POST -H "Accept:application/sparql-results+xml" -d "query=$SPARQL_QUERY" $SERVICE_ENDPOINT/repositories/$REPOSITORY

On-Demand RDF Graph Databases in the Cloud Jun 2015

Demo

#39On-Demand RDF Graph Databases in the Cloud Jun 2015

• (Create database)• Create a repository• Upload sample data• Query the data• Explore data with a 3rd party tool

Demo scenario

#40On-Demand RDF Graph Databases in the Cloud Jun 2015

Create a database

#41On-Demand RDF Graph Databases in the Cloud Jun 2015

Micro, XS, S, M, or L

R/O access to Open Data

services or open knowledge

graphs

Create a repository

#42On-Demand RDF Graph Databases in the Cloud Jun 2015

Inference ruleset

Cache distribution

Uploading data (OpenRDF Workbench)

#43On-Demand RDF Graph Databases in the Cloud Jun 2015

Sample data (European country populations)

#44On-Demand RDF Graph Databases in the Cloud Jun 2015

Uploading data (OpenRDF Workbench)

#45On-Demand RDF Graph Databases in the Cloud Jun 2015

Uploading data (OpenRDF Workbench)

#46On-Demand RDF Graph Databases in the Cloud Jun 2015

Querying data (OpenRDF Workbench)

#47On-Demand RDF Graph Databases in the Cloud Jun 2015

Querying data (OpenRDF Workbench)

#48On-Demand RDF Graph Databases in the Cloud Jun 2015

Exploring data (GraphRover)

#49On-Demand RDF Graph Databases in the Cloud Jun 2015

Exploring data (GraphRover)

#50On-Demand RDF Graph Databases in the Cloud Jun 2015

Exploring data (GraphRover)

#51On-Demand RDF Graph Databases in the Cloud Jun 2015

Exploring data (GraphRover)

#52On-Demand RDF Graph Databases in the Cloud Jun 2015

Roadmap

#53On-Demand RDF Graph Databases in the Cloud Jun 2015

• Various improvements (backup & export)• Gradually introduce XS, S, M and L databases• Increased availability

– Cross-datacenter replication

• Integration with the GraphDB Workbench

Work in progress

#54On-Demand RDF Graph Databases in the Cloud Jun 2015

GraphDB Workbench

#55On-Demand RDF Graph Databases in the Cloud Jun 2015

Key Takeaways

#56On-Demand RDF Graph Databases in the Cloud Jun 2015

• S4 provides an enterprise RDF DBaaS• Free DBs up to 1M triples• Instantly available whenever needed• Easy to use: OpenRDF REST services• Zero administration: automated operations,

maintenance & upgrades• Resilient design, high availability• Check out http://s4.ontotext.com

Key Takeaways

#57On-Demand RDF Graph Databases in the Cloud Jun 2015

• Online documentation– http://docs.s4.ontotext.com/

• Sample code & demos on GitHub– https://github.com/Ontotext-AD/S4

• Helpdesk– http://support.s4.ontotext.com/

• Twitter– @Ontotext_S4

Additional S4 resources

#58On-Demand RDF Graph Databases in the Cloud Jun 2015

Thank you!

On-Demand RDF Graph Databases in the Cloud

A link to the recording will be sent out shortly

Jun 11th, 2015

#59On-Demand RDF Graph Databases in the Cloud Jun 2015

DBaaS architecture on AWS

#60On-Demand RDF Graph Databases in the Cloud Jun 2015

top related