transforming your graph analytics with graphdb (petar ivanov)

72
GraphDB Fundamentals Ontotext Webinar Jan 26, 2017

Upload: ontotext

Post on 07-Apr-2017

157 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

GraphDB FundamentalsOntotext Webinar Jan 26, 2017

Page 2: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#2

Page 3: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#3

Page 4: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Resource Description Framework (RDF) is a graph data model that• Formally describes the semantics, or meaning, of information

• Represents metadata, i.e., data about data

RDF data model consists of triples• That represent links (or edges) in an RDF graph

• Where the structure of each triple is Subject, Predicate, Object

Example triples:

‘br:’ refers to the namespace ‘http://bedrock/’ so that ‘br:Fred’ expands to <http://bedrock/Fred> a Universal Resource Identifier (URI).

What is RDF?

Subject Predicate Object

br:Fred br:hasSpouse br:Wilma .br:Fred br:hasAge 25 .

#4

Page 5: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

An Example of an RDF Model

hasSpouse

hasSpouse

hasSpouse

hasChild

hasChild hasChildhasChild hasChild

hasChild hasChild hasChild hasChild

worksFor

livesInlivesIn

worksFor

WilmaFlintstone

PebblesFlintstone

PearlSlaghoople

RoxyRubble

PearlSlaghoople

Bamm-BammRubble

PrehistoricAmerica

CobblestoneCounty Bedrock Rock

Quarry

partOf locatedIn

FredFlinstone

BarneyRubble

BettyRubble

partOf

Chip

#5

Page 6: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

RDF Schema (RDFS)

• Adds− Concepts such as Resource, Literal, Class, and Datatype − Relationships such as subClassOf, subPropertyOf, domain, and range

• Provides the means to define− Classes and properties− Hierarchies of classes and properties

• Includes “entailment rules”, i.e., axioms to infer new triples from existing ones

What is RDFS?

#6

Page 7: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Applying RDFS To Infer New Triplesbr:hasSpouse a rdf:Property; rdfs:domain br:Human ; rdfs:range br:Human .

br:Fred br:hasSpouse br:Wilma .br:Human a rdf:Class; rdfs:subClassOf br:Mammal .

br:Fred a br:Human .br:Wilma a br:Human .

br:Fred a br:Mammal .br:Wilma a br:Mammal .

#7

Page 8: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline

• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#9

Page 9: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

10

What is SPARQL?

SPARQL is a SQL-like query language forRDF graph data with the following querytypes:

• SELECT returns tabular results

• CONSTRUCT creates a new RDF graph based on query results

• ASK returns ‘yes’ if the query has a solution, otherwise ‘no’

• DESCRIBE returns RDF graph data about a resource; useful when the query client does not know the structure of the RDF data in the data source

• INSERT inserts triples into a graph

• DELETE deletes triples from a graph.

SemanticSearch

Page 10: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Using SPARQL to Insert TriplesTo create an RDF graph, perform these steps:• Define prefixes to URIs with the PREFIX keyword

• Use INSERT DATA to signify you want to insert statements. Write the subject-predicate-object statements (triples).

• Execute this query.

:pebbles:bamm- bamm

:fred :wilma

:roxy :chip

:hasSpouse

:hasChild :hasChild

:hasChild :hasChild

PREFIX br: <http://bedrock/>INSERT DATA { br:fred br:hasSpouse br:wilma . br:fred br:hasChild br:pebbles . br:wilma br:hasChild br:pebbles . br:pebbles br:hasSpouse br:bamm-bamm ; br:hasChild br:roxy, br:chip .}

#11

Page 11: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Using SPARQL to Select TriplesTo access the RDF graph you just created, perform these steps:• Define prefixes to URIs with the PREFIX keyword.

• Use SELECT to signify you want to select certain information, and WHERE to signify your conditions, restrictions and filters.

• Execute this query.

PREFIX br: <http://bedrock/>SELECT ?subject ?predicate ?object WHERE {?subject ?predicate ?object}

Subject Predicate Object

br:fred br:hasChild br:pebblesbr:pebbles br:hasChild br:roxybr:pebbles br:hasChild br:chipbr:wilma br:hasChild br:pebbles

#12

Page 12: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Using SPARQL to Find Fred’s GrandchildrenTo find Fred’s grandchildren, first find out if Fred has any grandchildren:• Define prefixes to URIs with the PREFIX keyword

• Use ASK to discover whether Fred has a grandchild, and WHERE to signify your conditions.

YESPREFIX br: <http://bedrock/>ASKWHERE { br:fred br:hasChild ?child . ?child br:hasChild ?grandChild .}

#13

Page 13: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Using SPARQL to Find Fred’s GrandchildrenNow that we know he has at least one grandchild, perform these steps to find the grandchild(ren):• Define prefixes to URIs with the PREFIX keyword

• Use SELECT to signify you want to select a grandchild, and WHERE to signify your conditions.

PREFIX br: <http://bedrock/>SELECT ?grandChild WHERE { br:fred br:hasChild ?child . ?child br:hasChild ?grandChild .}

grandChild

1. br:roxy2. br:chip

#14

Page 14: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#16

Page 15: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

What is OntologyAn ontology is a formal specification that provides sharable and reusable knowledge representation.

Examples of formal specifications include:

• Taxonomies

• Vocabularies

• Thesauri

• Topic Maps

• Logical Models

#17

Page 16: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

What is in an Ontology?An ontology specification includes descriptions of• Concepts and properties in a domain • Relationships between concepts • Constraints on how the relationships can be used• Individuals as members of concepts

#18

Page 17: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

The Benefits of an OntologyOntologies provide:• A common understanding of information• Explicit domain assumptions

These provisions are valuable because ontologies:• Support data integration for analytics• Apply domain knowledge to data• Support interoperation of applications• Enable model-driven applications• Reduce the time and cost of application development• Improve data quality, i.e., metadata and provenance

#19

Page 18: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

OWL Overview

The Web Ontology Language (OWL) adds more powerful ontology modelling means to RDF/RDFS• Providing

− Consistency checks: Are there logical inconsistencies?− Satisfiability checks: Are there classes that cannot have instances?− Classification: What is the type of an instance?

• Adding identity equivalence and identity difference − Such as, sameAs, differentFrom, equivalentClass, equivalentProperty

• Offering more expressive class definitions, such as− Class intersection, union, complement, disjointness− Cardinality restrictions

• Offering more expressive property definitions such as,− Object and datatype properties− Transitive, functional, symmetric, inverse properties− Value restrictions

#20

Page 19: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#22

Page 20: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

"Ontology Development 101" by Noy & McGuinness (2001) is a popular, practical seven-step methodology for developing an ontology.

• Step 1: Identify the domain and scope

• Step 2: Consider re-using existing ontologies

• Step 3: Enumerate important terms

• Step 4: Define the classes and class hierarchy

• Step 5: Define the properties of classes

• Step 6: Define property facets

• Step 7: Create instances

A Methodology for Ontologies

1

23

45

6

#23

Page 21: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

To help identify the domain and scope of the ontology, answer these questions:

• What is the domain of the ontology?

• What is the purpose of the ontology?

• Who are the users and maintainers?

• What questions will the ontology answer?

Some say the last is most important (Competence Questions approach)

Step 1: Identify the Domain and Scope

#24

Page 22: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Ontologies are re-usable and extensible and there are a number of existing ontologies that you might consider:

• Your existing ontology

• Widely used ontologies− such as: Dublin Core, FOAF, SKOS, Geo (WGS84)

• Upper Level Ontologies− such as: Cyc, UMBEL, DOLCE, SUMO, PROTON

• Linked Open Data

• Specialized domain ontologies

Step 2: Consider Re-using Existing Ontology

#25

Page 23: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Terminology is useful for domain modeling. Start collecting terminology based on interviews and domain documentation.

Step 3: Enumerate Important Terms

#26

Page 24: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

To help define the class and class hierarchy, determine which type of modeling to use.

Three types of modeling are:

• Top-down modeling− Use it when the general domain concepts are known

• Bottom-up modeling− Use it when there is a great variety of concepts and no clear overarching general concepts at the outset

• Hybrid modeling− Use it when you need both top down and bottom up modeling, which is often the case

Step 4: Define Class and Class Hierarchy

Ontotext, AD and Keen Analytics, LLC. All Rights Reserved

#27

Page 25: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Define the properties of classes, such as:

• Intrinsic properties − For example color, mass, density

• Extrinsic properties − For example, name, location

• Parts

• Relationships to other individuals

Step 5: Define Properties of Classes

#28

Page 26: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Define property facets, such as:

• Property Type− Is it symmetric? Is it transitive? Is it a datatype or an object

property?

• Cardinality− Is the property optional or essential? Is the property a one-

to-many relationship?

• Domain− From which classes does this property point?

• Range− To which classes does this property point?

Step 6: Define Property Facets

#29

Page 27: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Create instances of classes

• For example, :Fred a :Human

Creating instances

• Tests the domain ontology

• May expose modeling issues− which can be addressed by iterative refinement

Step 7: Create Instances

#30

Page 28: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#32

Page 29: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

GraphDB™ Editions

• GraphDB™ Free

• GraphDB™ Standard

• GraphDB™ Cloud

• GraphDB™ as-a-Service (S4)

• GraphDB™ Enterprise

#33

Page 30: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

#34http://ontotext.com/products/graphdb/

GraphDB™ Free Installation

Page 31: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

GraphDB™ Free Edition Installation Overview

#35

Step 1:• On Windows - Download & run the GraphDB .exe file, follow the on-screen

installer prompts.

• On Mac OS - Download & run the GraphDB .dmg file. Copy the program from the virtual disk to your hard disk applications folder.

• On Linux - Download the GraphDB .rmp or .deb file. Install the package with sudo rpm -i or sudo deb -i and the name of the downloaded package.

Step 2:• Start the database by clicking the application icon. The GraphDB Server and

Workbench open at http://localhost:7200/.

Page 32: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Create a new repository by:• Launching the GraphDB™ Workbench• Selecting “Setup”• Selecting “Repositories”• Configuring the new repository

GraphDB™ Free Edition Workbench New Repositoryhttp://localhost:7200

#36

Page 33: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

GraphDB™ Free Edition Workbench New Repository

#37

Page 34: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Test the repository by

• Selecting “SPARQL”

• Submitting queries

GraphDB™ Workbench Execute Queries

2 Query

1 Insert Data

#38

Page 35: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#40

Page 36: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

With regard to performance tuning

• Memory is the most important factor−More memory results in better performance

• To specify the maximum amount of heap space used by a JVM, use the -Xmx virtual machine parameter.−The Xmx value should be about 2/3 of the

system memory.

Performance Tuning: Memory

#41

Page 37: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

• From GraphDB 7.2 on, you no longer have to −configure the cache-memory, tuple-index-memory and predicate-memory, or

size every repository and calculate the amount of memory dedicated to it thanks to a new cache strategy called single global page cache.

−calculate the entity pool memory when giving the JVM max heap memory parameter to GraphDB. All entity pool structures now reside off-heap, i.e. outside of the normal JVM heap.To activate the old behaviour, you can still enable on heap allocation with

Performance Tuning: Memory

#42

-Dgraphdb.epool.onheap=true

Page 38: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Each dataset has its own “geometry.” Technicians must gain experience with each dataset in order to refine the loading process. The following is a typical initialisation life-cycle:

1. Configure a repository for best loading performance with many estimated parameters.

2. Load data.

3. Examine dataset properties.

4. Refine loading configuration.

5. Reload data and measure improvement.

Unless the repository has to answer queries during the initialization phase, it can be configured with the minimum number of options and indices:

Tip: You can also use the LoadRDF Parallel Bulk Loader (video)

Performance Tuning: Load

#43

enablePredicateList = false (unless the dataset has a large number of predicates)enable-context-index = falsein-memory-literal-properties = false

Page 39: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

GraphDB™ Enterprise edition provides scalability

• Replication / High Availability cluster

• Improved concurrent querying and scalability

• Resilience for failover

Scalability: GraphDB™ Enterprise

GraphDB™

#44

Page 40: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#46

Page 41: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

GraphDB™ Workbench is a web-based administration tool. It is similar to RDF4J Workbench, but

• Has more features

• Is more intuitive and easier to use

GraphDB™ Workbench functions Include

• Managing GraphDB™ repositories

• Loading and exporting data

• Monitoring query execution

• Managing connectors and users

GraphDB™ Workbench and RDF4J

#47

Page 42: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

On the following slide is an example of the GraphDB™ Workbench screen.

• Access the GraphDB™ Workbench from a browser.

• The splash page provides a summary of the installed GraphDB™ Workbench.

GraphDB™ Workbench

#48

• The Workbench has a side menu bar with convenient drop down menus organized under “Import”, “Explore”, “SPARQL”, “Monitor”, “Setup” and “Help”.

Page 43: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Create New Repository

#49

Create a new repository by:• Selecting “Setup”• Selecting “Repositories”• Configuring the new repository – includes GraphDB-specific configuration

settings, not available in RDF4J.

Page 44: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

By selecting the SPARQL menu, the SPARQL query editor displays and

• Allows you to render your query results as Table, Pivot Table, or Google Analytic Charts

Execute Queries With GraphDB™ Workbench

#50

Page 45: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

GraphDB™ Workbench Query Editor

#51

Page 46: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Query Monitoring: Abort Query

#52

GraphDB™ allows you to abort long queries that are executing.

E.g. you create a query that is long running, and you would like to halt it, and perhaps modify it and resubmit it and not wait until it completes.

From the side menu panel select Monitor, then Queries.

Page 47: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#54

Page 48: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Loading data may be accomplished by using

• GraphDB™ Workbench− To upload individual files

− To upload bulk data from a directory

• LoadRDF Parallel Loader

Loading Data

#55

Page 49: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Loading DataSupported File Formats

#56

Page 50: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Loading data through the GraphDB WorkbenchTo load a local file:

#57

• Select Import -> RDF.• Open the Local files tab and click the Select files icon to choose the file you want to upload.• Click the Import button.• Enter the import settings in the pop-up window

Page 51: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Loading Local Files

#58

Page 52: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Loading a database server file

#59

• Create a folder named graphdb-import in your user home directory.• Copy all data files you want to load into the GraphDB database to this folder.• Go to the GraphDB Workbench.• Select Data -> Import.• Open the Server files tab.• Select the files you want to import.• Click the Import button.

Page 53: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

The LoadRDF Parallel Bulk Loader

• Features fast loading of large datasets into new repositories

• Is not intended for updating existing repositories

• Is easy to use:− Enter loadrdf <config.ttl> <serial|parallel> <files...>

▪ For example “./loadrdf.sh config.ttl parallel example.ttl”

− The “Serial Load” option pipelines the parse, entity resolution, and load tasks.

− The “Parallel Load” batch processes the parse, entity resolution, and load tasks.

LoadRDF Parallel Bulk Loader

#60

Page 54: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Other ways to load data

#61

By pasting data in the Text area tab of the Import page.

By pasting a data URL in the Remote content tab of the Import page.

By executing an INSERT query in the SPARQL -> SPARQL Query page.

Page 55: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Loading tabular data using OntoRefine

#62

Page 56: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Loading tabular data using OntoRefine

#63

Page 57: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#65

Page 58: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Reasoning Strategies:

• Forward Chaining− Inferences pre-computed

− Faster query performance

− Slower load times

− More memory/disk space required

− Updates are expensive (truth maintenance is non-trivial)

• Backward Chaining− Inferences performed as needed at query time

− Slower query performance

− Faster load times

• Hybrid Reasoning − Partial forward chaining at data loading time + partial backward chaining at query time

Reasoning Strategies

#66

Page 59: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

− Fast (incremental) inserts (assertions) and deletes (retractions)− Most triplestores perform an expensive full re-compute on updates

• Reasoning on insert: forward chaining optimization− Rule sets compiled to fast Java code− Every statement is passed through all rules. First check is in-memory, reducing need for lookups

• Delete Optimization: smooth (incremental) delete− Truth maintenance minimizes the re-compute but the required dependency tracking is expensive− GraphDB optimizes deletes by using backward chaining to derive delete dependencies dynamically− This backward search stops at axioms or ontology triples (see onto:schemaTransaction to control it)− Inferred triples without alternative support are retracted. Recursively

GraphDB™ Reasoning Optimizations

#67

Page 60: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

owl:sameAs optimisation• sameAs is useful in semantic data integration

− Often independent agencies mint different URLs for the same entity

− sameAs, an equivalence relation, declares them the same (“smushing”)

− All statements of URL X in equivalence cluster are “copied” to all Y in the same cluster

− Such inference causes combinatorial explosion of statements

− If unchecked, decreases memory and query time performance

• sameAs Optimisation− Compact representation: statements are made against clusters, not against individual URLs

− Backward chaining finds all solutions across cluster

− Query results compacted by picking one representative from cluster (option disableSameAs=true)

− disableSameAs=false = “Expand results over equivalent URIs”

#68

Page 61: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

A Rule Set Consists of• Prefixes (namespace prefixes)

• Axiomatic triples

• Custom rules

Pre-Defined Rule Sets are• empty: no reasoning, GraphDB™ operates as a plain RDF store;

• rdfs: standard RDFS semantics;

• owl-horst: RDFS + D-Entailment + Some OWL – Tractable

• owl-max: RDFS with most of OWL Lite

• owl2-rl: Conformant OWL2 RL profile except for D-Entailment (types)

• owl2-ql: Reasoning over large volumes of data

Rule Sets

#69

Page 62: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Presentation Outline• Welcome

• RDF and RDFS Overviews

• SPARQL Overview

• Ontology Overview

• Ontology Modeling

• GraphDB™ Installation

• Performance Tuning and Scalability

• GraphDB™ Workbench and RDF4J

• Loading Data

• Rule Sets and Reasoning Strategies

• Extensions#71

Page 63: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Ontotext GraphDB Connectors

#72

• Provides extremely fast full text, range, faceted search, and aggregations

• Utilize an external engine like Lucene, Solr or Elasticsearch

• Flexible schema mapping: index only what you need

• Real-time synchronization of data in GraphDB and the external engine

• Connector management via SPARQL

• Data querying & update via SPARQL

• Based on the GraphDB plug-in architecture

Page 64: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Interface

• All interaction via SPARQL queries − INSERT for creating connectors − SELECT for getting connector configuration parameters− INSERT/SELECT/DELETE for managing & querying RDF data

#74

Page 65: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Connectors – Primary Features• Maintaining an index that is always in sync with the data stored in

GraphDB

• Multiple independent instances per repository

• The entities for synchronization are defined by:− a list of fields (on the Lucene side) and property chains (on the GraphDB side) whose

values will be synchronised− a list of rdf:type's of the entities for synchronisation− a list of languages for synchronisation (the default is all languages)− additional filtering by property and value

• Full-text search using native Lucene queries

#75

Page 66: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

Connectors – Primary Features• Snippet extraction: highlighting of search terms in the search result

• Faceted search, e.g. Europeana Food and Drink

• Sorting by any preconfigured field

• Paging of results using offset and limit

• Custom mapping of RDF types to Lucene types

• Specifying which Lucene analyzer to use (the default is Lucene's StandardAnalyzer)

• Boosting an entity by the [numeric] value of one or more predicates

• Custom scoring expressions at query time to evaluate score based on Lucene #76

Page 67: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

TinkerPop Blueprints Support

• Blueprints (Apache TinkerPop, aka Gremlin) is a popular API for accessing graph databases

• It is supported by Hadoop, Neo4j, Titan, etc

• GraphDB supports Blueprints since 7.0 for accessing RDF databases

• It represents RDF as a simplified version of the Property Graph model

• In this way you can use graph programming frameworks, or use ready graph exploration software like Linkurious

#77

Page 68: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

RDF Rank is a GraphDB™ extension that• Is similar to PageRank and it identifies “important” nodes in an RDF graph based on their

interconnectedness • Is accessed using the rank:hasRDFRank system predicate• Incremental RDF Rank is useful for frequently changing data

For Example, to select the top 100 important nodes in the RDF graph:

RDF Rank

PREFIX rank: <http://www.ontotext.com/owlim/RDFRank#>SELECT ?n WHERE {?n rank:hasRDFRank ?r }ORDER BY DESC(?r)LIMIT 100

Page 69: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

GeoSPARQL Support

#79

GeoSPARQL is a standard for representing and querying geospatial linked data from the Open Geospatial Consortium, using the Geography Markup Language

• A small topological ontology in RDFS/OWL for representation

• Simple Features, RCC8, and DE-9IM (a.k.a. Egenhofer) topological relationship vocabularies and ontologies for qualitative reasoning

• A SPARQL query interface using a set of Topological SPARQL extension functions for quantitative reasoning

Page 70: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

81

Support and FAQ’s [email protected]

Additional resources:

Ontotext:Community Forum and Evaluation Support: http://stackoverflow.com/questions/tagged/graphdb GraphDB Website and Documentation: http://graphdb.ontotext.comWhitepapers, Fundamentals: http://ontotext.com/knowledge-hub/fundamentals/

SPARQL, OWL, and RDF: RDF: http://www.w3.org/TR/rdf11-concepts/ RDFS: http://www.w3.org/TR/rdf-schema/ SPARQL Overview: http://www.w3.org/TR/sparql11-overview/ SPARQL Query: http://www.w3.org/TR/sparql11-query/ SPARQL Update: http://www.w3.org/TR/sparql11-update

Page 71: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

For Further Information

• Peio Popov, North America Sales and Business Development−[email protected] −1.929.239.0659

• Ilian Uzunov, Europe Sales and Business Development−[email protected] −359.888.772.248

#82

Page 72: Transforming your Graph Analytics with GraphDB (Petar Ivanov)

The EndGraphDB™ Fundamentals