virtuoso universal server overview

37
1 © 2008 OpenLink Software, All rights reserved. Virtuoso Product Family Orri Erling - Program Manager, Virtuoso

Upload: rumito

Post on 15-Jan-2015

4.047 views

Category:

Technology


2 download

DESCRIPTION

Virtuoso Universal Server Overview Product Family

TRANSCRIPT

Page 1: Virtuoso Universal Server Overview

1© 2008 OpenLink Software, All rights reserved.

Virtuoso Product Family

Orri Erling - Program Manager, Virtuoso

Page 2: Virtuoso Universal Server Overview

2© 2008 OpenLink Software, All rights reserved.

Virtuoso Product Categories

Virtual Database EngineNative Data Management (multi-model covering:

SQL, RDF, XML, and Free Text)Discussion PlatformMail Proxy ServicesClient Connectivity Kit Virtuoso Universal Server

2

Page 3: Virtuoso Universal Server Overview

3© 2008 OpenLink Software, All rights reserved.

Virtual Database Engine

External ODBC or JDBC accessible SQL Data Sources

External XML based Data SourcesExternal SOAP or RESTful Web ServicesExternal RDF Data (e.g. Oracle)Custom Data Sources via Server Extensions API

3

RDF, XML, SQL Conceptual Views over:

Page 4: Virtuoso Universal Server Overview

4© 2008 OpenLink Software, All rights reserved.

Virtual Database Engine Contd.SQL Queries over Remote SQL, RDF, XML, and

Web Services based Data SourcesSPARQL Queries over Remote SQL, RDF, XML,

and Web Services based Data SourcesXQuery/XPath Queries over Remote RDF, SQL,

and XML based Data Sources Web Services based access to Remote RDF,

SQL, XML, and other Web Services based Data Sources

4

Page 5: Virtuoso Universal Server Overview

5© 2008 OpenLink Software, All rights reserved.

Virtual Database Engine Contd.Distributed Query Optimization

Locality Sensitive Query Cost Optimization (Collocated Joins, Pass-Through Queries, and Array Parameters)

Deductively Abstracts SQL Dialect Differences (via ODBC and JDBC metadata call exploitation)

Message Latency Factored into Cost Model

Hash Joins Used When Appropriate, Replacing Multiple Remote Lookups with Single Sequential Read

2-Phase Commit for Distributed TransactionsMS DTC for Windows

Tuxedo on Unix

5

Page 6: Virtuoso Universal Server Overview

6© 2008 OpenLink Software, All rights reserved.

Virtual Database Engine Contd.

ATTACH TABLE Statement incorporates Remote Table, Indexes and Statistics into Local Virtuoso Schema

Allows Incorporation of SQL Functions and Stored Procedures from Remote Relational Database Engines

Support for Remote XML, Full Text Indexing for Oracle, Microsoft SQL Server

6

Page 7: Virtuoso Universal Server Overview

7© 2008 OpenLink Software, All rights reserved.

Native Data Management – Relational (RDBMS)

Native SQL 92/2K Engine

Rich Procedure Language (PSM-95 based)

Database Engine Optimized for SMP Performance

Native Full Text Indexing

7

Page 8: Virtuoso Universal Server Overview

8© 2008 OpenLink Software, All rights reserved.

RDBMS Features - Transactions

Full ACID PropertiesCheckpoint + Roll Forward Log, Optional

Archiving of LogsUncommitted/Read

Committed/Repeatable/Serializable IsolationsNon-blocking Read Committed Shows Latest

Committed Versions of Uncommitted Updated Rows

Can Work as XA/MS DTC Resource Manager

8

Page 9: Virtuoso Universal Server Overview

9© 2008 OpenLink Software, All rights reserved.

RDBMS Features - SQL

Full SQL 92 with many 2K Features

SQLX, XPATH, XSLT, Xquery

SQL 2K Objects, Implementation in SQL/Java/.net

Transparent Mixing of Local and Remote Tables

9

Page 10: Virtuoso Universal Server Overview

10© 2008 OpenLink Software, All rights reserved.

RDBMS Features – Query Optimization

Cost Based Optimization

On The Fly Sampling of Table/Column/Literal Key Cardinalities

Fixed Statistics for Deterministic Query Plans

Loop/Hash/Merge Join

SQL Options for Explicitly Specifying Query Plan

10

Page 11: Virtuoso Universal Server Overview

11© 2008 OpenLink Software, All rights reserved.

RDBMS Features - Storage Engine

Rows Stored At Leaves of Primary Key Index Tree

Non PK Indexes Refer to Row By Value of PKBitmap IndexFull Text IndexStriping Across Disks, No Separate Files Per

Table/Key Incremental Online backup

11

Page 12: Virtuoso Universal Server Overview

12© 2008 OpenLink Software, All rights reserved.

RDBMS Features - Run Time Hosting

User Defined Type via Java or .NET Objects Hosted in Process

User Defined Types Persisted in LOB Columns

Java/.NET Methods Called Transparently From SQL

‘C” based Plugin Mechanism for adding SQL Functions

12

Page 13: Virtuoso Universal Server Overview

13© 2008 OpenLink Software, All rights reserved.

RDBMS Features - Security

SQL Role Based Security, Column/Table/View/Procedure Level

Row Level Security With Policy Functions

A Policy Function Can Add Extra Conditions to Queries/Updates Depending on User, Time, Other Considerations

13

Page 14: Virtuoso Universal Server Overview

14© 2008 OpenLink Software, All rights reserved.

Data Center Features -

Clustering

Combine Multiple Servers for Massive Scale and Parallelism

All Servers Show the Same SQL/RDF Data and Application Logic, A SQL or Web Client Can Connect to Any for the Same Service

Data Partitioning Specifiable Index by Index Optional Replicated Storage of Partitions for More

Load Balancing, Fault Tolerance Shared Nothing Architecture, Works With

Commodity Hardware and Networks

14

Page 15: Virtuoso Universal Server Overview

15© 2008 OpenLink Software, All rights reserved.

Data Center Features - Query Penalization

Latency: One Message Round Trip is 20 Single Row Random Lookups

Virtuoso Divides Queries into Collocated Fragments, Ships All Filtering, Aggregation, Joining to Where the Data Is.

Sends Arrays of Hundreds of Operations at a Time, Whenever Possible

15

Page 16: Virtuoso Universal Server Overview

16© 2008 OpenLink Software, All rights reserved.

Data Center Features - Transactions

Full ACID Properties

Two Phase Commit with Single Phase Optimization

Detection of Distributed Deadlocks Without Timing Out

Each Cluster Node Keeps Own Transaction Log

No External Monitor, Virtuoso Handles Distributed Recovery Cycle By Itself

Transactions/Logging Can BE Disabled for Bulk Load etc.

16

Page 17: Virtuoso Universal Server Overview

17© 2008 OpenLink Software, All rights reserved.

Data Center Features - Parallel SQL

Transparent Map-Reduce Style Execution of Specified Partitioned SQL Functions/Procedures

PL Extensions for Async Remote Execution of SQL Code, With and Without Transactional Semantics

17

Page 18: Virtuoso Universal Server Overview

18© 2008 OpenLink Software, All rights reserved.

Data Center Features - Futures

Dynamic Deployment, Adding and Removing Cluster Nodes Without Interruption of Service

Keeping Data in Small, Self-Contained, Easily Relocatable Mini-Partitions

18

Page 19: Virtuoso Universal Server Overview

19© 2008 OpenLink Software, All rights reserved.

SQL Client Connectivity - Data Access Drivers

Cross Platform ODBC 3.0 Drivers

JDBC 2.0 Drivers

OLE-DB Provider

ADO.NET Provider

XMLA Provider

19

Page 20: Virtuoso Universal Server Overview

20© 2008 OpenLink Software, All rights reserved.

Native Data Management - XML

Native XML Data TypeSQLX + Oracle Compatible XML Functions in

SQLDocument Centric Persistence of XML with

Special Support in Text IndexXSLTXQueryXML Views – XML Mapping Schema based

Views of SQL Data Sources

20

Page 21: Virtuoso Universal Server Overview

21© 2008 OpenLink Software, All rights reserved.

Native RDF Data Management

Native RDF Quad Storage (Physical Quads)SQL Enhanced With RDF IRI and

Typed/Language Tagged DataBitmap Indices and Key Compression for

Compact StorageSelectable Index Scheme, Optionally Allows

Queries Against Union of All GraphsOptional Full Text Index of LiteralsReuses SQL Cost Model and Execution Engine

With RDF Tailored Statistics

21

Page 22: Virtuoso Universal Server Overview

22© 2008 OpenLink Software, All rights reserved.

RDF Data Services – Client Connectivity

SPARQL ProtocolJena Storage ProviderSesame Storage ProviderRedland Storage ProviderLinq2Rdf Storage ProviderSPASQL

SPARQL execution within SQL Processor

Plethora of Built-In Functions, Stored Procedures, Web Services

22

Page 23: Virtuoso Universal Server Overview

23© 2008 OpenLink Software, All rights reserved.

RDF Data Services – SPARQL

Full SPARQL, Language and Protocol SupportJena Compatible SPARUL for Create Graph,

Insert, Update, and DeleteExtensions for Aggregates & GroupingNested Queries, SQL-Like Existence and Value

SubqueriesExpressions in Result SetsPath Expressions for Compact Notation, Also in

ExpressionsFull Text & XPath Magic Predicate Extensions

23

Page 24: Virtuoso Universal Server Overview

24© 2008 OpenLink Software, All rights reserved.

RDF Data Services – Inference

24

Backward Chaining Inference Support, No Materialization of Entailed Triples needed for:Subclass and Subproperty HierarchiesOWL sameAs for Instances, Classes and PropertiesOWL equivalentClass and equivalentProperty Inference Enabled at Query or Individual Triple Pattern

Level

Page 25: Virtuoso Universal Server Overview

25© 2008 OpenLink Software, All rights reserved.

Linked Data Services - RDF-ization Middleware

Declarative RDF Views (or Covers) over SQL Data In-Built RDF Middleware (Sponger) for RDF-ization of

Harvested Web Content (bulk ingest or “on the fly”) Extended SPARQL Against Mapped and Stored RDF RDF-ization Cartridges for 30+ non RDF data

sourcesUsed by SPARQL ProcessorUsed by in-built Content Crawler

Cache Invalidation based on HTTP Caching Rules Configurable URI dereferencing via pragmas for

node selection and path traversal

25

Page 26: Virtuoso Universal Server Overview

26© 2008 OpenLink Software, All rights reserved.

Linked Data Services - Deployment

URL Rewrite Rules combined with SPARQL for flexible association of URIs and RDF Data Sets

Proxy (or wrapper) URIs construction for materializing Linked Data “on the fly” from existing Web information resources

REST or SOAP based Web Services that expose functionality to Web Clients such as OpenLink Data Explorer, Marbles, Zitgist Data Explorer, DISCO, Tabulator etc.

26

Page 27: Virtuoso Universal Server Overview

27© 2008 OpenLink Software, All rights reserved.

RDF Data Services – RDF Views over SQL Data Sources

SPARQL Data Definition Statements for RDB Mapping Declare Correspondences Between Graph/Triple

Patterns and SQL Objects Specify Mapping Between URI's and Keys , Supporting

All Data Types, Multipart Keys Not Restricted to Table per Class and Column per

Property Use Arbitrary Joins, SQL Functions and Search

Conditions Automatically Generate Basic Class per Table, Property

per Column Mapping of Given SQL Schema

27

Page 28: Virtuoso Universal Server Overview

28© 2008 OpenLink Software, All rights reserved.

RDF Data Services - RDF Views Contd.

Evaluate Arbitrary SPARQL Against an RDF View In One Query, Some Graphs May Come from Views,

Others From Stored RDF RDF Views Generate a Single SQL Statement, The IRI

Generation and IRI Parsing is Only in Selection and Constant Expressions

SQL Has Full Optimization Possibilities and the Generated SQL Does not Depend on Virtuoso Specifics

Hence, RDF Views Are Efficient for Querying Remote, non-Virtuoso SQL Data

28

Page 29: Virtuoso Universal Server Overview

29© 2008 OpenLink Software, All rights reserved.

RDF Data Services - Clustering

Cluster-Optimized RDF Loader and SPARULRDF-Aware Data PartitioningAutomatic Statistics Sampling Across Cluster for

Best Query Plan

29

Page 30: Virtuoso Universal Server Overview

30© 2008 OpenLink Software, All rights reserved.

RDF Benchmarks

TPC H With SPARQL Extensions and RDF ViewsLUBMBerlin SPARQL Benchmark with Triples and with

RDF Views

30

Bundled With:

Page 31: Virtuoso Universal Server Overview

31© 2008 OpenLink Software, All rights reserved.

Web Services Platform – HTTP Services

HTTP/1.1 and HTTPS Server for Static and Dynamic Content

Dynamic Web Pages in PHP, Virtuoso SQL Procedures, ASP .net, Others

SOAP and Rest Web Services in Virtuoso PL, Java, .NET

DAV

31

Page 32: Virtuoso Universal Server Overview

32© 2008 OpenLink Software, All rights reserved.

Web Services Platform - WebDAV

Documents Stored in Virtuoso DatabaseACL Based plus Unix Style Security, SQL User

Accounts and Roles Own Documents and Collections

Automatic RDF Metadata ExtractionOptional Full Text Indexing and VersioningDynamic Collections for Alternate Views of

Directory Hierarchy

32

Page 33: Virtuoso Universal Server Overview

33© 2008 OpenLink Software, All rights reserved.

Web Services Platform – SOAP & REST

SOAP 1.1/1.2 End Points Exposing SQL Procedures in All SOAP Styles

Automatic WSDL GenerationSQL Extensions for Declaring Full XML Schema

Signatures for End PointsExposing Java and .net via SOAPDynamic Web Pages and XML Functions for

REST ServicesXMLA for SQL Access over SOAP

33

Page 34: Virtuoso Universal Server Overview

34© 2008 OpenLink Software, All rights reserved.

Web Services Platform - Dynamic Server Pages

Configure a Virtual Directory as ExecutablePublish Dynamic Web Pages in PHP, Virtuoso

PL, Ruby, PERL, ASP .net Without Using External Web Server

34

Page 35: Virtuoso Universal Server Overview

35© 2008 OpenLink Software, All rights reserved.

Administration Services

Web Interface for Setup of Web End Points, SQL, XML, RDF Functions

SQL Functions for Full Programmatic Admin Access

Simple Tuning, Only Specify File Layout and Amount of Threads and Memory to Use

35

Page 36: Virtuoso Universal Server Overview

36© 2008 OpenLink Software, All rights reserved.

Virtuoso RDF Applications

DbpediaBIO2RDFNeurocommonsZitgist, Pingthesemanticweb, Musicbrainz

36

Page 37: Virtuoso Universal Server Overview

37© 2008 OpenLink Software, All rights reserved.

Product

Open Source and Closed Source Versions, Closed Source AddsVirtual Database and Clustering

All Code, Applications, Samples, Docs in Single Download

Minimal Installation Consists of Single Executable + Config File

Web Admin Interface and Bundled ODS Collaborative Apps Suite

Available for All Linux, Unix, Windows, 32 and 64 bit Available Preinstalled on Amazon EC2, With Optional

Preloaded Dbpedia, BIO2RDF, Other RDF Data Sets

37