a practical introduction to sadi semantic web services and hydra query tool

66
A PRACTICAL INTRODUCTION TO SADI SEMANTIC WEB SERVICES AND HYDRA QUERY TOOL Alexandre Riazanov, CTO IPSNP Computing Inc Oslo University, Sep 23, 2015

Upload: alexandre-riazanov

Post on 12-Apr-2017

356 views

Category:

Science


1 download

TRANSCRIPT

Page 1: A practical introduction to SADI semantic Web services and HYDRA query tool

A PRACTICAL INTRODUCTION TO

SADI SEMANTIC WEB SERVICES

AND HYDRA QUERY TOOL

Alexandre Riazanov, CTOIPSNP Computing Inc

Oslo University, Sep 23, 2015

Page 2: A practical introduction to SADI semantic Web services and HYDRA query tool

PLAN OF THE TALK

• A brief reminder of the previous episode: data federation with SADI and HYDRA.

• RDF and OWL as syntactic foundations of service I/O and functionality descriptions.

• Query execution with automatic service discovery and reasoning.

• Resource publishing process with SADI, with a detailed practical example (time permitting).

Page 3: A practical introduction to SADI semantic Web services and HYDRA query tool

DATA FEDERATION: QUERYING MULTIPLE HETEROGENEOUS SOURCES AS A SINGLE DB

Page 4: A practical introduction to SADI semantic Web services and HYDRA query tool

QUERY EXAMPLES

• Find the names of drugs that contain chemical category Y as active ingredients.

• Find documents mentioning enzyme activity X, extract info on protein mutations and visualize mutations on 3D structure.

• Annotate a DNA sequence X with molecular functions of proteins produced by the corresponding gene.

• Find patients with precondition X diagnosed with infections Y resulting from procedure Z.

• Find patients diagnosed with X while taking drug C.

Page 5: A practical introduction to SADI semantic Web services and HYDRA query tool

HOW WE DO IT WITH HYDRA AND SADI SEMANTIC WEB SERVICES

Page 6: A practical introduction to SADI semantic Web services and HYDRA query tool

A HIGH LEVEL VIEW OF THE HYDRA APPROACH

● Given a SPARQL query, HYDRA analyses it by using an intelligent logic-based algorithm (proprietary, unlike SADI itself).

● HYDRA requests descriptions of potentially useful services from available SADI service registries.

● HYDRA processes the descriptions and figures out which services have to be invoked, on what data and in what order.

SPARQL is a W3C standard semantic query language -- much more intuitive than SQL.

Page 7: A practical introduction to SADI semantic Web services and HYDRA query tool

HOW IS THIS ALL POSSIBLE?

• Key ingredient: the SADI framework for Semantic Web services (Semantic Automated Discovery and Integration).

• SADI services are: • RESTful services• consuming and producing one format -- RDF,• with semantic descriptions (in OWL) fully defining

their functionality.

Page 8: A practical introduction to SADI semantic Web services and HYDRA query tool

DIGRESSION: RDF

• W3C RDF = Resource Description Framework

• Standartised graph-based data model and a few standard rendering formats.

• Nodes = objects (URIs) and data values like “abc”^^xsd:string or “123”^^xsd:integer.

• Edges: binary relations.

Page 9: A practical introduction to SADI semantic Web services and HYDRA query tool

RDF EXAMPLES

@prefix mt: <http://localhost:8080/medical_terminology.owl#> .

<http://example.com/patient#1234> rdf:type mt:Patient .<http://example.com/patient#1234> mt:has_mass _:hm ._:hm rdf:type mt:Measurement ._:hm mt:has_value "92.0"^^xsd:float ._:hm mt:has_units mt:kg .

@prefix mt: <http://localhost:8080/medical_terminology.owl#> .

<http://example.com/patient#1234> a mt:Person ; mt:has_mass [a mt:Measurement; mt:has_value "92.0"^^xsd:float; mt:has_units mt:kg] .

The original XML-based rendering format is also popular.

Page 10: A practical introduction to SADI semantic Web services and HYDRA query tool

DIGRESSION: OWL

• W3C OWL = Web Ontology Language • Essentially, extends RDF with definitions and other axioms

for classes (types of objects) and properties (binary relations).

• Most useful axiom types -- class and property chierarchies:Patient subClassOf Personloves subPropertyOf knows

• SADI reuses property restriction syntax:has_MRN exactly 1 string

Page 11: A practical introduction to SADI semantic Web services and HYDRA query tool

SADI SERVICE I/O

• Input: RDF description of an input object.

• Output: another RDF graph providing more (computed or retrieved) info about the input object or linking it to other objects.

• Since all SADI services “talk the same language” (RDF), they are 100% syntactically interoperable:– output of one SADI service can be directly

consumed by any other SADI services.

Page 12: A practical introduction to SADI semantic Web services and HYDRA query tool

COMPLETE SEMANTIC DESCRIPTIONSOF SERVICE FUNCTIONALITY

SADI services publish semantic descriptions of their I/O that completely define what the service expects and can accept as input, and what RDF assertions the service can output.

• Unique and extremely powerful property: it facilitatescompletely automatic discovery

and orchestration of services.

Page 13: A practical introduction to SADI semantic Web services and HYDRA query tool

Example: computeBMI service I/O

Page 14: A practical introduction to SADI semantic Web services and HYDRA query tool

SEMANTIC FUNCTIONALITY DESCRIPTION

• OWL syntax is repurposed to define what RDF graphs are acceptable as input, and what RDF graphs may be produced in the output.

• Input(computeBMI) = Person and (has_height exactly 1 (Measurement and (has_value exactly 1 float)))• Output(computeBMI) = has_BMI exactly 1 float

Page 15: A practical introduction to SADI semantic Web services and HYDRA query tool

SERVICE INPUT CLASS

• Specifies what kind of objects (RDF descriptions) the service expects in the input. OWL syntax is convenient for such definitions.

• Almost always just an enumeration of attributes of the input objects the SADI service expects.

● If the input class is defined as Person and (has_height exactly 1 (Measurement and (has_value exactly 1 float) and (has_units exactly 1 {m})) and (has_mass exactly 1 (Measurement and (has_value exactly 1 float) and (has_units exactly 1 {kg}))

… the service expects somethinglike this in the input:

patient1234 a Person; has_height [a Measurement; has_value “1.7"^^xsd:float; has_units m]; has_mass [a Measurement; has_value “92.0"^^xsd:float; has_units kg]

Page 16: A practical introduction to SADI semantic Web services and HYDRA query tool

SERVICE OUTPUT CLASS

• A SADI service advertises itself by publishing its output class specifying what the service promises to produce as the output.

• The class must enumerate attributes that the service will add to the input object. This fully semantically defines what the service does!

● If the output class is defined as

has_BMI exactly 1 float

… service clients can expect something like this in the output: patient1234 has_BMI “31.83”^^xsd:float

Page 17: A practical introduction to SADI semantic Web services and HYDRA query tool

DIGRESSION: SPARQL

• W3C SPARQL - standard query language for the RDF data model.

• SPARQL clients are programs that execute SPARQL queries, typically on RDF triplestores.

PREFIX mt: <http://localhost:8080/medical_terminology.owl#> SELECT ?mass { <http://example.com/patient#1234> a mt:Person ; mt:has_mass [a mt:Measurement; mt:has_value ?mass; mt:has_units mt:kg] . }

• HYDRA is also a SPARQL client, but for virtual RDF DBs.

Page 18: A practical introduction to SADI semantic Web services and HYDRA query tool

AUTOMATIC SERVICE DISCOVERY

• With the I/O descriptions, a sufficiently intelligent client can figure out that it can call the service if the client has to satisfy a query condition like this:

patient1234 has_BMI ?bmi_value

• The query condition suggests that a service with has_BMI in the output may be useful if called on the object patient1234

• To make the call, the client must have enough information about patient1234 : according to the input class, has_height and has_mass must be attached to it and sent to the service.

Page 19: A practical introduction to SADI semantic Web services and HYDRA query tool

QUERY, EXECUTION, ANSWERS

Query:FROM <.......rdf> # seed data SELECT ?bmi_value { patient1234 a Person; has_BMI ?bmi_value }

Execution: HYDRA ● seed data in FROM clause describes the

heights and weights of some people, including patient1234, using has_height and has_mass;

● since has_BMI is there, HYDRA looks for all services in the available registries that can attach has_BMI and finds computeBMI;

● patient1234 satisfies the input condition of computeBMI, so HYDRA calls it;

● computeBMI returns patient1234 has_BMI “32.3”

so HYDRA can return an an answer:?bmi_value = “32.3”

Page 20: A practical introduction to SADI semantic Web services and HYDRA query tool

MULTIPLE SERVICES

• Suppose, we don’t know patient’s height/mass, but can retrieve them from a DB by patient’s medical record number (MRN).

• We write another SADI service, patientInfo :Output(patientInfo) = (has_height exactly 1 (Measurement and (has_value exactly 1 float) and (has_units exactly 1 {m})) and (has_mass exactly 1 (Measurement and (has_value exactly 1 float) and (has_units exactly 1 {kg}))

Input(patientInfo) = Person and (has_MRN exactly 1 string)

Page 21: A practical introduction to SADI semantic Web services and HYDRA query tool

AUTOMATIC SERVICE COMPOSITION

• HYDRA can figure out automatically that the output of patientInfo can be submitted to computeBMI, and the composition of the services can solve the query

SELECT ?bmi_value { ?patient a Person ; has_MRN “1234” ; has_BMI ?bmi_value } (no has_height or has_mass anywhere !)

Page 22: A practical introduction to SADI semantic Web services and HYDRA query tool

INTELLIGENT (REASONING-ENABLED) QUERY EXECUTION

● Some queries are too complex unless generality can be exploited:➢ For example, query concerning all antibiotics

requires generalisation, otherwise all types of antibiotics would have to be enumerated in the query.

● Much better way to do this is to import a classification of drugs and use it in query execution.

● HYDRA facilitates such reasoning and even more complex reasoning with rules.

Page 23: A practical introduction to SADI semantic Web services and HYDRA query tool

(TINY) REASONING EXAMPLE

Query defines ?patient as a Patient instead of Person: ?patient a Patient ; has_MRN “1234” ; ...

● HYDRA is still able to call patientInfo on the Patient instance, say patient1234, if there is an axiom Patient subClassOf Person. It infers patient1234 a Person, which can be used as input to patientInfo.

● The axiom can be included in the definition of Output(patientInfo), or specified separately.

Page 24: A practical introduction to SADI semantic Web services and HYDRA query tool

RESOURCE PUBLISHING WITH SADI (1)

• Specify the source of data / software you want to publish with SADI.

• Model data semantically: find ontologies describing your domains and decide how your data will be expressed in the terms of these ontologies.

For example, a patient database and a BMI computation

algorithm.

Page 25: A practical introduction to SADI semantic Web services and HYDRA query tool

RESOURCE PUBLISHING WITH SADI (2)• Define your services I/O semantically: decide how to describe

the operation of your services in the terms of the domain ontologies, i.e., what will be written in the input and output classes.

• Code the business logic of your services in Java, Perl or Python. If a service wraps a DB, convert the input RDF into a query and the query results back to RDF. The coding effort is usually tiny compared to the modelling.

• Overall development costs may be considerable, but this cost is well amortized because SADI services are highly reusable, due to their unprecedented degree of interoperability and discoverability.

Page 26: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (1)

● Specify the source of data / software you want to publish with SADI.➢Database (CSV file) containing patient MRN, name,

height, weight, etc. We will use it to implement patientInfo.

➢BMI computation algorithm: BMI = mass, kg / height, m ^2.

Page 27: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (2)

● Model data semantically: find ontologies describing your domains and decide how your data will be expressed in the terms of these ontologies.➢ Create ontology clinical_terms.owl in Protégé:➢ Classes: Person, Patient, Measurement, Units➢ Properties: has_BMI, has_MRN, has_height, has_mass,

has_value, has_units.➢ Individuals: m, kg.➢ RDF data sample:

patient1234 a Patient; has_MRN “1234”^^xsd:string; has_height [a Measurement; has_value “1.7"^^xsd:float; has_units m]; . . .

Page 28: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (3)Background ontology medical_terminology.owl

Deploy:cp medical_terminology.owl /var/lib/tomcat7/webapps/ROOT/

URL: http://localhost:8080/medical_terminology.owl

Page 29: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (4)● Define your services I/O semantically: decide how to describe the

operation of your services in the terms of the domain ontologies, i.e., what will be written in the input and output classes.➢ I/O ontologies: patientInfo.owl and computeBMI.owl, importing

medical_terminology.owl

Page 30: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (5)● Code the business logic of your services in Java, Perl or Python.

➢There is a good open-source Java library for creating SADI services as Java Servlets.

➢A skeleton code for a service is generated automatically; we just have to fill the body of one method.

➢The library takes care of all the HTTP connectivity issues, parses the input RDF to a simple abstract representation (Jena), and renders the output RDF.

➢The compiled WAR file can be immediately deployed on a servlet container (Tomcat, Jetty, etc).

➢SADI services take only 10-15 min to code (if the business logic is simple or already programmed).

Page 31: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (6)

Edit pom.xml and run service skeleton creation plug-in:

Page 32: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (7)Just add your business logic code in processInput():

Page 33: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (8)Source database patientsDB.csv :

Page 34: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (9)

Finished processInput() for service patientInfo :

Page 35: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (10)

Finished processInput() for service computeBMI :

Page 36: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (11)• Deploy the services:

COPY target/my-sadi-services.war TO /var/lib/tomcat7/webapps/

• Test service description availability (HTTP GET):

Page 37: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (12)Test RDF for the services:

Page 38: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (13)

Service test runs with HTTP POST:

Page 39: A practical introduction to SADI semantic Web services and HYDRA query tool

PRACTICAL EXAMPLE (14)Running HYDRA command line application:

Page 40: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA PACKAGING

• Java API - can be embedded in something else.

• Command line application - convenient for small experiments.

• Web service (Java servlet) with– JSON-based protocol– Java client-side API.

Page 41: A practical introduction to SADI semantic Web services and HYDRA query tool

REMEMBER OUR BIG VISION?

Page 42: A practical introduction to SADI semantic Web services and HYDRA query tool

BIGGER VISION: SELF-SERVICE AD HOC QUERYING OF FEDERATED DATA

Page 43: A practical introduction to SADI semantic Web services and HYDRA query tool

THERE ARE NO PRINCIPLE OBSTACLES TO SELF-SERVICE QUERYING BECAUSE ..

● HYDRA implements semantic querying:○ users need not know how the source data is organised or

accessed.

● HYDRA can apply concept hierarchies and rules:○ syntactically simple queries for complex questions.

We just need an adequate user interface for building queries.

Page 44: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA QUERY COMPOSITION GUI PRINCIPLES

● Queries are rendered as highly readable graphs.

● A lot of query composition is done by entering keyphrases in English;○ HYDRA GUI suggests (sub)graphs

implementing a given keyphrase.

● Nodes can be delete/added manually;○ the system suggests possibilities (navigation).

Page 45: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA GUI SCREENSHOTS

Page 46: A practical introduction to SADI semantic Web services and HYDRA query tool

READABLE QUERY DESCRIPTION

Page 47: A practical introduction to SADI semantic Web services and HYDRA query tool

EMPTY CANVAS

Page 48: A practical introduction to SADI semantic Web services and HYDRA query tool

SERVICE REGISTRY

Note that we added allPatients that enumerates all patients with their MRN.

Page 49: A practical introduction to SADI semantic Web services and HYDRA query tool

KEYPHRASE INPUT

Page 50: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA GUI PROPOSES QUERY GRAPHS

Page 51: A practical introduction to SADI semantic Web services and HYDRA query tool

THE USER CAN CONFIRM THE WHOLE GRAPH OR SOME PARTS OF IT

Page 52: A practical introduction to SADI semantic Web services and HYDRA query tool

ADDING MNEMONIC VARIABLE NAME

Page 53: A practical introduction to SADI semantic Web services and HYDRA query tool

MNEMONIC VARIABLE NAME ADDED

Page 54: A practical introduction to SADI semantic Web services and HYDRA query tool

MORE KEYPHRASE INPUT

Page 55: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA GUI PROPOSES GRAPH AUGMENTATIONS

Page 56: A practical introduction to SADI semantic Web services and HYDRA query tool

VARIABLE NAME

Page 57: A practical introduction to SADI semantic Web services and HYDRA query tool

VARIABLE NAME ADDED

Page 58: A practical introduction to SADI semantic Web services and HYDRA query tool

MANUALLY ADDING RELATIONS

Numeric comparison < here, but could be any kinds of relations.

Page 59: A practical introduction to SADI semantic Web services and HYDRA query tool

EXTENDED GRAPH

Page 60: A practical introduction to SADI semantic Web services and HYDRA query tool

SPECIFYING A DATA VALUE

Page 61: A practical introduction to SADI semantic Web services and HYDRA query tool

EXTENDED GRAPH

The query is ready. It finds all patients with 20 < BMI < 30 and outputs their BMI values and MRNs.

Page 62: A practical introduction to SADI semantic Web services and HYDRA query tool

HYDRA GUI GENERATES SPARQL FROM QUERY GRAPHS

Page 63: A practical introduction to SADI semantic Web services and HYDRA query tool

EXECUTING THE QUERY

Page 64: A practical introduction to SADI semantic Web services and HYDRA query tool

ANSWERS

Page 65: A practical introduction to SADI semantic Web services and HYDRA query tool

SAVING THE ANSWERS AS AN EXCEL SPREADSHEET

Page 66: A practical introduction to SADI semantic Web services and HYDRA query tool

THANK YOU!

Further materials/services are available on request:• Live and recorded demos.

• Publications on previous (academic) case studies.

• Training/consulting.

• http://ipsnp.com/ (Canada) and http://ipsnp.co/ (UK)