![Page 1: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/1.jpg)
From Web 1.0 Web 3.0: Is RDF access to RDB enough?
Vipul [email protected]
Senior Medical Informatician, Clinical Informatics R&DPartners Healthcare System
Martin Flanagan,[email protected]
CTO, InSilico Discovery
W3C Workshop on RDF Access to Relational DatabasesOctober 26th , 2007
![Page 2: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/2.jpg)
Outline
• Position
• Use Case Scenario
• Solution Approach
• A Generalized Framework for RDF Access
• Next Steps:— Proposed Roadmap
— Research Topics
![Page 3: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/3.jpg)
There is a need for a generalized framework (format, representation language, algebra?) for RDF access to:
(A) Relational Databases(B) Tabular Data Sources, e.g., Excel Spreadsheets(C) Web Services
Motivation:(A) Large amounts of “tabular” data and increasing number of
web services in the Healthcare and Life Sciences(B) Learn from the relational database success story: Declarative
query language + Algebra + Opportunities for optimization(C) Potential for providing incremental value, increasing the
adoption and acceptance of the Semantic Web.
Position
![Page 4: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/4.jpg)
Use Case Scenario:Biological Explanations for Statistical Correlations
• What is the location of a given Gene, e.g., CPNE1 on the Human Genome?Data Repository: NCBI EntrezAccess Mechanism: Web Services
• For what gene(s) is a given SNP, e.g.., rs6060535 in the upstream regulatory region?Data Repository: RDBMS containing dbSNP and regulatory region data, Access Mechanism: JDBC/SQL
• What genes have been found to be "coexpressed" with CPNE1 and in what study?Data Repository: Excel Spreadsheet containing the co-expression patterns of various genes in various studies.Access Mechanism: .NET API, MS Office API
![Page 5: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/5.jpg)
Solution Approach
• Ontology based RDF query specification
• Mapping Framework— Relational Databases
— Excel Spreadsheets
— Web Services
• Query Translations and Execution
Illustrations of a working system based on the Semantic Discovery System by InSilico Discovery (http://www.insilicodiscovery.com)
![Page 6: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/6.jpg)
Ontology based RDF Query Specification
prefix example <http://www.semanticdiscoverysystems.com/Example.owl#>prefix ns <http://www.w3.org/1999/02/22-rdf-syntax-ns#>select distinct ?v0, ?v1where{?v0 ns:type example:gene?v0 example:has_gene_region ?v1?v0 example:gname ‘CPNE’}
SPARQL Query Generated:
![Page 7: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/7.jpg)
Mapping to Relational Databases
Mapping to OracleDatabases
Mapping to Gene NamesMediator Class
![Page 8: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/8.jpg)
Mapping to Web Services
Mapping to Web Services
Mapping to GetGenomeLocationsin gene_regions Mediator class
![Page 9: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/9.jpg)
Mapping to Excel Spreadsheets
Mapping to Spreadsheet Data
Mapping to Gene NamesMediator Class
![Page 10: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/10.jpg)
Translators
Query Translation and Execution
This one SPARQL statement ‘joins’ dataFrom NCBI, Excel, Oracle – “who did what assay
matching this sequence data …”
![Page 11: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/11.jpg)
A Generalized Framework for RDF Access
Ontology Classes and PropertiesGene, GeneRegionhas_gene_region, gname
RDB specific classes:oracle.mdl
Web service specific classes:ncbi.mdl, keg.mdl
Mediator Framework Classes:gene.mdl, gene_region.mdl, gene_names.mdl, …
Excel specific classes:excel.mdl
The SDS Platform is based on the Mediator Definition Languagework done by Val Tannen and his students at U. Pennsylvania.
Was earlier implemented in the K3 system and was widely used in Pharma
![Page 12: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/12.jpg)
Conclusions
• Need to think of various types of structured/semi-structured/tabular data sources in a wholistic manner:
— XML Documents (GRDDL Transforms)
— Relational Databases
— Web Services
— Excel Spreadsheets
— Other “Tabular” and “Tree” data sources
• Potential for providing value beyond relational databases
• Accelerate the transition to the Semantic Web
• Increase Adoption and Acceptance
![Page 13: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/13.jpg)
Next Steps: Proposed Roadmap
RDF
XML Relational Databases
WSDL ExcelSpreadsheets
Generalized Transformation Language
GRDDLRelationalAlgebra
![Page 14: From Web 1.0 Web 3.0: Is RDF access to RDB enough? Vipul Kashyap vkashyap1@partners.org Senior Medical Informatician, Clinical Informatics R&D Partners](https://reader030.vdocuments.mx/reader030/viewer/2022032803/56649e2d5503460f94b1cf22/html5/thumbnails/14.jpg)
Next Steps: Research• Extension of Relational Algebra?
— XQuery— RDF— GRDDL Transformations— WSDL— Read only Web Service Choreography/Composition
• What aspects of the above can be “webified”?— Access Transformation Languages — Mapping Languages: Is XQuery or RDF enough?
• Existing efforts in Mediator research— E.g., Mediator Definition Language (MDL)