beyond the models: applying semantic technologies across ...€¦ · moving to smart data – enter...

19
V.2.2 Eric Little, PhD Chief Data Officer OSTHUS [email protected] Beyond the Models: Applying Semantic Technologies Across the Enterprise

Upload: others

Post on 21-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

V.2.2

Eric Little, PhDChief Data [email protected]

Beyond the Models: Applying Semantic Technologies Across the Enterprise

Page 2: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 2

The Current Situation Across Enterprises

Many challenges exist for data to be captured, integrated and shared

Data SilosIncompatible instruments and software systems, proprietary data formatsLegacy architectures are brittle and rigidSME knowledge resides in people’s heads, little common vocabularyData schemas are not explicitly understoodLack of common vision between business units and scientists

Page 3: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 3

The challenge of big data is here – and it is growing By 2020 there will be 2.3 Zetabytes of annual traffic

on the Internet (ZB=1,000,000,000,000,000,000,000 bytes)

The volume of business data worldwide is estimated to double every 1.2 years.

Since 2012, more than 90 percent of the Fortune 500 have funded big data initiatives

100 terabytes of data is uploaded daily to Facebook Data production will be 44 times greater in 2020

than it was in 2009

Storing/retrieving that amount of data is 1challenge …. Analyzing even a fraction of it is an even bigger challenge

Big Data’s Impacts

If each Gigabyte in a Zettabyte were a brick, 258 Great Walls of China (made of 3.8B bricks) could be built.

Page 4: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 4

The Common Big Data Fallacy

Hypothesis:

If I have more data at my fingertips –then I will have more answers

Well…. Actually….. No.

One major hurdle:“Real-world data […] is messy data, filled with inconsistencies, potential biases, and noise.”

Need a new approach to Big DataCopping & Li Harvard Business ReviewNov 29, 2016

Page 5: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 5

Understanding the 4V’s of Big Data

Normally the focus –Big Data Analysis is more than just size

Performance is Critical to Success

Data complexity is increasing – Model complexity

Uncertainty abounds – requires statistics and probabilities

Majority of Big Data analytics approaches treat these two V’s

Semantic technologies provide

clear advantages

Mathematical Clustering

Techniques provide clear advantages

Page 6: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 6

Moving to Smart Data – Enter Semantics

Smart data can be added to existing systems Does not require replacement of existing tech

Smart data provides a separation of: Model Layer Data Layer

Link to the model layer Leave data in place Smart data links information from the models to instance-level data

Smart Data uses metadata in order to capture logical context about data

Page 7: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 7

Semantic Spectrum of Knowledge Organization Systems

• Deborah L. McGuinness. "Ontologies Come of Age". In Dieter Fensel, Jim Hendler, Henry Lieberman, and Wolfgang Wahlster, editors. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press, 2003. • Michael Uschold and Michael Gruninger “Ontologies and semantics for seamless connectivity” SIGMOD Rec. 33, 4 (December 2004), 58-64. DOI=http://dx.doi.org/10.1145/1041410.1041420• Leo Obrst “The Ontology Spectrum”. Book section in of Roberto Poli, Michael Healy, Achilles Kameas “Theory and Applications of Ontology: Computer Applications”. Springer Netherlands, 17 Sep 2010.• Leo Obrst and Mills Davis "Semantic Wave 2008 Report: Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”. 2008.

Sources

Page 8: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 8

Ontologies provide a background for computations

Humans logically structure their world

Ontologies help to capture that structure

Background Beliefs

Ontologies capture important logical structures

But where does Machine Learning fit in?

Page 10: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 10

As data sources continue to increase – so to do new algorithmic approaches

Data Variety & Veracity are driving new innovations

More data is now better

Specialized hardware is evolving to match needs

Machine Learning and Deep Learning on Image Data

From: https://medium.com/@anthony_sarkis/the-age-of-the-algorithm-why-ai-progress-is-faster-than-moores-law-2fb7d5ae7943

Switch to Deep Learning Approach

Page 11: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 11

THE MOVE FROM BIG DATA TO

B I G A N A L Y S I SST

ATIS

TIC

AL

SEM

ANTI

CS

MAC

HIN

ELE

ARN

ING

REA

SON

ING

Page 12: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 12

Big Analysis Requires Hybrid Architectures

Semantic DBs

Unstructured Docs

Structured Data

Cloud DBs (NoSQL)Analytics

Dashboards & Reports

Integration Layer

Page 13: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 13

Two Extremes of a Spectrum of Possible Solutions for Big Data

Data Warehouse Data Lake

Proven enterprise technology

Big DWHs require too great an effort

Not all data is suitable for rigid DWHs

+ Great flexibility and very little effort to store all sorts of dataData lakes are too loose a construct

Tremendous efforts on retrieval

+

Page 14: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 14

Data Science (machine learning, text analytics, clustering etc.)

Make Data FAIR (Findable, Accessible, Interoperable Reusable)

Linked Open Data& Open APIs

Semantic Graph DB

(Knowledge Graph)

Operational DBs

Unstructured Documents

Analytics Toolssimulationsstatisticsreasoning

Visualizationdashboardsexplorationsearch

Semi-structured Data

Instrument Data

Lightweight Semantic Integration Layer(semantic RMDM, APIs, semantic indexing, data annotation, catalogues, meta data and linking)

Reportingregulatoryinternalexternal

Page 15: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

15

Enter LeapAnalysis

Page 16: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

16

LeapAnalysis

NOSQL Excel

Any kind of data source can be supported directly

Queries, Rules, Patterns, etc.

Big Analysis Concept:Semantics + Statistics

True Federated Analytics Across The Enterprise

Ref Data

Page 17: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

17

• Companies must speed up the process of integrating data• Cleaning or integrating data before you know its value is

wasteful• Making data just “smart” can make it very slow• The world is moving to decentralization

• Virtualization• Federation• Complex problem solving• Pattern/model reuse

Main Topics to Consider

Page 18: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

1818

Open Source Data

Aligned Data Sources

LA Alignment Store

(MongoDB)Data Integration

Model

User Query Workspaces

Query Response

How LeapAnalysisWorks

Subject Domain

DomainModels

Reference Model(Virtuoso RDF)

xxxxxxxx

Patient Data(CSV File System)

Sample Herexxxxxxxxxxxxxx

Patient Data(MSSQL)

LA Alignment Store (MongoDB)

prefix core: <http://vocab.rd.astrazeneca.net/core/>prefix bdm: <http://vocab.rd.astrazeneca.net/bdm/>prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>SELECT ?subject ?gender ?indication ?age ?height ?weight WHERE {

?subject rdf:type core:Subject .?subject core:hasGender ?gender .?subject core:hasIndication ?indication .?subject bdm:hasAge ?age .?subject bdm:hasHeight ?height .?subject bdm:hasWeight ?weight .

}

SPARQL Queryxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Page 19: Beyond the Models: Applying Semantic Technologies Across ...€¦ · Moving to Smart Data – Enter Semantics. ... Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”

Slide 19

CONNECTING DATA, PEOPLE AND ORGANIZATIONS

Contact Information:

Email: [email protected]: www.osthus.com

www.biganalysis.comTwitter: OntoEric