lotico oct 2010

35
Introduction to Semantic Web Dean Allemang Chief Scientist, TopQuadrant Inc. [email protected]

Upload: dallemang

Post on 12-May-2015

603 views

Category:

Technology


1 download

DESCRIPTION

TopQuadrant Chief Scientist Dean Allemang presenting "introduction to Semantic Web" at Lotico's San Francisco meetup, Oct 27, 2010

TRANSCRIPT

Page 1: Lotico oct 2010

Introduction to Semantic Web

Dean AllemangChief Scientist, TopQuadrant Inc.

[email protected]

Page 2: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 2

Formed in 2001 Privately held First Semantic Web Consulting Firm in the U.S.

Products: TopBraid Suite Semantic Web Application Development Platform 600+ Customers

Solution Services Workshops: Solution Envisioning,

Ontology Modeling Jumpstarts to Large Implementations

Semantic Web Training 700+ People Trained

International Locations Alexandria, VA Mountain View, CA TopQuadrant Korea – Seoul, S. Korea

Strategic Partnerships Oracle, Franz, CTG

Corporate Overview

Page 3: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 3

Semantic Representation of Data and Models (Ontologies)

A model of the concepts and relationships between concepts within a specific domain.

World Wide Web Consortium (W3C) Semantic Web Standards:- RDF - RDFS

- SPARQL - OWL

Page 4: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 4

Semantic Web: Make web content machine-readable!

“The Semantic Web is a vision: the idea of having data on the Web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications.[W3C 2001] ”

“The Semantic Web is an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” [Tim Berners-Lee et al 2001]

Page 5: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 5

What could the Web do?

Web page interaction – uses people as its medium!

Page 6: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 6

What could the Web do? (cont.)

Can this sort of interactionbecome part of the Web itself?

Page 7: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 7

How could the Web do it?

Built-in by the Webmaster

Agree upon an “interlingua”

Page 8: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 8

A new Web of terminology

Use the same technology for mapping web pages to terminologyto map terminology to one another

What’s the Interlingua for the Interlingua?

Page 9: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 9

What about people? Don’t they make the world go round?

OntologyOntology

Annotated Web Page

Annotated Web Page

computers and people…better cooperation

Annotated Web Page

Agent

Ontology

Human

Internet

Source:, Phil Windridge WSWS (2004)

Page 10: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 10

The Web: The World’s Largest Information System!

How did it get so big? What is special about The Web?

Page 11: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 11

Semantic Web Standards Stack

Page 12: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 12

How Semantic Languages Work

Bring information together Draw inferences

RDF

RDFS

OWL

Page 13: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 13

What is RDF? Distribution of data

IDModel No.

Division Product LineManufacture location

SKUInStock

1 ZX-3Manufacturing support

Paper machine Sacramento FB3524 23

2 ZX-3PManufacturing support

Paper machine Sacramento KD5243 4

3 ZX-3SManufacturing support

Paper machine Sacramento IL4028 34

4 B-1430Control Engineering

Feedback Line Elizabeth KS4520 23

5 B-1430XControl Engineering

Feedback Line Elizabeth CL5934 14

6 B-1431Control Engineering

Active Sensor Seoul KK3945 0

7 DBB-12 Accessories Monitor Hong Kong ND5520 100

8 SP-1234 Safety Safety Valve Cleveland HI4554 4

9 SPX-1234 Safety Safety Valve Cleveland OP5333 14

Page 14: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 14

Distribute by rows?

1 ZX-3 Manufacturing support Paper machine Sacramento FB3524 23

4 B-1430

Control Engineering

Feedback Line Elizabeth KS4520 23

7 DBB-12 Accessories Monitor Hong Kong ND5520 100

Needs common schema - which column is which?

Page 15: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 15

Distribute by columns?

Division

Manufacturing support

Manufacturing support

Manufacturing support

Control Engineering

Control Engineering

Control Engineering

Accessories

Safety

Safety

In Stock

23

4

34

23

14

0

100

4

14

Model No.

ZX-3

ZX-3P

ZX-3S

B-1430

B-1430X

B-1431

DBB-12

SP-1234

SPX-1234

Needs to reference entities – which thing are

we talking about?

Page 16: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 16

Distribute by cells!?

Needs to reference both schema and entities

Most flexible – can distribute data in any way at all!

Division

7 Accessories

Product Line

4 Feedback Line

Model

1 ZX-3

Division

7 Accessories

Page 17: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 17

Distribute by cells!?

Division

7 Accessories

Subject

Predicate

Object

URI’s

Triple Store

• Store, • Index, and • Federate these triples

Page 18: Lotico oct 2010

© Copyright 2007-2009 TopQuadrant Inc. Slide 18

Thinking “Outside of the Table”

01-02 9.3 6.4 3.5 3.0 2.8

01-02 9.3 6.4 3.5 3.0 2.8

01-02 9.3 6.4 3.5 3.0 2.8

01-02 9.3 6.4 3.5 3.0 2.8

01-02 9.3 6.4 3.5 3.0 2.8

01-02 9.3 6.4 3.5 3.0 2.8

01-02 9.3 6.4 3.5 3.0 2.8

01-02 9.3 6.4 3.5 3.0 2.8

Agricultural production-crops4 01 7.6 5.3 3.0 2.4 2.3

Agricultural production-crops4 01 7.6 5.3 3.0 2.4 2.3

Agricultural production-crops4 01 7.6 5.3 3.0 2.4 2.3

Agricultural production-crops4 01 7.6 5.3 3.0 2.4 2.3

Agricultural production-crops4 01 7.6 5.3 3.0 2.4 2.3

Agricultural production-crops4 01 7.6 5.3 3.0 2.4 2.3

Agricultural production-crops4 01 7.6 5.3 3.0 2.4 2.3

Agricultural production-crops4 01 7.6 5.3 3.0 2.4 2.3

Agricultural production - livestock4 02 1.7 1.1 0.5 0.6 0.6

Agricultural production - livestock4 02 1.7 1.1 0.5 0.6 0.6

Agricultural production - livestock4 02 1.7 1.1 0.5 0.6 0.6

Agricultural production - livestock4 02 1.7 1.1 0.5 0.6 0.6

Agricultural production - livestock4 02 1.7 1.1 0.5 0.6 0.6

Agricultural production - livestock4 02 1.7 1.1 0.5 0.6 0.6

Agricultural production - livestock4 02 1.7 1.1 0.5 0.6 0.6

Agricultural production - livestock4 02 1.7 1.1 0.5 0.6 0.6

Agricultural services 07 12.4 7.6 4.4 3.2 4.8

Agricultural services 07 12.4 7.6 4.4 3.2 4.8

Agricultural services 07 12.4 7.6 4.4 3.2 4.8

Agricultural services 07 12.4 7.6 4.4 3.2 4.8

Agricultural services 07 12.4 7.6 4.4 3.2 4.8

Agricultural services 07 12.4 7.6 4.4 3.2 4.8

Agricultural services 07 12.4 7.6 4.4 3.2 4.8

Agricultural services 07 12.4 7.6 4.4 3.2 4.8

1.4 0.8 0.4 0.4 0.6

1.4 0.8 0.4 0.4 0.6

1.4 0.8 0.4 0.4 0.6

1.4 0.8 0.4 0.4 0.6

1.4 0.8 0.4 0.4 0.6

1.4 0.8 0.4 0.4 0.6

1.4 0.8 0.4 0.4 0.6

1.4 0.8 0.4 0.4 0.6

Oil and gas extraction 13 1.0 0.5 0.2 0.3 0.5

Oil and gas extraction 13 1.0 0.5 0.2 0.3 0.5

Oil and gas extraction 13 1.0 0.5 0.2 0.3 0.5

Oil and gas extraction 13 1.0 0.5 0.2 0.3 0.5

Oil and gas extraction 13 1.0 0.5 0.2 0.3 0.5

Oil and gas extraction 13 1.0 0.5 0.2 0.3 0.5

Oil and gas extraction 13 1.0 0.5 0.2 0.3 0.5

Oil and gas extraction 13 1.0 0.5 0.2 0.3 0.5

Nonmetallic minerals mining6 14 0.4 0.3 0.2 0.1 0.1

Nonmetallic minerals mining6 14 0.4 0.3 0.2 0.1 0.1

Nonmetallic minerals mining6 14 0.4 0.3 0.2 0.1 0.1

Nonmetallic minerals mining6 14 0.4 0.3 0.2 0.1 0.1

Nonmetallic minerals mining6 14 0.4 0.3 0.2 0.1 0.1

Nonmetallic minerals mining6 14 0.4 0.3 0.2 0.1 0.1

Nonmetallic minerals mining6 14 0.4 0.3 0.2 0.1 0.1

Nonmetallic minerals mining6 14 0.4 0.3 0.2 0.1 0.1

49.8 32.7 23.5 9.3 17.1

49.8 32.7 23.5 9.3 17.1

49.8 32.7 23.5 9.3 17.1

49.8 32.7 23.5 9.3 17.1

49.8 32.7 23.5 9.3 17.1

49.8 32.7 23.5 9.3 17.1

49.8 32.7 23.5 9.3 17.1

49.8 32.7 23.5 9.3 17.1

Adapted from a slide by Dean Allemang

From Tables to Linked Data

Page 19: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 19

Representing Data in Graphs

Graph = nodes linked by labeled edges

Page 20: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 20

What is RDFS?

RDFS is the schema language for RDFType inferences can be made, based on schema

Page 21: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 21

Why is RDFS useful?

RDFS allows us to talk about classes of instances

It provides inferences, e.g.,

Best Western is a Hotel

(and hence, anything we know about Hotelsapplies to Best Western)

RDFS is in RDF (it’s its own schema language!)

Page 22: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 22

gov: EPA

A little RDF(S) goes a long way

gov: department

gov: agency

gov: body

A model of government agencies and departments. Such models are called Ontologies.

brm: Business Area brm: Line Of Business

brm: subfunction

brm:Resource Mgmt

brm: s2citizens brm:Energy

eGOV: capability

eGOV: Standard

eGOV:Service SpeceGOV: Remote Reporting

eGOV: web service

eGovOS: project

gov: FERCgov: DoE

Page 23: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 23

Ontologies are the means to separate “what is common” from “what is different”

From Tim Berners-Lee, ISWC 2003

Semantic map: Connecting silo domains

Page 24: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 24

OWL (formerly DAML+OIL)

What is OWL?

The “Web Ontology Language” W3C Standard DAML

DAML+OIL

OIL

OWL

RDF

DARPA EU (various)

W3C

Became a Recommendation in February 2004

... for Owl, wise though he was in many ways, able to read and write and spell his own name ...

Page 25: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 25

Function Start Date OperatedBy

Air And Radiation

Jan 1, 1994 EPA

Financial Management

Aug 31, 1999 OMB

Compensation Oct 1, 2003 OMB

… … …

<Agency name=“EPA”> <OperatesLOB>SWER</OperatesLOB> <OperatesLOB>Water Quality Reporting</OperatesLOB> <OperatesLOB>Acid Rain Monitoring</OperatesLOB></Agency>

Page 26: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 26

OWL can specify rich relationships: equivalence, inverse, unique, …

Q: What is operatedBy EPA? A: AirAndRadiation Q: What is operatedBy EPA? A: SWER, AirAndRatiation Q: What is operatedBy EPA? A: SWER

Page 27: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 27

SPARQL

SPARQL Protocol and RDF Query Language

Query Language (like SQL is for databases, XQuery is for XML, etc.)

Extracts information from a graph using pattern matching

“Four-star lodging in New York”

Page 28: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 28

Find data in a graph

Page 29: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 29

TopBraid Suite™

Enterprise Application Deployment and Use

(Ontology Modeling and Application Development

to

Complete Semantic Application Lifecycle Support

Semantic Web Modeling and Application Development

Environment

Enterprise Platform for Semantic Web

Applications

Semantic Web Application Assembly

Toolkit

Page 30: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 30

Data.Gov

Data.gov – collection of government data sets made available to the world to use

Page 31: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 31

Sample Dataset: #1329

Page 32: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 32

FHEO Filed Cases

What does the data look like?

Page 33: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 33

Metadata for the data

What are the fields? What do they mean?

Page 34: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 34

How can we use it?

Dataset 1329FHEO Filings(data.gov)

Counties/Locations

SPARQL Rules

FHEO Data Description

TopBriad Suite uses Semantic Web Standards to mash up data

TopBriad Suite uses Semantic Web Standards to mash up data

Page 35: Lotico oct 2010

© Copyright 2007-2010 TopQuadrant Inc. Slide 35

Display data mash-up

Rules for color coding encoded in SPARQL Rules Yahoo! Map served up through TopBraid Ensemble™