semantic web

70
Semantic Web John Davies Head of Next Generation Web Research, BT

Upload: adanna

Post on 22-Mar-2016

95 views

Category:

Documents


0 download

DESCRIPTION

Semantic Web. John Davies Head of Next Generation Web Research, BT. Overview of this talk. History of the (Semantic) Web Semantic Web Languages XML RDF(S) OWL Ontologies Semantic Web Applications Knowledge Management Web Services. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Semantic Web

Semantic WebJohn DaviesHead of Next Generation Web Research, BT

Page 2: Semantic Web

Overview of this talk

• History of the (Semantic) Web• Semantic Web Languages

– XML– RDF(S)– OWL

• Ontologies• Semantic Web Applications

– Knowledge Management– Web Services

Page 3: Semantic Web

History of the (Semantic) Web

• Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN

• TBL’s original vision of the Web was much more ambitious than the reality of the existing (syntactic) Web:

“... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.”

Page 4: Semantic Web

• TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web– E.g., article in May 2001 issue of Scientific

American…

Page 5: Semantic Web

Scientific American, May 2001:

Page 6: Semantic Web

10000

100000

1000000

10000000

100000000D

ez 9

4

Jun

95

Dez

95

Jun

96

Dez

96

Jun

97

Dez

97

Jun

98

Dez

98

Jun

99

Dez

99

Jun

00

Dez

00

Jun

01

Dez

01

Jun

02

Dez

02

Jun

03

Dez

03

Time

Web

Ser

ver N

umbe

r

[ Source: http://www.zakon.org/robert/internet/timeline/ ]

„Web data transfer larger than FTP data transfer“„Kifer, Lausen, Woo, Logical foundations of object-orientedand frame-based languages“„A. Borgida, On the relative expressiveness of descriptionLogics and predicate logic“

... Semantic Web HISTORY

„W3C Semantic Web Standardization:Work on Web Ontology Language (OWL)“

„W3C standardization of Semantic Web startsWork on Resource Description Framework (RDF)Work on RDF Schema (RDFS)“

10.2.2004: Resource Description Framework (RDF)Resource Description Framework (RDF)Web Ontology Language (OWL)Web Ontology Language (OWL)become W3C recommendationsbecome W3C recommendations

Semantic WebSemantic Web = Web + Data base technology + Knowledge Representation

„W3C Standardization of XML starts“

„Research projects on Web Ontologiesstart EU : On-To-Knowledge (01/00)and US (DARPA): DAML (07/00)“

Page 7: Semantic Web

Semantic Web

„The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in co-operation.“

[Berners-Lee et al., 2001]

Page 8: Semantic Web

Semantic Web Vision

Page 9: Semantic Web

Where we are Today: the Syntactic Web

[Hendler & Miller 02]

Page 10: Semantic Web

The Syntactic Web is…

• A hypermedia, a digital library– A library of documents called (web pages)

interconnected by a hypermedia of links• A database, an application platform

– A common portal to applications accessible through web pages, and presenting their results as web pages

• A platform for multimedia– BBC Radio 4 anywhere in the world! Terminator 3

trailers!• A naming scheme

– Unique identity for those documents

[Goble 03]

Page 11: Semantic Web

i.e. the Syntactic Web is…

• A place where – computers do the presentation (easy) and – people do the linking and interpreting (hard).

• Why not get computers to do more of the hard work?

[Goble 03]

Page 12: Semantic Web

Hard Work using the Syntactic Web…• Complex queries involving background knowledge

– Find information about “animals that use sonar but are not either bats, dolphins or whales”

• Locating information in data repositories– Travel enquiries– Prices of goods and services– Results of human genome experiments

• Delegating complex tasks to web “agents”– Book me a holiday next weekend somewhere warm, not

too far away, and where they speak French or English

, e.g., Barn Owl

Page 13: Semantic Web

What is the Problem?• Consider a typical web page:

• Markup consists of: – rendering

information (e.g., font size and colour)

– Hyper-links to related content

• Semantic content is accessible to humans but not (easily) to computers…

Page 14: Semantic Web

What information can we see…WWW2002The eleventh international world wide web conferenceSheraton waikiki hotel, Honolulu, hawaii, USA7-11 may 2002, 1 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong kong,,

norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire

Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh

international world wide web conference. This prestigious event..Speakers confirmedTim berners-lee Tim is the well known inventor of the Web, …Ian FosterIan is the pioneer of the Grid, the next generation internet …

Page 15: Semantic Web

What information can a machine see…

Page 16: Semantic Web

XMLUser definable and domain specific markup

<H1>Knowledge Management</H1><UL>

<LI>Manager: John Davies<LI>Project: SEKT </UL>

HTML:

<research-topic><title>Knowledge Management</title><manager>John Davies</manager><project>SEKT</project>

</research-topic>

XML:

Page 17: Semantic Web

XML: Document = labelled tree

course

teachertitle students

name http

<course date=“...”><title>...</title><teacher>...</teacher>

<name>...</name><http>...</http>

<students>...</students></course>

=

• DTD: simple grammars to describe legal trees

• node = label + contents

Page 18: Semantic Web

XML example<play> <title>The Life and Death of King John</title> <Dramatis Personae> <persona>The Earl of PEMBROKE</persona> <persona>The Earl of ESSEX</persona> …… </Dramatis Personae> <Stagedir>SCENE England, the Court.</Stagedir> <act>Act 1 <scene>Scene I. <speech> <speaker>JOHN</speaker> <line>Now, Chatillon, what would France with us?</line> </speech>

Page 19: Semantic Web

Solution: XML markup with “meaningful” tags?

<name>

</name><location> </location>

<date> </date><slogan> </

slogan><participants>

</participants>

Page 20: Semantic Web

But What About…<conf>

</conf><place> </place>

<date> </date><strapline> </

strapline><participants>

</participants>

Page 21: Semantic Web

XML: limitations for semantic markup

XML per se makes no commitment on:• Domain specific ontological vocabulary

• Which words shall we use to describe a given set of concepts?• Ontological modelling primitives

• How can we combine these concepts, e.g. “car is a-kind-of (subclass-of) vehicle”

requires pre-arranged agreement on vocab and primitives

Only feasible for closed collaboration– agents in a small & stable community– pages on a small & stable intranet

.. not for sharable Web-resources

Page 22: Semantic Web

Limitations of the Web today

Machine-to-human, not machine-to-machine

Page 23: Semantic Web

XML is a first step

• Semantic markup– HTML layout– XML content

• Metadata– within documents, not across documents– prescriptive, not descriptive– No commitment on vocabulary and modelling

primitives• RDF is the next step

Page 24: Semantic Web

Resource Description Framework (RDF)• A standard of W3C• Relationships between documents• Consisting of triples or sentences:

– <subject, property, verb>– <Tolkien, wrote, The Lord of the Rings>

• RDFS extends RDF with standard “ontology vocabulary”:– Class, Property– Type, subClassOf– domain, range

Page 25: Semantic Web

An example

“Tolkein wrote ISBN00001047582”

hasWritten (‘http://www.famouswriters.org/tolkein/’, http://www.books.org/ISBN00001047582’)

Page 26: Semantic Web

RDF and RDFS• RDFS defines the ontology

– classes and their properties and relationships– what concepts do we want to reason about and how are they

related– there are authors, and authors write books

• RDF defines the instances of these classes and their properties– Mark Twain is an author– Mark Twain wrote “Adventures of Tom Sawyer”– “Adventures of Tom Sawyer” is a book

• Notation: RDF(S) = RDF + RDFS

Page 27: Semantic Web

hasName(‘http://www.famouswriters.org/twain/mark’,“Mark Twain”)

hasWritten(‘http://www.famouswriters.org/twain/mark’,‘http://www.books.org/ISBN00001047582’)

title(‘http://www.books.org/ISBN00001047582’,“The Adventures of Tom Sawyer”)

XML version:<rdf:Description rdf:about=http://www.famouswriters.org/twain/mark>

<s:hasName>Mark Twain</s:hasName><s:hasWritten rdf:resource=http://www.books.org/ISBN0001047/>

</rdf:Description>

RDF

Page 28: Semantic Web

twain/mark /ISBN000010475hasWritten

“Mark Twain”

“The Adventures of Tom Sawyer”

hasName title

An example RDF data graph

Page 29: Semantic Web

RDF(S) definitions

subclassof(FamousWriter, Writer)

type(‘http://www.books.org/ISBN00001047582’,‘http://www.description.org/schema#Book’)

Page 30: Semantic Web

An example RDF Schema

Writer hasWritten Book

FamousWriter

/twain/mark ../ISBN00010475

Schema(RDFS)Data(RDF)

hasWrittentype

subClassOf

domain range

type

Annotation of WWW resources and semantic links

Page 31: Semantic Web

Conclusions about RDF(S)

• Next step up from plain XML:– (small) ontological commitment to

modeling primitives– possible to define vocabulary

• However:– no precisely described meaning– no inference model

Page 32: Semantic Web

Web Ontology Language Requirements

Desirable features identified for Web Ontology Language:• Extends existing Web standards

– Such as XML, RDF, RDFS• Easy to understand and use

– Should be based on familiar KR idioms• Formally specified • Of “adequate” expressive power• Possible to provide automated reasoning support

Page 33: Semantic Web

OWL Language• OWL is based on Description Logics knowledge representation

formalism• OWL (DL) benefits from many years of DL research:

– Well defined semantics– Formal properties well understood (complexity, decidability)– Known reasoning algorithms– Implemented systems (highly optimised)

• Three species of OWL– OWL full is union of OWL syntax and RDF– OWL DL restricted to FOL fragment – OWL Lite is “easier to implement” subset of OWL DL

• OWL DL based on SHIQ Description Logic

Page 34: Semantic Web

Why OWL?

• OWL = Web Ontology Language• Owl’s superior intelligence is known

throughout the Hundred Acre Wood, as are his talents for Writing, Spelling, other Educated and Special tasks.

• "My spelling is Wobbly. It's good spelling, but it Wobbles, and the letters get in the wrong places."

Page 35: Semantic Web

• a philosophical discipline• a branch of philosophy that deals with the nature

and the organisation of reality• Science of Being (Aristotle, Metaphysics, IV, 1)• Tries to answer the questions:

• What characterizes being?• Eventually, what is being?

Ontology: Origins and History

Page 36: Semantic Web

What (for our purposes) are Ontologies?

Ontologies provide a shared and common understanding of a domain– a shared specification of a conceptualisation– ‘concept map’– for WWW resources – defined using RDF(S) or OWL

Page 37: Semantic Web

Ontology as Taxonomy

Living Beings

Plants

InvertebratesVertebrates

Animals

Taxonomy is a classification system where each node has only one parent – simple ontology

Page 38: Semantic Web

Ontology of People and their Roles

Employee

Manager Expert Analyst

Programme Mgr Project Mgr

funds

advises

Contractor

Typically, we want a richer ontology with more relationships between concepts:

Page 39: Semantic Web

Structure of an OntologyOntologies typically have two distinct components:• Names for important concepts and relationships in the

domain– Elephant is a concept whose members are a kind of

animal– Herbivore is a concept whose members are exactly

those animals who eat only plants or parts of plants • Background knowledge/constraints on the domain

– Adult_Elephants weigh at least 2,000 kg– No individual can be both a Herbivore and a Carnivore

Page 40: Semantic Web

Why develop an ontology?• To make define web resources more precisely and make

them more amenable to machine processing• To make domain assumptions explicit

– Easier to change domain assumptions– Easier to understand and update legacy data

• To separate domain knowledge from operational knowledge– Re-use domain and operational knowledge separately

• A community reference for applications• To share a consistent understanding of what information

means

Page 41: Semantic Web

Types of Ontologies [Guarino, 98]

Describe very general concepts like space, time, event, which are independent of a particular problem or domain. It seems reasonable to have unified top-level ontologies

for large communities of users.

Describe the vocabulary related to a

generic domain by specializing the concepts

introduced in the top-level ontology.

Describe the vocabulary related to a

generic task or activity by specializing the top-level ontologies.

These are the most specific ontologies. Concepts in application ontologies often correspond to roles played by

domain entities while performing a certain activity.

Page 42: Semantic Web

Ontologies - Some Examples• General purpose ontologies:

– WordNet / EuroWordNet, http://www.cogsci.princeton.edu/~wn– The Upper Cyc Ontology, http://www.cyc.com/cyc-2-1/index.html– IEEE Standard Upper Ontology, http://suo.ieee.org/

• Domain and application-specific ontologies:– RDF Site Summary RSS, http://groups.yahoo.com/group/rss-dev/files/schema.rdf– RETSINA Calendering Agent,

http://ilrt.org/discovery/2001/06/schemas/ical-full/hybrid.rdf– AIFB Web Page Ontology, http://ontobroker.semanticweb.org/ontos/aifb.html– Dublin Core, http://dublincore.org/– UMLS, http://www.nlm.nih.gov/research/umls/– Open Biological Ontologies: http://obo.sourceforge.net/

• Ontologies in a wider sense– Agrovoc, http://www.fao.org/agrovoc/– Art and Architecture, http://www.getty.edu/research/tools/vocabulary/aat/– UNSPSC, http://eccma.org/unspsc/

• DAML.org library with all kinds of different ontologies!

Page 43: Semantic Web

RSS (RDF Site Summary)

• RDF Site Summary (RSS) is a lightweight multipurpose extensible metadata description and syndication format.  

• The underlying RDF(S) ontology is extremely simple, mainly consisting of:

title

link

channel

descriptionlanguage

item

hasItem

title

linkdescription

Page 44: Semantic Web

UMLS (Unified Medical Language System) (I)

• provided by the US National Library of Medicine (NLM), a database of medical terminology

• terms from several medical databases (MEDLINE, SNOMED International, Read Codes, etc.) are unified so that different terms are identified as the same medical concept

• access at http://umlsks.nlm.nih.gov/

Page 45: Semantic Web

UMLS (Unified Medical Language System) (II)

• UMLS Knowledge Sources:– Metathesaurus provides the concordance of

medical concepts: • 730,000 concepts• 1.5 million concept names in different source

vocabularies

Page 46: Semantic Web

Dublin Core

• The Dublin Core Metadata Initiative is an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models

• Ontology includes elements like– TITLE– CREATOR– SUBJECT– DESCRIPTION– PUBLISHER– DATE...

Simple set of elementsthat people can agree on!

see http://dublincore.org/

Page 47: Semantic Web

Open Biological Ontologies

• Various ontologies in the biological domain• obo.sourceforge.net• e.g. Gene Ontology (www.geneontology.org)

“Biologists currently waste a lot of time and effort in searching for all of the available information about each small area of research. This is hampered further by the wide variations in terminology that may be common usage at any given time, and that inhibit effective searching by computers as well as people. For example, if you were searching for new targets for antibiotics, you might want to find all the gene products that are involved in bacterial protein synthesis, and that have significantly different sequences or structures from those in humans. But if one database describes these molecules as being involved in 'translation', whereas another uses the phrase 'protein synthesis', it will be difficult for you — and even harder for a computer — to find functionally equivalent terms. The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions of gene products in different databases.”

• Hundreds of classes

Page 48: Semantic Web

Ontology and Logic• Reasoning over ontologies• Inferencing capabilities

X is author of Y Y is written by X

X is supplier to Y; Y is supplier to Z X and Z are part of the same supply chain

Cars are a kind of vehicle;Vehicles have 2 or more wheels

Cars have 2 or more wheels

Page 49: Semantic Web

Proof and Trust

Page 50: Semantic Web

Semantic Web Vision

Page 51: Semantic Web

Semantic Web areas of application

• Semantic Web & Knowledge Management– SEKT (sekt.semanticweb.org)

• Semantic Web-enabled Web Services– SWWS (swws.semanticweb.org)– DIP (dip.semanticweb.org)

Page 52: Semantic Web

The knowledge explosion!

Knowledge workers overwhelmed with information:• from intranets, emails, external newslines …• but may still lack the information they require

They need information identified:• by semantics, not just keywords

– precise and complete• by their interests and their task context• in a form appropriate to their current physical context

– mobile phone, PDA, blackberry

Page 53: Semantic Web

Semantic Web & KM

• Making WWW information machine processable– annotation via ontologies & metadata– offers prospect of enhanced knowledge

management• “Rank all the documents containing the word Tolkien”• “Show me the non-fiction books written by Tolkien

about philology before 1940”• Data integration

– significant research & technology challenges are outstanding

Page 54: Semantic Web

SEKT

• EU collaborative project– €12m budget, 12 partners

• Application of Semantic Web technologies to Knowledge Management

• www.sekt-project.com

Page 55: Semantic Web

Search engine trends• Desktop search:

– Google moving to support searching the desktop. – Microsoft are moving into Google’s space, and vice versa.

• Categorisation: – Increasingly small ranking quality differentiators– One new differentiator is organising the results by categorisation (e.g. clusty.com)

• Integrated search: – Build into OA apps - less overhead to initiate search– MS would hope to dominate by embedding search into all the Office apps

• Seamless search– firing off “implicit queries” based on user activity (blinx.com) – less overhead to access information (don’t have to stop what you’re doing). – Ideally combining search of desktop and web

• Personalised search: – tweaking the search based on user’s prior searches or profile of some kind.

Page 56: Semantic Web

Beyond search, beyond documents

• sub-document level analysis of information • a long list of documents is rarely the ultimate

information need of the end user– “there’s too much relevant information!”– support for the next step - the analysis of the returned

information– e.g. key points on a topic from a large document you

don’t want to read– e.g. creation of a digest of information from multiple

documents about Bush’s statements on a given topic

Page 57: Semantic Web

Searching by concept

Page 58: Semantic Web

Identifying information entities- people, companies …

Page 59: Semantic Web

The semantic desktop

• Integrated with day-to-day business processes– automatic knowledge delivery based on current

context• Multiple devices• Personal profile• Current activity• Current location

– integrated with the tools which support business processes

• Supporting on-the-fly metadata creation– as a side effect of data creation

Page 60: Semantic Web

Major research challenges• Improve automation of ontology and metadata

generation• Research and develop techniques for ontology

management and evolution• Develop highly-scalable solutions • Research sound inferencing despite inconsistent

models• Develop semantic knowledge access tools• Develop methodology for deployment

Page 61: Semantic Web

Semantic Web-enabled Web Services

WWWstatic, unstructured info

Web Servicescomputational objects

Semantic Webstructured info

SWWS - intelligent service discovery, interoperation, composition

Page 62: Semantic Web

Current Web Services

• UDDI, WDSL, SOAP– Web Service discovery and description– No semantic (formal) description– Don’t support automatic

• web service discovery• mediation• composition into complex services• negotiation

Page 63: Semantic Web

Future Web Services -exploiting the Semantic Web

• OWL-S– an OWL-based language for WS description– US-based consortium

• WSMF - Web Services Modelling Framework– EU initiative (DIP project)– Extends and enhances OWL-S capability– P2P approach with emphasis on mediation– www.wsmo.org

Page 64: Semantic Web

Semantic Web Services• Automatic discovery

Find a book selling service

• Automatic invocationPurchase the latest Ian McEwan book

• Automatic composition and interoperationPurchase the cheapest, latest Ian McEwan book and the latest Keane

CD and have them both delivered within 48 hours to Innsbruck

• Automatic execution monitoringWhat is the status of my book order?

Page 65: Semantic Web

Semantic Web Services - benefits

• More flexible use of internal IT systems• Cost savings via software re-use• Repurposing legacy systems• Easier B2B integration along supply chain• Software as a commodity

– Web-based services– Usage-based charging

Page 66: Semantic Web

Semantic Web and AI?• No anthropomorphic claims• As with today’s WWW

– large, inconsistent, distributed• Requirements

– scalable, robust, decentralised– tolerant, mediated

• As with WWW, Semantic Web will (need to) adapt fast

Page 67: Semantic Web

What the analysts say• Gartner

– Long time before the majority of the public Web is “semantic” (as expected)

– In many areas (content management, life sciences, government, media, industry and market information), Semantic Web will be adopted much earlier (2006)

– Business Impact: Improve content management (costs and quality), information access, system interoperability, database integration and data quality.

• IDC– “Hottest” emerging technology in 2005

• “information-intensive enterprises should do more than just take note”

Page 68: Semantic Web

Summary• The emergence of the Semantic Web

– machine-processable information– Language stack: XML/RDF(S)/OWL– Ontologies

• Semantic Web for KM– next generation WWW-based KM tools (inside)

• Semantic Web for Web Services– automating Web Services processes (buy/sellside)

“… great implications for a huge range of industrial and social applications” Gartner Group

Page 69: Semantic Web

Acknowledgements

• York Sure, University of Karlsruhe.• Frank van Harmelen, Vrije Universiteit Amsterdam.• Ian Horrocks, University of Manchester.

Page 70: Semantic Web

Thanks for your time & attention

• Any questions?• Here’s 3 for you:

– What are the semantic web layers?– What is the key difference between the

semantic and syntactic web?– Name 3 ontologies in use today

John DaviesNext Generation Web Research, BT

[email protected]