jim hendler's presentation at sssw 2011

53
Tetherless World Constellation The Semantic Web: Reclaiming the vision Jim Hendler RPI http://www.cs.rpi.edu/~hendler @jahendler

Upload: sssw2011

Post on 09-May-2015

1.629 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The Semantic Web: Reclaiming the vision

Jim HendlerRPI

http://www.cs.rpi.edu/~hendler@jahendler

Page 2: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Original VisionWho first conceived of the Semantic Web?

Tim Berners-Lee (WWW Geneva, 1994)

• ”… documents on the web describe real objects and imaginary concepts, and give particular relationships between them... For example, a document might describe a person. The title document to a house describes a house and also the ownership relation with a person. ... This means that machines, as well as people operating on the web of information, can do real things. For example, a program could search for a house and negotiate transfer of ownership of the house to a new owner. The land registry guarantees that the title actually represents reality.”– Tim Berners-Lee plenary presentation at WWW Geneva, 1994

Page 3: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Revisiting the Vision…

Page 4: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

A busy decade+

>200 Semantic Web talks since 2000

Page 5: Jim Hendler's Presentation at SSSW 2011

Beyond XML:Agent Semantics• DARPA will lead the way with the development of

Agent markup Language (DAML)– a “semantic” language that ties the information on a

page to machine readable semantics (ontology)• Currently being explored at University level

– SHOE (Maryland), Ontobroker(Karlsruhe),OWL(Washington Univ)– Largely grows from past DARPA programs (I3, ARPI)

• But not transitioning – W3C focused on short-term gain:HTML/XML

<Title> Beyond XML

<subtitle> agent semantics </subtitle> </title>

<USE-ONTOLOGY ID=”PPT-ontology" VERSION="1.0" PREFIX=”PP" URL= "http://iwp.darpa.mil/ppt..html">

<CATEGORY NAME=”pp.presentation” FOR="http://iwp.darpa.mil/jhendler/agents.html">

<RELATION-VALUE POS1 = “Agents” POS2 = “/jhendler”>

<ONTOLOGY ID=”powerpoint-ontology" VERSION="1.0" DESCRIPTION=”formal model for powerpoint presentations">

<DEF-CATEGORY NAME=”Title" ISA=”Pres-Feature" > <DEF-CATEGORY NAME=”Subtitle" ISA=”Pres-Feature" >

<DEF-RELATION NAME=”title-of" SHORT="was written by"> <DEF-ARG POS=1 TYPE=”presentation"> <DEF-ARG POS=2 TYPE=”presenter" >

Prehistory: 1st funding talk Oct. 1999

Page 6: Jim Hendler's Presentation at SSSW 2011

6Brighton, Mar 2002

This leads to a radically new view of interoperation

Distributed,partially mapped, inconsistent -- but very flexible!

uses

uses

uses

uses

usesuses

uses

uses

uses

uses

uses

uses

usesuses

uses

uses

uses

uses

uses

uses

usesuses

uses

uses

uses

uses

uses

uses

usesuses

uses

uses

uses

uses

uses

uses

usesuses

uses

uses

uses

uses

uses

uses

usesuses

uses

uses

Page 7: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Berners-Lee et al, 2001

(May 21, 2001)

Page 8: Jim Hendler's Presentation at SSSW 2011

8Southhampton, 1/03 8 www.mindswap.org

Web “travel agents”

Query processed: 73 answers found Google document search finds 235,312 possible page hits. Http://www…/CowTexas.html claims the answer is 289,921,836 A database entitled “Texas Cattle Association” can be queried for the answer,

but you will need “authorization as a state employee.” A computer program that can compute that number is offered by the State of

Texas Cattleman’s Cooperative, click here to run program. ... The “sex network” can answer anything that troubles you, click here for

relief... The “UFO network” claims the “all cows in Texas have been replaced by

aliens

How many cows are there in Texas?

“Agent” Markup Language

Page 9: Jim Hendler's Presentation at SSSW 2011

9Brighton, Mar 2002

Making Markup Easier

Page 10: Jim Hendler's Presentation at SSSW 2011

10Brighton, Mar 2002

Animal ontology

Page 11: Jim Hendler's Presentation at SSSW 2011

11Brighton, Mar 2002

Use that markup in query/portal interfaces

Page 12: Jim Hendler's Presentation at SSSW 2011

12Southhampton, 1/03 12 www.mindswap.org

Services need Web Logics

2001: Semantic Web Services

Page 13: Jim Hendler's Presentation at SSSW 2011

13Southhampton, 1/03 13 www.mindswap.org

Services off the desktop2003: Semantic Web Services

Page 14: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The famous layercake

Tim Berners-Lee, 2001

Page 15: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The “approved” layercake (ca. 2006)

Page 16: Jim Hendler's Presentation at SSSW 2011

SPARUL

SAWSDLWSMO WSMOX WSMOL

(LayercakeISWC 2010)

Page 17: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

DAML

Notional Schedule

Now

Later

2001: We will change the world!

Page 18: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

So where have we got to

• Semantic Web technology use has exceeded even my wildest expectations– What is different now?

• Semantic Search– All the big kids are playing!

• Advertising drives Web markets– “Markets are created by disaggregating the

producer and the consumer” • “Buzz” around data on the Web

– esp. Open Government Data

Page 19: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Sem Web 2010

April 2010

Page 20: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Semantic Web 2010

July 2010

Page 21: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Sem Web 2010

August 2010

Page 22: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Sem Web 2010

July 2010

Page 23: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Sem Web 2010

August 2010

Page 24: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Enterprise Semantic Web

Page 25: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Example: OGP use growing quicklyFacebook incentivizing use of RDFa like buttons

15,178 sites of top 1,000,000 as of 3/3/11

Facebook is encouraging developers to use the RDFa version

Oct 2010: FB reports RDFa is ~ 10-15% of > 3,000,000 likes per day!

Page 26: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Because they want the links!

The network is where their money is made! (predicted >$5B of advertising in next two years)

Page 27: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Creates a platform for SW-powered apps

Page 28: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

They said it couldn’t be done

• We have proven a number of our critics (mostly) wrong

Page 29: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The Shirky fallacy

• Folksonomy will win– Tagging the technology of choice

• Tagging has largely failed to meet its promise– Tagging doesn’t achieve goals without “social

context”• Example: Flickr tag “James”; Amazon tag “My-…”

The Network effect requires links (Hendler & Golbeck, JWS, 2008)

Page 30: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The database community fallacy

• The semantic web will never scale,1,000,000 triples and things go to heck

Winner of the 2009 Billion Triples Challenge

Just plain wrong!!

Page 31: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

“ad hoc” data integrationexample: Linked Open Govt Data

More than 50 of these at http://logd.tw.rpi.eduSee also http://data.gov and http://data.gov.uk

Page 32: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

And we do things the DB community struggles with

Page 33: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Another Shirky criticism

• This is just a make-work program to keep AI scientists busy doing what they’ve always done

• Cannot create an ontology at Web Scale• AI never works so it won’t this time• Logic and reasoning will not work on the

Web because people disagree and because logic isn’t powerful enough for what is needed

– (ok, he called it syllogism, but we know what he meant)

Page 34: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The “bottom” of the Semantic Web

• What isseeing the mostuse??

RDFa

Page 35: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The success of “Linked Data”

• Maturation of RDF technologies– SPARQL endpoints

• Fits Web development models– RDFa

• Works well with current search paradigms– A little semantics goes a long way

• BUT WHAT IS STUNNING IS JUST HOW LITTLE!

– Equality via same URI– RDFa mostly w/DBMS not triple store– Not only no reasoning, but hardly any “principled”

inferencing!

Page 36: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The bad news…

• The ontology story is still confused

Page 37: Jim Hendler's Presentation at SSSW 2011

Ontology: the OWL DL view

• Ontology as Barad-Dur (Sauron's tower):– Extremely powerful!

– Patrolled by Orcs• Let one little hobbit

in, and the whole thing could come crashing down

inconsistency

Decidable Logic basis

Page 38: Jim Hendler's Presentation at SSSW 2011

ontology: the linked-data view

• ontology and the tower of Babel– We will build a tower

to reach the sky– We only need a little

ontological agreement

• Who cares if we all speak different languages?

Genesis 11:7 Let us go down, and there

confound their language, th

at they may not

understand one another's speech. So the

Lord scattered them abroad from thence upon

the face of all the earth: and they left o

ff to

build the city.

Page 39: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

OWL has had successes

• Examples from Clark and Parsia (2011)– Decision-support tool for sales people to automate policy

driven cross-selling recommendations at very large US bank built out of RDF integrated data, OWL reasoning, and Pellet

– At global 25 company (another bank) OWL and Pellet form the core of a bank-wide Entitlements service to represent, analyze, and query every access control policy for the entire bank, globally, in 50+ legal jurisdictions

• And many other companies could claim similar– But most of these sorts of systems are still just coming out

of prototype phase– And most are still more “expert” system than Web app

Page 40: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The tough love stuff

• OWL is succeeding to a large degree as a KR standard– Building “expert systems” as a business

has never gone away; OWL improves tooling

• But it is largely failing in bringing representation to the WWW– cf. so-called “misuse” of owl:sameAs >>

“proper” use– cf. rdf:class >> owl:class – cf. it is rare that ontologies link to others

Page 41: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

The gap is growing

• Linked-Data-based applications are growing in size, number and importance on the Web– But the “vocabulary” story is still unclear

• C.f. Name a major application powered by DBpedia

• Ontology research is turning OWL into a usable KR standard,– But the linking story is still unclear

No linking without vocabulariesNo network effect without links

Page 42: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

What I think we MUST do

• Bridging the gap between the linked-data and ontology views requires some key research challenges to be addressed– DL (and FOL) are useful formalisms for

KR&R, but do not address the needs of the Web!

– Empirical comparisons are useful in scaling systems, but do not address the needs of an academic community!

Page 43: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

My Challenge to you

• A sufficient formalism for Semantic Web applications must– Provide a model that accounts for linked

data • What is the equivalent of a DB calculus?

– Provide a means for evaluating incomplete reasoners • In practice we must be able to model A-box

effects as formally as T-box technologies

Page 44: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Be bold!

• A sufficient formalism for Semantic Web applications must also– Define what an ontology is

Web ontologies really are• Including external referents

linking between terms• Including ontology

alignment partial mapping • Including non-expressive

formalisms real-world “errors”

Page 45: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Start it this week!

• One idea on how to get there– Define common problems that offer

features of interest to both communities– Compare approaches with respect to

performance– Develop hybrids that have best features

of both as necessary– Repeat

(thanks Bettina!)

Page 46: Jim Hendler's Presentation at SSSW 2011

Summary• The infrastructure needs of intelligent systems are now

being met by a combination of Semantic Web, Linked Data, Web Services and Rule-based systems– Knowledge engineering can be jumpstarted from existing

terminologies/ontologies, semi-structured systems, and other Web resources

– Web Services (esp WSDL, SAWSDL) provide "wrappers" and other methods to let "legacy" systems play with agents

– Reasoners and rule-based systems are scaling in new ways, and receiving some standardization

• So where are all the agents???

IADIS-2008

Page 47: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Regaining the Vision?

• The Semantic Web is here, it is working, and it will continue to do so– OGP, schema.org, govt data

• But, for it to move to the next level and be all that we as a community have aspired for– We must revisit and update the early visions for

the modern web– We must unify the “competing” models of linked-

data and machine-readable vocabularies– We must step up to some critical research

challenges

Page 48: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Appendix

• Research Challenges (ca. 2008)

Page 49: Jim Hendler's Presentation at SSSW 2011

Research Challenges• What is the Web culture?

– Design/use/analysis are connected to "cultural stereotypes" (Think HSBC ads)• What are the cultural stereotypes in the emerging online

community?

• What level of "knowledge" is needed by Web users? – Is this dependent on application? User

community? – Is expressivity a plus, minus, non-issue?

• Especially in an open system (previous AI systems were "closed"

Page 50: Jim Hendler's Presentation at SSSW 2011

Research Challenges• Computational challenges as "end user" support

– Scaling– Semantic Web HCI (What do we show "real users"?)

• What are the trade-offs in use– Virtually all AI literature assumes a high-cost, high-value

model– The Semantic Web is showing us alternative models

• What are the trade-offs, analyses

• If more and more of what we see includes integrated data from multiple sources, will that change the trust models– Do we need to expose provenance? Will "provider" model

be changed?

Page 51: Jim Hendler's Presentation at SSSW 2011

Research Challenges• Who are the "experts"

– What level of expertise is needed to become "dangerous" with this new technology?

• What is the "ecosystem" (what is the equivalent of Web developer/web master/web user?)

• If more and more of what we see includes integrated data from multiple sources, will that change the trust models– Do we need to expose provenance? Will "provider" model be

changed? • Formal vs. informal models of ontology

– I didn't discuss "folksonomy" but a key aspect is "social context" (Hendler & Golbeck, 08)

• Can social contexts use

Page 52: Jim Hendler's Presentation at SSSW 2011

Research Challenges

The Biggie

Page 53: Jim Hendler's Presentation at SSSW 2011

Tetherless World Constellation

Questions?