resource identity and semantic extensions: making sense of ambiguity david booth, ph.d. cleveland...

45
Resource Identity and Semantic Extensions: Making Sense of Ambiguity David Booth, Ph.D. Cleveland Clinic (contractor) Semantic Technology Conference 25-June-2010 Latest version of these slides: http://dbooth.org/2010/ambiguity/ Also available: Companion paper

Upload: leslie-webster

Post on 02-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Resource Identity and Semantic Extensions:

Making Sense of AmbiguityDavid Booth, Ph.D.

Cleveland Clinic (contractor)

Semantic Technology Conference

25-June-2010

Latest version of these slides:http://dbooth.org/2010/ambiguity/

Also available: Companion paper

Outline

• Part 0: Myths about resource identity

• Part 1: RDF semantics and ambiguity– Interpretations

– Interpretations of a URI

• Part 2: Constraining ambiguity through URI

declarations– Bounding ambiguity

– Ambiguity and owl:sameAs

• Part 3: Determining resource identity– Semantic extensions

PART 0:

Myths About Resource Identity

This section makes some observations that will be further

explained in later sections.

4

URIs as names for resources

URI

http://example/#apple

Resource

• In the semantic web, URIs are used as names for

resources– Separate from a URI's use as a locator

• “Resource” == “Thing” == Universal class– E.g., people, proteins, medications, concepts, etc.

• But which resource does a URI name?

?

Resource identity

• Which resource does a given URI denote?

• Because a URI can denote any resource, this

question is central to RDF semantics

• This is the question of resource identity – the

determination of which resource a URI denotes

Myth 1: A URI denotes only one resource

Myth:

• “By design a URI identifies one resource”

– W3C Architecture of the World Wide Web

Reality:

• True as an ideal

• True for one interpretation of one RDF graph, but . . .

• Different interpretations of the same graph may map

the same URI to different resources

• Different graphs may permit different interpretations

Myth 2: RDF semantics are global

Myth (a variation of #1):

• There is only one giant graph, with global semantics

• E.g., “owl:sameAs makes a very strong statement”– The implication is that it must hold universally

Reality:

• RDF semantics are defined for a given graph

• There are many graphs

• The “meaning” of a URI depends on the graph– A URI may denote different resources in different

graphs

Myth 3: Resource ambiguity is due to sloppiness

Myth:

• A URI's resource identity can be uniquely defined if

you are precise enough

Reality:

• Ambiguity is unavoidable– . . . with vanishingly few exceptions

• Always possible to make ever finer resource

distinctions

• See examples in In Defense of Ambiguity by Pat

Hayes and Harry Halpin

Myth 4: Truth is absolute

Myth:

• “If your RDF models the world as flat, then it is

wrong”

Reality:

• “Truth” is irrelevant; what matters is usefulness

• Different apps have different needs– Flat world model may be best for street navigation:

Precise enough, and simpler than round world model

• Different apps need different models

PART 1:

RDF Semantics and Ambiguity

This section examines some consequences of standard

RDF semantics.

11

“Interpretations” in RDF Semantics

• An interpretation maps URIs to resources

http://example/#plum

http://example/#apple

http://example/#pear

http://example/#banana

http://example/#orange

InterpretationURIs Resources

12

An interpretation applied to a single URI

• An interpretation maps that URI to one resource– Associates the name with a particular resource

http://example/#apple

InterpretationURI Resource

13

Multiple interpretations

• RDF semantics does not constrain a graph to a

unique interpretation

• Different interpretations may map the same URI to

different resources

http://example/#apple

InterpretationsURIs Resources

i3

i2

i1

14

Many interpretations

• There may be many interpretations– Potentially infinite

http://example/#apple

15

RDF semantics constrains the possible interpretations for a given graph

• For a given graph, RDF semantics constrains the

possible interpretations

http://example/#apple

16

Adding assertions reduces the set of possible interpretations

• By merging RDF graphs, constraints of both graphs

must be satisfied

http://example/#apple

17

“Interpretations of a URI”

• For a given graph, “Interpretations of a URI” == The set of resources from applying all possible interpretations to that URI

http://example/#apple

18

Resource ambiguity

• For a given RDF graph, a URI's resource is ambiguous if there exists

more than one possible interpretation for that URI– I.e., the possible interpretations map that URI to more than one resource

• Referent of a URI is almost always ambiguous!– But that's okay – it's just life

http://example/#apple

19

Interpretations of different URIs may overlap

• URIs X and Y may map to some of the same

resources

Interpretations of X Interpretations of Y

20

Effect of owl:sameAs

• X owl:sameAs Y

• Limits the interpretations for X and Y to the intersection

X owl:sameAs Y

Interpretations of X Interpretations of Y

PART 2:

Constraining Ambiguity through

URI Declarations

This section proposes a standard way to constrain

resource ambiguity.

22

URI Declarations

• A URI declaration provides a definition for a

resource denoted by a URI– See “URI Declaration in Semantic Web Architecture”

• Definition is provided by a set of core assertions

• Core assertions constrain the possible

interpretations for the URI

• URI declaration should be provided via the URI's

follow-your-nose location– See “Cool URIs for the Semantic Web”

Why URI Declarations?

• Easy to know what definition to use– Dereference the URI to find its URI declaration (usually)

• Permits all users of the URI share the same

definition– Stablizes meaning / Avoids semantic drift

• Resource ambiguity is precisely bounded– Interpretations can still vary within bounds

24

Bounding the interpretations of a URI X

• URI declaration bounds the interpretations of URI X

• Use of X in graph A further limits the possible

interpretations

Interp. of Xin graph A

Interpretations of X consistentwith X's URI declaration

25

Interpretations of a URI X in different graphs

• Same URI may have different possible interpretations in

different graphs– E.g., URI X is used in graphs A and B

• All are within the bounds of the X's URI declaration

• When graphs are merged, the possible interpretations for

X are limited to the intersection

In graph A In graph BIn A+B

In URI declaration

26

In X's URI declaration

Inconsistent combined graphs

• URI X is used in graphs A, B and C– Graph A+B is consistent

– Graph B+C is consistent

• Graph A+C (or A+B+C) is inconsistent: no possible

interpretations

In AIn B

In C

In A+B In B+C

27

In X's URI declaration

Splitting identities

• Use of A+C (or A+B+C) together requires splitting X's identity,

e.g.:– Mint new URI Xab to replace X in graph A to make A'

– Mint new URI Xbc to replace X in graph C to make C'

– Then merge graph A' with graph C'

• See http://dbooth.org/2007/splitting/

X in AX in B X in C

Xabin A+B

Xbcin B+C

28

Trade-off: precision versus reusability

• Broader URI declaration:– Permits the URI to be used in more applications

– Causes more down-stream contradictions, when the

URI is re-used in other graphs and those graphs are

later combined

• Narrower URI declaration:– Restricts the URI to few applications

– Reduces likelihood of downstream contradictions

• Recommendation:– Choose the degree of precision that will best attract the

community of applications that you wish to attract

– See also discussion of “clumping” in

http://dbooth.org/2007/uri-decl/20100615.htm#clumping

PART 3:

Determining Resource Identity

This section proposes a standard process for determining

resource identity.

30

Determining resource identity

1. Select assertions – what graph?

1.a. Recursively merge ontologies and URI declarations– Ontologies and URI declarations should be cached!

2. Apply RDF semantics

• Constrains the possible interpretations for each URI

3. Select an interpretation

RDF Semantics only defines step 2!

31

3. Select aninterpretation

Resource identity with RDF semantics

1. Selectassertions

2. ApplyRDF semantics

Available assertions

Possible interpretations

e.g. <http://example#apple> ...

Informalassertions

e.g. rdf:comment " ... " .

Formalassertions

1.a. Get ontologies& URI declarations

32

Semantic extensions

• Define additional entailment rules and constraints– E.g., OWL or FruitOnt

• Must be monotonic– All previous entailments still hold

• Further limit the set of possible interpretations

• Typically triggered by a predicate URI

33

Resource identity under semantic extensions

1. Select assertions – what graph?

1.a. Recursively merge ontologies and URI declarations– Ontologies and URI declarations should be cached!

2. Apply RDF semantics + semantic extensions

• Predicate URI triggers the use of semantic extensions:– Opaque plug-in, or

– Set of rules

3. Select an interpretation

34

Resource identity under semantic extensions

1. Selectassertions

3. Select aninterpretation

2. ApplyRDF+extensionsemantics

Available assertions

Possible interpretations

e.g. <http://example#apple> ...

Informalassertions

e.g. rdf:comment " ... " .

Formalassertions

Semantic extensionse.g. OWL, FruitOnt

1.a. Get ontologies& URI declarations

Summary

• Part 0: Myths about resource identity

• Part 1: RDF semantics and ambiguity– Interpretations

– Interpretations of a URI

• Part 2: Constraining ambiguity through URI

declarations– Bounding ambiguity

– Ambiguity and owl:sameAs

• Part 3: Determining resource identity– Semantic extensions

36

Questions?

37

BACKUP SLIDES

38

Splitting URI X resource identity

• What if you really want to combine graphs A+B+C?

• URI X may be split into two URIs, e.g.:

• In graph AB = A+B, change all X to X1

• In graph BC = B+C, change all X to X2

In graph A In graph B In graph C

In A+B In B+C

39

X owl:sameAs Y

Interpretations of X Interpretations of Y

40

X owl:sameAs Y

Interpretations of X Interpretations of Y

41

42

43

Effect of owl:sameAs

• X owl:sameAs Y

• Each URI has a set of possible interpretations

• owl:sameAs limits this set to the intersection

Interpretations for X Interpretations for Y

Brief Description

• This presentation shows how ambiguity fits within

standard RDF semantics, explains how it relates to

owl:sameAs, and proposes a standard operational

sequence for determining the referent of a URI.

Abstract

• What does a URI denote? How should its referent be determined, even in

the presence of semantic extensions that affect the interpretation of an RDF

graph? How should ambiguity be viewed?

• One view is that a given URI has no fixed referent, but may denote different

things in different contexts. Another is that each URI should have a URI

declaration that precisely delimits its interpretation. Some suggest reusing

existing URIs in new contexts, while others prefer to mint new URIs and

then allow owl:sameAs assertions to indicate that two URIs denote the same

thing.

• This presentation sheds light on these issues by explaining how ambiguity

of a URI's referent fits within standard RDF semantics, how this ambiguity

applies to the use of owl:sameAs, and proposes a standard operational

sequence for determining the intended referent of a URI, even in the the

presence of semantic extensions.