Download - Every Identity, its Ontology
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Every Identity,its Ontology
Robert SandersonSemantic ArchitectJ. Paul Getty Trust
[email protected] / @azaroth42
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
The shared identity of the concept of the fictional person
Dr Strangelove: How I learned to stop worrying
and love inconsistency
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Overview
• Linked Open Data and Identity• Philosophical Challenges• Practical Challenges• Practical Philosophy• A Philosophy of Practicality
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Linked Open Data’s PotentialLinked Open Data achieves its potential when institutions: • link outside of their own data (⭐⭐⭐⭐⭐),• trust other organizations tomanage, publish and maintain data which they use
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Linked Open Data’s ChallengesCommonly cited:• Amount of data to transform• Data is mostly “strings”, not “things”• Cost of new management system• Cost of new business workflows• Difficulty of data enrichment• Institutional reluctance to embrace change,
trust, imperfection
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Identity
We need to understand the entity before we can reuse its identifier!
Questions:1. What constitutes “identity”?2. How does one describe entities?3. How does one discover identifiers?
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
LOD Identity Fundamentals
• Open World Assumption• What is not stated is unknown, not false• No single agent or observer has complete knowledge in a distributed system
• Identifier space is infinite• No formal character limit for IRIs• Even practical limit is very large (65536 ^ length)
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
LOD Identity Fundamentals
• IRIs are globally unique• IRIs used for identifying entities and relationships• No identity for instance of a relationship• Only one contextual identity (named graph)per statement, with inconsistent use• Anyone may make assertions about any entity
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Every Identity, …
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
http://www.getty.edu/art/collection/objects/249050/
… some Philosophy• RDF falls in Plato’s “Universals” space• Same relationship had by many entities• No relationship instances• Fictional entities and relationships ok
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
1. Indiscernibility of Identicals
for each object a:
for each object b:
if a === b:
for each property P:
P(a) === P(b)
Or … owl:sameAs
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Open World Ramifications
If we know that a owl:sameAs b
And discover that a property x
Then we know that b property x
The rule is an effect of identity, it doesn’t help us determine identity.
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
2. Identity of Indiscernibles
object a === object b if:
for each property P:
P(a) === P(b)
Or: If two entities share all of their properties, they are the same entity.
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Open World Ramifications (1)
Uh-‐oh!• There are infinite (potential) properties• We cannot compute indiscernibility as the for loop on the properties would run forever
len(Ψ) = ∞Indiscernibility: (∀ P∈ Ψ)(P(a) = P(b)) → a = b
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Open World Ramifications (2)
Uh-‐oh!!• There are infinite (potential) properties• [Imagine the loop could run in zero time]• Any different property would prevent identity• The likelihood of encountering indiscernibles is 1/∞ … or 0
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Open World Ramifications (3)
Uh-‐oh!!!• There are infinite (potential) properties• Any property not asserted is just not known locally and could be known elsewhere• To compute, you need complete knowledge of an infinite set of instances and infinite properties, and zero cost comparison.
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Escaping the Infinite Loop?
But …• Finite asserted properties• Finite set of publishers• Finite changes over time
Can’t we iterate over only the properties actually asserted?
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Escaping the Infinite Loop?
Still need the big-‐triplestore-‐in-‐the-‐sky with all assertions from all publishers.
Answer: Google can do it!
Google, will you run a big triplestore for us?
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Google is Disinclined to Acquiesce
to your Request
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Escaping the Infinite Loop?
Also trivial to construct a failing case:
let Ψ = [rdfs:label]a rdfs:label “Unknown”
b rdfs:label “Unknown”
Should not conclude that a owl:sameAs b
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
(╯ರ ~ ರ)╯︵┻━┻
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
(╯ರ ~ ರ)╯︵┻━┻
angry tableflip
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Practical Philosophy: John Locke
“You cannot know an entity’s identity, only its qualities.” (paraphrased)
This rings true:<urn:uuid:493650E7-‐ACBB-‐40EC-‐B141-‐4F2B6C660A71>
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
More Properties, More Identity?Identity is a relationship that admits of degree:• Less than 100% identity is resemblance• The more resemblance,
the more certain the identity relationskos:exactMatch• “high degree of confidence that the concepts can be used interchangeably across a wide range of applications”
skos:closeMatch• “sufficiently similar that they can be used interchangeably in some applications”
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
… its Ontology
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Resemblance?
• Given “sufficient resemblance”, we can conclude identity for practical purposes• Resemblance is via shared properties• To compute resemblance, we must understand the properties shared by candidate entities• Properties are given as predicates in LOD• Need for shared ontology?
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Porridge Too Hot? Too Cold?
Too few properties:• Sufficiency of resemblance impossible
Too many properties:• Amount of information overwhelming•More likely to run into incompatible properties
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
_:Porridge crm:P51_has_former_or_current_owner
_:Papa Bear?
Understanding can then be increased by not only looking at the one entity, but where it fits within the graph of connected entities.
Now you have many resemblance problems.
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Graphs unlikely to have the same shape, even with a shared ontology.Different organizations:• know different information• are from different domains• have different foci• have different contexts for the work
Graph Isomorphism
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Costs / ValuesReuse
Philosophically InfiniteAutomated: ExpensiveManual: Very Expensive
ReinventionFree as in Kittens!
Cheap, Fast, Good: Pick One!And forget about picking Cheap!
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Every Identity, its OntologyIn the absence of continuous community pressure, demonstration of value, and in-‐house expertise, even well-‐intentioned
organizations will create their own identities and ontologies for describing entities.
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Cultural Heritage Sector• Getty ULAN• Library of Congress NAF• Bibliotheque nationale de France• Deutsche National Bibliothek• British Library• ISNI• VIAF• SNAC• …
Example: Lewis CarrollIndustry
• MusicBrainz (LinkedBrainz)• IMDB (LinkedMDB)• DBPedia• WikiData• Google / Freebase• Genealogics• Quora• ReadSocial• …
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Practical Philosophy
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
We could stop requiring perfection in our use of others’ data:• skos:exactMatch, not owl:sameAs• Data that is good enough• And contribute improvements!
• Persistence, not Permanence• Target is Comprehension, not Inference
Perfect is the Enemy of the Good
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
We could publish a set of rules per class by which sufficiency of resemblance can be determined:•Which properties must overlap?•Which properties must be exactly the same?•Which properties can be ignored?•Which relationships must match?
Sufficiency of Resemblance
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
We could publish services to make it easier to discover and reconcile our identities:• Auto-‐complete / type-‐ahead• Open Refine reconciliation• Embeddable widgets
Resemblance Services
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
We could contribute to shared infrastructure for discovery and change management:• Shared infrastructure, decentralized publication• Notifications when data changes• Notifications when identities are used• With links back from the identity
• Separate publishing / discovery concerns
Shared Infrastructure
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Philosophy of Practicality
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Five Laws of LOD
• Linked Open Data is for Use• Every Developer, her Data• Every Data, its Application• Save the time of the Developer
• LOD [Community] is a growing Organism
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Patrick Hochstenbach, @hochstenbach
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
https://www.flickr.com/photos/harris77/3357537737
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Linked Open Usable Data!
• Strict identity matching is impossible• Target is skos:exactMatch, not owl:sameAs
• Shared ontologies are more important than precision• Target is comprehension, not inference
• Build services & infrastructure to enable reconciliation• Target audience of LOD is Developers
Pick Usable not Perfect!
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Thank You!
Rob [email protected] / @azaroth42
@azaroth42
IIIF: Interope
rabilituy
Every Identity,
Its Ontology
@azaroth42
Discuss!