every identity, its ontology

47
@azaroth42 rsanderson @getty.edu IIIF: Interoperabilituy Every Identity, Its Ontology @azaroth42 rsanderson @getty.edu Every Identity, its Ontology Robert Sanderson Semantic Architect J. Paul Getty Trust [email protected] / @azaroth42

Upload: robert-sanderson

Post on 20-Mar-2017

655 views

Category:

Technology


0 download

TRANSCRIPT

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Every  Identity,its  Ontology

Robert  SandersonSemantic  ArchitectJ.  Paul  Getty  Trust

[email protected] /        @azaroth42

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

The  shared  identity  of  the  concept  of  the  fictional  person  

Dr Strangelove:  How  I  learned  to  stop  worrying  

and love inconsistency

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Overview

• Linked  Open  Data  and  Identity• Philosophical  Challenges• Practical  Challenges• Practical  Philosophy• A  Philosophy  of  Practicality

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Linked  Open  Data’s  PotentialLinked  Open  Data  achieves  its  potential  when  institutions:  • link  outside  of  their  own  data  (⭐⭐⭐⭐⭐),• trust  other  organizations  tomanage,  publish  and  maintain  data  which  they  use

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Linked  Open  Data’s  ChallengesCommonly  cited:• Amount  of  data  to  transform• Data  is  mostly  “strings”,  not  “things”• Cost  of  new  management  system• Cost  of  new  business  workflows• Difficulty  of  data  enrichment• Institutional  reluctance  to  embrace  change,  

trust,  imperfection

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Identity

We  need  to  understand  the  entity  before  we  can  reuse  its  identifier!

Questions:1. What  constitutes  “identity”?2. How  does  one  describe  entities?3. How  does  one  discover  identifiers?

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

LOD  Identity  Fundamentals

• Open  World  Assumption• What  is  not  stated  is  unknown,  not  false• No  single  agent  or  observer  has  complete  knowledge  in  a  distributed  system

• Identifier  space  is  infinite• No  formal  character  limit  for  IRIs• Even  practical  limit  is  very  large  (65536  ^ length)

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

LOD  Identity  Fundamentals

• IRIs  are  globally  unique• IRIs  used  for  identifying  entities  and  relationships• No  identity  for  instance  of  a  relationship• Only  one  contextual  identity  (named  graph)per  statement,  with  inconsistent  use• Anyone  may  make  assertions  about  any  entity

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Every  Identity,  …

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

http://www.getty.edu/art/collection/objects/249050/

…  some  Philosophy• RDF  falls  in  Plato’s  “Universals”  space• Same  relationship  had  by  many  entities• No  relationship  instances• Fictional  entities  and  relationships  ok

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

1.  Indiscernibility  of  Identicals

for each object a:

for each object b:

if a === b:

for each property P:

P(a) === P(b)

Or  …    owl:sameAs

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Open  World  Ramifications

If  we  know  that                             a owl:sameAs b

And  discover  that                         a property x

Then  we  know  that                   b property x

The  rule  is  an  effect  of  identity,  it  doesn’t  help  us  determine  identity.

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

2.  Identity  of  Indiscernibles

object a === object b if:

for each property P:

P(a) === P(b)

Or:  If  two  entities  share  all of  their  properties,  they  are  the  same  entity.

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Open  World  Ramifications  (1)

Uh-­‐oh!• There  are  infinite  (potential)  properties• We  cannot  compute  indiscernibility  as  the  for  loop  on  the  properties  would  run  forever

len(Ψ)  = ∞Indiscernibility:  (∀ P∈ Ψ)(P(a)  =  P(b))  →  a  =  b

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Open  World  Ramifications  (2)

Uh-­‐oh!!• There  are  infinite  (potential)  properties• [Imagine  the  loop  could  run  in  zero  time]• Any  different  property  would  prevent  identity• The  likelihood  of  encountering  indiscernibles is  1/∞  …  or  0

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Open  World  Ramifications  (3)

Uh-­‐oh!!!• There  are  infinite  (potential)  properties• Any  property  not  asserted  is  just  not  known  locally  and  could  be  known  elsewhere• To  compute,  you  need  complete  knowledge  of  an  infinite  set  of  instances  and  infinite  properties,  and  zero  cost  comparison.

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Escaping  the  Infinite  Loop?

But  …• Finite  asserted  properties• Finite  set  of  publishers• Finite  changes  over  time

Can’t  we  iterate  over  only  the  properties  actually asserted?

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Escaping  the  Infinite  Loop?

Still  need  the  big-­‐triplestore-­‐in-­‐the-­‐sky  with  all  assertions  from  all  publishers.

Answer:  Google  can  do  it!

Google,  will  you  run  a  big  triplestore for  us?

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Google  is  Disinclined  to  Acquiesce

to  your  Request

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Escaping  the  Infinite  Loop?

Also  trivial  to  construct  a  failing  case:

let Ψ = [rdfs:label]a rdfs:label “Unknown”

b rdfs:label “Unknown”

Should  not  conclude  that          a owl:sameAs b

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

(╯ರ ~ ರ)╯︵┻━┻

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

(╯ರ ~ ರ)╯︵┻━┻

angry  tableflip

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Practical  Philosophy:  John  Locke

“You  cannot  know  an  entity’s  identity,  only  its  qualities.”  (paraphrased)

This  rings  true:<urn:uuid:493650E7-­‐ACBB-­‐40EC-­‐B141-­‐4F2B6C660A71>

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

More  Properties,  More  Identity?Identity  is  a  relationship  that  admits  of  degree:• Less  than  100%  identity  is  resemblance• The  more  resemblance,  

the  more  certain  the  identity  relationskos:exactMatch• “high  degree  of  confidence  that  the  concepts  can  be  used  interchangeably  across  a  wide  range  of  applications”

skos:closeMatch• “sufficiently  similar  that  they  can  be  used  interchangeably  in some applications”

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

…  its  Ontology

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Resemblance?

• Given  “sufficient  resemblance”,  we  can  conclude  identity  for  practical  purposes• Resemblance  is  via  shared  properties• To  compute  resemblance,  we  must  understand  the  properties  shared  by  candidate  entities• Properties  are  given  as  predicates  in  LOD• Need  for  shared  ontology?

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Porridge  Too  Hot?  Too  Cold?

Too  few  properties:• Sufficiency  of  resemblance  impossible

Too  many  properties:• Amount  of  information  overwhelming•More  likely  to  run  into  incompatible  properties

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

_:Porridge  crm:P51_has_former_or_current_owner  

_:Papa  Bear?

Understanding  can  then  be  increased  by  not  only  looking  at  the  one  entity,  but  where  it  fits  within  the  graph  of  connected  entities.

Now  you  have  many  resemblance  problems.

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Graphs  unlikely  to  have  the  same  shape,  even  with  a  shared  ontology.Different  organizations:• know  different  information• are  from  different  domains• have  different  foci• have  different  contexts  for  the  work

Graph  Isomorphism

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Costs  /  ValuesReuse

Philosophically  InfiniteAutomated:  ExpensiveManual:  Very  Expensive

ReinventionFree  as  in  Kittens!

Cheap,  Fast,  Good:  Pick  One!And  forget  about  picking  Cheap!

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Every  Identity,  its  OntologyIn  the  absence  of  continuous  community  pressure,  demonstration  of  value,  and  in-­‐house  expertise,  even  well-­‐intentioned  

organizations  will  create  their  own  identities  and  ontologies  for  describing  entities.

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Cultural  Heritage  Sector• Getty  ULAN• Library  of  Congress  NAF• Bibliotheque nationale de  France• Deutsche  National  Bibliothek• British  Library• ISNI• VIAF• SNAC• …

Example:  Lewis  CarrollIndustry

• MusicBrainz (LinkedBrainz)• IMDB  (LinkedMDB)• DBPedia• WikiData• Google  /  Freebase• Genealogics• Quora• ReadSocial• …

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Practical  Philosophy

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

We  could  stop  requiring  perfection  in  our  use  of  others’  data:• skos:exactMatch,  not  owl:sameAs• Data  that  is  good  enough• And  contribute  improvements!

• Persistence,  not  Permanence• Target  is  Comprehension,  not  Inference

Perfect  is  the  Enemy  of  the  Good

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

We  could  publish  a  set  of  rules  per  class  by  which  sufficiency  of  resemblance  can  be  determined:•Which  properties  must  overlap?•Which  properties  must  be  exactly  the  same?•Which  properties  can  be  ignored?•Which  relationships  must  match?

Sufficiency  of  Resemblance

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

We  could  publish  services  to  make  it  easier  to  discover  and  reconcile  our  identities:• Auto-­‐complete  /  type-­‐ahead• Open  Refine  reconciliation• Embeddable  widgets

Resemblance  Services

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

We  could  contribute  to  shared  infrastructure  for  discovery  and  change  management:• Shared  infrastructure,  decentralized  publication• Notifications  when  data  changes• Notifications  when  identities  are  used• With  links  back  from  the  identity

• Separate  publishing  /  discovery  concerns  

Shared  Infrastructure

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Philosophy  of  Practicality

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Five  Laws  of  LOD

• Linked  Open  Data  is  for  Use• Every  Developer,  her  Data• Every  Data,  its  Application• Save  the  time  of  the  Developer

• LOD  [Community]  is  a  growing  Organism

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Patrick  Hochstenbach,  @hochstenbach

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

https://www.flickr.com/photos/harris77/3357537737

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Linked  Open  Usable  Data!

• Strict  identity  matching  is  impossible• Target  is  skos:exactMatch,  not  owl:sameAs

• Shared  ontologies  are  more  important  than  precision• Target  is  comprehension,  not  inference

• Build  services  &  infrastructure  to  enable  reconciliation• Target  audience  of  LOD  is  Developers

Pick  Usable  not  Perfect!  

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Thank  You!

Rob  [email protected] /        @azaroth42

@azaroth42

[email protected]

IIIF:  Interope

rabilituy

Every  Identity,

Its  Ontology

@azaroth42

[email protected]

Discuss!