nlp linked open data "is a" solution
DESCRIPTION
Demonstration of solving the "is a" natural language problem using linked open data, the DBPedia.org RDF store, and Semantic Web technologies.TRANSCRIPT
LINKED OPEN DATA SERVICESIs a “cat” a “mammal”? … true
Is a “lizard” a “reptile”? … true
Is a “cat” a “reptile”? … false
Is a “lizard” an “animal”? … true
…..
Four types of “IS A” query• Shallow query TO rdf:type: “Cat” is a “Animal”?
• Is a “Cat” -> rdf:type -> http://dbpedia.org/ontology/Animal
• Deep query THROUGH rdf:type: “Cat” is a “Eukaryote”?• Is a “Cat” -> rdf:type -> http://dbpedia.org/ontology/Animal• “Animal” -> rdfs:subClassOf -> http://dbpedia.org/ontology/Eukaryote
Four types of “IS A” query• Shallow query TO dcterms:subject: “Cat” is a “Feline”?
• Is a “Cat” -> dcterms:subject -> http://dbpedia.org/resource/Category:Felines
• Deep query THROUGH dcterms:subject: “Cat” is a “Felid”?• Is a “Cat” -> dcterms:subject -> http://dbpedia.org/resource/
Category:Felines
• http://dbpedia.org/resource/Category:Felines -> skos:broader -> http://dbpedia.org/resource/Category:Felids
CONVERT TRIPLES TO NEW VOCABULARYRe-link the Linked Data
Original Graph from DBPedia.org
Term
rdf:type
OntologyResource
OntologyResource
rdfs:subClassOf
Category Resource
dcterms:subject
Category Resource
skos:broader
Stru
ctured
Hierarch
yW
ikip
edia
Cat
ego
ries
Normalize Labels for Search• http://dbpedia.org/page/United_States
• “united states”• http://dbpedia.org/ontology/PopulatedPlace
• “populated place”• http://dbpedia.org/class/yago/CountriesBorderingTheAtlanticOcean
• “countries bordering the atlantic ocean”• http://dbpedia.org/resource/Category:Former_British_colonies
• “former british colonies”
Term / Resource
String: Label
rdfs:label
Update Types to Vulcan Vocabularyprefix: halo-uri http://halo.vulcan.com/lod/2013/11/isa-vocabulary#
• http://dbpedia.org/page/United_States• halo-uri:term• halo-uri:term-id
• http://dbpedia.org/ontology/PopulatedPlace• halo-uri:rdf-type
• http://dbpedia.org/resource/Category:Former_British_colonies• halo-uri:wikipedia-category
Term / Resource
Halo Vocabulary
rdf:type
Add “Is A” Connection for Graphsprefix: halo-uri http://halo.vulcan.com/lod/2013/11/isa-vocabulary#
• http://dbpedia.org/page/United_States• halo-uri:isA http://dbpedia.org/ontology/Place
• http://dbpedia.org/page/United_States• halo-uri:isA http://dbpedia.org/resource/Category:Republic
TermType,
Category, or Graph
halo-uri:isA
Add “Is A” Connection for Ontologyprefix: halo-uri http://halo.vulcan.com/lod/2013/11/isa-vocabulary#
• http://dbpedia.org/page/United_States• halo-uri:isA http://dbpedia.org/ontology/Place
Ontology Tree
Ontology Tree
Ontology Tree
Ontology Tree
Termhalo-uri:isA
Place Populated Place Country Thing
Add “Is A” Connection for Categoriesprefix: halo-uri http://halo.vulcan.com/lod/2013/11/isa-vocabulary#
• http://dbpedia.org/page/United_States• halo-uri:isA http://dbpedia.org/resource/Category:Republic
Category Tree
Category Tree
Category Tree
Category Tree
Termhalo-uri:isA “Is A” connections
not applied higher up category tree
Countries Republics Political Theories Philosophies
New Graph with Links to DBPedia.org
TermOntology graph
Halo type
Halo label
Halo typeHalo
type
Halolabel
Category graph
halo-uri:isA
Sea
rch
Ter
ms
and
Co
nte
xt
Custom graph replaces hierarchy
Original Graph from DBPedia.org
Term
rdf:type
OntologyResource
OntologyResource
Category Resource
dcterms:subject
Category Resource
skos:broader
Stru
ctured
Hierarch
yW
ikip
edia
Cat
ego
ries
rdfs:subClassOf
EXAMPLE: 1Query TO and THROUGH rdf:type
Query TO and THROUGH rdf:type• Is a “cat” a “mammal”?• Is a “cat an “animal”?
• Is a “lizard” a “reptile”?• Is a “lizard” an “animal”?
Cat halo-uri:isA Mammal
Lizard Reptile
Animal
halo-uri:isA
Query TO and THROUGH rdf:type• Is a “cat” a “reptile”?• Is a “cat an “animal”?
Cat Animal
Cat Reptilehalo-uri:isA
halo-uri:isA
Query TO and THROUGH rdf:type• Is a “cat” a “mal”?
Cat Animal
Cat Mammalhalo-uri:isA
halo-uri:isA
?
Query TO and THROUGH rdf:type• XML: http://halo.vulcan.com:8080/isa/cat/type/animal.xml• XML: http://halo.vulcan.com:8080/isa/cat/type-graph/animal.xml
<item>
<id>9c5eebf630d626279fa6acbe1f50c9b9</id>
<term>cat</term>
<domain>animal</domain>
<match>true</match>
<triples>
<s>http://dbpedia.org/resource/Cat</s>
<p>http://halo.vulcan.com/lod/2013/11/isa-vocabulary#isA</p>
<o>http://dbpedia.org/ontology/Animal</o>
<search>
<p>http://www.w3.org/2000/01/rdf-schema#label</p>
<o>animal</o>
</search>
</triples>
</item>
Query TO and THROUGH rdf:type• JSON: http://halo.vulcan.com:8080/isa/cat/type/animal.json• JSON: http://halo.vulcan.com:8080/isa/cat/type-graph/animal.json
{
"id":"9c5eebf630d626279fa6acbe1f50c9b9",
"term":"cat",
"domain":"animal",
"match":true,
"triples":[{
"s":"http://dbpedia.org/resource/Cat",
"p":"http://halo.vulcan.com/lod/2013/11/isa-vocabulary#isA",
"o":"http://dbpedia.org/ontology/Animal",
"search":{
"p":"http://www.w3.org/2000/01/rdf-schema#label",
"o":"animal"
}
}]
}
Query TO and THROUGH rdf:type• Halo.Vulcan SPARQL : Is a “cat” a “mammal”?
PREFIX halo: <http://halo.vulcan.com/lod/2013/11/isa-vocabulary#>
SELECT DISTINCT ?p ?o ?domainLabel WHERE { GRAPH ?G {
?term halo:isA ?o .
?term ?p ?o .
?term <http://www.w3.org/2000/01/rdf-schema#label> ?termLabel .
?o <http://www.w3.org/2000/01/rdf-schema#label> ?domainLabel .
?o rdf:type halo:rdf-type .
FILTER (regex(str(?termLabel), '^cat$', 'i')) .
FILTER (regex(str(?domainLabel), 'mammal', 'i'))
}} LIMIT 100
EXAMPLE: 2Query TO and THROUGH category
Query TO and THROUGH category• Is a “cat” an “animal”?
Cat halo-uri:isAInvasive animal
species
Animals described in 1758
Domesticated animals
Searches are Returned with Triples"triples":[{
"s":"http://dbpedia.org/resource/Cat",
"p":"http://halo.vulcan.com/lod/2013/11/isa-vocabulary#isA",
"o":"http://dbpedia.org/resource/Category:Invasive_animal_species",
"search":{
"p":"http://www.w3.org/2000/01/rdf-schema#label",
"o":"invasive animal species"
}
},{
"s":"http://dbpedia.org/resource/Cat",
"p":"http://halo.vulcan.com/lod/2013/11/isa-vocabulary#isA",
"o":"http://dbpedia.org/resource/Category:Animals_described_in_1758",
"search":{
"p":"http://www.w3.org/2000/01/rdf-schema#label",
"o":"animals described in 1758"
}
}]
Query TO and THROUGH rdf:type• Halo.Vulcan SPARQL : Is a “cat” an “animal”?
PREFIX halo: <http://halo.vulcan.com/lod/2013/11/isa-vocabulary#>
SELECT DISTINCT ?p ?o ?domainLabel WHERE { GRAPH ?G {
?term halo:isA ?o .
?term ?p ?o .
?term <http://www.w3.org/2000/01/rdf-schema#label> ?termLabel .
?o <http://www.w3.org/2000/01/rdf-schema#label> ?domainLabel .
?o rdf:type halo:wikipedia-category .
FILTER (regex(str(?termLabel), '^cat$', 'i')) .
FILTER (regex(str(?domainLabel), ’animal', 'i'))
}} LIMIT 100
ADDITIONAL SERVICESMore graphs, flexible service points, and unexpected features…
There are 4 graphs in Virtuoso• http://halo.vulcan.com:8890/conductor/sparql_graph.vspx
• http://halo.vulcan.com:8890/isa/rdf-type• ~87,069 Triples
• http://halo.vulcan.com:8890/isa/rdf-type-graph• ~101,823 Triples
• http://halo.vulcan.com:8890/isa/category-type• ~292,239 Triples
• http://halo.vulcan.com:8890/isa/category-type-graph• ~560,906 Triples
652,677 NORMALIZED TRIPLES ACROSS ALL GRAPHS
“Find All” Service Points
Why query every instance? Just ask the service for all relations to a term in a given graph.
• All “is a” matches in rdf-type• http://halo.vulcan.com:8080/isa/cat/type/.xml
• All “is a” matches in rdf-type graph• http://halo.vulcan.com:8080/isa/cat/type-graph/.xml
• All “is a” matches in categories• http://halo.vulcan.com:8080/isa/cat/category/.xml
• All “is a” matches in categories graph• http://halo.vulcan.com:8080/isa/cat/category-graph/.xml
• All “is a” matches in every graph• http://halo.vulcan.com:8080/isa/cat/.xml
Unanticipated Features• Absolute matching on domain for “is a” relations
• Add a URL for literal matching and update SPARQL regex
• Spacing and special characters need to be URL encoded because each call is a ‘GET’
• Wikipedia categories are of poor quality• User defined and often inaccurate• Very specific: “Animal species described in 1705”• Cyclical: “Republics -> Countries -> United States -> Republics …
• All paths lead to: Philosophy
• Plural categories make for difficult literal matching and odd “is a” statements: Is a cat a felines?
Example Service URLsQuery rdf-types graph:• Domain-range Query – http://halo.vulcan.com:8080/isa/cat/type/animal.json• Term isa * Query – http://halo.vulcan.com:8080/isa/cat/type/.json
Query rdf-types and parents graph: • Domain-range Query – http://halo.vulcan.com:8080/isa/lizard/type-graph/reptile.json• Term isa * Query – http://halo.vulcan.com:8080/isa/lizard/type-graph/.json
Query category graph:• Domain-range Query – http://halo.vulcan.com:8080/isa/cat/category/animal.json• Term isa * Query – http://halo.vulcan.com:8080/isa/cat/category/.json
Query category and parents graph:• Domain-range Query – http://halo.vulcan.com:8080/isa/cat/category-graph/animal.json• Term isa * Query – http://halo.vulcan.com:8080/isa/cat/category-graph/.json
Query all IsA graphs (every associated entity types, categories, and parents):• Domain-range Query – http://halo.vulcan.com:8080/isa/cat/animal.json• Term isa * Query – http://halo.vulcan.com:8080/isa/cat/.json