2015 eswc framebase presentation
TRANSCRIPT
FrameBase: Representing N-ary Relations Using Semantic Frames
Jacobo RoucesAalborg University
Gerard de MeloTsinghua [email protected]
Katja HoseAalborg [email protected]
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 2
Ways to represent N-ary relations.
● Using Direct Binary Relations– Used by “default” in most KBs. Dereified.
● RDF reification– YAGO,YAGO2s
● Subproperties– Proposed in [Nguyen et al, WWW 2014]
● Neo-davidsonian representations– To an extent used in most Kbs that include events.
Freebase, Framebase
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 3
Ways to represent N-ary relations
Direct Binary Relations
● Pairwise properties around an event (unreified)✗ From N up to N(N-1) triples:
person1 gotMarriedWith person2
person1 gotMarriedInPlace place
person2 gotMarriedInPlace place
person1 gotMarriedOnDate time
person2 gotMarriedOnDate time
person1 ceremonyType marriageCeremonyType
person2 ceremonyType marriageCeremonyType
place holdWeddingOnDate time
✗ Without events, connections are unknown: Sarkozy gotMarriedWith Carla_Bruni
Sarkozy gotMarriedWith Cécilia_Attias
Sarkozy gotMarriedOnDate 2007
Sarkozy gotMarriedOnDate 1996
?
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 4
e1 e2
e3
Ways to represent N-ary relations.
Direct Binary Relations
e1 p e2 .e2 q e3 .e3 r e4 .
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 5
Ways to represent N-ary relations
RDF reification
e1
Statement
e2
● Original triplee1 p e2
● Reified with additional triples. r signifies the triple:r rdf:type rdf:Statement
r rdf:subject e1
r rdf:property p
r rdf:object e2
– RDF reification is different from (general) reification, where the new entity r would signify, not a triple but the event or frame evoked by a property.
● This other kind is central to FrameBase, and will come later.
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 6
Ways to represent N-ary relations
RDF reification
e1 e2
e3
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 7
Ways to represent N-ary relations
RDF reification
e1
Statement
e2
e3Statement
Stat
emen
t
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 8
Ways to represent N-ary relations
RDF reification● Possible third way: reifying a primary triple (YAGO). But:
✗ 4-fold overhead when using pure RDF, or need for quads.
Lower triplestore performance and cumbersome queries.
✗ The advantage (including also the direct binary relation) is only for the primary pair. For the other direct binary relations, more reifications are needed.
✗ Which one is the primary pair? Can the user replicate the choice?
✗ Mixing metadata with data leads to ambiguity and errors in LOD:
Something like “:factId :time 2013” would mean that Einstein won the Nobel Prize in the 21st century or that the triple was created at that time?
✗ Non-unique triple ids when several instances of the event share the primary pair.
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 9
Ways to represent N-ary relations
Neo-Davidsonian representation
● Reified properties (connecting properties around an event). ✔ N+1 triples:
event type marriage
event partner Sarkozy
event partner Carla_Bruni
event time 2007
event location Paris
event manner civilCeremony
✔ Unlike the case with direct binary predicates, events can be separated
event2 type Marriage
event2 partner1 Sarkozy
event2 partner2 Cécilia_Attias
event2 time 1996
A.k.a. Neo-Davidsonian representation
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 10
Ways to represent N-ary relations
Neo-Davidsonian representation
● Example from http://purl.org/vocab/bio/0.1/Marriage
_:e a bio:Marriage ; dc:date "1903" ; bio:partner dbpedia:Albert_Einstein ; bio:partner dbpedia:Mileva_Mari%C4%87 ; bio:place dbpedia:Bern .
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 11
Ways to represent N-ary relations
Neo-Davidsonian representation
e1Event type
e2
e3
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 12
Ways to represent N-ary relations
● Using different representations is troublesome:
✗ Low recall when querying● The user may use a different schema to model the query
✗ Alignment hindered● Ontology alignment systems usually search direct
equivalences between classes, properties, etc.
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 13
FrameBase
● Core: RDFS schema to represent knowledge using neo-Davidsonian approach with a wide and extensible vocabulary of
– frames (events, situations, frames, eventualities…)
– frame elements (outgoing properties representing frame-specific semantic roles)
● Vocabulary based on NLP resources (FrameNet+WordNet) – This provides connection with natural language and semantic role labeling
systems.
● Inference rules to provide direct binary predicates?f a :frame-Separating-partition.v ?f :fe-Separating-Whole ?s ?s :isPartitionedIntoParts ?o?f :fe-Separating-Parts ?o
We will explain these points now...
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 14
FrameBase:Core schema
e1Frame type
e2
e3
FRAME CLASS
FRAME ELEMENT(FRAME-SPECIFIC SEMANTIC ROLES)FRAME INSTANCE
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 15
FrameBase:Core schema
● Problems using FrameNet:
✗ Coverage is limited
✗ Some frames and FEs are too general
☞ Create micro-frames with LUs
✗ Too many near-equivalent frames now! Sparsity.
☞We must cluster near-equivalent senses
by aligning and extending with WordNet (algorithm in the paper)
● Using synsets and lexical-semantic pointers we group● Synonyms● Near-equivalent senses● Morphosemantic variations. e.g nominalizations
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 16
FrameBase:Core schema
..defect.v ..defection.n ..desert.v ..desertion.n
..desertion_n_00055315 ..defect_v_02584097
..abandon_v_00614057
..deserter_n_10007109
..deserter_n_10006842
..retreat.v ..withdraw.v ..withdrawal.n
..receding_n_00057486
..pullback_n_00056688
..withdraw_v_01994442
..withdrawal_n_00053913
:frame-Quitting_a_place
deserterturncoatapostateratter
recreantrenegade
desertionabandonmentdefection
deserterdefector
defectdesert
abandondesertdesolateforsake
pullbackrecedingrecession
withdrawretireretreat
drawbackpullbackmovebackrecede
pullaway
withdrawal
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 17
FrameBase:Core schema
..defect.v ..defection.n ..desert.v ..desertion.n
..desertion_n_00055315 ..defect_v_02584097
..abandon_v_00614057
..deserter_n_10007109
..deserter_n_10006842
..retreat.v ..withdraw.v ..withdrawal.n
..receding_n_00057486
..pullback_n_00056688
..withdraw_v_01994442
..withdrawal_n_00053913
:frame-Quitting_a_place
deserterturncoatapostateratter
recreantrenegade
desertionabandonmentdefection
deserterdefector
defectdesert
abandondesertdesolateforsake
pullbackrecedingrecession
withdrawretireretreat
drawbackpullbackmovebackrecede
pullaway
withdrawal
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 18
FrameBase:Reification-dereification rules
● Challenge using neo-davidsonian representation: The reification provided by frames is necessary when more than two slots/arguments are filled, but sometimes is not.
✗ Overhead querying and storing.
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 19
FrameBase:Reification-dereification rules
☞ Solution in FrameBase: Two-layered structure.– Create two levels of reification, and inference rules that
connect them.● Reified knowledge using frames and frame elements● Dereified knowledge using direct binary predicates
– Rules are definite clauses (easy for inference engines)
e1Event type
e2
e3
?f a :frame-Separating-partition.vAND?f :fe-Separating-Whole ?sAND?f :fe-Separating-Parts ?oIFF?s ..-isPartitionedIntoParts ?o
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 20
Example: Win_prize frame
:frame-Win_prize-win.v
...-competitor
yago:A_Einsteinyago:Nobel_Prize
fe-Win_prize-competition fe-Win_prize-prize
1921^xsd:date
...-time ...-explanation
BEYOND TIME AND LOCATION!
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 21
Example: Win_prize frame
:frame-Win_prize-win.v
...-competitor
yago:A_Einsteinyago:Nobel_Prize
fe-Win_prize-competition fe-Win_prize-prize
1921^xsd:date
...-time
yago:Photoelectric_effect
...-explanation
frame:Working_on-work.n
fe-Working_on-agent
...-domain
...-time
1905^xsd:date
BEYOND TIME AND LOCATION!
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 22
Example: Win_prize frame
:frame-Win_prize-win.v
...-competitor
yago:A_Einsteinyago:Nobel_Prize
fe-Win_prize-competition fe-Win_prize-prize
1921^xsd:date
...-time
?
??
yago:Photoelectric_effect
...-explanation
frame:Working_on-work.n
fe-Working_on-agent
...-domain
...-time
1905^xsd:date
?
BEYOND TIME AND LOCATION!
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 23
Example: Win_prize frame
:frame-Win_prize-win.v
...-competitor
yago:A_Einsteinyago:Nobel_Prize
fe-Win_prize-competition fe-Win_prize-prize
1921^xsd:date
...-time
winsByCompetitor
winsAtTime isWonAtTime
yago:Photoelectric_effect
...-explanation
frame:Working_on-work.n
fe-Working_on-agent
...-domain
...-time
1905^xsd:date
worksAtTime
BEYOND TIME AND LOCATION!
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 24
FrameBase:Reification-dereification rules
● FrameBase: Two-layered structure:
☞Create two levels of reification, and inference rules that connect them.
● Reified knowledge using frames and frame elements● Dereified knowledge using direct binary predicates
– Rules are Horn clauses (good for inference engines)
– Around 15000 rules and direct binary predicates are created automatically.
– Different storage strategies are possible.
?f a :frame-Separating-partition.vAND?f :fe-Separating-Whole ?sAND?f :fe-Separating-Parts ?oIFF?s ..-isPartitionedIntoParts ?o
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 25
FrameBase:Integration rules
● Integration rules from source KBs can be created with SPARQL CONSTRUCT queries (and optionally a RDFier)
CONSTRUCT { _:e a framebase:frame-People_by_jurisdiction-citizen.n . _:e framebase:fe-People_by_jurisdiction-Person ?person . _:e framebase:fe-People_by_jurisdiction-Jurisdiction ?country .} WHERE { ?person freebase:people.person.nationality ?country . }
● More examples in the DeRiVE 2015 paper “Representing Specialized Events with FrameBase”
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 26
Results
● RDFS schema of size 250,407 triples
– Using FrameNet-WordNet mapping with precision = 0.789
– It provides 19,376 frames with lexical labels● A total of 18,357 microframes
– 11,939 LU-microframes– 6,418 synset-microframes. – Grouped into 8,145 logical clusters:
● sets of microframes whose elements are linked by a logical near-equivalence relation.
● We generate automatically 14,930 reification–dereification rules for the same number of direct binary predicates.
– Human-readable
– 86.59% ± 6.41% were correct.
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 27
Data
● More information: http://framebase.org
● Data is open-source. – License: CC-BY 4.0 International
– Everybody is welcome to publish their
data using the FrameBase schema!
The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement No. FP7-SEC-2012-312651 (ePOOLICE project). Additional funding was provided by National Basic Research Program of China Grants 2011CBA00300, 2011CBA00301, and NSFC Grants 61033001, 61361136003.
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 28
Conclusion
● FrameBase offers a reusable, wide-range, semantically rich, natural-language-related and extensible schema for representation of n-ary relations, events, situations, processes, natural kinds, etc. (in general: frames).
● Two levels of representation: reified and dereified.
● Future work:
– Automatic integration of source KBs
– Interfacing with NL and QA (SEMAFOR).
Jacobo Rouces, Gerard De Melo, Katja Hose
02/06/15 29