unsupervised induction of frame-semantic ... - ashutosh modi · ashutosh modi ivan titov alexandre...

1
Approximate MAP inference o Initialize with one frame per predicate o Iteratively merge frames (greedy search) o Each frame merge involves greedy role alignment For a collection of sentences: 1. Identify predicates 2. Identify arguments 3. Label predicates with frames 4. Label arguments with roles Can be handled by heuristics (Lang and Lapata ACL’11) Our focus Oracle Modeled Jointly Unsupervised Induction of Frame-Semantic Representations Ashutosh Modi Ivan Titov Alexandre Klementiev [email protected] [email protected] [email protected] Task Definition In frame semantics, a semantic frame is a conceptual structure describing a situation, object, or event along with associated properties and participants. The Model Evaluation In this work we focus on inducing frames and their roles from unlabeled data. Frame Induction Cluster predicates (each cluster is a frame) Key Signals Related predicates have: Similar argument fillers Similar mapping between syntax and semantics Role Induction Associate arguments with syntactic signatures (argkeys) Cluster argkeys (each argkey cluster is a role) Key Signals Argkeys with similar argument fillers clustered together Most roles occur once (per predicate occurrence) o We consider verbal predicates only o Every verb belongs to a single frame, i.e. we do not model polysemy o Priors encode sparsity of selectional preferences and predicates over frames Frame evoking element (a predicate) Frame Elements (semantic roles) o Related to Titov and Klementiev, ACL’11, EACL’12 o Can be extended to share alternation patterns across frames (as in EACL’12) o Can be extended to induce cross-cutting clusters of argument fillers and multi-word expressions (as in ACL’11) Assumptions Evaluation done on the FrameNet corpus: 158,048 sentences with 3,474 unique verbal predicates and 722 gold frames. Qualitative Evaluation Quantitative Evaluation METRICS Purity (PU) : Extent to which predicted cluster occurrences share the same gold label Collocation (CO) : Extent to which gold label is assigned to a single cluster F1 : Harmonic mean of PU and CO Role labeling Frame labeling Our unsupervised model simultaneously induces frames (clusters of predicates) and roles (clusters of argkeys) by exploiting distributional context. Related to Levin classes Pruned with a coarse cosine heuristic. Intentional_traversing Frame Semantic Roles Predicates Cut Climb Ascend Ford Traverse Self_mover Goal Source Co-participant : : climbed Everest. Hillary ascended the hill. John climbed up the hill. Mary : : Generative Story GenArgument(f,r) for each frame f : for each occurrence of frame f: p φ f for every role r B f : if [n Unif (0, 1)] = 1: GenArgument(f, r) while [n ψ f,r ] = 1: GenArgument(f, r) k f,r Unif(1,...,|r|) x f,r " f,r Check if the role appears at least once Generate first argument Draw a predicate Generate more arguments Draw argkey Draw argument filler For each frame we generate a prior partition of argkeys If role appears more than once Parameters For each frame f : ! f DP(γ,H (P) ) [Distribution of lexical units] B f CRP(α) [Partition of Argkeys] for each role r B f : ! f,r DP(",H (A) ) [Distribution of argument fillers] " f,r Beta(η 0 ,η 1 ) [Geometric distribution of duplicate roles] Occurrence counts ascended hill John climbed hill Mary climbed Everest Hillary the the up ACT:LEFT:NSUBJ ACT:LEFT:NSUBJ ACT:LEFT:NSUBJ ACT:LEFT:DOBJ ACT:LEFT:DOBJ ACT:LEFT:PREP_UP ACT:LEFT:NSUBJ (Hillary,John,Mary) climb; ascend ACT:LEFT:DOBJ (Everest,Hill) ACT:LEFT:PREP_UP; : : : : ACT:LEFT:PREP_WITH (Cousin) PASS:LEFT:NSUBJPASS; PASS:RIGHT:PREP_BY;

Upload: others

Post on 10-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unsupervised Induction of Frame-Semantic ... - Ashutosh Modi · Ashutosh Modi Ivan Titov Alexandre Klementiev amodi@mmci.uni-saarland.de titov@mmci.uni-saarland.de aklement@mmci.uni-saarland.de

Approximate MAP inference o  Initialize with one frame per predicate o  Iteratively merge frames (greedy search) o  Each frame merge involves greedy role

alignment

For a collection of sentences:

1.  Identify predicates

2.  Identify arguments

3.  Label predicates with frames

4.  Label arguments with roles

Can be handled by heuristics (Lang and Lapata ACL’11)

Our focus Oracle

Modeled Jointly

Unsupervised Induction of Frame-Semantic Representations

Ashutosh Modi Ivan Titov Alexandre Klementiev [email protected] [email protected] [email protected]

Task Definition In frame semantics, a semantic frame is a conceptual structure describing a situation, object, or event along with associated properties and participants.

The Model

Evaluation

In this work we focus on inducing frames and their roles from unlabeled data.

Frame Induction • Cluster predicates (each cluster is

a frame)

Key Signals Related predicates have: •  Similar argument fillers • Similar mapping between syntax

and semantics

Role Induction • Associate arguments with

syntactic signatures (argkeys) • Cluster argkeys (each argkey

cluster is a role)

Key Signals • Argkeys with similar argument

fillers clustered together • Most roles occur once (per

predicate occurrence)

o We consider verbal predicates only o Every verb belongs to a single frame, i.e. we do not

model polysemy o Priors encode sparsity of selectional preferences

and predicates over frames

Frame evoking element (a predicate)

Frame Elements (semantic roles)

o  Related to Titov and Klementiev, ACL’11, EACL’12 o  Can be extended to share alternation patterns across frames (as in EACL’12) o  Can be extended to induce cross-cutting clusters of argument fillers and multi-word expressions (as in ACL’11)

Assumptions

Evaluation done on the FrameNet corpus: 158,048 sentences with 3,474 unique verbal predicates and 722 gold frames.

Qualitative Evaluation Quantitative Evaluation

METRICS Purity (PU) : Extent to which predicted cluster occurrences share the same gold label Collocation (CO) : Extent to which gold label is assigned to a single cluster F1 : Harmonic mean of PU and CO

Role labeling

Frame labeling

Our unsupervised model simultaneously induces frames (clusters of predicates) and roles (clusters of argkeys) by exploiting distributional context.

Related to Levin classes

Pruned with a coarse cosine heuristic.

Intentional_traversing Frame

Semantic Roles

Predicates

Cut Climb

Ascend

Ford Traverse

Self_mover

Goal

Source

Co-participant

:: climbed Everest.Hillary

ascended the hill.John

climbed up the hill.Mary

::

Generative Story GenArgument(f,r)

for each frame f : for each occurrence of frame f: p ∼ φf for every role r ∈ Bf : if [n ∼ Unif (0, 1)] = 1: GenArgument(f, r) while [n ∼ ψf,r] = 1: GenArgument(f, r)

kf,r ∼ Unif(1,...,|r|) xf,r ∼!"f,rCheck if the role

appears at least once

Generate first argument

Draw a predicate

Generate more arguments

Draw argkey

Draw argument filler

For each frame we generate a prior partition of argkeys

If role appears more than once

ParametersFor each frame f :!f ∼ DP(γ,H(P)) [Distribution of lexical units]Bf ∼ CRP(α) [Partition of Argkeys]

for each role r ∈ Bf :

!f,r ∼ DP(",H(A)) [Distribution of argument fillers] "f,r ∼ Beta(η0,η1) [Geometric distribution of duplicate roles]

Occurrence counts

ascended hillJohn

climbed hillMary

climbed EverestHillary

the

theup

ACT:LEFT:NSUBJ

ACT:LEFT:NSUBJ

ACT:LEFT:NSUBJ

ACT:LEFT:DOBJ

ACT:LEFT:DOBJ

ACT:LEFT:PREP_UP

ACT:LEFT:NSUBJ(Hillary,John,Mary)

climb;ascend

ACT:LEFT:DOBJ(Everest,Hill)

ACT:LEFT:PREP_UP;

::

::

ACT:LEFT:PREP_WITH(Cousin)

PASS:LEFT:NSUBJPASS;

PASS:RIGHT:PREP_BY;