dependency parsing prashanth mannem prashanth@research.iiit.ac.in

Post on 02-Jan-2016

238 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Dependency Parsing

Prashanth Mannemprashanth@research.iiit.ac.in

Outline

April 20, 2023Dependency Parsing (P. Mannem)2

Introduction Phrase Structure Grammar Dependency Grammar Comparison

Dependency Parsing Formal definition

Parsing Algorithms Introduction Dynamic programming Constraint satisfaction Deterministic search

Introduction

April 20, 2023Dependency Parsing (P. Mannem)3

The syntactic parsing of a sentence consists of finding the correct syntactic structure of that sentence in a given formalism/grammar.

Dependency Grammar (DG) and Phrase Structure Grammar (PSG) are two such formalisms.

Phrase Structure Grammar (PSG)

April 20, 2023Dependency Parsing (P. Mannem)4

Breaks sentence into constituents (phrases) Which are then broken into smaller constituents

Describes phrase structure, clause structure

E.g.. NP, PP, VP etc..

Structures often recursive The clever tall blue-eyed old … man

Draw the phrase structure tree

Red figures on the screens indicated falling stocks

April 20, 2023Dependency Parsing (P. Mannem)5

Phrase Structure Tree

April 20, 2023Dependency Parsing (P. Mannem)6

Dependency Grammar

April 20, 2023Dependency Parsing (P. Mannem)7

Syntactic structure consists of lexical items, linked by binary asymmetric relations called dependencies

Interested in grammatical relations between individual words (governing & dependent words)

Does not propose a recursive structure Rather a network of relations

These relations can also have labels

Draw the dependency tree

Red figures on the screens indicated falling stocks

April 20, 2023Dependency Parsing (P. Mannem)8

Dependency Tree

April 20, 2023Dependency Parsing (P. Mannem)9

Dependency Tree Example

April 20, 2023Dependency Parsing (P. Mannem)10

Phrasal nodes are missing in the dependency structure when compared to constituency structure.

Dependency Tree with Labels

April 20, 2023Dependency Parsing (P. Mannem)11

Comparison

April 20, 2023Dependency Parsing (P. Mannem)12

Dependency structures explicitly represent Head-dependent relations (directed arcs) Functional categories (arc labels) Possibly some structural categories (parts-of-

speech)

Phrase structure explicitly represent Phrases (non-terminal nodes) Structural categories (non-terminal labels) Possibly some functional categories (grammatical

functions)

Learning DG over PSG

April 20, 2023Dependency Parsing (P. Mannem)13

Dependency Parsing is more straightforward Parsing can be reduced to labeling each token w i with

wj

Direct encoding of predicate-argument structure Fragments are directly interpretable

Dependency structure independent of word order Suitable for free word order languages (like Indian

languages)

Outline

April 20, 2023Dependency Parsing (P. Mannem)14

Introduction Phrase Structure Grammar Dependency Grammar Comparison

Dependency Parsing Formal definition

Parsing Algorithms Introduction Dynamic programming Constraint satisfaction Deterministic search

Dependency Tree

April 20, 2023Dependency Parsing (P. Mannem)15

Formal definition An input word sequence w1…wn

Dependency graph D = (W,E) where W is the set of nodes i.e. word tokens in the input seq. E is the set of unlabeled tree edges (wi, wj) (wi, wj є W).

(wi, wj) indicates an edge from wi (parent) to wj (child).

Task of mapping an input string to a dependency graph satisfying certain conditions is called dependency parsing

Well-formedness

April 20, 2023Dependency Parsing (P. Mannem)16

A dependency graph is well-formed iff

Single head: Each word has only one head.

Acyclic: The graph should be acyclic.

Connected: The graph should be a single tree with all the words in the sentence.

Projective: If word A depends on word B, then all words between A and B are also subordinate to B (i.e. dominated by B).

Non-projective dependency tree

April 20, 2023Dependency Parsing (P. Mannem)17

Ram saw a dog yesterday which was a Yorkshire Terrier

* Crossing lines

English has very few non-projective cases.

Outline

April 20, 2023Dependency Parsing (P. Mannem)18

Introduction Phrase Structure Grammar Dependency Grammar Comparison and Conversion

Dependency Parsing Formal definition

Parsing Algorithms Introduction Dynamic programming Constraint satisfaction Deterministic search

Dependency Parsing

April 20, 2023Dependency Parsing (P. Mannem)19

Dependency based parsers can be broadly categorized into Grammar driven approaches

Parsing done using grammars. Data driven approaches

Parsing by training on annotated/un-annotated data.

Dependency Parsing

April 20, 2023Dependency Parsing (P. Mannem)20

Dependency based parsers can be broadly categorized into Grammar driven approaches

Parsing done using grammars. Data driven approaches

Parsing by training on annotated/un-annotated data.

These approaches are not mutually exclusive

Covington’s Incremental Algorithm

April 20, 2023Dependency Parsing (P. Mannem)21

Incremental parsing in O(n2) time by trying to link each new word to each preceding one [Covington 2001]:

PARSE(x = (w1, . . . ,wn))1. for i = 1 up to n2. for j = i − 1 down to 13. LINK(wi , wj )

Covington’s Incremental Algorithm

April 20, 2023Dependency Parsing (P. Mannem)22

Incremental parsing in O(n2) time by trying to link each new word to each preceding one [Covington 2001]:

PARSE(x = (w1, . . . ,wn))1. for i = 1 up to n2. for j = i − 1 down to 13. LINK(wi , wj )

Different conditions, such as Single-Head and Projectivity, can be incorporated into the LINK operation.

Parsing Methods

April 20, 2023Dependency Parsing (P. Mannem)23

Three main traditions Dynamic programming

CYK, Eisner, McDonald

Constraint satisfaction Maruyama, Foth et al., Duchier

Deterministic search Covington, Yamada and Matsumuto, Nivre

Dynamic Programming

Basic Idea: Treat dependencies as constituents. Use, e.g. , CYK parser (with minor modifications)

April 20, 2023Dependency Parsing (P. Mannem) 24

Dependency Chart Parsing

April 20, 2023Dependency Parsing (P. Mannem)25

Grammar is regarded as context-free, in which each node is lexicalized

Chart entries are subtrees, i.e., words with all their left and right dependents

Problem: Different entries for different subtrees spanning a sequence of words with different heads

Time requirement: O(n5)

Generic Chart Parsing

April 20, 2023Dependency Parsing (P. Mannem)26

for each of the O(n2) substrings,for each of O(n) ways of splitting it,

for each of S analyses of first half

for each of S analyses of second half,for each of c ways of combining them: combine, & add result to chart if best

[cap spending] + [at $300 million] = [[cap spending] [at $300 million]]

S analyses S analyses cS2 analysesof which we keep S

O(n3 S2 c)

Slide from [Eisner, 1997]

Headed constituents ...

April 20, 2023Dependency Parsing (P. Mannem)27

... have too many signatures.

How bad is (n3S2c)?

For unheaded constituents, S is constant: NP, VP ...

(similarly for dotted trees). So (n3).

But when different heads different signatures,the average substring has (n) possible heads

and S=(n) possible signatures. So (n5).

Slide from [Eisner, 1997]

Dynamic Programming Approaches

April 20, 2023Dependency Parsing (P. Mannem)28

Original version [Hays 1964] (grammar driven) Link grammar [Sleator and Temperley 1991]

(grammar driven) Bilexical grammar [Eisner 1996] (data driven) Maximum spanning tree [McDonald 2006] (data

driven)

Eisner 1996

April 20, 2023Dependency Parsing (P. Mannem)29

Two novel aspects: Modified parsing algorithm Probabilistic dependency parsing

Time requirement: O(n3) Modification: Instead of storing subtrees, store

spans Span: Substring such that no interior word links

to any word outside the span. Idea: In a span, only the boundary words are

active, i.e. still need a head or a child One or both of the boundary words can be

active

Example

April 20, 2023Dependency Parsing (P. Mannem)30

Red figures on the screen indicated falling stocks_ROOT_

Example

April 20, 2023Dependency Parsing (P. Mannem)31

Spans:

Red figures on the screen indicated falling stocks_ROOT_

indicated falling stocksRed figures

Assembly of correct parse

April 20, 2023Dependency Parsing (P. Mannem)32

Red figures on the screen indicated falling stocks_ROOT_

Red figures

Start by combining adjacent words to minimal spans

figures on on the

Assembly of correct parse

April 20, 2023Dependency Parsing (P. Mannem)33

Red figures on the screen indicated falling stocks_ROOT_

Combine spans which overlap in one word; this word must be governed by a word in the left or right span.

on the the screen+ → on the screen

Assembly of correct parse

April 20, 2023Dependency Parsing (P. Mannem)34

Red figures on the screen indicated falling stocks_ROOT_

Combine spans which overlap in one word; this word must be governed by a word in the left or right span.

figures on + →on the screen figures on the screen

Assembly of correct parse

April 20, 2023Dependency Parsing (P. Mannem)35

Red figures on the screen indicated falling stocks_ROOT_

Combine spans which overlap in one word; this word must be governed by a word in the left or right span.

Invalid span

Red figures on the screen

Assembly of correct parse

April 20, 2023Dependency Parsing (P. Mannem)36

Red figures on the screen indicated falling stocks_ROOT_

Combine spans which overlap in one word; this word must be governed by a word in the left or right span.

+ → indicated falling stocksfalling stocksindicated falling

Eisner’s Model

April 20, 2023Dependency Parsing (P. Mannem)37

Recursive Generation Each word generates its actual dependents Two Markov chains:

Left dependents Right dependents

Eisner’s Model

April 20, 2023Dependency Parsing (P. Mannem)38

where tw(i) is ith tagged word

lc(i) & rc(i) are the left and right children of ith word

wherelcj(i) is the jth left child of the ith wordt(lcj-1(i)) is the tag of the preceding left child

McDonald’s Maximum Spanning Trees

April 20, 2023Dependency Parsing (P. Mannem)39

Score of a dependency tree = sum of scores of dependencies

Scores are independent of other dependencies If scores are available, parsing can be

formulated as maximum spanning tree problem Two cases:

Projective: Use Eisner’s parsing algorithm. Non-projective: Use Chu-Liu-Edmonds algorithm [Chu

and Liu 1965, Edmonds 1967] Uses online learning for determining weight

vector w

Parsing Methods

April 20, 2023Dependency Parsing (P. Mannem)40

Three main traditions Dynamic programming

CYK, Eisner, McDonald

Constraint satisfaction Maruyama, Foth et al., Duchier

Deterministic parsing Covington, Yamada and Matsumuto, Nivre

Constraint Satisfaction

April 20, 2023Dependency Parsing (P. Mannem)41

Uses Constraint Dependency Grammar Grammar consists of a set of boolean

constraints, i.e. logical formulas that describe well-formed trees

A constraint is a logical formula with variables that range over a set of predefined values

Parsing is defined as a constraint satisfaction problem

Constraint satisfaction removes values that contradict constraints

Constraint Satisfaction

April 20, 2023Dependency Parsing (P. Mannem)42

Parsing is an eliminative process rather than a constructive one such as in CFG parsing

Constraint satisfaction in general is NP complete

Parser design must ensure practical efficiency Different approaches

Constraint propagation techniques which ensure local consistency [Maruyama 1990]

Weighted CDG [Foth et al. 2000, Menzel and Schroder 1998]

Maruyama’s Constraint Propagation

April 20, 2023Dependency Parsing (P. Mannem)43

Three steps:1) Form initial constraint network using a “core”

grammar2) Remove local inconsistencies3) If ambiguity remains, add new constraints and

repeat step 2

Weighted Constraint Parsing

April 20, 2023Dependency Parsing (P. Mannem)44

Robust parser, which uses soft constraints Each constraint is assigned a weight between

0.0 and 1.0 Weight 0.0: hard constraint, can only be

violated when no other parse is possible Constraints assigned manually (or estimated

from treebank) Efficiency: uses a heuristic transformation-

based constraint resolution method

Transformation-Based Constraint Resolution

April 20, 2023Dependency Parsing (P. Mannem)45

Heuristic search Very efficient Idea: first construct arbitrary dependency

structure, then try to correct errors Error correction by transformations Selection of transformations based on

constraints that cause conflicts Anytime property: parser maintains a

complete analysis at anytime → can be stopped at any time and return a complete analysis

Parsing Methods

April 20, 2023Dependency Parsing (P. Mannem)46

Three main traditions Dynamic programming

CYK, Eisner, McDonald

Constraint satisfaction Maruyama, Foth et al., Duchier

Deterministic parsing Covington, Yamada and Matsumuto, Nivre

Deterministic Parsing

April 20, 2023Dependency Parsing (P. Mannem)47

Basic idea: Derive a single syntactic representation

(dependency graph) through a deterministic sequence of elementary parsing actions

Sometimes combined with backtracking or repair

Motivation: Psycholinguistic modeling Efficiency Simplicity

Shift-Reduce Type Algorithms

April 20, 2023Dependency Parsing (P. Mannem)48

Data structures: Stack [. . . ,wi ]S of partially processed tokens Queue [wj , . . .]Q of remaining input tokens

Parsing actions built from atomic actions: Adding arcs (wi → wj , wi ← wj ) Stack and queue operations

Left-to-right parsing Restricted to projective dependency graphs

Yamada’s Algorithm

April 20, 2023Dependency Parsing (P. Mannem)49

Three parsing actions: Shift [. . .]S [wi , . . .]Q

[. . . , wi ]S [. . .]Q Left [. . . , wi , wj ]S [. . .]Q

[. . . , wi ]S [. . .]Q wi → wj

Right [. . . , wi , wj ]S [. . .]Q

[. . . , wj ]S [. . .]Q wi ← wj

Multiple passes over the input give time complexity O(n2)

Yamada and Matsumoto

April 20, 2023Dependency Parsing (P. Mannem)50

Parsing in several rounds: deterministic bottom-up O(n2)

Looks at pairs of words 3 actions: shift, left, right

Shift: shifts focus to next word pair

Yamada and Matsumoto

April 20, 2023Dependency Parsing (P. Mannem)51

Right: decides that the right word depends on the left word

Left: decides that the left word depends on the right one

Parsing Algorithm

April 20, 2023Dependency Parsing (P. Mannem)52

Go through each pair of words Decide which action to take

If a relation was detected in a pass, do another pass

E.g. the little girl First pass: relation between little and girl Second pass: relation between the and girl

Decision on action depends on word pair and context

Nivre’s Algorithm

April 20, 2023Dependency Parsing (P. Mannem)53

Four parsing actions:Shift [. . .]S [wi , . . .]Q

[. . . , wi ]S [. . .]Q

Reduce [. . . , wi ]S [. . .]Q Эwk : wk → wi

[. . .]S [. . .]QLeft-Arc [. . . , wi ]S [wj , . . .]Q ¬Эwk : wk → wi

[. . .]S [wj , . . .]Q wi ← wj

Right-Arc [. . . ,wi ]S [wj , . . .]Q ¬Эwk : wk → wj

[. . . , wi , wj ]S [. . .]Q wi → wj

Nivre’s Algorithm

April 20, 2023Dependency Parsing (P. Mannem)54

Characteristics: Arc-eager processing of right-dependents Single pass over the input gives time worst case

complexity O(2n)

Example

April 20, 2023Dependency Parsing (P. Mannem)55

Red figures on the screen indicated falling stocks_ROOT_S Q

Example

April 20, 2023Dependency Parsing (P. Mannem)56

Red figures on the screen indicated falling stocks_ROOT_S Q

Shift

Example

April 20, 2023Dependency Parsing (P. Mannem)57

Red

figures on the screen indicated falling stocks_ROOT_S Q

Left-arc

Example

April 20, 2023Dependency Parsing (P. Mannem)58

Red

figures on the screen indicated falling stocks_ROOT_S Q

Shift

Example

April 20, 2023Dependency Parsing (P. Mannem)59

Red

figures on the screen indicated falling stocks_ROOT_S Q

Right-arc

Example

April 20, 2023Dependency Parsing (P. Mannem)60

Red

figures on the screen indicated falling stocks_ROOT_S Q

Shift

Example

April 20, 2023Dependency Parsing (P. Mannem)61

Red

figures on the

screen indicated falling stocks_ROOT_S Q

Left-arc

Example

April 20, 2023Dependency Parsing (P. Mannem)62

Red

figures on the

screen indicated falling stocks_ROOT_S Q

Right-arc

Example

April 20, 2023Dependency Parsing (P. Mannem)63

Red

figures on the

screen indicated falling stocks_ROOT_S Q

Reduce

Example

April 20, 2023Dependency Parsing (P. Mannem)64

Red

figures on

the

screen indicated falling stocks_ROOT_S Q

Reduce

Example

April 20, 2023Dependency Parsing (P. Mannem)65

Red

figures

on

the

screen indicated falling stocks_ROOT_S Q

Left-arc

Example

April 20, 2023Dependency Parsing (P. Mannem)66

Red

figures

on

the

screen indicated falling stocks_ROOT_S Q

Right-arc

Example

April 20, 2023Dependency Parsing (P. Mannem)67

Red

figures

on

the

screen indicated falling stocks_ROOT_S Q

Shift

Example

April 20, 2023Dependency Parsing (P. Mannem)68

Red

figures

on

the

screen indicated falling

stocks_ROOT_S Q

Left-arc

Example

April 20, 2023Dependency Parsing (P. Mannem)69

Red

figures

on

the

screen indicated falling

stocks_ROOT_S Q

Right-arc

Example

April 20, 2023Dependency Parsing (P. Mannem)70

Red

figures

on

the

screen indicated falling

stocks_ROOT_S Q

Reduce

Example

April 20, 2023Dependency Parsing (P. Mannem)71

Red

figures

on

the

screen indicated

falling

stocks_ROOT_S Q

Reduce

Classifier-Based Parsing

April 20, 2023Dependency Parsing (P. Mannem)72

Data-driven deterministic parsing: Deterministic parsing requires an oracle. An oracle can be approximated by a classifier. A classifier can be trained using treebank data.

Learning algorithms: Support vector machines (SVM) [Kudo and Matsumoto

2002, Yamada and Matsumoto 2003,Isozaki et al. 2004, Cheng et al. 2004, Nivre et al. 2006]

Memory-based learning (MBL) [Nivre et al. 2004, Nivre and Scholz 2004]

Maximum entropy modeling (MaxEnt) [Cheng et al. 2005]

Feature Models

April 20, 2023Dependency Parsing (P. Mannem)73

Learning problem: Approximate a function from parser states,

represented by feature vectors to parser actions, given a training set of gold standard derivations.

Typical features: Tokens and POS tags of :

Target words Linear context (neighbors in S and Q) Structural context (parents, children, siblings in G)

Can not be used in dynamic programming algorithms.

Dependency Parsers for download

MST parser by Ryan McDonald Malt parser by Joakim Nivre Stanford parser

April 20, 2023Dependency Parsing (P. Mannem)74

Summary

April 20, 2023Dependency Parsing (P. Mannem)75

Provided an intro to dependency parsing and various dependency parsing algorithms

Read up Nivre’s and McDonald’s tutorial on dependency parsing at ESSLLI’ 07

April 20, 2023Dependency Parsing (P. Mannem)76

top related