universal semantic parsing · universal semantic parsing siva reddy oscar t ackstr¨ om¨ slav...

Universal Semantic Parsing

Siva Reddy†∗ Oscar Tackstrom‡ Slav Petrov‡ Mark Steedman†† Mirella Lapata††

†Stanford University‡ Google Inc.

††University of [email protected], oscart, [email protected], steedman, [email protected]

Abstract

Universal Dependencies (UD) offer a uni-form cross-lingual syntactic representation,with the aim of advancing multilingual ap-plications. Recent work shows that se-mantic parsing can be accomplished bytransforming syntactic dependencies to log-ical forms. However, this work is lim-ited to English, and cannot process de-pendency graphs, which allow handlingcomplex phenomena such as control. Inthis work, we introduce UDEPLAMBDA,a semantic interface for UD, which mapsnatural language to logical forms in analmost language-independent fashion andcan process dependency graphs. We per-form experiments on question answeringagainst Freebase and provide German andSpanish translations of the WebQuestionsand GraphQuestions datasets to facilitatemultilingual evaluation. Results show thatUDEPLAMBDA outperforms strong base-lines across languages and datasets. ForEnglish, it achieves a 4.9 F1 point improve-ment over the state-of-the-art on Graph-Questions.

1 Introduction

The Universal Dependencies (UD) initiative seeksto develop cross-linguistically consistent annota-tion guidelines as well as a large number of uni-formly annotated treebanks for many languages(Nivre et al., 2016). Such resources could advancemultilingual applications of parsing, improve com-parability of evaluation results, enable cross-linguallearning, and more generally support natural lan-guage understanding.

∗Work done at the University of Edinburgh

Seeking to exploit the benefits of UD for natu-ral language understanding, we introduce UDEP-LAMBDA, a semantic interface for UD that mapsnatural language to logical forms, representing un-derlying predicate-argument structures, in an al-most language-independent manner. Our frame-work is based on DEPLAMBDA (Reddy et al.,2016) a recently developed method that convertsEnglish Stanford Dependencies (SD) to logicalforms. The conversion process is illustrated inFigure 1 and discussed in more detail in Section 2.Whereas DEPLAMBDA works only for English, U-DEPLAMBDA applies to any language for whichUD annotations are available.1 Moreover, DEP-LAMBDA can only process tree-structured inputswhereas UDEPLAMBDA can also process depen-dency graphs, which allow to handle complex con-structions such as control. The different treatmentsof various linguistic constructions in UD comparedto SD also require different handling in UDEP-LAMBDA (Section 3.3).

Our experiments focus on Freebase semanticparsing as a testbed for evaluating the framework’smultilingual appeal. We convert natural languageto logical forms which in turn are converted to ma-chine interpretable formal meaning representationsfor retrieving answers to questions from Freebase.To facilitate multilingual evaluation, we providetranslations of the English WebQuestions (Berantet al., 2013) and GraphQuestions (Su et al., 2016)datasets to German and Spanish. We demonstratethat UDEPLAMBDA can be used to derive logicalforms for these languages using a minimal amountof language-specific knowledge. Aside from devel-oping the first multilingual semantic parsing toolfor Freebase, we also experimentally show that U-DEPLAMBDA outperforms strong baselines across

1As of v1.3, UD annotations are available for 47 languagesat http://universaldependencies.org.

mailto:[email protected]





http://universaldependencies.org

languages and datasets. For English, it achieves thestrongest result to date on GraphQuestions, withcompetitive results on WebQuestions. Our imple-mentation and translated datasets are publicly avail-able at https://github.com/sivareddyg/udeplambda.

2 DEPLAMBDA

Before describing UDEPLAMBDA, we provide anoverview of DEPLAMBDA (Reddy et al., 2016)on which our approach is based. DEPLAMBDA

converts a dependency tree to its logical form inthree steps: binarization, substitution, and com-position, each of which is briefly outlined below.Algorithm 1 describes the steps of DEPLAMBDA

in lines 4-6, whereas lines 2 and 3 are specific toUDEPLAMBDA.

Binarization A dependency tree is first mappedto a Lisp-style s-expression indicating the orderof semantic composition. Figure 1(b) shows thes-expression for the sentence Disney won an Os-car for the movie Frozen, derived from the depen-dency tree in Figure 1(a). Here, the sub-expression(dobj won (det Oscar an)) indicates that the logi-cal form of the phrase won an Oscar is derived bycomposing the logical form of the label dobj withthe logical form of the word won and the logicalform of the phrase an Oscar, derived analogously.The s-expression can also be interpreted as a bi-narized tree with the dependency label as the rootnode, and the left and right expressions as subtrees.

A composition hierarchy is employed to imposea strict traversal ordering on the modifiers to eachhead in the dependency tree. As an example, wonhas three modifiers in Figure 1(a), which accordingto the composition hierarchy are composed in theorder dobj> nmod> nsubj. In constructions likecoordination, this ordering is crucial to arrive atthe correct semantics. Lines 7-17 in Algorithm 1describe the binarization step.

Substitution Each symbol in the s-expressionsis substituted for a lambda expression encodingits semantics. Words and dependency labels areassigned different types of expressions. In general,words have expressions of the following kind:ENTITY ⇒ λx.word(xa); e.g. Oscar⇒ λx.Oscar(xa)EVENT ⇒ λx.word(xe); e.g. won⇒ λx.won(xe)FUNCTIONAL⇒ λx.TRUE; e.g. an⇒ λx.TRUE

Here, the subscripts ·a and ·e denote the typesof individuals (Ind) and events (Event), respec-tively, whereas x denotes a paired variable (xa,xe)

Disney won an Oscar for the movie Frozenpropn verb det propn adp det noun propn

nsubj

dobj

nmod

det

case

det

compound

root

(a) The dependency tree for Disney won an Oscar for themovie Frozen in the Universal Dependencies formalism.

(nsubj (nmod (dobj won (det Oscar an))(case (det (comp. Frozen movie) the) for)) Disney)

(b) The binarized s-expression for the dependency tree.

λx.∃yzw.won(xe)∧Disney(ya)∧Oscar(za)∧Frozen(wa)∧ movie(wa)∧arg1(xe,ya)∧ arg2(xe,za)∧ nmod.for(xe,wa)

(c) The composed lambda-calculus expression.

Figure 1: The mapping of a dependency tree to itslogical form with the intermediate s-expression.

of type Ind×Event. Roughly speaking, propernouns and adjectives invoke ENTITY expressions,verbs and adverbs invoke EVENT expressions, andcommon nouns invoke both ENTITY and EVENT ex-pressions (see Section 3.3), while remaining wordsinvoke FUNCTIONAL expressions. DEPLAMBDA

enforces the constraint that every s-expression is ofthe type η = Ind×Event→ Bool, which simpli-fies the type system considerably.

Expressions for dependency labels glue thesemantics of heads and modifiers to articulatepredicate-argument structure. These expressions ingeneral take one of the following forms:

COPY ⇒ λ f gx.∃y. f (x)∧g(y)∧ rel(x,y)e.g. nsubj, dobj, nmod, advmodINVERT ⇒ λ f gx.∃y. f (x)∧g(y)∧ reli(y,x)e.g. amod, aclMERGE ⇒ λ f gx. f (x)∧g(x)e.g. compound, appos, amod, aclHEAD ⇒ λ f gx. f (x)e.g. case, punct, aux, mark .

As an example of COPY, consider the lambdaexpression for dobj in (dobj won (det Oscar an)):λ f gx.∃y. f (x)∧ g(y)∧ arg2(xe,ya). This expres-sion takes two functions f and g as input, wheref represents the logical form of won and g repre-sents the logical form of an Oscar. The predicate-argument structure arg2(xe,ya) indicates that thearg2 of the event xe, i.e. won, is the individual ya,i.e. the entity Oscar. Since arg2(xe,ya) mimics thedependency structure dobj(won, Oscar), we referto the expression kind evoked by dobj as COPY.

https://github.com/sivareddyg/udeplambda

Expressions that invert the dependency direc-tion are referred to as INVERT (e.g. amod in run-ning horse); expressions that merge two subexpres-sions without introducing any relation predicatesare referred to as MERGE (e.g. compound in movieFrozen); and expressions that simply return the par-ent expression semantics are referred to as HEAD

(e.g. case in for Frozen). While this generalizationapplies to most dependency labels, several labelstake a different logical form not listed here, someof which are discussed in Section 3.3. Sometimesthe mapping of dependency label to lambda expres-sion may depend on surrounding part-of-speechtags or dependency labels. For example, amod actsas INVERT when the modifier is a verb (e.g. in run-ning horse), and as MERGE when the modifier isan adjective (e.g. in beautiful horse).2 Lines 26-32in Algorithm 1 describe the substitution procedure.

Composition The final logical form is computedby beta-reduction, treating expressions of the form(f x y) as the function f applied to the argumentsx and y. For example, (dobj won (det Oscar an))results in λx.∃z.won(xe)∧Oscar(za)∧ arg2(xe,za)

when the expression for dobj is applied to thosefor won and (det Oscar an). Figure 1(c) shows thelogical form for the s-expression in Figure 1(b).The binarized s-expression is recursively convertedto a logical form as described in lines 18-25 inAlgorithm 1.

3 UDEPLAMBDA

We now introduce UDEPLAMBDA, a semantic in-terface for Universal Dependencies.3 WhereasDEPLAMBDA only applies to English Stanford De-pendencies, UDEPLAMBDA takes advantage of thecross-lingual nature of UD to facilitate an (almost)language independent semantic interface. This isaccomplished by restricting the binarization, sub-stitution, and composition steps described aboveto rely solely on information encoded in the UDrepresentation. As shown in Algorithm 1, lines4-6 are common to both DEPLAMBDA and UDEP-LAMBDA, whereas lines 2 and 3 applies only toUDEPLAMBDA. Importantly, UDEPLAMBDA isdesigned to not rely on lexical forms in a language

2We use Tregex (Levy and Andrew, 2006) for substitu-tion mappings and Cornell SPF (Artzi, 2013) as the lambda-calculus implementation. For example, in running horse, thetregex /label:amod/=target < /postag:verb/ matches amod toits INVERT expression λ f gx.∃y. f (x)∧g(y)∧ amodi(ye,xa).

3In what follows, all references to UD are to UD v1.3.

Algorithm 1: UDEPLAMBDA Steps1 Function UDepLambda(depTree):2 depGraph = Enhancement (depTree)

#See Figure 2(a) for a depGraph.3 bindedTree = SplitLongDistance (depGraph)

#See Figure 2(b) for a bindedTree.4 binarizedTree = Binarization (bindedTree)

#See Figure 1(b) for a binarizedTree.5 logicalForm = Composition (binarizedTree)6 return logicalForm

7 Function Binarization (tree):8 parent = GetRootNode (tree);9 (label1,child1),(label2,child2) . . .

= GetChildNodes (parent)10 sortedChildren = SortUsingLabelHierarchy

((label1,child1),(label2,child2) . . .)11 binarziedTree.root = parent12 for label, child ∈ sortedChildren:13 temp.root = label14 temp.le f t = binarziedTree15 temp.right = Binarization(child)16 binarziedTree = temp17 return binarizedTree

18 Function Composition (binarizedTree):19 mainLF = Substitution (binarizedTree.root)20 if binarziedTree has left and right children:21 le f tLF = Composition (binarziedTree.le f t)22 rightLF = Composition(binarziedTree.right)23 mainLF = BetaReduce (mainLF, le f tLF)24 mainLF = BetaReduce (mainLF,rightLF)25 return mainLF

26 Function Substitution (node):27 logicalForms = [ ]28 for tregexRule, template ∈ substitutionRules:29 if tregexRule.match(node):30 l f = GenLambdaExp (node, template)31 logicalForms.add(l f )32 return logicalForms

to assign lambda expressions, but only on informa-tion contained in dependency labels and postags.

However, some linguistic phenomena are lan-guage specific (e.g. pronoun-dropping) or lexical-ized (e.g. every and the in English have differentsemantics, despite being both determiners) and arenot encoded in the UD schema. Furthermore, somecross-linguistic phenomena, such as long-distancedependencies, are not part of the core UD represen-tation. To circumvent this limitation, a simple en-hancement step enriches the original UD represen-tation before binarization takes place (Section 3.1).This step adds to the dependency tree missing syn-tactic information and long-distance dependencies,thereby creating a graph. Whereas DEPLAMBDA

is not able to handle graph-structured input, UDEP-

LAMBDA is designed to work with dependencygraphs as well (Section 3.2). Finally, several con-structions differ in structure between UD and SD,which requires different handling in the semanticinterface (Section 3.3).

3.1 Enhancement

Both Schuster and Manning (2016) and Nivre et al.(2016) note the necessity of an enhanced UD rep-resentation to enable semantic applications. How-ever, such enhancements are currently only avail-able for a subset of languages in UD. Instead, werely on a small number of enhancements for ourmain application—semantic parsing for question-answering—with the hope that this step can be re-placed by an enhanced UD representation in the fu-ture. Specifically, we define three kinds of enhance-ments: (1) long-distance dependencies; (2) typesof coordination; and (3) refined question word tags.These correspond to line 2 in Algorithm 1.

First, we identify long-distance dependencies inrelative clauses and control constructions. We fol-low Schuster and Manning (2016) and find theseusing the labels acl (relative) and xcomp (control).Figure 2(a) shows the long-distance dependency inthe sentence Anna wants to marry Kristoff. Here,marry is provided with its missing nsubj (dashedarc). Second, UD conflates all coordinating con-structions to a single dependency label, conj. Toobtain the correct coordination scope, we refineconj to conj:verb, conj:vp, conj:sentence,conj:np, and conj:adj, similar to Reddy et al.(2016). Finally, unlike the PTB tags (Marcus et al.,1993) used by SD, the UD part-of-speech tags donot distinguish question words. Since these are cru-cial to question-answering, we use a small lexiconto refine the tags for determiners (DET), adverbs(ADV) and pronouns (PRON) to DET:WH, ADV:WH

and PRON:WH, respectively. Specifically, we usea list of 12 (English), 14 (Spanish) and 35 (Ger-man) words, respectively. This is the only partof UDEPLAMBDA that relies on language-specificinformation. We hope that, as the coverage of mor-phological features in UD improves, this refine-ment can be replaced by relying on morphologicalfeatures, such as the interrogative feature (INT).

3.2 Graph Structures and BIND

To handle graph structures that may result from theenhancement step, such as those in Figure 2(a), wepropose a variable-binding mechanism that differs

Anna wants to marry Kristoff

nsubj

xcomp

mark dobj

nsubj

(a) With long-distance dependency.

Anna wants to marry Kristoff

Ω Ω

nsubj

xcomp

mark dobj

bind nsubj

(b) With variable binding.

Figure 2: The original and enhanced dependencytrees for Anna wants to marry Kristoff.

from that of DEPLAMBDA. This is indicated inline 3 of Algorithm 1. First, each long-distancedependency is split into independent arcs as shownin Figure 2(b). Here, Ω is a placeholder for the sub-ject of marry, which in turn corresponds to Anna asindicated by the binding of Ω via the pseudo-labelBIND. We treat BIND like an ordinary dependencylabel with semantics MERGE and process the result-ing tree as usual, via the s-expression:

(nsubj (xcomp wants (nsubj (mark(dobj marry Kristoff) to) Ω) (BIND Anna Ω)) ,

with the lambda-expression substitutions:

wants, marry ∈ EVENT; to ∈ FUNCTIONAL;Anna, Kristoff ∈ ENTITY;mark ∈ HEAD; BIND ∈ MERGE;xcomp = λ f gx.∃y. f (x)∧g(y)∧xcomp(xe,ye) .

These substitutions are based solely on unlexi-calized context. For example, the part-of-speechtag PROPN of Anna invokes an ENTITY expression.

The placeholder Ω has semantics λx.EQ(x,ω),where EQ(u,ω) is true iff u and ω are equal (havethe same denotation), which unifies the subject vari-able of wants with the subject variable of marry.

After substitution and composition, we get:

λz.∃xywv.wants(ze)∧Anna(xa)∧ arg1(ze,xa)∧ EQ(x,ω)∧marry(ye)∧xcomp(ze,ye)∧ arg1(ye,va)∧ EQ(v,ω)∧ Kristoff(wa)∧ arg2(ye,wa) ,

This expression may be simplified further byreplacing all occurrences of v with x and removingthe unification predicates EQ, which results in:

λz.∃xyw.wants(ze)∧Anna(xa)∧ arg1(ze,xa)∧marry(ye)∧xcomp(ze,ye)∧ arg1(ye,xa)∧ Kristoff(wa)∧ arg2(ye,wa) .

This expression encodes the fact that Anna is thearg1 of the marry event, as desired. DEPLAMBDA,in contrast, cannot handle graph-structured input,since it lacks a principled way of generating s-expressions from graphs. Even given the aboves-expression, BIND in DEPLAMBDA is defined ina way such that the composition fails to unify vand x, which is crucial for the correct semantics.Moreover, the definition of BIND in DEPLAMBDA

does not have a formal interpretation within thelambda calculus, unlike ours.

3.3 Linguistic Constructions

Below, we highlight the most pertinent differencesbetween UDEPLAMBDA and DEPLAMBDA, stem-ming from the different treatment of various lin-guistic constructions in UD versus SD.

Prepositional Phrases UD uses a content-headanalysis, in contrast to SD, which treats functionwords as heads of prepositional phrases, Accord-ingly, the s-expression for the phrase presidentin 2009 is (nmod president (case 2009 in)) in U-DEPLAMBDA and (prep president (pobj in 2009))in DEPLAMBDA. To achieve the desired semantics,

λx.∃y.president(xa)∧president event(xe)∧arg1(xe,xa)∧2009(ya)∧prep.in(xe,ya) ,

DEPLAMBDA relies on an intermediate logicalform that requires some post-processing, whereasUDEPLAMBDA obtains the desired logical formdirectly through the following entries:

in ∈ FUNCTIONAL; 2009 ∈ ENTITY; case ∈ HEAD;president = λx.president(xa)∧president event(xe)

∧arg1(xe,xa) ;nmod = λ f gx.∃y. f (x)∧g(y)∧nmod.in(xe,ya) .

Other nmod constructions, such as possessives(nmod:poss), temporal modifiers (nmod:tmod)and adverbial modifiers (nmod:npmod), are han-dled similarly. Note how the common noun presi-dent, evokes both entity and event predicates above.

Passives DEPLAMBDA gives special treatmentto passive verbs, identified by the fine-grained part-of-speech tags in the PTB tag together with de-pendency context. For example, An Oscar waswon is analyzed as λx.won.pass(xe)∧Oscar(ya)∧arg1(xe,ya), where won.pass represents a passiveevent. However, UD does not distinguish be-tween active and passive forms.4 While the labels

4UD encodes voice as a morphological feature, but mostsyntactic analyzers do not produce this information yet.

nsubjpass or auxpass indicate passive construc-tions, such clues are sometimes missing, such as inreduced relatives. We therefore opt to not have sep-arate entries for passives, but aim to produce identi-cal logical forms for active and passive forms whenpossible (for example, by treating nsubjpass asdirect object). With the following entries,won ∈ EVENT; an, was ∈ FUNCTIONAL; auxpass ∈ HEAD;nsubjpass = λ f gx.∃y. f (x)∧g(y)∧ arg2(xe,ya) ,

the lambda expression for An Oscar was won be-comes λx.won(xe)∧Oscar(ya)∧arg2(xe,ya), iden-tical to that of its active form. However, not havinga special entry for passive verbs may have unde-sirable side-effects. For example, in the reduced-relative construction Pixar claimed the Oscar wonfor Frozen, the phrase the Oscar won ... willreceive the semantics λx.Oscar(ya)∧won(xe)∧arg1(xe,ya), which differs from that of an Oscarwas won. We leave it to the target application todisambiguate the interpretation in such cases.

Long-Distance Dependencies As discussed inSection 3.2, we handle long-distance dependen-cies evoked by clausal modifiers (acl) and con-trol verbs (xcomp) with the BIND mechanism,whereas DEPLAMBDA cannot handle control con-structions. For xcomp, as seen earlier, we use themapping λ f gx.∃y. f (x)∧g(y)∧xcomp(xe,ye). Foracl we use λ f gx.∃y. f (x)∧ g(y), to conjoin themain clause and the modifier clause. However, notall acl clauses evoke long-distance dependencies,e.g. in the news that Disney won an Oscar, theclause that Disney won an Oscar is a subordinatingconjunction of news. In such cases, we insteadassign acl the INVERT semantics.

Questions Question words are marked with theenhanced part-of-speech tags DET:WH, ADV:WH

and PRON:WH, which are all assigned the seman-tics λx.$word(xa)∧ TARGET(xa). The predicateTARGET indicates that xa represents the variable ofinterest, that is the answer to the question.

3.4 Limitations

In order to achieve language independence, UDEP-LAMBDA has to sacrifice semantic specificity, sincein many cases the semantics is carried by lexicalinformation. Consider the sentences John brokethe window and The window broke. Although it isthe window that broke in both cases, our inferredlogical forms do not canonicalize the relation be-tween broke and window. To achieve this, we

language target people

x e1 y e2 Ghanaspeak.arg2

speak.arg1

people.arg1

people.nmod.in

type

type

(a) English

sprache target

x e1 Ghanagesprochen

.arg2gesprochen.nmod.in

type

(b) German

lengua target

x e1 Ghanalengua.arg1

lengua.nmod.de

type

(c) Spanish

language.human language

target

x m Ghanalocation.country

.official language.2location.country

.official language.1

type

(d) Freebase

Figure 3: The ungrounded graphs for What language do the people in Ghana speak?, Welche Sprachewird in Ghana gesprochen? and Cual es la lengua de Ghana?, and the corresponding grounded graph.

would have to make the substitution of nsubj de-pend on lexical context, such that when windowoccurs as nsubj with broke, the predicate arg2 isinvoked rather than arg1. UDEPLAMBDA doesnot address this problem, and leave it to the tar-get application to infer context-sensitive semanticsof arg1 and arg2. To measure the impact of thislimitation, we present UDEPLAMBDASRL in Sec-tion 4.4 which addresses this problem by relying onsemantic roles from semantic role labeling (Palmeret al., 2010).

Other constructions that require lexical informa-tion are quantifiers like every, some and most, nega-tion markers like no and not, and intentional verbslike believe and said. UD does not have speciallabels to indicate these. We discuss how to handlequantifiers in this framework in the supplementarymaterial.

Although in the current setup UDEPLAMBDA

rules are hand-coded, the number of rules are onlyproportional to the number of UD labels, mak-ing rule-writing manageable.5 Moreover, we viewUDEPLAMBDA as a first step towards learningrules for converting UD to richer semantic repre-sentations such as PropBank, AMR, or the Paral-lel Meaning Bank (Palmer et al., 2005; Banarescuet al., 2013; Abzianidze et al., 2017)..

4 Cross-lingual Semantic Parsing

To study the multilingual nature of UDEPLAMBDA,we conduct an empirical evaluation on questionanswering against Freebase in three different lan-guages: English, Spanish, and German. Beforediscussing the details of this experiment, we brieflyoutline the semantic parsing framework employed.

5UD v1.3 has 40 dependency labels, and the number ofsubstitution rules in UDEPLAMBDA are 61, with some labelshaving multiple rules, and some representing lexical seman-tics.

4.1 Semantic Parsing as Graph Matching

UDEPLAMBDA generates ungrounded logicalforms that are independent of any knowledge base,such as Freebase. We use GRAPHPARSER (Reddyet al., 2016) to map these logical forms to theirgrounded Freebase graphs, via corresponding un-grounded graphs. Figures 3(a) to 3(c) show theungrounded graphs corresponding to logical formsfrom UDEPLAMBDA, each grounded to the sameFreebase graph in Figure 3(d). Here, rectangles de-note entities, circles denote events, rounded rectan-gles denote entity types, and edges between eventsand entities denote predicates or Freebase relations.Finally, the TARGET node represents the set of val-ues of x that are consistent with the Freebase graph,that is the answer to the question.

GRAPHPARSER treats semantic parsing as agraph-matching problem with the goal of findingthe Freebase graphs that are structurally isomorphicto an ungrounded graph and rank them accordingto a model. To account for structural mismatches,GRAPHPARSER uses two graph transformations:CONTRACT and EXPAND. In Figure 3(a) there aretwo edges between x and Ghana. CONTRACT col-lapses one of these edges to create a graph isomor-phic to Freebase. EXPAND, in contrast, adds edgesto connect the graph in the case of disconnectedcomponents. The search space is explored by beamsearch and model parameters are estimated withthe averaged structured perceptron (Collins, 2002)from training data consisting of question-answerpairs, using answer F1-score as the objective.

4.2 Datasets

We evaluate our approach on two public bench-marks of question answering against Freebase:WebQuestions (Berant et al., 2013), a widely usedbenchmark consisting of English questions andtheir answers, and GraphQuestions (Su et al., 2016),a recently released dataset of English questionswith both their answers and grounded logical forms.

While WebQuestions is dominated by simple entity-attribute questions, GraphQuestions contains alarge number of compositional questions involvingaggregation (e.g. How many children of EddardStark were born in Winterfell? ) and comparison(e.g. In which month does the average rainfall ofNew York City exceed 86 mm? ). The number oftraining, development and test questions is 2644,1134, and 2032, respectively, for WebQuestionsand 1794, 764, and 2608 for GraphQuestions.

To support multilingual evaluation, we createdtranslations of WebQuestions and GraphQuestionsto German and Spanish. For WebQuestions twoprofessional annotators were hired per language,while for GraphQuestions we used a trusted pool of20 annotators per language (with a single annotatorper question). Examples of the original questionsand their translations are provided in Table 1.

4.3 Implementation Details

Here we provide details on the syntactic analyzersemployed, our entity resolution algorithm, and thefeatures used by the grounding model.

Dependency Parsing The English, Spanish, andGerman Universal Dependencies (UD) treebanks(v1.3; Nivre et al 2016) were used to train part ofspeech taggers and dependency parsers. We used abidirectional LSTM tagger (Plank et al., 2016) anda bidirectional LSTM shift-reduce parser (Kiper-wasser and Goldberg, 2016). Both the tagger andparser require word embeddings. For English, weused GloVe embeddings (Pennington et al., 2014)trained on Wikipedia and the Gigaword corpus.For German and Spanish, we used SENNA em-beddings (Collobert et al., 2011; Al-Rfou et al.,2013) trained on Wikipedia corpora (589M wordsGerman; 397M words Spanish).6 Measured on theUD test sets, the tagger accuracies are 94.5 (En-glish), 92.2 (German), and 95.7 (Spanish), withcorresponding labeled attachment parser scores of81.8, 74.7, and 82.2.

Entity Resolution We follow Reddy et al. (2016)and resolve entities in three steps: (1) potential en-tity spans are identified using seven handcraftedpart-of-speech patterns; (2) each span is associatedwith potential Freebase entities according to theFreebase/KG API; and (3) the 10-best entity link-ing lattices, scored by a structured perceptron, are

6https://sites.google.com/site/rmyeid/projects/polyglot.

WebQuestions

en What language do the people in Ghana speak?de Welche Sprache wird in Ghana gesprochen?es ¿Cual es la lengua de Ghana?

en Who was Vincent van Gogh inspired by?de Von wem wurde Vincent van Gogh inspiriert?es ¿Que inspiro a Van Gogh?

GraphQuestions

en NASA has how many launch sites?de Wie viele Abschussbasen besitzt NASA?es ¿Cuantos sitios de despegue tiene NASA?

en Which loudspeakers are heavier than 82.0 kg?de Welche Lautsprecher sind schwerer als 82.0 kg?es ¿Que altavoces pesan mas de 82.0 kg?

Table 1: Example questions and their translations.

k WebQuestions GraphQuestionsen de es en de es

1 89.6 82.8 86.7 47.2 39.9 39.510 95.7 91.2 94.0 56.9 48.4 51.6

Table 2: Structured perceptron k-best entity linkingaccuracies on the development sets.

input to GRAPHPARSER, leaving the final disam-biguation to the semantic parsing problem. Table 2shows the 1-best and 10-best entity disambiguationF1-scores for each language and dataset.

Features We use features similar to Reddy et al.(2016): basic features of words and Freebase re-lations, and graph features crossing ungroundedevents with grounded relations, ungrounded typeswith grounded relations, and ungrounded answertype crossed with a binary feature indicating if theanswer is a number. In addition, we add featuresencoding the semantic similarity of ungroundedevents and Freebase relations. Specifically, we usedthe cosine similarity of the translation-invariant em-beddings of Huang et al. (2015).7

4.4 Comparison Systems

We compared UDEPLAMBDA to four versions ofGRAPHPARSER that operate on different represen-tations, in addition to prior work.

SINGLEEVENT This model resembles thelearning-to-rank model of Bast and Haussmann(2015). An ungrounded graph is generated by con-necting all entities in the question with the TARGET

node, representing a single event. Note that this

7http://128.2.220.95/multilingual/data/.

https://sites.google.com/site/rmyeid/projects/polyglot

http://128.2.220.95/multilingual/data/

WebQuestions GraphQuestionsMethod en de es en de es

SINGLEEVENT 48.5 45.6 46.3 15.9 8.8 11.4DEPTREE 48.8 45.9 46.4 16.0 8.3 11.3CCGGRAPH 49.5 – – 15.9 – –UDEPLAMBDA 49.5 46.1 47.5 17.7 9.5 12.8UDEPLAMBDASRL 49.8 46.2 47.0 17.7 9.1 12.7

Table 3: F1-scores on the test data.

baseline cannot handle compositional questions, orthose with aggregation or comparison.

DEPTREE An ungrounded graph is obtained di-rectly from the original dependency tree. An eventis created for each parent and its dependents in thetree. Each dependent is linked to this event with anedge labeled with its dependency relation, while theparent is linked to the event with an edge labeledarg0. If a word is a question word, an additionalTARGET predicate is attached to its entity node.

CCGGRAPH This is the CCG-based semanticrepresentation of Reddy et al. (2014). Note thatthis baseline exists only for English.

UDEPLAMBDASRL This is similar to UDEP-LAMBDA except that instead of assuming nsubj,dobj and nsubjpass correspond to arg1, arg2 andarg2, we employ semantic role labeling to identifythe correct interpretation. We used the systems ofRoth and Woodsend (2014) for English and Ger-man and Bjrkelund et al. (2009) for Spanish trainedon the CoNLL-2009 dataset (Haji et al., 2009).8

4.5 Results

Table 3 shows the performance of GRAPHPARSER

with these different representations. Here and inwhat follows, we use average F1-score of predictedanswers (Berant et al., 2013) as the evaluation met-ric. We first observe that UDEPLAMBDA consis-tently outperforms the SINGLEEVENT and DEP-TREE representations in all languages.

For English, performance is on par with CCG-GRAPH, which suggests that UDEPLAMBDA doesnot sacrifice too much specificity for universal-ity. With both datasets, results are lower for Ger-man compared to Spanish. This agrees with thelower performance of the syntactic parser on theGerman portion of the UD treebank. While U-DEPLAMBDASRL performs better than UDEP-

8The parser accuracies (%) are 87.33, 81.38 and 79.91forEnglish, German and Spanish respectively.

Method GraphQ. WebQ.

SEMPRE (Berant et al., 2013) 10.8 35.7JACANA (Yao and Van Durme, 2014) 5.1 33.0PARASEMPRE (Berant and Liang, 2014) 12.8 39.9QA (Yao, 2015) – 44.3AQQU (Bast and Haussmann, 2015) – 49.4AGENDAIL (Berant and Liang, 2015) – 49.7DEPLAMBDA (Reddy et al., 2016) – 50.3

STAGG (Yih et al., 2015) – 48.4 (52.5)BILSTM (Ture and Jojic, 2016) – 24.9 (52.2)MCNN (Xu et al., 2016) – 47.0 (53.3)AGENDAIL-RANK (Yavuz et al., 2016) – 51.6 (52.6)

UDEPLAMBDA 17.7 49.5

Table 4: F1-scores on the English GraphQuestionsand WebQuestions test sets (results with additionaltask-specific resources in parentheses).

LAMBDA on WebQuestions for English, we do notsee large performance gaps in other settings, sug-gesting that GRAPHPARSER is either able to learncontext-sensitive semantics of ungrounded predi-cates or that the datasets do not contain ambiguousnsubj, dobj and nsubjpass mappings. Finally,while these results confirm that GraphQuestions ismuch harder compared to WebQuestions, we notethat both datasets predominantly contain single-hopquestions, as indicated by the competitive perfor-mance of SINGLEEVENT on both datasets.

Table 4 compares UDEPLAMBDA with previ-ously published models which exist only for En-glish and have been mainly evaluated on Web-Questions. These are either symbolic like ours (firstblock) or employ neural networks (second block).Results for models using additional task-specifictraining resources, such as ClueWeb09, Wikipedia,or SimpleQuestions (Bordes et al., 2015) are shownin parentheses. On GraphQuestions, we achievea new state-of-the-art result with a gain of 4.8 F1-points over the previously reported best result. OnWebQuestions we are 2.1 points below the bestmodel using comparable resources, and 3.8 pointsbelow the state of the art. Most related to ourwork is the English-specific system of Reddy et al.(2016). We attribute the 0.8 point difference in F1-score to their use of the more fine-grained PTB tagset and Stanford Dependencies.

5 Related Work

Our work continues the long tradition of buildinglogical forms from syntactic representations initi-ated by Montague (1973). The literature is rife with

attempts to develop semantic interfaces for HPSG(Copestake et al., 2005), LFG (Kaplan and Bresnan,1982; Dalrymple et al., 1995; Crouch and King,2006), TAG (Kallmeyer and Joshi, 2003; Gardentand Kallmeyer, 2003; Nesson and Shieber, 2006),and CCG (Baldridge and Kruijff, 2002; Bos et al.,2004; Artzi et al., 2015). Unlike existing semanticinterfaces, UDEPLAMBDA uses dependency syn-tax, a widely available syntactic resource.

A common trend in previous work on seman-tic interfaces is the reliance on rich typed featurestructures or semantic types coupled with strongtype constraints, which can be very informativebut unavoidably language specific. Instead, UDEP-LAMBDA relies on generic unlexicalized informa-tion present in dependency treebanks and uses asimple type system (one type for dependency labels,and one for words) along with a combinatory mech-anism, which avoids type collisions. Earlier at-tempts at extracting semantic representations fromdependencies have mainly focused on language-specific dependency representations (Spreyer andFrank, 2005; Simov and Osenova, 2011; Hahn andMeurers, 2011; Reddy et al., 2016; Falke et al.,2016; Beltagy, 2016), and multi-layered depen-dency annotations (Jakob et al., 2010; Bedarideand Gardent, 2011). In contrast, UDEPLAMBDA

derives semantic representations for multiple lan-guages in a common schema directly from Univer-sal Dependencies. This work parallels a growinginterest in creating other forms of multilingual se-mantic representations (Akbik et al., 2015; Vander-wende et al., 2015; White et al., 2016; Evang andBos, 2016).

We evaluate UDEPLAMBDA on semantic pars-ing for question answering against a knowledgebase. Here, the literature offers two main modelingparadigms: (1) learning of task-specific grammarsthat directly parse language to a grounded repre-sentation (Zelle and Mooney, 1996; Zettlemoyerand Collins, 2005; Berant et al., 2013; Flaniganet al., 2014; Pasupat and Liang, 2015; Groschwitzet al., 2015); and (2) converting language to a lin-guistically motivated task-independent representa-tion that is then mapped to a grounded representa-tion (Kwiatkowski et al., 2013; Reddy et al., 2014;Krishnamurthy and Mitchell, 2015; Gardner andKrishnamurthy, 2017). Our work belongs to thelatter paradigm, as we map natural language toFreebase indirectly via logical forms. Capitalizingon natural-language syntax affords interpretability,

scalability, and reduced duplication of effort acrossapplications (Bender et al., 2015). Our work also re-lates to literature on parsing multiple languages to acommon executable representation (Cimiano et al.,2013; Haas and Riezler, 2016). However, existingapproaches still map to the target meaning represen-tations (more or less) directly (Kwiatkowksi et al.,2010; Jones et al., 2012; Jie and Lu, 2014).

6 Conclusions

We introduced UDEPLAMBDA, a semantic inter-face for Universal Dependencies, and showed thatthe resulting semantic representation can be usedfor question-answering against a knowledge basein multiple languages. We provided translations ofbenchmark datasets in German and Spanish, in thehope to stimulate further multilingual research onsemantic parsing and question answering in gen-eral. We have only scratched the surface when itcomes to applying UDEPLAMBDA to natural lan-guage understanding tasks. In the future, we wouldlike to explore how this framework can benefit ap-plications such as summarization (Liu et al., 2015)and machine reading (Sachan and Xing, 2016).

Acknowledgements

This work greatly benefited from discussions withMichael Collins, Dipanjan Das, Federico Fancellu,Julia Hockenmaier, Tom Kwiatkowski, AdamLopez, Valeria de Paiva, Martha Palmer, FernandoPereira, Emily Pitler, Vijay Saraswat, NathanSchneider, Bonnie Webber, Luke Zettlemoyer, andthe members of ILCC Edinburgh University, theMicrosoft Research Redmond NLP group, the Stan-ford NLP group, and the UW NLP and Linguisticsgroup. We thank Reviewer 2 for useful feedback.The authors would also like to thank the Univer-sal Dependencies community for the treebanks anddocumentation. This research is supported by aGoogle PhD Fellowship to the first author. We ac-knowledge the financial support of the EuropeanResearch Council (Lapata; award number 681760).

ReferencesLasha Abzianidze, Johannes Bjerva, Kilian Evang,

Hessel Haagsma, Rik van Noord, Pierre Ludmann,Duc-Duy Nguyen, and Johan Bos. 2017. The Par-allel Meaning Bank: Towards a Multilingual Cor-pus of Translations Annotated with CompositionalMeaning Representations. In Proceedings of the

European Chapter of the Association for Compu-tational Linguistics. Association for ComputationalLinguistics, Valencia, Spain, pages 242–247.

Alan Akbik, laura chiticariu, Marina Danilevsky, Yun-yao Li, Shivakumar Vaithyanathan, and Huaiyu Zhu.2015. Generating High Quality Proposition Banksfor Multilingual Semantic Role Labeling. In Pro-ceedings of the Association for Computational Lin-guistics and the International Joint Conference onNatural Language Processing. Association for Com-putational Linguistics, Beijing, China, pages 397–407.

Rami Al-Rfou, Bryan Perozzi, and Steven Skiena.2013. Polyglot: Distributed Word Representationsfor Multilingual NLP. In Proceedings of the Com-putational Natural Language Learning. Sofia, Bul-garia, pages 183–192.

Yoav Artzi. 2013. Cornell SPF: Cornell Semantic Pars-ing Framework. arXiv:1311.3011 [cs.CL] .

Yoav Artzi, Kenton Lee, and Luke Zettlemoyer. 2015.Broad-coverage CCG Semantic Parsing with AMR.In Proceedings of the Empirical Methods on NaturalLanguage Processing. pages 1699–1710.

Jason Baldridge and Geert-Jan Kruijff. 2002. CouplingCCG and Hybrid Logic Dependency Semantics. InProceedings of the Association for ComputationalLinguistics. pages 319–326.

Laura Banarescu, Claire Bonial, Shu Cai, MadalinaGeorgescu, Kira Griffitt, Ulf Hermjakob, KevinKnight, Philipp Koehn, Martha Palmer, and NathanSchneider. 2013. Abstract Meaning Representationfor Sembanking. In Linguistic Annotation Workshopand Interoperability with Discourse. Sofia, Bulgaria,pages 178–186.

Hannah Bast and Elmar Haussmann. 2015. More Ac-curate Question Answering on Freebase. In Pro-ceedings of ACM International Conference on Infor-mation and Knowledge Management. pages 1431–1440.

Paul Bedaride and Claire Gardent. 2011. Deep Seman-tics for Dependency Structures. In Proceedings ofConference on Intelligent Text Processing and Com-putational Linguistics. pages 277–288.

Islam Beltagy. 2016. Natural Language Semantics Us-ing Probabilistic Logic. Ph.D. thesis, Departmentof Computer Science, The University of Texas atAustin.

Emily M. Bender, Dan Flickinger, Stephan Oepen,Woodley Packard, and Ann Copestake. 2015. Lay-ers of Interpretation: On Grammar and Composition-ality. In Proceedings of the International Confer-ence on Computational Semantics. Association forComputational Linguistics, London, UK, pages 239–249.

Jonathan Berant, Andrew Chou, Roy Frostig, and PercyLiang. 2013. Semantic Parsing on Freebase fromQuestion-Answer Pairs. In Proceedings of the Em-pirical Methods on Natural Language Processing.pages 1533–1544.

Jonathan Berant and Percy Liang. 2014. Semantic Pars-ing via Paraphrasing. In Proceedings of the Asso-ciation for Computational Linguistics. pages 1415–1425.

Jonathan Berant and Percy Liang. 2015. ImitationLearning of Agenda-Based Semantic Parsers. Trans-actions of the Association for Computational Lin-guistics 3:545–558.

Anders Bjrkelund, Love Hafdell, and Pierre Nugues.2009. Multilingual Semantic Role Labeling. InProceedings of Computational Natural LanguageLearning (CoNLL 2009): Shared Task. Associationfor Computational Linguistics, Boulder, Colorado,pages 43–48.

Antoine Bordes, Nicolas Usunier, Sumit Chopra, andJason Weston. 2015. Large-scale simple ques-tion answering with memory networks. CoRRabs/1506.02075.

Johan Bos, Stephen Clark, Mark Steedman, James R.Curran, and Julia Hockenmaier. 2004. Wide-Coverage Semantic Representations from a CCGParser. In Proceedings of the Conference on Com-putational Linguistics. pages 1240–1246.

Philipp Cimiano, Vanessa Lopez, Christina Unger,Elena Cabrio, Axel-Cyrille Ngonga Ngomo, andSebastian Walter. 2013. Multilingual question an-swering over linked data (QALD-3): Lab overview.In Information Access Evaluation. Multilinguality,Multimodality, and Visualization. Springer, Valencia,Spain, volume 8138.

Michael Collins. 2002. Discriminative Training Meth-ods for Hidden Markov Models: Theory and Exper-iments with Perceptron Algorithms. In Proceedingsof the Empirical Methods on Natural Language Pro-cessing. pages 1–8.

Ronan Collobert, Jason Weston, Leon Bottou, MichaelKarlen, Koray Kavukcuoglu, and Pavel Kuks. 2011.Natural language processing (almost) from scratch.Journal of Machine Learning Research 12:2493–2537.

Ann Copestake, Dan Flickinger, Carl Pollard, andIvan A. Sag. 2005. Minimal Recursion Semantics:An Introduction. Research on Language and Com-putation 3(2-3):281–332.

Dick Crouch and Tracy Holloway King. 2006. Seman-tics via f-structure rewriting. In Proceedings of theLFG’06 Conference. CSLI Publications, page 145.

Mary Dalrymple, John Lamping, Fernando C. N.Pereira, and Vijay A. Saraswat. 1995. Linear Logicfor Meaning Assembly. In Proceedings of Computa-tional Logic for Natural Language Processing.

Kilian Evang and Johan Bos. 2016. Cross-lingualLearning of an Open-domain Semantic Parser. InProceedings of the Conference on ComputationalLinguistics. The COLING 2016 Organizing Com-mittee, Osaka, Japan, pages 579–588.

Tobias Falke, Gabriel Stanovsky, Iryna Gurevych, andIdo Dagan. 2016. Porting an Open Information Ex-traction System from English to German. In Pro-ceedings of the Empirical Methods in Natural Lan-guage Processing. Association for ComputationalLinguistics, Austin, Texas, pages 892–898.

Jeffrey Flanigan, Sam Thomson, Jaime Carbonell,Chris Dyer, and Noah A. Smith. 2014. A Discrimi-native Graph-Based Parser for the Abstract MeaningRepresentation. In Proceedings of the Associationfor Computational Linguistics. pages 1426–1436.

Claire Gardent and Laura Kallmeyer. 2003. SemanticConstruction in Feature-based TAG. In Proceedingsof European Chapter of the Association for Compu-tational Linguistics. pages 123–130.

Matt Gardner and Jayant Krishnamurthy. 2017. Open-Vocabulary Semantic Parsing with both Distribu-tional Statistics and Formal Knowledge. In Proceed-ings of Association for the Advancement of ArtificialIntelligence.

Jonas Groschwitz, Alexander Koller, and Christoph Te-ichmann. 2015. Graph parsing with s-graph gram-mars. In Proceedings of the Association for Compu-tational Linguistics. pages 1481–1490.

Carolin Haas and Stefan Riezler. 2016. A Corpus andSemantic Parser for Multilingual Natural LanguageQuerying of OpenStreetMap. In Proceedings ofthe North American Chapter of the Association forComputational Linguistics: Human Language Tech-nologies. Association for Computational Linguistics,San Diego, California, pages 740–750.

Michael Hahn and Detmar Meurers. 2011. On deriv-ing semantic representations from dependencies: Apractical approach for evaluating meaning in learnercorpora. In Proceedings of the Int. Conference onDependency Linguistics (Depling 2011). Barcelona,pages 94–103.

Jan Haji, Massimiliano Ciaramita, Richard Johans-son, Daisuke Kawahara, Maria Antnia Mart, LlusMrquez, Adam Meyers, Joakim Nivre, SebastianPad, Jan tpnek, and others. 2009. The CoNLL-2009shared task: Syntactic and semantic dependencies inmultiple languages. In Proceedings of the Compu-tational Natural Language Learning: Shared Task.Association for Computational Linguistics, pages 1–18.

Kejun Huang, Matt Gardner, Evangelos Papalex-akis, Christos Faloutsos, Nikos Sidiropoulos, TomMitchell, Partha P. Talukdar, and Xiao Fu. 2015.

Translation Invariant Word Embeddings. In Pro-ceedings of the Empirical Methods in Natural Lan-guage Processing. Lisbon, Portugal, pages 1084–1088.

Max Jakob, Marketa Lopatkova, and Valia Kordoni.2010. Mapping between Dependency Structures andCompositional Semantic Representations. In Pro-ceedings of the Fifth International Conference onLanguage Resources and Evaluation.

Zhanming Jie and Wei Lu. 2014. Multilingual Seman-tic Parsing : Parsing Multiple Languages into Se-mantic Representations. In Proceedings of the Con-ference on Computational Linguistics. Dublin CityUniversity and Association for Computational Lin-guistics, Dublin, Ireland, pages 1291–1301.

Bevan Keeley Jones, Mark Johnson, and Sharon Gold-water. 2012. Semantic Parsing with Bayesian TreeTransducers. In Proceedings of the Association forComputational Linguistics. Association for Compu-tational Linguistics, Stroudsburg, PA, USA, pages488–496.

Laura Kallmeyer and Aravind Joshi. 2003. Factor-ing predicate argument and scope semantics: Under-specified semantics with LTAG. Research on Lan-guage and Computation 1(1-2):3–58.

Ronald M Kaplan and Joan Bresnan. 1982. Lexical-functional grammar: A formal system for gram-matical representation. Formal Issues in Lexical-Functional Grammar pages 29–130.

Eliyahu Kiperwasser and Yoav Goldberg. 2016. Sim-ple and Accurate Dependency Parsing Using Bidi-rectional LSTM Feature Representations. Transac-tions of the Association for Computational Linguis-tics 4:313–327.

Jayant Krishnamurthy and Tom M. Mitchell. 2015.Learning a Compositional Semantics for Freebasewith an Open Predicate Vocabulary. Transactionsof the Association for Computational Linguistics3:257–270.

Tom Kwiatkowksi, Luke Zettlemoyer, Sharon Goldwa-ter, and Mark Steedman. 2010. Inducing Probabilis-tic CCG Grammars from Logical Form with Higher-Order Unification. In Proceedings of the Empiri-cal Methods on Natural Language Processing. pages1223–1233.

Tom Kwiatkowski, Eunsol Choi, Yoav Artzi, and LukeZettlemoyer. 2013. Scaling Semantic Parsers withOn-the-Fly Ontology Matching. In Proceedings ofthe Empirical Methods on Natural Language Pro-cessing. pages 1545–1556.

Roger Levy and Galen Andrew. 2006. Tregex and tsur-geon: tools for querying and manipulating tree datastructures. In Proceedings of LREC. pages 2231–2234.

Fei Liu, Jeffrey Flanigan, Sam Thomson, NormanSadeh, and Noah A. Smith. 2015. Toward Ab-stractive Summarization Using Semantic Represen-tations. In Proceedings of North American Chap-ter of the Association for Computational Linguistics.pages 1077–1086.

Mitchell P. Marcus, Mary Ann Marcinkiewicz, andBeatrice Santorini. 1993. Building a large annotatedcorpus of English: The Penn Treebank. Computa-tional linguistics 19(2):313–330.

Richard Montague. 1973. The Proper Treatment ofQuantification in Ordinary English. In K.J.J. Hin-tikka, J.M.E. Moravcsik, and P. Suppes, editors,Approaches to Natural Language, Springer Nether-lands, volume 49 of Synthese Library, pages 221–242.

Rebecca Nesson and Stuart M. Shieber. 2006. SimplerTAG Semantics Through Synchronization. In Pro-ceedings of the 11th Conference on Formal Gram-mar. Center for the Study of Language and Informa-tion, Malaga, Spain, pages 129–142.

Joakim Nivre, Marie-Catherine de Marneffe, Filip Gin-ter, Yoav Goldberg, Jan Hajic, Christopher D. Man-ning, Ryan McDonald, Slav Petrov, Sampo Pyysalo,Natalia Silveira, Reut Tsarfaty, and Daniel Zeman.2016. Universal Dependencies v1: A MultilingualTreebank Collection. In Proceedings of the Tenth In-ternational Conference on Language Resources andEvaluation. European Language Resources Associa-tion (ELRA), Paris, France.

Joakim Nivre et al. 2016. Universal dependencies 1.3.LINDAT/CLARIN digital library at the Institute ofFormal and Applied Linguistics, Charles Universityin Prague.

Martha Palmer, Daniel Gildea, and Paul Kingsbury.2005. The proposition bank: An annotated corpus ofsemantic roles. Computational linguistics 31(1):71–106.

Martha Palmer, Daniel Gildea, and Nianwen Xue. 2010.Semantic role labeling. Synthesis Lectures on Hu-man Language Technologies 3(1):1–103.

Panupong Pasupat and Percy Liang. 2015. Composi-tional Semantic Parsing on Semi-Structured Tables.In Proceedings of the Association for ComputationalLinguistics. pages 1470–1480.

Jeffrey Pennington, Richard Socher, and ChristopherManning. 2014. Glove: Global Vectors for WordRepresentation. In Proceedings of the EmpiricalMethods in Natural Language Processing. Associ-ation for Computational Linguistics, Doha, Qatar,pages 1532–1543.

Barbara Plank, Anders Søgaard, and Yoav Goldberg.2016. Multilingual Part-of-Speech Tagging withBidirectional Long Short-Term Memory Models andAuxiliary Loss. In Proceedings of the Annual Meet-ing of the Association for Computational Linguistics.Berlin, Germany, pages 412–418.

Siva Reddy, Mirella Lapata, and Mark Steedman. 2014.Large-scale Semantic Parsing without Question-Answer Pairs. Transactions of the Association forComputational Linguistics 2:377–392.

Siva Reddy, Oscar Tackstrom, Michael Collins, TomKwiatkowski, Dipanjan Das, Mark Steedman, andMirella Lapata. 2016. Transforming DependencyStructures to Logical Forms for Semantic Parsing.Transactions of the Association for ComputationalLinguistics 4:127–140.

Michael Roth and Kristian Woodsend. 2014. Compo-sition of Word Representations Improves SemanticRole Labelling. In Proceedings of the EmpiricalMethods in Natural Language Processing (EMNLP).Association for Computational Linguistics, Doha,Qatar, pages 407–413.

Mrinmaya Sachan and Eric Xing. 2016. Machine Com-prehension using Rich Semantic Representations. InProceedings of the Association for ComputationalLinguistics. Association for Computational Linguis-tics, Berlin, Germany, pages 486–492.

Sebastian Schuster and Christopher D. Manning. 2016.Enhanced English Universal Dependencies: An Im-proved Representation for Natural Language Under-standing Tasks. In Proceedings of the Tenth Interna-tional Conference on Language Resources and Eval-uation. European Language Resources Association(ELRA), Paris, France.

Kiril Simov and Petya Osenova. 2011. Towards Min-imal Recursion Semantics over Bulgarian Depen-dency Parsing. In Proceedings of the InternationalConference Recent Advances in Natural LanguageProcessing 2011. RANLP 2011 Organising Commit-tee, Hissar, Bulgaria, pages 471–478.

Kathrin Spreyer and Anette Frank. 2005. ProjectingRMRS from TIGER Dependencies. In Proceedingsof the HPSG 2005 Conference. CSLI Publications.

Yu Su, Huan Sun, Brian Sadler, Mudhakar Srivatsa,Izzeddin Gur, Zenghui Yan, and Xifeng Yan. 2016.On Generating Characteristic-rich Question Sets forQA Evaluation. In Proceedings of the EmpiricalMethods in Natural Language Processing. Austin,Texas, pages 562–572.

Ferhan Ture and Oliver Jojic. 2016. Simple and Ef-fective Question Answering with Recurrent NeuralNetworks. CoRR abs/1606.05029.

Lucy Vanderwende, Arul Menezes, and Chris Quirk.2015. An AMR parser for English, French, German,Spanish and Japanese and a new AMR-annotatedcorpus. In Proceedings of the North American Chap-ter of the Association for Computational Linguistics:Demonstrations. Association for Computational Lin-guistics, Denver, Colorado, pages 26–30.

Aaron Steven White, Drew Reisinger, Keisuke Sak-aguchi, Tim Vieira, Sheng Zhang, Rachel Rudinger,Kyle Rawlins, and Benjamin Van Durme. 2016. Uni-versal Decompositional Semantics on Universal De-pendencies. In Proceedings of the Empirical Meth-ods in Natural Language Processing. Associationfor Computational Linguistics, Austin, Texas, pages1713–1723.

Kun Xu, Siva Reddy, Yansong Feng, Songfang Huang,and Dongyan Zhao. 2016. Question Answering onFreebase via Relation Extraction and Textual Evi-dence. In Proceedings of the Association for Compu-tational Linguistics. Association for ComputationalLinguistics, Berlin, Germany, pages 2326–2336.

Xuchen Yao. 2015. Lean Question Answering overFreebase from Scratch. In Proceedings of NorthAmerican Chapter of the Association for Computa-tional Linguistics. pages 66–70.

Xuchen Yao and Benjamin Van Durme. 2014. Infor-mation Extraction over Structured Data: QuestionAnswering with Freebase. In Proceedings of the As-sociation for Computational Linguistics. pages 956–966.

Semih Yavuz, Izzeddin Gur, Yu Su, Mudhakar Srivatsa,and Xifeng Yan. 2016. Improving Semantic Parsingvia Answer Type Inference. In Proceedings of theEmpirical Methods in Natural Language Processing.Association for Computational Linguistics, Austin,Texas, pages 149–159.

Wen-tau Yih, Ming-Wei Chang, Xiaodong He, andJianfeng Gao. 2015. Semantic Parsing via StagedQuery Graph Generation: Question Answering withKnowledge Base. In Proceedings of the Associationfor Computational Linguistics. pages 1321–1331.

John M. Zelle and Raymond J. Mooney. 1996. Learn-ing to Parse Database Queries Using Inductive LogicProgramming. In Proceedings of Association for theAdvancement of Artificial Intelligence. pages 1050–1055.

Luke S. Zettlemoyer and Michael Collins. 2005. Learn-ing to Map Sentences to Logical Form: StructuredClassification with Probabilistic Categorial Gram-mars. In Proceedings of Uncertainty in Artificial In-telligence. pages 658–666.

Universal Semantic Parsing: Supplementary Material

Siva Reddy† Oscar Tackstrom‡ Slav Petrov‡ Mark Steedman†† Mirella Lapata††

†Stanford University‡ Google Inc.

††University of [email protected], oscart, [email protected], steedman, [email protected]

Abstract

This supplementary material to the mainpaper, provides an outline of how quantifi-cation can be incorporated in the UDEP-LAMBDA framework.

1 Universal Quantification

Consider the sentence Everybody wants to buy ahouse,1 whose dependency tree in the UniversalDependencies (UD) formalism is shown in Fig-ure 1(a). This sentence has two possible readings:either (1) every person wants to buy a differenthouse; or (2) every person wants to buy the samehouse. The two interpretations correspond to thefollowing logical forms:(1) ∀x.person(xa)→

[∃zyw.wants(ze)∧ arg1(ze,xa)∧buy(ye)∧xcomp(ze,ye)∧house(wa)∧ arg1(ze,xa)∧ arg2(ze,wa)] ;

(2) ∃w.house(wa)∧ (∀x.person(xa)→[∃zy.wants(ze)∧ arg1(ze,xa)∧buy(ye)∧xcomp(ze,ye)∧

arg1(ze,xa)∧ arg2(ze,wa)]) .

In (1), the existential variable w is in the scope ofthe universal variable x (i.e. the house is dependenton the person). This reading is commonly referredto as the surface reading. Conversely, in (2) theuniversal variable x is in the scope of the existentialvariable w (i.e. the house is independent of theperson). This reading is also called inverse reading.Our goal is to obtain the surface reading logicalform in (1) with UDEPLAMBDA. We do not aim toobtain the inverse reading, although this is possiblewith the use of Skolemization (Steedman, 2012).

In UDEPLAMBDA, lambda expressions forwords, phrases and sentences are all of theform λx. . . .. But from (1), it is clear that we needto express variables bound by quantifiers, e.g. ∀x,while still providing access to x for composition.This demands a change in the type system since the

1Example borrowed from Schuster and Manning (2016).

Everybody wants to buy a house

nsubj

xcomp dobj

detmark

root

(a) Original dependency tree.


nsubj

xcomp dobj

detmark

root

Ω Ωbind nsubj

(b) Enhanced dependency tree.


nsubj:univ

xcomp dobj

detmark

root

Ω Ωbind nsubj

(c) Enhanced dependency tree with universal quantification.

Figure 1: The dependency tree for Everybodywants to buy a house and its enhanced variants.

same variable cannot be lambda bound and quanti-fier bound—that is we cannot have formulas of theform λx . . .∀x . . .. In this material, we first derivethe logical form for the example sentence usingthe type system from our main paper (Section 1.1)and show that it fails to handle universal quantifi-cation. We then modify the type system slightlyto allow derivation of the desired surface readinglogical form (Section 1.2). This modified type sys-tem is a strict generalization of the original typesystem.2 Fancellu et al. (2017) present an elaboratediscussion on the modified type system, and how itcan handle negation scope and its interaction withuniversal quantifiers.

2Note that this treatment has yet to be added to ourimplementation, which can be found at https://github.com/sivareddyg/udeplambda.








1.1 With Original Type System

We will first attempt to derive the logical form in (1)using the default type system of UDEPLAMBDA.Figure 1(b) shows the enhanced dependency treefor the sentence, where BIND has been introducedto connect the implied nsubj of buy (BIND is ex-plained in the main paper in Section 3.2). Thes-expression corresponding to the enhanced tree is:(nsubj (xcomp wants (mark

(nsubj (dobj buy (det house a)) Ω) to))(BIND everybody Ω)) .

With the following substitution entries,wants, buy ∈ EVENT;everybody, house ∈ ENTITY;a, to ∈ FUNCTIONAL;Ω = λx.EQ(x,ω);nsubj= λ f gx.∃y. f (x)∧g(y)∧ arg1(xe,ya);dobj= λ f gx.∃y. f (x)∧g(y)∧ arg2(xe,ya);xcomp= λ f gx.∃y. f (x)∧g(y)∧xcomp(xe,ya);mark ∈ HEAD;BIND ∈ MERGE,

the lambda expression after composition becomes:λz. ∃xywv.wants(ze)∧ everybody(xa)∧ arg1(ze,xa)∧ EQ(x,ω)∧buy(ye)∧xcomp(ze,ye)∧ arg1(ye,va)∧ EQ(v,ω)∧ arg1(xe,ya)∧house(wa)∧ arg2(ye,wa) .

This expression encodes the fact that x and v arein unification, and can thus be further simplified to:(3) λz.∃xyw.wants(ze)∧ everybody(xa)∧ arg1(ze,xa)

∧ buy(ye)∧xcomp(ze,ye)∧ arg1(ye,xa)∧ arg1(xe,ya)∧house(wa)∧ arg2(ye,wa) .

However, the logical form (3) differs from thedesired form (1). As noted above, UDEPLAMBDA

with its default type, where each s-expression musthave the type η = Ind×Event→ Bool, cannothandle quantifier scoping.

1.2 With Higher-order Type System

Following Champollion (2010), we make a slightmodification to the type system. Instead of usingexpressions of the form λx. . . . for words, we useeither λ f .∃x. . . . or λ f .∀x. . . ., where f has type η.As argued by Champollion, this higher-order formmakes quantification and negation handling soundand simpler in Neo-Davidsonian event semantics.Following this change, we assign the followinglambda expressions to the words in our examplesentence:everybody = λ f .∀x.person(x)→ f (x) ;wants = λ f .∃x.wants(xe)∧ f (x) ;to = λ f .TRUE ;buy = λ f .∃x.buy(xe)∧ f (x) ;a = λ f .TRUE ;house = λ f .∃x.house(xa)∧ f (x) ;Ω = λ f . f (ω) .

Here everybody is assigned universal quantifiersemantics. Since the UD representation does notdistinguish quantifiers, we need to rely on a small(language-specific) lexicon to identify these. Toencode quantification scope, we enhance the la-bel nsubj to nsubj:univ, which indicates thatthe subject argument of wants contains a universalquantifier, as shown in Figure 1(c).

This change of semantic type for words and s-expressions forces us to also modify the seman-tic type of dependency labels, in order to obeythe single-type constraint of DEPLAMBDA (Reddyet al., 2016). Thus, dependency labels will nowtake the form λPQ f . . . ., where P is the parent ex-pression, Q is the child expression, and the returnexpression is of the form λ f . . . .. Following thischange, we assign the following lambda expres-sions to dependency labels:nsubj:univ= λPQ f .Q(λy.P(λx. f (x)∧ arg1(xe,ya))) ;nsubj= λPQ f .P(λx. f (x)∧Q(λy.arg1(xe,ya))) ;dobj= λPQ f .P(λx. f (x)∧Q(λy.arg2(xe,ya))) ;xcomp= λPQ f .P(λx. f (x)∧Q(λy.xcomp(xe,ya))) ;det, mark= λPQ f .P( f ) ;BIND = λPQ f .P(λx. f (x)∧Q(λy.EQ(y,x))) .

Notice that the lambda expression ofnsubj:univ differs from nsubj. In the for-mer, the lambda variables inside Q have widerscope over the variables in P (i.e. the universalquantifier variable of everybody has scope over theevent variable of wants) contrary to the latter.

The new s-expression for Figure 1(c) is(nsubj:univ (xcomp wants (mark

(nsubj (dobj buy (det house a)) Ω) to))(BIND everybody Ω)) .

Substituting with the modified expressions, andperforming composition and simplification leads tothe expression:(6) λ f .∀x .person(xa)→

[∃zyw. f (z)∧wants(ze)∧ arg1(ze,xa)∧buy(ye)∧ xcomp(ze,ye)∧ house(wa)∧ arg1(ze,xa)∧ arg2(ze,wa)] .

This expression is identical to (1) except for theoutermost term λ f . By applying (6) to λx.TRUE,we obtain (1), which completes the treatment ofuniversal quantification in UDEPLAMBDA.

ReferencesLucas Champollion. 2010. Quantification and negation

in event semantics. Baltic International Yearbook ofCognition, Logic and Communication 6(1):3.

Federico Fancellu, Siva Reddy, Adam Lopez, and Bon-nie Webber. 2017. Universal Dependencies to Logi-cal Forms with Negation Scope. arXiv Preprint .

Siva Reddy, Oscar Tackstrom, Michael Collins, TomKwiatkowski, Dipanjan Das, Mark Steedman, andMirella Lapata. 2016. Transforming DependencyStructures to Logical Forms for Semantic Parsing.Transactions of the Association for ComputationalLinguistics 4:127–140.

Sebastian Schuster and Christopher D. Manning. 2016.Enhanced English Universal Dependencies: An Im-

proved Representation for Natural Language Under-standing Tasks. In Proceedings of the Tenth Interna-tional Conference on Language Resources and Eval-uation. European Language Resources Association(ELRA), Paris, France.

Mark Steedman. 2012. Taking Scope - The Natural Se-mantics of Quantifiers. MIT Press.

universal semantic parsing · universal semantic parsing siva reddy oscar t ackstr¨ om¨ slav...

Documents