locality and the complexity of minimalist derivation tree languages

20
Locality and the Complexity of Minimalist Derivation Tree Languages Thomas Graf Department of Linguistics University of California, Los Angeles [email protected] http://tgraf.bol.ucla.edu Abstract. Minimalist grammars provide a formalization of Minimalist syntax which allows us to study how the components of said theory affect its expressivity. A central concern of Minimalist syntax is the locality of the displacement operation Move. In Minimalist grammars, however, Move is unbounded. This paper is a study of the repercussions of limiting movement with respect to the number of slices a moved constituent is allowed to cross, where a slice is the derivation tree equivalent of the phrase projected by a lexical item in the derived tree. I show that this locality condition 1) has no effect on weak generative capacity 2) has no effect on a Minimalist derivation tree language’s recognizability by top-down automata 3) renders Minimalist derivation tree languages strictly locally testable, whereas their unrestricted counterparts aren’t even locally threshold testable. Key words: Minimalist grammars, locality, subregular tree languages, first-order logic, top-down tree automata Introduction Even though Minimalist grammars (MGs) weren’t introduced by Stabler [16] with the sole intent of scrutinizing the merits of ideas put forward by syntacti- cians in the wake of Chomsky’s Minimalist Program [2], a lot of work on MGs certainly focuses on this aspect [cf. 17]. Recently, considerable attention has also been directed towards the role played by derivation trees in MGs [4, 7, 8]. It is now known that every MG’s derivation tree language is regular and “almost” closed under intersection with regular tree languages (some refinement of cat- egory labels is usually required), but it is still an open question which class of tree languages approximates them reasonably well. This paper combines both research strands by taking the linguistically motivated restriction to local move- ment as its vantage point for an examination of the structural complexity of Minimalist derivation tree languages (MDTLs). The main result is that while bounding the distance of movement leaves weak generative capacity unaffected, the complexity of MDTLs is lowered to a degree where they become strictly lo- cally testable. Since MGs are fully characterized by their MDTLs, lowering the

Upload: duonghanh

Post on 11-Jan-2017

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Locality and the Complexity of Minimalist Derivation Tree Languages

Locality and the Complexity ofMinimalist Derivation Tree Languages

Thomas Graf

Department of LinguisticsUniversity of California, Los Angeles

[email protected]

http://tgraf.bol.ucla.edu

Abstract. Minimalist grammars provide a formalization of Minimalistsyntax which allows us to study how the components of said theory affectits expressivity. A central concern of Minimalist syntax is the localityof the displacement operation Move. In Minimalist grammars, however,Move is unbounded. This paper is a study of the repercussions of limitingmovement with respect to the number of slices a moved constituentis allowed to cross, where a slice is the derivation tree equivalent ofthe phrase projected by a lexical item in the derived tree. I show thatthis locality condition 1) has no effect on weak generative capacity 2)has no effect on a Minimalist derivation tree language’s recognizabilityby top-down automata 3) renders Minimalist derivation tree languagesstrictly locally testable, whereas their unrestricted counterparts aren’teven locally threshold testable.

Key words: Minimalist grammars, locality, subregular tree languages,first-order logic, top-down tree automata

Introduction

Even though Minimalist grammars (MGs) weren’t introduced by Stabler [16]with the sole intent of scrutinizing the merits of ideas put forward by syntacti-cians in the wake of Chomsky’s Minimalist Program [2], a lot of work on MGscertainly focuses on this aspect [cf. 17]. Recently, considerable attention has alsobeen directed towards the role played by derivation trees in MGs [4, 7, 8]. It isnow known that every MG’s derivation tree language is regular and “almost”closed under intersection with regular tree languages (some refinement of cat-egory labels is usually required), but it is still an open question which class oftree languages approximates them reasonably well. This paper combines bothresearch strands by taking the linguistically motivated restriction to local move-ment as its vantage point for an examination of the structural complexity ofMinimalist derivation tree languages (MDTLs). The main result is that whilebounding the distance of movement leaves weak generative capacity unaffected,the complexity of MDTLs is lowered to a degree where they become strictly lo-cally testable. Since MGs are fully characterized by their MDTLs, lowering the

Page 2: Locality and the Complexity of Minimalist Derivation Tree Languages

2

upper bound from regular to strictly locally testable may prove useful for vari-ous practical applications operating directly on the derivation trees, in particularparsing [9, 18, 21].

The paper is laid out as follows: After some general technical remarks, Sec. 2introduces MGs, their derivation trees, and the important concept of slices.Building on these notions, I define movement-free and k-local MGs, and I provethat every MG can be converted into a k-local one. The complexity of thesederivation trees is then studied with respect to several subregular languagesclasses in Sec. 3. I first show that MDTLs can be recognized by l-r-deterministictop-down tree automata, but not by sensing tree automata, which entails non-recognizability by deterministic top-down tree automata. Furthermore, everyk-local MG G has a strictly locally κ-testable derivation tree language, with thevalue of κ depending on several parameters of G. This result is followed by ademonstration that unrestricted MDTLs are not locally threshold testable. How-ever, they are definable in first-order logic with predicates for left child, rightchild, proper dominance, and equivalence (Sec. 4).

1 Preliminaries and Notation

As usual, N denotes the set of non-negative integers. A tree domain is a finitesubset D of N∗ such that, for w ∈ N∗ and j ∈ N, wj ∈ D implies both w ∈ Dand wi ∈ D for all i < j. Every n ∈ D is called a node. Given nodes m,n ∈ D,m immediately dominates n iff n = mi, i ∈ N. In this case we also say m is themother of n, or conversely, n is a daughter of m. The transitive closure of theimmediate dominance relation is called (proper) dominance. A node that doesnot dominate any other nodes is a leaf, and the unique node that isn’t dominatedby any nodes is called the root.

Now let Σ be a ranked alphabet, i.e. every σ ∈ Σ has a unique non-negativerank (arity); Σ(n) is the set of all n-ary symbols in Σ. A Σ-tree is a pair T :=〈D, `〉, where D is a tree domain and ` : D → Σ is a function assigning each noden a label drawn from Σ such that `(n) ∈ Σ(d) iff n has d daughters. Usuallythe alphabet will not be indicated in writing when it is irrelevant or can beinferred from the context. Sometimes trees will be given in functional notationsuch that f(t1, . . . , tn) is the tree whose root node is labeled f and immediatelydominates trees t1, . . . , tn. I denote by TΣ the set of all trees such that for n ≥ 0,f(t1, . . . , tn) is in TΣ iff f ∈ Σ(n) and ti ∈ TΣ , 1 ≤ i ≤ n. A tree language issome subset of TΣ .

A context C is a Σ ∪ -tree, where is a new symbol that appears onexactly one leaf of C, designating it as the port of C. A context C with occurring in the configuration c := σ(t1, . . . ,, . . . , tn), σ ∈ Σ(n) and each ti aΣ-tree, can be composed with context C ′ (written C · C ′) by replacing c withσ(t1, . . . , C

′, . . . , tn). This extends naturally to all cases where C = and C ′

is a tree rather than a context. Given a Σ-tree t := 〈D, `〉 and some node uof t, t|u := 〈D|u, `〉 denotes the subtree rooted by u in t, such that D|u :=n ∈ D | u = n or u dominates n and dominance and the labeling function are

Page 3: Locality and the Complexity of Minimalist Derivation Tree Languages

3

preserved. For any tree t with nodes m and n of t such that either m = nor m dominates n, Ct[m,n) is the context obtained from t|m by replacing t|nby a port. If s and t are trees, r the root of s and u some node of s, thens[u← t] := Cs[r, u) · t.

Let m and n be nodes of some tree t. A path from m to n is a sequence ofnode 〈i0, . . . , ik〉 such that i0 = m, ik = n, and for all j < k, ij is the mother orthe daughter of ij+1. A path containing k nodes is of length k− 1. The distancebetween nodes m and n is the length of the shortest path from m to n. Thedepth of a tree t is identical to the greatest distance between the root of t andone of its leafs.

We now move on to defining the strictly locally testable languages, followingthe exposition in [21]. For each Σ-tree and choice of k ≥ 1, we define its k-factors,or more precisely, its k-prefixes, k-forks and k-suffixes as follows:

pk(σ(t1, . . . , tn)) :=

σ if k = 1 or σ has no children

σ(pk−1(t1), . . . , pk−1(tn)) otherwise

fk(σ(t1, . . . , tn)) :=

∅if σ(t1, . . . , tn) is of

depth d < k − 1

pk(σ(t1, . . . , tn)) ∪⋃ni=1 fk(ti) otherwise

sk(σ(t1, . . . , tn)) :=

σ(t1, . . . , tn) ∪⋃ni=1 sk(ti)

if σ(t1, . . . , tn) is of depth

d < k − 1⋃mi=1 sk(ti) otherwise

A tree language L ⊆ TΣ is strictly locally k-testable (in SLk) iff there exist threefinite subsets R, F , and S, such that t ∈ L iff pk−1(t) ∈ R, fk(t) ⊆ F , andsk−1 ⊆ S. A language is local (in LOC) iff it is in SL2. It is locally k-thresholdtestable (in LTTk) iff furthermore each k-factor must appear a specific numberof times, counting up to some fixed threshold. When the threshold is set to 1, L islocally k-testable (in LTk). We say that L belongs to one of these classes iff thereis some k such that L is k-testable in the intended sense. Finally, L is regular iffit is the range of a transduction computed by some linear tree transducer withdomain TΣ (the reader is referred to [5] for further details).

Definition 1. A linear tree transducer is a 5-tuple A := 〈Σ,Ω,Q,Q′, ∆〉, whereΣ and Ω are finite ranked alphabets, Q is a finite set of states, Q′ ⊆ Q the set offinal states, and ∆ is a set of productions of the form f(q1(x1), . . . , qn(xn)) →q(t), where f ∈ Σ(n), q1, . . . , qn, q ∈ Q, t ∈ TΩ∪x1,...,xn, and each xi occurs atmost once in t.

It is a well-known fact that LOC ⊂ SL ⊂ LT ⊂ LTT ⊂ REG.

2 Minimalist Grammars

2.1 Introduction and Examples

MGs are a highly lexicalized formalism. Every lexical item (LI) comes equippedwith a linear sequence of features that have to be “checked”, or equivalently,

Page 4: Locality and the Complexity of Minimalist Derivation Tree Languages

4

“erased” in the right order. Features come in two varieties that can only bechecked by the operations Merge and Move, respectively. Merge conjoins trees,while Move displaces subtrees. A very simple MG, for example, is instantiatedby the following lexicon.

man :: n the :: = n d ε :: = v + nom tJohn :: d the :: = n d − nom ε :: = t cJohn :: d − nom the :: = n d − top ε :: = t + top cJohn :: d − top killed :: = d = d v

The first component of an LI denotes its phonetic exponent, the second oneits feature string. Features without a prefix represent categories (n for noun, d fordeterminer, and so on). A category feature, say n, of LI l is checked wheneverMerge combines l with another LI l′ such that the respective first uncheckedfeatures of l and l′ are n and the matching selector feature = n. This is alsothe only feature configuration in which Merge may apply. Hence Merge couldcombine the and man, yielding the man, which in turn can be merged with killed,but not the and John (no compatible features at all), or killed and the (the firstunchecked feature of the is = n, which is incompatible with the = d on killed).The feature combinatorics of the Move operation are essentially the same, withthe only difference being that Move applies to features prefixed with + and −(licensor and licensee, respectively). Note that Merge introduces new materialinto the derivation, whereas Move merely displaces old material — intuitively,the subtree headed by an LI l with some licensee feature −f as its first uncheckedfeature is moved into the specifier of the closest LI l′ that c-commands l and has+f as its first unchecked feature.

An utterance is well-formed if it can be assigned a derivation in which allfeatures were checked except for the category feature of the last LI to be merged,which must be a so-called final category (usually c). The MG above generatesthe following eight sentences, and only those (assuming that c is the only finalcategory):

(1) a. John/The man killed John/the man.

b. John/The man, John/the man killed.

A derivation tree for one of the sentences with topicalization is given in Fig. 1.

Despite its simplicity, the feature calculus controlling Move allows for a daz-zling array of movement configurations, in particular remnant movement, inwhich some XP is extracted from some YP via Move before YP itself is movedto a higher position. Remnant movement allows for an elegant reanalysis ofcases where apparently non-phrasal constituents end up in positions reserved forphrases, such as in the German example below.

(2) [CP Gekusstikissed

hatjhas

[TP derthe

HansJohn

diethe

Maria tj ti.]]Mary.

’John kissed Mary.’

Page 5: Locality and the Complexity of Minimalist Derivation Tree Languages

5

move

\ merge

move

\ merge

merge

John :: d − nom merge

merge

man :: n the :: = n d − top

killed :: = d = d v

ε :: = v + nom t

ε :: = t + top c

CP

DPi

D′

D

the

NP

N′

N

man

C′

C TP

DPj

D′

D

John

T′

T VP

tj V′

V

killed

ti

Fig. 1. Left: derivation tree of The man, John killed, depicted in the final formatadopted in this paper and with slices indicated by color; Right: Corresponding X′-tree

Instead of having just the V-head gekusst move into SpecCP (which is at oddswith standard assumptions about phrase structure), one can fall back to a rem-nant movement analysis: the object moves out of the VP, followed by movementof the remaining VP into SpecCP. At this point the VP’s only phonetic expo-nent is its V-head, so that the end result is indistinguishable from the scenariowhere only the V-head had undergone movement. For our purposes, remnantmovement is of interest because of the crucial role it plays in the proof thatevery instance of Move that spans arbitrary distances can be decomposed intoa sequence of local Move steps (Thm. 1).

2.2 Minimalist Grammars, Derivation Trees, and Slices

As the focus of this paper is on the derivation trees of MGs rather than the phrasestructure trees derived via Merge and Move, the details of both operations are ofinterest to us only in so far as they have ramifications for the shape of derivationtrees or the string yield (which will be important for Thm 1). Nonetheless I givea full definition of the formalism here, staying close to the chain-based expositionof [19]. After that I formally define MDTLs and introduce the notion of slices.

Definition 2. A Minimalist grammar is a 6-tuple G := 〈Σ,Feat ,F ,Types,Lex ,Op〉, where

– Σ 6= ∅ is the alphabet,– Feat is the union of a non-empty set Base of basic features (also called cate-

gory features) and its prefixed variants = f | f ∈ Base, +f | f ∈ Base,−f | f ∈ Base of selector, licensor, and licensee features, respectively,

– F ⊆ Base is a set of final categories,

Page 6: Locality and the Complexity of Minimalist Derivation Tree Languages

6

– Types := ::, : distinguishes lexical from derived expressions,– the lexicon Lex is a finite subset of Σ∗ × :: × Feat∗,– and Op is the set of generating functions to be defined below.

A chain is a triple in Σ∗ × Types × Feat∗, and C denotes the set of all chains(whence Lex ⊂ C). Non-empty sequences of chains will be referred to as expres-sions, the set of which is called E.

The set Op of generating functions consists of the operations merge andmove, which are the respective unions of the following functions, with s, t ∈ Σ∗,· ∈ Types, f ∈ Base, γ ∈ Feat∗, δ ∈ Feat+, and chains α1, . . . , αk, ι1, . . . , ιk,0 ≤ k, l:

s :: = fγ t · f, ι1, . . . , ιk merge1st : γ, ι1, . . . , ιk

s : = fγ, α1, . . . , αk t · f, ι1, . . . , ιl merge2ts : γ, α1, . . . , αk, ι1, . . . , ιl

s · = fγ, α1, . . . , αk t · fδ, ι1, . . . , ιl merge3s : γ, α1, . . . , αk, t : δ, ι1, . . . , ιl

s : +fγ, α1, . . . , αi−1,t : −f, αi+1, . . . , αkmove1

ts : γ, α1, . . . , αi−1, αi+1, αk

s : +fγ, α1, . . . , αi−1,t : −fδ, αi+1, . . . , αkmove2

s : γ, α1, . . . , αi−1, t : δ, αi+1, . . . , αk

Furthermore, all chains must satisfy the Shortest Move Constraint (SMC), ac-cording to which no two chains in the domain of move display the same li-censee feature −f as their first feature. The string language generated by G isL(G) := σ | 〈σ · c〉 ∈ closure(Lex ,Op), · ∈ Types, c ∈ F.

MGs and MCFGs [15] have the same weak generative capacity [6, 12]. In fact,for every MG there exists a strongly equivalent MCFG, but not the other wayround. In a certain sense, then, one may view MGs as a narrowly restrictedway of specifying MCFGs. A more peculiar fact about MGs is that in orderfor l ∈ Lex to occur in a well-formed derivation, its feature string must be in+f,= f | f ∈ Base∗ ×Base×−f | f ∈ Base∗. I will implicitly invoke thisfact several times throughout the paper.

I now turn to the derivation trees of an MG, defining them in two steps.

Definition 3. Given an MG G := 〈Σ,Feat , F,Types,Lex ,Op〉, the largest sub-set of TE satisfying the following conditions is called the string-annotated deriva-tion tree language sder(G) of G:

– For every leaf node n, `(n) = 〈l〉, l ∈ Lex.– For every subtree m(d1, . . . , dn), n ≥ 1, op(d1, . . . , dn) is defined for exactly

one op ∈ merge,move and `(n) = op(d1, . . . , dn).– For n the root node, `(n) = 〈σ : c〉, where σ ∈ Σ∗ and c ∈ F .

Page 7: Locality and the Complexity of Minimalist Derivation Tree Languages

7

Definition 4. Given an MG G, its Minimalist derivation tree language mder(G)is the set of trees obtained from sder(G) by the map µ:

– µ(〈l〉) = l, where l ∈ Lex– µ(e(e1, . . . , en)) = op(µ(e1), . . . , µ(en)), where e, e1, . . . , en ∈ E, n ≥ 1, and

op is the unique operation in Op such that op(e1, . . . , en) = e

As was first observed in [8], every MDTL is regular. The basic idea is to equip abottom-up tree automaton with states corresponding to the feature componentsof the string-annotated derivation trees (there are only finitely many thanks tothe SMC) and have its transition rules recast the conditions imposed on Mergeand Move by the feature calculus. Interestingly, an MG’s set of derived trees —which is not regular — can be obtained from its MDTL in an efficient way usinga multi bottom-up tree transducer. This in turn means that MDTLs are the keyto capturing MGs by finite-state means.

In [4], the concept of slices is introduced. Intuitively, slice(l) is the derivationtree equivalent of the phrase projected by l in the derived tree (using the standardlinguistic notion of projection). That is to say, slices mark the subpart of thederivation that l has control over by virtue of its selector and licensor features(cf. Fig. 1 on page 5). From this perspective a derivation tree language is simplythe result of combining slices in all possible ways such that none of conditionsimposed on Merge and Move by the feature calculus are violated.

Definition 5. Given a Minimalist derivation tree T := 〈D, `〉 and LI l occurringin T , the slice of l is the pair slice(l) := 〈S, `〉, where S ⊆ D, l ∈ S, and if noden ∈ D immediately dominates a node s ∈ S, then n ∈ S iff the operation denotedby `(n) erased a selector or licensor feature on l. The unique n ∈ S that is notdominated by any n′ ∈ S is called the slice root of l.

The following properties hold of slices for every MG G [cf. 4]:

– Let |γ| denote the length of the longest γ such that there is some l ∈ LexGwith feature string γcδ, γ, δ ∈ Feat∗, c ∈ Base. Then for every t ∈ mder(G)and slice s := 〈S, `〉 of t, 1 ≤ |S| ≤ |γ|+ 1.

– Every tree t ∈ mder(G) is partitioned into slices.– All slices are continuous (i.e. if node y is the child of x and the mother of z

such that x and z belong to the same slice s, then y also belongs to s).

As the order of siblings is irrelevant in derivation trees, I stipulate for thesake of simplicity that all slices are strictly right-branching. Moreover, I assumethat derivation trees are strictly binary branching, as this simplifies the math atvarious points in this paper — a small change which is easily accommodated bymapping move(t) to move(\, t), where \ is a new symbol not in Lex ∪Op.

Now given an LI l, node n is the kth node of slice(l) iff the shortest pathfrom n to l is the sequence 〈n1, . . . , nk, l〉 such that n = n1 and no ni is a leftdaughter, 1 ≤ i ≤ k. Furthermore, n is associated to feature f iff n is the ith

node of slice(l) and f is the ith feature of l. Two features g and h are said tomatch iff there is a feature f ∈ Base such that either g = +f and h = −f org = = f and h = f . Finally, a node n matches an LI l iff for some feature g of l,n is associated to a feature matching g.

Page 8: Locality and the Complexity of Minimalist Derivation Tree Languages

8

2.3 Locality and Subclasses of Derivation Tree Languages

If one builds on the notion of slices, a natural way of imposing locality condi-tions on movement suggests itself. Take any MG G, t ∈ mder(G), and sup-pose that l is an LI occurring in t with licensee features −f1, . . . ,−fn. Letmove[l] := 〈move1, . . . ,moven〉 be the sequence of Move nodes in t such thatthe operation denoted by movei checked feature −fi on l, 1 ≤ i ≤ n. An MG Gis k-local or of locality rank k, k ≥ 1, iff it holds for every t ∈ mder(G) and LI loccurring in t with move[l] := 〈move1, . . . ,moven〉 that no more than k−1 slicesintervene between mi and mi+1 in 〈m0,m1, . . . ,mn〉 := 〈l〉 ·move[l], 0 ≤ i < n.Equivalently, the shortest path from movei to movei+1 may contain at mostk left branches. If G is k-local, we also say that movement in G is k-local ork-bounded. An MG G is movement-free iff no tree in mder(G) contains a nodelabeled move. It is unrestricted iff it is neither movement-free nor k-local for anyk ≥ 1. This terminology extends to MDTLs in the natural way. By mdtl[merge]and mdtl[merge,move(k)] I denote, respectively, the class of all movement-freeMDTLs and the class of all k-local MDTLs. The class of all MDTLs is simplydenoted by mdtl[merge,move].

The notion of k-boundedness is motivated by the linguistic assumption thatmovement is successive-cyclic. In (3), for instance, the wh-phrase may not movefrom its base position to the beginning of the utterance in one fell swoop butrather has to land in every CP.

(3) Which professori did John say [CP ti that Bill told him [CP ti that Maryhas a crush on ti ]]

This assumption can be used to explain various syntactic, semantic and evenmorphological phenomena, for instance embedded inversion, stranding of quan-tifiers, binding ambiguities and complementizer agreement [cf. 3]. The followingcontrast provides a very simple example.

(4) a. [CP2Wheni did John ti tell you [CP1

whoj Mary will meet tj?] ]

b. * [CP2Wheni did John tell you [CP1

whoj Mary will ti meet tj?] ]

In (4a), when originates in the matrix clause and moves directly to the corre-sponding SpecCP position, while who does the same in the embedded clause.In (4b), when is supposed to originate in the embedded clause together withwho, but now they cannot both move into CP-specifiers. In one case, who movesfirst, so that it fills SpecCP1. As a consequence, when cannot move to SpecCP2

due to successive cyclicity requiring it to go through the SpecCP1 first, whichis already filled by who. In the other case, where moves first but on its way toSpecCP2 it leaves a trace in SpecCP1 so that the latter is no longer availableas a landing site to who. The details of the analysis have changed significantlyover the years, and recently it has even been argued that CPs do not matter forcyclicity [3], but the basic idea that long-distance movement is in fact a sequenceof local movement steps has remained unaltered.

If one ignores adjuncts, the kind of cyclicity involved in the examples abovecan be reinterpreted as wh-movement being 3-local: every clause consists of the

Page 9: Locality and the Complexity of Minimalist Derivation Tree Languages

9

slices CP, TP, VP, so movement from one CP to the next creates paths con-taining 3 left branches. The analogy between successive-cyclic movement andthe restriction to k-locality is far from perfect, of course. Among other things,successive-cyclic movement is relativized to specific positions, whereas k-localityonly cares about distance. But if the value of k is carefully chosen and combinedwith certain regular constraints on derivation trees [4, 7] that prevent movementfrom skipping, say, CP slices, a reasonably close approximation can be achieved.

Surprisingly, every MG can be translated into a weakly equivalent 1-localone. The idea is that instead of having subtree s move directly to its targetposition t, s can hitch a ride by being selected by another LI l. As long as thestring component of l is empty, this will have no effect on the string yield. Thisreduces unbounded movement to k-local movement, which in turn can easily bereduced to 1-local movement.

Theorem 1. For every unrestricted MG G, there is an MG G′ of locality rank1 that defines the same string language.

Proof. I sketch two linear (bottom-up) transducers τ and τ ′. The former doesmost of the work by translating every t ∈ mder(G) into its correspondingmax(E)-local t′, where max(E) is the maximum length of expressions gener-ated by G (i.e. the maximum number of chains per expression). The latter thenrewrites t′ as the 1-local t′′. Since MDTLs are regular, and the image of a regularlanguage under a linear transduction is itself regular, the output of τ · τ ′ is, too.This set is then intersected with mder(G′′), where LexG′′ := Ωτ ′ \ OpG′′ andFG′′ := FG. The result of this intersection can then be automatically convertedinto a new MG, our G′ [4, 7].

The idea underlying τ is that unbounded movement can be localized bycreating intermediary landing sites which have no effect on the string language.The main task of the transducer is to insert those intermediate landing sitesand transfer (a subset of) the features of each moving element to its landingsite. By definition there are at most n = max(E) − 1 moving elements at anygiven point in the derivation. Each state of τ consists of 1) n components qi thatmemorize the feature string of the moving items and which of them have alreadybeen reinstantiated, 2) the expression the current node evaluates too (modulothe string component), 3) a boolean flag b that indicates whether some LI hadthe new licensee feature −s0 added to its feature string.

Suppose we are at a node in the derivation tree t that belongs to the sliceof LI l := σ :: γcδ and that there are 0 ≤ m ≤ n moving LIs li := σ ::γiciδi, where 1 ≤ i ≤ m and c, ci ∈ Base and σ, σi ∈ Σ and δi is of theform −fi1 , . . . ,−fik , k ≥ 1. For each i and 1 ≤ u ≤ v ≤ k, we define new LIslci [u, v] := ε :: = c +fiu c −si −fiu δi[u+1, v], where −si is a new licensee featureand δi[u + 1, v] := −fiu+1

, . . . ,−fiv . Given a feature string φ := f1, . . . , fk, wefurthermore let q(j, φ) := f1, . . . , fj • fj+1 . . . fk, 0 ≤ j ≤ k.

The transducer τ now has to perform the following steps: First, if l is notitself among the moving elements, τ non-deterministically replaces it by l :=ε :: γc − s0 and switches the boolean flag b in its state from 0 to 1. If either

Page 10: Locality and the Complexity of Minimalist Derivation Tree Languages

10

m = 0 and b = 1 or m ≥ 1 and b = 0, τ aborts at the slice root of l/l.

Second, τ replaces each li carrying more than one licensee feature by li :=σi :: γci − fi1 and stores q(0, δi[1, k]) in qi. Third, when τ reaches the slice

root of l/l, it inserts C(lcm[um, vm]) · . . . · C(lc1[u1, v1]), where C(lci [ui, vi]) :=move(\,merge(, lci [ui, vi])), and ui must be such that q(ui − 1, δi[1, k]) is thestring stored in qi. The value of vi is chosen non-deterministically — in orderfor τ not to abort it must hold that every fij can get checked later on withoutfurther intermediate movement sites for all j ≤ vi but not for j = vi + 1 (thisis easily verified, as τ keeps track of licensor features in the second componentof its states). With the insertion of C(lci [ui, vi]), qi is updated to q(vi, δi[1, k]).So far then, τ has introduced new slices slice(lci [ui, vi]) that function as the

respective landing sites for each li, or rather, their impoverished counterpart li.The fourth step requires τ to insert the context C(sc[m, b]) above C(lm[um, vm]),where sc[j, b] := ε :: = c + s1−b . . . + sj c for 1 ≤ j ≤ n, and C(sc[j, b]) :=move(\,)·︸ ︷︷ ︸j + b times

merge(, sc[j, b]).

This enforces remnant movement of l/l and all li, allowing each slice(lci [k]) tomove freely later on without carrying along any other parts of the derived tree(which would induce a change in the string yield). The procedure as outlinedabove is iterated (with the value of c varying with l) until no more features needto be checked off. Since there are at most n = max(E)−1 moving elements in G,

no LI li (including l/l) has to cross more than n slices in order to check its −fi1feature against lci [u, v] or its −si feature against sc[j, b]. Thus every instance ofmove is max(E)-bounded.

All instances of (k + 1)-bounded movement, 1 ≤ k ≤ max(E), can be made1-local as follows. Assume that slices slice(l1), . . . slice(lk) intervene between l :=σ :: γcδ and its next occurrence. Then τ ′ has to prefix δ with k new movementlicensee features −l1 . . . − lk, and for each lci [u, v] (1 ≤ i ≤ k) add +li to theend of its feature string and move(\,) above its slice root. If several LIs movethrough a slice, the number of licensor features and Move nodes has to be adaptedaccordingly. Crucially, both the number of moving elements and the distancebetween LIs and individual occurrences is finitely bounded, so this strategy caneasily be carried out by a non-deterministic linear transducer. ut

3 (Un)Definability in Some Subregular Language Classes

3.1 Deterministic Top-Down Automata

Since MDTLs are regular, they can be recognized by non-deterministic top-downtree automata [cf. 5]. As we will see now, top-down non-determinism can be dis-pensed with only if it is compensated for by unbounded look-ahead. I considertwo common variants of the standard deterministic top-down tree automaton(DTDA), both of which are more powerful than DTDAs but do not recognizeall regular languages. One is the sensing tree automaton (STA) [11], which mayalso take the labels of a node’s children into account in order to decide which

Page 11: Locality and the Complexity of Minimalist Derivation Tree Languages

11

states should be assigned to them, while the other is the l-r-deterministic DTDA(lrDTDA) [13], which allows for a limited kind of non-determinism. The classesof languages recognized by STAs and lrDTDAs are incomparable, but can easilybe characterized in descriptive terms. For this reason, I focus on the languagesthemselves rather than the automata, and no further technical details of the lat-ter will be discussed here (the interested reader is referred to [11] and referencestherein).

Definition 6. Given a node v of some Σ-tree t, lsibt(u) is the string consistingof the label of u’s left sister (if it exists) followed by the label of u, and rsibt(u) isthe string consisting of the label of u and the label of its right sister (if it exists).Let u1, . . . , un be the shortest path of nodes extending from the root to v suchthat u1 is the root and un = v. Let and ♣ be two new symbols not in Σ. Byspinet(v) we denote the string recursively defined by

spinet(u1) = lsibt(u1)rsibt(u1)

spinet(u1, . . . , un) = spinet(u1, . . . , un−1) ♣ lsibt(un) rsibt(un)

A regular tree language L is spine-closed iff it holds for all trees s, t ∈ L andnodes u and v belonging to s and t, respectively, that spines(u) = spinet(v)implies s[u← t|v] ∈ L.

Definition 7. A regular tree language L is homogeneous iff it holds that ift[u ← a(t1, t2)] ∈ L, t[u ← a(s1, t2)] ∈ L and t[u ← a(t1, s2)] ∈ L, then alsot[u← a(s1, s2)] ∈ L.

Proposition 1. A regular tree language L is recognizable by

– an STA iff L is spine-closed [10].– an lrDTDA iff L is homogeneous [13].

Thanks to these characterizations, results for MDTLs are easily obtained.

Theorem 2. mdtl[merge] and the class of tree languages recognized by STAsare incomparable.

Proof. Let grammar G be defined by the following LIs (with names in squarebrackets for reference) and FG := a, b:

[a0] a :: a [a1] a :: = a a [a2] a :: = a = a a[b0] b :: b [b1] b :: = b b [b2] b :: = b = b b

Consider the derivation tree ta := merge1(merge2(a0, a1),merge3(a0, a2)) andits counterpart tb := merge1(merge2(b0, b1),merge3(b0, b2)) — the indices arefor the reader’s convenience. Even though spineta(merge2) = spinetb(merge2), itholds that merge1(merge2(b0, b1),merge3(a0, a2)) /∈ mder(G), so mder(G) is notspine-closed. ut

Theorem 3. Every L ∈ mdtl[merge,move] is recognized by some lrDTDA.

Page 12: Locality and the Complexity of Minimalist Derivation Tree Languages

12

Proof. I show that every MDTL is homogeneous. For a = move, closure is triv-ially satisfied. So let a = merge. Merge depends only on the distribution ofcategory and selector features, and there is no way to distribute these over t1,t2, s1 and s2 such that the fourth tree would be an illicit instance of Merge: Inorder for Merge to be licensed, one of t1 or t2 must have some category featurec as its first unchecked feature, and the other one the matching selector feature= c. Assume w.l.o.g. that t2 carries the selector feature = c. Then s1 must alsohave feature c, and s2 feature = c. We also know that s1 and t1 on the one handand s2 and t2 on the other agree on all features following these selector/licensorfeatures, since the derivations differ only w.r.t. the subtree rooted by a. It followsthat Merger of s1 and s2 is licit, so the required closure property obtains. ut

Martens et al. [11] point out a peculiar property of languages recognized bylrTDAs but not by STAs: in order to determine which states should be assignedto the children of the root, one has to look arbitrarily deep into at least oneof the subtrees dominated by the root. This is indeed typical of unrestrictedMDTLs, where movement features at the very bottom of a derivation introducedependencies that — given the impoverished nature of the interior node labels —cannot be predicted deterministically in a top-down fashion without unboundedlook-ahead. The class of lrTDAs overshoots the mark, though, as it fails to drawa distinction even between mdtl[merge,move] and mdtl[merge].

3.2 Strictly Local and Locally Threshold Testable Languages

Let us now traverse the subregular hierarchy from the bottom instead, startingwith LOC and subsequently moving on to SL and LTT. As lrTDAs before, localsets lack the granularity to distinguish any of the subclasses of MDTLs. Butwhere the lrTDAs universally succeeded, local sets universally fail.

Theorem 4. mdtl[merge] and the class of local sets are incomparable.

Proof. Consider any movement-free MDTL L with a derivation containing a sub-tree of the form t := merge(l,merge). As we require slices to be right-branching,l contains no selector or licensor features. Furthermore, the finiteness of the lexi-con establishes an upper bound |γ|+1 on the size of slices. However, LOC = SL2,so if L ∈ LOC, t could be composed with itself arbitrarily often, yielding slicesof unbounded size. ut

The shortcomings of LOC can be circumvented, though, by extending the sizeof the locality domain, i.e. by moving to SLk for some sufficiently large k > 2.Let |δ| be the maximum number of licensee features that may occur on a singleLI, analogously to |γ|. Given a k-local MG, set κ := (|γ|+ 1) ∗ (|δ| ∗ k + 1) + 1.

Theorem 5. Every L ∈ mdtl[merge,move(k)] is strictly κ-local.

Before we may proceed to the actual proof, the notion of occurrences mustbe introduced. Intuitively, the occurrences of an LI l are merely the Move nodesin the derivation tree that operated on one of l’s licensee features. It does not

Page 13: Locality and the Complexity of Minimalist Derivation Tree Languages

13

take much insight to realize that the first occurrence of l has to be some Movenode that dominates it (otherwise l’s licensee features could not be operatedon) and is not included in slice(l) (no LI may license its own movement). Onecan even require the first occurrence to be the very first Move node satisfyingthese properties, thanks to the SMC (the reader might want to reflect on thisfor a moment). The reasoning is similar for all other occurrences, with the soleexception that closeness is now relativized to the previous occurrence. In moreformal terms: Given an LI l := σ :: γcδ with c ∈ Base and δ := −f1, . . . ,−fn,its occurrences occi, 1 ≤ i ≤ n, are such that

– occ1 is the first node labeled move that matches −f1 and properly dominatesthe slice root of l.

– occi is the first node labeled move that matches −fi and properly dominatesocci−1.

Note that every well-formed MDTL obeys the following two conditions:

– M1 : For every LI l with 1 ≤ n ≤ |δ| licensee features, there exist nodesm1, . . . ,mn labeled move such that mi is the ith occurrence of l, 1 ≤ i ≤ n.

– M2 : For every node m labeled move, there is exactly one LI l such that mis an occurrence of l.

In fact, the implication holds in both directions.

Lemma 1. For every MG G it holds that if t ∈ TLexG∪merge,move is a com-bination of well-formed slices and respects all constraints on the distribution ofMerge nodes, then it is well-formed iff M1 and M2 are satisfied.

Proof. As just discussed the left-to-right direction poses little challenge. In theother direction, I show that µ−1 is well-defined on t and maps it to a well-formeds ∈ sder(G). For LIs and Merge nodes, µ−1 is well-defined by assumption if itis well-defined for Move nodes. From the definition of move and the SMC itfollows that µ−1(move(\, t2)) (the expression returned by move when applied toµ−1(t2)) is well-defined only if the root of t2 is an expression consisting of at leasttwo chains such that 1) its first chain has some feature +f as its first featureand 2) the feature component of exactly one chain begins with −f . However,the former follows from the well-formedness of slices, while the latter is enforcedby M2; in particular, if the SMC were violated, some Move node would be anoccurrence for more than one LI. This establishes that µ−1 is well-defined forall nodes. Now µ−1(t) can be ungrammatical only if the label of the root nodecontains some licensor or licensee features. The former is ruled out by M2 and theinitial assumption that all slices are well-formed, whence every licensor featureis instantiated by a Move node. In the latter case, there must be some licenseefeature without an occurrence in t, which is blocked by M1. ut

Now we can finally move on to the proof of Thm. 5.

Proof. Given some k-local MG G with L := mder(G) ∈ mdtl[merge,move(k)],let κ-factors(L) be the set containing all κ-factors of L, and F the corresponding

Page 14: Locality and the Complexity of Minimalist Derivation Tree Languages

14

strictly κ-local language built from these κ-factors. It is obvious that F ⊇ L, soone only needs to show that F ⊆ L. Trivially, t ∈ L iff t ∈ F for all trees t ofdepth d ≤ κ. For this reason, only trees of size greater than κ will be considered.

Assume towards a contradiction that F 6⊆ L, i.e. there is a t such thatF 3 t /∈ L. Clearly F 3 t /∈ L iff some condition enforced by merge or move onthe combination of slices is violated, as the general restrictions on tree geometry(distribution of labels, length and directionality of slices) are always satisfied byvirtue of κ always exceeding |γ| + 1. I now consider all possible cases. In eachcase, I use the fact that the constraints imposed by merge and move operateover a domain of bounded size less than κ, so that if t ∈ F violated one of them,one of its κ-factors would have to exhibit this violation, which is impossible asκ-factors(F ) = κ-factors(L).

Case 1 [Merge]: merge is illicit only if there is an internal node n labeledmerge in slice(l) such that the shortest path from n to l is of length 1 ≤ i, theith feature of l is +f for some f ∈ Base, and there is no LI l′ such that the leftdaughter of n belongs to slice(l′) and l′ carries feature f . But the size of everyslice is at most |γ|+ 1, so the distance between n and l is at most |γ|, and thatbetween n and l′ at most |γ|+1. Hence a factor of size |γ|+2 is sufficient, whichis less than κ. So if we found such a configuration, it would be part of someκi ∈ κ-factors(F ) = κ-factors(L). Contradiction.

Case 2 [Move]: Conditions M1 and M2 can be split into three subcases.Case 2.1 [Too few occurrences]: Assume that LI l has j ≤ |δ| licensee

features but only i < j occurrences. Since L ∈ mdtl[merge,move(k)], the short-est path spanning from any LI to its last occurrence includes nodes from at most|δ| ∗ k+ 1 distinct slices. Since the size of no slice minus its LI exceeds |γ|, somefactor κi of size greater than (|γ| ∗ |δ| ∗ k) + (|γ|+ 1) ≤ κ must exhibit the illicitconfiguration, yet κi /∈ κ-factors(L).

Case 2.2 [Too many Move nodes]: Assume that for some Move node mthere is no LI l such that m is an occurrence of l. This is simply the reverse ofCase 2.1, where we obtain a violation if it holds for no LI l in any κi that m is oneof its occurrences. But then at least one of these κi cannot be in κ-factors(L).

Case 2.3 [SMC violation]: The SMC is violated whenever there are twodistinct items l and l′ for which Move node m is an occurrence. As 2.2, this isjust a special case of 2.1. ut

We now have a very good approximation of mdtl[merge,move(k)] for anychoice of k > 0. They are not local, or equivalently, strictly 2-locally testable,but they are strictly κ-locally testable, where κ depends on k and the maxima oflicensor and licensee features, respectively. But what about mdtl[merge,move]in general?

To readers acquainted with MGs it will hardly be surprising that unrestrictedMDTLs are not strictly locally testable. Nor is it particularly difficult to demon-strate that they even fail to be locally threshold testable. In [1], it was proved thatclosure under k-guarded swaps is a necessary condition for a language to be de-finable in FOmod [S1, S2] — that is to say, first-order logic with unary predicatesfor all labels, binary predicates for the left child and right child relations, respec-

Page 15: Locality and the Complexity of Minimalist Derivation Tree Languages

15

tively, and the ability to perform modulo counting. Evidently FOmod [S1, S2] is aproper extension of FO[S1, S2], and definability in the latter fully characterizesthe locally threshold testable languages [20]. So no language that isn’t closedunder k-guarded swaps is locally threshold testable.

Definition 8. Let t := C ·∆1 ·∆·∆2 ·T be the composition of trees C := Ct[a, x),∆1 := Ct[x, y), ∆ := Ct[y, x

′), ∆2 := Ct[x′, y′) and T := t|y′ . The vertical swap

of t between [x, y) and [x′, y′) is the tree t′ := C ·∆2 ·∆ ·∆1 · T . If the subtreesrooted at x and x′ are identical up to and including depth k, and the same holdsfor the subtrees rooted at y and y′, then the vertical swap is k-guarded.

Theorem 6. mdtl[merge,move] and the class of tree languages definable inFOmod [S1, S2] are incomparable.

Proof. Consider a grammar containing (at least) the following four items:

a :: a a :: a − b a :: = a a a :: = a + b a

I restrict my attention to those derivation trees in which movement occurs ex-actly once. Pick any k ∈ N. Then there is some derivation tree that can befactored as above such that ∆1 contains the movement node at some depthm > k, ∆2 contains the corresponding LI a :: a − b at some depth n > k,C = ∆ = T , and the depth of ∆ and T exceeds k. Given this configuration,the vertical swap of ∆1 and ∆2 is k-guarded, yet t′ := C · ∆2 · ∆ · ∆1 · T isnot a Minimalist derivation tree, as the movement node no longer dominatesa :: a − b, thereby negating closure under k-guarded swaps. ut

The insufficiency of FOmod [S1, S2] puts a strong lower bound on the complex-ity of mdtl[merge,move]. In the next section, I show that enriching FO[S1, S2]with proper dominance and equivalence is all it takes to make mdtl[merge,move]first-order definable.

4 Definability in First-Order Logic

I start with an FO[S1, S2] theory of mdtl[merge], which is then extended toFO[S1, S2, <,≈] for mdtl[merge,move]. Given an MG G, FO[S1, S2] is definedover ordered binary branching trees in the standard way, with the signaturecontaining a unary predicate p for each p ∈ Λ := LexG ∪OpG ∪ \ and binarypredicates S1 and S2 for the left and right child relation, respectively. The equiv-alence relation is superfluous for mdtl[merge]. I write x /1 y instead of S1(x, y),and similarly for S2. Moreover, x / y iff x /1 y ∨ x /2 y.

First a number of constraints are established to ensure that every node hasexactly one label drawn from Λ, and that the arity of the labels is respected(it suffices only to restrict nullary symbols to leaves, as this entails that binarysymbols can be assigned only to interior nodes). Furthermore, \ may be assignedto a node if and only if it is the left daughter of a Move node.

∀x

[( ∨u∈Λ

u(x))∧∧u∈Λ

(u(x)→

∧v∈Λ\u

¬v(x))]

Page 16: Locality and the Complexity of Minimalist Derivation Tree Languages

16

∀x[ ∨u∈Λ\merge,move

u(x)↔ ¬∃y[x / y]]

∀x∀y[\(y)↔ move(x) ∧ x /1 y

]As was pointed out in Sec. 2, MDTLs can be viewed as the result of combining

the slices defined by LIs in all possible ways such that the constraints of thefeature calculus are respected. Hence I first define the shape of slices beforemoving on to the feature conditions enforced by Merge. To simplify this task, Iuse n φ(x) as a shorthand for “φ holds at the node reached from x by takingn steps down the right branch”. The analogous n φ(x) moves us down theleft branch instead, while n φ(x) moves us upwards only along a right branch.Intuitively, , and can be viewed as first-order implementations of modaldiamond operators.

0 φ(x)↔ φ(x)

n φ(x)↔ ∃y[x /2 y∧ n−1 φ(y)]

Recall that all slices are strictly right-branching and never exceed size |γ|+1.This is equivalent to saying that there is no node that is at least |γ|+ 1 S2-stepsaway from a node satisfying a tautology >.

¬∃x[|γ|+1 >(x)]

Next, every interior node n must be licensed by a feature of the LI of the slicecontaining n. Again a special notational device proves useful: for any feature f ,fi(x) holds iff for some l ∈ LexG whose ith feature is f , l(x) is true (the indexwill be suppressed whenever the position of the feature is irrelevant). Now letslr i(x)↔

∨f∈Base = fi(x) and lcr i(x)↔

∨f∈Base +fi(x).

∀x[(

merge(x)→∨

1≤i≤|γ|

i slr i(x))∧(

move(x)→∨

1≤i≤|γ|

i lcr i(x))]

Besides the evident restriction on the distribution of merge and move, the for-mula above also ensures that no l ∈ Λ without selector or licensor features canever be a right leaf.

We still have to establish a minimum size on slices, though, which is easilyaccomplished by requiring every selector/licensor feature to license a uniqueinterior node.

∀x[ ∧1≤i≤|γ|

((slr i(x)→i merge(x)

)∧(lcr i(x)→i move(x)

))]Note that this also prevents every LI with selector or licensor features fromoccurring on a left branch. The topmost slice in the derivation is also subject tothe condition that the category of its LI must be final.

∀x[¬∃y[y / x]→

∨c∈F

0≤i≤|γ|

i c(x)]

Page 17: Locality and the Complexity of Minimalist Derivation Tree Languages

17

So far, then, our first-order theory enforces the correct minimum/maximumsize of slices for every l ∈ LexG and fixes their branching direction and nodelabels. For mdtl[merge], it only remains to capture the feature dependenciesimposed by Merge: the category feature of the LI of the slice on the left branchhas to match the selector feature of the LI found along the right branch.

∀x[

merge(x)→∧

c∈Base

(1

∨0≤i≤|γ|

i c(x)↔∨

1≤j≤|γ|

j = cj(x))

)]

Extending this basis to unrestricted MDTLs is surprisingly easy using thenotion of occurrences we encountered earlier on. First, proper dominance andequivalence are added to the signature of FO[S1, S2], yielding FO[S1, S2, <,≈].As before, I use infix notation for all binary relations, so instead of < (x, y) Iwrite x /+ y. For every i ≤ |δ|, matchi(x, y) denotes that x is associated to afeature that matches the ith licensee feature of y.

matchi(x, y)↔∨f∈Base

( ∧c∈Base

1≤j≤|γ|+1

(cj(y)→ −fj+i(y)

)∧move(x) ∧

∨1≤g≤|γ|

g +fg(x))

Furthermore, the predicate x J y ↔ ∃z,∃z′[(x/+z∨x ≈ z)∧z/1z′∧(z/+y∨z ≈

y)]

holds of x and y iff x properly dominates y and they belong to different slices.Building on these two notions, it is a straightforward task to recast the definitionof occurrences in first-order terms.

occ1(x, l)↔ match1 (x, l) ∧ x J l ∧ ¬∃y[x /+ y ∧match1 (y, l) ∧ y J l

]

occi(x, l)↔ x /+ l ∧matchi(x, l) ∧ ∃y[x /+ y ∧ occi−1(y, l)∧

¬∃z[x /+ z ∧ z /+ y ∧matchi(z, l)

]]In line with Lem. 1, constraining the distribution of move requires but three

formulas that demand, respectively, that every licensee feature has a matchingmove node, that every mode node has a matching licensee feature, and that nomovement node can be matched against more than one licensee feature (SMC).It is only this very last condition that depends on the equivalence predicate asthere is no other first-order definable way of distinguishing nodes (the use ofequivalence in the definition of J is merely a matter of convenience and caneasily be avoided).

∀x[ ∧

c∈Base1≤i≤|γ|+1

(ci(x)→

∧f∈Base0≤j≤|δ|

(− fi+j(x)→ ∃y

[occj(y, x)

]))]

Page 18: Locality and the Complexity of Minimalist Derivation Tree Languages

18

∀x[move(x)→ ∃l

[ ∨1≤i≤|δ|

occi(x, l)]]

∀x∀l[ ∧1≤i≤|δ|

(occi(x, l)→ ∀l′

[ ∧j∈[|δ|]\0,i

¬occj(x, l′)∧(occi(x, l

′)→ l ≈ l′)])]

Conclusion

The results reported herein highlight the rather indirect relation between MDTLsand the string languages they derive. MGs without movement yield context-freestring languages, whereas even bounded movement is sufficient to generate allmultiple context-free languages. At the level of tree languages, however, bothmovement-free and k-local MGs are strictly locally testable, whereas unrestrictedmovement leads to an increase in complexity that pushes MDTLs out of therealm of local threshold testability (see Fig. 2 on the next page).1

As my results posit a split between k-local and unrestricted MGs on the levelof derivation trees, they seem to vindicate the assumption commonly made bysyntacticians that locality restrictions on movement are a fundamental prop-erty of natural language that keeps computational complexity in check. On theother hand, weak generative capacity remains unaffected, and the locality rankis immaterial, as all local grammars can be made 1-local. Further work is neededbefore a full understanding can be reached as to how derivational complexity mayinteract with string language complexity, what measure of complexity should beused, and how this relates to syntactic proposals.

It must also be pointed out that alternative representations of Minimalistderivation trees could conceivably paint a different picture. Eventually, one wouldlike to have a better understanding as to which aspects of a derivation tree lan-guage genuinely reflect the complexity of the derivational machinery underlyingthe MG formalism and which are just notational quirks. By probing differentformats for Minimalist derivation trees we might also unearth new connections

1 The strictly local nature of movement-free and k-local MDTLs also implies that theycan be recognized by deterministic tree-walking automata. I conjecture that thisdoes not carry over to unrestricted MDTLs unless the automata are enriched withtwo weak pebbles. In particular, non-deterministic tree-walking automata cannotrecognize unrestricted MDTLs: The fundamental problem one faces while siftingthrough a derivation tree with unrestricted movement in a sequential manner is thateither 1) the automaton has to keep track of an unbounded number of features whenperforming a brute-force search for an LI matching a given movement node, or 2)it gets lost in the derivation tree and cannot make its way back to the movementnode in question. This makes it impossible to ensure that every movement node is anoccurrence for exactly one LI, and non-determinism offers no remedy. The additionof two pebbles, on the other hand, allows the automaton to mark the movement nodeand the LI that was inspected last, so that the automaton can always find its wayback and can infer from the position of the second pebble which LIs have alreadybeen looked at.

Page 19: Locality and the Complexity of Minimalist Derivation Tree Languages

19

REG = Star-Free

lrDTDASTA

DTDA

FO[S1, S2, <]

LTT = FO[S1, S2]

FOmod [S1, S2]

SL

LOC

mdtl[merge,move]

mdtl[merge]

mdtl[merge,move(k)]

[14]

[20]

[11]

Thm.5

Sec. 4Thm. 3

Thm

.6

[11]

Thm. 4

Thm.2

a b a is properly included in ba b a and b are incomparable

Fig. 2. MDTLs in the subregular space of strictly binary branching tree languages(references omitted for obvious relations)

between MGs and Tree Adjoining Grammar, an area that has recently enjoyedincreased interest.

Acknowledgments My thanks go to Ed Stabler and the three anonymousreviewers for their helpful criticism. The research reported herein was supportedby a DOC-fellowship of the Austrian Academy of Sciences.

Bibliography

[1] Benedikt, M., Segoufin, L.: Regular tree languages definable in FO and inFOmod. ACM Transactions in Computational Logic 11, 1–32 (2009)

[2] Chomsky, N.: The Minimalist Program. MIT Press, Cambridge, Mass.(1995)

[3] den Dikken, M.: Arguments for successive-cyclic movement throughSpecCP. A critical review. Linguistic Variation Yearbook 9, 89–126 (2009)

[4] Graf, T.: Closure properties of minimalist derivation tree languages. In:Pogodalla, S., Prost, J.P. (eds.) LACL 2011. LNAI, vol. 6736, pp. 96–111(2011)

Page 20: Locality and the Complexity of Minimalist Derivation Tree Languages

20

[5] Gecseg, F., Steinby, M.: Tree Automata. Academei Kaido, Budapest (1984)[6] Harkema, H.: A characterization of minimalist languages. In: de Groote, P.,

Morrill, G., Retore, C. (eds.) Logical Aspects of Computational Linguistics(LACL’01), LNAI, vol. 2099, pp. 193–211. Springer, Berlin (2001)

[7] Kobele, G.M.: Minimalist tree languages are closed under intersection withrecognizable tree languages (2011), to appear in Proceedings of LACL2011

[8] Kobele, G.M., Retore, C., Salvati, S.: An automata-theoretic approach tominimalism. In: Rogers, J., Kepser, S. (eds.) Model Theoretic Syntax at 10.pp. 71–80 (2007)

[9] Mainguy, T.: A probabilistic top-down parser for Minimalist grammars(2010), arXiv:1010.1826v1

[10] Martens, W.: Static Analysis of XML Transformation- and Schema Lan-guages. Ph.D. thesis, Hasselt University (2006)

[11] Martens, W., Neven, F., Schwentick, T.: Deterministic top-down tree au-tomata: Past, present, and future. In: Proceedings of Logic and Automata2008. pp. 505–530 (2008)

[12] Michaelis, J.: Transforming linear context-free rewriting systems into mini-malist grammars. LNAI 2099, 228–244 (2001)

[13] Nivat, M., Podelski, A.: Minimal ascending and descending tree automata.SIAM Journal on Computing 26, 39–58 (1997)

[14] Potthoff, A., Thomas, W.: Regular tree languages without unary symbolsare star-free. In: Proceedings of the 9th International Symposium on Fun-damentals of Computation Theory. pp. 396–405 (1993)

[15] Seki, H., Matsumura, T., Fujii, M., Kasami, T.: On multiple context-freegrammars. Theoretical Computer Science 88, 191–229 (1991)

[16] Stabler, E.P.: Derivational minimalism. In: Retore, C. (ed.) Logical Aspectsof Computational Linguistics, LNCS, vol. 1328, pp. 68–95. Springer, Berlin(1997)

[17] Stabler, E.P.: Computational perspectives on minimalism. In: Boeckx, C.(ed.) Oxford Handbook of Linguistic Minimalism, pp. 617–643. Oxford Uni-versity Press, Oxford (2011)

[18] Stabler, E.P.: Top-down recognizers for MCFGs and MGs. In: Proceedingsof the 2011 Workshop on Cognitive Modeling and Computational Linguis-tics (2011), to appear

[19] Stabler, E.P., Keenan, E.: Structural similarity. Theoretical Computer Sci-ence 293, 345–363 (2003)

[20] Thomas, W.: Languages, automata and logic. In: Rozenberg, G., Salomaa,A. (eds.) Handbook of Formal Languages, vol. 3, pp. 389–455. Springer,New York (1997)

[21] Verdu-Mas, J.L., Carrasco, R.C., Calera-Rubio, J.: Parsing with probabilis-tic strictly locally testable tree languages. IEEE Transactions on PatternAnalysis and Machine Intelligence 27, 1040–1050 (2005)