-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
1/149
University of Alexandria
Faculty of Arts
English Language Department
A Proposed Approach to Handling Unbounded
Dependencies in Automatic Parsers
A THESIS SUBMITTED TO THE ENGLISH LANGUAGE
DEPARTMENT, FACULTY OF ARTS, THE UNIVERSITY OF
ALEXANDRIA IN FULFILLMENT OF THE REQUIREMENTS FOR
THE DEGREE OF MASTER OF ARTS IN COMPUTATIONAL
LINGUISTICS
By
Ramy Muhammad Magdi Ragab Abdel Azim
Supervised by:
Dr. Sameh Al-Ansary
Associate Professor of Computational
Linguistics
Department of Phonetics and Linguistics
Faculty of Arts
Alexandria University
Dr. Heba Labib
Assistant Professor of Linguistics
Department of English Language and
Literature
Faculty of Arts
Alexandria University
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
2/149
2
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
3/149
3
to the memory of
Professor Hassan Atiyya Taman
(2010)
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
4/149
4
Contents
Abstract
Acknowledgements
Symbols and Abbreviations
List of Figures
List of Tables
1. INTRODUCTION 171.1.Motivation 181.2.The Problem 211.3.Aims and Contributions 231.4.Thesis Structure 251.5.UDs defined 271.6.The class of UDs 30
1.6.1. Strong UDs 311.6.2. Weak UDs 32
1.7.Nomenclature 342. UDS AND SYNTACTIC FORMALISMS 37
2.1.Derivational Approaches 382.2.Generalized phrase structure grammar (GPSG) 442.3.Head-driven phrase structure grammar (HPSG) 522.4.Categorial grammar (CG) 602.5.Lexical functional grammar (LFG) 63
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
5/149
5
2.6.Towards an Ontology of Gaps 702.6.1. Gaps between Objects and Subjects 712.6.2. The Distribution of Gaps 732.6.3. The Ontology 77
3. Parsing and Formal Languages 813.1.The Concept of a Formal Language 823.2.Defining a Generative Grammar 833.3.Formal Grammars and their Relation to Formal Languages 843.4.The Chomsky Hierarchy 863.5.Automata 893.6.Parsing Theories and Strategies 913.7.The Universal Parsing Problem 923.8.Major Parsing Direction 933.9.Top-down Parsing 95
3.10.Bottom-up Parsing 963.11.The Cocke-Kasami-Younger Algorithm 983.12.The Earley Algorithm 1003.13.Statistical or Grammarless Parsing 1033.14.Text vs. Grammar Parsing: the Nivre Model 1043.15.Text Parsing and the Problem of UDs 105
4. UDs Parsing Complexity 1084.1.The Rimell-Clark-Steedman (RCS) Test 1104.2.The Parsers Set 112
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
6/149
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
7/149
7
"In the beginning was the word. But by the time the second word was
added to it, there was trouble. For with it came syntax, the thing that
tripped up so many people."
John Simon,Paradigms Lost
This is a fertile area of research, in which definitive answers have not
yet been found.
Sag &Wasow, Syntactic Theory: a formal introduction
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
8/149
8
Abstract
Unbounded dependencies (UDs) represent a set of syntactic constructions in the
English language that face syntactic and computational analyses with a number of
challenges. Unbounded dependencies cover such constructions as wh-questions,
relative clauses, topicalized sentences, tough movement clauses, it-clefts and many
more. Though each of the previous constructions may have received considerable
attention in the syntactic literature, the awareness of the unity of all these constructions
and their likeminded behavior that make them form a coherent whole was largely
missing in such treatments.
This thesis explores the linguistic nature of UDs and how they were handled within
the current flurry of syntactic theories. The thesis provides analyses of UDs within the
Principles & Parameters model (as representative of derivational approaches to syntax),
Generalized Phrase Structure Grammars, Head-driven Phrase Structure Grammars,
Lexical-functional Grammars, and Categorial Grammars (as representatives of non-
derivational approaches). The thesis, then offers a newly devised gaps-ontology that
aims at gathering all the information and rules related to the behavior of gaps in
unbounded dependencies in one integral theoretical entity that can be utilized in
computational environments.
The thesis claims that the problem of UDs parsing is basically a computational
problem not a syntactic one, i.e. the solution of the problem lies in the parsing strategy
and techniques used not the theoretical underpinnings of the different syntactic analyses
available. Accordingly, the thesis proposes two types of solutions to the parsing
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
9/149
9
problem of UDs: the first introduces modifications on the architectural design of the
universal parser, subscribing to the highly useful technique of modularity and thus
devising what the thesis calls a Small-scale Latent Parser. The other proposes
processing modifications represented by the techniques of gap-threading and
memoization. The thesis also claims that top-down parsing cannot be endorsed as a
possible strategy for parsing UDs and favors, thus, bottom-up parsing strategies
instead.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
10/149
01
Acknowledgments
My interests in computer science and the study of computational linguistics were
triggered 13 years ago when I began my work with Dr. Nabil Ali. Dr. Ali, an engineer
by training and the father of Arabic informatics and computational linguistics, brought
to my attention many important works and gave me the opportunity to see how a real
computational system looks like. The late Prof. Hasan Taman, original supervisor of
the thesis, is the one who should be accredited with the current organization of the
thesis. He insisted, against my disposition to work on theoretical issues alone, on a
problem-solving method that finds a problem and proposes solutions, which explains
the title of the thesis itself (his exact phrasing). Prof. Tamans belief in me and in my
academic abilities was so crucial in infusing me with the spirit that made me work on
this thesis and recover from so many bouts of despair. May his soul rest in peace.
Prof. Azza el-Khoulys and Prof. Sahar Hamoudas kindness and support made this
thesis see the light of day. Dr. Sameh al-Ansarys patience, unflinching support and
understanding also revitalized the hope of finishing this thesis. Without him I would
not have been able to finish the thesis in the first place, not to mention his comments
and suggestions that improved the outlook and organization of the thesis. My debt to
him will always be remembered.
Also, Prof. Olga Matars kind approval to be one of the examiners brought me such
happiness because she was the first one I hoped could supervise my work even before
Prof. Taman, but unfortunately at that time she was unable to slot me in her already full
schedule of graduate theses supervisions. Dr. Heba Labibs sweet kindness and
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
11/149
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
12/149
02
Symbols and Abbreviations
_ Underscores represent the position (s) of gaps in a sentence./ Represents the SLASH feature in which the slashed feature on the
right-hand side of the slash is missing.
e Null or empty categories.
' Adding up to in HPSG
AB There is a category A missing somewhere within it a B, within
Moortgats version of CG.
In LFG, a variable that refers to the lexical item being categorized.= In LFG, an equation meaning that the features of the nodes below and
above are being shared.
Lambda, a symbol referring to a string consisting of zero elements.
L Language in formal languages theory.
G Grammar in the theory of formal languages.
VN Nonterminal variables
VT Terminal variables
L(G) The grammar of a languageLin formal language theory.
(N, , S,P) Elements of a formal grammar G.
The left-hand side elements are rewritten as the right-hand side
elements, e.g. S NP VP
xS xbelongs to or a member of S. Refers either to the root of a sentence or, in formal language theory, to
terminals of sentence in contrast toNwhich refers to non-terminals.
In the Earley parsing algorithm, the dot is used on the right-hand side
of the grammar rule to tell us where the rule has reached or to what
extent it progressed, e.g. SVP, [0, 0]
Boxed numbers, or tags, in AVMs indicate structure sharing in HPSG
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
13/149
03
NLP Natural Language Processing
UDs Unbounded Dependencies
ST Syntactic Theory
GPSG Generalized Phrase Structure Grammar
LFG Lexical Functional Grammar
HPSG Head-driven Phrase Structure Grammar
CG(s) Categorial Grammar(s)
CCG Combinatory Categorial Grammar
TG Transformational Grammar
ATN Augmented Transition Networks
PSG Phrase Structure Grammar
GB Government and Binding theory
P&P Principles and Parameters theory
MP Minimalist Program
TP A clause consisting of an NP and a VP.
C Complementizer within a P&P context.
CP Complementizer phrase within a P&P context.
DP Determiner Phrase within a P&P context.
SPEC Specifier within a P&P context.
CF-PSG Context-free Phrase Structure Grammar.
FFP Foot Feature Principle within a GPSG context.
ID Immediate Dominance rules within a GPSG context.
LP Linear Precedence within a GPSG context.
HFP Head Feature principle within a GPSG context.
CSLI Stanfords University Center for the Study of Language and Information.QUE A feature of questions in HPSG.
REL A feature of relative clauses in HPSG.
INHER Inheritance feature in HPSG.
AVM Attribute Value Matrix in HPSG and Unification Grammars.
SYNSEM Syntax-semantics interface in HPSG.
SPR Specifiers in HPSG.
3sg Third person singular in HPSG.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
14/149
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
15/149
05
List of Figures and Tables
Figures
(1) The Class of Unbounded Dependency Constructions.(2) A derivational analysis of the sentence Who do you think Jim Kissed?.(3) A derivational analysis of the sentence Who do you think Jim Kissed?
(modified).
(4) A derivational analysis of the sentence Who do you think Jim Kissed?(modified).
(5)
A derivational analysis of the sentence Who do you think Jim Kissed?(modified).
(6) A derivational analysis of the sentence Which city did Ian visit?.(7) Tree geometry of the structure of a UD in GPSG.(8) A GPSG analysis of the sentence Sandy we want to succeed.(9) An HPSG analysis of the sentence Kim,we know Sandy claims Dana hates.(10) An attribute value matrix (AVM) for the verb sees in HPSG.(11) An HPSG structural description (SD) of gaps in UDs.(12) A CG analysis of the sentences Whom do you think he loves? and Who do
you think loves him?.(13) A CG analysis of the sentence Who Jo hits?(14) An LFG analysis of the sentence What Rachel thinks Ross put on the shelf?(15) The c-structure of What Rachel thinks Ross put on the table?(16) The f-structure of What Rachel thinks Ross put on the table?(17) C-structure of What did the strange, green entity seem to try to quickly hide?
(Asudeh 2009)(18) F-structure of What did the strange, green entity seem to try to quickly hide?
(Asudeh 2009)(19) A subject-predicate analysis of the topicalized sentence The others I know are
genuine. CGEL.(20) A proposed Gap ontology.(21) GAPS AVM.(22) The Chomsky Hierarchy and its corresponding automata.(23) a top-down analysis of the sentenceBook that flight.(24) A bottom-up analysis of the sentenceBook that flight.(25) A CKY parsing of the sentenceBook the flight through Houston.(26) An illustration of an attachment ambiguity in the sentence I shot the elephant in
my pajamas.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
16/149
06
(27) Components of a language processing system.(28) The structure of a compiler within a language processing system.(29) The Parser within the compiler.(30) Small-scale Latent Parser.(31) GAPS AVM.(32) Flowchart of UDs SPL algorithm.(33) Gap-threading in the sentence John, Sally gave a book to.(34) A parse of the sentence Who do you claim that you like?using Python.(35) A parser blueprint incorporating all proposed modifications.
Tables
(1) Position/function of Gaps.(2) Multi-locus Gaps.(3) Formal elements of a PSG.(4) Chomsky hierarchy grammars and their corresponding automata.(5) An Earley algorithm analysis of the sentenceBook that flight.(6) Examples of the seven types of UDs used in the RSC Test.(7) Parser accuracy on the UDs corpus according to the RCS Test.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
17/149
07
Chapter1: Introduction
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
18/149
08
Chapter1:
Introduction
1.1. Motivation:Since the beginning of the year 2000 up until the end of 2003, I have worked on natural
language processing (NLP) solutions for two major companies in Egypt. My first-hand
experience with actual large-scale parsers made me aware of some problems facing
those parsers in the processing of certain grammatical constructions. I decided, back
then, to tackle one of the most difficult problems facing those parsers unbounded
dependencies.
Complex syntactic phenomena stand out as a challenge to computational
implementation in NLP applications. The challenge resides in the problematic nature of
these phenomena: they are syntactically rich with details, and as a consequence of
complexity, they are interleaved with many other linguistic phenomena. In addition,
they exhibit a sufficiently perplexing tendency towards being polymorphous and
diverse. Unbounded dependencies (or, alternatively, long-distance dependencies, filler-
gap constructions, wh-movement constructions, A-bar dependency constructions,
extraction dependencies, etc.) are classic examples of how complex and theoretically as
well as computationally challenging these syntactic phenomena can be. Terry
Winograd (Winograd 1983) gives us an unequivocal statement about the significance of
UDs to the then current syntactic theory. He says:
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
19/149
09
The need to account for this phenomenon [UDs] is one of the major forces
shaping grammar formalisms. It was one of the motivations for the original
idea of transformations, and in some recent versions of TG, the only
remaining transformations are those needed to handle it. The hold register
in ATN grammars, the distant binding arrows of LFG, and the derived
categories of PSG are other examples of special devices that have been
added on top of simpler underlying mechanisms in order to handle it.
(Winograd 1983: 478)Since the 1970s, it has been generally assumed that a number of grammatical
constructions show a type of uniform behavior and architecture that they should be
considered en masse. Chomsky (1977) notes that the rule of wh-movement has, inter
alia, the following general characteristics:
1- it leaves a gap.2- where there is a bridge, there is an apparent violation of subjacency.3- it observes wh-islands. (Chomsky 1977: 86)
Grammatical phenomena that fall under the rubric of UDs cover the following
constructions: topicalization, wh-questions, wh-relatives, it-clefts, tough movement, etc.
The most important feature marking all these constructions is the existence of gaps as
Chomsky noted above.
UDs represent a unique class of grammatical constructions that require some
especially devised mechanisms in order to successfully process them syntactically and
computationally. A basic example on UDs is given in the following sentence:
(a) Sam, I think he told me he tried to understand __.The above sentence can be represented in the following, largely theory-neutral, tree diagram
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
20/149
21
S
NP VP
VS
NP VP
V NP S
NP VP
V IP
I
V NP
S
NP
Sam, I think he told me he tried to understand___
Sentence (a) above is a topicalized sentence where the object of the sentence is
fronted to add emphasis to the intended message of the construction. The fronting of
Sam, i.e. its displacement from the normal object position in the English language (an
SVO language) left a trace in the position of the displaced object that tells us about the
history or the original constitution of the structure before the displacement process.
This trace is usually marked with a hyphen or a dash representing the displaced
element. This account somehow subscribes to a movement-based hypothesis that is part
of the derivational approach to UDs evidenced in TG, GB, P&P and MP theories of
syntax.1
1 The example above and the subsequent explanation should not be taken as a sign of the researchers
subscription to the Chomskyan model and its various manifestations and developments. On the contrary.
The present work openly criticizes those approaches and spots many deficiencies in them as will be seen
in chapter 2.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
21/149
20
1.2. The ProblemUDs represent a unique instance in the history of contemporary syntactic theory
(henceforth: ST) and NLP. In fact, they became the raison d'tre of a handful of
extremely influential syntactic formalisms and a number of novel computational
theorems and techniques. Robust syntactic formalisms such as Generalized Phrase
Structure Grammar (henceforth: GPSG), Lexical Functional Grammar(s) (henceforth:
LFG), Head-driven Phrase Structure Grammar(s) (henceforth: HPSG), and modern
Categorial Grammar(s) all owe, some way or another, many of their formative concepts
and notational devices to studies of UDs. Ivan Sag (Sag 1982) expresses this fact
succinctly by saying that:
Few linguists would take seriously a theory of grammar which did not
address the fundamental problems of English that were dealt with in the
framework of standard transformational grammar by such rules as There-
insertion, It-extraposition, Passive, SubjectSubject raising, and Subject
Object raising. (Sag 1982, p. 427)
UDs happen to be one of those constructions. This is not the whole picture, though.
UDs form an integrated component in most syntactic theories that have attained a
considerable degree of maturity. Its internal complexity and the sophistication needed
to handle them formally and computationally made them a benchmark against which
the validity, expressive power, and exhaustiveness of treatment of any given syntactic
theory are gauged. None the less, only few works have paid attention to handling UDs
in a uniform manner, i.e. works dealing with UDs as a uniform whole surveying their
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
22/149
22
treatment in different syntactic formalisms that subscribe to different linguistic
frameworks are quite meager.
1
As regards the computational handling of UDs, there have been various attempts at
unraveling their syntactic complexity through computers. The basic idea was to test the
robustness of a particular grammar formalism or computational system (Oltmans 1999).
The idea of robustness is of the essence here. A computational system is deemed robust
if it exhibits graceful behavior in the presence of exceptional conditions. Robustness in
NLP is concerned with the systems behavior when input falls outside of its initial
coverage. For instance, if the system is fed with rules describing and specifying the
behavior and structure of relative clauses in English, it will not be negatively affected if
these rules are not covered in full. But the question remains: why study UDs from a
computational viewpoint? The answer to this question seems to be unanimous in the
computational literature. UDs have been always identified in computational linguistics
works as a problem. Charniak (1993) mentions the following concerning UDs:
Another standard problem with CFGs is long distance dependencies This
problem can be solved within a CFG, although it gets a bit complicated. (Charniak
1993: 8-9)
In Mellishet al.(1994) the situation is even more clear-cut:
The problem is more severe when we come to consider long distance
dependencies, or more correctly unbounded dependencies in which two unrelated
pieces of structure may be arbitrarily far apart and not in the same level in the tree.
(Mellish et al. 1994: 129-130)
1Only recently Robert Levine and Thomas Hukari have produced a uniform treatment of UDs in their full
manifestations in their: R. Levine & T. Hukari (2006) The Unity of Unbounded Dependency
Constructions. CSLI Publications, Stanford University. Unfortunately, I was unable to secure a copy ofthe book, but I read a detailed academic review of it by Robert Borsley. However, the main thrust of the
book is on the syntax-theoretic aspects of UDs within the framework of HPSG without any reference to
computational issues (see Borsley 2009).
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
23/149
23
Pereira (1981) finds out that one of the most important benefits of connecting parsing
with deduction is the [h]andling of gaps and unbounded dependencies on the fly
without adding special mechanisms. Such an excessive interest and engagement with
UDs, give us a clear unhampered view of the status of UDs as a computational
problem. There seems to be a common realization amongst computational linguists and
syntacticians of the problematic nature of UDs; a fact that precipitated many of the
current theoretical frameworks both in pure syntax and in computational linguistics.
Statistically speaking, there is a common belief that UDs and similar phenomena do
not represent a sizable portion of any general large-scale corpus, hence ignoring their
treatment. Surprisingly, however, around three quarters of the Wall Street Journal
corpus (WSJC) in the Penn Treebank (PTB) are non-local dependencies, which happen
to include UDs most of the time. The internal sophistication of UDs, their typological
diversity, the existence of gaps, their considerable corpus frequency not only lay bare
UDs as an engaging problem (syntactically and computationally) but as a compelling
one as well.
1.3. Aims and Contributions:
The main goal of this thesis is to provide outlines for solutions of the problem of UDs
as a computational problem. The overall aims of the thesis can be summarized in the
following points:
Placing UDs in their proper positions as regards simple, non-theoretic,grammatical analysis.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
24/149
24
Uncovering the role of UDs in the formation and evolution of many syntactictheories and formalisms.
Laying hands on the key element(s) that could enable us to unravel thegrammatical complexity of UDs.
Proposing a syntax-theoretic solution for UDs in terms of a proposed gaps-ontology.
Highlighting the complexity of processing UDs computationally, i.e. UDs as aparsing problem.
Proposing two types of solutions regarding the automatic parsing of UDs: thefirst has to do with the overall parser design (some tweaking and modifications
of the parser architecture); while the second offers two parsing techniques that
may enable the parser to process UDs in a robust and efficient way.
However, before embarking on discussing the general outlines of my study, it is of
paramount importance to examine a question of method which confronts the researcher
at the outset. A linguist who has been trained on the dynamics and sophisticated details
of the many linguistic theories currently available while hardly having any formal
training in computability theory or computer science is unlikely to offer any detailed or
profoundly technical treatment of a phenomenon such as UDs from a computational
viewpoint. Besides, in order to prepare aseriouscomputationally viable study of UDs,
a linguist needs an intricate set of computational tools that can only be secured and
afforded by such large commercial/research entities (IBM, Microsoft, Carnegie Mellon,
etc.), not to mention the academic and technical expertise that cannot be obviated.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
25/149
25
Thus, what a linguist can do on their own is to adumbrate certain guidelines that relate
to the interface between theoretical and computational linguistics. Many a research has
been marred because its author was unable to resist the temptation of going
computational: a temptation that normally leads to a chaotic morass of computational
nuances that, with the wisdom of hindsight, prove quite hard to disentangle. This
aptitude towards things computational can be ascribed to the current hype given to
anything that has to do with computers, without having, on the part of the researcher,
any proper knowledge, training or experience to do so.
I have attempted to get around this dilemma by focusing on the theoretical syntactic
issues that relate directly to computational parsing offering a broad, semi-technical
approach to solutions. As such, none of the arguments or proposals in the
computational section of this work should be judged as technical; they are just a
number of theoretical postulations, conjectures and refutations on how, in my opinion
and according to my knowledge of computer science, these problems can be solved.
1.4. Thesis Structure:The thesis is broadly divided into three sections: the first focuses on the extensive
theoretical backdrop of the phenomenon, providing an eclectic approach towards a
uniform view of one of the lynchpin components of both the theoretical division of the
work and the computational onegaps. The second represents a rough treatment of the
computational and the parsing problems involved using the second part as a
springboard. The third represents the researchers contribution to the problem of UDs
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
26/149
26
parsing by giving two sets of proposed modifications on the architectural and
processing levels of the parser.
The first section can be seen as the syntax-theoretic part that deals with the
definitions, typology and grammatical analysis of UDs (Chapter 1 and 2) and how a
number of syntactic theories and formalisms dealt with them. In addition to the
analytical exposition, this section is permeated with critiques of those theories and
formalisms in their treatment of UDs, along with an attempt (a perfunctory one though)
at digging up their intellectual milieus and methodological underpinnings (Chapter 2).
Section 2.6 proposes a gaps-ontology in which an eclectic, but hopefully harmonious,
mlange of the theoretical component of gaps and gaps handling is offered. This
concludes the syntax-theoretic section of the thesis.
The second section of the thesis focuses on parsing theory and its roots in the study
of formal languages (Chapter 3). Sections 3.6 - 14 discuss the various strategies and
techniques of parsing available in the literature. Chapter 4 considers the complexity of
UDs parasability as evidenced in a recent computational experiment. Sections 4.3 - 5
examine the architecture and design of mainstream parsers and how they are built.
The final section of the thesis represents the contributions part of the work where the
proposed modifications mentioned earlier are found. Chapter 5 proposes architectural
and design modifications on the universal parser by introducing the notion of
modularity and by devising a Small-scale Latent parser. Chapter 6 proposes the next set
of modifications that relate to the processing of the parser itself. This final section
concludes with a brief account of the conclusions of the thesis.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
27/149
27
1.5. UDs Defined
The syntactic phenomenon of unbounded dependencies has been, as alluded to above, a
major springboard for many theoretical proposals and syntactic formalisms. Naturally,
this multiplicity of origins generated concomitantly a multiplicity of definitions and
designations in the literature. First, I will look at the different definitions of UDs and
how these differences can be accounted for. Then I will survey the various designations
found in the relevant syntactic literature.
The concept of "unbounded dependencies" was first introduced by Gerald Gazdar
(1981) to refer to a set of syntactic structures handled within transformational
frameworks in terms of movement or, more specifically, wh-movement. The use of the
adjective "unbounded" in such contexts, however, goes back to J. Bresnan (1976)
during the heyday of transformational approaches to grammatical analysis. Originally,
however, the idea of "unboundedness" is a mathematical concept used in algebraic and
computational studies of unbounded operators, set theory, number theory and
algorithmics (Gowers 2009). The mathematical undertones of the term will be
discussed later in the following section.
Crystal (2008) defines an unbounded dependency as
[a] term used in some theories of grammar (such as GPSG) to refer to a
construction in which a syntactic relationship holds between two
constituents such that there is no restriction on the structural distance
between them (e.g. a restriction which would require that both be
constituents of the same clause); also called a long-distance clause. In
English, cleft sentences, topicalization, wh-questions and relative clauses
have been proposed as examples of constructions which involve this kind
of dependency; for instance, a wh-constituent may occur at the beginning
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
28/149
28
of a main clause, while the construction with which it is connected may be
one, two or more clauses away, as in What has John done?/What do they
think John has done?/ What do they think we have said John has done?,
etc. In GB theory, unbounded dependencies are analyzed in terms of
movement. In GPSG, use is made of the feature SLASH. The term is
increasingly used outside the generative context1. (Crystal 2008: 501)
Crystal's definition deserves a while of analytical contemplation. First, we need to
establish the fact that Crystal (2008) is a relatively basic specialized dictionary targeted
at professional as well as lay readers. This means that encountering detailed
argumentative analyses of linguistic phenomena would be a rare incident in his work.
He establishes his definition of UDs upon an abstract postulate that describes UDs as
having a syntactic relationship between two constituents "such that there is no
restriction on the structural distance between them." The idea of having no restriction
on the structural distance between two dependencies is a mathematically or logically
oriented idea rather than a natural language based one. In other words, natural language
cannot permit such infinitely continuous clausal concatenations. It has to have a bound
(i.e. a sentence must end somewhere in a linguistic text). The idea of unboundedness is
thus a potentiality rather than an actuality. Mathematically-oriented thinking about
language, however, has a natural proclivity towards abstraction and higher-order
1 The final two sentences in Crystal's definition are interesting from an error analysis viewpoint,
however. First, he describes GB as handling UDs in terms of movement, which is essentially correct.
However, he continues his description by stipulating another fact about the handling of UDs in GPSGthrough the feature SLASH. The feature SLASH, as we will see later, is postulated in GPSG to account
primarily for the existence of gaps in UDs, while describing movement only as the main technique for
handling UDs in GB. This entails an intrinsic mistake in proposing that GB theory has no theorem for
handling gaps, which is incorrect. Second, Crystal describes GPSG, HPSG, LFG and CGs as theories
"outside the generative context." In fact all these theories are "generative" in essence; they are only non-
transformational.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
29/149
29
language. A more linguistically-real term would be "long-distance dependencies",
which was later adopted by most non-transformational syntactic theories and syntactic
formalisms handling the phenomenon of UDs.
Trask (1993) defines UDs in a more poised manner. He notes how UDs present "a
major headache for syntactic analysis," and that "all sorts of special machinery have
been postulated to deal with them." He takes a more development-oriented approach to
the handling of the phenomenon: for example, he mentions that classical TG made a
liberal use of the theoretically problematic unbounded movement rules, and that GB
and GPSG both reanalyzed UDs in terms of chains of local dependencies. GB used
traces and GPSG came up with a feature SLASH. LFG, on the other hand, used arcs in
its f-structures. I shall deal with all these formative concepts in more detail later in this
work.
Matthews (1997) defines the phenomenon of UDs as a "[r]elation between syntactic
elements that is not subject to a restriction on the complexity of intervening structures."
His definition is a restriction-based one, bearing in mind the formative concepts of
island and cross-over constraints.
Another definition based on psycho-syntactic realization of UDs is found in Slack
(1990). According to him UDs represent a unique linguistic phenomenon, he writes:
One linguistic phenomenon which, more than any other, focuses on the
problem of addressing structural configurations is that of unbounded
dependency. Typically, in sentences like The boy who John gave the book
to __ last week was Bill, the phrase The boyis taken as the filler for the
missing argument, or gap, of the gave predicate, as indicated by the
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
30/149
31
underline. At the level of constituent structure there are no constraints on
the number of lexical items that can intervene between a filler and its
corresponding gap. (Slack 1990: 268)
Slack (1990) dissects the phenomenon of UDs in a more profound manner. He states
that UDs belong to a class of linguistic phenomena in which the structural address of
an element is determined by information which is only accessible over some arbitrary
distance in the structure.According to him it is necessary to determine the address of
the gap to which a filler belongs. The arbitrariness of the distance separating the gaps
and their fillers in the input strings, makes the specification of the set of potential
predicate-argument relations that the filler can be involved in (and thus the
identification of a direct address of the gap) quite an impossible task (ibid.).
The former definitions can be classified as non-partisan, i.e. they do not subscribe to
any particular syntactic theory, framework or formalism. Also being mostly dictionary
entries they are naturally confined by the constraints of brevity, simplicity and
neutrality. Apart from encyclopedic definitions, I need to establish the fact that the
study of UDs have been originally formulated within more arcane journal articles and
research monographs. For that matter Gazdar et al. (1985) presents the first
perspicuous and formally rigorous definition of UDs. I shall not dwell further on GPSG
and its treatment of UDs for I have included a whole section dedicated to this classic
and most influential treatment of UDs (see 3.2.).
1.6. The Class of UDs:
Any rigorous treatment of the phenomenon of Unbounded Dependencies should rest on
a uniform, holistic comprehension of its nature. By "holistic" I refer to the necessity of
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
31/149
30
treating UDs in an undivided manner; i.e. studying relative clauses, wh-questions or
topicalized constructions separately will not shed enough light on the nature and
dynamics of the phenomenon. The study of UDs should be applied to the complete set
of constructions recognized and classified as unbounded dependency constructions.
These constructions are included within the following two subsets: strong UDs and
weak UDs.
1.6.1. Strong UDs:
In what sense is the first subset of UDs "strong"? "Strength" here is rather a misnomer
for compatibility or isomorphism. They are strong because they require the filler and
the gap to be of the same syntactic category. According to Pollard & Sag (1994: 157-
158), the first subset clearly represents strong UDs because there is an overt constituent
in a non-argument position (sentences 1-5 group A) (normally the wh-phrase) that is
strongly associated with the gap indicated by "_". Strong UDs include the following
structures:
GROUP (A)
Topicalization:
(1) This sort of problemi, my motherjis difficult to talk to_jabout_i.1
Wh-questions:
(2) Which violiniare these sonatasjdifficult for them to play_jon_i?
Wh-relative clauses:
(3) This is the bookithat the manjwe told the story to _jbought_i.
I t-clefts:
1Underscores and small subscripts (j, i, etc.) in this and the following sentences represent gaps or empty
elements (traces of nominal or pronominal antecedents); this is a notational convention found in the
majority of syntactic analyses of UDs and similar grammatical constructions.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
32/149
32
(4) It is Kim whoiSandy loves _i.
Pseudo-clefts:
(5) This is whatiKim loves _i.
1.6.2. Weak UDs:
Weak UDs, on the other hand, have no overt filler in a non-argument position
(sentences 1-4 group B); instead they have a constituent in an argument position that is
"loosely" co-referential with the gap or the trace. Weak UDs include the following
structures:
GROUP (B)
Tough movement:
(1)Sandyiis hard to love _i.Purpose in f in i tives:
(2)I bought itifor Sandy to eat _i.Non-wh r elati ves
(3)This is the politicianiSandy loves _i.Non-wh clefts
(4)It's KimiSandy loves _i.Two important points have to be mentioned here. First, UDs are indeed unbounded,
which means that the dependency may, theoretically speaking, extend ad infinitum.
Second, there is a syntactic category-matching condition between the filler and the gap,
especially in strong UDs. The following examples illustrate these two points:
(1)
a) Kimi, Sandy trusts _i.
b) [On Kim]i, Sandy depends _i.
(2)
a) Kimi,Chris knows Sandy trusts _i.
b) [On Kim]i,
Chris knows Sandy depends _i.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
33/149
33
(3)
a) Kimi,Dana believes Chris knows Sandy trusts _i.
b) [On Kim]i, Dana believes Chris knows Sandy depends _i.
In (1) the gap is an argument of the main clause, in (2) it is an argument of an
embedded complement clause, and in (3) it is an argument of a complement clause
within a complement clause. Mathematically speaking, there is no bound on the depth
of embedding. The following diagram represents the above-mentioned in a clearer
style.
Figure (1) The Class of Unbounded Dependency Constructions
Evidently, the class of UDs has a rich taxonomical structure that justifies its
complexity. As noted above, studying each of the branches in the above tree diagram
on its own will yield unsubstantial insights into UDs. As a first approximation, the
thing that gathers all these different syntactic constructions under a uniform category is
the existence of a "gap" somewhere in the construction. An oxymoron as it might seem,
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
34/149
34
the existence of gaps or missing elements in the sentence is the common denominator
that holds all the above branches under one node UDs. That is why I allocate a
special section for the handling of gaps in UDs later in this work (see Chapter 4). Also,
I found out that an eclectic theory of gaps might be a step towards a better and more
profound comprehension of the phenomenon of UDs and the more general
phenomenon of gapping.
For the sake of brevity and better visibility conditions, the present work will focus
mainly on strong UDs throughout the proposed analyses and critiques. Weak UDs will
be sporadically mentioned throughout the work, though they will not have a proper
treatment on their own right. The partial exclusion of weak UDs from the work will
hardly affect the treatment of the overall phenomenon. Strong UDs have all the features
that we need in order to analyze UDs. Weak UDs, on the other hand, are more of a
subset of strong UDs: a fact that makes obviating the handling of weak UDs a
reasonable act in the footsteps of Ockham's razor.
1.7. Nomenclature
UDs have been variously termed in the literature. Y. Falk (2006: 316) recognized the
following designations: extraction, long-distance dependencies, wh dependencies
(or wh-movement), A' dependencies (or A' movement), syntactic binding,
operator movement, and constituent control. The concept owes its multifarious
terminological manifestations to different realizations of its nature and functions. Each
linguistic school or syntactic formalism saw UDs according to its defining
characteristics and theoretical grounding. Transformational theories (such as
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
35/149
35
Chomsky's GB, P&P and MP), for instance, have essentially a dynamic, movement-
based conception of most linguistic constructions; a fact which clearly explains the use
of such terms as "wh-movement, A' movement, syntactic binding," etc. On the contrary,
non-transformational theories (such as GPSG, HPSG, CG) proceed from a static
monostratal1 conception of linguistic constructions, hence their use of such terms as
"unbounded dependency constructions and long-distance dependencies."
Terminologically speaking, the term "extraction" is the only common ground where
transformational and non-transformational theories meet (on the use of extraction in
non-transformational contexts see Sag 1994).
1This term refers to the idea that syntactic structures are essentially monostratal, i.e. they consist of only
one level of representation, which is a surface apparent level. The Chomskyan postulate of a deep
structure is irrevocably repudiated within this monostratal framework. Gazdar et al(1985) was the firstunequivocal statement of this theorem on which are based the whole frameworks of GPSG, HPSG and
DCG. For more details see Horrocks (1987), Gazdar et al (1985), Sag et al (1994), Sag et al (2003),
Brown ed. (2006).
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
36/149
36
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
37/149
37
Chapter 2: UDs and Syntactic Formalisms
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
38/149
38
Chapter 2:
UDs and Syntactic Formalisms
2.1. Derivational Approaches to UDs
Most of the reviews of the literature I came across in academic theses and books
dealing with UDs, from a historiographical viewpoint, seem to be a disparate collation
of information that hardly precipitates profound understanding or evaluation of the
intellectual context that spawned and fostered the growth of syntactic theory. This is
not the case here as far as I hope. My faith is that syntactic theory (and its handling of
UDs) can hardly be understood or profoundly appreciated without a firm belief in the
utility of coming to grips with the intellectual milieu that made such scholarly feats
possible. Fortunately, the historiography of UDs in both the syntactic and the
computational realms is as much variegated as could help build a mosaic that is
informative, insightful and sufficiently panoramic. I believe, thus, along with Tomalin
(2006) that
[i]t could hardly be claimed that to consider the aims and goals of
contemporary generative grammar, without first attempting to comprehendsomething of the intellectual context out of which the theory developed, is
to labour in a penumbra of ineffectual superficiality. (Tomalin 2006: 20)
Another important factor that necessitates this line of research has to do with UDs
themselves. The study of UDs has been a major formative force in the field, a fact that
made it a prerequisite (and a keepsake) for anyone embarking on a serious study of
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
39/149
39
syntactic theory. The inherent complexity of unbounded dependency constructions and
the challenges they posed before syntacticians of different streamlines and the various
analytical strategies and tools proposed to handle them endowed these constructions
with a level of significance unprecedented in the field. That is why I adopted a
historical-cum-theoretical approach in studying them, because, as far as I can see, this
is the approach that is the most felicitous and the most enlightening as well.
Historically, UDs have been studied according to two different approaches: the
transformational and the non-transformational.1 Transformational approaches analyze
UDs from a movement-based perspective. The filler of a UD is marked with an
underscore (as in [a] below) then it changes its location through a series of movements
till it reaches the leftmost position in the tree.
(a)
1. Which car does John think you should purchase_?
2. That book you should read_.
3. This is the car which_ John told me he thinks I should purchase_.
4. Whom do you think Jim kissed _?
Sentence (4) (see Carnie 2006: 325) can be represented according to a transformational
(derivational) framework like the following2:
1Transformational approaches have also been known as derivational approaches, because they depend
on processes that derive, via transformations, the final output of a sentence from certain hypothesized
deep structures to their final realizations as surface structures (see Bussmann 1996; Trask 1993; Radford
2003).2The version used here is a recent version of the transformational enterprise known as P&P (Principles
and Parameters) which is the version before the last emendation stated in Chomskys The Minimalist
Program(1995).
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
40/149
41
Figure (2)
According to TG analyses, this is the original deep representation of the sentence;
where the wh-word is situated at the bottom of the tree. This means that in order to
move who to its proper position a number of movements have to be done. These
movements can also be illustrated in the following tree (see Carnie 2006: 326):
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
41/149
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
42/149
42
Figure (4)
The two arcs in the above figure represent the two hopsCarnie just referred to. Now,
we can have the correct S-structure where the wh-phrase will be situated at its rightful
initial position in the tree, as shown in figure (4) (Carnie: 328):
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
43/149
43
Figure (5)
Fodor (1978) pointed out that the effects of Wh-movement are not strictly local. The s-
structure position of a wh-phrase can be arbitrarily far from its d-structure position. The
sentence Which city did Ian visitcan serve as an example
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
44/149
44
Figure (6)
The analysis proceeds by creating the appropriate CP structure by attaching the phrase
which city in the [SPEC, CP] position. Then the analysis proceeds to account for did by
attaching it to the C position. Then the analysis proceeds to handle the verb visit by
identifying it with a verb that requires a NP. The analysis identifies an antecedent
where there is no argument position for the proposed NP. Here comes in the role of the
wh-trace (t) attaching it to a post-verbal NP node (Gorrell 1995: 132-133). The
fundamental line of argument evident in transformational analyses proceeds from a
psychological springboard entrenched in hypothetical reasoning that hardly accounts
for the computational handling we aspire to study.
2.2. UDs in Generalized Phrase Structure Grammar (GPSG)
The domineering nature of Noam Chomskys transformational grammar generated a
sense of dissatisfaction among leading younger linguists during the early 80s. Gerald
Gazdar was one of those leading linguists. Back at that time linguists began to call what
is now GPSG Gazdar Grammar. Gerald Gazdar, however, did not like that nor did his
collaborators: Ewan Klein, Geoffrey Pullum and Ivan Sag. Their main focus was on the
study of PSGs (Phrase Structure Grammars) but they did not have a specific name for
what they were doing. After attending a talk by Emmon Bach called Generalized
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
45/149
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
46/149
46
It has to be noted that GPSG was something of a revolution against Chomskyan TG
(Gazdar et al. 1985; Horrocks 1987; Borsely 1999; Falk 2006). And since the class of
UDs was one of the constructions that TG adherents used as proof of the inadequacy of
the class of Phrase Structure Grammars (PSGs) in describing natural language syntax,
Gazdar and his collaborators decided to show that this assumption was basically
mistaken (Falk 2006).1 Thus the earliest work in GPSG dealt with UDs in greater
detail.
Gazdars paper opened up new avenues of research in theoretical linguistics and formal
computer science producing four years later the seminal and foundational work by
Gazdar, Klein, Pullum and Sag (1985).2In Gazdaret al.(1985) we will encounter the
first formally perspicuous exposition of the nature of UDs. According to Gazdar et al.
(1985: 137) an unbounded dependency construction is one in which
(i) a syntactic relation of some kind holds between thesubstructures in the construction, and
(ii) the structural distance between these two substructures isnot restricted to some finite domain (e.g. by a requirement
that both be substructures of the same simple clause).
1GPSG was a frontal attack on transformational grammar. It not only attacked the lynchpins of
the concept of transformations, but it also showed how unfounded other sacrosanct conceptssuch as Deep Structure vs. Surface Structure are. Another attack was against the permeatingpsychologism of TG and its claim to universality. As such and against this backdrop, GPSGwas founded on a monostratal model (a model that accepts no dualisms or hypothesized deepvs. surface dichotomies) with an intricate use of set-theoretic concepts, just to cleanse their
syntactic model of any possible trace of psychologism.In spite of the rigorous nature of GKPS, the first chapter has this air of revolutionary
manifestoes, and it is by far the authors clearest statement on what GPSG is, (see Gazdar et al.1985: 1-16).2
Sometimes abbreviated as GKPS based on authors initials.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
47/149
47
(iii) topicalization, relative clauses, constituent questions, freerelatives, clefts, and various other constructions in English
have been taken to involve a dependency of this kind.
According to Gazdar et al. (1985: 137), it is analytically useful to think of such
constructions, conceptualized in terms of tree geometry (in the usual way, root up and
leaves down), as having three parts: the top, the middle and the bottom. The top is the
substructure which introduces the dependency, the middle is the domain of the structure
that the dependency spans, and the bottom is the substructure in which the dependency
ends, or is eliminated. Gazdar et al.(1985: 138) illustrate their proposed tree geometry
as follows:
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
48/149
48
Figure (7) Tree geometry of the structure of a UD in GPSG
Gazdar et al. (1985: 138) theory of UDs claims that the principles which govern the
bottom and the middle are completely general in character, in that all types of UDs
receive the same treatment. The idea is that the proposed analysis of UDs will be
focused on the middle of the construction which involves no more than the feature
SLASH along with feature instantiation principles. Of these principles the Foot Feature
Principle (FFP) is the most important.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
49/149
49
The central claim of GPSG analysis of unbounded dependencies is that these
dependencies are simply a global consequence of a linked series of mother-daughter
feature correspondences.
The main formative components of GPSG is a set of metarules that generate other
rules, such as Immediate Dominance (ID)/ Linear Precedence (LP) rules, along with
feature instantiation principles, such as FFP, Head Feature Principle (HFP) and
SLASH. The feature SLASH, however, is our mainstay in the analysis of UDs, because
it represents and accounts for the behavior of the most significant element in an
unbounded dependency constructiongaps. But what is a SLASH?
When we write down in quasi-algebraic notation that we have, for instance, a set
A/B, this means that the set A lacks or is missing the element B. The SLASH or [/] is
originally an algebraic symbol for a missing element. The value of the SLASH feature
will be a category corresponding to a gap dominated by the categories bearing a
SLASH specification. A gap is created by some Immediate Dominance (ID) rule which
introduces a constituent that has a SLASH feature; the feature-matching principles of
GPSG push it down the head path of the category on which it first appears, and a
multiplicity of metarules allow it eventually to be cashed outas a gap at the bottom of a
nonlocal tree structure (see Levine 1989: 124-5). The best way to come to grips with
the effects of the FFP apropos slash categories is to inspect an example of its
application. Consider the following ID rules:
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
50/149
51
1
According to the above rules and according to feature instantiation principles, we can
predict that the resulting structures will be the following:
Though the above notation seems a little difficult to follow, it is actually very
straightforward. Rule (e.) above, for instance, refers to a verb phrase (VP) missing (/) a
noun phrase, an object in this case (NP), which conforms with ID rule number (45) that
deals with transitive verbs that takes a prepositional object as part of its
1Numbers in square brackets refer to a list of rules provided as an appendix in Gazdaret al.
(1985: 245-9).
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
51/149
50
subcategorization such as approve of, which itself lacks the existence of this object
(PP/NP).
Now, we need to see an example illustrating all the formal nuances mentioned
above. A topicalized sentence like (a) will suffice.
(a)Sandy we want to succeed.
The normal ordering of this sentence would normally reads We want Sandy to
succeed. However, a topicalized structure such as (a) within the framework of GPSG
can be represented according to the following tree:
Figure (8)
The basic idea in GPSG analysis of UDs is that the constituent containing the gap
has a missing element feature (Falk 2006). This is represented by the [+NULL] e
above. The constituent headed by wantis a VP/NP (a verb with a missing object). The
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
52/149
52
e (empty) is a pronominal that refers back to Sandy. The whole clausal constituent
containing this VP/NP is S/NP, since it is missing the same NP as the VP it dominates.
As a result of the above feature sharing, the same element occupies the filler and gap
positions at the same time, without any indication or sign of movement. This
movement-less approach to UDs along with a solid formal apparatus (ID & LP rules,
metarules, FCRs, FSDs and FIPs) catapulted GPSG as a suitable alternative to the
much disputed TG framework. However, GPSG was short-lived: its sophisticated
formalism and nuanced quasi-algebraic treatment of complex phenomena such as UDs
made it forbidding to the majority of linguists during the 80s. But this was not the end
of GPSG, though. For, it continued its existence, as we shall see in the next chapter, in
a different guise, this time as the much more successful framework of HPSG (Head-
driven Phrase Structure Grammar).
2.3. UDs in Head-driven Phrase Structure Grammar (HPSG)
According to Sag et al.(1999: 435) HPSG was formulated in an intellectually eclectic
environment at Stanfords Center for the Study of Language and Information (CSLI).
During the 1980s, CSLI was incubating a number of theories, approaches and
frameworks that aim at formulating a kaleidoscopic view to language and its
mechanisms. Sag and Pollard established their theory of HPSG on a variety of theories
and formalisms: situation semantics, data type theory, TG, GPSG, CG and Unification
Grammars. This eclectic formation endowed HPSG with an undeniable flexibility on
the theoretical and formal levels.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
53/149
53
There are three known hallmarks in the history of HPSG: the publication of Sag and
Pollard (1987); Sag and Pollard (1994); and Sag and Wasow (1999). These are
hallmarks in the sense that they marked some definitive changes in the views of the
authors or the formal apparatus of HPSG in general.
Unlike GPSG, HPSG shifted its attention from rules to features. This is clearly
manifested in the adoption of Unification Grammars use of typed (or sorted) feature
structures. A typed feature structure consists of features representing linguistic entities
(words, phrases, sentences) and values that identify the dimensions of those features.
For example, the feature PERSON in a given feature structure has three values: 1st, 2nd,
and 3rd. According to this, the word youhas the property second person and this is
represented by the feature value pair [PERSON 2nd]. Sag and Pollard (1994:8) suggests
that the role of their proposed linguistic theory is to give a precise specification of
which feature structures are to be considered admissible. And also according to their
view, the types of linguistic entities that correspond to the admissible feature structures
constitute the predictions of the theory.
UDs have received considerable treatment within HPSG. This could be ascribed to
two reasons: the first one has to do with the incremental theoreticalprerequisitenessof
UDs as a sophisticated syntactic phenomenon that many see as a testing-ground for any
proposed syntactic theory or formalism (see Winograd 1983; Falk 2006). The second
reason has to do with the importance of UDs within the previous contributory
progenitorGPSG.1However, HPSG took the analysis of UDs some steps further. In
HPSG UDs get more than a single feature, a wh-feature, as they used to get in GPSG.
1 It has to be noted that Ivan Sag, one of the original expositors of GPSG, became later the
central figure in HPSG work for his, along with Carl Pollard, 1987 and 1994 publications.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
54/149
54
In HPSG they get two distinct features: QUE and REL for questions and relative
constructions (Pollard and Sag 1994: 159). This separation could be accounted for on
the ground that the only information that needs to be kept track of in an interrogative
dependency is the nominal-object corresponding to the wh-phrase, while in a relative
dependency the referential index of the relative pronoun is all that is required (see
Pollard and Sag 1994).
Another difference relates to the realization of feature structures in both GPSG and
HPSG. In GPSG, foot features take the same kind of value, which is normally a
syntactic category, while in HPSG, nonlocal features take setsas values.1According
to Pollard and Sag (1994: 159) this strategy will enable HPSG to deal with more
sophisticated UDs, such as multipleUDs as in the following sentences:
1- [A violin this well crafted]1, even [the most difficult sonata]2will beeasy to play2on1.
2- This is a problem which1John2is difficult to talk to2about1.It is noteworthy to mention the fact that in HPSG, strong UDs are analyzed in terms of
a filler-gap conception. This peculiar conception underscores the centrality of the
concept of gap in any treatment of UDs. This is why I think that HPSG is ahead of
most other syntactic theories in the analysis of UDs, because of this very gap-based
analysis. This competitive edge will be more clearly accounted for later in this work
(see ch.?).
1Again the mathematical, especially algebraic, influence on syntactic theory is very muchmanifested in this instance where the use of sets is borrowed from algebraic Cantorian Set
Theory.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
55/149
55
Take the following sentence (P&S 1994: 160) as an example of how HPSG
analyzes a topicalized clause of the strong UDs type:
1- Kim1,we know Sandy claims Dana hates1.
Figure (8)
The analysis provided above looks similar, to a great extent, to Gazdars bottom -
middle-top model (see figure 6). In HPSG, the bottom of the arboreal skeleton is where
the dependency is introduced, because at the bottom there exists the terminal node that
triggers the whole unbounded dependency. This terminal node is associated with a
special sign that must be nonempty. As for interrogative dependencies, this sign is an
interrogative pronoun (what, which, where, etc.) with a nonempty value for the QUE
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
56/149
56
nonlocal feature, while in relative dependencies, the sign is a relative word (e.g. who,
which) having also a nonempty value for the REL nonlocal feature.
What really distinguishes HPSG from previous theories or formalisms is its reliance
on associativity: it attempts to associate linguistic objects with each other by a number
of concepts and techniques. Central to these is the concept of inheritance hierarchy, the
embodiment of which can be seen in the above tree diagram (figure 8). Instead of the
crude movement transformations in all versions of Transformational Grammars, we get
here a more computationally sound technique where the traits of a certain linguistic
object are inherited from one object to another. The SLASH category in the above tree,
for example, is being inherited from one stratum of analysis to the other by inserting
boxed numbers and the feature INHER. So the SLASH feature at the bottom of the
dependency passes from daughter to mother up the tree, and the top is where the
dependency is discharged or bound off (Pollard and Sag 1994: 160-161). As with
GPSG, HPSG is more inclined towards computational implementation, because it
originally availed itself from many computational models and procedures, and it has to
be noted here that the concept of inheritance is a genuine computational procedure that
HPSG incorporated into its theoretical architectonic.1
HPSG uses a number of features to construct what it considers to be a complete
description of a given linguistic entity. For the description of the syntax-semantics
interface, for example, it employs a feature SYNSEM that represents the syntactic as
well as the semantic content of a particular lexical item. This is realized via what HPSG
1The idea of inheritance is directly borrowed from computer science, especially from work onGenetic Algorithms which resorts to biological jargon and concepts such as inheritance,
evolution and survival of the fittest (see Dopico et al 2009)
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
57/149
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
58/149
58
Head-driven phrase structure grammar is a monostratal theory of natural
language grammar, based on richly specified lexical descriptions which
combine according to a small set of abstract combinatory principles stated as
formulae in a constraint logic regulating, for the most part, the satisfaction of
valence and other properties of syntactic heads. These constraints, applying
locally, determine the flow of information, encoded as feature specifications,
through arbitrarily complex syntactic representations, and capture all
syntactic dependenciesboth local and non-local in elegant and compact
form requiring no derivational apparatus.
This theoretically rich definition deserves an equally rich analysis. The first fact about
HPSG in this definition is that it is monostratal, which means that it does not subscribe
to derivational or transformational theories of natural language grammars (see fn.1 in
p.28 above). This, of course, reminds us of the early beginnings of GPSG (Gazdar
1982). The second important notice that really characterizes the theory of HPSG is its
lexicalism: as Levine (2003) puts it, HPSG is based on richly specified lexical
descriptions. This highlights HPSGs attention to the value of lexical items as bearers
of information and as the glue that binds linguistic descriptions together. In fact, HPSG
is head-driven because it relies on lexical heads, such as seesabove, on its descriptions
of linguistic entities. Finally, the definition gives us a hint concerning HPSG recourse
to mathematical and logico-mathematical jargon in its descriptions of local and non-
local (UDs) syntactic dependencies in an elegant and compact form. 1 Implied here is
1Note here also the use of elegant and compact which is a commonplace description inmathematical and logico-mathematical literature. A mathematical proof, for instance, has to beelegant and compact in the sense that it admits of no logical fallacies, internal inconsistencies
or needless tortuous sub-proofs.
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
59/149
59
the idea that a derivational apparatus as in GB, P&P and MP formalisms is
essentially inelegant and incompact.
HPSG, then, looks at UDs as filler-gap constructions (Pollard & Sag 1994), or as
constructions with gaps (GAPs) that can be resolved via the detection of the sites or
positions of those gaps and relating them to their original positions via inheritance. This
is realized by stipulating what HPSG calls the GAP Principle (Sag &Wasow 1999;
Carnie 2003). The GAP Principle states the following:
A well-formed phrase structure licensed by a headed rule other than the Head Filler
Rule must satisfy the following SD1:
Figure (11)
This means that the mother GAP feature subsumes all the GAP values in its daughters.
The symbol in the diagram above simply refers to the arithmetical notion of adding
up to, but this time the entities added are not single linguistic objects but lists of
linguistic objects (Sag & Wasow 1999: 351). The boxed n above is also the
arithmetical indication of the idea of any number of. Gaps in HPSG will be more
thoroughly, and comparatively, explored later along with other syntactic frameworks.
1SDs stand for Structural Descriptions, which are the amalgamation of constraints from lexical
entries, grammar rules, and relevant grammatical principles. See Sag &Wasow (1999: 68)
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
60/149
61
2.4. Categorial Grammar(s)
From a historical vantage point, Categorial Grammars (or CGs) antedate all generative
theories of syntax. CG was first formulated within a strictly logical backdrop: it was
Kasimir Ajdukiewicz, the famous Polish logician and algebraist of the Lvov-Warsaw
school of logic and mathematics, who introduced the idea of functional syntax in his
Die syntaktischeKonnexitt(1935). But Ajdukiewiczs treatment was strictly logico-
mathematical, a fact which made his work quite forbidding for linguists. 1Two decades
later, Yehoshua Bar-Hillel (1953), also a logician, came along with a revived interest in
Ajdukiewiczs CG, but this time combining it with many insights and methods from
American linguists during the 1950s. This new combination of ideas and methods of
mathematical logic and structural linguistics spawned a novel interest in CG in the
USA and the Continent. The interesting thing about Bar-Hillels revival of CG is his
belief in the suitability of CG for machine translation purposes. That explains why
computational linguists tend to prefer CG, and other likeminded formalisms, over other
syntactic theories bereft of such computational aptitude.
Being an offshoot of advanced logical and formal studies, CGs emphasis on the
semantics of natural languages is naturally expected. Unlike other formalisms and
theories of syntax, CG has no separate module for semantic processing; for it sees
semantics as an inherently inextricable component of syntactic description. In other
words, syntax and semantics in CG are one and the same thing: every rule of syntax is,
1Besides being an excruciating reading even for the initiated in mathematical logic,Ajdukiewiczs paper appeared in a Polish philosophical journal and has therefore been
unknown to most linguists, (Y. Bar-Hillel 1953: 1).
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
61/149
60
inherently, a rule of semantics (Wood 1993: 3). CG has the following properties (Wood
1993: 3-5):
(1)It sees language in terms of functions and arguments rather than of constituentstructure.
(2)Syntax and semantics are integral.(3)It is monotonous (monostratal), i.e. it avoids destructive devices such as
movement or deletion rules which characterize transformational grammars.
(4)It takes to its logical extreme the move towards lexicalism, i.e. the syntacticbehavior of any linguistic item is directly encoded in its lexical category
specifications.
The other peculiar aspect that has to do with CG and UDs is the somehow troubled
relationship between the two. Ironically, Bar-Hillel lost faith in CG because he found
out that it was unable to process discontinuous constructions (such as UDs) (Wood
1993: 23,104). But the theory of CG during the 1960s was not very much developed to
handle such sophisticated syntactic constructions such as UDs. Since that early, UDs
intractability was recognized as a processing fact that any syntactic theory or formalism
has to efficiently and rigorously account for.
Classical CG did not offer any straightforward method to deal with UDs (Wood
1993: 104). However, Ades and Steedman (1982) used the recursive power of
generalized composition to reach what they called a derivational constituent which
can be utilized to apply backwards to the fronted object giving the correct semantic
interpretation (Wood 1993: 105). A sentence like Who(m) do you think he loves?can be
represented according to Ades and Steedman (1982) in the following way
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
62/149
62
Figure (12)
Recent advances in CG produced the more elaborate type-logical categorical
grammar. What interests me the most in this more advanced formalism is its proposal
of a novel procedure to handle gaps in UDs. Bob Carpenter (1997) adopts Moortgats
approach to UDs to account for the existence of gaps and how they should be treated
within a CG-based framework. As Carpenter (1997: 203) mentions, Moortgats
analysis rests on proposing an additional binary category constructor,, that can be used
to construct equations of the form AB. This equation means that there is a category A
missing somewhere within it a B. For instance, snpis a sentence from which a noun
phrase has been extracted. The extraction constructor AB is a generic form for both
A/BandA\Bthat may be instantiated in the following:
snp=s/npors\np
which indicate a sentence lacking a noun phrase on the right or left frontiers. The use of
the SLASH feature in CG is similar to that in GPSG and HPSG; the difference lies in
the adoption of feature structures and AVMs in HPSG and the adoption of the Lambek
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
63/149
63
calculus (a semi-algebraic linear formalism) in CG. An example of how advanced CG
handles a UD can be of use here. The phrase Who Jo hitsis formally represented in
CG according to the following schemata, see Carpenter (1997: 206).
Figure (13) A representation of who Jo hits?
The postulation of (snp) in the beginning of the relative or interrogative clause (under
who) is the notational tool that unravels the unboundedness of the structure by
postulating that there is a missing noun phrase somewhere in the construction.
2.5. Lexical Functional Grammar (LFG)
This is the fourth syntactic theory through which I try to explain and unravel the nature
of UDs. LFG is one of the most prominent theories of grammar belonging to the
generative tradition. It is also one of the theories that subscribe to a non-
transformational agenda. Being non-transformational boosted the theorys potential for
a rigorous treatment of UDs. This is due to the fact that most non-transformational
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
64/149
64
theories and formalisms are more inclined towards formalisms that are couched in
mathematical or semi-mathematical terms. This is the case with LFG.
But in what sense LFG is different from the other theories mentioned above? It
differs from GPSG and CG in that LFG is, in fact, a complete theory of language
syntax, with a separate explanatory module for the study of language acquisition,
universals and cognitive aspects. This is not the case with GPSG or CG, because both
of them, and especially GPSG, pose ruthless critiques to the prevalent psychologism in
GB and P&P. And both of them are more devoted to such applications as
computational linguistics and AI. LFG is similar to HPSG because the latter also
sustains certain claims to universality and psychological reality. But all of them share a
staunch rejection of transformational rules and assumptions. They also share their avid
interest in lexicalism: the four of them (GPSG, HPSG, CG, LFG) see the lexicon as the
springboard for any viable and true grammatical analysis.
As opposed to GB and P&P, the non-transformational approaches mentioned above
see lexical categories as the keys with which we can unravel syntactic riddles,
especially the riddle of UDs. That also accounts for the high importance of UDs
analyses within the frameworks of all those theories. GPSG proposed the Head Feature
Principle, which restores to lexical items their due powers instead of ascribing all
powers to extra-linguistic features and movements as is the case with transformational
grammars (Falk 2001). HPSG, which is a more stringent framework than GPSG
(Carnie 2008), bases the entire linguistic analysis on the head sign, which is an
instantiation of a certain lexical item or word. CG is even more extremist on the issue
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
65/149
65
of lexicalism; that is why it derives its analytical momentum from certain atomic
lexical categories.
LFG is also lexical or lexicalist because the lexicon plays a major role in it. In LFG
(Dalrymple ELL2 2006) the lexicon is richly structured, with lexical relations rather
than transformations or operations on phrase structure trees as a means of capturing
linguistic generalizations. Yehuda Falk (2003) adds to the major tenets of LFG what he
calls the Lexical Integrity Principle, which states the following:
Words are the atoms out of which syntactic structure is built. Syntactic
rules cannot create words or refer to the internal structures of words, and
each terminal node (or leaf of the tree) is a word.(Falk 2003: 4)
The other aspect of LFG has to do with its emphasis on functionalism. The
functional part of LFG means that grammatical functions (or grammatical relations)
such as subject and object are primitives of the theory, not defined in terms of structural
configurations or semantic roles1 (Dalrymple 2006). LFG grants such grammatical
functions as subject and object a rather universal character where such abstract
grammatical functions are at play in the structure of all languages no matter how
dissimilar they might appear. The theory assumes that as languages obey certain
universal principles as regards abstract syntactic structures, they do the same thing
regarding the principles of functional organization (Dalrymple 2001). This is LFG as
pertains to its nomenclature, i.e. the lexical and functional epithets.
1This is the standard view of transformational approaches. According to this view subject and
object are not part of the syntax vocabulary, i.e. they are extra-configurational. Thosegrammatical functions or relations derive from the phrase structure they happen to occur in. Ifsubjects, for example, can be controlled, this control, according to this view, is attributed to thestructural lineaments of the position where the subject occurs. For a more in depth discussion,
see Falk (2003).
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
66/149
66
C-structure and F-structure:
The two divisions of the formal architecture of LFG are constituent structure (c-
structure) and functional structure (f-structure). The c-structure is concerned with the
description of syntactic structure while the f-structure details the semantic-cum-
functional structure of the linguistic entities concerned. The formal machinery of c-
structure depends on X-bar syntax with the addition of a number of techniques and
concepts that characterize the LFG theory and its formalism. C-structure can be
illustrated according to the following figure (Falk 2003) analyzing the following clause:
What Rachel thinks Ross put on the shelf
Figure (14)
According to this description the empty category (e) is tied to or bound with the
antecedent filler by what LFG calls metavaraibles represented by the up and down
arrows. The use of double arrows has been left over in the more recent versions of LFG
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
67/149
67
incorporating functional components into the tree; this could be illustrated in the
following sentence:
Figure (15) the c-structure of What Rachel thinks Ross put on the table?
The corresponding f-structure looks like the following
Figure (16) the f-structure of What Rachel thinks Ross put on the table?
The previous descriptions are classic representations of UDs that are due to Kaplan and
Bresnan (1982) and Kaplan and Zaenan (1989) respectively.
More recent advances in LFG tend to be more detailed and hence more
sophisticated. The following example from Asudeh (2009) is just an example. The
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
68/149
68
clause What did the strange, green entity seem to try to quickly hide?gets the following
constituent and functional descriptions respectively:
Figure (17) C-structure of What did the strange, green entity seem to try to quickly hide?
(Asudeh 2009)
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
69/149
69
Figure (18) F-structure of What did the strange, green entity seem to try to quickly hide?
(Asudeh 2009)
The interesting thing about this clause, however, is that it not only describes how LFG
handles the phenomenon of UDs but it also describes a host of other syntactic
phenomena such as Adjunction, Raising and Control.
To sum up, early LFG (Kaplan and Bresnan 1982) analyzed UDs in terms of c-
structure that explicitly drew the relation between a displaced constituent and its
corresponding gap via the double arrow notation. However, Kaplan and Zaenan (1989)
-
8/13/2019 A Proposed Approach to handling unbounded dependencies in automatic parsers
70/149
71
showed that the previous treatment was deficient in accounting for functional
constraints on UDs (Dalrymple 2001). This led them to incorporating f-structure
components in their analysis of UDs, thus abandoning the double arrow notation as
seen in figure (15) above.
2.6. Towards an Ontology of Gaps
The previous accounts pose a serious question as to the various treatments of UDs. But,
despite the various moot points among the many theories and formalisms scantily
described in the previous sections, the one thing that all those theories tend to agree
upon is that the key to unlocking the sophistication of unbounded constructions lies in
providing a rigorous account of gaps (a.k.a. empty categories, null elements, missing
elements, SLASH categories, traces). A correct and rigorous account of gaps will be
the liaison between the purely theoretical treatment of UDs and computational
implementation. This is due to the fact that dealing with gaps represents a crystallized
problem, and all computational theorizing or implementation is based on problem-
solving. Thus first we need to identify what might be called an ontologyo