david lebeaux language acquisition and the form of the grammar 2000

8/13/2019 David Lebeaux Language Acquisition and the Form of the Grammar 2000

1/308


2/308

LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR


3/308

This book was originally selected and revised to be included in the World Theses Series

(Holland Academic Graphics, The Hague), edited by Lisa L.-S. Cheng.


4/308

LANGUAGE ACQUISITION

AND THE FORMOF THE GRAMMAR

JOHN BENJAMINS PUBLISHING COMPANY

PHILADELPHIA/AMSTERDAM

DAVID LEBEAUX

NEC Research Institute


5/308

The paper used in this publication meets the minimum requirements of

American National Standard for Information Sciences Permanence of

Paper for Printed Library Materials, ANSI Z39.48-1984.8

TM

Library of Congress Cataloging-in-Publication Data

Lebeaux, David.

Language acquisition and the form of the grammar / David Lebeaux

p. cm.

Includes bibliographical references and index.

1. Language acquisition. 2. Generative grammar. I. Title.

P118.L38995 2000

401.93--dc21 00-039775

ISBN 90 272 2565 6 (Eur.) / 1 55619 858 2 (US)

2000 John Benjamins B.V.

No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any

other means, without written permission from the publisher.

John Benjamins Publishing Co. P.O.Box 75577 1070 AN Amsterdam The Netherlands

John Benjamins North America P.O.Box 27519 Philadelphia PA 19118-0519 USA


6/308

Table of Contents

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

C1

A Re-Definition of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.1 The Pivot/Open Distinction and the Government Relation . . . . . . . . 7

1.1.1 Braines Distinction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.1.2 The Government Relation . . . . . . . . . . . . . . . . . . . . . . . . . 91.2 The Open/Closed Class Distinction . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2.1 Finiteness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.2.2 The Question of Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.3 Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.3.1 A Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.3.2 Determining the base order of German . . . . . . . . . . . . . . . . 17

1.3.2.1 The Movement of NEG (syntax) . . . . . . . . . . . . . . . 24

1.3.2.2 The Placement of NEG (Acquisition) . . . . . . . . . . . . 26

C2

Project-, Argument-Linking,

and Telegraphic Speech . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.1 Parametric variation in Phrase Structure . . . . . . . . . . . . . . . . . . . . . 31

2.1.1 Phrase Structure Articulation . . . . . . . . . . . . . . . . . . . . . . . 31

2.1.2 Building Phrase Structure (Pinker 1984) . . . . . . . . . . . . . . . 32

2.2 Argument-linking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.2.1 An ergative subsystem: English nominals . . . . . . . . . . . . . . 41

2.2.2 Argument-linking and Phrase Structure: Summary . . . . . . . . 45


7/308

vi TABLE OF CONTENTS

2.3 The Projection of Lexical Structure . . . . . . . . . . . . . . . . . . . . . . . . 47

2.3.1 The Nature of Projection . . . . . . . . . . . . . . . . . . . . . . . . . . 512.3.2 Pre-Project- representations (acquisition) . . . . . . . . . . . . . . 56

2.3.3 Pre-Project- representations and the Segmentation Problem . 60

2.3.4 The Initial Induction: Summary . . . . . . . . . . . . . . . . . . . . . 65

2.3.5 The Early Phrase Marker (continued) . . . . . . . . . . . . . . . . . 66

2.3.6 From the Lexical to the Phrasal Syntax . . . . . . . . . . . . . . . . 75

2.3.7 Licensing of Determiners . . . . . . . . . . . . . . . . . . . . . . . . . . 84

2.3.8 Submaximal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

C3Adjoin-and Relative Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3.2 Some general considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

3.3 The Argument/Adjunct Distinction, Derivationally Considered . . . . . 94

3.3.1 RCs and the Argument/Adjunct Distinction . . . . . . . . . . . . . 94

3.3.2 Adjunctual Structure and the Structure of the Base . . . . . . . . 98

3.3.3 Anti-Reconstruction Effects . . . . . . . . . . . . . . . . . . . . . . . . 102

3.3.4 In the Derivational Mode: Adjoin- . . . . . . . . . . . . . . . . . . 104

3.3.5 A Conceptual Argument . . . . . . . . . . . . . . . . . . . . . . . . . . 110

3.4 An Account of Parametric Variation . . . . . . . . . . . . . . . . . . . . . . . 112

3.5 Relative Clause Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

3.6 The Fine Structure of the Grammar, with Correspondences: The

General Congruence Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

3.7 What the Relation of the Grammar to the Parser Might Be . . . . . . . 136

C4

Agreement and Merger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1454.1 The Complement of Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 146

4.2 Agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

4.3 Merger or Project- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

4.3.1 Relation to Psycholinguistic Evidence . . . . . . . . . . . . . . . . . 154

4.3.2 Reduced Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

4.3.3 Merger, or Project- . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

4.3.4 Idioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181


8/308

TABLE OF CONTENTS vii

C5

The Abrogation of DS Functions:Dislocated Constituents and Indexing Relations . . . . . . . . . . . . . . . . . 183

5.1 Shallow Analyses vs. the Derivational Theory of Complexity . . . . 184

5.2 Computational Complexity and The Notion of Anchoring . . . . . . . . 188

5.3 Levels of Representation and Learnability . . . . . . . . . . . . . . . . . . . 192

5.4 Equipollence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

5.5 Case Study I: Tavakolians results and the Early Nature of Control . . 203

5.5.1 Tavakolians Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

5.5.2 Two Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

5.5.3 PRO as Pro, or as a Neutralized Element . . . . . . . . . . . . . . . 2085.5.4 The Control Rule, Syntactic Considerations: The Question of

C-command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

5.5.5 The Abrogation of DS functions . . . . . . . . . . . . . . . . . . . . . 220

5.6 Case Study II: Condition C and Dislocated Constituents . . . . . . . . . 224

5.6.1 The Abrogation of DS Functions: Condition C . . . . . . . . . . . 226

5.6.2 The Application of Indexing . . . . . . . . . . . . . . . . . . . . . . . 229

5.6.3 Distinguishing Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . 234

5.7 Case Study III: Wh-Questions and Strong Crossover . . . . . . . . . . . . 239

5.7.1 Wh-questions: Barriers framework . . . . . . . . . . . . . . . . . . . 240

5.7.2 Strong Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

5.7.3 Acquisition Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

5.7.4 Two possibilities of explanation . . . . . . . . . . . . . . . . . . . . . 248

5.7.5 A Representational Account . . . . . . . . . . . . . . . . . . . . . . . . 249

5.7.6 A Derivational Account, and a Possible Compromise . . . . . . 251

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273


9/308


10/308

There are two ways of painting two trees together. Draw

a large tree and add a small one; this is called fu lao

(carrying the old on the back). Draw a small tree and add

a large one; this is called hsieh yu (leading the young by

the hand). Old trees should show a grave dignity and an

air of compassion. Young trees should appear modest and

retiring. They should stand together gazing at each other.

Mai-mai Sze

The Way of Chinese Painting


11/308


12/308

Acknowledgments

This book had its origins as a linguistics thesis at the University of Massachu-setts. First of all, I would like to thank my committee: Tom Roeper, for scores

of hours of talk, for encouragement, and for his unflagging conviction of the

importance of work in language acquisition; Edwin Williams, for the example of

his work; Lyn Frazier, for an acute and creative reading; and Chuck Clifton, for

a psychologists view. More generally, I would like to thank the faculty and

students of the University of Massachusetts, for making it a place where creative

thinking is valued. The concerns and orientation of this book are very much

molded by the training that I received there.

Further back, I would like to thank the people who got me interested in allof this in the first place: Steve Pinker, Jorge Hankamer, Jane Grimshaw, Annie

Zaenen, Merrill Garrett and Susan Carey. I would also like to thank Noam

Chomsky for encouragement throughout the years.

Since the writing of the thesis, I have had the encouragement and advice of

many fine colleagues. I would especially like to thank Susan Powers, Alan

Munn, Cristina Schmitt, Juan Uriagereka, Anne Vainikka, Ann Farmer, and Ana-

Teresa Perez-Leroux. I am also indebted to Sandiway Fong, as well as Bob

Krovetz, Christiane Fellbaum, Kiyoshi Yamabana, Piroska Csuri, and the NEC

Research Institute for a remarkable environment in which to pursue the researchfurther.

I would also like to thank Mamoru Saito, Hajime Hoji, Peggy Speas,

Juergen Weissenborn, Clare Voss, Keiko Muromatsu, Eloise Jelinek, Emmon

Bach, Jan Koster, and Ray Jackendoff.

Finally, I would like to thank my parents, Charles and Lillian Lebeaux, my

sister, Debbie Lebeaux, and my sons, Mark and Theo. Most of all, I would like

to thank my wife Pam, without whom this book would have been done badly, if

at all. This book is dedicated to her, with love.


13/308


14/308

Preface

What is the best way to structure a grammar? This is the question that I startedout with in the writing of my thesis in 1988. I believe that the thesis had a

marked effect in its answering of this question, particularly in the creation of the

Minimalist Program by Chomsky (1993) a few years later.

I attempted real answers to the question of how to structure a grammar, and

the answers were these:

(i) In acquisition, the grammar is arranged along the lines of subgrammars.

These grammars are arranged so that the child passes from one to the next,

and each succeeding grammar contains the last. I shall make this clearer

below.

(ii) In addition, in acquisition, the child proceeds to construct his/her grammar

from derivational endpoints (Chapter 5). From the derivational endpoints,

the child proceeds to construct the entire grammar. This may be forward or

backward, depending on what the derivational endpoint is. If the derivation

endpoint, or anchorpoint, is DS, then the construction is forward; if the

derivational endpoint or anchorpoint is S-structure or the surface, then the

construction proceeds backwards.

The above two proposals were the main proposals made about theacquisition sequence. There were many proposals made about the syntax. Of

these, the main architectural proposals were the following.

(iii) The acquisition sequence and the syntax in particular, the syntactic

derivation are not to be considered in isolation from each other, but

rather are tightly yoked. The acquisition sequence can be seen as the result

of derivational steps or subsequences (as can be seen in Chapter 2, 3, and

4). This means that the acquisition sequence gives unique purchase onto the

derivation itself, including the adult derivation.

(iv) Phrase structure is not given as is, nor is derived top-down, but rather iscomposed (Speas 1990). This phrase structure composition (Lebeaux 1988),

is not strictly bottom up, as in Chomskys (1995) Merge, but rather involves


15/308

xiv PREFACE

(a) the intermingling or units, (b) is grammatically licensed, and not simply

geometrical (bottom-up) in character (in a way which will become clearerbelow), and (c) involves, among other transformations, the transformation

Project- (Chapter 4).

(v) Two specific composition operations (and the beginnings of a third) are

proposed. Adjoin- (Chapter 3) is proposed, adding adjuncts to the basic

nuclear clause structure (Conjoin- is also suggested in that chapter). In

further work, this is quite similar to the Adjunction operation of Joshi and

Kroch, and the Tree Adjoining Grammars (Joshi 1985; Joshi and Kroch

1985; Frank 1992), though the proposals are independent and the proposals

are not exactly the same. The second new composition operation isProject- (Chapter 4), which is an absolutely new operation in the field. It

projects open class structure into a closed class frame, and constitutes the

single most radical syntactic proposal of this book.

(vi) Finally, composition operations, and the variance in the grammar as a

whole, are linked to the closed class set elements like the, a, to, of, etc.

In particular, each composition operation requires the satisfaction of a

closed class element; as well as a closed class element being implicated in

each parameter.

These constitute some of the major proposals that are made in the course of this

thesis. In this preface I would like to both lay out these proposals in more detail,

and compare them with some of the other proposals that have been made since

the publication of this thesis in 1988. While this thesis played a major role in the

coming of the Minimalist Program (Chomsky 1993, 1995), the ideas of the thesis

warrant a renewed look by researchers in the field, for they have provocative

implications for the treatment of language acquisition and the composition of

phrase structure.

Let us start to outline the differences of this thesis with respect to laterproposals, not with respect to language acquisition, but with respect to syntax. In

particular, let us start with parts (iv) and (v) above: that the phrase marker is

composed from smaller units.

A similar proposal is made with Chomskys (1995) Merge. However, here,

unlike Merge:

(1) The composition is not simply bottom-up, but involves the possible

intermingling of units.

(2) The composition is syntactically triggered in that all phrase structurecomposition involves the satisfaction of closed class elements


16/308

PREFACE xv

(Chapters 3 and 4), and is not simply the geometric putting together

of two units, as in Merge, and(3) The composition consists of two operations among others (these are

the only two that are developed in this thesis), Adjoin- and

Project-.

With respect to the idea that all composition operations are syntactically triggered

by features, let us take the operation Adjoin-. This takes two structures and

adjoins the second into the first.

(1) s1:

s2:

the man met the woman

who loved him

the man met the woman

who loved him

Adjoin-

This shows the intermingling of units, as the second is intermeshed with the first.

However, I argue here (Chapter 4), that it also shows the satisfaction of closed

class elements, in an interesting way. Let us call the wh-element of the relative

clause, who here, the relative clause linker.

It is a proposal of this thesis that the adjunction operation itself involves the

satisfaction of the relative clause linker (who), by the relative clause head (the

woman), and it is this relation, which is the relation of Agreement, which composesthe phrase marker. The relative clause linker is part of the closed class set. This

relative clause linker is satisfied in the course of Agreement, thus the composi-

tion operation is put into a 1-to-1 relation with the satisfaction of a closed class

head. (This proposal, so far as I know, is brand new in the literature).

(2) Agree Relative head/relativizer Adjoin-

This goes along with the proposal (Chapter 4), which was taken up in the

Minimalist literature (Chomsky 1992, 1995), that movement involves thesatisfaction of closed class features. The proposal here, however, is that composi-

tion, as well as movement, involves the satisfaction of a closed class feature (in

particular, Agreement). In the position here, taken up in the Minimalist literature,

the movement of an element to the subject position is put into a 1-to-1 corre-

spondence with agreement (Chapter 4 again).

(3) Agree Subject/Predicate Move NP (Chapter 4)

The proposal here is thus more thoroughgoing than that in the minimalist

literature, in thatboththe composition operation, and the movement operation aretriggered by Agreement, and the satisfaction of closed class features. In the

minimalist literature, it is simply movement which is triggered by the satisfaction


17/308

xvi PREFACE

of closed class elements (features); phrase structure composition is done simply

geometrically (bottom-up). Here, both are done through the satisfaction ofAgreement. This is shown below.

(4) Minimalism Lebeaux (1988)

Movement syntactic (satisfaction

of features)

syntactic (satisfaction

of features)

Phrase Structure

Composition

asyntactic (geometric) syntactic (satisfaction

of features)

This proposal (Lebeaux 1988) links the entire grammar to the closed class set

both the movement operations and the composition operations are linked to

this set.

The set of composition operations discussed in this thesis is not intended to

be exhaustive, merely representative. Along with Adjoin- which Chomsky-

adjoins elements into the representation (Chapter 3), let us take the second, yet

more radical phrase structure composition operation, Project-. This is not

equivalent to Speas (1990) Project-, but rather projects an open class structure

into a closed class frame. The open class structure also represents pure thematic

structure, and the closed class structure, pure Case structure.

This operation, for a simple partial sentence, looks like (5) (see Lebeaux

1988, 1991, 1997, 1998 for further extensive discussion).

The operation projects the open class elements into the closed class (Case)

frame. It also projects up the Case information from Determiner to DP, and

unifies the theta information, from the theta subtree, into the Case Frame, so that

it appears on the DP node.

The Project- operation was motivated in part by the postulation of asubgrammar in acquisition (Chapters 2, 3, and 4), in part by the remarkable

speech error data of Garrett (Chapter 4, Garrett 1975), and in part by idioms

(Chapter 4). This operation is discussed at much greater length in further

developments by myself (Lebeaux 1991, 1997, 1998).

I will discuss in more detail about the subgrammar underpinnings of the

Project-approach later in this preface. For now, I would simply like to point to

the remarkable speech error data collected by Merrill Garrett (1975, 1980), the

MIT corpus, which anchors this approach.


18/308

PREFACE xvii

(5)

V

N

agent

V

V N

patient

womanseeman

man

VP

VP

DP

+nom

DP

+agent

+nom

Det

+nom

V

V

Det

+nom

the

the

NP

e

V

V

DP

+acc

DP

+patient

+acc

NPDet

+acc

a esee

Theta subtree (open class) Case Frame (closed class)

Project-

NP

+agent

see

Det

+acc

NP

+patient

a woman

Garrett and Shattuck-Hufnagel collected a sample of 3400 speech errors. Of

these, by far the most interesting class is the so-called morpheme-stranding

errors. These are absolutely remarkable in that they show the insertion of open

class elements into a closed class frame. Thus, empirically, the apparent impor-

tance of open class and closed class items is reversed rather than open class

items being paramount, closed class items are paramount, and guide the deriva-

tion. Open class elements are put into slots provided by closed class elements, in

Garretts remarkable work. A small sample of Garretts set is shown below.


19/308

xviii PREFACE

(6) Speech errors (stranded morpheme errors), Garrett (personal commu-

nication) (permuted elements underlined)Error Target

my frozers are shoulden my shoulders are frozen

that just a back trucking out a truck backing out

McGovern favors pushing busters favors busting pushers

but the cleans twoer twos cleaner

his sink is shipping ship is sinking

the cancel has been practiced the practice has been

cancelled

shes got her sets sight sights set a puncture tiring device tire puncturing device

As can be seen, these errors can only arise at a level where open class elements

are inserted into a closed class frame. The insertion does not take place correctly

a speech error so that the open class elements end up in permuted slots

(e.g. a puncture tiring device).

Garrett summarizes this as follows:

why should the presence of a syntactically active bound morpheme be

associated with an error at the level described in [(6)]? Precisely because theattachment of a syntactic morpheme to a particular lexical stem reflects a

mapping from a functional level [i.e. grammatical functional, i.e. my theta

subtree, D. L.] to a positional level of sentence planning

This summarizes the two phrase structure composition operations that I propose

in this thesis: Adjoin-and Project-. As can be seen, these involve (1) the inter-

mingling of structures (and are not simply bottom up), and (2) satisfaction of

closed class elements. Let us now turn to the general acquisition side of the

problem.

It was said above that this thesis was unique in that the acquisition sequence

and the syntax in particular, the syntactic derivation were not considered

in isolation, but rather in tandem. The acquisition sequence can be viewed as the

output of derivational processes. Therefore, to the extent to which the derivation

is partial, the corresponding stage of the acquisition sequence can be seen as a

subgrammar of the full grammar. The yoking of the acquisition sequence and the

syntax is therefore the following:

(7) A subgrammar approach

S phrase structure composition from smaller units


20/308

PREFACE xix

The subgrammar approach means that children literally have a smaller grammar

than the adult. The grammar increases over time by adding new structures (e.g.relative clauses, conjunctions), and by adding new primitives of the representa-

tional vocabulary, as in the change from pure theta composed speech, to theta

and Case composed speech.

The addition of new structures e.g. relative clauses and conjunctions

may be thought of as follows. A complex sentence like that in (8) may be

thought of as a triple: the two units, and the operation composing them (8b).

(8) a. The man saw the woman who loved him.

b. (the man saw the woman (rooted), who loved him, Adjoin-)

Therefore a subgrammar, if it is lacking the operation joining the units may be

thought of as simply taking one of the units let us say the rooted one and

letting go of the other unit (plus letting go of the operation itself). This is

possible and necessary because it is the operation itself which joins the units: if

the operation is not present, one or the other of the units must be chosen. The

subgrammar behind (8a), but lacking the Adjoin-operation, will therefore generate

the structure in (9) (assuming that it is the rooted structure which is chosen).

(9) The man saw the woman.This is what is wanted.

Note that the subgrammar approach (in acquisition), and the phrase structure

composition approach (in syntax itself) are in perfect parity. The phrase structure

composition approach gives the actual operation dividing the subgrammar from

the supergrammar. That is, with respect to this operation (Adjoin-), the

grammars are arranged in two circles: Grammar 1 containing the grammar itself,

but without Adjoin-, and Grammar 2 containing the grammar including Adjoin-.

(10)

Grammar 2

(w/ Adjoin- )

Grammar 1

The above is a case of adding a new operation.

The case of adding another representational primitive is yet more interesting.


21/308

xx PREFACE

Let us assume that the initial grammar is a pure representation of theta relations.

At a later stage, Case comes in. This hypothesis is of the layering of vocabu-lary: one type of representational vocabulary comes in, and does not displace,

but rather is added to, another.

(11) theta theta + Case

Stage I Stage II

The natural lines along which this representational addition takes place is

precisely given by the operation Project-. The derivation may again be thought

of as a triple: the two composing structures, one a pure representation of theta

relations, and one a pure representation of Case, and the operation composing them.

(12) ((man (see woman)), (the __ (see (a __))), Project-)

the sees in theta tree and Case frame each contain partial informa-

tion which is unified in the Project- operation.

The subgrammar is one of the two representational units: in this case, the unit

(man(see woman)). That is a sort of theta representation or telegraphic speech.

The sequence from Grammar 0 to Grammar 1 is therefore given by the addition

of Project-.

(13)

Grammar 1

(w/ Project- )

Grammar 0

The full pattern of stage-like growth is shown in the chart below:

(14) A: Subgrammar Approach

Add construction operations Relative clauses,

to simplified tree Conjunction (not discussed here)

Add primitives to Theta Theta + Case

representational vocabulary

As can be seen, the acquisition sequence and the syntax syntactic derivation are tightly yoked.

Another way of putting the arguments above is in terms of distinguishing


22/308

PREFACE xxi

accounts. I wish to distinguish the phrase structure operations here from Merge;

and the acquisition subgrammar approach here from the alternative, which is theFull Tree, or Full Competence, Approach (the full tree approach holds that the

child does not start out with a substructure, but rather has the full tree, at all

stages of development.) Let us see how the accounts are distinguished, in turn.

Let us start with Chomskys Merge. According to Merge, the (adult) phrase

structure tree, as in Montague (1974), is built up bottom-up, taking individual

units and joining them together, and so on. The chief property of Merge is that

it isstrictlybottom-up. Thus, for example, in a right-branching structure like see

the big man, Merge would first take big and man and Merge them together,

then add the to big man, and then add see to the resultant.

(15) Application of Merge:

V Det Adj N

see the big man

N

Adj

big

N

man

DP

Det NP

Adj N

man

bigthe

V

see

VP

DP

Det NP

the Adj N

manbig

The proposal assayed in this thesis (Lebeaux 1988) would, however, have a

radically different derivation. It would take the basic structure as being the basic

government relation: (see man). This is the primitive unit (unlike with Merge).

To this, thethe and thebig may be added, by separate transformations, Project-

and Adjoin-, respectively.


23/308

xxii PREFACE

(16)

V

N

man

man

V

see

V

V

V

V

DP

DP

NP

NP

Det

Det

e

(see)

(see)Case Frame

Project-

Theta subtree

the

the

a. Project-

( see ( the man))

( big)V DP

ADJ

b. Adjoin-

Adjoin- ( see ( the ( big man)))V DP NP

How can these radically distinct accounts (Lebeaux 1988 and Merge) be

empirically distinguished? I would suggest in two ways. First, conceptually the

proposal here (as in Chomsky 19751955, 1957, and Tree Adjoining Grammars,

Kroch and Joshi 1985) takes information nuclei as its input structures, notarbitrary pieces of string. For example, for the structure The man saw the

photograph that was taken by Stieglitz, the representation here would take the

two clausal nuclear structures, shown in (17) below, and adjoin them. This is not

true for Merge which does not deal in nuclear units.

(17) s1:

s2:

the man saw the photograph

that was by Stieglitz

the man saw the photograph

that was by Stieglitz

Adjoin-

Even more interesting nuclear units are implicated in the transformationProject-, where the full sentence is decomposed into a nuclear unit which is the

theta subtree, and the Case Frame.


24/308

PREFACE xxiii

(18)

The man saw the woman

(man (see woman))

(the _(see a_))

The structure in (18), the man saw the woman, is composed of a basic nuclear

unit, (man (see woman)), which is telegraphic speech (as argued for in Chap-

ter 2). No such nuclear unit exists in the Merge derivation of the man saw the

woman: that is, in the Merge derivation, (man (see woman)) does not exist as

a substructure of ((the man) (saw (the woman)).

This is the conceptual argument for preferring the composition operation

here over Merge. In addition, there are two simplicity arguments, of which I willgive just one here.

The simplicity argument has to do with a set of structures that children

produce which are calledreplacement sequences(Braine 1976). In these sequenc-

es, the child is trying to reach (output) some structure which is somewhat too

difficult for him/her. To make it, therefore, he or she first outputs a substructure,

and then the whole structure. Examples are given below: the first line is the first

outputted structure, and the second line is the second outputted structure, as the

child attempts to reach the target (which is the second line).

(19) see ball (first output)

see big ball (second output and target)

(20) see ball (first output)

see the ball (second output and target)

What is striking about these replacement sequences is that the child does not

simply first output random substrings of the final target, but rather that the first

output is an organized part of the second. Thus in both (19) and (20), what the

child has done is first isolate out the basic government relation, (see ball), andthen added to it: with big and the, respectively.

The particular simplifications chosen are precisely what we would expect with

the substructure approach outlined here, and crucially not with Merge. With the

substructure approach outlined here (Chapter 2, 4), what the child (or adult) first

has in the derivation is precisely the structure (see ball), shown in example (21).


25/308

xxiv PREFACE

(21) V

V

see

N

+patient

ball

To this structure is then added other elements, by Project- or Adjoin-. Thus,

crucially, the first structure in (19) and (20) actually exists as a literal substruc-

ture of the final form line 2 and thus could help the child in deriving the

final form. It literally goes into the derivation.By contrast, with Merge, the first line in (19) and (20) never underlies the

second line. It is easy to see why. Merge is simply bottom-up it extends the

phrase marker. Therefore, the phrase structure composition derivation underlying

(20) line 2, is simply the following (Merge derivation).

(22) Merge derivation underlying (20) line 2

(N ball)

(DP (D the) (N ball))

(see (DP (D the) (N ball)))

However, this derivation crucially does not have the first line of (20) (see (ball))

as a subcomponent. That is, (see (ball)) does not go into the making of (see

(the ball)), in the Merge derivation, but it does in the substructure derivation.

But this is a strong argument against Merge. For the first line of the

outputted sequence of (20), (see ball), is presumably helping the child in

reaching the ultimate target (see (the ball)). But this is impossible with Merge,

for the first line in (20) does not go into the making of the second line, accord-

ing to the Merge derivation.That is, Merge cannot explain why (see ball) would help the child get to the

target (see (the ball)), since (see ball) is not part of the derivation of (see (the

ball)), in the Merge derivation. It is part of the sub-derivation in the substructure

approach outlined here, because of the operation Project-.

The above (see Chapters 2, 3, and 4) differentiates the sort of phrase

structure composition operations found here from Merge. This is in the domain

of syntax though I have used language acquisition argumentation. In the

domain of language acquisition proper, the proposal of this thesis the

hypothesis of substructures must be contrasted with the alternative, whichholds that the child is outputting the full tree, even when the child is potentially

just in the one word stage: this may be called the Full Tree Hypothesis. These


26/308

PREFACE xxv

differential possibilities are shown below. (For much additional discussion, see

Lebeaux 1991, 1997, 1998, in preparation.)

(23) Lebeaux (1988) Distinguished From

Syntax phrase structure com-

position

Both:

(1) no composition

(2) Merge

Language Acquisition subgrammar approach Full Tree Approach

Let us now briefly distinguish the proposals here from the Full Tree Approach.In the Full Tree Approach, the structure underlying a child sentence like ball

or see ball might be the following in (24). In contrast, the substructure

approach (Lebeaux, 1988) would assign the radically different representation,

given in (25).

(24) Full Tree Approach

IP

TP

AgrSP

AgrOP

VP

V

DP

NPD

V

DP

AgrO

AgrS

T

DP

D NP

balleeeeeeee


27/308

xxvi PREFACE

(25)

V

V N

+patient

ball

Substructure Approach

How can these approaches be distinguished? That is, how can a choice be made

between (25), the substructure approach, and (24), the Full Tree approach? I

would suggest briefly at least four ways (to see full argumentation, consultLebeaux 1997, to appear; Powers and Lebeaux 1998).

First, the subgrammar approach, but not the full tree approach, has some

notion ofsimplicity in representation and derivation. Simplicity is a much used

notion in science, for example deciding between two equally empirically

adequate theories. The Full Tree Approach has no notion of simplicity: in

particular, it has no idea of how the child would proceed from simpler structures

to more complex ones. On the other hand, the substructure theory has a strong

proposal to make: the child proceeds from simpler structures over time to those

which are more complex. Thus the subgrammar point of view makes a strong

proposal linked to simplicity, while the Full Tree hypothesis makes none.

A second argument has to do with the closed class elements, and may be

broken up into two subarguments. The first of these arguments is that, in the Full

Tree Approach, there is no principledreason for the exclusion of closed class

elements in early speech (telegraphic speech). That is, both the open class and

closed class nodes exist, according to the Full Tree Hypothesis, and there is no

principled reason why initial speech would simply be open class, as it is. That is,

given the Full Tree Hypothesis, since the full tree is present, lexical insertioncould take place just as easily in the closed class nodes as the open class nodes.

The fact that it doesnt leaves the Full Tree approach with no principled reason

why closed class items are lacking in early speech.

A second reason having to do with closed class items, has to do with the

special role that they have in structuring an utterance, as shown by the work of

Garrett (1975, 1980), and Gleitman (1990). Since the Full Tree Approach gives

open and closed class items the same status, it has no explanation for why closed

class items play a special role in processing and acquisition. The substructure

approach, with Project-, on the other hand, faithfully models the difference, byhaving open class and closed class elements initially on different representations,


28/308

PREFACE xxvii

which are then fused (for additional discussion, see Chapter 4, and Lebeaux

1991, 1997, to appear).A third argument against the Full Tree Approach has to do with structures

like see ball (natural) vs. see big (unnatural) given below.

(26) see ball (natural and common)

see big (unnatural and uncommon)

Why would an utterance like see ball be natural and common for the child

maintaining the government relation while see big is unnatural and uncom-

mon? There is a common sense explanation for this: see ball maintains the

government relation (between a verb and a complement), while see and bighave no natural relation. While this fact is obvious, it cannot be accounted for

with the Full Tree Approach. The reason is that the Full Tree Approach has all

nodes potentially available for use: including the adjectival ones. Thus there

would be no constraint on lexically inserting see and big (rather than see

and ball). On the substructure approach, on the other hand, there is a marked

difference: see and ball are on a single primitive substructure the theta

tree while see and big are not.

A fourth argument against the Full Tree Approach and for the substructure

approach comes from a paper by Laporte-Grimes and Lebeaux (1993). In this

paper, the authors show that the acquisition sequence proceeds almost sequential-

ly in terms of the geometric complexity of the phrase marker. This is, children

first output binary branching structures, then double binary branching, then triply

binary branching, and so on. This complexity result would be unexpected with

the Full Tree Approach, where the full tree is always available.

This concludes the four arguments against the Full Tree Approach, and for

the substructure approach in acquisition. The substructure approach (in acquisi-

tion) and the composition of the phrase marker (in syntax) form the two main

proposals of this thesis.

Aside from the main lines of argumentation, which I have just given, there

are a number of other proposals in this thesis. I just list them here.

(1) One main proposal which I take up in all of Chapter 5 is that the acquisition

sequence is built up from derivational endpoints. In particular, for some purpos-

es, the childs derivation is anchored in the surface, and only goes part of the

way back to DS. The main example of this can be seen with dislocated constitu-

ents. In examples like (27a) and (b), exemplifying Strong Crossover and a

Condition C violation respectively, the adult would not allow these constructions,

while the child does.


29/308

xxviii PREFACE

(27) a. *Which mani did hei see t? (OK for child)

b. *In Johnsi house, hei put a book t. (OK for child)It cannot be simply said, as in (27b), that Condition C does not apply in the

childs grammar, because it does, in nondislocated structures (Carden 1986b).

The solution to this puzzle and there exist a large number of similar puzzles

in the acquisition literature, see Chapter 5is that Condition C in general applies

over direct c-command relations, including at D-Structure (Lebeaux 1988, 1991,

1998), and that the child analyzes structures like (27b) as if they were dislocated

at all levels of representation, thus never triggering Condition C (a similar

analysis holds of Strong Crossover, construed as a Condition C type constraint,

at DS, van Riemsdijk and Williams 1981). That is, the child derivation, unlike

the adult, does not have movement, but starts out with the element in a dislocat-

ed position, and indexes it to the trace. This explains the lack of Condition C and

Crossover constraints (shown in Chapter 5). It does so by saying that the childs

derivation is shallow: anchored at SS or the surface, and the dislocated item is

never treated as if it were fully back in the DS position.

This is the shallowness of the derivation, anchored in SS (discussed in

Chapter 5).

(2) A number of proposals are made in Chapter 2. One main proposal concernsthe theta tree. In order to construct the tree, one takes a lexical entry, and does

lexical insertion of open class items directly into that. This is shown in (28).

(28) V

N V

V

see

N

patientwoman

man

This means that the sequence between the lexicon and the syntax is in fact a

continuum: the theta subtree constitutes an intermediate structure between those

usually thought to be in the lexicon, and those in the syntax. This is a radical

proposal.

A second proposal made in Chapter 2 is that Xprojections project up as far

as they need to. Thus if one assumed the X-theory of Jackendoff (1977) (as I

did in this thesis) recall that Jackendoff had 3 X levels then an elementmight project up to the single bar level, double bar level, or all the way up to the

triple bar level, as needed.


30/308

PREFACE xxix

(29) N

N

N

N

This was called the hypothesis of submaximal projections.

A final proposal of Chapter 2 is that the English nominal system is ergative.

That is, a simple intransitive noun phrase like that in (29), with the subject in thesubject position (of the noun phrase) is always derived from a DS in which the

subject is a DS object. Crucially, this includes not simply unaccusative verbs (i.e.

nominals from unaccusative verbs) but unergative verbs as well (such as sleeping

and swimming).

(30) a. Johns sleeping

derived from: the sleeping of John (subject internal)

b. Johns swimming

derived from: the swimming of John (subject internal)This means that the English nominal system is actually ergative in character

a startling result.

Some final editorial comments. For space reasons in this series, Chapter 5 in the

original thesis has been deleted, and Chapter 6 has been re-numbered Chapter 5.

Second, I have maintained the phrase structure nodes of the original trees, rather

than trying to update them with the more recent nodes. The current IP is

therefore generally labelled S (sentence), the current DP is generally labelled NP

(noun phrase), and the current CP is sometimes labelled S (S-bar, the old namefor CP). Finally, the term dislocation in Chapter 5 is intended to be neutral by

itself between moved and base-generated. The argument of that section is that

wh-elements which are moved by the adult, are base generated in dislocated

positions by the child. Finally, I would like to thank Lisa Cheng and Anke de

Looper for helpful editorial assistance.


31/308


32/308

Introduction

This work arose out of an attempt to answer three questions:

I. Is there a way in which the Government-Binding theory of Chomsky (1981)

can be formulated so that the leveling in it is more essential than in thecurrent version of the theory?

II. What is the relation between the sequence of grammars that the child adopts,

and the basic formation of the grammar, and is there such a relation?

III. Is there a way to anchor Chomskys (1981) finiteness claim that the set of

possible human grammars is finite, so that it becomes a central explanatory

factor in the grammar itself?

The work attempts to accomplish the following:

I. To provide for an essentially leveled theory, in two ways: by showing that

DS and SS are clearly demarcated by positing operations additional to

Move- which relate them, and by suggesting that there is a ordering in

addition by vocabulary, the vocabulary of description (in particular, Case

and theta theory) accumulating over the derivation.

II. To relate this syntactically argued for leveling to the acquisition theory,

again in two ways: by arguing that the external levels (DS, the Surface, PF)

may precede S-structure with respect to the induction of structure, and by

positing a general principle, the General Congruence Principle, which relatesacquisition stages and syntactic levels.

III. To give the closed class elements a crucial role to play: with respect to

parametric variation, they are the locus of the specification of parametric

difference, and with respect to the composition of the phrase marker: it is

the need for closed class (CC) elements to be satisfied which gives rise to

phrase marker composition from more primitive units, and which initiates

Move- as well.

In terms of syntactic content, Chapters 24 deal with phrase structure both the

acquisition and the syntactic analysis thereof and Chapter 5 deals with the

interaction of indexing functions, Control and Binding Theory, with levels of

representation, particularly as it is displayed in the acquisition sequence.


33/308

2 INTRODUCTION

Thematically, a number of concerns emerge throughout. A major concern is

with closed class elements and finiteness. With respect to parametric variation,I suggest that closed class elements are the locus of parametric variation. This

guarantees finiteness of possible grammars in UG, since the set of possible

closed class elements is finite.1 With respect to phrase structure composition, it

is the closed class elements, and the necessity for their satisfaction, which require

the phrase marker to be composed, and initiate movement as well (e.g. Move-wh

is in a 1-to-1 correspondence with the lexical necessity: Satisfy +wh feature).

The phrase marker composition has some relation to the traditional generalized

transformations of Chomsky (1957), and they may apply (in the case of

Adjoin-) after movement. But the composition that occurs is of a strictlylimited sort, where the units are demarcated according to the principles of GB.

Finally, closed class elements form a fixed frame into which the open class (OC)

elements are projected (Chapters 1, 2, and 4). More exactly, they form a Case

frame into which a theta sub-tree is projected (Chapter 4). This rule, I call

Merger (or Project-).

A second theme is the relation of stages in acquisition to levels of grammat-

ical representation. Since the apparent difficulty of any theory which involves

the learning of transformations,2 the precise nature of the relation of the acquisi-

tion sequence to the structure of the grammar has remained murky, without a

theory of how the grammatical acquisition sequence interacts with, or displays

the structure of the grammar, and with, perhaps, many theoreticians believing

that any correspondence is otiose. Yet there is considerable reason to believe that

there should be such a correspondence. On theoretical grounds, this would be

expected for the following reason: The child in his/her induction of the grammar

is not handed information from all levels in the grammar at once, but rather from

particular picked out levels; the external levels of Chomsky (class lectures, 1985)

DS, LF, and PF or the surface.These are contrasted to the internal level, S-structure. Briefly, information

from the external levels are available to the child; about LF because of the paired

meaning interpretation, from the surface in the obvious fashion, and from DS,

construed here simply as the format of lexical forms, which are presumably

given by UG. As such, the childs task (still!) involves the interpolation of

operations and levels between these relatively fixed points. But, this then means

1. Modulo the comments in Chapter 1, footnote 1.

2. Because individual transformations are no longer sanctioned in the grammar. I do not believe,

however, that the jury is yet in on the type of theory that Wexler and Culicover (1980) envisage.


34/308

INTRODUCTION 3

that the acquisition sequence must build on these external levels, and display the

structure of the levels, perhaps in a complex fashion.A numerical argument leads in the same direction: namely, that the acquisi-

tion theory, in addition to being a parametric theory, should contain some

essential reference to, and reflect, the structure of the grammar. Suppose that, as

above, the closed class elements and their values are identified with the possible

parameters. Let us (somewhat fancifully) set the number at 25, and assume that

they are binary. This would then give 225 target grammars in UG (=30 million),

a really quite small finite system. But, consider the range of acquisition sequenc-

es involved. If parameters are independent a common assumption then any

of these 25 parameters could be set first, then any of the remaining 24, and soon. This gives 25! possible acquisition sequences for the learning of a single

language (=1.51025), a truly gigantic number. That is, the range of acquisition

sequences would be much larger than the range of possible grammars, and

children might be expected to display widely divergent intermediate grammars in

their path to the final common target, given independence. Yet they do nothing

of the sort; acquisition sequences in a given language look remarkably similar.

All children pass through a stage of telegraphic speech, and similar sorts of

errors are made in structures of complementation, in the acquisition of Control,

and so on. There is no wide fecundity in the display of intermediate grammars.

The way that has been broached in the acquisition literature to handle this

has been the so-called linking of parameters, where the setting of a single

parameter leads to another being set. This could restrict the range of acquisition

sequences. But, the theories embodying this idea have tended to have a rather

idiosyncratic and fragmentary character, and have not been numerous. The

suggestion in this work is that there is substructuring, but this is not in the

lexical-parametric domain itself (conceived of as the set of values for the closed

class (CC) elements), but in the operational domain with which this lexicaldomain is associated. An example of this association was given above with the

relation of the wh-movement to the satisfaction of the +wh feature; another

example would be with satisfaction of the relative clause linker (the wh-element

itself), which either needs or does not need to be satisfied in the syntax. This

gives rise to either language in which the relative forms a constituent with the

head (English-type languages), or languages in which it is splayed out after the

main proposition, correlative languages.(1) Lexical Domain

+wh must be satisfied by SS+wh may not be satisfied by SS

Operational Domain

Move-wh applies in syntaxMove-wh applies at LF


35/308

4 INTRODUCTION

Lexical Domain

Relative Clause linker must be

satisfied by SS

Relative Clause linker may not

be satisfied by SS

Operational Domain

English-type language

Correlative language

The theory of this work suggests that all operations are dually specified in the

lexical domain (requiring satisfaction of a CC lexical element) and in the

operational domain.

The acquisition sequence reflects the structure of the grammar in two ways:

via the General Congruence Principle, which states that the stages in acquisition

are in a congruence relation with the structure of parameters (see Chapter 3 for

discussion), and via the use of the external levels (DS, PF, LF) as anchoring

levels for the analysis essentially, as the inductive basis. The General

Congruence Principle is discussed in Chapter 24, the possibility of broader

anchoring levels, in Chapter 5. The latter point of view is somewhat distinct

from the former, and (to be frank) the exact relation between them is not yet

clear to the author. It may be that the General Congruence Principle is a special

case, when the anchoring level is DS, or it may be that these are autonomousprinciples. I leave this question open.

The third theme of this work has to do with levels or precedence relations

in the grammar. In particular, with respect to two issues: (a) Is it possible to

make an argument that the grammar is essentially derivational in character, rather

than in the representational mode (cf. Chomskys 1981 discussion of Move-)?

(b) Is there any evidence of intermediate levels, of the sort postulated in van

Riemsdijk and Williams (1981)? I believe that considering a wider range of

operations than Move- may move this debate forward. In particular, I propose

two additional operations of phrase structure composition: Adjoin-, whichadjoins adjuncts in the course of the derivation, and Project-, which relates the

lexical syntax to the phrasal. With respect to these operations, two types of

precedence relations do seem to hold. First, operation/default organization holds

within an operation type. In the case of Adjoin- and its corresponding default,

Conjoin- (i.e., two of the types of generalized transformations in Chomsky

1957, are organized as a single operation type, with an operation/default relation

between them). The other precedence relation is vocabulary layering and this

hold between different operations, for example, Case and theta theory (see

Chapter 2, 3, and 4 for discussion). Further, operations like Adjoin-may follow

Move-, and this explains the anti-Reconstruction facts of van Riemsdijk and


36/308

INTRODUCTION 5

Williams (1981); such facts cannot be easily explained in the representational

mode (see Chapter 3).In general, throughout this work I will interleave acquisition data and theory

with pure syntactic theory, since I do not really differentiate between them.

Thus, the proposal having to do with Adjoin- was motivated by pure syntactic

concerns (the anti-Reconstruction facts, and the attempt to get a simple descrip-

tion of licensing), but was then carried over into the acquisition sphere. The

proposal having to do with the operation of Project-(or Merger) was formulat-

ed first in order to give a succinct account of telegraphic speech (and, to a lesser

degree, to account for speech error data), and was then carried over into the

syntactic domain. To the extent to which this type of work is successful, the twoareas, pure syntactic theory and acquisition theory may be brought much closer,

perhaps identified.


37/308


38/308

C 1

A Re-Definition of the Problem

1.1 The Pivot/Open Distinction and the Government Relation

For many years language acquisition research has been a sort of weak sister in

grammatical research. The reason for this, I believe, lies not so much in its own

intrinsic weakness (for a theoretical tour de force, see Wexler and Culicover

1980, see also Pinker 1984), but rather, as in other unequal sibships, in relation.

This relation has not been a close one; moreover the lionizing of the theoretical

importance of language acquisition as the conceptual ground of linguistic

theorizing has existed in uneasy conscience alongside a real practical lack of

interest. Nor is the fault purely on the side of theoretical linguistics: the acquisi-tion literature, especially on the psychological side, is notorious for having

drifted further and further from the original goal of explaining acquisition, i.e.

the sequence of mappings which take the child from G0to the terminal grammar

Gn, to the study of a different sort of creature altogether, Child Language (see

Pinker 1984, for discussion and a diagnostic).

1.1.1 Braines Distinction

Nonetheless, even in the psychological literature, especially early on, there werea number of proposals of quite far-reaching importance which would, or could,

have (had) a direct bearing on linguistic theory, and which pointed the way to

theories far more advanced than those available at the time. For example,

Braines (1963a) postulation of pivot-open structures in early grammars. Braine

essentially noticed and isolated three properties of early speech: for a large

number of children, the vocabulary divided into two classes, which he called

pivot and open. The pivot class was closed class, partly in the sense that it

applies in the adult grammar (e.g., containing prepositions, pronouns, etc.) but

partly also in the broader sense: it was a class that contained a small set of

words which couldnt be added on to, even though these words corresponded to


39/308

8 LANGUAGE ACQUISITION AND THE FORM OF THE GRAMMAR

those which would ordinarily be thought of as open class (e.g. come); these

words operated on a comparatively large number of open class elements. Anexample of the Braine data is given below.

(1) Stevens word combinations

want baby see record

want car see Stevie

want do

want get whoa cards

want glasses whoa jeep

want head

want high more ball

want horsie more book

want jeep

want more there ball

want page there book

want pon there doggie

want purse there doll

want ride there high

want up there mommawant byebye car there record

there trunk

it ball there byebye car

it bang there daddy truck

it checker there momma truck

it daddy

it Dennis that box

it X etc. that Dennis

that X etc.get ball

get Betty here bed

get doll here checker

here doll

see ball here truck

see doll

bunny do

daddy do

momma do


40/308

A RE-DEFINITION OF THE PROBLEM 9

The second property of the pivot/open distinction noticed by Braine was that

pivot and open are positional classes, occurring in a specified position withrespect to each other, though the positional ordering was specific to the pivot

element itself (P1 Open, Open P2, etc.) and hence not to be captured by a

general phrase structure rewrite rule: S Pivot Open. This latter fact was used

by critical studies of the time (Fodor, Bever, and Garrett 1974, for example) to

argue that Braines distinction was somehow incoherent, since the one means of

capturing such a distinction, phrase structure rules, required a general collapse

across elements in the pivot class which was simply not available in the data.

The third property of the pivot/open distinction was that the open class

elements were generally optional, while the pivot elements were not.

1.1.2 The Government Relation

What is interesting from the perspective of current theory is just how closely

Braine managed to isolate analogs not to the phrase structure rule descriptions

popular at that time, but to the central relation primitives of the current theory.

Thus the relation of pivot to open classes may be thought of as that between

governor and governed element, or perhaps more generally that of head to

complement; something like a primitive prediction or small clause structure (in

the extended sense of Kayne 1984) appears to be in evidence in these early

structures as well:

(2) Steven word utterances:

it ball that box

it bang that Dennis

it checker that doll

it X, etc. that Tommy

that truckthere ball here bed

there book here checker

there doggie here doll

there X, etc. here X, etc.

Andrew word combinations:

boot off airplane all gone

light off Calico all gone

pants off Calico all done

shirt off salt all shut

shoe off all done milk


41/308


water off all done now

all gone juiceclock on there all gone outside

up on there all gone pacifier

hot in there

X in/on there, etc.

Gregory word combinations:

byebye plane allgone shoe

byebye man allgone vitamins

byebye hot allgone eggallgone lettuce

allgone watch

etc.

The third property that Braine notes, the optionality of the open constituent with

respect to the pivot, may also be regularized to current theory: it is simply the

idea that heads are generally obligatory while complements are not.

The idea that the child, very early on, is trying to determine the general

properties of the government relation in the language (remaining neutral for nowabout whether this is case or theta government) is supported by two other facts

as well: the presence of what Braine calls groping patterns in the early data,

and the presence of what he calls formulas of limited scope. The former can be

seen in the presence of the allgone constructions in Andrews speech. The

latter refers simply to the fact that in the very early two-word grammars, the set

of relations possible between the two words appears limited in terms of the

semantic relation which hold between them. This may be thought of as showing

that the initial government relation is learned with respect to specific lexical

items, or cognitively specified subclasses, and is then collapsed between them.See also later discussion. The presence of groping patterns, i.e. the presence,

in two word utterances of patterns in which the order of elements is not fixed for

lexically specific elements corresponds to the original experimentation in

determining the directionality of government (Chomsky 1981, Stowell 1981). The

presence of groping patterns is problematic for any theory of grammar which

gives a prominent role to phrase structure rules in early speech, since the order

of elements must be fixed for all elements in a class. See, e.g., the discussion in

Pinker (1984), which attempts, unsuccessfully I believe, to naturalize this set of

data. To the extent to which phrase structure order is considered to be a deriva-

tive notion, and the government-of relation the primitive one, the presence of

lexically specific order difference is not particularly problematic, as long as the


42/308


directionality of government is assumed to be determined at first on a word-by-

word basis.

1.2 The Open/Closed Class Distinction

Braines prescient analysis was attacked in the psychological literature on both

empirical and especially theoretical grounds; it was ignored in the linguistic

literature. The basis of the theoretical attack was that the pivot/open distinction,

being lexically specific with respect to distribution, would not be accommodated

in a general theory of phrase structure rules (as already mentioned above);moreover, the particular form of the theory adopted by Braine posited a radical

discontinuity in the form of the grammar as it changed from a pivot/open

grammar to a standard Aspects-style PS grammar. This latter charge we may

partly diffuse by noting that there is no need to suppose a radical discontinuity

in the form of the grammar as it changed over time, the pivot/open grammar is

simply contained as a subgrammar in all the later stages. However, we wish to

remain neutral, for now, on the general issue of whether such radical discontinu-

ities are possible. The proponents of such a view, especially the holders of the

view that the original grammar was essentially semantic (i.e. thematically

organized), held the view in either a more or less radical form. The more

extreme advocates (Schlesinger 1971) held not simply that there was a radical

discontinuity, but that the primitives of later stages syntactic primitives like

case and syntactic categories like noun or noun phrase were constructedout

of the primitives of the earlier stages: a position one may emphatically reject.

Other theoreticians, however, in particular Melissa Bowerman (Bowerman 1973,

1974) held that there was such a discontinuity, but without supposing any

construction of the primitives of the later stages from those of the earlier. Wereturn, in detail, to this possibility below.

More generally, however, the charge that the pivot/open class stage presents

a problem for grammatical description appears to dissolve once the government-

of relation is taken to be the primitive, rather than the learning of a collection of

(internally coherent) phrase structure rules.

However, more still needs to be said about Braines data. For it is not

simply the case that a rudimentary government relation is being established, but

that this is overlaid, in a mysterious way, with the open/closed class distinction.

Thus it is not simply that the child is determining the government-of andpredicate of relations in his or her language, but also that the class of governing


43/308


elements is, in some peculiar way, associated with a distributional class: namely,

that of closed class elements.While the central place of the government-of relation in current theory gives

us insight into one-half of Braines data, the role of the closed class/open class

distinction, though absolutely pervasive in both Braines work and in all the

psycholinguistic literature (see Garrett 1975, Shattuck-Hufnagel 1974, Bradley

1979, for a small sample) has remained totally untouched. Indeed, even the

semantic literature, which has in general paid much more attention to the

specifier relation than transformational-generative linguistics, does not appear to

have anything to say that would account for the acquisition facts.

What could we say about the initial overlay of the elements closed class andthe set of governors? The minimal assumption would be something like this:

(3) The set of canonical governors is closed class.

While this is an interesting possibility, it would involve, for example, including

prepositions and auxiliary verbs in the class of canonical governors, but not main

verbs. Suppose that we strengthen (3), nonetheless.

(4) Only closed class elements may govern.

What about verbs? Interestingly, a solution already exists in the literature: in fact,two of them. Stowell (1981) suggests that it is not the verb per se which governs

its complements, but rather the theta grid associated with it. Thus the comple-

ments are theta governed under coindexing with positions in the theta grid. And

while the class of verbs in a language is clearly open class and potentially

infinite, the class of theta grids is equally clearly finite: a member of a closed,

finite set of elements. Along the same lines, Koopman (1984) makes the

interesting, though at first glance odd, suggestion that it is not the verb which

Case-governs its complements, but Case-assigning features associated with the

verb. She does this in the context of a discussion of Stowells Case adjacency

requirement for case assignment; a proposal which appears to be immediately

falsified by the existence of Dutch, a language in which the verb is VP final, but

the accusative marked object is at the left periphery of the VP. Koopman saves

Stowells proposal by supposing that the Case-assigning features of the verb are

at the left periphery, though the verb itself is at the right. This idea that the two

aspects of the verb are separable in this fashion will be returned to, and support-

ed, below. What is crucial for present purposes is simply to note that Case-

governing properties of the verb are themselves closed class, though the set ofverbs is not. Thus both the Case-assigning and theta-assigning properties of the


44/308


verb are closed class, and we may assume that these, rather than some property

of the open class itself enters into the government relation.There is a second possibility, less theory-dependant. This is simply that, as

has often been noted, there is within the open part of the vocabulary of

language a subset which is potentially closed: this is the so-called basic vocabu-

lary of the language, used in the teaching of basic English, and other languages.

The verb say would presumably be part of this closed subset, but not the verb

mutter, as would their translations. The child task may be viewed as centering on

the closed class elements in the less abstract sense of lexical items, if these are

included in the set.

1.2.1 Finiteness

While the syntactic conjecture that the Case features on the verb are governing

its object has been often enough made, the theoretical potential of such a

proposal has not been realized. In essence, this proposal reduces a property of an

open class of elements, namely verbs, to a property of a closed class of elements

(the Case features on verbs). Insofar as direction of government is treated as a

parameter of variation across languages, by reducing government directionality

to a property of a closed class set, the two sorts of finiteness, lexical and

syntactic, are joined together. The finiteness of syntactic variation (Chomsky

1981) is tied, in the closest possible way, to the necessary finiteness of a lexical

class (and the specifications associated with it).

Let us take another example. English allows wh-movement in the syntax;

Chinese, apparently, apportions it into LF (Huang 1982). This is a parametric

difference in the level of derivation at which a particular operation applies.

However, this may well be reducible to a parametric difference in a closed class

element. Let us suppose, following Chomsky (1986), that wh-movement ismovement into the specifier position of C.

Ordinarily it is assumed that lexical selection (of the complement-taking

verb) is of the head. Let us assume likewise the matrix verb must select for

a +/ wh feature in Comp. This, in turn, must regulate the possible appearance

of the wh-word appearing in the specifier position of C. We may assume that

some agreement relation holds between these two positions, in direct analog to

the agreement relation which exists generally between specifier and head

positions, e.g. with respect to case. Thus the presence of the overt wh-element in

Spec C is necessary to agree with, or saturate the +wh feature which is base-generatedin Comp. What then is the difference between English and Chinese? Just

this: the agreeing element in Comp must be satisfied at S-structure in English,


45/308


while it needs only be satisfied at LF in Chinese. This difference, in turn, may

be traced to some intrinsic property of agreement in the two languages, we mighthope.

(5) wonder C

C

I

I

VP

NP

e

V

saw

I

NP

John

Comp

who

If this sketch of an analysis is correct or something like it is then the

parametric difference between English and Chinese with respect to wh-move-ment is reduced to a difference in the lexical specification of a closed class

element.1 Since the possible set of universal specifications associated with a

closed class set of elements is of necessity finite, the finiteness conjecture of

Chomsky (1981) would be vindicated in the strongest possible way. Namely, the

finiteness in parametric variation would be tied, and perhaps only tied, to the

finiteness of a group of necessarily finite lexical elements, and the information

associated with them.

1.2.2 The Question of Levels

There is a different aspect of this which requires note. The difference between

Chinese and English with respect to wh-movement is perhaps associated with

features on the closed class morpheme, but this shows up as a difference in the

appearance of the structure at a representational level. I believe that this is in

1. I should note that the term closed class elementhere is being used in a somewhat broader sensethan usual, to encompass elements like the +whfeature. The finiteness in the closed class set cannot

be that of the actual lexical items themselves, since these may vary from language to language, but

in the schema which defines them (e.g. definite determiner, indefinite determiner, Infl, etc.).


46/308


general the case: namely, that while information associated with a closed class

element is at the root of some aspect of parametric variation, this differenceoften evidences itself in the grammar by a difference in the representational level at

which a particular operation applies. We may put this in the form of a proposal:

(6) The theory of UG is the theory of the parametric variation in the

specifications of closed class elements, filtered through a theory of

levels.

I will return throughout this work to more specific ways in which the conjecture

in (6) may be fleshed out, but I would like to return at this point to two aspects

which seem relevant. First is the observation made repeatedly by Chomsky(1981, 1986a), that while the set of possible human languages is (at least

conjecturally) finite, they appear to have a wide scatter in terms of surface

features. Why, we might ask, should this be the case? If the above conjecture (6)

is correct, it is precisely because of the interaction of the finite set of specifica-

tions associated with the closed class elements, and the rather huge surface

differences which would follow from having different operations apply at

different levels. The information associated with the former would determine the

latter; the latter would give rise to the apparent huge differences in the descrip-

tion of the worlds languages, but would itself be tied to a parametric variation

in a small, necessarily finite set.

How does language acquisition proceed under these circumstances? Briefly,

it must proceed in two ways: by determining the properties of lexical specifica-

tions associated with the closed class set the child determines the structure of the

levels; by determining the structure of the levels he or she determines the

properties of the closed class morphemes. The proposal that the discovery of

properties associated with closed class lexical items is central obviously owes a

lot to Borers (1985) lexical learning hypothesis, that what the child learns, and

all that he/she learns is associated with properties of lexical elements. It consti-

tutes, in fact, a (fairly radical) strengthening of that proposal, in the direction of

finiteness. Thus while the original lexical learning hypothesis would not guaran-

tee finiteness in parametric variation, the version adopted in (6) would, and thus

may be viewed as providing a particular sort of grounding for Chomskys

finiteness claim.

However, the proposal in (6) contains an additional claim as well: that the

difference in the specifications of closed class elements cashes in as a difference

in the level that various operations apply. Thus it provides an outline of the waythat the gross scatter of languages may be associated with a finite range.


47/308


1.3 Triggers

1.3.1 A Constraint

The theory of parametric variation or grammatical determination has often been

linked with a different theory: that oftriggers (Roeper 1978b, 1982, Roeper and

Williams 1986). A trigger may be thought of, in the most general case, as a

piece of information, on the surface string, which allows the child to determine

some aspect of grammatical realization. The idea is an attractive one, in that it

suggests a direct connection between a piece of surface data and the underlying

projected grammar; it is also in danger, if left further undefined, of becomingnearly vacuous as a means of grammatical description. A trigger, as it is

commonly used, may apply to virtually any property of the surface string which

allows the child to make some determination about his or her grammar.

There is, as is usual in linguistic theory, a way to make an idea more

theoretically valuable: that is, by constraining it. This constraint may be either

right or wrong, but it should, in either case, sharpen the theoretical issues involved.

In line with the discussion earlier in the chapter, let us limit the content of

trigger in the following way:

(7) A trigger is a determination in the property of a closed class element.

Given the previous discussion, the differences in the look of the output grammar

may be large, given that a trigger has been set. The trigger-setting itself, however, is

aligned with the setting of the specification of a closed class element.

There are a number of instances of triggers in the input which must be re-

examined given (7) above, there are however, at least two very good instances

of triggers in the above sense which have been proposed in the literature. The

first is Hyams (1985, 1986, 1987) analysis of the early dropping of subjects inEnglish. Hyams suggests that children start off with a grammar which is

essentially pro-drop, and that English-speaking children then move to an English-

type grammar, which is not. These correspond to developmental stages in which

children initially allow subjects to drop, filter out auxiliaries, and so on (as a first

step), to one in which they do not so do (as the second step). The means by

which children pass from the first grammar to the second, Hyams suggests, is by

means of the detection of expletives in the input. Such elements are generally

assumed not to exist in pro-drop languages; the presence of such elements would

thus allow the child to determine the type of the language that he or she was facing.


48/308


1.3.2 Determining the base order of German

The other example of a trigger, in the sense of (7) above, is found in Roepers

(1978b) analysis of German. While German sentences are underlyingly verb-final

(see Bierwisch 1963, Bach 1962, Koster 1975, and many others), the verb may

show up in either the second or final position.

(8) a. Ich sah ihn.

I saw him.

b. Ich glaube dass ich ihn gesehen habe.

I believe that I him seen have

Roepers empirical data suggests that the child analyses German as verb-final at

a very early stage. However, this leaves the acquisition question open: how does

the child know that German is verb final?

Roeper proposes two possible answers:

(9) i. Children pay attention to the word order in embedded, not

matrix clauses.

ii. Children isolate the deep structure position of the verb by

reference to the placement of the word not, which is alwaysat the end of the sentence.

At first, it appears that the solution (i) is far preferable. It is much more general,

for one thing, and it also allows a natural tie-in with theory namely, Emonds

(1975) conception, that various transformations apply in root clauses which are

barred from applying in embedded contexts. However,

david lebeaux language acquisition and the form of the grammar 2000

Documents