prepositions and the names of countries and islands: a

26
Prepositions and the Names of Countries and Islands: A Local Grammar for the Automatic Analysis of Texts 1 Mylene Garrigues The expression of movement in French has given rise to a large amount of linguistic research centred on the attempt to attribute a particular meaning to a given syntactical form. Thus, for example, the use of alternatives such as the prepositions ii/ sur is explained in terms of a semantic opposition, with ii marking "the conclusion or goal of a displace- ment", and sur "the direction of a rapid displacement" ("le terme d'un mouvement" and "la direction d'un mouvement rapide" respectively, G. Gougenheim, 1962). This view, illustrated by a number of examples, presupposes that it is enough to learn the "meaning" of a preposition in order to acquire the necessary competence for its production in any context. What is presented here is an analogous study which, however, leads to a dismantling of this view. The systematic examination of circumscribed points, which is, after all, necessary to the formalisation of a linguistic phenomenon for automatic processing, shows a remarkable absence of any such correlation between meaning and form, whilst revealing a highly complex pattern of lexico-syntactic relationships. This reversal of the ini- tial position leads us to envisage the learning of the usage, of a preposition in terms of the memorisation not of a meaning, but of a multitude of grammars corresponding to the different contexts, or even micro-contexts, in which it occurs. These grammars, of which we shall provide graphical representations, reveal also, and in a tangible way, the extreme complexity of our cognitive linguistic system. 1. Definition of the Predicate Along more rigorous lines, the object of our research was to develop a formal grammar of the use of prepositional complements with the names of countries, of continents and of islands, in the context of a semantic predi- 1 This paper has been translated by Ivan Birks. Language Research, Volume 31, Number 2, June 1995.0254-4474/309-334 309

Upload: others

Post on 31-Oct-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands: A Local Grammar for the

Automatic Analysis of Texts1

Mylene Garrigues

The expression of movement in French has given rise to a large amount of linguistic research centred on the attempt to attribute a particular meaning to a given syntactical form. Thus, for example, the use of alternatives such as the prepositions ii/ sur is explained in terms of a semantic opposition, with ii marking "the conclusion or goal of a displace­ment", and sur "the direction of a rapid displacement" ("le terme d'un mouvement" and "la direction d'un mouvement rapide" respectively, G. Gougenheim, 1962). This view, illustrated by a number of examples, presupposes that it is enough to learn the "meaning" of a preposition in order to acquire the necessary competence for its production in any context.

What is presented here is an analogous study which, however, leads to a dismantling of this view. The systematic examination of circumscribed points, which is, after all, necessary to the formalisation of a linguistic phenomenon for automatic processing, shows a remarkable absence of any such correlation between meaning and form, whilst revealing a highly complex pattern of lexico-syntactic relationships. This reversal of the ini­tial position leads us to envisage the learning of the usage, of a preposition in terms of the memorisation not of a meaning, but of a multitude of grammars corresponding to the different contexts, or even micro-contexts, in which it occurs. These grammars, of which we shall provide graphical representations, reveal also, and in a tangible way, the extreme complexity of our cognitive linguistic system.

1. Definition of the Predicate

Along more rigorous lines, the object of our research was to develop a

formal grammar of the use of prepositional complements with the names of

countries, of continents and of islands, in the context of a semantic predi-

1 This paper has been translated by Ivan Birks.

Language Research, Volume 31, Number 2, June 1995.0254-4474/309-334 309

Page 2: Prepositions and the Names of Countries and Islands: A

310 Myiene Garrigues

cate defined by the following scenario.

(1) A human being undertakes a displacement to an explicit destination (Ndest=Nland, Ncont or Nisl).

It was, in other words, a question of describing all the syntactic occur­

rences of this predicate by formulating a grammar of the constructions in­

volving a verb of movement accompanied by these complements. We fo­

cused our attention on this particular type of complement because it ena­

bled us:

- to reconsider the classic problem of the prepositions which accompany

geographical nouns by listing them, and organising them much more rigor­

ously than had been done previously.

- to describe two situations, of which one (countries and continents) is

relatively regular, and the other (islands) is of exemplary complexity.

Over and above countries, continents and islands, we intend to extend

our research into the domain of seas, regions, towns, etc. in order to cover,

progressively, the geographical field in its entirety. We furthermore intend

to work on the formulation of the complementary grammar, that of comple­

ments of origin (venir d'Italie/to come from Italy).

The elementary sentences (subject-verb-complement) which correspond

to our scenario are, therefore, of the following type:

Jo a regagne la France

Jo has arrived back in/regained France

Jo vole vers les Etats- Unis

Jo is flying to the US.

Jo se rend au Maroc

Jo is going to Morocco

Jo part en Afrique

Jo is leaving for Africa

La Sardaigne attire les touristes

Sardinia attracts tourists

Page 3: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands 311

Since we were dealing with sentences which incorporate prepositions of

place, it was necessary to describe all the other scenarios involving these

prepositions in order to avoid confronting ourselves subsequently with a

muddled situation, incompatible with the process of formalisation.

To illustrate this observation, we have provided three examples from a

series of similar scenarios (M. Garrigues, 1993) which we were careful to

distinguish from (1):

(2) A human being undertakes a displacement for which no limit is

established:

Jo parcourt la France

Jo is exploring France

Jo sillonne les Pays-Bas

Jo is travelling the length and breadth of the Netherlands

Jo survole les Etats- Unis

Jo is flying over the United States

(3) A human being undertakes an action. The action mayor may not

involve displacement, but a displacement is implicit in relation to the

activity in question. The prepositional complement is in this case

called a complement of scenery (J. P. Boons, A. Guillet, C. Leclere,

1976) :

Jo travaille au Maroc

Jo is working in Morocco

Jo s' amuse en Angleterre

Jo is having fun in England

Jo a fait la guerre au Vietnam

Jo fought in Vietnam

( 4) A human being displaces an object or a human being from one

place to another:

Jo a emmene ses eleves en Angleterre

Jo has taken his pupils to England

Page 4: Prepositions and the Names of Countries and Islands: A

312 Mylene Garrigues

Jo a envoye un colis au Bresil

Jo has sent a parcel to Brazil

It will be noted, furthermore, that the definition of the predicate requires

iliat the field of application it operates on should be explicited (countries, con­

tinents or islands). We can observe that it is enough for us to change the field

of application of our predicate, in order for a specific and unpredictable distri­

bution of prepositions to emerge. If, for example. we apply the scenario of our

predicate not to the geographical field of countries and islands. but to a highly

limited spatial field, such as that of the rooms in a house, the result is the

emergence of a particular grammar which governs its own prepositions and

exeeptions. We can compare the following examples:

Ja va dans la chambre • a la chambre -Jo goes into the bedroom

Ja va dans la cuisine a la cuisine -Jo goes into the kitchen

Ja va dans le vestibule • au vestibule -Jo goes into the hall

Jo va dans le salon au salon -Jo goes into the living-room

The same phenomenon can be observed in any other semantic field. If we

take the case of a second field, just as restricted, that of "open spaces", the

same arbitrariness reappears in the choice of the preposition, as the follow­

ing examples indicate:

Jo va

Jova

Jo va

Jova

sur la route

• sur la rue

sur la place

• sur le jardin public

• dans la route

dans la rue

• dans la place

dans le jardin public

• a la route

·'a la rue

? a la place

au jardin public

Jo va sur le terrain vague dans le terrain vague ? au terrain vague

( Jo goes onto/ into/ to/the road/ street/ square/ park/ waste ground)

As we can see, no rule can be formulated which would enable us to pre­

diCt globally these occurrences. There is, therefore, no alternative but to

proceed by examining and describing each case one by one, within the con­

text of the paradigm of each field taken into consideration; In view of this

requirement, it is easy to appreciate that traditional grammars and manu­

als, whose purpose is not be used in the context of the automatic processing

of a language, make an appreciable, but fundamentally inadequate contri­

bution. As linguistic theory evolves, however, towards the formalisation of

Page 5: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands 313

the morpho-syntactic functions of a language, with a view to enabling the

automatic processing of the latter, the development of such grammars will

become necessary on a very large scale. The enormity of the ,task should

soon become apparent.

Our method of formalisation consists in the representation of linguistic

facts by means of graphs known as finite state automata {more specifical­

ly: DAGs or Directed Acyclic Graphs (M. Gross, 1989). We have, there­

fore, described and listed the verb-preposition-noun combinations resulting

from scenario (1) using graphs. These graphs permit the regrouping of

families of expressions according to their variations and common features.

The graphs should be read from left (initial position) to right (final posi­

tion). Each of the paths in the graph leading from the initial position to the

final position corresponds to an acceptable expression. The degree of

formalisation which results from this method of representation means that

these grammars can be applied directly to the automatic analysis of texts.

This is achieved by means of software applications which translate these

graphs automatically into analysers capable of recognising in a text the ex­

pressions thus described.

2. The Syntactic Permutations

The syntactic permutations of our predicate can be represented by three

possible structures, applicable to all fields.

(a) Nhum V Prep Nloc

(b) Nhum V Nloc

(c) Nloc V Nhum

Jo va (dans le salon, en Italie)

Jo goes into the living room/ into Italy

Jo regagne (sa chambre, l'Italie)

Jo regains her bedroom/Italy

(Ce restaurant, l'Italie) attire les touristes

This restaurant/ Italy attracts tourists

Our grammar will account for the realisation of syntactic structures cor­

responding to (a) and (b). The syntactic structures of type (c) are less

specific, and can be processed and added at a later stage. Furthermore, for

reasons we have already provided, we have momentarily limited the Nloc

destination to Nland, Ncont and NisI, taken from the strictly geographical

Page 6: Prepositions and the Names of Countries and Islands: A

314 Myiene Garrigues

field which we propose to cover subsequently. Each field of application,

even one as limited as that of the rooms of a house, possesses its own gram­

mar, and, as a result, must be isolated and analysed in detail. We can con­

ceive of no other satisfactory way of proceeding.

The existence of a lexicon grammar of French verbs enabled us to select

the simple verbs relevant to the task in hand. In building up an inventory of

verbs of movement we relied on the following tables:

For the prepositional permutation (a):

Table T2 (M. Gross, 1975)

Table T35L (J. P. Boons, A. Guillet, C. Leclere, 1976)

Table 38LO (A. Guillet, C. Leclere, 1992)

This category also comprises sentences with a composed or periphrastic

verb form, such as faire un sauta (drop into), faire une excursion a (take a

trip to), faire un tour a (have a wander in). Our graph includes only a few

of these forms. We intend to create a sub-graph which accounts for them

at a later stage.

For the two non-prepositional permutations (b) and (c):

- Table 38L1 (A. Guillet, C. LecH~re, 1992)

3. Semantic Compatibility in V Nlcx:

Once we had made an inventory of these verbs we conducted a secon­

dary selection based on factors of plausibility and semantic compatibility.

The criterion of plausibility led us to discard, on the one hand, verbs such

as pagayer (paddle/row) or amerrir (to land on water), since the plausibility

of their occurrence in everyday language is slight (M. Garrigues, 1993). On

the other hand we discarded verbs such as se glisser (slide into) and se

faufiler (to worm into) - even though the plausibility of their occurring is

considerable in everyday language, it is slight in the context of our predi­

cate:

? Jo pagaye vers l'Italie

Jo vogue vers l'Italie

? Jo s' est glisse en Italie

Jo s' est glisse dans la piece

Jo paddles towards Italy

Jo wanders towards Italy

Jo slid into Italy

Jo slid into the room

Page 7: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands 315

We then eliminated verbs with zero plausibility, due to their semantic

incompatibility, or minimal compatibility with the following noun (G. Gross,

1992). Marcher sur (to walk/march on), meaning aller vers (to go towards) is

logically incompatible with the name of an island, and is less plausible with

the name of a country than with the name of a town. This meaning of the

verb marcher is, in the event, generally linked to a military context:

?? Les allies marchent sur la Sardaigne

? Les allies marchent sur l'Italie

The allies march on Sardinia

The allies march on Italy

The allies march on Rome Les allies marchent sur Rome

Similarly, monter (go up to) and descendre (go down to) are generally used

with the names of towns, rather than with the names of countries, although

it does seem to us that the two verbs present certain differences:

Monter a Paris

? Monter en Belgique

Descendre a Toulouse

Descendre en Espagne

4. The Group V Prep NIx

(to go up to Paris)

(to go up to Belgium)

(to go down to Toulouse)

(to go down to Spain)

We have observed that in the case of the linguistic realisation of a phrase

of type V Prep Nloc, the choice of the preposition is dependent on the prior

selection of the verb· and corresponding noun. For this reason we consider

this group to constitute an indivisible phrasal unit. If, for example, we take

the verb aller (go), and attempt to associate it with the three prepositions it

is supposed to accept (ii, dans, en- to/in), we realise that we must discover

the noun, before producing the preposition retroactively. In other words the

choice of the preposition is bound by a double constraint imposed

simultaneously by the verb and the noun;

AlZer ii la cu isine dans la cuisine en cuzsme

Aller • ii la chamhre dans la chamhre . en chamhre

AlZer ii la salle de bains dans la salle de bains • en salZe de bains

AlZer ii la salle d' operation dans la salle d' operation en salle d' operation

AlZer • la banlieue ? dans la banlieue en banlieue

Page 8: Prepositions and the Names of Countries and Islands: A

316 Mylene Garrigues

Aller aux Etats- Unis • clans les Etats- Unis • en Etats- Unis

Aller . aux Vosges dans les Vosges • en Vosges

Aller • ii la Sardaigne • dans la Sarclaigne en Sardaigne

Aller ii la Guadeloupe • dans la Guadeloupe en Guadeloupe

Aller ii la plage • clans la plage • en plage

Aller ii la loret dans la loret en loret

( To go [12/ dans/ en - to/in] the kitchen, the bedroom, the bathroom, the opera­

ting theatre, the suburbs, the US, the Vosge mountains, Sardinia)

Inversely, if we establish a noun, and try to use three prepositions (ii,

vers, pour - to, towards, for) apt to precede it, before we can decide on the

acceptability of these prepositions we must know to which verb this noun is

linked.

Aller

Se rendre

Partir

a l'eco1e

a 1'eco1e

a l'eco1e

vers l' eco1e

• vers l' eco1e

vers I' ecole

• pour l' ecole

• pour l' ecole

pour l' eco1e

$' acheminer • a l' ecole vers Z' ecole • pour l' ecole

Rentrer a l'ecoZe • vers Z'ecoZe • pour l:ecole

(to [go/leave/set out/return] [to/towards/for] school)

It would appear that we are in the presence of an operation in two stag­

es. The first involves a latent programme of signification defined in terms

of a semantic predicate. The second intervenes at the moment of the lin­

guistic realisation of the predicate, and can be seen as an- incredibly com­

plex procedure which establishes a relationship between nominal and verbal

groups. Out of the calibration of the noun and verb results the production

of a preposition. Thus the latent programme of signification is entrenched

in an indivisible lexico-syntactic unit V Prep Moc, which constitutes the

utterance. As we saw with marcher sur even the nature of the subject can

interfere. We must, therefore, consider that the basic minimal unit deter­

mining the preposition is the simple sentence NO V Prep Moc. The implica­

tions in terms of cognitive mechanisms which result fromi this hypothesis

are currently unverifiable. We can say, however, that it is more in line with

the requirements of a process of automatic formalisation than the results a­

chieved by traditional grammars. It is also true that in the field of artificial

intelligence (R.C. Schank 1982) the process leading to the generation of an

Page 9: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands 317

action has been described as a sequence of discriminatory operations pro­

ceeding from more abstract forms of representation to more concrete ones,

resulting in the realisation of an action. It is true that in the field of infra­

linguistics the ultimate goal is not, as with artificial intelligence, an action,

but an utterance. This goal could, however, be achieved by a similar proc­

ess, starting with a predicate 'schema' and, by means of a series of succes­

sive forks in a 'syntactic script', arriving at a final utterance.

In order to produce the 'syntactic script' of our predicate in the form of

an automaton, we started with an inventory of selected verbs of movement,

as we explained above. We then organised the different names of the coun­

tries, continents and islands into groups (cf. Fig. 1).

Each group, or box, in our main graph bears a name indicating its

morphosyntactic properties. The name in a shaded box (or node) refers to

a sub-graph which contains a list of the relevant countries, and which

would be consulted upon execution of the automaton. When the box is un­

shaded, as is the case with Israel (Israel), it means that the node is explicit,

and does not refer to a sub-graph.

4.1. Names of Countries

- Graph LandDetZ. This graph contains the names of countries which

are used without a determiner, i.e. with determiner 'zero' (cf. Fig. 2):

(Aller 0., partir pour) (Panama, Singapour, Bahrein)

( to go to/ leave for··· )

- Node Israel. Although Israel has determiner zero, it does not feature

in the previous group of countries since it functions differently, as the fol-

lowing examples demonstrate:

Partir pour Panama Singapour Israel

Aller a Panama Singapour • Israel

Aller en • Panama • Singapour Israel

Furthermore, as we can see in our basic automaton NVPrepN (cf. Fig. 1),

Page 10: Prepositions and the Names of Countries and Islands: A

318 Mylene Garrigues

Fig. 1. NVPrepN

Bahrein

Hong- Kong

J\ Formose 0

Macao

Singapour

Panama

Fig. 2. PaysDetZ

Page 11: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands 319

Israel does not function like the names of countries whose article disap­

pears when used with the preposition en (cf. graph LandVowel (Fig. 4) and

graph LandFemCons (Fig. 5):

Aller en Israel Italie Grande-Bretagne

Avancer vers Israel • Italie • Grande-Bretagne

Partir pour Israel • Italie • Grande-Bretagne

(to go to/advance towards/leave for··· )

Given that the name of this country functions in a unique fashion, we

placed it separately, in a specific node of the automaton.

Node Etats-Unis (United States). This node contains the country names

which are plural, and used with the plural definite article:

Aller aux (Etats-Unis, Pays Bas [Netherlands])

- Graph LandMasCons. This graph contains the country names which

are masculine, begin with a consonant, and are used with the masculine sin­

gular definite article (cf. Fig. 3):

Aller au (Bangladesh, Japon, Maroc [Morocco])

- Graph LandVowel. This graph contains the country names which

begin with a vowel, and are used with the elided definite article, or without

a determiner (cf. Fig. 4) :

Partir pour l' Angleterre/ Aller en Angleterre

(To leave for/go to England)

- Graph LandFemCons. This graph contains the country names which

are feminine, begin with a consonant, and are used with the feminine defi­

nite article, or with no determiner (cf. Fig. 5):

Partir pour la Pologne/ Aller en Pologne.

(To leave for/go to Poland)

- Graph LandClass. This graph contains the country names which are

plural, with a classifier ('Etats'/States, 'Pays'/Countries), and which, unlike

the node Etats Unis, do not take the preposition a, but do take the preposi­

tions de, vers, sur, pour (cf. Fig. 6).

Page 12: Prepositions and the Names of Countries and Islands: A

320

c

Mylene Garrigues

Bangladesh Benin Bresil Burkina Faso Brundi Cameroun Canada Chili Congo Costa Rica Danemark Japon Gabon Ghana Guatemals Honduras Groenland Kenya Koweit Laso Liban Liberia Luxembourg Mali

),-____ --j Maroc Mexique Mozambique Nepal Nicaragua Niger Nigeria Paraguay Perou Portugal Qatar Royaume-Uni Rwanda Salvador Senegal Soudan Sri-Lanka Surinam Swaziland Tchad Togo Venezuela Vietnam Yemen Zaire Zimbabwe Machrek

Maghreb '-------I }--------; Moyen-orient

Proche-orient Pole nord Pole sud

Fig. 3. PaysMasCons

Page 13: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands

Afghanistan

Afrique du Sud Albanie Algerie

Allemagne Angleterre Angola Arabie Saoudite Argentine Australie Autriche Ecosse Egypte

-----f )r-------i Equateur

Espagne Ethiopie Inde Indochine Indonesie Irak Iran Irlande Islande ltalie Ouganda URSS Uruguay

Occident Orient Extr~me-orient

Fig. 4. PaysVoc

Aller aux (Pays-Bas, 'Pays Baltes)

Aller dans les CPays-Bas, Pays Baltes)

321

This sub-set also cOITesponds to the only country names which function

with the preposition dans, and, as such, has been re-used in the context of

Page 14: Prepositions and the Names of Countries and Islands: A

322

Beigique Birmanie Bolivie Bosnie Bulgarie

Mylene Garrigues

R~publique Centrafricaine Chine Colombie Communaut~ des Etats Ind~pendants CEI Communaut~ Economique Europ~enne CEE Cor~e COte d'lvoire R~publique Dominicaine Croatie Estonie Finlande France Gambie Grande-Bretagne Gr~ce Guinee Hollande Hongrie Jordanie Lettonie Libye Lituanie

)-------1 Malaisie Mauritanie Mongolie Namibie Norv~ge Nouvelle Cal~donie Nouvelle Zelande Papouasie Pologne Polyn~sie Fran~aise Republique F~derale Allemande RFA Roumanie Russie Serbie Sierra Leone Slovaquie Slovenie Somalie Su~de Suisse Syrie Tanzanie Tchecoslovaquie R~publique ch~que Terre Ad~lie Tunisie Turquie Yougoslavie Zambie

Fig. 5. PaysFemCons

1------lD

Page 15: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands

Etats de l'Afrique de l'Est Etats de l'Afrique de l'Ouest Etats de l'Est africain Etats de l'Ouest africain Pays Baltes

323

>-------1 pays du Commonwealth I-----~D pays de la Communaute europeenne pays de l'Est pays d'Europe centra le pays du Levant pays du Sud

Fig. 6. PaysClass

Fig. 7. Cant

the general graph NVDansN (cL Fig. 11), devoted to this preposition.

- Graph Cont. This graph contains the names of the continents, and

their variants (cL Fig. 7) .

The classification of country and continent names is facilitated by a cer-

Page 16: Prepositions and the Names of Countries and Islands: A

324 Myiene Garrigues

tain number of identifiable, morphological criteria (masculine/feminine,

vQwel/consonant, etc.) which correspond to a relative regularity in their

utilisation. The distributional organisation of the names of islands consti­

tutes a problem of considerably greater complexity. It is this phenomenon

that we intend to account for now.

4. 2. Names of Islands

Independently of all syntactic considerations, the name of an island com­

prises a classifier of variable extension (l'1le de [the island of} / les !les de

[t:Jie X isles/islands]). The possible permutations are the following:

Singular constructions:

/iJet 'zerd - Nile: Belle-lle est a une heure de bateau

DM-Nile: La Guadeloupe est tres a la mode en ce moment

IJlle-Nile: L'lle Maurice est ideale pour les vacances

L'ile de-Nile: L'lle de Paques est celebre pour ses statues

l},Jle de la/ le-Nile: L'lle du Diable est dangereuse

Plural constructions:

Det-Nile: Les A"ores sont sous une depression

I..es lies-Nile: Les lles Baleares sont plus jrequentable au printemps

Le8 lies de-NIles: Les lIes de Saint Pierre .et Miquelon sont jran"aises

Le8 lies de Dei-Nile: Les lIes du Cap- Vert sont dijjiciles asituer

IJarchipel des NIles: L'archipel des Comores est au nord de Madagascar

€::lassifiers can often be reduced, and sometimes even suppressed,

although, as the following examples demonstrate, this procedure is unpre­

dictable.

CL'lle de la, la) Guadeloupe est tres courue.

fL'lle de la, ·la) Tortue est peu jrequentee

(Les lIes, les) Maldives sont devenues plus accessibles

(Les lIes, ·les) Ioniennes sont connues

If, on the other hand, we try to expand a classifier, the results are

equally unpredictable.

La (Corse, Crete) est tres jrequentee

L'IIe de ("Corse, Crete) est tres jrequentee

Page 17: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands 325

Les (Comores, Philippines) sont interessantes

Les lIes (*Comores, Philippines) sont interessantes

The system of classifiers for island names can, therefore, only be devel­

oped on the basis of a detailed, case by case, examination. This general de­

scription would, however, not be adequate for the purposes of creating the

necessary graphs for our grammar. A further complication arises when, as

is the case with our current research, these place names are combined with

verbs of movement. The interference between our general description, and

the preposition gives rise to new, and equally unpredictable, restrictions.

Thus we find the following examples:

Aller (iz l'lle de, *iz la) Crete

Aller (iz 1'lle de, iz) Chypre

(to go to (the island of) Crete/Cyprus)

Unlike Chypre, la Crete is also to be found in the graph IsIFemSing (cf.

Fig. 10). However, as the following example shows, none of the other

names contained in that graph could accompany la Crete in the graph

IsIPrep (cf. Fig. 8).

Aller (en, * iz 1'lie de) (Corse, Sardaigne)

Aller (en, iz l'lle de) Crete

(to go to (the island of) Corsica/Sardinia/Crete)

Furthermore, a change of preposition can result in a change of behav­

iour. The following example illustrates the consequences of the substitution

of a (to) by dans (in):

Aller iz l'lle de la Reunion, iz la Reunion

Aller dans l'lle de la Reunion, • dans la Reunion

These observations led us to adopt the following method in the construc­

tion of our graphs:

- Graph IsIPrep. Graph of the island names which combine with preposi­

tions other than dans. We placed the singular islands in the upper half of

the automaton and the plural islands in the lower half (Fig. 8).

Page 18: Prepositions and the Names of Countries and Islands: A

326 Mylene Garrigues

Fig. 8. lIePrep

Debarquer a l'lle de Re

Se diriger vers les lies Canaries

- Graph IsLDetZ. Graph of the island names which combine with prepo·

sitions other than dans, and which permit 'zero' determination (cf. Fig. 9).

Alter ii (l'lle de) Malte

- Graph IsIFemSing. Graph of the feminine island names which take the

preposition en (in/ to) (cL Fig. 10).

Alter en Corse

Parallel to the graph NVPrepN (cf. Fig. 1), we devoted a general graph

Page 19: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands

Fig. 9. IIeDetZ

Corse

Crete

Guadeloupe

}------i Haiti

Martinique

Sardaigne

Sicile

l--------j 0

Fig. 10. lIeFemSing

(NVDansN) to the use of the preposition dans( cf. Fig. 11).

327

It is worth noticing that, with the exception of the category of names in

graph LandClass, country names will not take the preposition dans. On the

other hand, this preposition is used extensively with island names. It is im­

portant to notice that the mere introduction of the preposition dans in the

general description entailed a binary re-organisation of the context to the

left (V), and to the right (Nlac), resulting in a specific description. For the

verbs, we had to extract from the verbal paradigm of our basic automaton

Page 20: Prepositions and the Names of Countries and Islands: A

328 Mylene Garrigues

Fig. 11. NVDansN

NVPrepN (cf. Fig. 1) those verbs which function with the preposition dans

accompanied by an island name

(aller, partir, se rendre) dans les Baleares

("avancer, "voler, "appareiller) dans les Baleares

On the other hand, for the nouns, we had to re-organise the graphs of

the island names in function of the modifications entailed by the use of the

preposition dans. Thus, of those island names listed in the graph IslFemSing

(cf. Fig. 10), some cannot be used with dans, whereas others can, but only

with the extended classifier:

alter en (Corse, Crete, Sardaigne)

aller dans la ("Corse, "Crete, "Sardaigne)

alter dans I' fle de (,Corse, Crete, 'Sardaigne)

Furthermore, among the island names listed in the graph IslPrep (Fig. 8),

we notice that the group Barbade, Martinique, Trinite, etc. admits the possi-

Page 21: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands

Bcnfo Ceylln

~ EJbe Guenney Jlva

I>-----t~ Mllle Olboe PmJueroIIes Rhodes Saint-VIncent SlInle-Lueie SumlU'l TobI

Slim-Pierre et MI

Fig. 12. IleDansl

bility of two alternative classifiers, !le de lal Det-Nisl:

aller a( l' fIe de la, la) (Barbade, Martinique, Trinite)

to go to [the island of (the) I cp(the)] (Barbados, Martinique, Trinidad)

329

However, with the preposition dans, this alternation is no longer avail­

able:

aller dans Cl' fIe de la, 'la) (Barbade, Martinique, Trinite)

On the other hand, for the group of plural island names (Aleoutiennes,

Bahamas, etc.), from the same graph IsIPrep, both alternatives for the clas­

sifier (Les flesl Det-Nisls ) are acceptable, even with the preposition dans:

Aller (a les fles, Cl les) (Bahamas, Baleares, Seychelles)

Aller (dans les iles, dans les) (Bahamas, Baleares, Seychelles)

Our approach was, therefore, to take the graphs of the island names de­

veloped for use with prepositions other than dans, and to carry out in each

those alterations made necessary by the new context on the left along with

the preposition dans. Thus we arrived at the following graphs:

Graph NVDansN derived from NVPrepN (cL Fig. 11)

Graph IslDansl derived from IslDetZ (cL Fig. 12)

Page 22: Prepositions and the Names of Countries and Islands: A

330 Mylene Garrigues

Fig. 13. lIeDans3

Graph IsLDans3 derived from IsIPrep (cL Fig. 13)

The graph IsLDans2 is not derived, and includes those island names which

cannot be shortened (cf. Fig. 14).

We intend to extend this set of graphs devoted to the use' of the preposi­

tion dans subsequently to include graphs of the names of departements (:::::

regions), since these are used regularly with this preposition.

Page 23: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands 331

Fig. 14. lIeDans2

Fig. 15. NVN

The syntactic structure (b) of our predicate is represented in the graph

NVN (cL Fig. 15). This graph provides a formal representation of sequenc­

es such as:

Atteindre (les Etats- Unis, la Barbade)

To reach (the United States, Barbados)

Page 24: Prepositions and the Names of Countries and Islands: A

332 Myiene Garrigues

Certain graphs (or modules) of our grammar could subsequently be re­

injected into the grammars of certain neighbouring scenarios. It is therefore

desirable, from a methodological point of view, to proceed in a highly struc­

tured, modular fashion. These selective, limited grammars will, by linking

up with each other, progressively encompass all the scenarios which include

a locative preposition. On a more general level, the accumulation of thou­

sands of local grammars will ultimately encompass the morpho-syntactic

system of our language in its entirety. This approach, as developed and

used at the LADL, is made possible by the above use of finite automata for

the representation of linguistic phenomena, and by the existence of applica­

tions software (lNTEX system) which enables them to be put into practice.

5 .. Computerisation and Implementation

An automata editor (FSGRAPH), implemented under the INTEX system

of textual engineering, enables the user to create graphs 'directly on the

computer screen. This means of construction is very easy to use, and per­

mits the immediate correction of the graph by means of the addition, or

elimination of elements inside the boxes, or nodes, and also the addition or

elimination of the paths linking them. The advantage of this tool is that it

allbws for the modular construction of the grammar, and for a direct repre­

sentation of the linguistic classification of phenomena (cf. D. Maurel,

19-89). It is also capable of the automatic conversion of a graph into a rec­

ognition automaton.

We have provided a list of examples of automatic recognition carried out

on. the basis of such grammars. The corpus consists of a day's AFP

(Agence France Presse) dispatches (cf. Fig. 16). The system automatically

lemmatises the inflected form of each verb, and furnishes these forms as a

list of concordances.

It is also worth noticing that the PrepNloc elements of the automata we

have provided can be found as the complements of various verbs (tele

phoner en Chine), and as the complements of substantives derived from

verbs (l'arrivee au Japan, the arrival in Japan) and of non-derivational

forms (l'ambassade de France en Chine, the French embassy in China). This

demonstrates the considerable generalising power of this form of represen­

tation.

Page 25: Prepositions and the Names of Countries and Islands: A

Prepositions and the Names of Countries and Islands 333

Before going to Asia ,where the first

after his arrival in Ethiopia

against the arri val in France of Japanese indu

at the moment of her arrival in Greece , gave birth in a h

had arrived in Israel as a result of

managed to reach Greece and 35 people als

I intend to return to Argentina but Japan interest

He(···) would return to Italy to become manag

6 young Albanians are taking refuge in Italy

was not able to go to France , said his spoke

are going to go to Sweden this summer

managed to go to Syria

managed to go to Yugoslavia

are invited to come to Europe ", declared M. A

Fig. 16. Concordances, depeches de I' AFP Concordances, AFP dispatches

References

Boons, Jean-Paul, Guillet, Alain, Lecl~re Christian (1976) La structure des

phrases simples en franr;ais I: Constructions intransitives, 378, Gene­

va: Droz.

Garrigues, Myl~ne (1993) Methode de parametrage des dictionnaires et

grammaires electnmiques, Doctoral dissertation, Universite Paris 7,

LADL.

Gougenheim, Georges (1962) Systeme grammatical de la langue franr;aise,

Paris: Editions d' Artrey.

Gross, Gaston (1992) Classes d' objets et description des langues, Report LLI,

Universite de Paris XIIl, Villetaneuse.

Gross, Maurice (1975) Methodes en syntaxe, 414, Paris: Hermann.

Gross, Maurice (1989) 'The use of finite automata in the lexical representa­

tion of natural language', Electronic Dictionaries and Automata in

Computational Linguistics, 34-50, Berlin-New York: Springer.

Guillet, Alain, Lecl~re Christian (1992) La structure des phrases simples en

franr;ais. II. Constructions transitives locatives, 445, Geneva: Droz.

Maurel, Denis (1988) 'Grammaire des dates, Etude preliminaire cl leur

Page 26: Prepositions and the Names of Countries and Islands: A

334 Mylene Garrigues

traitement automatique', Linguisticae Investigationes, 12.1, 101-128.

Schank, Roger C. (1990) Tell Me a Story: A New Look at Real and Artificial

Memory, New York: Macmillan Publishing Company.

Silberztein, Max (1993) Dictionnaires electroniques et analyse automatique de

textes: Le systeme INTEX, 233, Paris: Masson.

LADL Universite Paris 7

2 place Jussieu, Tour Centrale,

F 75251, Paris Cedex 05

France

E-mail: [email protected]