word typology
TRANSCRIPT
Abbreviations
AGT – Agentive
DIST – Distal Demonstrative
F – Feminine
NF – Non-feminine
NONVIS – Non-visual
PN – Personal Name
PL – Plural
PRES – Present
PST – Past Tense
SG – Singular
2
Contents
1. Introduction...............................................................................................................................4
2. What is a word?.........................................................................................................................4
3. Kinds of words............................................................................................................................8
3.1. The orthographic word.......................................................................................................8
3.2. The phonological word.....................................................................................................10
3.3. The grammatical word......................................................................................................11
3.4. Clitics................................................................................................................................12
4. Mongsen Ao.............................................................................................................................14
4.1. The phonological word.....................................................................................................14
4.2. The grammatical word......................................................................................................15
4.3. Clitics................................................................................................................................16
5. Tariana.....................................................................................................................................16
5.1. Nasalization......................................................................................................................18
5.2. Aspiration.........................................................................................................................18
5.3. Vowel harmony................................................................................................................20
6. Conclusions..............................................................................................................................21
7. References................................................................................................................................23
3
1. Introduction
For many years the definition of ‘word’ has been based on Eurocentric parameters derived
from the conception of our classics, from the Latin and the Ancient Greek grammarians. According
to Matthews (2003:268), in Latin dictio ‘word’ was the smallest unit of speech. For the Greeks and
the Romans, “the word was the basic unit for the statement of morphological patterns”, Dixon &
Aikhenvald (2003:2). For example, for the ancient Latin grammarians, Romam deleui, ‘I destroyed
Rome’, Matthews (2003:268), was two words, but is ‘word’ a universal concept? Perhaps the
definition seen above only works for a few European languages, those of Latin origin in particular.
It may also work for isolated languages such as Chinese, Vietnamese, etc., where there is no
morphology, but does it work for agglutinative languages? We could take Mohawk, an Iroquoian
language, ashakotya'tawitsherahetkvhta'se1 ‘He ruined her dress’ (lit. He made the-thing-that-one-
puts-on-one's body ugly for her). Is this a word? Orthography plays tricks on us when having to
define ‘word’; perhaps the difference between Spanish and Catalan, two similar languages,
exemplifies how orthography can be tricky. How many words do we have in menja-t’ho (Catalan
for “[you] eat it”) and cómetelo (Spanish for “[you] eat it”)? How about if we write *menjatho and
*come-te-lo? Is a ‘word’ determined by the orthography, only?
The aim of this essay is to define ‘word’, to identify what determines a word and how clitics
should be considered. We will take a look at the criterion used to establish words in two recently
written grammars of the Tariana (2003) and Mongsen Ao (2007) languages. Neither has an
established orthography, and all the examples given in the books are phonetic transcriptions.
The information available from these two languages on ‘word’ is different in the two grammar
books; so the information given here on these two languages differs and is structured differently.
1 http://en.wikipedia.org/wiki/Synthetic_language
4
2. What is a word?
Perhaps we could go to several dictionaries and take a look at a standard definition of ‘word’.
The following definitions are taken from three different dictionaries:
1- Word: “a sound or combination of sounds that has a meaning and is spoken or written.”
Merriam-Webster;
2- Word: “Significant unit consisting of one or more morphemes, characterized by a certain
functional autonomy within a sentence or phrase.”2 Diccionari de l’Institut d’Estudis
Catalans;
3- Word: “Discourse segment usually unified by the accent, the meaning and initial and final
potential breaks.”3 Diccionario de la Lengua española of the Real Academia Española.
The first definition is quite clear and simple, but fails to remind the reader that sign languages
also have words. Therefore, a more plausible definition would start something like: “a sound or a
sign or combination of sounds and signs…”. The second definition starts with: “Significant unit
consisting of one or more morphemes…”, since a morpheme could be defined as the minimal unit
with meaning, is the morpheme –s for the plural therefore a word? Ending with the third
definition, I would say this definition is too abstract. What comprises this discourse segment?
A more linguistic approach to the meaning of ‘word’ would be:
“A word is a unit which is a constituent at the phrase level and above. It is sometimes identifiable
according to such criteria as being the minimal possible unit in a reply; having features such as regular
stress pattern and phonological changes conditioned by or blocked at word boundaries; being the
largest unit resistant to insertion of new constituents within its boundaries, or being the smallest
constituent that can be moved within a sentence without making the sentence ungrammatical.”
Glossary of Linguistic Terms, by SIL.
Certainly this definition seems to be a mixture of the definitions given above by the three
dictionaries, but, as with the other definitions, there is a feeling of abstractness to it, and I think
2 Original → Mot: unitat significativa que consta d’un o més morfemes, caracteritzada per una certa autonomia funcional dintre d’una oració o un sintagma.3 Original → Palabra: segmento del discurso unificado habitualmente por el acento, el significado y pausas potenciales inicial y final.
5
that ‘word’ is a more palpable concept. Without willing –or daring– to define ‘word’, I can affirm
that right now, as I am typing, I am writing many words and reading them at the same time, and
am sure that many of them can be easily recognized.
However, the definitions we have seen from the dictionaries, where ‘word’ is defined as a unit
of a language, still seem made for and by some European languages which have a synthetic
structure. But, how about languages with polysynthetic morphology? Yup’ik, an Eskimo language
of Alaska, Mithun (1999:37), has words that, for their orthographic appearance, are quite
equivalent to words in English, but it also has many words that are not. Among its nouns, for
example, we can take a look at the following: araq, which means ‘ash made from birch tree fungus
or other special plant products and then mixed with chewing tobacco’; caginraq, meaning ‘skin or
pelt of caribou taken just after the long winter hair has been shed in spring’; or andpartak ‘spruce
root stretched above water, from which hang a line of snares just above the water's surface, to
catch waterfowl’. These ‘words’ in Yup’ik are apparently like English words, however they express
concepts that in English –and in many other European languages– cannot be expressed in one
single orthographic word. These ‘words’ in Yup’ik “consist of multiple meaningful parts or
morphemes”, Mithun (1999:38). The following example is broken down, Mithun (1999:38):
(1) kaipiallrulliniuk
kaig-piar-llru-llini-u-k
Be.hungry-really-PAST-apparently-INDICATIVE-they.two
‘the two of them were apparently really hungry’
Languages such as Yup’ik that show a high number of morphemes per word are described as
polysynthetic. What are we to think of an example such as number (1) If we take a look at Yup’ik,
we have one orthographic word, but if we take a look at the English translation we have a
sentence. At the same time, we must consider that Yup’ik writing convention states that this
‘word’ has to be written as one, with no spaces, while English writing convention says that we
have to place spaces between what English speakers consider separate words.
According to Mithum (1999:38), “the best criterion is usually the judgment of native
speakers.” She affirms that it does not matter whether the speaker knows much grammar or not,
but normally speakers are able to repeat a sentence word-for-word without much hesitation and
including all the needed stops between words. Mithum also states that in natural speech, speakers
6
barely pause in the middle of words; if they are distracted in the middle of a word, they normally
go back to the beginning of the word, and this is done by instinct. At the same time, we must
consider that speakers might know the meaning of a whole word, but they might not be able to
split a word made out of several morphemes and give the meaning of these morphemes. So in
that case, all these premises may apply to a word such as example (1). One of the features of
polysynthetic languages is that the morphemes comprised in a word are not always fully
transparent, as in the case of the example seen above; therefore, it is very plausible that Yu’pik
speakers perceive this example as a single word and that is why it is written as one.
Words can also be identified by the stress. Words have no more than one primary stress.
Mithun (1999:38), again, says that stress in Chitimacha, an isolate language from Louisiana (United
States), falls on the first syllable of a word. So every time we hear the stress, it means that we are
starting a new orthographic word. In Tuscarora, a Northern Iroquoian language, for example, it
falls on the penultimate syllable. Words can also be identified in some languages by their
morphological structure, for example in Yup’ik verbs always begin with a root morpheme like kaig-
‘be hungry’ (as seen on the previous example) and end with a pronoun like –k, Mithum (1999:38).
Berg (1989:41) defines ‘word’ as “a unit of phonological and morphological constancy and
syntagmatic mobility.” Berg notes that in an orthographic word such as fotu ‘head’ from Muna, an
austronesian language from Indonesia, the four phonemes have a fixed order and cannot be
changed without altering the meaning or even without creating a nonexistent word. He points out
that morphological constancy is shown by the fact that in a word such as no-feka-nggela-hi-e-mo
‘she has already made it clean’, the order of the morphemes is fixed, so this would be the only way
of arranging these morphemes and no other way would be possible. Then, if we take a group of
words (nouns, affixes, etc.) and change their order, if it still makes sense, if it is grammatical and
has meaning –probably a different one from the original order– , then we are not talking about a
‘word’, but about a sentence with separate words.
As seen so far, trying to define ‘word’ is not easy, especially if we want that definition to be as
inclusive as possible, taking into account all the typologically different kinds of languages. Berg’s
explanation is quite convincing, but, whatever definition of ‘word’ we decide on, perhaps the best
criterion to do it or to recognize words is, as Mithum says: “usually the judgment of native
speakers”. But one could ask: why not do it with the help and knowledge of linguists.
7
3. Kinds of words
3.1. The orthographic word
The orthographic word would be what we write between spaces, according to Western society
conventions and especially in reference to those languages that use the Latin alphabet. Pike
(1947:89), as quoted by Dixon & Aikhenvald (2003:7-8), says that a word is “the smallest unit
arrived at for a particular language as the most convenient type of grammatical entity to separate
by spaces; in general, it constitutes one of those units of a particular language which actually or
potentially may be pronounced by itself”. Interpreting this quote, I would put it briefly as
reaffirming the convention that grammatical words are written between spaces.
However, conventions are conventions, they are established, and, as pointed out by Dixon &
Aikhenvald (2003:8), in English, for example, the convention is to write “cannot” as one word, but
to write “must not” as two orthographic words.
Dixon & Aikhenvald (2003:8) explain how Van Wyk (1967:230) describes different conventions
used in Bantu languages for writing word divisions, with the following examples:
1. Disjunctivism: “according to which relatively simple, and, therefore, relatively short,
linguistic units are written and regarded as words” → re tlo e bua ka thipa ya gagwe;
2. Conjunctivism: “according to which simple units are joined to form long words with
complex morphological structures” → retloebua kathipa yagagwe.
Both examples mean the same: ‘we shall skin it with his knife’.
However, conventions can be different in other writing systems. The Japanese language uses
two syllabaries, hiragana and katakana, and ideograms, kanji. It is completely possible to write
Japanese using only the syllabaries, but the convention is that they write all the words together.
(2) ゆうがた おじいさん が 山 から もどって きました4
4 http://life.ou.edu/stories/momotarou.html
8
yuugataojiisangayamakaramodottekimashita
‘In the evening the old man returned from the mountain’
According to my personal knowledge of the Japanese language, this would split into the
following:
ゆうがた→ yuugata → evening;
おじいさん→ ojii-san → the old man;
が→ ga → subject marker;
山→ yama → mountain;
から→ kara → from;
もどって→ modotte → back;
きました→ kimashita → returned.
The Japanese orthographic system is just a convention that differs from our system, because
all Japanese speakers can identify orthographic words, and the proof is that, if they are required,
they can write the same sentence in romanji5, where they have to use spaces, like this:
(3) yuugata ojii-san ga yama kara modotte kimashita
Carme Junyent, one of my professors of linguistics during my B.A. in Barcelona, used to ask us
in class: “how many words are there in Sant Joan Despí6?”, Despí could be translated as ‘of the
Pine’. So, once again, we could be writing: “Sant Joan Des Pi” or “Sant Joandespí” or
“Santjoandespí”. The utterance follows the rule, explained above, of its constituents not being
interchangeable; therefore it all could be considered one word.
One of the problems that arise with spaces is where to set them when there are particles such
as clitics and affixes and also with compounds. This is one of the big difficulties when setting
orthographic rules for a language. Newman (2000:729) notes that in Hausa, a Chadic language
from Africa, the general practice seems to be to write “noun.of noun” or “adjective.of noun”
compounds as separate words, e.g., gidan sauro ‘mosquito net’, farin jini ‘popularity’, but verb +
5 The Romanization of the Japanese characters. 6 A city near Barcelona.
9
noun compounds with hyphens, e.g., a-ci-balbal ‘an oil- burning lamp’, kas-dafi ‘potion that makes
one impervious to poison’. Therefore the conception of orthographic word is very changeable.
3.2. The phonological word
As defined by Dixon & Aikhenvald (2003:13), a phonological word “is a phonological unit larger
than the syllable (in some languages it may minimally be just one syllable)”. The phonological word
must have at least one of the following features:
(a) Segmental features (internal syllabic and segmental structure); phonetic realizations in
terms of this; word boundary phenomena; pause phenomena. According to Dixon &
Aikhenvald (2003:14), in some Australian languages, for example, a root or suffix may have
one or more syllables but every phonological word must have at least two syllables. Dixon
& Aikhenvald (2003:13) talk about Walmatjari, a Pama-Nyungan language from Australia,
where a disyllabic verb root may take a zero tense-mood suffix, e.g. luwa-ø ‘hit!’ (the
allomorph of imperative for the conjugation to which this verb root belongs is zero),
whereas a monosyllabic root must take a suffix that is at least one syllable in extent, e.g.
ya-nta ‘go!’ (here the imperative allomorph is -nta).
(b) Prosodic features (stress or accent and/or tone assignment); prosodic features such as
nasalization, retroflexion, vowel harmony. In many languages, stress (or accent) provides
one criterion for phonological word. Many languages have fixed stress, for example on the
first or last or penultimate or antepenultimate syllable of a phonological word. It should
then be possible to ascertain the position of word boundaries from the location of stress.
(c) Phonological rules take place specifically across a phonological word boundary. In some
languages the optimum analysis involves recognizing underlying forms for roots and
affixes and then a number of phonological rules which generate the surface forms. Each
rule applies to a certain syntagmatic extent. Many rules apply just within the phonological
word while some apply across a phonological word boundary.
These features are not mutually exclusive, they may interact. For example, many phonological
rules work in terms of stress assignment within a word; the appearance of certain phonemes at
10
certain positions within a phonological word may be a consequence of the operation of certain
phonological rules.
3.3. The grammatical word
For Dixon & Aikhenvald (2003:19), a grammatical word consists of a number of grammatical
elements that always occur together in a fixed order and have an established coherence and
meaning. Grammatical words are used to express grammatical relationships with other words
within a sentence.
In order to better understand the difference between a grammatical word and a phonological
word, the following examples in English will be of great help. In English “don’t” is a phonological
word, but this one consists of two grammatical words, “do” and “n’t”. So, in that case, two
grammatical words make one phonological word. Also, this phonological word and two
grammatical words coincide with one orthographic word.
Phonological and grammatical words coincide in most cases, but not always. Aikhenvald
(2008:51) affirms that examples of one grammatical word forming several phonological words
include nominal and verbal compounds, full reduplication of simple roots, and reduplication in
compounds.
In Manambu, a Sepik language from Papua New Guinea, Aikhenvald (2008:51), disyllabic and
trisyllabic nominal compounds form one phonological word, e.g. du-tá:kw (man-woman) ‘people’,
man-tá:b (leg-hand) ‘arms and legs’, bapa-tá:kw (moon-woman) ‘lady moon’, takw-a-ñán (woman-
child) ‘girl’, and many others. The stress typically falls on the final syllable. A compound of three
syllables or more may form two phonological words, e.g. kamí kamná:gw (fish food) ‘foodstuff’
(the free form of “fish” is kami:), babáy dú (maternal.grandparent man) ‘maternal grandfather’.
Alternatively, such a compound may form one word, and have its main stress on the last syllable of
the second component. Then, the first component retains a weaker secondary stress on the old
stressed syllable, e.g. vyaketà-yaké (beautiful-fully) ‘very beautiful’ referred to a woman, for
example. All these compounds form one grammatical word since no other constituent can
intervene between their components.
11
Another example, as mentioned above, is with full reduplication of simple roots. Continuing
with Manambu, if a verbal, an adjectival, or a nominal root undergoes full reduplication and the
resulting structure is four syllables long, it is treated as two disyllabic phonological words in terms
of stress assignment, e.g. kwasá-kwasá ‘very small’, wuké-wuké-k ‘in order to hear’. The stress on
the last phonological word is the same as the non-reduplicated form in isolation.
Reduplication in compounds is also another case where one grammatical word forms more
than one phonological word. A compound comprised of two verbs, or one of its components, can
undergo full reduplication. For example, the second component of the compound ve-semél- (see-
dummy.root) ‘look for’ can be reduplicated; the resulting form consists of two phonological words:
ve-semél-semél- ‘look everywhere’. If both components undergo full reduplication, and the initial
compound contains more than two syllables, the last vowel of the first component is dropped. The
reduplicated components form separate phonological words. So, full reduplication of kui-taka-
produces two phonological words kui-tak-kui-taka- (give-put-give-put) ‘give away repeatedly’.
As explained by Aikhenvald (2008:53), there are also cases of two or three grammatical words
forming one phonological word. This is given in noun phrases and complex predicates which may
form one phonological word and phonological words containing clitics.
3.4. Clitics
Clitics are a complex issue. I would define them as particles that are bound to a word, which
sometimes behave as morphemes and sometimes as independent words: they are half way
between the two categories. They are always host dependent. In similar terms, Dixon &
Aikhenvald (2003:25) define clitics as “[referring] to something that is a grammatical word but not
a complete phonological word (for example, it does not take stress). A clitic is attached to a host
phonological word, as a sort of optional extra. There are some items that can have the form either
of a clitic or of a full phonological word.”
So, if clitics are half way between a morpheme and a word, how should they be considered?
Portuguese is one of the languages that I am able to speak. This language has proclitics, enclitics
and mesoclitics. We could discuss whether proclitics or enclitics are separate words or not and
orthography plays an important role in that, but, how about mesoclitics? In Portuguese a
12
mesoclitic is used with “Futuro do Presente” and “Futuro do Pretérito” verbal tenses. An example
in “Futuro do Presente” would be de following:
(4) queixar ‘to complain’ → queixarei ‘I will complain’ → queixar-me-ei ‘I would complain’.
Here, Portuguese splits the infinitive form from the future tense morpheme to insert the
mesoclitic me. The orthography is tricky, as it does not disguise the clitic within the word, but, if
we would write *queixarmeei, it would be much easier to affirm that this is a word, an
agglutinated one. But, even with the correct orthography, we have a word that is split apart with
two hyphens and perhaps they are telling us –at least from an orthographic point of view– that
everything belongs to one unit, a single word.
Catalan, perhaps, offers a good example of the clitics double behavior as well as showing us
that hyphens are just a convention and do not necessarily tell us whether the clitic is a word by
itself or belongs to what it is supposedly attached to, together forming a single word. In the vast
majority of Catalan dialects, the final r of the infinitive is silent and it remains silent even when
there is another word following the infinitive.
(5) parlar [pərˈla] ‘to speak’ → parlar bé [pərˈla ˈβe] ‘to speak well’.
Depending on the dialect, when introducing a clitic, phonetics change. Here we have the same
sentence into two different dialects:
(6) parlar-lo bé [pərˈlarlu7 ˈβe] ‘to speak it well’
(7) parla’l bé [pərˈlal ˈβe] ‘to speak it well’
The o from the clitic lo is epenthetic, that’s why its omission is possible in example (7). In this
same example, the verb behaves, phonetically speaking, in the same way as in example (5), that is,
the final r of the infinitive remains silent, meaning that the clitic is behaving as if it was a word. In
example (6), on the other hand, the final r of the infinitive is pronounced, meaning that what
follows behaves differently from example (7). Since Catalan does not silence r’s within a word, we
could consider that in example (6) the verb and clitic form a single unit, and therefore the –lo is a
morpheme.
7 Vowel reduction.
13
Above we have said that clitics do not take stress, Dixon & Aikhenvald (2003:25). However, the
Catalan spoken in Majorca breaks this rule as it moves the stress to the clitic; however, it keeps
the pronunciation of the final r of the infinitive.
(8) parlar-lo [pərˌlarˈlo8 ˈβe] ‘to speak it well’
After seeing these examples, how should we consider clitics? Even seeing the examples of the
Catalan language, it is my belief that clitics should not be considered words as they cannot stand
by themselves and they always need to be attached to a host –which is a word that can stand by
itself.
4. Mongsen Ao
Mongsen Ao is a Tibeto-Burman language spoken by the Ao people in Nagaland, in north-
eastern India. According to India’s 1991 Census, Ao is spoken by 170,000 people. It is estimated
that perhaps forty percent of that number speak Mongsen as their first language, Coupe (2007:1).
Coupe, the author of the grammar, uses ‘word’ very broadly, following Dixon & Aikhenvald’s
conventions, so I will structure this chapter in the same way as chapter 3.
A word in Mongsen Ao may consist of just one syllable forming a single morpheme, which is a
phonological word and a grammatical word at the same time, e.g. pùʔ ‘carried on the back’ or nì
‘I’, Coupe (2007:49). The word in Mongsen can be identified on the basis of phonological and
grammatical criteria, according to the parameters defined in the previous chapter, which are
based on Dixon & Aikhenvald’s (2003) premises.
4.1. The phonological word
The phonological word in Mongsen Ao has the following features:
(a) Segmental features: Diphthongs may form across syllable boundaries within phonological
words that also constitute grammatical words e.g. rà ‘come’ + -iʔ causative → raiʔ ‘caused
8 The Catalan from Majorca does not generally make vowel reduction in this case.
14
to come’, Coupe (2007:49). Contrary to Manambu, in Mongsen Ao a single phonological
word can on rare occasions consist of two grammatical words. However, there are some
cases, as shown by Coupe (2007:49), in which a phonological word is made up of two
grammatical words where the juxtaposed vowels of the separate phonological and
grammatical words á-límá (prefix-world) ‘world’ and the proximate demonstrative i ‘this’
diphthongize to form one phonological word comprising two grammatical words [a-li-maj]
‘this world’, pronounced without a hiatus. In Mongsen Ao, a sequence of vowels of
identical quality and tone in a stream of speech signal a boundary between two words.
Sequences of identical consonants, however, may occur both within a stream of speech
and within a word. According to Coupe (2007:50), only voiceless unaspirated stops, voiced
nasals and the voiced retroflex approximant may occur at the end of a syllable, and,
therefore, at the end of a word, but it seems that this is not a sufficient criterion for the
identification of a word boundary because any allowable coda constituent may occur
internally.
(b) Prosodic features: Tone sandhi processes operate across morpheme boundaries in
phonological words and across grammatical word boundaries. Coupe (2007:51) notes that
Mongsen Ao has a prosody glottal stop, which seems to be an indicator of word boundary,
and therefore the limit for a grammatical and a phonological word.
(c) Phonological processes: Some dissimilatory processes are for the word-internal
environment. For example, as Coupe (2007:51) points out, a high front vowel occurring as
the final linear constituent of a verb root is pronounced as a schwa in the environment
before the high front vowel of a suffixed irrealis marker -ì, as in hlì ‘buy’ + -ì → [ləj] ‘will
buy’. This dissimilation is motivated by the need to keep the grammatical information that
would otherwise be lost through the deletion of one of the two identical vowels, Coupe
(2007:51). Dissimilatory changes of this nature never occur across phonological or
grammatical word boundaries, so this is also a way of identifying words.
4.2. The grammatical word
The grammatical word is recognized by the following features, Coupe (2007:51):
15
(a) Isolatability: As in many other languages, speakers of Mongsen Ao are able to identify and
give meanings to grammatical words but, for them, it is harder to split words into
morphemes or to give meaning to them.
(b) Immutability: words, and their affixes, have an established order and breaking this order
results in a change of meaning or the total loss of it.
(c) Potential pause: pauses are an important element, which may occur between words, but
not within words.
4.3. Clitics
A clitic in Mongsen Ao generally has the following characteristics, Coupe (2007:52):
(a) It has low selectivity with respect to the lexical category of its host;
(b) It cannot occur independently of its host, meaning that it cannot function as an
independent phonological word;
(c) A clitic’s scope of relevance is the word, phrase or clause;
(d) It can only occur as the last element of the word, phrase or clause;
(e) The clitic and its host do not admit an intervening pause, meaning that it is treated as part
of the word and not as a separate word.
5. Tariana
16
Tariana is a language spoken in the Vaupes river basin in northwestern Brazil. It is, in fact, the
only language belonging to the Arawak family in the region. It is an endangered language, which,
according to Aikhenvald (2003:XVII), had some 100 speakers in 2003, the year the grammar was
published.
Aikhenvald (2003:40) states that, “the main criterion for a phonological word in Tariana is
primary stress.” Prosodic processes play an important role in delimiting the phonological word,
where different prosodic classes of morphemes have different rules with respect to stress. Some
other important prosodic processes are nasalization, aspiration and vowel harmony.
In Tariana, a phonological word contains one primary stress; it may also contain one or more
secondary stresses, Aikhenvald (2003:40). In Tariana, prosodic classes of morphemes are roots,
affixes and clitics. Proclitics can form a phonological word on their own and they never take
secondary stress, while affixes and enclitics cannot. Proclitics are monosyllabic and they are
attached to the following word, if they are not in focus. Roots can also form a phonological word
on their own. Enclitics always take a secondary stress and cannot form a phonological word on
their own.
The following example, in which it is possible to find a phonological word with a suffix and an
enclitic, both being a minimal pair, is found in Aikhenvald’s grammar (2003:41):
(9) íri-ne (blood-PL) ‘those of blood, the Tariana’, consists of a root and a suffix, and is a
phonological word with one primary stress;
(10) íri=nè (blood=comitative) ‘with blood’ consists of a root and an enclitic, and is a
phonological word with one primary and one secondary stress.
According to Aikhenvald (2003:45-46), in Tariana, the word-delimiting prosodic parameters
are:
1- Primary stress: which determines the number of phonological words;
2- Secondary stress: indicative of clitic boundaries; indicative of the structure of a
phonological word;
3- Nasalization: which operates within the boundaries of a phonological word;
17
4- Aspiration floating: indicative of the beginning of a phonological word, and of clitic
boundaries within it (i.e., of its internal structure);
5- Aspiration: indicative of the word-initial position;
6- Vowel harmony: indicative of word-final, and of word-initial positions;
7- Post-tonic vowel reduction: indicative of word-final position.
From the above, nasalization, aspiration and vowel harmony merit a more detailed
explanation:
5.1. Nasalization
If a phonological word contains a nasal vowel or a nasal consonant, all vowels (except ɨ), the
voiced dental stop d, the lateral l, the flap ɾ and the alveopalatal glide y within the word limits are
replaced with a homorganic nasal in normal and in rapid speech registers: [ɾ] → [n], [y] ɾ→ [n], [d]
→ [n], e.g. -toréta 'roll dough' is realized as [toneta]; kenowa-na 'a tree-like plant (unidentified)'
as [kenowa-na], Aikhenvald (2003:42). It would seem that this nasalization found in Tariana is the
result of a Tucano accent, since, according to Aikhenvald (2003:42), nasalization as word prosody
is a characteristic of East Tucano languages.
5.2. Aspiration
Tariana presents two phenomena. The first is the loss of aspiration within a phonological
word. If adjacent syllables across a clitic boundary contain aspirated stops, nasals or the bilabial
glide wh, any, but only one, of these is aspirated, Aikhenvald (203:42). The following are some of
the processes that delimit the phonological word boundary:
(a) Sequence of a prefix and a root di-pha (3sgnf-fall) and a clitic =kha ‘away’ should have
resulted in di-pha=kha; but this sequence is pronounced as either [di-pha=ka] or [di-
pa=kha], Aikhenvald (2003:42).
18
(b) A phonological word consisting of a sequence of a root wyaka ‘far’, an enclitic =tha
‘frustrative’ and the enclitic -mhana ‘remote past non-visual’ should have resulted in
wyaka=tha=mhana (‘it is –perceived non-visually– as being far in vain’) but is pronounced
as either [wyaka=ta=mhana] or [wyaka=tha=mana], Aikhenvald (2003:42).
(c) Aspirated voiced stops and aspirated nasals may lose their aspiration in a non-word initial
position, e.g. di-nu-mha (3sgnf-come-pres.nonvis) ‘he is coming’ can be pronounced as [di-
nu-ma], Aikhenvald (2003:43).
(d) Tariana speakers pronounce words like diayahya (3sgnf-swim)=kha (away) ‘he swam
away’ as [di-ayahya-ka]. According to Aikhenvald (2003:43), this process is typical of clitic
boundaries.
The second phenomenon is aspiration floating, as Aikhenvald (2003:43) calls it. In a sequence
of two syllables (on a prefix-root boundary or within an enclitic), one of which contains an
aspirated stop, nasal or bilabial glide and the other an unaspirated voiceless stop, the aspirated
consonant may lose its aspiration; then the coda of the first syllable is h. This process does not
apply if:
(a) the first syllable is not stressed;
or
(b) if it does not contain a long vowel.
Aspiration floating does not apply within a root, for example maratahka and not [maratakha]
‘quick wave’, karahta and not [karatha] ‘lung’. As Aikhenvald (2003:43) states, an aspirated stop, a
nasal or a wh on the prefix-root boundary is most often the result of h-metathesis.
Similarly, the sequence of clitics =thaka ‘frustrative=recent past visual’ can be pronounced as
[=tahka], and the sequence =mha=ka ‘recent past non-visual’ can be pronounced as [=mahka],
Aikhenvald (2003:43). There is just one case in which aspiration floating takes place within what
appears to be one morpheme: the enclitic =botha, =butha ‘conditional’ is often pronounced as
=bohta or =buhta, Aikhenvald (2003:44). According to Aikhenvald (2003:44), it seems that this
clitic is actually a combination of two clitics =bo ‘counterfactual’ and =tha ‘frustrative’. Therefore,
the application of aspiration floating within =botha may constitute, as Aikhenvald (2003:44) states,
proof that the origin of this clitic could be bimorphemic.
19
Floating aspiration is evidence of a clitic boundary within a phonological word, and of a prefix-
root boundary.
5.3. Vowel harmony
Vowel harmony is realized within the phonological word, and indicates its boundaries. Tariana
has three types of vowel harmony, all of which occur in normal and rapid speech, Aikhenvald
(2003:44).
(a) Regressive translaryngeal vowel harmony in independent personal pronouns: according to
Aikhenvald (2003:44), independent pronouns consist of a cross-referencing prefix and a
deictic element -há. In the rapid speech register, the vowel of the first syllable elides and
aspirated consonants emerge as a result. The morpheme -ha triggers nasalization of the
following vowel.
(b) Word-initial vowel harmony: If the root begins with h and is preceded by a prefix, a vowel
identical to that of the prefix is inserted after h, e.g. slow register: [duhéni]; normal to
rapid register [du-huéni] or [dhuéni] (3sgf-ear) ‘her ear’. This can be interpreted as an
instance of translaryngeal vowel harmony. Two unstressed vowels at the beginning of a
phonological word may be pronounced the same as the second vowel, e.g. [yarumakási],
[yəarumakási], may be pronounced as [yurumakási] ‘clothing’; and [di-sepatá] as [di-
sapatá] ‘he suffers’, Aikhenvald (2003:45).
(c) Word-final vowel harmony within enclitics: In normal to rapid speech, enclitics which have
their secondary stress on the last syllable assimilate the vowel in the pre-tonic syllable to
the one in the stressed syllable, e.g. -nakù ‘topical non-subject’ becomes -nukù, -pidenà
‘remote past inferred’ becomes -pidanà. This only happens when these enclitics occur at
the end of the word. Vowel harmony of this type is therefore associated with the end of a
phonological word, Aikhenvald (2003:45).
In Tariana, a phonological word may consist of one or more grammatical word(s). If a
phonological word contains a proclitic and/or an enclitic, then it may consist of more than one
grammatical word. According to Aikhenvald (2003:54), roots can form grammatical and
20
phonological words on their own, and they have a primary stress. Prefixes and suffixes cannot
form grammatical or phonological words on their own: they are all monosyllabic, (Aikhenvald
(2003:54)). Most suffixes are monosyllabic; some are disyllabic. Prefixes are atonic; some suffixes
are stressed, and some are not, Aikhenvald (2003:54). Proclitics can form grammatical and
phonological words on their own. They are monosyllabic, never have secondary stress and are
procliticized to the following word, if they are not in focus, Aikhenvald (2003:54).
6. Conclusions
Staying within the Latin alphabet, as we have seen, an orthographic word is what we separate
between spaces. A phonological word, as defined by Dixon & Aikhenvald (2003:13), “is a
phonological unit larger than the syllable” and a grammatical word, according to Dixon &
Aikhenvald (2003:19), consists of a number of grammatical elements that always occur together,
take place in a fixed order and have an established coherence and meaning. In this essay we have
seen how sometimes phonological and grammatical words coincide, and how sometimes they do
not, as in the case of the Manambu language where a grammatical word may form more than one
phonological word. Manambu does not have an established orthography; however, following the
orthographic examples given by Aikhenvald, one grammatical word is equal to an orthographic
word, even if this grammatical word forms more than one phonological word. As mentioned
above, an orthographic word is what we have between spaces. Therefore, I would consider
compound words written with hyphens one orthographic word, as Aikhenvald does, since there
are no spaces in between.
Even though the concepts of the several kinds of words are well defined, I would say that in
practice, applying those concepts is not that easy or clear. However, I still like to recall what
Mithum (1999:38) states regarding the identification of a ‘word’: “the best criterion is usually the
judgment of native speakers”, because it is true that when speaking, if we get stuck in the middle
of a word, we go back and repeat the whole word, and not just half of it, and it seems that in our
mind it is very clear what a word is.
21
Since Mongsen Ao and Tariana do not have an established orthography and therefore, there
are no orthographic examples, I cannot tell whether the orthographic word represents the
phonological word, the grammatical word or both. However, in chapters 4 and 5 we have seen
many instances of the phonological word and the grammatical word being different in these two
languages.
Regarding clitics, I myself have defined them as particles that are half way between a word
and a morpheme. I have thought a lot about the Catalan examples (6), (7) and (8) and I have
changed my mind several times regarding the phenomenon of the stress shift in Catalan as spoken
in Majorca. Looking at all the definitions and examples in this essay, my guess would be that in the
Majorcan variety, when the clitic is attached to the host, the stress shift is due to the fact that the
clitic is considered a single phonological word together with the host. If it were not the same
phonological unit, the stress would not be able to shift and it would not leave the host without
stress. There’s no stress shift in examples (6) and (7), which may mean that the clitic is not
assimilated into the phonological word. There is a single grammatical word, however, in all the
examples, since they follow the pattern of all the grammatical elements being joined together in a
fixed order with an established coherence and meaning. At the same time, they are all single
orthographic words, also, as between the host and the clitic there are no spaces.
22
7. References
Institut d'Estudis Catalans. (25 / 5 / 2014). Diccionari de la llengua catalana. Retrieved from http://dlc.iec.cat
Real Academia Española. (25 / 5 / 2014). Diccionario de la lengua española. Retrieved from http://lema.rae.es/drae/
Aikhenvald, A. Y. (2003). A grammar of Tariana, from Northwest Amazonia. Cambridge: Cambridge University Press.
Aikhenvald, A. Y. (2008). The Manambu Language of East Sepik, Papua New Guinea. Oxford, New York: LaTrobe University.
Coupe, A. (2007). A grammar of Mongsen Ao. Berlin: Mouton de Gruyter.
Dixon, R.M.W., & Aikhenvald, A. Y. (2003). Word: a typological framework. A R. Dixon, & A. Y. Aikhenvald, Word. A cross-linguistic typology (p. 1-41). Cambridge: Cambridge University Press.
Loos, E. E. (29 / 5 / 2014). Glossary of linguistic terms. Retrieved from http://www-01.sil.org/linguistics/GlossaryOfLinguisticTerms/
Matthews, P. (2003). What can we conclude? In R.M.W. Dixon, & A. Y. Aikhenvald, Word. A cross-linguistic typology (p. 266-281). Cambridge: Cambridge University Press.
Merriam-Webster. (25 / 5 / 2014). Merriam-Webster dictionary. Retrieved from http://www.merriam-webster.com
Mithum, M. (1999). The languages of Native America (Reprinted 2001 ed.). Cambridge: Cambridge University Press.
Newman, P. (2000). The Hausa Language. New Haven - London: Yale University Press.
SIL International. (29 / 5 / 2014). Ethnologue. Retrieved from http://www.ethnologue.com
23