Inducing Place Feature Distinctionsfrom the Distribution of Consonants
Thomas Mayer
Research Center Deutscher SprachatlasPhilipps-Universitat Marburg
CUNY Conference on the Feature in Phonology and PhoneticsNew York City, Jan 16–18, 2013
Phoneme classification I
Two methods to classify phonemes have been discussed in theearly phonological literature (cf. Fischer-Jørgensen 1952):phonemes can be classified. . .
1 according to their constituent parts (their distinctive features) and
2 according to their possibilities of combination (their distribution orrelations in the speech chain)
1 Trubetzkoy (1967[1939]:219) considered the first method as themost important task and claims that the classification based ondifferent possibilities of combination cannot give each phoneme aunique definition in all languages
2 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Phoneme classification II
2 Sapir (1925:48) claimed that a place in the system can be found foreach sound “because of a general feeling of its phoneticrelationships (such as parallelism, contrast, combination,imperviousness to combination, and so on) to all other sounds”
Bloomfield (1933:129–130) considered the classification bydistinctive features as structurally irrelevant “because they groupthe phonemes according to the linguist’s notion of theirphysiological character, and not according to the parts which theseveral phonemes play in the working of the language”
3 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Distributional correlates
I Acoustic (Jakobson et al. 1969[1952]) and articulatory (Chomskyand Halle 1968) correlates of features have been discussed in theliterature
I Distributional criteria are sometimes used to discriminate classes ofsounds: ”The major distinction between the two classes [of vowelsand glides] is distributional: vowels are heads of syllables, but glidescannot be heads of syllables.” (Halle 2003:318)
Are there also distributional correlates of features?
I several methods have been discussed in the literature todiscriminate vowels from consonants on distributional grounds:Sukhotin 1962; Powers 1997; Ellison 1990; Ellison 1994; Goldsmithand Xanthos 2009; Calderone 2009; Mayer 2012
4 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Overview
1 Similar Place Avoidance (SPA) as a universal tendency in languages
2 Can we use SPA to infer place distinctions in consonants?
3 Discussion of the results
4 Conclusions
5 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Phonotactic constraints on stems inlanguages
I It has long been noticed for Semitic languages (see Bachra 2001 fora historical account) that there are constraints as to theco-occurrence of consonants within stems
I In his Grammatik des arabischen Vulgardialectes von AegyptenSpitta-Bey (1880)—referring to an older source—remarks that “theArabic language tends to combine those letters in a word whosepoints of formation are remotely distant, such as gutturals anddentals.” (my translation);
e.g.,√k − t − b (dorsal – coronal – labial)
I Greenberg (1950) confirms this tendency in a quantitative study ofroot morphemes in (Standard) Arabic (see also references for similarwork in other Semitic languages in Bachra 2001)
6 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Similar Place Avoidance (SPA)
I a similar restriction on the structure of morphemes has beenpostulated for other languages as well (e.g., Twaddell 1939;Twaddell 1940 for German)
I several studies show that English also exhibits this tendency: e.g.,Berkley (1994), Dmitrieva et al. (2008)
SPA as a universal tendency?
I Pozdniakov and Segerer (2007) found impressive support for such aconstraint for a wider range of languages (mostly West African)
I Mayer et al. (2010) conducted a more representative study of about4,500 languages from all parts of the world
I more generally for surface forms: successive consonants with thesame place of articulation are avoided in stems in languages likeEnglish: bit is fine, map is not
I Hansson (2010:127) lists a number of studies that show support forSPA for other languages as well
7 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Twaddell’s (1940:46) results for Germanfor stressed syllables (37,500 word forms)
8 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Data
I Database (Version 12) from the Automated Similarity JudgmentProgram (ASJP; Wichmann et al. 2010)
• contains Swadesh list items (ranging from 1-100 depending on thelanguage) for over 4,500 languages (altogether 188,475 word forms)
• uses standard ASCII characters to encode the sounds of the world’slanguages, but does merge some of the distinctions made by theIPA (place of articulation is sufficiently distinguished)
• stress, tone and vowel length are not recorded in the database
I a comprehensive list of 1,958 Maltese roots (3-, 4-consonantal,weak) (Spagnol 2011)
I CELEX database for English (52,447 types) (Baayen et al. 1995)
9 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Automated statistical analysis
I a succession of places of articulation is defined as a pair ofconsonants which have to be separated by one vowel
I before and after the succession either word boundaries (#) orvowels have to appear
I the following regular expression is used to extract C-C successions:[#|V ]CVC[#|V ]
I each consonant is assigned to one of the three major place ofarticulation (PoA) categories L (labial), C (coronal) and D (dorsal)
I the succession counts are summarized in a quadratic matrix wherethe rows represent the preceding PoA and the columns the followingPoA
I each matrix cell contains the number of times the respective PoAsuccession could be observed in the corpus
10 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Place of Articulation distinctions
LCD PTCK ASJP transcription IPA transcription
L (labial) P p; b; m; f; v; w p, F; b, B; m; f; v; w
C (coronal)T
8; 4; t; d; s; z; c; n;S; Z
T, D; nˆ
; t; d; s; z; ts,dz; n; S; Z
CC; j; T; l; L; r; y; 5 Ù; Ã; c, é; l; Ï, í, L; r,
R; j; ñ
D (dorsal) Kk; g; x; N; q; G; X; 7;h
k; g; x, G; N; q; G; X,K, è, Q; P; h, H,
Table 1: Assignment of consonants to symbols. All varieties of“click”-sounds have been ignored.
11 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Automated statistical analysis
P2 ¬P2
P1 A : n(P1 → P2) B: n(P1 → ¬P2)
¬P1 C : n(¬P1 → P2) D : n(¬P1 → ¬P2)
Table 2: Contingency table for the articulation place (P) succession fromC1 to C2 (P1 → P2).
The succession counts were used to calculate φ coefficients, whereA,B,C and D correspond to the four cells in Table 5.
φ =A · D − B · C√
(A + B) · (C + D) · (A + C ) · (B + D)(1)
φ values are in the interval [-1;1]
12 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Comparison of resultsacross languages
P T C K
P −15 +11 +5 −5
T +12 −10 −5 +13
C +8 −5 −6 +8
K −3 +8 +5 −15
Table 3: Pozdniakov and Segerer(2007)
Table 4: Mayer et al. (2010)
Consonants with the same place of articulation tend not toco-occur in CVC sequences!
13 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Problems with the ASJP database
I SPA has been claimed for non-derived forms (or stems). Thedatabase does not necessarily contain stems but citation forms(which might involve inflectional markers) → adds noise to theresults
I the number of word forms (and thus the number of consonantsuccessions) per language is very small (40 or fewer for mostlanguages)
However. . .
I I consider the database to be a representative sample of word formsfrom the languages of the world
I check for the overall tendency in the data and the influence of thenumber of successions on the results
14 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Distribution across languages-0
.50
.00
.51
.0
Phi value for individual place categories
ph
i va
lue
labial coronal dorsal 50 100 150 200 250 300
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
Average phi values as a function of amount of data
number of consonant successions for languages
ave
rag
e p
hi v
alu
e
Figure 1: Boxplots for all languages (left); scatter plot for average φvalue depending on number of consonant successions (right)
15 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Mapping the results
Figure 2: Map of the 3,871 languages with more than 20 successions andtheir behavior with respect to SPA.
16 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Is the tendency for SPA universal?
I we have seen that many languages adhere to SPA. The question iswhether this can be regarded as a universal tendency
I Dryer test to check for universal features (cf. Dryer 1989; Dryer1992; Dryer 2003; Cysouw 2005)
I the idea is that the majority of genera in the six macro areas of theworld should be in accordance with SPA (Eurasia, Africa, SouthEast Asia, Australia/New Guinea, North America, South America)
Figure 3: Dryer’s macro areas
17 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Dryer test: results
ll lc ld cl cc cd dl dc dd ll+cc+dd generaAfrica neg 244 44 164 52 259 62 139 76 218 186 281Africa pos 37 237 117 229 22 219 142 205 63 95
ratio 0.868327 0.156584 0.58363 0.185053 0.921708 0.220641 0.494662 0.270463 0.775801 0.661922Australia/New Guinea neg 132 22 92 18 140 39 68 47 115 97 152Australia/New Guinea pos 20 130 60 134 12 113 84 105 37 55
ratio 0.868421 0.144737 0.605263 0.118421 0.921053 0.256579 0.447368 0.309211 0.756579 0.638158Eurasia neg 54 11 30 12 57 14 40 14 52 44 63Eurasia pos 9 52 33 51 6 49 23 49 11 19
ratio 0.857143 0.174603 0.47619 0.190476 0.904762 0.222222 0.634921 0.222222 0.825397 0.698413North America neg 76 18 41 22 76 21 35 24 67 56 88North America pos 12 70 47 66 12 67 53 64 21 32
ratio 0.863636 0.204545 0.465909 0.25 0.863636 0.238636 0.397727 0.272727 0.761364 0.636364South America neg 70 16 43 18 70 19 35 30 61 50 83South America pos 13 67 40 65 13 64 48 53 22 33
ratio 0.843373 0.192771 0.518072 0.216867 0.843373 0.228916 0.421687 0.361446 0.73494 0.60241South East Asia neg 27 1 19 11 29 1 9 7 25 23 30South East Asia pos 3 29 11 19 1 29 21 23 5 7
ratio 0.9 0.033333 0.633333 0.366667 0.966667 0.033333 0.3 0.233333 0.833333 0.766667
Figure 4: Number of genera in the six macro areas and their ratio of genera thatadhere to SPA (NegRatio). Each genus is represented by the language with thehighest number of successions in the ASJP database. It can be seen at a glance thatall six macro areas show a similar behavior with respect to SPA.
18 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Intermediate summary
I the cross-linguistic data from the ASJP database suggest that SPAis indeed a universal principle
I the “Dryer test” shows that the tendency is significant on the 0.05level; “the logic behind this method is that there is only one chancein [26 = 64] that all six areas will exhibit a given preference” (Dryer2003:110, see also Dryer 1989:270), which corresponds to1/64 = 0.015625 < 0.05, i.e., a 95% certainty that it is not due tochance.
Further question:
given that SPA is a universal tendency of languages can we inferthe place of articulation distinctions on the basis of thedistribution of consonants in CVC sequences?
19 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Automated statistical analysis
I a succession of consonants is defined as a pair of consonants whichhave to be separated by one vowel
I before and after the succession either word boundaries (#) orvowels have to appear
I the following regular expression is used to extract C1-C2 successions:[#|V ]CVC[#|V ]
I the succession counts are summarized in a quadratic matrix wherethe rows represent the first consonant (C1) and the columns thesecond consonant (C2) in the CVC sequence
I each matrix cell contains the number of times the respectiveconsonant succession could be observed in the corpus
I this matrix can be interpreted as a dissimilarity matrix
20 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Automated statistical analysis
C2 ¬C2
C1 A : n(C1 → C2) B: n(C1 → ¬C2)
¬C1 C : n(¬C1 → C2) D : n(¬C1 → ¬C2)
Table 5: Contingency table for the consonant (C) succession from C1 toC2.
The succession counts were used to calculate φ coefficients, whereA,B,C and D correspond to the four cells in Table 5.
φ =A · D − B · C√
(A + B) · (C + D) · (A + C ) · (B + D)(2)
21 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Visual inspection of the clusteringresults
I the φ values of consonant pairs are used as a distance matrix (thehigher the φ value, the more distant are the consonants in terms oftheir place feature)
I consonants have been clustered using the Ward clustering algorithm(Ward 1963)
I for the visualization of the results, dendrograms are used which area common technique in visually presenting linguistic classes (Powers1997). Dendrograms are a visual representation of an agglomerativehierarchical clustering.
22 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Figure 5: Maltese verbal roots (cf. Greenberg 1950:178 for Proto-Semiticand McCarthy 1994:204 for Arabic; cf. Weitzman 1987 for an MDS ofArabic and Hebrew root consonants)
coronal
labial
dorsal
Figure 6: All word forms in the ASJP database
Place of Articulation distinctions
LCD PTCK ASJP transcription IPA transcription
L (labial) P p; b; m; f; v; w p, F; b, B; m; f; v; w
C (coronal)T
8; 4; t; d; s; z; c; n;S; Z
T, D; nˆ
; t; d; s; z; ts,dz; n; S; Z
CC; j; T; l; L; r; y; 5 Ù; Ã; c, é; l; Ï, í, L; r,
R; j; ñ
D (dorsal) Kk; g; x; N; q; G; X; 7;h
k; g; x, G; N; q; G; X,K, è, Q; P; h, H,
Table 6: Assignment of consonants to symbols. All varieties of“click”-sounds have been ignored.
25 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Peripheral vs. non-peripheralconsonants
I The results lend support to the distinction between peripheral(labial and dorsal) vs. non-peripheral (coronal) consonants inphonological theory (cf. Rice 1994)
I Based on articulatory and acoustic grounds, a similar distinction wasalready made by Jakobson (1939), who distinguished between grave(labials and velars) as opposed to acute (dentals and palatals)consonants
I It contradicts the Lingual (coronal and dorsal vs. labial) structureproposed by Clements and Hume (1995)
26 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Figure 7: English lemmata in CELEX
Conclusions
I SPA seems to be a universal tendency among languages
• the more successions, the higher the tendency to be in accordancewith the principle
• the Dryer test shows that it is indeed a statistical universal
I Place of articulation features can be induced from thedistribution of consonants in CVC sequences
• consonants with identical place features nicely cluster as can beseen in the dendrogram, with a major distinction between coronalsand non-coronals
I do features exist? there is evidence for distributional correlates offeatures (at least with respect to their discrimination)
I data-driven clustering methods on soft constraints can be used asanother type of evidence for feature organization
28 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
Thank you for your attention!
Acknowledgments
I would like to thank Bernhard Walchli, Frans Plank, ChristianRohrdantz, Janet Grijzenhout, Michael Hund, Michael Cysouw andMiriam Butt for valuable comments and suggestions.
This work was funded by the Research Initiative CALD at theUniversity of Konstanz and the DFG project “Algorithmiccorpus-based approaches to typological comparison” at the LMUMunich and the Philipps University of Marburg.
29 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
References I
Baayen, R. Harald et al. 1995. “The CELEX Lexical Database (CD-ROM)”. Linguistic Data Consortium, Universityof Pennsylvania, Philadelphia, PA [slide 9].
Bachra, Bernard N. 2001. The Phonological Structure of the Verbal Roots in Arabic and Hebrew. Studies inSemitic Languages and Linguistics. Leiden: Brill [slide 6].
Berkley, Deborah. 1994. “The OCP and gradient data”. In: Studies in the Linguistics Sciences 24, pp. 59–72[slide 7].
Bloomfield, Leonard. 1933. Language. New York: Henry Holt and Co. [slide 3].Calderone, Basilio. 2009. “Learning phonological categories by Independent Component Analysis”. In: Journal of
Quantitative Linguistics 16(2).2, pp. 132–156 [slide 4].Chomsky, Noam and Morris Halle. 1968. The Sound Pattern of English. New York: Harper & Row [slide 4].Clements, George Nick and Elizabeth Hume. 1995. “The internal organization of segments”. In: Handbook of
Phonological Theory. Ed. by John Goldsmith. Oxford: Blackwell, pp. 245–306 [slide 26].Cysouw, Michael. 2005. “Quantitative methods in typology”. In: Quantitative Linguistics: An International
Handbook. Ed. by Gabriel Altmann et al. Berlin: Mouton de Gruyter, pp. 554–578 [slide 17].Dmitrieva, Olga et al. 2008. “Gradient OCP and harmonic alignment in English phonotactics” [slide 7].Dryer, Matthew S. 1989. “Large linguistic areas and language sampling”. In: Studies in Language 13(2).2,
pp. 257–292 [slide 17].— 1992. “The Greenbergian word order correlation”. In: Language 68(1).1, pp. 80–138 [slide 17].— 2003. “Significant and non-significant implicational universals”. In: Linguistic Typology 7(1).1, pp. 108–128
[slide 17].Ellison, Mark T. 1990. Discovering planar segregations. Tech. rep. 90/5. University of Western Australia [slide 4].— 1994. “The machine learning of phonological structure”. PhD thesis. University of Western Australia [slide 4].Fischer-Jørgensen, Eli. 1952. “On the definition of phoneme categories on a distributional basis”. In: Acta
Linguistica 7, pp. 8–39 [slide 2].Goldsmith, John and Aris Xanthos. 2009. “Learning phonological categories”. In: Language 85(1).1, pp. 4–38
[slide 4].Greenberg, Joseph H. 1950. “The patterning of root morphemes in {Semitic}”. In: Word 6, pp. 161–182 [slides 6,
23].
30 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
References II
Halle, Morris. 2003. Phonological features [slide 4].Hansson, Gunnar. 2010. Consonant Harmony. Berkeley: University of California Press [slide 7].Jakobson, Roman. 1939. “Observations sur le classement phonologique des consonnes”. In: Proceedings of the 3rd
International Congress of Phonetic Sciences, pp. 34–41 [slide 26].Jakobson, Roman et al. 1969[1952]. Preliminaries to Speech Analysis: the Distinctive Features and their
Correlates. s-Gravenhage: Mouton [slide 4].Mayer, Thomas. 2012. “The induction of phonological structure”. PhD thesis. University of Konstanz [slide 4].Mayer, Thomas et al. 2010. “Consonant co-occurrence in stems across languages: Automatic analysis and
visualization of a phonotactic constraint”. In: Proceedings of the ACL 2010 Workshop on NLP andLinguistics: Finding the Common Ground, pp. 67–75 [slide 7].
McCarthy, John. 1994. “The phonetics and phonology of Semitic pharyngeals”. In: Papers in LaboratoryPhonology III: Phonological Structure and Phonetic Form. Ed. by Patricia A Keating. Cambridge: CambridgeUniversity Press, pp. 191–233 [slide 23].
Powers, David M. W. 1997. “Unsupervised learning of linguistic structure”. In: International Journal of CorpusLinguistics 2(1).1, pp. 91–131 [slides 4, 22].
Pozdniakov, Konstantin and Guillaume Segerer. 2007. “Similar Place Avoidance: A statistical universal”. In:Linguistic Typology 11.2, pp. 307–348 [slide 7].
Rice, Keren. 1994. “Peripherals in consonants”. In: Canadian Journal of Linguistics 39(3).3, pp. 191–216 [slide 26].Sapir, Edward. 1925. “Sound patterns in language”. In: Language 1, pp. 37–51 [slide 3].Spagnol, Michael. 2011. “A Tale of Two Morphologies. Verb structure and argument alternations in Maltese”.
PhD thesis. University of Konstanz [slide 9].Spitta-Bey, Wilhelm. 1880. Grammatik des arabischen Vulgardialectes von Aegypten. Leipzig: Hinrichs [slide 6].Sukhotin, Boris V. 1962. “Eksperimental’noe vydelenie klassov bukv s pomoscju EVM”. In: Problemy strukturnoj
lingvistiki 234, pp. 189–206 [slide 4].Trubetzkoy, N. S. 1967[1939]. Grundzuge der Phonologie. 4. Auflage. Gottingen: Vandenhoeck & Ruprecht
[slide 2].Twaddell, William F. 1939. “Combinations of consonants in stressed syllables in German”. In: Acta Linguistica 1,
pp. 189–199 [slide 7].
31 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants
References III
Twaddell, William F. 1940. “Combinations of consonants in stressed syllables in German (continued)”. In: ActaLinguistica 2, pp. 31–50 [slide 7].
Ward, Joe H. Jr. 1963. “Hierarchical grouping to optimize an objective function”. In: Journal of the AmericanStatistical Association 58(1).1, pp. 236–244 [slide 22].
Weitzman, Michael. 1987. “Statistical patterns in Hebrew and Arabic roots”. In: Journal of the Royal AsiaticSociety 119.1, pp. 15–22 [slide 23].
Wichmann, Søren et al. 2010. “The ASJP Database (version 12)”. URL:http://email.eva.mpg.de/~wichmann/ASJPHomePage.htm [slide 9].
32 / 32
Mayer: Inducing place feature distinctions from the distribution of consonants