![Page 1: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/1.jpg)
Self-Organization of the Sound Inventories: An Explanation
based on Complex Networks
![Page 2: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/2.jpg)
Overview of the Talk
• Motivation
• Approach & Objective
• Principle of Occurrence in Consonant Inventories
• Principle of Co-Occurrence in Consonant Inventories
• Findings
• Conclusions and Future Work
![Page 3: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/3.jpg)
Sabda Bramha: Sound is Eternity
sabda-brahma su-durbodham pranendriya-mano-mayam ananta-param gambhiramdurvigahyam samudra-vat
– Sound is eternal and as well very difficult to comprehend. It manifests within the life air, the senses, and the mind. It is unlimited and unfathomable, just like the ocean.
![Page 4: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/4.jpg)
• Several living organisms can produce sound
– They emit sound signals to communicate
– These signals are mapped to certain symbols (meanings) in the brain
– E.g., mating calls, danger alarms
Signals and Symbols & § ۞ ☼ ♥
![Page 5: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/5.jpg)
Human Communication
• Human beings also produce sound signals
• Unlike other organisms, they can concatenate these sounds to produce new messages – Language
• Language is one of the primary cause/effect of human intelligence
![Page 6: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/6.jpg)
Human Speech Sounds
• Human speech sounds are called phonemes – the smallest unit of a language
• Phonemes are characterized by certain distinctive features like
Mermelstein’s Model
I. Place of articulation
II. Manner of articulation
III. Phonation
![Page 7: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/7.jpg)
Types of Phonemes
Vowels Consonants Diphthongs
/ai/L
/a/
/i/
/u/
/p/
/t/
/k/
![Page 8: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/8.jpg)
Choice of Phonemes
• How a language chooses a set of phonemes in order to build its sound inventory?
• Is the process arbitrary?
• Certainly Not!
• What are the forces affecting this choice?
![Page 9: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/9.jpg)
Forces of Choice
/a/
Speaker Listener / Learner
/a/
Desires “ease of articulation” Desires “perceptual contrast” / “ease of learnability”
A Linguistic System – How does it look?
The forces shaping the choice are opposing – Hence there has to be a non-trivial solution
![Page 10: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/10.jpg)
Vowels: A (Partially) Solved Mystery
• Languages choose vowels based on maximal perceptual contrast.
• For instance if a language has three vowels then in more than 95% of the cases they are /a/,/i/, and /u/.
Max
imall
y Dist
inct
Maximally Distinct
Maximally Distinct/u/
/a/
/i/
![Page 11: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/11.jpg)
Consonants: A puzzle
• Research: From 1929 – Date
• No single satisfactory explanation of the organization of the consonant inventories
– The set of features that characterize consonants is much larger than that of vowels
– No single force is sufficient to explain this organization
– Rather a complex interplay of forces goes on in shaping these inventories
Ji g
sa
w
![Page 12: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/12.jpg)
The Approach & Objective
• We adopt a Complex Network Approach to attack the problem of consonant inventories
• We try to figure out the principle of the distribution of the occurrence of consonants over languages
• We also attempt to figure out the co-occurrence patterns (if any) that are found across the consonant inventories
![Page 13: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/13.jpg)
Principle of Occurrence
• PlaNet – The “Phoneme-Language Network”
– A bipartite network N=(VL,VC,E)
– VL : Nodes representing languages of the world
– VC : Nodes representing consonants
– E : Set of edges which run between VL and VC
• There is an edge e Є E between two nodes
vl Є VL and vc Є VC if the consonant c occurs
in the language l.
L1
L4
L2
L3
/m/
/ŋ/
/p/
/d/
/s/
/θ/
Conso
na
nts
Langu
ages
The Structure of PlaNet
![Page 14: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/14.jpg)
Construction of PlaNet
• Data Source : UCLA Phonological Inventory Database (UPSID)
• Number of nodes in VL is 317
• Number of nodes in VC is 541
• Number of edges in E is 7022
![Page 15: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/15.jpg)
Degree Distribution
• Degree of a node is defined as the number of edges connected to the node.
• Degree Distribution (DD) is the fraction of nodes, pk, having degree equal to k.
• The Cumulative Degree Distribution (CDD) is the fraction of nodes, Pk, having degree k.
![Page 16: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/16.jpg)
Degree Distribution of PlaNet
0 50 100
150
0.02
0.04
0.06
0.08
Language inventory size (degree k)
pk
pk = beta(k) with α = 7.06, and β = 47.64
pk =Γ(54.7) k6.06(1-k)46.64
Γ(7.06) Γ(47.64)
kmin= 5, kmax= 173, kavg= 21
200
Pk
1000Degree of a consonant, k
Pk = k -0.71
Exponential Cut-off
1 10 100
0.001
0.01
0.1
1
DD of the language nodes follows a β-distribution
DD of the consonant nodes follows a power-law with an exponential cut-off
Distribution of Consonants over Languages follow a power-law
![Page 17: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/17.jpg)
Preferential Attachment: The Key to Power Law
• Power law distributions observed in
– Social Networks
– Biological Networks
– Internet Graphs
– Citation Networks
• These distributions emerge due to preferential attachment
$$ $ $
$$ $ $
$ $ $ $$ $ $ $
RIC
H RIC
HE
R
![Page 18: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/18.jpg)
Synthesis of PlaNet
Given: VL = {L1, L2, ..., L317} sorted in the ascending order of their degrees and 541 unlabeled nodes in VC .
Step 0: All nodes in VC have degree 0.
Step t+1:
Choose a language node Lj (in order) with cardinality kj (inventory size)
for c running from 1 to kj do
Pr(Ci) =di
α+ ε
∑xV* (dxα + ε)
Connect Lj preferentially with a consonant node Ci VC, to which it is already not connected, with a probability
where, di = degree of node Ci at step t and V* = subset of VC not connected to Lj at t and ε is the smoothing parameter.
![Page 19: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/19.jpg)
L1 L3L2 L4
L1 L3L2 L4
The Preferential Mechanism of Synthesis
After step 3
After step 4
![Page 20: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/20.jpg)
Simulation Result
The parameters α and ε are 1.44 and 0.5 respectively.
The results are averaged over 100 runs
PlaNetrand
PlaNetPlaNetsyn
1 10 100 1000
1
.1
.01
.001 Degree
(k)
Pk
![Page 21: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/21.jpg)
Principle of Co-occurrence
• Consonants tend to co-occur in groups or communities
• These groups tend to be organized around a few distinctive features (based on: manner of articulation, place of articulation & phonation) – Principle of feature economy
If a language has in its inventory
then it will also tend to have
voiced voiceless
bilabial
dental
/b/ /p/
/d/ /t/
plosive
![Page 22: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/22.jpg)
How to Capture these Co-occurrences?
• PhoNet – “Phoneme Phoneme Network”– A weighted network N=(VC,E)
– VC : Nodes representing consonants
– E : Set of edges which run between the nodes in VC
• There is an edge e Є E between two nodes vc1 ,vc2 Є VC if the consonant c1 and c2 co-occur in a language. The number of languages in which c1 and c2 co-occurs defines the edge-weight of e. The number of languages in which c1 occurs defines the node-weight of vc1.
/kw/
/k′/
/k/
/d′/42
14
38
13
283
17
50
39
![Page 23: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/23.jpg)
Construction of PhoNet
• Data Source : UPSID
• Number of nodes in VC is 541
• Number of edges is 34012
PhoNet
![Page 24: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/24.jpg)
Community Structures in PhoNet
• Radicchi et al. algorithm (for unweighted networks) – Counts number of triangles that an edge is a part of. Inter-community edges will have low count so remove them.
• Modification for a weighted network like PhoNet
– Look for triangles, where the weights on the edges are comparable.
– If they are comparable, then the group of consonants co-occur highly else it is not so.
– Measure strength S for each edge (u,v) in PhoNet where S is,
– Remove edges with S less than a threshold η
S =wuv
√Σi Є Vc-{u,v}(wui – wvi)2 if √Σi Є Vc-{u,v}(wui – wvi)2>0 else S = ∞
![Page 25: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/25.jpg)
3
1
2
4100
110
101
10
5
646
52
45 3
1
2
411.11
10.94
7.14
0.06
5
63.77
5.17
7.5S
η>1
3
1
2 6
4
5
Community Formation
For different values of η we get different sets of communities
![Page 26: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/26.jpg)
Consonant Societies!
η=1.25η=0.72
η=0.60
η=0.35
![Page 27: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/27.jpg)
Evaluation of the Communities: Occurrence Ratio
• Hypothesis: The communities obtained from the algorithm should be found frequently in UPSID
• We define occurrence ratio to capture the “intensity” of occurrence,
– N is the number of consonants in C (ranked by the ascending order of frequency of occurrence) , M is the number of consonants of C that occur in a language L and Rtop is the rank of the highest ranking consonant in L that is also present in C
– If a high-frequency consonant is present in L it is not necessary that the low-frequency one should be present; but if a lower one is already present then it is expected that the higher one must be present
OL =M
N – (Rtop – 1)
![Page 28: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/28.jpg)
Computing Occurrence Ratio: An Example
X
/kh/
/k/
/kw/
/kh/
X
/kw/
/kh/
/k/
/k/
/kh/
/kw/
C
L1
L2
L3
R =1
R =2
R =3
M=3, N=3, Rtop=1
OL=3/3=1
M=2, N=3, Rtop=2
OL=2/2=1
M=2, N=3, Rtop=1
OL=2/3=0.66
![Page 29: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/29.jpg)
Average Occurrence Ratio
• For a given community it will have an occurrence ratio in each language L in UPSID
• We average this ratio over all L as,
where Loccur is the number of languages where at least one of the members of C has occurred
Oav =Loccur
ΣL Є UPSIDOL
![Page 30: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/30.jpg)
Results of the Evaluation
Consonants show patterns of co-occurrence in 80% or more of the world’s languages
η >
0.3
Oav > 0.8
![Page 31: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/31.jpg)
The Binding Force of the Communities: Feature Economy
• Feature Entropy: The idea is borrowed from information theory
• For a community C of size N, let there be pf consonants for which a particular feature f is present and qf other consonants for which f is absent – probability that a consonant chosen from C has f is pf /N and that it does have f is qf /N or (1- pf /N)
• Feature entropy can be therefore defined as
where F is the set of all features present in the consonants in C
• Essentially the number of bits needed to transmit the entire information about C through a channel.
ΣFЄf(-(pf /N)log(pf /N) – (qf /N)log(qf /N))FE =
![Page 32: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/32.jpg)
Computing Feature Entropy
Lower FE -> C1 economizes on the number of features
Higher FE -> C2 does not economize on the number of features
![Page 33: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/33.jpg)
If the Inventories had Evolved by Chance!
• Construction of PhoNetrand
– For each consonant c let the frequency of occurrence in UPSID be denoted by fc.
– Let there be 317 bins each corresponding to a language in UPSID.
– fc bins are then chosen uniformly at random and the consonant c is packed into these bins without repetition.
– Thus the consonant inventories of the 317 languages corresponding to the bins are generated.
– PhoNetrand can be constructed from these new consonant inventories similarly as PhoNet.
• Cluster PhoNetrand by the method proposed earlier
![Page 34: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/34.jpg)
PhoNet
PhoNetrand
0 5 10 15 20
10
5
0
Avera
ge F
eatu
re
En
trop
y
Community Size
The curve shows the average feature entropy of the communities of a particular size versus the community size
Comparison between PhoNet and PhoNetrand
![Page 35: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/35.jpg)
Our Findings
• The distribution of the occurrence of consonants over languages follow a power-law behavior;
• A preferential attachment-based model can reproduce this distribution of occurrence to a very close approximation (mean error ~0.01);
• The patterns of co-occurrence of the consonants, reflected through communities in PhoNet, are observed in 80% or more of the world's languages;
•Such patterns of co-occurrence would not have emerged if the consonant inventories had evolved just by chance;
![Page 36: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/36.jpg)
The Epilogue
• How to explain preferential attachment?– Perhaps it is due to the linguistic heterogeneity involved in the
process of language change (at the microscopic level)– Consonants belonging to languages that are prevalent among the
speakers in one generation have a higher (and higher) chance of getting transmitted to the speakers of the subsequent generations
– The above heterogeneity manifests as preferential attachment in the mesoscopic level
• What is the cause of the origin of feature economy?– Perhaps it is the outcome of the interplay of the functional forces
such as the perceptual contrast and ease of learnability that is reflected as feature economy
Indo-European family of languages
![Page 37: Self-Organization of the Sound Inventories: An Explanation based on Complex Networks](https://reader033.vdocuments.mx/reader033/viewer/2022051401/56649cf45503460f949c1d34/html5/thumbnails/37.jpg)
Danke!