the large-scale structure of semantic networks

45
The Large-Scale Structure of Semantic Networks A. Tuba Baykara Cognitive Science 2002700187

Upload: gisela-fields

Post on 03-Jan-2016

33 views

Category:

Documents


3 download

DESCRIPTION

The Large-Scale Structure of Semantic Networks. A. Tuba Baykara Cognitive Science 2002700187. 1) Introduction 2) Analysis of 3 semantic networks and their statistical properties - Associative Network - WordNet - Roget’s Thesaurus 3) The Growing Network Model proposed by the authors - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Large-Scale Structure of Semantic Networks

The Large-Scale Structure of Semantic Networks

A. Tuba BaykaraCognitive Science

2002700187

Page 2: The Large-Scale Structure of Semantic Networks

2

Overview

1) Introduction2) Analysis of 3 semantic networks and their statistical

properties- Associative Network- WordNet- Roget’s Thesaurus

3) The Growing Network Model proposed by the authors- Undirected Growing Network Model- Directed Growing Network Model

4) Psychological Implications of the findings5) General Discussion and Conclusions

Page 3: The Large-Scale Structure of Semantic Networks

3

1) Introduction

Semantic Network: A network where concepts are represented as hierarchies of inter-connected nodes, which are linked to characteristic attributes.

Important to understand their structure because they reflect the organization of meaning and language.

Statistical similarities important because of their implications on language evolution and/or acquisition.

Would a similar model have the same statistical properties? Growing Network Model

Page 4: The Large-Scale Structure of Semantic Networks

4

1) Introduction1) IntroductionPredictions related to the model

1- It would have the same characteristics:

* Degree distribution would follow a power-law some concepts would have much higher connections

* Addition of new concepts would not change such structure Scale-free (vs. small-world!!)

2- Previously added (early acquired) concepts would have higher connectivity than later added (acquired) concepts.

Page 5: The Large-Scale Structure of Semantic Networks

5

1) Introduction1) IntroductionTerminology

Graph, network– Node, edge (undirected link), arc (directed link), degree– Avg. shortest path (L), diameter (D), clustering coefficient

(C), degree distribution ()

Small-world network, random graph

Page 6: The Large-Scale Structure of Semantic Networks

6

2) Analysis of 3 Semantic Networksa. Associative Network

“The University of South Florida Word Association, Rhyme and Word Fragment Norms”

>6000 thousand participants; 750,000 responses to 5,019 cues (stimulus words)

great majority of these words are nouns (76%), but adjectives (13%) and verbs (7%), and other parts of speech are also represented. In addition, 16% are identified as homographs

Page 7: The Large-Scale Structure of Semantic Networks

7

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksa. Associative Network

Examples:

BOOK _______

BOOK READ SUPPER _______

SUPPER LUNCH

Page 8: The Large-Scale Structure of Semantic Networks

8

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksa. Associative Network

DINNER SUPPER EAT LUNCH FOOD MEAL

DINNER - 0.54 0.11 0.10 0.09 0.09

SUPPER 0.55 - 0.02 0.03 0.17 0.01

EAT - 0.41 0.02

LUNCH 0.27 0.02 0.08 - 0.20 0.06

FOOD 0.41 0.01 - 0.02

MEAL 0.21 0.06 0.06 0.06 0.49 -

Note: for simplicity, the networks were constructed with all arcs and edges unlabeled and equally-weighted.

Forward & backward strength imply directions.

(when SUPPER was normed, it produced LUNCH as a target with a forward strength of .03)

Page 9: The Large-Scale Structure of Semantic Networks

9

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksa. Associative Network

I) Undirected network Word nodes were joined by an edge if associatively

related, regardless of associative direction

The shortest path from VOLCANO to ACHE is highlighted.

Page 10: The Large-Scale Structure of Semantic Networks

10

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksa. Associative Network

II) Directed network Words x & y were joined by an arc from x to y if cue x

evoked y as an associative response

all shortest directed paths from VOLCANO to ACHE are shown.

Page 11: The Large-Scale Structure of Semantic Networks

11

2) Analysis of 3 Semantic Networksb. Roget’s Thesaurus

1911 edition with 29,000 words from 1,000 categories A connection is made only between a word and a

semantic category, if that word is within that category. bipartite graph

Page 12: The Large-Scale Structure of Semantic Networks

12

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksb. Roget’s Thesaurus

numeration representation painting

calculator numbering accounting computer imitation map design perspective chalk monochrome

accountingdesign chalk

map

calculator perspective

computer

imitation

numberingmonochrome

Bipartite graph

Unipartite graph

Page 13: The Large-Scale Structure of Semantic Networks

13

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksc. WordNet

Developed by George Miller at the CogSci Lab in Princeton Uni.: http://wordnet.princeton.edu

Based on the relation between synsets; contained more than 120k word forms and 99k meanings

ex: The noun "computer" has 2 senses in WordNet.1. computer, computing machine, computing device, data processor, electronic computer, information processing system -- (a machine for performing calculations automatically)2. calculator, reckoner, figurer, estimator, computer -- (an expert at calculation (or at operating calculating machines))

Page 14: The Large-Scale Structure of Semantic Networks

14

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksc. WordNet

Links are between word forms and their meanings according to the relationships between word forms such as:

– SYNONYMY– POLYSEMY– ANTONYMY – HYPERNYMY (Computer is a kind of machine/device/object.)– HYPONYMY (Digital computer/Turing machine… is a kind of computer)– HOLONYMY (Computer is a part of a platform)– MERONYMY (CPU/chip/keyboard… is a part of a computer)

Links can be established in any desired way, so WordNet treated as an undirected graph.

Page 15: The Large-Scale Structure of Semantic Networks

15

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties

I) How sparse are the 3 networks? <k>: avg. # of connections In all 3, a node is connected to only a small % of other nodes.

II) How connected are the networks? Undirected A/N: completely connected Directed A/N: largest connected component has 96% of all words WordNet & Thesaurus: 99%

All further analyses with these components!!!

Page 16: The Large-Scale Structure of Semantic Networks

16

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties

Page 17: The Large-Scale Structure of Semantic Networks

17

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties

III) Short Path-length (L) and Diameter (D) In WordNet & Thesaurus, L & D based on a sample of 10,000

words. In A/N, all words considered. L & D in random graphs with equivalent size; expected

IV) Local Clustering (C) To measure its C, directed A/N regarded as undirected To calculate C of Thesaurus, bipartite graph converted into

unipartite graph C of all 4 networks much higher than in random graphs

Page 18: The Large-Scale Structure of Semantic Networks

18

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties

V) Power-Law Degree Distribution ()

• All distributions are plotted in log-log coordinates with the line showing best fitting power law distribution.• in of Directed A/N lower than the rest

These semantic networksare scale-free!

Page 19: The Large-Scale Structure of Semantic Networks

19

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties / Summary

Sparsity & High-ConnectivityOn avg. words are related to only a few other words

Local ClusteringConnections between words are coherent and transitive:

if xy and yz; then xz

Short Path Length and DiameterLanguage is expressive and flexible (thru’ polysemy & homonymy..)

Power-Law Degree DistributionLanguage hosts hubs as well as many words connected to few others

Page 20: The Large-Scale Structure of Semantic Networks

20

3) The Growing Network Model

Inspired by Barabási & Albert (1999) Incorporates both growth and preferential attachment Aim: to see whether the same mechanisms are at work

or not in real-life semantic networks and artificial ones Might be applied to lexical development in children

+ growth of semantic structures across languages,

or even language evolution

Page 21: The Large-Scale Structure of Semantic Networks

21

3) The Growing Network Model

Assumptions: how children learn concepts is thru’ semantic

differentiation: a new concept differentiates an already existing one, acquires a similar meaning, but also different, with a different pattern of connectivity.

more complex concepts get more differentiated more frequent concepts get more involved in

differentiation

Page 22: The Large-Scale Structure of Semantic Networks

22

3) The Growing Network Model3) The Growing Network ModelStructure

Nodes are words, and connections are semantic associations/relations

Nodes are different in their utility frequency of use Over time new nodes are added and attached to

existing nodes probabilistically according to:– Locality principle: New links are added only into a local

neighborhood a set of nodes with a common neighbor– Size principle: New connections will be to neighborhoods

with already large # of connections– Utility principle: New connections within a neighborhood will

be onto nodes with high utility (rich-get-richer phenomenon)

Page 23: The Large-Scale Structure of Semantic Networks

23

3) The Growing Network Model3) The Growing Network Modela. Undirected GN Model

Aim: To grow a network with nn nodes # of nodes at time tt is n(t)n(t) Start with a fully connected network of MM nodes (MM<<nn) At each tt, add a node ii with MM links (chosen for a desired

avg. density of connections) into a local neighborhood HHii the set of neighbors of i i including i i itself.

Choose a neighborhood according to the size principle:

kkii(t)(t): degree of node ii at time tt

Ranges over all current n(t)n(t) in the network

Page 24: The Large-Scale Structure of Semantic Networks

24

3) The Growing Network Model3) The Growing Network Modela. Undirected GN Model

Ranges over all nodes in Hi

Connect to a node j j in the neighborhood of node i i according to the utility principle:

If all utilities are equal, make a connection randomly:

Stop when n n nodes are reached.

UUjj = log(ffjj+1); ffjj taken from Kučera &Francis(1967) frequency count

Page 25: The Large-Scale Structure of Semantic Networks

25

3) The Growing Network Model3) The Growing Network Modela. Undirected GN Model

The growth process and a small resulting network with n=150, M=2:

Page 26: The Large-Scale Structure of Semantic Networks

26

3) The Growing Network Model3) The Growing Network Modelb. Directed GN Model

Very similar to the Undirected GN Model: insert nodes with MM arcs instead of links

Same equations to apply locality, size and utility principles, since:

ki = kiin + ki

out

Difference: Direction Principle: majority (!) of arcs are pointed from new nodes to existing nodes the p that an arc points away from the new node is , where >0.5 is assumed; so most arcs will point towards existing nodes.

Page 27: The Large-Scale Structure of Semantic Networks

27

3) The Growing Network Model3) The Growing Network ModelModel Results

Due to computational constraints, the GN model was compared only with A/N model.

n=5018; M=11 and M=12 in the undirected and directed GN models respectively.

The only free parameter in Directed GN model, , was set to 0.95

The networks produced by the model are similar to A/N in terms of their L, D, C. Same low in as in Directed A/N.

Page 28: The Large-Scale Structure of Semantic Networks

28

3) The Growing Network Model3) The Growing Network ModelModel Results

Also checked if the same results would be produced when the Directed GN Model was converted into an undirected one. why!?

Convert all arcs into links, with MM=11 and =0.95 Results similar to Undirected GN model.

Degree distribution follows a power-law

Page 29: The Large-Scale Structure of Semantic Networks

29

3) The Growing Network Model3) The Growing Network ModelArgument

L, C and from the artificial networks were expected to compare to real-life networks:

– incorporation of growth – incorporation of preferential attachment (locality, size & utility

principles) Do models without growth not produce such power-laws? Analyze the co-occurrence of words within a large corpus

Latent Semantic Analysis (LSA): meaning of words can be represented by vectors in a high dimensional space

Landauer & Dumais (1997) have already shown that local neighborhoods in semantic space captures semantic relations between words.

Page 30: The Large-Scale Structure of Semantic Networks

30

3) The Growing Network Model3) The Growing Network ModelLSA Results

Higher L, D and C than in real-life semantic networks Very different degree-distribution. The distributions do not

follow a power-law. Difficult to interpret the slope of the best fitting line.

Page 31: The Large-Scale Structure of Semantic Networks

31

3) The Growing Network Model3) The Growing Network ModelLSA Results

Analysis of the TASA corpus (>10mio words) using LSA vector representation:

All words from A/N in TASA Most freq. words in TASA

All words from LSA (>92k)represented as vectors

Page 32: The Large-Scale Structure of Semantic Networks

32

3) The Growing Network Model3) The Growing Network ModelLSA Results

Non-existence of power-law degree distribution implies LSA does not produce hubs.

In contrast, a growing model provides a principled explanation for the origin of power-law: Words with high connectivity acquire even more connections over time.

Page 33: The Large-Scale Structure of Semantic Networks

33

4) Psychological Implications

Number of connections a node has is related to the time at which the node is introduced into the network.

Predictions: – Concepts that are learned early in life will have more

connections than concepts learned later.– Concepts with high utility (frequency) will receive more links

than concepts with lower utility.

Page 34: The Large-Scale Structure of Semantic Networks

34

4) Psychological Implications4) Psychological ImplicationsAnalysis of AoA-related data

To test the prediction, two data sets were analyzed:

I) Age of Acquisition Ratings (Gilhooly & Logie, 1980) AoA effect: Early acquired words are retrieved from

memory more rapidly than late acquired words An experiment with 1,944 words Adults were required to estimate the age at which they

thought they first learned a word on a rating scale (100-700, 700 rated to be very late-learned concept)

II) Picture naming norms (Morrison, Chappell & Ellis, 1997) Estimation of the age at which 75% of children could

successfully name the object depicted by a picture

Page 35: The Large-Scale Structure of Semantic Networks

35

4) Psychological Implications4) Psychological ImplicationsAnalysis of AoA-related data

Predictions

are

confirmed!

Standarderrorbarsaroundthe means

Page 36: The Large-Scale Structure of Semantic Networks

36

4) Psychological Implications4) Psychological ImplicationsDiscussion

Important consequences on psychological research on AoA and word frequency

– Weakens: AoA affects mainly the speech output system AoA & word frequency display their effect on behavioral tasks

independently

– Confirms: early acquired words show short naming-latencies and lexical-

decision-latencies AoA affects semantic tasks AoA is mere cumulative frequency

Page 37: The Large-Scale Structure of Semantic Networks

37

4) Psychological Implications4) Psychological ImplicationsCorrelational Analysis of Findings

Early acquired words have more semantic connections (more central in an underlying semantic network) early acquired words have higher degree centrality

Centrality can also be measured by computing the eigenvector of the adjacency matrix with the largest eigenvalue.

Analysis of how degree centrality, word frequency and AoA from previous rating & naming studies correlate with 2 databases:

– Naming-latency db of 796 words– Lexical-decision-latency db of 2,905 words

Page 38: The Large-Scale Structure of Semantic Networks

38

4) Psychological Implications4) Psychological ImplicationsCorrelational Analysis of Findings

• Centrality negatively correlates with latencies• AoA correlates positively withlatencies• Word frequency correlates negatively with latencies• When effects of word freq. andAoA partialled out, centrality-latency correlation remainsignificant there must be othervariables

Page 39: The Large-Scale Structure of Semantic Networks

39

5) General Discussion and and ConclusionsConclusions

Weakness of correlational analysis: direction of causation is unknown:

– Because acquired early, a word will have more connections

vs.– Because of having more connections, a word will be acquired

early A connectionist model can produce similar results: early

acquired words are learnt better.

Page 40: The Large-Scale Structure of Semantic Networks

40

5) General Discussion and General Discussion and Conclusions

Power-law degree distributions in semantic networks can be understood by semantic growth processes hubs

Non-growing semantic representations as LSA do not produce such a distribution per se.

Early acquired concepts have richer connections confirmed by AoA norms.

Page 41: The Large-Scale Structure of Semantic Networks

41

References

Barabási, A.L., & R. Albert (1999). Emergence of scaling in random network models. Science, 286, 509-512.

Gilhooly, K.J., & R.H.Logie (1980). Age of Acquisition, imagery, concreteness, familiarity and ambiguity measures for 1944 words. Behavior Research Methods and Instrumentation, 12, 395-427.

Kučera, H., & W.N.Francis (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press.

Landauer, T.K., & S.T.Dumais (1997). A solution to Plato’s problem: The Latent Semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104, 211-240.

Morrison, C.M., T.D.Chappell and A.W.Ellis (1997). Age of Acquisition norms for a large set of object names and their relation to adult estimates and other variables. Quarterly Journal of Experimental Psychology, 50A, 528-559.

Page 42: The Large-Scale Structure of Semantic Networks

Thanks for your attention!

Questions / comments

are appreciated.

Page 43: The Large-Scale Structure of Semantic Networks

43

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic Networksc. WordNet

Number of words, synsets, and senses

POS Unique Synsets Total Word-Strings Sense Pairs

Noun 114,648 79,689 141,690

Verb 11,306 13,508 24,632

Adjective 21,436 18,563 31,015

Adverb 4,669 3,664 5,808

Totals 152,059 115,424 203,145

Page 44: The Large-Scale Structure of Semantic Networks

44

2) Analysis of 3 Semantic Networks2) Analysis of 3 Semantic NetworksStatistical Properties

With N nodes and <k> avg.degree If <k> = pN < , the graph is composed of isolated trees If <k> > 1, a giant cluster appears If <k> ln(N), the graph is totally connected

Page 45: The Large-Scale Structure of Semantic Networks

45

Roget’s Thesaurus

WORDS EXPRESSING ABSTRACT RELATIONS

WORDS RELATING TO SPACE

WORDS RELATING TO MATTER

WORDS RELATING TO THE INTELLECTUAL FACULTIES

WORDS RELATING TO THE VOLUNTARY POWERS

WORDS RELATING TO THE SENTIMENT AND MORAL POWERS