quantifying perceived morphological relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje...

79
Quantifying Perceived Morphological Relatedness Kathleen Currie Hall, Claire Allen, Tess Fairburn, Kevin McMullin, Michael Fry, Masaki Noguchi Department of Linguistics University of British Columbia [email protected]

Upload: others

Post on 06-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Quantifying Perceived Morphological Relatedness

Kathleen Currie Hall, Claire Allen, Tess Fairburn, Kevin McMullin,

Michael Fry, Masaki Noguchi Department of Linguistics

University of British Columbia [email protected]

Page 2: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Background: Alternations

•  Much of phonological analysis relies on identifying alternations between sounds (to then determine the factors that govern such alternations).

•  Alternations occur when what is assumed to be a single morpheme has multiple allomorphs, i.e., different phonological shapes on the surface.

•  In order to identify alternations, we must first be able to determine that elements with differing forms are in fact the "same" underlying morpheme.

ACL / CLA 2014 2

Page 3: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

An Example: Dutch Nom. Sg. Gloss Diminutive Gloss

tas [tɑs] bag tasje [tɑʃə] handbag

poes [pus] cat ('puss') poesje [puʃə] kitten ('pussy')

meid [mɛɪt] maid meisje [mɛɪʃə] girl

vaas [vas] vase (large) vaasje [vaʃə] small vase

ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream

glas [ɡlɑs] glass glasje [ɡlɑʃə] lens (on spectacles)

ACL / CLA 2014 3

1. The orthographic similarity between the nom. sg. and diminutive forms makes it relatively easy to surmise that the diminutives are formed by adding <je> to a base, which is visible in the nom. sg. forms.

Page 4: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

An Example: Dutch

ACL / CLA 2014 4

2. But note that the actual pronunciation of the diminutives involves [ʃ], not [s], such that we seem to have multiple allomorphs of the base morpheme: e.g., [pus] ~ [puʃ], [ɛɪs] ~ [ɛɪʃ], etc.

Nom. Sg. Gloss Diminutive Gloss

tas [tɑs] bag tasje [tɑʃə] handbag

poes [pus] cat ('puss') poesje [puʃə] kitten ('pussy')

meid [mɛɪt] maid meisje [mɛɪʃə] girl

vaas [vas] vase (large) vaasje [vaʃə] small vase

ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream

glas [ɡlɑs] glass glasje [ɡlɑʃə] lens (on spectacles)

Page 5: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

An Example: Dutch

ACL / CLA 2014 5

3. Also note that the extent to which the "diminutive" form is transparently related to the nom. sg. form varies – poes ~ poesje may be fairly obvious, but what about ijs ~ ijsje? glas ~ glasje?

Nom. Sg. Gloss Diminutive Gloss

tas [tɑs] bag tasje [tɑʃə] handbag

poes [pus] cat ('puss') poesje [puʃə] kitten ('pussy')

meid [mɛɪt] maid meisje [mɛɪʃə] girl

vaas [vas] vase (large) vaasje [vaʃə] small vase

ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream

glas [ɡlɑs] glass glasje [ɡlɑʃə] lens (on spectacles)

Page 6: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

An Example: Dutch

ACL / CLA 2014 6

In order to make the claim that [s] ~ [ʃ] alternate with each other in Dutch, and are therefore (e.g.) allophonic or subject to a particular phonological rule / analysis, one needs to know that speakers of Dutch recognize these as allomorphs of the same morpheme.

Nom. Sg. Gloss Diminutive Gloss

tas [tɑs] bag tasje [tɑʃə] handbag

poes [pus] cat ('puss') poesje [puʃə] kitten ('pussy')

meid [mɛɪt] maid meisje [mɛɪʃə] girl

vaas [vas] vase (large) vaasje [vaʃə] small vase

ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream

glas [ɡlɑs] glass glasje [ɡlɑʃə] lens (on spectacles)

Page 7: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

A Question About Alternations

•  This issue is rampant in phonology—the assumption is that surface variants of a single underlying morphological form are evidence of phonological rules / constraints – i.e., predictable phonological processes.

•  Some have claimed that alternations are crucial or even the only reliable way to identify allophony / predictable phonological patterns (e.g., Silverman 2006, Lu 2012).

•  The question is: how can we test speakers' awareness of the extent to which pairs of words are in fact morphologically related, i.e., contain a morpheme in common?

ACL / CLA 2014 7

Page 8: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Further Claims About Alternations

•  Johnson & Babel (2010), in a study comparing phonological relations between [s] and [ʃ] in English and Dutch: "Note that in English [s] and [ʃ] sometimes

alternate...through morphophonological alternations (oppress ~ oppression, confess ~ confession), but alternations of this type are infrequent in English and the phonemic contrast between /s/ and /ʃ/ is very a salient aspect of the English phonological system."

(emphasis added)

ACL / CLA 2014 8

Page 9: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Frequency of Alternations?

•  If alternations are crucial, then we cannot discount them if they exist.

•  If the frequency of alternations matter, we need to have a way to count them.

•  One way of doing this might be to count up the number of morphemes that sometimes have [s] and sometimes have [ʃ], and compare that to the number that do not alternate.

•  Again, though, this relies on being able to identify alternations: face, face-lift, facial, typeface, surface, facet, façade, superficial...

Page 10: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Measuring Morphological Relatedness

•  Much of the work on morphological relatedness is focused on morphological processing and the determination of what role morphology itself plays as compared to phonology or semantics (e.g., Murrell & Morton 1974, Marslen-Wilson et al. 1994, Frost et al. 2000, inter alia).

•  Morphological relatedness has been shown to cause priming in a variety of tasks and studies.

ACL / CLA 2014 10

Page 11: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Measuring Morphological Relatedness

•  But, morphologically "related" or "unrelated" words in these studies are generally very clear-cut cases. – e.g., Marslen-Wilson et al. (1994) have a very strict

set of criteria that determine whether words are morphologically related: •  derived form has a "recognizable" affix; •  the stem without the affix matches the free stem; •  the words share the same etymological source word.

ACL / CLA 2014 11

Page 12: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Intermediate Cases?

•  Gonnerman et al. (2007) recognize this tendency and investigate intermediate degrees of morphological relatedness.

•  Furthermore, they claim that apparent morphological priming effects can be accounted for in a connectionist model as the combination of phonological and semantic factors.

•  Their stimuli are carefully pre-classified into groups of high / medium / low semantic or phonological similarity based on pre-testing, and they do find gradient degrees of priming in these conditions.

ACL / CLA 2014 12

Page 13: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Our Experiment

•  Use a wider range of, and more automatic measurements of, phonological & semantic relatedness.

•  Goal: Ability to quantify the degree to which two words are perceived as being morphologically related, without having to do further behavioural studies.

ACL / CLA 2014 13

Page 14: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Word Similarity

•  The task: Use an AXB task to have people intuitively pair a key word (X) with one of two choices (A, B) according to "similarity."

•  No training / feedback / information given about what might count as being "similar."

Page 15: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

FACE!

faced! ! !facial!

Press '1' if the centre word is more like the word on the left; press '5' if it is

more like the word on the right.!

Page 16: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Rationale

•  If two words are indeed morphologically related, they should share both sound and meaning.

•  Words that are more closely morphologically related to the key word should be picked more often as being "similar" to the key than words that are not (closely) morphologically related, because they are similar in both aspects.

•  Follows Gonnerman et. al. (2007), but with a more overt measure (conscious judgment, not priming).

Page 17: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Stimuli

KEY

Inflected form

Derived 1 ("transparent")

Derived 2 ("opaque")

Meaning 1 ("primary")

Meaning 2 ("secondary")

Rhyme

Cohort

Unrelated 1

Unrelated 2

ACL / CLA 2014 17

•  180 Key words •  Each has a set of 9 comparison words, intended to show a range of relatedness to the key

matched for # sylls, lexical category

Page 18: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Stimuli - Examples

KEY PRESS RIGHT

Inflected form pressed rights

Derived 1 ("transparent") pressure rightful

Derived 2 ("opaque") expressway righteousness

Meaning 1 ("primary") push entitlement

Meaning 2 ("secondary") media correct

Rhyme mess flight

Cohort preppy rhyme

Unrelated 1 table apple

Unrelated 2 sofa orange

ACL / CLA 2014 18

matched for # sylls, lexical category

•  180 Key words •  Each has a set of 9 comparison words, intended to show a range of relatedness to the key

Page 19: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Design

•  All pairs for a given keyword are formed (8+7+6+5+4+3+2+1 = 36) and put in two orders (36 * 2 = 72).

•  Any given participant sees only two of these 72 pairs per key word (and never both orders for a single pair) –  stimuli are split into arbitrary 36 groups, with each

participant being in one group

•  180 key words * 2 pairs each = 360 trials per participant

Page 20: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Participants

•  91 English-speaking participants have been run so far

•  randomly assigned to each of the 36 groups, so not all groups are equally represented – not all individual word pairs are equally represented

•  participants receive $10 for participating •  wide range of time-to-completion: anywhere

from 15 minutes to 50 minutes

ACL / CLA 2014 20

Page 21: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Results

KEY PRESS RIGHT Percent

Inflected form pressed rights 83.31%

Derived 1 ("transparent") pressure rightful 75.45%

Derived 2 ("opaque") expressway righteousness 63.66%

Meaning 1 ("primary") push entitlement 62.20%

Meaning 2 ("secondary") media correct 53.57%

Rhyme mess flight 46.63%

Cohort preppy rhyme 35.78%

Unrelated 1 table apple 14.59%

Unrelated 2 sofa orange 15.30%

ACL / CLA 2014 21

•  For any given comparison word type, we can calculate what percentage of the time that type was chosen when it was a choice.

matched for # sylls, lexical category

Page 22: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Results

KEY PRESS RIGHT Percent

Inflected form pressed rights 83.31%

Derived 1 ("transparent") pressure rightful 75.45%

Derived 2 ("opaque") expressway righteousness 63.66%

Meaning 1 ("primary") push entitlement 62.20%

Meaning 2 ("secondary") media correct 53.57%

Rhyme mess flight 46.63%

Cohort preppy rhyme 35.78%

Unrelated 1 table apple 14.59%

Unrelated 2 sofa orange 15.30%

ACL / CLA 2014 22

•  An overall linear regression, predicting percent chosen from word type, is statistically significant (p < 0.001), with an adjusted r2 of 0.82, and each word type is also sig.

matched for # sylls, lexical category

Page 23: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Results

KEY PRESS RIGHT Percent

Inflected form pressed rights 83.31%

Derived 1 ("transparent") pressure rightful 75.45%

Derived 2 ("opaque") expressway righteousness 63.66%

Meaning 1 ("primary") push entitlement 62.20%

Meaning 2 ("secondary") media correct 53.57%

Rhyme mess flight 46.63%

Cohort preppy rhyme 35.78%

Unrelated 1 table apple 14.59%

Unrelated 2 sofa orange 15.30%

ACL / CLA 2014 23

•  An overall linear regression, predicting percent chosen from word type, is statistically significant (p < 0.001), with an adjusted r2 of 0.82, and each word type is also sig.

matched for # sylls, lexical category

Page 24: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Results

KEY PRESS RIGHT Percent

Inflected form pressed rights 83.31%

Derived 1 ("transparent") pressure rightful 75.45%

Derived 2 ("opaque") expressway righteousness 63.66%

Meaning 1 ("primary") push entitlement 62.20%

Meaning 2 ("secondary") media correct 53.57%

Rhyme mess flight 46.63%

Cohort preppy rhyme 35.78%

Unrelated 1 table apple 14.59%

Unrelated 2 sofa orange 15.30%

ACL / CLA 2014 24

•  Post-hoc t-tests showed that most types were sig. diff. from the following type, with the exception of Derived-2 vs. Meaning-1, and the two unrelated forms:

matched for # sylls, lexical category

n.s.

n.s.

Page 25: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Interpretation

•  The basic morphological / semantic / phonological categories that linguists are assuming do seem to have psychological merit.

•  Inflected forms are the most similar – though note that even they are picked only 83% of the time as being more similar than other forms! –  In all pairwise comparisons, inflected forms are

picked significantly more often than the other choice, though other choices were picked at least occasionally, for all other word types.

ACL / CLA 2014 25

Page 26: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Interpretation

•  Transparent derivationally related forms are more similar than semantically related forms, but more opaque forms seem to pattern like other semantic relations, rather than having much of an additional boost. –  Note that in pairwise comparisons, if a derived2 form and a

meaning1 form were the two choices, the derived2 form was selected significantly more frequently (498 vs. 415; binomial test p < 0.05).

–  Also note that derived2 forms seemed to have more competition from other morphologically related forms, while meaning1 forms had more competition from phonologically related forms.

ACL / CLA 2014 26

Page 27: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Interpretation

•  Semantically related forms are more similar than phonologically related forms – Matches e.g. priming and ERP results in Radeau et al.

(1998). •  Rhymes are more related than cohorts. – Matches findings that rhymes prime more than cohorts

(see discussion in Radeau et al. 1998); – Does not match findings in spoken word processing that

cohorts get more activation than rhymes (e.g., Marslen-Wilson & Zwitserlood 1989, Allopenna et al. 1998) – but of course the current experiment involved written words.

ACL / CLA 2014 27

Page 28: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Analysis

•  For each pair of words (i.e., a key + comparison), we have three independently calculable measures:

1.  orthographic similarity (Khorsi 2012) 2.  transcription similarity (based on Khorsi 2012) 3.  semantic similarity (Banerjee & Pedersen 2003)

•  We will use these scores to predict, in a logistic regression, which of the two comparison words will be chosen.

•  For each of the scores, as the score between a key and a comparison increases, the probability of that comparison word’s being chosen on any given trial also increases.

ACL / CLA 2014 28

Page 29: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Judged Word-Relatedness vs. Orthographic Similarity

ACL / CLA 2014 29

●●

● ●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

● ●

●●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●

●●

●●

●● ●

●●

●●

●●

● ●

● ●

●●

●●

●●●

● ●

●●●

●●

●●

● ●

●●

●●

●● ●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●●

●●

●●

●●

●●

0.00

0.25

0.50

0.75

1.00

−50 −25 0 25Spelling Relatedness Score

AXB

Rel

ated

ness

Sco

re

AXB Relatedness score is the percentage of the time that a particular word was picked as being most similar to the key, as a function of their spelling relatedness.

R2 = 0.36

Page 30: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Judged Word-Relatedness vs. Transcription Similarity

ACL / CLA 2014 30

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●●

●●

●●

●●

●●

● ●

● ● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

0.00

0.25

0.50

0.75

1.00

−60 −40 −20 0 20Transcription Relatedness Score

AXB

Rel

ated

ness

Sco

re

AXB Relatedness score is the percentage of the time that a particular word was picked as being most similar to the key, as a function of their transcription relatedness.

R2 = 0.24

Page 31: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Judged Word-Relatedness vs. Semantic Similarity

ACL / CLA 2014 31

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

● ●

● ●

●●

●●

●●

●●●

●●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●● ●

●●

●●

●●

●●

0.00

0.25

0.50

0.75

1.00

0 3 6 9 12Log Non−Zero Semantic Relatedness Score

AXB

Rel

ated

ness

Sco

re

AXB Relatedness score is the percentage of the time that a particular word was picked as being most similar to the key, as a function of their logged semantic relatedness.

R2 = 0.15

Page 32: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Principal Components Analysis

•  To eliminate collinearity among the three variables (orthography, pronunciation, and semantics), a principal components analysis was run.

•  The first two components accounted for 94% of the variance in these measures, and were subsequently used in the logistic regression.

•  The first component was based entirely on spelling and transcription, with the two relatively equally weighted, while the second component was entirely based on semantics.

ACL / CLA 2014 32

Page 33: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Logistic Regression

•  These principal components were put into a logistic regression.

•  This model attempts to predict, given the two principal components of two word pairs (Key + Comparison 1 vs. Key + Comparison 2), which of the two comparison words would be chosen as being "more similar" to the Key.

ACL / CLA 2014 33

Page 34: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Accuracy of the Model

•  The best-fit logistic regression model (based on the lowest AIC) relies on all four of the factors that were entered into it (2 PCs for each of the 2 comparison words), as well as interactions (up to 3-way).

•  The classification accuracy for the original input data is 73%. –  i.e., it correctly predicts which of the two

comparison words will be selected 73% of the time

ACL / CLA 2014 34

Page 35: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Making Predictions

•  Even more usefully, we can use the model to generate predicted "perceived morphological relatedness" scores.

•  A new word pair (Key + Comparison) is put into the model, using calculated orthographic, transcription, and semantic similarity (transformed by the same PCA loadings as the original data).

•  Predicted perceived morphological relatedness is the model's calculated probability that this comparison word would be chosen as being more similar, as compared to the average probability that a known unrelated word would be chosen.

ACL / CLA 2014 35

Page 36: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Example

•  New pair: TABLE-tabulate – Orthographic sim.: -9.643768 – Transcription sim.: -29.671627 – Semantic sim.: 87

ACL / CLA 2014 36

Page 37: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Example

•  New pair: TABLE-tabulate – Orthographic sim.: -9.643768 – Transcription sim.: -29.671627 – Semantic sim.: 87

ACL / CLA 2014 37

Subject these to the loadings from the original Principal Components Analysis, to turn them into...

Page 38: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Example

•  New pair: TABLE-tabulate – Orthographic sim.: – Transcription sim.: – Semantic sim.: 0.1537 (‘semantic’ principal

component)

ACL / CLA 2014 38

-0.5629 (‘form’ principal component)

Page 39: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Example

•  New pair: TABLE-tabulate – Form component: -0.5629 – Semantic component: 0.1537

•  Compare to values for the average unrelated KEY – comparison pair: – Form component: -1.2104 – Semantic component: -0.432

ACL / CLA 2014 39

Page 40: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Example

•  What is the probability that TABLE-tabulate would be chosen as compared to TABLE-(unrelated)? – 50% = TABLE-tabulate has the same

likelihood as an unrelated pair; it is also “unrelated” – 100% = TABLE-tabulate is highly related

•  Actual value in this case: 71%

ACL / CLA 2014 40

Page 41: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Testing the Model

•  10 new key words with their various comparison words.

•  Probability of selection, as compared to average unrelated form?

•  Based on orthographic, transcription, and semantic relatedness.

ACL / CLA 2014 41

Type  of  Comparison  

Probability  

inflected   90.28%  

derived-­‐1   85.69%  

derived-­‐2   72.26%  

meaning-­‐1   73.21%  

meaning-­‐2   73.72%  

rhyme   80.97%  

cohort   67.67%  

unrelated-­‐1   48.35%  

unrelated-­‐2   52.74%  

Page 42: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Testing the Model

•  Most of the categories emerge from the model in the order that would be expected, with other unrelated forms at chance (50%) and inflected forms at 90%.

ACL / CLA 2014 42

Type  of  Comparison  

Probability  

inflected   90.28%  

derived-­‐1   85.69%  

derived-­‐2   72.26%  

meaning-­‐1   73.21%  

meaning-­‐2   73.72%  

rhyme   80.97%  

cohort   67.67%  

unrelated-­‐1   48.35%  

unrelated-­‐2   52.74%  

Page 43: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Testing the Model

•  Rhymes are over-predicted relative to the behavioural results; the model predicts they will be chosen over an average unrelated word 81% of the time.

ACL / CLA 2014 43

Type  of  Comparison  

Probability  

inflected   90.28%  

derived-­‐1   85.69%  

derived-­‐2   72.26%  

meaning-­‐1   73.21%  

meaning-­‐2   73.72%  

rhyme   80.97%  

cohort   67.67%  

unrelated-­‐1   48.35%  

unrelated-­‐2   52.74%  

Page 44: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Testing the Model

•  This is almost significantly different from both the derived-1 forms and the derived-2 forms (p < 0.07).

ACL / CLA 2014 44

Type  of  Comparison  

Probability  

inflected   90.28%  

derived-­‐1   85.69%  

derived-­‐2   72.26%  

meaning-­‐1   73.21%  

meaning-­‐2   73.72%  

rhyme   80.97%  

cohort   67.67%  

unrelated-­‐1   48.35%  

unrelated-­‐2   52.74%  

Page 45: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Testing the Model

•  There’s no difference between derived-2 and meaning-1 form (which matches the behavioural results), but also not between meaning-1 and meaning-2.

ACL / CLA 2014 45

Type  of  Comparison  

Probability  

inflected   90.28%  

derived-­‐1   85.69%  

derived-­‐2   72.26%  

meaning-­‐1   73.21%  

meaning-­‐2   73.72%  

rhyme   80.97%  

cohort   67.67%  

unrelated-­‐1   48.35%  

unrelated-­‐2   52.74%  

N.S.  

Page 46: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Testing the Model

•  Not perfect yet, but a good first step toward being able to predict perceived morphological relatedness.

ACL / CLA 2014 46

Type  of  Comparison  

Probability  

inflected   90.28%  

derived-­‐1   85.69%  

derived-­‐2   72.26%  

meaning-­‐1   73.21%  

meaning-­‐2   73.72%  

rhyme   80.97%  

cohort   67.67%  

unrelated-­‐1   48.35%  

unrelated-­‐2   52.74%  

Page 47: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Conclusions

•  An explicit similarity judgment task can reveal perceived morphological similarity.

•  Judgments of naïve English speakers do line up with intuitions of linguists: – Morphology > Semantics Only > Phonology Only

ACL / CLA 2014 47

Page 48: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Conclusions

•  An explicit similarity judgment task can reveal perceived morphological similarity.

•  Judgments of naïve English speakers do line up with intuitions of linguists: – Morphology > Semantics Only > Phonology Only

Inflected > Derived

ACL / CLA 2014 48

Page 49: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Conclusions

•  An explicit similarity judgment task can reveal perceived morphological similarity.

•  Judgments of naïve English speakers do line up with intuitions of linguists: – Morphology > Semantics Only > Phonology Only

Inflected > Derived

ACL / CLA 2014 49

Transparent > Opaque

Page 50: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Conclusions

•  An explicit similarity judgment task can reveal perceived morphological similarity.

•  Judgments of naïve English speakers do line up with intuitions of linguists: – Morphology > Semantics Only > Phonology Only

Primary Meaning > Secondary Meaning

ACL / CLA 2014 50

Page 51: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Conclusions

•  An explicit similarity judgment task can reveal perceived morphological similarity.

•  Judgments of naïve English speakers do line up with intuitions of linguists: – Morphology > Semantics Only > Phonology Only

Rhyme > Cohort

ACL / CLA 2014 51

Page 52: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Conclusions

•  An explicit similarity judgment task can reveal perceived morphological similarity.

•  Judgments of naïve English speakers do line up with intuitions of linguists: – Morphology > Semantics Only > Phonology Only

Derived 2 =?= Primary Meaning

ACL / CLA 2014 52

Page 53: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Conclusions

•  An explicit similarity judgment task can reveal perceived morphological similarity.

•  Judgments of naïve English speakers do line up with intuitions of linguists.

•  The perceived relatedness of novel word pairs can in turn be estimated from measures of their spelling, transcription, & semantic similarity.

•  Knowing / estimating perceived relatedness can help us identify phonological alternations.

ACL / CLA 2014 53

Page 54: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Future Directions

•  Investigate the role of the type of derived form – e.g., did antonyms get picked less often than other derived forms? (e.g. use / usefulness or complex / complication vs. wash / unwashable or infect / disinfectant)

•  Use auditory stimuli. •  Compare other metrics of similarity (e.g., with auditory

stimuli, use phonetic similarity; try other measures of semantic relatedness).

•  Try a similar design with implicit measure (e.g., priming). •  Develop a metric for word-to-word similarity; the current

one is based on similarity to a KEY (root) word.

ACL / CLA 2014 54

Page 55: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

THANK YOU!

Thanks especially to Molly Babel, Michael McAuliffe, Blake Allen, Rose-Marie

Déchaine, the UBC Speech in Context Lab, and funding from SSHRC.

ACL / CLA 2014 55

Page 56: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

References •  Allopenna, Paul D., James S. Magnuson & Michael K. Tanenhaus. 1998. Tracking the time course

of spoken word recognition using eye movements: Evidence for continuous mapping models. Journal of Memory and Language 38.419-39.

•  Bannerjee, Satanjeev & Ted Pedersen. 2003. Extended gloss overlaps as a measure of semantic relatedness. Proceedings of the 18th International Conference on Artificial Intelligence.

•  Bauer, Laurie. 1997. Evaluative morphology: A search for universals. Studies in Language 21.533-75. •  Booij, Geert. 1995. The phonology of Dutch. Oxford: Oxford University Press. •  Brysbaert, Marc & Boris New. 2009. Moving beyond Kučera and Francis: A critical evaluation of

current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods 41.977-90.

•  Frost, Ram, Avital Deutsch, Orna Gilboa, Michal Tannenbaum & William Marslen-Wilson. 2000. Morphological priming: Dissociation of phonological, semantic, and morphological factors. Memory & Cognition 28.1277-88.

•  Gonnerman, Laura M., Mark S. Seidenberg & Elaine S. Andersen. 2007. Graded semantic and phonological similarity effects in priming: Evidence for a distributed connectionist approach to morphology. Journal of Experimental Psychology: General 136.323-45.

ACL / CLA 2014 56

Page 57: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

References •  Johnson, Keith & Molly Babel. 2010. On the perceptual basis of distinctive features: Evidence

from the perception of fricatives by Dutch and English speakers. Journal of Phonetics 38.127-36. •  Khorsi, Ahmed. 2012. On morphological relatedness. Natural Language Engineering.1-19. •  Lu, Yu-an. 2012. The role of alternation in phonological relationships: Stony Brook University

Doctoral dissertation.

•  Marslen-Wilson, William, Lorraine Komisarjevsky, Rachelle Waksler & Lianne Older. 1994. Morphology and meaning in the English mental lexicon. Psychological Review 101.3-33.

•  Marslen-Wilson, William & Pienie Zwitserlood. 1989. Accessing spoken words: The importance of word onsets. Journal of Experimental Psychology: Human Perception and Performance 15.576-85.

•  Murrell, Graham A. & John Morton. 1974. Word recognition and morphemic structure. Journal of Experimental Psychology 102.963-68.

•  Radeau, Monique, Mireille Besson, Elisabeth Fonteneau & Sao Luis Castro. 1998. Semantic, repetition, and rime priming between spoken words: Behavioral and electrophysiological evidence. Biological Psychology 48.183-204.

•  Silverman, Daniel. 2006. A critical introduction to phonology: Of sound, mind, and body. London/New York: Continuum.

ACL / CLA 2014 57

Page 58: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Priming vs. Overt Judgment?

•  serves as a comparison to the more well-documented effects of priming

•  priming effects have in fact not been found in certain conditions – e.g. Marslen-Wilson et al.:

•  morphologically related words didn’t prime unless they were also semantically related

•  phonologically related words that are not morphologically or semantically related did not reliably prime

ACL / CLA 2014 58

Page 59: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Stimuli Selection •  Inflected form: the highest-frequency form that arose in a Google ngram search for

key_INF •  Derived 1: only one affix; often given in WordNet as a "derivationally related form" to

the key; generally felt to be relatively transparent by at least two native speakers of English

•  Derived 2: (at least) two affixes; verified in etymological dictionary as being from the same root

•  Meaning 1: a word that appeared in or matched the definition associated with the highest-frequency entry in WordNet

•  Meaning 2: as for meaning 1, but with a secondary meaning, preferably of a different lexical category

•  Rhyme: all material from stressed vowel to end of word identical (insofar as possible) •  Cohort: all material from beginning of word through stressed vowel identical (insofar

as possible) •  Unrelated 1 & 2: did not have any obvious phonological or semantic connection to the

key word; had the same number of syllables and same lexical category as each other, so that when they were paired, there was no obvious advantage to picking one over the other

ACL / CLA 2014 59

Page 60: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

What Else Was Chosen?

Inflected Derived

1 Derived

2 Meaning

1 Meaning

2 Rhyme Cohort Unrelated

1 Unrelated

2

Inflected 5947 307 172 288 202 131 49 20 21 Derived1 586 5360 256 325 241 166 87 30 37 Derived2 711 624 4634 415 350 271 178 42 60 Meaning1 578 561 498 4521 311 331 251 75 88 Meaning2 693 643 528 543 3846 393 288 89 103

Rhyme 785 701 677 619 505 3473 319 149 133 Cohort 849 803 720 658 600 651 2527 165 188

Unrelated1 858 864 867 866 798 765 681 1042 420

Unrelated2 887 857 916 807 839 765 674 472 1050

ACL / CLA 2014 60

What was chosen?

Wha

t was

a c

hoic

e??

Raw counts

Page 61: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

What Else Was Chosen?

Inflected Derived

1 Derived

2 Meaning

1 Meaning

2 Rhyme Cohort Unrelated

1 Unrelated

2

Inflected 0.833 0.043 0.024 0.040 0.028 0.018 0.007 0.003 0.003 Derived1 0.083 0.756 0.036 0.046 0.034 0.023 0.012 0.004 0.005 Derived2 0.098 0.086 0.636 0.057 0.048 0.037 0.024 0.006 0.008 Meaning1 0.080 0.078 0.069 0.627 0.043 0.046 0.035 0.010 0.012 Meaning2 0.097 0.090 0.074 0.076 0.540 0.055 0.040 0.012 0.014

Rhyme 0.107 0.095 0.092 0.084 0.069 0.472 0.043 0.020 0.018 Cohort 0.119 0.112 0.101 0.092 0.084 0.091 0.353 0.023 0.026

Unrelated1 0.120 0.121 0.121 0.121 0.111 0.107 0.095 0.146 0.059

Unrelated2 0.122 0.118 0.126 0.111 0.115 0.105 0.093 0.065 0.144

ACL / CLA 2014 61

What was chosen?

Wha

t was

a c

hoic

e??

Percentages; Row totals sum to 100% Chance: 50% on the diagonal, AND 6.25% in the rest of the row.

Page 62: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

What Else Was Chosen?

Inflected Derived

1 Derived

2 Meaning

1 Meaning

2 Rhyme Cohort Unrelated

1 Unrelated

2

Inflected 0.833 0.043 0.024 0.040 0.028 0.018 0.007 0.003 0.003 Derived1 0.083 0.756 0.036 0.046 0.034 0.023 0.012 0.004 0.005 Derived2 0.098 0.086 0.636 0.057 0.048 0.037 0.024 0.006 0.008 Meaning1 0.080 0.078 0.069 0.627 0.043 0.046 0.035 0.010 0.012 Meaning2 0.097 0.090 0.074 0.076 0.540 0.055 0.040 0.012 0.014

Rhyme 0.107 0.095 0.092 0.084 0.069 0.472 0.043 0.020 0.018 Cohort 0.119 0.112 0.101 0.092 0.084 0.091 0.353 0.023 0.026

Unrelated1 0.120 0.121 0.121 0.121 0.111 0.107 0.095 0.146 0.059

Unrelated2 0.122 0.118 0.126 0.111 0.115 0.105 0.093 0.065 0.144

ACL / CLA 2014 62

What was chosen?

Wha

t was

a c

hoic

e??

Percentages; Row totals sum to 100% Chance: 50% on the diagonal, AND 6.25% in the rest of the row.

Page 63: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Pairwise Comparisons: Inflected Forms

Comparison Num.

Choice 1 Num.

Choice 2 p value

(binomial test) significant?

(alpha = 0.05)

inflected-derived1 586 307 6.6E-21 sig

inflected-derived2 711 172 1.7E-78 sig

inflected-meaning1 578 288 3.6E-23 sig

inflected-meaning2 693 202 1.3E-63 sig

inflected-rhyme 785 131 3.0E-114 sig

inflected-cohort 849 49 2.2E-189 sig

inflected-unrelated1 858 20 2.5E-224 sig

inflected-unrelated2 887 21 1.9E-231 sig

ACL / CLA 2014 63

When it was a choice, the inflected form was always chosen more often than any other non-inflected form.

Page 64: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Pairwise Comparisons: Derived 1 Forms

Comparison Num.

Choice 1 Num.

Choice 2 p value

(binomial test) significant?

(alpha = 0.05)

derived1-inflected 307 586 6.6E-21 sig

derived1-derived2 624 256 3.4E-36 sig

derived1-meaning1 561 325 2.0E-15 sig

derived1-meaning2 643 241 6.2E-43 sig

derived1-rhyme 701 166 6.9E-79 sig

derived1-cohort 803 87 6.6E-146 sig

derived1-unrelated1 864 30 1.3E-213 sig

derived1-unrelated2 857 37 8.6E-204 sig

ACL / CLA 2014 64

When it was a choice, the derived1 form was always chosen more often than any other non-derived1 form, except for inflected forms.

Page 65: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Pairwise Comparisons: Derived 2 Forms

Comparison Num.

Choice 1 Num.

Choice 2 p value

(binomial test) significant?

(alpha = 0.05)

derived2-inflected 172 711 1.7E-78 sig

derived2-derived1 256 624 3.4E-36 sig

derived2-meaning1 498 415 6.6E-03 sig

derived2-meaning2 528 350 2.1E-09 sig

derived2-rhyme 677 271 9.4E-41 sig

derived2-cohort 720 178 6.5E-78 sig

derived2-unrelated1 867 42 2.4E-201 sig

derived2-unrelated2 916 60 1.5E-197 sig

ACL / CLA 2014 65

When it was a choice, the derived2 form was always chosen more often than any other non-derived2 form, except for inflected and derived1 forms.

Page 66: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Pairwise Comparisons: Meaning 1 Forms

Comparison Num.

Choice 1 Num.

Choice 2 p value

(binomial test) significant?

(alpha = 0.05)

meaning1-inflected 288 578 3.6E-23 sig

meaning1-derived1 325 561 2.0E-15 sig

meaning1-derived2 415 498 6.6E-03 sig

meaning1-meaning2 543 311 1.8E-15 sig

meaning1-rhyme 619 331 6.4E-21 sig

meaning1-cohort 658 251 9.3E-43 sig

meaning1-unrelated1 866 75 2.4E-171 sig

meaning1-unrelated2 807 88 3.2E-146 sig

ACL / CLA 2014 66

When it was a choice, the meaning1 form was chosen more often than any other non-meaning1 form, except for morphologically related forms.

Page 67: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Pairwise Comparisons: Meaning 2 Forms

Comparison Num.

Choice 1 Num.

Choice 2 p value

(binomial test) significant?

(alpha = 0.05)

meaning2-inflected 202 693 1.3E-63 sig

meaning2-derived1 241 643 6.2E-43 sig

meaning2-derived2 350 528 2.1E-09 sig

meaning2-meaning1 311 543 1.8E-15 sig

meaning2-rhyme 505 393 2.1E-04 sig

meaning2-cohort 600 288 5.2E-26 sig

meaning2-unrelated1 798 89 3.2E-143 sig

meaning2-unrelated2 839 103 4.0E-144 sig

ACL / CLA 2014 67

When it was a choice, the meaning2 form was chosen more often than the sound-related or the unrelated forms.

Page 68: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Pairwise Comparisons: Rhyme Forms

Comparison Num.

Choice 1 Num.

Choice 2 p value

(binomial test) significant?

(alpha = 0.05)

rhyme-inflected 131 785 3.0E-114 sig

rhyme-derived1 166 701 6.9E-79 sig

rhyme-derived2 271 677 9.4E-41 sig

rhyme-meaning1 331 619 6.4E-21 sig

rhyme-meaning2 393 505 2.1E-04 sig

rhyme-cohort 651 319 7.0E-27 sig

rhyme-unrelated1 765 149 2.0E-100 sig

rhyme-unrelated2 765 133 1.6E-108 sig

ACL / CLA 2014 68

When it was a choice, the rhyme form was chosen more often than the cohort and unrelated forms.

Page 69: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Pairwise Comparisons: Cohort Forms

Comparison Num.

Choice 1 Num.

Choice 2 p value

(binomial test) significant?

(alpha = 0.05)

cohort-inflected 49 849 2.2E-189 sig

cohort-derived1 87 803 6.6E-146 sig

cohort-derived2 178 720 6.5E-78 sig

cohort-meaning1 251 658 9.3E-43 sig

cohort-meaning2 288 600 5.2E-26 sig

cohort-rhyme 319 651 7.0E-27 sig

cohort-unrelated1 681 165 3.9E-75 sig

cohort-unrelated2 674 188 6.6E-65 sig

ACL / CLA 2014 69

When it was a choice, the cohort form was chosen more often than the unrelated forms.

Page 70: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Pairwise Comparisons: Unrelated 1 Forms

Comparison Num.

Choice 1 Num.

Choice 2 p value

(binomial test) significant?

(alpha = 0.05)

unrelated1-inflected 20 858 2.5E-224 sig

unrelated1-derived1 30 864 1.3E-213 sig

unrelated1-derived2 42 867 2.4E-201 sig

unrelated1-meaning1 75 866 2.4E-171 sig

unrelated1-meaning2 89 798 3.2E-143 sig

unrelated1-rhyme 149 765 2.0E-100 sig

unrelated1-cohort 165 681 3.9E-75 sig unrelated1-unrelated2

472 420 8.8E-02 NS

ACL / CLA 2014 70

The unrelated 1 forms were never chosen significantly more often than any other category. When pitted against each other, the choice between unrelated 1 and unrelated 2 forms was not significant.

Page 71: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Pairwise Comparisons: Unrelated 2 Forms

Comparison Num.

Choice 1 Num.

Choice 2 p value

(binomial test) significant?

(alpha = 0.05)

unrelated2-inflected 21 887 1.9E-231 sig

unrelated2-derived1 37 857 8.6E-204 sig

unrelated1-derived2 60 916 1.5E-197 sig

unrelated2-meaning1 88 807 3.2E-146 sig

unrelated2-meaning2 103 839 4.0E-144 sig

unrelated2-rhyme 133 765 1.6E-108 sig

unrelated2-cohort 188 674 6.6E-65 sig unrelated2-unrelated1

420 472 8.8E-02 NS

ACL / CLA 2014 71

The unrelated 2 forms were never chosen significantly more often than any other category. When pitted against each other, the choice between unrelated 2 and unrelated 1 forms was not significant.

Page 72: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Spelling / Transcription Similarity

•  Based on Khorsi (2012) •  Takes the log of the inverse of the frequency of

occurrence of the letters (phonemes) in the longest common shared sequence between two words, and subtracts the frequency of the letters (phonemes) that are not shared.

ACL / CLA 2014 72

Page 73: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Semantic Similarity

•  Currently using the extended Lesk similarity metric (Banerjee & Pedersen 2003).

•  Similarity score is the sum of the square of the lengths of overlapping words in the two definitions, ignoring function words and other extremely common words.

•  Examples: –  Two definitions, 1 word overlapping: Score = 1 –  Two definitions, 2 non-adjacent words overlapping: Score = 2 –  Two definitions, 2 adjacent words overlapping: Score = 4

•  "Extended" score takes into account the definitions themselves as well as those of related words (e.g., hypernyms, hyponyms, metonyms, etc.).

ACL / CLA 2014 73

Page 74: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Judged Word-Relatedness vs. Frequency

ACL / CLA 2014 74

AXB Relatedness score is the percentage of the time that a particular word was picked as being most similar to the key, as a function of the log of that word's token frequency (from the SUBTLEX corpus, Brysbaert & New 2009).

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●● ●

●●

● ●

●● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

0.00

0.25

0.50

0.75

1.00

0 5Log of Non−Zero Frequency of Chosen Word

AXB

Rel

ated

ness

Sco

re

R2 = 0.0003

Page 75: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Principal Components Analysis

Component  1   Component  2   Component  3  

Standard  deviaBon   1.3523781   0.9994649   0.41490159  

ProporBon  of  variance   0.6096422   0.3329767   0.05738111  

CumulaBve  proporBon   0.6096422   0.9426189   1.00000000  

ACL / CLA 2014 75

Summary:  

Loadings:    Key  vs.  Word  1   Component  1   Component  2   Component  3  

Spelling   0.707   0.707  

TranscripBon   0.705   -­‐0.706  

Log(SemanBc)   0.998  

Note:  Signs  of  loadings  for  component  1  were  forced  to  be  posiBve,  such  that  higher    relatedness  scores  are  associated  with  a  greater  chance  of  being  picked.  

Page 76: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

Logistic Regression Details

EsAmate   Std.  Error   z-­‐value   Pr(>|z|)   Sig  

(Intercept)   0.045958   0.014511   3.167   0.00154   **  

key_word1_pca1   0.591003   0.011585   51.014   <  0.001   ***  

key_word1_pca2   0.630035   0.015789   39.903   <  0.001   ***  

key_word2_pca1   -­‐0.553259   0.011512   -­‐48.06   <  0.001   ***  

key_word2_pca2   -­‐0.586177   0.015457   -­‐37.923   <  0.001   ***  

key_word1_pca1:key_word1_pca2   -­‐0.259545   0.010852   -­‐23.917   <  0.001   ***  

key_word1_pca1:key_word2_pca1   0.029583   0.008666   3.414   0.000641   ***  

key_word1_pca1:key_word2_pca2   0.007293   0.012211   0.597   0.550371  

key_word1_pca2:key_word2_pca1   0.033681   0.01208   2.788   0.005302   **  

key_word1_pca2:key_word2_pca2   0.004188   0.015915   0.263   0.792459  

key_word2_pca1:key_word2_pca2   0.218311   0.010606   20.585   <  0.001   ***  

key_word1_pca1:key_word1_pca2:key_word2_pca1   0.009531   0.008124   1.173   0.240731  

key_word1_pca1:key_word1_pca2:key_word2_pca2   0.012206   0.010817   1.128   0.259132  

key_word1_pca1:key_word2_pca1:key_word2_pca2   -­‐0.032065   0.0082   -­‐3.91   9.22E-­‐05   ***  

key_word1_pca2:key_word2_pca1:key_word2_pca2   -­‐0.041444   0.010783   -­‐3.843   0.000121   ***  ACL / CLA 2014 76

choose_first  ~  (key_word1_pca1  +  key_word1_pca2  +  key_word2_pca1  +  key_word2_pca2)^3  

AIC:  31188  

Page 77: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

EsAmate   Std.  Error   z-­‐value   Pr(>|z|)   Sig  

(Intercept)   0.045958   0.014511   3.167   0.00154   **  

key_word1_pca1   0.591003   0.011585   51.014   <  0.001   ***  

key_word1_pca2   0.630035   0.015789   39.903   <  0.001   ***  

key_word2_pca1   -­‐0.553259   0.011512   -­‐48.06   <  0.001   ***  

key_word2_pca2   -­‐0.586177   0.015457   -­‐37.923   <  0.001   ***  

key_word1_pca1:key_word1_pca2   -­‐0.259545   0.010852   -­‐23.917   <  0.001   ***  

key_word1_pca1:key_word2_pca1   0.029583   0.008666   3.414   0.000641   ***  

key_word1_pca1:key_word2_pca2   0.007293   0.012211   0.597   0.550371  

key_word1_pca2:key_word2_pca1   0.033681   0.01208   2.788   0.005302   **  

key_word1_pca2:key_word2_pca2   0.004188   0.015915   0.263   0.792459  

key_word2_pca1:key_word2_pca2   0.218311   0.010606   20.585   <  0.001   ***  

key_word1_pca1:key_word1_pca2:key_word2_pca1   0.009531   0.008124   1.173   0.240731  

key_word1_pca1:key_word1_pca2:key_word2_pca2   0.012206   0.010817   1.128   0.259132  

key_word1_pca1:key_word2_pca1:key_word2_pca2   -­‐0.032065   0.0082   -­‐3.91   9.22E-­‐05   ***  

key_word1_pca2:key_word2_pca1:key_word2_pca2   -­‐0.041444   0.010783   -­‐3.843   0.000121   ***  ACL / CLA 2014 77

Logistic Regression Details choose_first  ~  (key_word1_pca1  +  key_word1_pca2  +  key_word2_pca1  +  key_word2_pca2)^3  

AIC:  31188  

Page 78: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

EsAmate   Std.  Error   z-­‐value   Pr(>|z|)   Sig  

(Intercept)   0.045958   0.014511   3.167   0.00154   **  

key_word1_pca1   0.591003   0.011585   51.014   <  0.001   ***  

key_word1_pca2   0.630035   0.015789   39.903   <  0.001   ***  

key_word2_pca1   -­‐0.553259   0.011512   -­‐48.06   <  0.001   ***  

key_word2_pca2   -­‐0.586177   0.015457   -­‐37.923   <  0.001   ***  

key_word1_pca1:key_word1_pca2   -­‐0.259545   0.010852   -­‐23.917   <  0.001   ***  

key_word1_pca1:key_word2_pca1   0.029583   0.008666   3.414   0.000641   ***  

key_word1_pca1:key_word2_pca2   0.007293   0.012211   0.597   0.550371  

key_word1_pca2:key_word2_pca1   0.033681   0.01208   2.788   0.005302   **  

key_word1_pca2:key_word2_pca2   0.004188   0.015915   0.263   0.792459  

key_word2_pca1:key_word2_pca2   0.218311   0.010606   20.585   <  0.001   ***  

key_word1_pca1:key_word1_pca2:key_word2_pca1   0.009531   0.008124   1.173   0.240731  

key_word1_pca1:key_word1_pca2:key_word2_pca2   0.012206   0.010817   1.128   0.259132  

key_word1_pca1:key_word2_pca1:key_word2_pca2   -­‐0.032065   0.0082   -­‐3.91   9.22E-­‐05   ***  

key_word1_pca2:key_word2_pca1:key_word2_pca2   -­‐0.041444   0.010783   -­‐3.843   0.000121   ***  ACL / CLA 2014 78

Logistic Regression Details choose_first  ~  (key_word1_pca1  +  key_word1_pca2  +  key_word2_pca1  +  key_word2_pca2)^3  

AIC:  31188  

Page 79: Quantifying Perceived Morphological Relatedness · 2016-01-22 · vaas [vas] vase (large) vaasje [vaʃə] small vase ijs [ɛɪs] ice ijsje [ɛɪʃə] ice cream glas [ɡlɑs] glass

EsAmate   Std.  Error   z-­‐value   Pr(>|z|)   Sig  

(Intercept)   0.045958   0.014511   3.167   0.00154   **  

key_word1_pca1   0.591003   0.011585   51.014   <  0.001   ***  

key_word1_pca2   0.630035   0.015789   39.903   <  0.001   ***  

key_word2_pca1   -­‐0.553259   0.011512   -­‐48.06   <  0.001   ***  

key_word2_pca2   -­‐0.586177   0.015457   -­‐37.923   <  0.001   ***  

key_word1_pca1:key_word1_pca2   -­‐0.259545   0.010852   -­‐23.917   <  0.001   ***  

key_word1_pca1:key_word2_pca1   0.029583   0.008666   3.414   0.000641   ***  

key_word1_pca1:key_word2_pca2   0.007293   0.012211   0.597   0.550371  

key_word1_pca2:key_word2_pca1   0.033681   0.01208   2.788   0.005302   **  

key_word1_pca2:key_word2_pca2   0.004188   0.015915   0.263   0.792459  

key_word2_pca1:key_word2_pca2   0.218311   0.010606   20.585   <  0.001   ***  

key_word1_pca1:key_word1_pca2:key_word2_pca1   0.009531   0.008124   1.173   0.240731  

key_word1_pca1:key_word1_pca2:key_word2_pca2   0.012206   0.010817   1.128   0.259132  

key_word1_pca1:key_word2_pca1:key_word2_pca2   -­‐0.032065   0.0082   -­‐3.91   9.22E-­‐05   ***  

key_word1_pca2:key_word2_pca1:key_word2_pca2   -­‐0.041444   0.010783   -­‐3.843   0.000121   ***  ACL / CLA 2014 79

Logistic Regression Details choose_first  ~  (key_word1_pca1  +  key_word1_pca2  +  key_word2_pca1  +  key_word2_pca2)^3  

AIC:  31188