capturing patterns of linguistic interaction in a parsed corpus a methodological case study sean...

22
Capturing patterns of Capturing patterns of linguistic interaction in linguistic interaction in a parsed corpus a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London [email protected]

Upload: michael-dalton

Post on 04-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Capturing patterns of Capturing patterns of linguistic interaction in a linguistic interaction in a

parsed corpusparsed corpusA methodological case study

Sean WallisSurvey of English Usage

University College London

[email protected]

Page 2: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Capturing linguistic Capturing linguistic interaction...interaction...• Parsed corpus linguistics

• Intra-structural priming

• Experiments– Attributive AJPs before a noun– Embedded postmodifying clauses– Sequential postmodifying clauses– Speech vs. writing

• Conclusions

• The handout explains the analytical method in more detail(so read it later!)

Page 3: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Parsed corpus linguisticsParsed corpus linguistics

• An example tree from ICE-GB (spoken)

S1A-006 #23

Page 4: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Parsed corpus linguisticsParsed corpus linguistics

• Three kinds of evidence may be obtained from a parsed corpusFrequency evidence of a particular known

rule, structure or linguistic eventCoverage evidence of new rules, etc.Interaction evidence of the relationship

between rules, structures and events

• This evidence is necessarily framed within a particular grammatical scheme– How might we evaluate this grammar?

Page 5: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Intra-structural primingIntra-structural priming

• Priming effects within a structure – Study repeating an additive step in

structures

• Consider– a phrase or clause that may (in principle)

be extended ad infinitum• e.g. an NP with a noun head

N

Page 6: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Intra-structural primingIntra-structural priming

• Priming effects within a structure – Study repeating an additive step in

structures

• Consider– a phrase or clause that may (in principle)

be extended ad infinitum• e.g. an NP with a noun head

– a single additive step applied to this structure

• e.g. add an attributive AJP before the head

N

AJP

Page 7: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Intra-structural primingIntra-structural priming

• Priming effects within a structure – Study repeating an additive step in structures

• Consider– a phrase or clause that may (in principle) be

extended ad infinitum• e.g. an NP with a noun head

– a single additive step applied to this structure• e.g. add an attributive AJP before the head

– Q. What is the effect of repeatedly applying this operation to the structure?

shipN

N

AJP

Page 8: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Intra-structural primingIntra-structural priming

• Priming effects within a structure – Study repeating an additive step in structures

• Consider– a phrase or clause that may (in principle) be

extended ad infinitum• e.g. an NP with a noun head

– a single additive step applied to this structure• e.g. add an attributive AJP before the head

– Q. What is the effect of repeatedly applying this operation to the structure?

shipNAJP

tall

N

AJP

Page 9: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Intra-structural primingIntra-structural priming

• Priming effects within a structure – Study repeating an additive step in structures

• Consider– a phrase or clause that may (in principle) be

extended ad infinitum• e.g. an NP with a noun head

– a single additive step applied to this structure• e.g. add an attributive AJP before the head

– Q. What is the effect of repeatedly applying this operation to the structure?

shipNAJP

very greentallAJP

N

AJP

Page 10: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Intra-structural primingIntra-structural priming

• Priming effects within a structure – Study repeating an additive step in structures

• Consider– a phrase or clause that may (in principle) be

extended ad infinitum• e.g. an NP with a noun head

– a single additive step applied to this structure• e.g. add an attributive AJP before the head

– Q. What is the effect of repeatedly applying this operation to the structure?

shipNAJP

very greentallAJP

N

AJP

AJP

old

Page 11: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Experiment 1: analysis of Experiment 1: analysis of resultsresults• Sequential probability analysis

– calculate probability of adding each AJP– error bars: Wilson intervals– probability falls

• second < first• third < second

– decisions interact

– Every AJP addedmakes it harderto add another

0.00

0.05

0.10

0.15

0.20

0 1 2 3 4 5

probability

Page 12: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Experiment 1: explanations?Experiment 1: explanations?

• Feedback loop: for each successive AJP,

it is more difficult to add a further AJP logical-semantic constraints

• tend to say the tall green ship • do not tend to say tall short ship or green tall ship

communicative economy• once speaker said tall green ship, tends to only say

ship memory/processing constraints

• unlikely: this is a small structure, as are AJPs

Page 13: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Experiment 1: speech vs. Experiment 1: speech vs. writingwriting• Spoken vs. written subcorpora

– Same overall pattern– Spoken data tends to have fewer attributive AJPs

• Support for communicative economy or memory/processing hypotheses?

– Significance tests• Paired 2x1 Wilson tests

(Wallis 2011)• first and second

observed spoken probabilities are significantly smallerthan written

0.00

0.05

0.10

0.15

0.20

0.25

0 1 2 3 4 5

probability

written

spoken

Page 14: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Experiment 2: preverbal AVPsExperiment 2: preverbal AVPs

• Consider adverb phrases before a verb– Results very different

• Probability does not fall significantly between first and second AVP

• Probability does fall between third and second AVP

– Possible constraints• (weak) communicative • (weak) semantic

– Further investigationneeded

0.00

0.05

0.10

0 1 2 3 4

probability

Page 15: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Experiment 3: postmodifying Experiment 3: postmodifying clausesclauses• Another way to specify nouns in English

– add clause after noun to explicate it• the ship [that was in the port]• the ship [called Ariadne]

– may be embedded• the ship [that was in the port [we visited last week]]

– or successively postmodified• the ship [called Ariadne][that was in the port]

Page 16: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Experiment 3: (i) Experiment 3: (i) embeddingembedding

• Probability of adding a further embedded postmodifying clause falls with size– All data

• second < first• third < first

– Spoken• second < first

– Written• third < second

• Compare with effect ofsequential postmodification of same head

0.00

0.05

0.10

0 1 2 3 4

probability

written

spoken

all

Page 17: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Experiment 3: (ii) Experiment 3: (ii) sequentialsequential

• Probability of sequential postmodifying falls - and - for spoken data, falls, then rises– All data

• second < first

– Spoken• third > second

0.00

0.05

0.10

0.15

0 1 2 3 4 5

probability

written

spoken

Page 18: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Experiment 3: (ii) Experiment 3: (ii) sequentialsequential

• Probability of sequential postmodifying falls - and - for spoken data, falls, then rises– All data

• second < first

– Spoken• third > second

– Option: count conjoins separatelyor treat as single item

• Either way, results showsimilar pattern

– Negative feedback: the ‘in for a penny’ effect

0.00

0.05

0.10

0.15

0 1 2 3 4 5

probability

written

spoken

Page 19: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Experiment 3: (iii) Experiment 3: (iii) embedembed vs. vs. seqseq• Embedded vs. sequential postmodification

• embedding > sequence (second level)

– It is slightly easier tomodify the latest headthan a more remoteone:

• semantic constraints?• backtracking cost?

– Third level• embedding < sequence

(if counting conjoins)• long sequences seem to be easier to construct than

comparable layers of embedding

0.00

0.05

0.10

0.15

0 1 2 3 4 5

probability

embedding

sequential

Page 20: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

ConclusionsConclusions

• A method for evaluating interactions along grammatical axes– General purpose, robust, structural– More abstract than ‘linguistic choice’ experiments– Depends on a concept of grammatical distance

along an axis, based on the chosen grammar

• Method has philosophical implications– Grammar viewed as outcome of linguistic choices– Linguistics as an evaluable observational science

• Signature (trace) of language production decisions

– A unification of theoretical and corpus linguistics?

Page 21: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

Potential applicationsPotential applications

• Corpus linguistics– Optimising existing grammatical framework

• e.g. coordination, compound nouns

– Comparing genres/languages/periods

• Theoretical linguistics– Comparing different grammars, same language

• Psycholinguistics– Search for evidence of language production

constraints in spontaneous speech corpora• speech and language therapy• language acquisition and development

Page 22: Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London

ReferencesReferences

Nelson, G., Wallis, S. & Aarts, B. (2002) Exploring natural language. Benjamins.

Pickering, M. & Ferreira, V. (2008) Structural priming. Psychological Bulletin 134, 427–459.

Wallis, S.A. (2011) Comparing χ² tests for separability. Survey of English Usage.

• For explanation of the analysis method see the handout!

• For more detail and a draft of the full paper see http://corplingstats.wordpress.com