methods for identifying relations and associations in...

25
Methods for Identifying Relations and Associations in Text Stuart G. Towns and Richard Watson Todd King Mongkut’s University of Technology Thonburi DRAL 2017

Upload: others

Post on 20-Oct-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

  • MethodsforIdentifyingRelationsandAssociationsinText

    StuartG.Townsand RichardWatsonToddKingMongkut’sUniversityofTechnologyThonburi

    DRAL2017

  • PreviousResearchintoWritingQuality

    Whatarethelinguisticfeaturesfoundinproficientwriting?

    Whatmakesgoodwritinggood?

    Howdoweteachstudentstobegoodwriters?

    Howarelinguisticfeaturesexpressedindifferentlevelsofproficiency?

    Whatlinguisticfeaturescorrelatewithhumanjudgementsofwritingquality?

    WhatisBasicEnglish?

    WhatdoesEnglishProficiencymean?

  • LinguisticFeaturesforWritingQuality

    SyntacticComplexity• ClauseLength• #ofwordsbeforethemainverb

    LexicalDiversity&Sophistication• Type-TokenRatios• WordRarity

    Cohesion• Repetition&Reference• Paraphrases

    Coherence

    ?

  • ResearchonWritingQuality

    Pulitzer Prize Winner

  • LinguisticFeaturesforWritingQuality

    SyntacticComplexity• ClauseLength• #ofwordsbeforethemainverb

    LexicalDiversity&Sophistication• Type-TokenRatios• WordRarity

    Cohesion• Repetition&Reference• Paraphrases

    Coherence

    ?

  • ConnectednessofConceptsinaText

    Cohesion Coherenceexplicitconnectedconceptsinthetext

    implicitconnectionsinthemindofthereader

    Reiterations• Repetition• Reference

    Relations• Synonyms• Hyponyms• Meronyms

    Associations• Anyother

    connection

  • “Moonrise Kingdom” opens with no music — just the sound of raindrops falling on the roof of apreternaturally cozy house, which the camera gently leads the audience through as the familymembers inside go about their rainy day business.

    Bathed in apple reds, egg yolk yellows and an air of studied eccentricity, the house is immediatelyrecognizable as yet another habitat created by Wes Anderson, a film director whose obsession withmaterial culture, nostalgia and nursery comforts borders on the fetishistic.

    Of course, for viewers who happen to share Anderson’s taste for boldly framed, bespokeproductions — in which everything looks (and most probably is) lovingly handmade and artisanal,“Moonrise Kingdom” will simply offer yet another chance to live, at least for a little while, in thekind of universe only Anderson can create.

    (You can almost smell the damp canvas and wood polish in that opening sequence.)

    Those who long ago wrote off the writer-director as insufferably mannered and arcane — the usualterm of art is “twee” — well, they’re welcome to stay out in the rain.

    That opening scene house has a name, by the way: Summer’s End, which turns out to aptly capturea vaguely autumnal tale of young love that takes place in early September 1965 — a time of FordFalcons and mothers who smoked.

  • “Moonrise Kingdom” opens with no music — just the sound of raindrops falling on the roof of apreternaturally cozy house, which the camera gently leads the audience through as the familymembers inside go about their rainy day business.

    Bathed in apple reds, egg yolk yellows and an air of studied eccentricity, the house is immediatelyrecognizable as yet another habitat created by Wes Anderson, a film director whose obsession withmaterial culture, nostalgia and nursery comforts borders on the fetishistic.

    Of course, for viewers who happen to share Anderson’s taste for boldly framed, bespokeproductions — in which everything looks (and most probably is) lovingly handmade and artisanal,“Moonrise Kingdom” will simply offer yet another chance to live, at least for a little while, in thekind of universe only Anderson can create.

    (You can almost smell the damp canvas and wood polish in that opening sequence.)

    Those who long ago wrote off the writer-director as insufferably mannered and arcane — the usualterm of art is “twee” — well, they’re welcome to stay out in the rain.

    That opening scene house has a name, by the way: Summer’s End, which turns out to aptly capturea vaguely autumnal tale of young love that takes place in early September 1965 — a time of FordFalcons and mothers who smoked.

  • “Moonrise Kingdom” opens with no music — just the sound of raindrops falling on the roof of apreternaturally cozy house, which the camera gently leads the audience through as the familymembers inside go about their rainy day business.

    Bathed in apple reds, egg yolk yellows and an air of studied eccentricity, the house is immediatelyrecognizable as yet another habitat created by Wes Anderson, a film director whose obsessionwith material culture, nostalgia and nursery comforts borders on the fetishistic.

    Of course, for viewers who happen to share Anderson’s taste for boldly framed, bespokeproductions — in which everything looks (and most probably is) lovingly handmade and artisanal,“Moonrise Kingdom” will simply offer yet another chance to live, at least for a little while, in thekind of universe only Anderson can create.

    (You can almost smell the damp canvas and wood polish in that opening sequence.)

    Those who long ago wrote off the writer-director as insufferably mannered and arcane — the usualterm of art is “twee” — well, they’re welcome to stay out in the rain.

    That opening scene house has a name, by the way: Summer’s End, which turns out to aptly capturea vaguely autumnal tale of young love that takes place in early September 1965 — a time of FordFalcons andmothers who smoked.

  • “Moonrise Kingdom” opens with no music — just the sound of raindrops falling on the roof of apreternaturally cozy house, which the camera gently leads the audience through as the familymembers inside go about their rainy day business.

    Bathed in apple reds, egg yolk yellows and an air of studied eccentricity, the house is immediatelyrecognizable as yet another habitat created by Wes Anderson, a film director whose obsessionwith material culture, nostalgia and nursery comforts borders on the fetishistic.

    Of course, for viewers who happen to share Anderson’s taste for boldly framed, bespokeproductions — in which everything looks (and most probably is) lovingly handmade and artisanal,“Moonrise Kingdom” will simply offer yet another chance to live, at least for a little while, in thekind of universe only Anderson can create.

    (You can almost smell the damp canvas and wood polish in that opening sequence.)

    Those who long ago wrote off the writer-director as insufferably mannered and arcane — theusual term of art is “twee” — well, they’re welcome to stay out in the rain.

    That opening scene house has a name, by the way: Summer’s End, which turns out to aptlycapture a vaguely autumnal tale of young love that takes place in early September 1965 — a timeof Ford Falcons and mothers who smoked.

  • “Moonrise Kingdom” opens with no music — just the sound of raindrops falling on the roof of apreternaturally cozy house, which the camera gently leads the audience through as the familymembers inside go about their rainy day business.

    Bathed in apple reds, egg yolk yellows and an air of studied eccentricity, the house is immediatelyrecognizable as yet another habitat created by Wes Anderson, a film director whose obsessionwith material culture, nostalgia and nursery comforts borders on the fetishistic.

    Of course, for viewers who happen to share Anderson’s taste for boldly framed, bespokeproductions — in which everything looks (and most probably is) lovingly handmade and artisanal,“Moonrise Kingdom” will simply offer yet another chance to live, at least for a little while, in thekind of universe only Anderson can create.

    (You can almost smell the damp canvas and wood polish in that opening sequence.)

    Those who long ago wrote off the writer-director as insufferably mannered and arcane — theusual term of art is “twee” — well, they’re welcome to stay out in the rain.

    That opening scene house has a name, by the way: Summer’s End, which turns out to aptlycapture a vaguely autumnal tale of young love that takes place in early September 1965 — a timeof Ford Falcons andmothers who smoked.

  • Associations:PPWsvsBloggers

    Author BlogorPPW? %Assoc.

    MO PPW 56%

    DA PPW 40%

    HD PPW 38%

    MG PPW 31%

    AF Blog 26%

    FB Blog 23%

    FI Blog 16%

    CN Blog 15%

    Author BlogorPPW? %Assoc.

    HD PPW 43%

    DA PPW 41%

    MO PPW 41%

    MG PPW 32%

    CT Blog 30%

    CN Blog 30%

    FI Blog 24%

    AF Blog 22%

    MoonriseKingdom GhostProtocol

  • MethodsforIdentifyingRelationsandAssociationsinText

    Relations:• OxfordAmericanWriter'sThesaurus• WordNet(Princeton)• UCRELSemanticAnalysisSystem

    Associations:• WordassociationdatabaseatSmallWorldofWords• NearNeighborsLSAtoolhostedattheUniversityofColorado• MIscoresfromtheCorpusofContemporaryAmericanEnglish(COCA)

  • CreatingaWordList

    Thesecriteriawerefollowedtoidentify73wordstobeanalyzed:

    • Mustbecontentwordsonly,excludingadverbs,propernouns,auxiliaryverbs,phrasalverbs,orotheridioms(e.g.ofcourse,bytheway)

    • Mustbeseparateentriesindictionariesandthesauruses(e.g.rainy-daywas brokenupintoseparatewordsbutegg-yolkwasnot)

    • Mustnotbeoneofthetop250mostcommonEnglishwords

  • ThreePhasesofthisStudy

    • Phase1: Identifyallpossiblewordpairconnectionspersource,thendeterminewhichwordpairconnectionsarerelevanttoourtext.

    • Phase2:Determinewhichofthesixsourcesreturnedthemostvalidresults.

    • Phase3:Sumtheresultsfromallsixsourcestoseeifthewordpairsmatchedtheresearcher’sintuitionofconnectednessbetweenconceptsinthetext.

  • Phase1

    Example:Findingrelevantconnections

    IntheThesaurus:Thehouseistheaudienceatatheatre

    Inourtext:Theaudienceiswatchingthemoviethatshowsahouse

    audience– house

  • Phase1

    Identifyallpossiblewordpairconnectionspersource,thendeterminewhichwordpairconnectionsarerelevanttoourtext.

    Source Identified Relevant %RelevantThesaurus 21 6 29%

    WordNet 32 12 38%

    Semantic Tagging 21 10 48%

    SmallWorld 45 23 51%

    LSA 34 19 56%

    MIScores 28 15 54%

    (Note:73words->2,628possiblecombinations)

  • Phase2

    Determinewhichofthesixsourcesreturnedthemostvalidresults.

    • Assumption:Anexceptionallywell-writtentextshouldhavehigh-connectednessandshouldbewell-organized.I.e.,theconnectedpairsshouldbeclosertogetherthanarandomorder.

    • PointBi-SerialCorrelationbetweendistancebetweenthewordsinawordpairandwhetherornotitwasamatch.

  • Phase2

    *p<.05**p<.00001

    Source PBSCorrelationThesaurus -0.17**

    WordNet -0.04*

    SemanticTagging -0.10**

    SmallWorldofWords -0.24**

    LatentSemanticAnalysis -0.02

    COCAMIScores -0.12**

    PointBiserialCorrelationbetweendistancesandrelevance

  • Phase3

    Sumtheresultsfromallsixsourcestoseeifthewordpairsmatchedtheresearcher’sintuitionofconnectednessbetweenconcepts.

    audience– viewersreds– yellowsdamp- rain

    Foundby… #Word Pairs %Relevant6sources 0 n/a

    5Sources 0 n/a

    4Sources 3 100%

    3Sources 8 88%

    2Sources 27 52%

    1Source 91 29%

    0Sources 2,499 0%

  • Phase1FindingSummary

    Phase1:Theassociationdatabase(SmallWorld)andcorpus-basedsources(LSAandMI)returnedbetterresultsthantheothersources,asitidentified:

    • Morewordpairs

    • Higherpercentageofrelevantwordpairs

    • Uniquereiterations,relations,andassociations

  • Phase2FindingSummary

    • TheSmallWorldwasthemostvalidsource(withrespecttodistancebetweenwords),withWordNetandLSAbeingtheleastvalid.

    However,WordNetandLSAidentifiedthemostnumberofmovieterms.

    • Limitation:Thisstudyonlyconsideredwordpairs,notwordchainsorwordnetworks.

  • Phase3FindingSummary

    Themoreoftenawordpairwasfound,themorelikelyitwasarelevantwordpairforourtext.

    Thesemethodsreducedthenumberofpotentialwordpairsfrom2,628toamoremanageable129.

    Thesesixmethodsthereforeshowpromiseforfuturefullautomationoftheidentificationofrelationsandassociationsinatext.

  • Thankyou!