a memory-sensitive classification model of errors in early...

Acquiringasecondlanguage(L2)asanadultisnotoriouslydifficult.Byunderstandingwhereindividuallearnersmakemistakes,wecanimproveefficiencyanddurabilityofL2learning• Linguisticfactors:

• E.g.Cognates,concretewordsareeasier(deGroot&Keizer,2000)whileinterlingualhomographsareharder(Dijkstra,Timmermans&Schriefers,2000)

• Memoryfactors:• Sincelanguageislearned,itmustbestoredinmemory.

• Whatimprovesmemoryingeneralshouldalsoimprovememoryforlanguage

• Spacedrepetition:words(andotheritems)arerememberedbetterwhentheyareencounteredrepeatedly,withtemporalgapsinbetween(vs.repeatedallatonce).

• Longergapsarebetter(e.g.Cepeda etal.2006)• Robustoverseconds,minutes,days,weeks,years(e.g.Cepeda etal.

2008)• Appliestoawidevarietyofmaterials(e.g.Donovan&Radosevich,

1999)• Includinglanguage(e.g.Ullman&Lovelett,2018)

• RetrievalPractice:Recallinginformationfrommemorymakesthatinformationeasiertorecallinthefuture

• Duolingo frequentlypromptsuserstoretrievefrommemory• Retrievalpracticeenhancestheefficacyofspacedrepetition

BybetterunderstandingthefactorsthatinfluencelearningandretentionofL2,systemslikeDuolingo can:• DevotemoreresourcestothemostdifficultaspectsoftheL2(foreachlearner)• Schedulereviewoflearnedmaterialwhenitisofmostbenefittothelearner• Leveragetheirownusers’datatoimproveunderstandingofthelearningprocess,andin

turnimprovelearningoutcomes

A Memory-Sensitive Classification Model of Errors in Early Second Language Learning

INTRODUCTION

Brendan Tomoschuk, Jarrett T. Lovelett1University of California, San Diego

Contact: [email protected], [email protected]

ENGINEERED FEATURES

MODEL

RESULTS & DISCUSSION

DATA

REFERENCES

AUROC F1 Log-lossSLAMEnglish .7730 .1899 .3580

English .8286 .4242 .3191

SLAMFrench .7707 .2814 .3952

French .8228 .4416 .3561SLAMSpanish .7456 .1753 .3862

Spanish .8027 .4353 .3571

EnglishFrench

Spanish

userMeanError

userVarError

countries

timePerToken

days

userTrial

tokenMeanError tim

e

tokenVarError

format:prevFormat

lagTr1Tr2

nthOccurance

stemLag1

tokenLag1

stemLag1:stemLag2

lagTr1Tr2:morphoComplexity

stemLag1:stemLag2:lagTr1Tr2

stemLag2

tokenLag2

sentLength

0.000

0.025

0.050

0.075

0.100

0.000

0.025

0.050

0.075

0.100

0.000

0.025

0.050

0.075

0.100

Feature

Importance

EngineeredNewOld

Thefirst30daysofeachuserslearningbrokenarebrokendowninto:• Training:eachuser’sfirst80%ofsessions• Development:thenext10%ofeachuser’s

data• Test:thefinal10%ofexercisesforeachuser

Linguistic Memory Categorical InteractionsorthoLength Wordlength incharacters nthOccurance Number oftimesatokenhasbeenseen pos Partofspeech stemLag1xstemLag2

phonLength Wordlengthinphonemes userTrial Number oftrialsauserhasseen format Trialformat(seeFigure 1) stemLag1xstemLag2xlagTr1Tr2

orthoNei Numberoforthographicwordneighbors tokenLag1 Amountoftimesincetokenlastseen prevFormat Previoustrialformat lagTr1Tr2xmorphoComplexity

phonNei Numberofphonological wordneighbors tokenLag2 Amountoftimebetweenlasttime awordhasbeenseenandthetimebeforethat

client User’sclient(collapsed tomobileorweb) lagTr1Tr2xmorphoLag1

logWordFreq log-transformed wordfrequency stemLag1 Amountoftimesincestemmed tokenhasbeenseen

userMeanError Averageof a user’saccuracyacrosstrials FormatxprevFormat

logOrthoNeiFreq Average log-transformedwordfrequencyoforthographicneighbors

stemLag2 Amountoftimebetweenlasttime astemmedtokenhasbeenseenandthetimebeforethat

userVarError Variance inauser’saccuracyacrosstrials orthoNei xformat

logPhonNeiFreq Average log-transformedwordfrequencyofphonologicalneighbors

morphoLag1 Amountoftimesincemorphological featureshavelastbeenseen

phonNei xformat

Edit Distance Levenshtein distancebetweentranslationsofword

lagTr1Tr2 Amount oftimebetweenfirstandsecondtrialscontainingthattoken

formatxclient

Interlingual homograph Whether agiventranslationwasidenticaltoadifferentwordinthesourcelanguage

morphoComplexity xpos

morphoCompexity Numberofmorphological features

Concreteness Subject ratingsofhowperceptibleanentityis

Threegroupswereanalyzedseparately:• English-speakinglearnersofSpanish• English-speakinglearnersofFrench• Spanish-speakinglearnersofEnglish

POPULATIONS

THREESETS

ReverseTranslate ReverseTap Listen

Randomforestclassifier• Eachdecisiontreebranchedanumberoftimesequaltothesquarerootofthetotalnumberof

features• Anensembleof1000treeswascreatedforeachofthethreelanguagedatasets• Eachtreebrancheduntilleaveswerepure(containedonlyasinglelabel:“error”or“noerror”)• Out-of-bagerrorwasusedtoestimatepredictionerroroftheclassifier• TheclassifierwastrainedinPython3,usingsklearn.ensemble.RandomForestClassifier()

• Themoretimeusersspendpertoken(onaverage)withinanexercise(timePerToken)themorelikelytheyaretomakeerrorsinthatexercise

• Usersmakemoreerrorsonaveragethelongerthey’vespentusingtheapp(Days,userTrial).Perhapsbecauseitemdifficultyalsoincreaseswithexperience.

• Wordsthatrepeatmoreoften(nthOccurrence)arerememberedbetter.• Themoretimethatpassedsincethepreviousoccurrenceofaword,thehighertheerrorrate

(tokenLag1, tokenLag2)• Contraspacingeffect:perhapsmoreconsiderationoffullitemhistoryis

needed(orgapstoolong;seeCepeda etal.2008)• Thereseemstobeacosttoswitchingformats:errorratesarehigherwhenthecurrenttask

typeisdifferentfromtheprevious(format:prevFormat)• Futuremodelswillincludeablationexperimentsandwordembeddings

• Mostimportantfeatures: userMeanError; userVarError:• meanandvarianceofeachuser’serrorrate(undereachcombinationof

levelsofasmallsetoffeatures)• Computationalsavingsoverfittingamorecomprehensiverandomeffect

structure(i.e.randomeffectsforallusers,alltokens,andalluser-tokencombinations,atminimum)

Figure1.ExamplesofDuolingo exercisesanderrormarkingspresentinthedata

Figure2.Importancemeasuresforeachofthetop20features.

Table2.ModeloutcomescomparedtoSLAMbaselines.

Table1.Namesanddescriptionsoftheengineeredfeatures

BenAmbridge,AnnaL.Theakston,ElenaV.m.Lieven,andMichaelTomasello.2006.Thedistributedlearningeffectforchildren’sacquisitionofanabstractsyntacticconstruction.CognitiveDevelopment,21(2),174–193.HarryP.Bahrick andElizabethPhelphs.1987.RetentionofSpanishvocabularyover8years.JournalofExperimentalPsychology:Learning,Memory,andCognition,13(2),344–349.DavidA.Balota,JanetM.Duchek,andRondaPaullin.1989.Age-relateddifferencesintheimpactofspacing,lag,andretentioninterval.PsychologyandAging,4(1),3–9.MarcBrysbaert,AmyBethWarriner,andVictorKuperman.2014.Concretenessratingsfor40thousandgenerallyknownEnglishwordlemmas. Behaviorresearchmethods, 46(3),904-911.ShanaK.Carpenter.2009.Cuestrengthasamoderatorofthetestingeffect:Thebenefitsofelaborativeretrieval.JournalofExperimentalPsychology:Learning,Memory,andCognition,35(6),1563–1569.ShanaK.CarpenterandEdwardL.DeLosh.2006.Impoverishedcuesupportenhancessubsequentretention:Supportfortheelaborativeretrievalexplanationofthetestingeffect.MemoryandCognition,268-276. NicholasJ.Cepeda,HaroldPashler,EdwardVul,JohnTWixted,andDougRohrer.2006.Distributedpracticeinverbalrecalltasks:Areviewandquantitativesynthesis.PsychologicalBulletin,132(3),354-380.Nichlaos J.Cepeda,EdwardVul,DougRohrer,JohnT.Wixted,andHaroldPashler.2008.Spacingeffectsinlearning:Atemporalridgelineofoptimalretention.PsychologicalScience,19(11),1095-1102.WilliamL.Cull.2000.Untanglingthebenefitsofmultiplestudyopportunitiesandrepeatedtestingforcuedrecall.AppliedCognitivePsychology,14(3),215-235.AnnetteDeGroot andRineke Keijzer.2000.Whatishardtolearniseasytoforget:Therolesofwordconcreteness,cognatestatus,andwordfrequencyinforeign-languagevocabularylearningandforgetting. LanguageLearning, 50(1),1-56.TonDijkstra,MarkTimmermans,andHerbertSchriefers.2000.Onbeingblindedbyyourotherlanguage:Effectsoftaskdemandsoninterlingual homographrecognition. JournalofMemoryandLanguage, 42(4),445-464.JohnJ.Donovan andDavidJ.Radosevich.1999.Ameta-analyticreviewofthedistributionofpracticeeffect:Nowyouseeit,nowyoudon't.JournalofAppliedPsychology,84(5),795-805.HermannEbbinghaus.1964.Memory:Acontributiontoexperimentalpsychology(H.A.Ruger,C.E.Bussenius,&E.R.Hilgard,Trans.).NewYork,NY:Dover.(Originalworkpublishedin1885). JeffreyD.Karpicke andHenryL.Roediger.2007.Expandingretrievalpracticepromotesshort-termretention,butequallyspacedretrievalenhanceslong-termretention.JournalofExperimentalPsychology-LearningMemoryandCognition,33(4),704-719.ThomasK.Landauer andRobertA.Bjork.1978.OptimumrehearsalpatternsandnamelearningInM.M.Gruneberg,P.E.Morris,&R.N.Sykes(Eds.),Practicalaspectsofmemory(pp.625-632).London:AcademicPress.Viorica Marian,JamesBartolotti,SarahChabal,andAnthonyShook.2012.CLEARPOND:Cross-linguisticeasy-accessresourceforphonologicalandorthographicneighborhooddensities. PloS one, 7(8),e43230.CorneliusP.ReaandVitoModigliani.1985.Theeffectofexpandedversusmassedpracticeontheretentionofmultiplicationfactsandspellinglists.HumanLearning:JournalofPracticalResearchandApplications,4(1),11-18.B.Settles,C.Brust,E.Gustafson,M.Hagiwara,andN.Madnani.2018.SecondLanguageAcquisitionModeling.In ProceedingsoftheNAACL-HLTWorkshoponInnovativeUseofNLPforBuildingEducationalApplications(BEA).ACL.MichaelT.Ullman andJarrettT.Lovelett.2016.Implicationsofthedeclarative/proceduralmodelforimprovingsecondlanguagelearning:Theroleofmemoryenhancementtechniques.SecondLanguageResearch,39(1),39-65.EleanorVanderLinde,BarbaraA.Morrongiello,andCarolynRovee-Collier.1985.Determinantsofretentionin8-week-oldinfants.DevelopmentalPsychology,21(4),601–61.

Settlesetal.2018

a memory-sensitive classification model of errors in early...

Documents