correcting errors produced by french speakers writing in english:

Download Correcting errors produced by French speakers writing in English:

If you can't read please download the document

Upload: yaholo

Post on 23-Feb-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Correcting errors produced by French speakers writing in English:. An illustration with misplaced adverbs Workshop LORIA, Nancy 17-18 June 2010 . Marie Garnier Cultures Anglo-Saxonnes Université Toulouse 2 France. P. Saint-Dizier IRIT CNRS France. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Diapositive 1

An illustration with misplaced adverbs

Workshop LORIA, Nancy17-18 June 2010 Correcting errors produced by French speakers writing in English:Marie GarnierCultures Anglo-SaxonnesUniversit Toulouse 2FranceP. Saint-DizierIRITCNRSFrance1IntroductionCorrecTools projectobjective: develop correction rules for grammatical errors produced by French speakers writing in a foreign language, application to English (not detected nor corrected by grammar checkers)didactic perspective: inclusion of dynamically generated explanations (grammar, several corrections, etc.) and possibly argumentation.Possible extension to style.First experiment: Errors linked to misplaced adverbs (adjuncts)motivations for the correction of such errorstheir automatic correction

2Project OverviewTarget:French speakersAudience: large-public as well as professionalsExploratory corpus:variety of types of documents, domains, authorsaround 100.000 words (errors are manually detected and annotated)Classification of errors:A priori choice: system of categories based on linguistic criteria (NP, PP, VP, Clause and sentence) (Albert et al., 2009)

3Parameters of the construction of a corpusGeneral methodologyConstruction of corpus: first step of an error analysis methodologyDesigned in accordance with our objective (representativity of errors and types of situations)Parameters taken into consideration:Level of controlType of documentAuthors and target audienceFields or domains of document production4Description of parametersType of documents and level of control:From short spontaneous productions (e.g. emails, posts) to longer professional productionsQuasi-continuum from low level to high level of controlEmails, blogs = low level of control,Web pages = average level of controlprofessional productions = high level of controlVariations exist within groupsAround 200 pages (90 pages of internet productions, 110 pages of professional productions, 100 000 words), 79 authors.5Constraints on the classification of errorsMethodsTwo main methods (Ellis, 2008):Errors categorized according to linguistic criteria (i.e. syntax/morphology/lexicon, parts of speech, linguistic systems such as determination, expression of future, etc.)Errors categorized according to the observation of surface phenomena (i.e. omission, addition, wrong use, etc.)Possibility of ad hoc categories (study of a limited number of error types concerning a specific group of learners)6Constraints on our classification systemCategories should describe most types of errors (not ad hoc)Categories should be designed according to linguistic criteria (descriptions used to analyze the source of errors)Categories should be understood by most annotators and usersClassification system should show internal coherence (linguistic, cognitive)Categories could be portable to other languages

7Presentation of our error categorization systemMain categories: syntactic phrases that contain the errors (NP, VP, PP, Sentence and Clause)

Internal categories: finer distinctions designed after observation of the nature of errors. Leads to about 40 subclasses.

Analyze reasons/source of errors for a better correction.8Error categories: a few examples NOUN PHRASEAdjectivePosition of adjective w.r.t. nounThe carrying of weapons is permitted in fifty states different. The carrying of weapons is permitted in fifty different states.Order of adjectives in a complex constructionEuropean academic and industrial partners Academic and industrial European partnersPosition of the adverb modifying an adjective (exceptional construction)A quite detailed analysis Quite a detailed analysisDeterminationChoice of articleA Merovingian necropolis was built on exact site of the villa. A Merovingian necropolis was built on the exact site of the villa.NN constructionUngrammatical NN constructionThe objects properties The properties of the objectsAbusive NN stackingSecurity object granularity The granularity of security objectsTable 4.9CategoryNumber of calque errorsLexical & lex. choice calques200Incorrect lexical choice of preposition62Determiner30Adverbs12Modals26Incorrect idiomatic expression70Structural calques105Incorrect position of adverbs38Incorrect position of adjectives7Argument omissions52Incorrect passive forms8Stylistic calques122Incorrect temporal sequence26Incorrect choice of aspect20Punctuation errors76CALQUE: Frequency table10Distribution of errors: a samplePublic.EmailsLearner product.ReportsTOTALNN Constructions55461110,5%Choice of article2492599,3%Choice of preposition18271638,9%Position of adverb160864,2%Transitivity621124,2%TOTAL47%22,9%39,2%50%37,1%Table 5. Main types of errors in the corpus11About other language pairsSame remarks apply, but with quite different error categories:French Spanish (Mathilde Janier)ex: temporal agreement ramos en los tiempos, nos vamos con destino a Lyn ramos en los tiempos, nos fuimos con destino a Lynfutur avec Cuando: cuando ser ms vieja cuando sea ms vieja

Spanish English (Astrid Rojas)Realmente espero ir el prximo ao Really I hope I can go there next year (I really hope)Tengo 20 aos I have 20 years.The grammar of pronouns and reflexives is quite different in Spanish, leading to forms such as David is me, a calque of David soy yo.

French German (Camille Albert)Ich habe gern die Suppe Ich habe die Suppe gern.Proposition of an annotation schemaAttempt to reflect the parameters involved in error detection and correction made by human correctorsAnnotations are in XML formatThe aim is to derive correction rules from annotations, possibly through machine-learning techniques13Error annotations: a preliminary proposal tags the group of words involved in the errorcomprehensionindicates if the segment is understandable (0 to 4)grammaticalityindicates how ungrammatical the error is (0 to 2)categmain category of the error (lexical, syntactic, stylistic, semantic, textual)sourcetransfer, overgeneralization, erroneous ruleTable 1. Delimitation and characterization of an error14 tags the text fragment involved in the correction tags each correctionsurfacesize of the text fragment affected by the correction (minimal, average, maximal)grammarindicates if correction proposed is standard (by-default, alternative, unlikely)meaningindicates if the meaning has been altered (yes, somewhat, no)var-sizeindicates increase/decrease in number of wordschangeindicates the nature of the change (lexical, syntactic, stylistic, semantic, textual)compindicates if correction is easy to understand (yes, average, no)fixindicates whether the error is specific or not (yes, no)qualifindicates the certainty level of the annotator (high, average, low)correctgives the correctionNB: More complex schema than those used in other projects (ICLE and FreeText, NICT Japanese Learner English, Cambridge Learner Corpus) but purposes are very different.Table 2. Delimitation and characterization of correction(s)15Example of an annotated error with multiple corrections:*We need to index efficiently the soundtrack of multimedia documentsWe need to

Table 3. Example of an annotated error16The case of misplaced adverbsDistribution and type of errors in the corpusResponses offered by grammar checkersA correction strategy17Type of errorsFunctionTypeExampleAdjunctsVP modifiersMannerDegree

Means or Instrument*To index efficiently the distributional data*His father resembles strongly his own character*Our system is able to derive automatically informationClausalConnective

*They exhibit nevertheless the dependency relationships observed in the source parse treeFocusing ModifiersAdditive

*The treatment of this official day exemplifies also an awnswer to associationsRestrictive?in order to hand down exclusively family memoriesTable 6. Errors linked to adverbsMorphology: mostly prototypical ly adverbs + simple or complex other adverbs (well, nevertheless...)18Grammar checkersFrom payware to freeware, from professional websites to research projects:

With those systems: Error samples from corpus: best result = 19.3%

Misplaced adverbs in the VP are in general not corrected nor detected...After the Deadline, Paper Rater, TwinMarker, SpellCheckPlus, LanguageTool, Grammar Expert +, GrammarCheckAnywhere, Ginger, Word 2007, Grammar Slammer...19Error sourcesSyntactic transfer (Ellis, 2008): Adverb placed between main verb and complement (L1 influence)Ex: *It won't change completely the life of its citizensGeneralization from exceptional cases:In English, adverbs can be found after the verb when the complement is long (Huddleston and Pullum, 2002) or when there is no complement (intransitive VP)Ex: She ate slowly.Ex: She waited anxiously for the results of the exam she had had such a hard time preparing for. 20Towards automatic correctionGrammatical and linguistic framework:Descriptive grammarThe Cambridge Grammar of the English Language, R. Huddleston & G. K. Pullum (2002)Prescriptive grammarGrammaire Explicative de l'Anglais, P. Larreya & C. Rivire (2005)

Overview of grammatical rules and tendencies governing adverb placement21Parameters involved in correction rulesWeightLength of AdvP (long HeadAdv and/or modification)? She would very erratically tell her story.Presence/absence of complements after the verb + length? She was slowly eating. She has slowly opened the door to the second guestroom.vs She has opened it slowly.SemanticsAdjunct type (Manner, Degree, Act-related...)They deliberately had stopped the train.Scope of the adverb (VP-oriented, Clause-oriented)Sadly they were arguing about the children.? They were arguing sadly about the children.

22Syntax"Simple" verbs vs Prepositional verbs vs Phrasal verbsShe has opened the door slowly.She has slowly given up cigarettes.ProsodyProsodically integrated vs prosodically detached*Anxiously she waited for the results.Anxiously , she waited for the results.

(Other works on parameters of adverb placement include: Kampers-Manhe, 1994; Engels, 2004)

23Tests with native speakersSentencesOKIncorrectBest Choice(1)(2)1.Slowly she has opened the door.x2.She slowly has opened the door.x3.She has slowly opened the door.x4.She has opened the door slowly.xx(1) Grammatical but unnatural and/or changes original meaning(2) UngrammaticalTable 7. Sample from NS tests24Error patterns and correction rulesManner adverbs used as adjuncts (VP-oriented)Ex: *Slowly she has opened the door.(1) Slowly, she has opened the door.(2) She has opened the door slowly.(3) She has slowly opened the door.Correction: pattern for detection + Rewriting under conditions, with preferences:Adverb(+manner), NP1, {Auxiliary}, Verb, NP2 Adverb(+manner), [,] , NP1, {Auxiliary}, Verb, NP2 ,{preference: 1} NP1, {Auxiliary}, Verb, NP2, Adverb(+manner) ,{preference: 2} NP1, {Auxiliary}, Adverb(+manner),Verb, NP2 ,{preference: 3}

25Ex: *She anxiously was waiting for the results.(1) She was anxiously waiting for the results.(2) She was waiting for the results anxiously. Rewriting rule:NP1, Adverb(+manner), {Auxiliary}, Verb, NP2 NP1, {Auxiliary}, Verb, NP2, Adverb(+manner), {preference: 1} NP1, {Auxiliary}, Adverb(+manner), Verb, NP2, {preference: 2}

26Difficulties:Deal with the recognition of NPsPossible interactions with other functions of adverbs (ex: He loves only his work, focusing modifier)Await testing and implementation using (software platform for the identification of textual semantic structures): Evaluation of : Annotation of errors + correction proposals.

27PerspectivesFurther research on adverbs:Other functions, e.g. modifiers of adjectives and adverbs, focusing modifiers (might interact with existing error patterns)Internal syntax of AdvPsDevelop explanation aspects of the project:Generate argumentations to deal with multiple correction propositions (Garnier et al., 2009)Design dynamically generated explanations for errors linked to adverbsInvestigate cognitive aspects of error correctionCorrection of NN errors (ex: the meaning utterance) and other types of errorsRequires knowledge from different areas (lexical, ontological, domain knowledge, etc.)

28More information on:http://www.irit.fr/recherches/ILPL/webct/ct.html

29

Thank to you

30