pedagogical applications of corpus data for english for general and specific purposes

72
Pedagogical applications of corpus data for English for General and Specific Purposes Pascual Pérez-Paredes Universidad de Murcia, Campus Mare Nostrum Université Catholique de Louvain FIAL (conférence ouverte aux chercheurs et étudiants): Mercredi 4 décembr 12h45 (local ERAS 56).

Upload: pascual-perez-paredes

Post on 21-Jun-2015

4.675 views

Category:

Education


0 download

DESCRIPTION

FIAL (conférence ouverte aux chercheurs et étudiants): "Pedagogical applications of corpus data for English for General and Specific Purposes" le mercredi 4 décembre, 12h45 (local ERAS 56). UCL, Louvain-la-Neuve

TRANSCRIPT

Page 1: Pedagogical applications of corpus data for English for General and Specific Purposes

Pedagogical applications of corpus data for English for General and Specific Purposes

Pascual Pérez-Paredes Universidad de Murcia, Campus Mare Nostrum

Université Catholique de LouvainFIAL (conférence ouverte aux chercheurs et étudiants): Mercredi 4 décembr 12h45 (local ERAS 56).

Page 2: Pedagogical applications of corpus data for English for General and Specific Purposes

Pedagogical applications of corpus data for English for General and Specific PurposesUniversité Catholique de LouvainFIAL (conférence ouverte aux chercheurs et étudiants): Mercredi 4 décembr 12h45 (local ERAS 56).

perezparedes.blogspot.com

Page 3: Pedagogical applications of corpus data for English for General and Specific Purposes

Outline1. Background: corpora2. The SACODEYL- BACKBONE approach

at a glance3. Getting down to annotating

1. Backbone Annotator: download and installation

2. Texts and CMT 3. Guided annotation

1. BACKBONE2. Specific purposes: LADEX

3

Page 4: Pedagogical applications of corpus data for English for General and Specific Purposes

4

Page 5: Pedagogical applications of corpus data for English for General and Specific Purposes

Corpus

Principled collection of texts representative of a given language or reprsentative of a particular language domain. -Language research purposes-Applied purposes: teaching, learning, dictionary making, testing…

5

view.byu.edu/corpora.aspsacodeyl.inf.um.es/sacodeyl-search/

webapps.ael.uni-tuebingen.de/backbone-search/

Page 6: Pedagogical applications of corpus data for English for General and Specific Purposes

Corpora in language education

ReCALL special issue: Researching uses of corpora for language teaching and learning . Boulton & Pérez-Paredes (2014).

-Indirect uses: Thorndike and Lorge’s Teacher’s Word Book of 30,000 Words (1944), West’s General Service List (1953), or Gougenheim (e.g. 1958) and colleagues’ work on the Français Fondamental-Cobuild work led by John Sinclair (1987) Routledge Frequency Dictionaries Coxhead’s Academic Word List (2000) Martinez and Schmitt’s (2012) Phrasal Expressions List

6

Page 7: Pedagogical applications of corpus data for English for General and Specific Purposes

TaLC Lancaster 1994

(1)computers and storage at the time were improving dramatically;

(2) there was a new interest in authentic data and usage in language education; and

(3)there was a consensus that learners were adopting new, more active roles in their learning process.

7

Page 8: Pedagogical applications of corpus data for English for General and Specific Purposes

Imagine …..today

8

Page 9: Pedagogical applications of corpus data for English for General and Specific Purposes

•Braun (2005, 2007): pedagogically motivated corpora

(a) provide a more systematic range of material than individual texts or scattered collections of activities and, if well-designed, (b) offer a wider range of idiolects than the average material.

Braun (2006) : thematic annotation, including topic keys and section titles, are particularly useful in the implementation of pedagogically motivated corpora

9

Page 10: Pedagogical applications of corpus data for English for General and Specific Purposes

10

Page 11: Pedagogical applications of corpus data for English for General and Specific Purposes

•Pérez-Paredes & Alcaraz (2009)For the time being, the natural corpus playground

continues to be tertiary education. Our motivation:CL in the language classroom. The resulting annotated corpus can be seen as

being integrative of language data and annotated pedagogy.

Pedagogy can be annotated and, subsequently, accessed by corpus users.

11

Page 12: Pedagogical applications of corpus data for English for General and Specific Purposes

12

Linguistic analysis of interest in FLT

------>Linguistics comes first

------->DDL materialsConcordances

and corpus Researcher/Linguist

End user

What is possible..(Alcáraz and

Pérez-Paredes2008)

Page 13: Pedagogical applications of corpus data for English for General and Specific Purposes

•Pedagogical analysis (and annotation) of language corpora

------>Pedagogy comes first

------->Pedagogy-driven

DDL

•Pedagogical analysis (and annotation) of language corpora

------>Pedagogy comes first

------->Pedagogy-driven

DDL

13

Material developer/Teacher

/ LearnerEnd user

What is feasible..(Alcáraz and

Pérez-Paredes2008)

Page 14: Pedagogical applications of corpus data for English for General and Specific Purposes

14

Corpus

LanguageData

Annotation

Language

Metadata

Pedagogy

Page 15: Pedagogical applications of corpus data for English for General and Specific Purposes

www.um.es/sacodeyl

15

Page 16: Pedagogical applications of corpus data for English for General and Specific Purposes

16

Page 17: Pedagogical applications of corpus data for English for General and Specific Purposes

17

Page 18: Pedagogical applications of corpus data for English for General and Specific Purposes

18

Page 19: Pedagogical applications of corpus data for English for General and Specific Purposes

sacodeyl.inf.um.es/sacodeyl-search/

webapps.ael.uni-tuebingen.de/backbone-search/

19

Page 20: Pedagogical applications of corpus data for English for General and Specific Purposes

•Default annotation tree has been developed by the teachers & researchers in SACODEYL

20

Page 21: Pedagogical applications of corpus data for English for General and Specific Purposes

What categories does this defaultcategory tree contain ?

21

TopicsGrammaticalLexicalStyleCEF Level….

Page 22: Pedagogical applications of corpus data for English for General and Specific Purposes

Annotator friendly GUI

22

Page 23: Pedagogical applications of corpus data for English for General and Specific Purposes

Multilanguage

•Supports a real multilingual annotation

23

Page 24: Pedagogical applications of corpus data for English for General and Specific Purposes

Outline1. Background: corpora2. The SACODEYL- BACKBONE approach

at a glance3. Getting down to annotating

1. Backbone Annotator: download and installation

2. Texts and CMT 3. Guided annotation

1. BACKBONE2. Specific purposes: LADEX

24

Page 25: Pedagogical applications of corpus data for English for General and Specific Purposes

25

Page 26: Pedagogical applications of corpus data for English for General and Specific Purposes

What is XML TEI format?▫TEI Text Encoding Initiative▫This is a format for storing corpora▫Has been promoted by OTA

(Oxford Text Archive)▫Is a continuously growing format (more

than 50 versions released yet, currently TEI P5)

▫Is rapidly spreading among the available tools

26

Page 27: Pedagogical applications of corpus data for English for General and Specific Purposes

TEI Tools (Research)

•TeiPublisher“This tool is a XML-based repository thatallows the publication of TEI corpora to thepublic community and offers a search tool.”

•Dexter“This is other annotator tool that used TEI as the format for the annotated files.”

27

Page 28: Pedagogical applications of corpus data for English for General and Specific Purposes

TEI Tools (Research)

•Oxygen XML Editor and XMLSpy“These are XML Editors that allows the

modification of the TEI files without any limitation”

(These are complex for non-advanced users)

28

Page 29: Pedagogical applications of corpus data for English for General and Specific Purposes

TEI Tools (Research)

•TAPoR (http://portal.tapor.ca/)“The Text Analysis Portal for Research (TAPoR) is a gateway to tools for

sophisticated analysis and retrieval, along with

representative texts for experimentation.”

29

Page 30: Pedagogical applications of corpus data for English for General and Specific Purposes

TEI Tools (Research)•TokenX

http://www.unl.edu/libr/etext/tokenx.shtml

“Is a text visualization, analysis, and play tool”•WordHoard

http://wordhoard.northwestern.edu/userman/index.html

“Is a tool for annotating or tagging texts by morphological, lexical, prosodic, and

narratological criteria and for determining frequency

information”

30

Page 31: Pedagogical applications of corpus data for English for General and Specific Purposes

TEI Tools (Research)

•XAIRAXAIRA (XML Aware Information Retrieval Architecture) is an open source tool for constructing high-quality linguistically-motivated search interfaces to large collections of XML documents.

31

Page 32: Pedagogical applications of corpus data for English for General and Specific Purposes

•The XAIRA search

32

Page 33: Pedagogical applications of corpus data for English for General and Specific Purposes

TEI Tools (Classroom)

•A more interesting orientation.

How I can use the Annotation in the classroom?

Backbone Search Toolwww.um.es/backbone

33

Page 34: Pedagogical applications of corpus data for English for General and Specific Purposes

Outline1. Background: corpora2. The SACODEYL- BACKBONE approach

at a glance3. Getting down to annotating

1. Backbone Annotator: download and installation

2. Texts and CMT 3. Guided annotation

1. BACKBONE2. Specific purposes: LADEX

34

Page 35: Pedagogical applications of corpus data for English for General and Specific Purposes

Download BACKBONE Annotator + Install + CMT config

http://www.um.es/backbone/

Pérez-Paredes, P., and Alcaraz-Calero, J. M. (2009). Developing annotation solutions for online Data Driven Learning. ReCALL 21, 55..

35

Page 36: Pedagogical applications of corpus data for English for General and Specific Purposes

How do I create a corpus?

36

Page 37: Pedagogical applications of corpus data for English for General and Specific Purposes

How can I add a new document to the current corpus?

1. Add document …

2. Select the text format/encoding

3. Select the new document

37

Page 38: Pedagogical applications of corpus data for English for General and Specific Purposes

What does the text format mean?

•Mainly 4 text formats are supported:▫Plain text (written) .txt▫Oral text in Backbone Transcriptor format▫Oral text in SACODEYL Transcriptor format▫XML text in TEI standard format (text in special XML files)

38

Page 39: Pedagogical applications of corpus data for English for General and Specific Purposes

What does the text encoding mean?

39

This is the form in which the text is stored (related to the Multilanguage).

(In Windows ANSI by default)

Page 40: Pedagogical applications of corpus data for English for General and Specific Purposes

Selecting the text to annotate

• Select a document and annotate it

1. Open document…

2. Select the document

40

Page 41: Pedagogical applications of corpus data for English for General and Specific Purposes

Information shown in the working document

•Section Number•Applied Categories to this

section (Annotations)•Speaker (only in oral text)•Transcription

41

Page 42: Pedagogical applications of corpus data for English for General and Specific Purposes

What is a section?•Is a stretch of text that is “whateverly”

motivated.•A fragment that could be useful in

whatever context•A section can be established in any kind

of text (oral and written) with the insertion of the special char (#) for division of texts into sections.

42

Page 43: Pedagogical applications of corpus data for English for General and Specific Purposes

Intuitive Annotation Process•Drag and Drop to Annotate a Section

43

Page 44: Pedagogical applications of corpus data for English for General and Specific Purposes

What is a Keyword?

•“… [a] keyword is a stretch of language (a word, more than one word or a whole paragraph) that the annotator associates to a category…”

Pérez-Paredes and Alcaraz, ReCALL, 2009 Vol 21. (1)

44

Page 45: Pedagogical applications of corpus data for English for General and Specific Purposes

What are Keywords? •BACKBONE Annotator supports the

annotation of keywords•Just select text and apply a category by

right-clicking

45

Page 46: Pedagogical applications of corpus data for English for General and Specific Purposes

Selective View

•Offers a selective view of the information in order to facilitate the organization.

46

Page 47: Pedagogical applications of corpus data for English for General and Specific Purposes

Section title•Drag and Drop the special

“Title” category to the desired section.

•The title is rendered by a tool tip when placing the cursor

on the section. (No tool tip = No title)

47

Page 48: Pedagogical applications of corpus data for English for General and Specific Purposes

Extensible annotation•Supports customization

of the annotation

•User can add his/her own annotation taxonomy or remove any annotation category

48

Page 49: Pedagogical applications of corpus data for English for General and Specific Purposes

How can I add a new category?

▫Select the parent category. (i.e. Topics)

▫Press Add Cat. Button.▫Fill in

49

Page 50: Pedagogical applications of corpus data for English for General and Specific Purposes

50

Page 51: Pedagogical applications of corpus data for English for General and Specific Purposes

How can I remove a category?

51

Select the category to remove (i.e. Topic)

Be careful …All the associated children

will be removed alsoAll the annotation with the

tags will be removed alsoPress Delete Cat. Button.

Page 52: Pedagogical applications of corpus data for English for General and Specific Purposes

How can I reorder the categories?

52

Select the category to reorder (i.e. Topic)

Press Up Cat or Down Cat. to move it.

Page 53: Pedagogical applications of corpus data for English for General and Specific Purposes

53

How can I customize a category?

Select the category to customize (i.e. Topic)

Press double click

Page 54: Pedagogical applications of corpus data for English for General and Specific Purposes

Can I manage metadata?

54

Page 55: Pedagogical applications of corpus data for English for General and Specific Purposes

What if I find mistakes?•Supports edition of the inserted texts.•Uses XML TEI standard for encoding

corpora.

55

Page 56: Pedagogical applications of corpus data for English for General and Specific Purposes

Integration

•Backbone Annotator is integrated with▫Backbone Transcriptor▫Backbone CMT▫Backbone Search ▫SACODEYL VRP

56

Page 57: Pedagogical applications of corpus data for English for General and Specific Purposes

Resource Management•Offers

enrichment of text with external resources

• i.e. html links, videos, audios, etc.

57

Page 58: Pedagogical applications of corpus data for English for General and Specific Purposes

Where is the information stored?•Remember: All the information is store in one file. The corpus file which you have created.

58

Corpus

LanguageData

Annotation

Language

Metadata

Pedagogy

Page 59: Pedagogical applications of corpus data for English for General and Specific Purposes

Make your corpus collaborative

59

Page 60: Pedagogical applications of corpus data for English for General and Specific Purposes

Make your corpus collaborative

60

Page 61: Pedagogical applications of corpus data for English for General and Specific Purposes

Make your corpus collaborative

61

Page 62: Pedagogical applications of corpus data for English for General and Specific Purposes

Outline1. Background: corpora2. The SACODEYL- BACKBONE approach

at a glance3. Getting down to annotating

1. Backbone Annotator: download and installation

2. Texts and CMT 3. Guided annotation

1. BACKBONE2. Specific purposes: LADEX

62

Page 63: Pedagogical applications of corpus data for English for General and Specific Purposes

Backbone

•Pedagogic Corpora for Content and Language Integrated Learning. Insights from the BACKBONE Project. The EUROCALL Review, 20, 2, September 2012

•Kurt Kohn, Applied English Linguistics, University of Tübingen (Germany)

http://eurocall.webs.upv.es/index.php?m=menu_00&n=news_20_2

63

Page 64: Pedagogical applications of corpus data for English for General and Specific Purposes

webapps.ael.uni-tuebingen.de/backbone-search/

64

Page 65: Pedagogical applications of corpus data for English for General and Specific Purposes

Outline1. Background: corpora2. The SACODEYL- BACKBONE approach

at a glance3. Getting down to annotating

1. Backbone Annotator: download and installation

2. Texts and CMT 3. Guided annotation

1. BACKBONE2. Specific purposes: LADEX

65

Page 66: Pedagogical applications of corpus data for English for General and Specific Purposes

Specific uses: Legal-administrative language and immigration

66

This project aims at filling the existing gap between the linguistic studies combining legal language characterisation and the cultural and social implications of immigration, from a multilingual angle (English, Italian, French and Spanish).

The project will contribute to the definition of the immigrant in each society, encouraging the debate on solidarity from a linguistic perspective.

Our starting point is the compilation, tagging and annotation of a multilingual corpus comprising a collection of representative documents used in immigration (UE and non-UE citizens), issued by the different Public Administrations and institutions in Spain, UK, France and Italy, ranging from 2007 to 2011.

Page 67: Pedagogical applications of corpus data for English for General and Specific Purposes

• 1. Compilation and organisation of legal-administrative binding documents for immigrants in all the countries involved.

• 2. Contrastive analysis of all those terminological, phraseological and discoursive aspects which can help us shape the cultural identity of administrators and immigrants.

• 3. Multilingual study of the legal-administrative language analysed in the research corpus textual typology.

• 4. Contrastive characterisation of the foreign user and cultural implications.

67

Page 68: Pedagogical applications of corpus data for English for General and Specific Purposes

•LADEX Annotator (Multilingual automatic tagging) + Manual collaborative annotation

•http://www.um.es/languagecorpora

68

Page 69: Pedagogical applications of corpus data for English for General and Specific Purposes

Annotation Aim•Why are you annotating?•What is the purpose of your annotation?•What use are you giving to your

annotation?

69

Page 70: Pedagogical applications of corpus data for English for General and Specific Purposes

Discussion and debate

•Pedagogical annotation vs. Morphological tagging paradigm

•Learner-centered vs. Researcher-oriented•Indirect applications of language corpora

vs. Direct applications•Constraints of traditional CL in the

languagge classroom

70

Page 71: Pedagogical applications of corpus data for English for General and Specific Purposes

Discussion and debate

•Cognitive demands of traditional CL in the language classroom: learner as a reseacher and as a traveller

• Is CL an extra hassle in language classrooms? (Mauranen 2004)

•Customization of language corpus/collection of texts

•Mediation role of corpus-based resources in the FLT classroom

•Authenticity issues (Widdowson)

71

Page 72: Pedagogical applications of corpus data for English for General and Specific Purposes

References and further reading• Braun, S. 2005. “From pedagogically relevant corpora to authentic

language learning contents”, ReCALL 17/1:47-64.• Braun, S. 2006. “ELISA - a pedagogically enriched corpus for

language learning purposes”. In Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, Frankfurt M: Peter Lang. (eds) 25-47.

• Braun, S. 2007. “Integrating corpus work into secondary education: from data-driven learning to needs-driven corpora”. ReCALL 19/3: 307-328.

• Mauranen, A. 2004.” Spoken - general: Spoken corpus for an ordinary learner”. In How to Use Corpora in Language Teaching, Sinclair, J. McH. (Ed), 89–105.

• Pérez-Paredes, P. and Alcaraz, J.M. 2009. “Developing annotation solutions for online data-driven learning”. ReCALL,21,1, .

• Römer, Ute. (2008). “Corpora and Language Teaching”. In Corpus Linguistics. An International Handbook, Lüdeling, Anke & Merja Kytö (eds.). Berlin: Mouton de Gruyter.

• Widdowson, H.G. 2003. Defining issues in English Language Teaching. Oxford: Oxford University Press.

72

perezparedes.blogspot.com