learning and teaching corpora: data-sharing and repository for research on multimodal interactions...

42
LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université LRL: http://lrl.univ-bpclermont.fr/ Publications: http://hal.archives-ouvertes.fr/LRL PPT: http://edutice.archives-ouvertes.fr/edutice-00778274 1 4th WorldCALL Conference , 10-13 July 2013, Glasgow

Upload: rita-hallas

Post on 14-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

LEarning and TEaching Corpora:data-sharing and repository for

research on multimodal interactions

Ciara R. Wigham & Thierry ChanierClermont Université

LRL: http://lrl.univ-bpclermont.fr/ Publications: http://hal.archives-ouvertes.fr/LRL

PPT: http://edutice.archives-ouvertes.fr/edutice-00778274

1

4th WorldCALL Conference , 10-13 July 2013, Glasgow

Page 2: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

2

Simuligne (2001)

UK-FR

fre

Copéas (2005)

eng

UK-FR

Tridem(2005-06)

UK-FR-USA

eng, fre

Ecofralin (2008)

CO-FR

fre,spa

VMT-teamC (2006)

math

UK-USA-SG

INFRAL (2009)

deu,fra

DE-FR

FR

FAVI (2006-08)

fra

ARCHI21 (2011)

eng,fra

FR

SLIC (2013)

USA-FR

fra

Page 3: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

3

Data validity & reliability in CALL research?

•Problem in Social Sciences and CALL: ▫visibility, accessibility and validity of

research data▫data representative / anecdotal?▫no access to data when reading a publication▫links between data and publications

Page 4: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

4

CALL data from online learning situations

• CALL data is often:▫not contextualised – pedagogical &

technological situations (Kern et al., 2004)

▫ tangled in specific software using proprietary formats

• Replication for interaction analysis in online learning near impossible:▫variables that are difficult to control▫replication does not imply that phenomenon

previously observed will reoccur (Reffay et al., 2012)

Page 5: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

5

Open space for sharing research data concerning online multimodal interactions

Mulce project 2007-2010 & LETECMultimodal Corpora Exchange

Page 6: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

6

Research data quality: Mulce project

•Interoperability:▫Structured and coherent data sets=> analyses can be completed by researchers

who did not participate in the course•Sustainability:

▫Independent from online platforms▫Stored in independent formalisms

•Open access to research data & appropriate licences

•Accessibility: ▫Finding the research data through standard

metadata – OLAC (Open Language Archives Community)

Page 7: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

7

Learner Corpora / LETEC

• Learner Corpora (see Granger, 2002; Meunier et al., 2011)

▫SLA research▫ learners' productions▫ test situations (Reffay et al., 2008)

▫ learner- native speaker comparative studies (Boulton et al., 2012)

• LEarning and TEaching Corpora▫all participants considered (learners, tutors, etc.)▫ interaction data▫context

Page 8: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

8

LETEC Components

Instanciation

Pedagogical scenario

Research protocol

Public licence

Privatelicence

Analyses

Context

"A LETEC corpus collects in a systematic and structured way all the data from interactions which occur during a course which is partially or entirely online. These data are enriched by technical, pedagogical and scientific information as well as information about the participants and are organized to allow contextualized analyses to be performed.“ (Mulce-documentation, 2013)

ethics &

rights

Page 9: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

9

Methodology for building a LETEC

Page 10: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

10

Staged process

stages=

Data analyses

Page 11: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

11

Illustration of methodology-

• European project KA2 Languages

• CLIL approach (Content and Language Integrated Learning)

▫Architecture + French / English L2• Hybrid course "Building Fragile Spaces" : 5-day

studio Feb. 2011

• 17 students, 2 architecture tutors, 1 EFL tutor, 1 FFL tutor

Working with external partners: exchanges

Page 12: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

12

Stage 1: Design

Page 13: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

13

Elaboration of research areas•Interplay between verbal and non verbal

modes•Role of nonverbal in identity construction•Interplay between textchat & voicechat

modalities

Support for L2 verbal participation and

production

Wigham (2012) – PhD Thesis http://tel.archives-ouvertes.fr/tel-00762382

Stage 1: Design

Page 14: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

14

Pedagogical Design

• Macro-task– collaboratively elaborate a model in a synthetic world (Second Life) as a response to an architectural problem brief

• Architectural studio, hybrid CLIL approach• 4 workgroups

Stage 1: Design

Learning design

Online environments

Participants’roles

Learning & support activities

Page 15: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

15

Learning & support activitiesActivity Architecture objectives L2 objectives

Introduction to Second Life

Introduce students to multimodal nature of SL

Establish a communication protocol

Collaborative buildingactivity

Introduce students to building techniques to aid them develop their model

Develop L2 communication techniques concerning the referencing of objects

Group reflective session

Develop critical thinking by negotiation

Distinguish pertinent information for overall problem identification in their design brief

Help students to skill-up their L2

Acquire domain-specific vocabulary

Develop a professional discourse

Stage 1: Design

Detailed in: Rodrigues et al., in press; Wigham & Chanier, 2013

Page 16: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

16

Research protocol•Research protocol design

▫Protocol for data collection▫Researchers' roles▫Timetable of research activities

Stage 1: Design

researcher

Wigham & Chanier, 2013 ReCALL

Page 17: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

17

Stage 2: Data Collection

Page 18: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

18

Data collection & coverage

Data collected

Pre-questionnaires

Session data Post questionnaire

s

Semi-directive

interviewsEnviron

mentKwiksurveys Second Life VoiceForum Kwiksurveys Skype

Data type

Spreadsheet file

Video screen captures

Audio recordings

Spreadsheet file

Audio recordings

Quantity &

coverage of data

17 student questionnair

es

20 group sessions & 2 presentation

sessions19h40m

64 forum messages

16 student questionnaire

s

5 student interviews

2h30

pre-course post-courseduring course

Stage 2: Data collection

Page 19: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

19

Stage 3: Data Organisation,diffusion

Page 20: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

20

Primary data (anonymised)

Each resources has an ID and a description given

LETEC global corpus: content packaging

Manifest : structured dataStructured Interaction Data Model (Mce_sid, 2011)XML Information about each component of the corpus

General metadata(OLAC standards)

Environnements used

Information on participants: language biographies and group organisationDescription of the environment, course length, participants, toolsActivities described in the pedagogical scenario

Stage 3: Data organisation

Page 21: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

21

Corpus deposit•Mulce corpus repository (Mulce-repository,

2013)

Stage 3: Data organisation

Page 22: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

22

Corpus diffusion• Description of corpus; interface to browse

structure; zip file to download

Stage 3: Data organisation

Page 23: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

23

Stage 4: Transcription, analyses, publications

Page 24: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

24

verbal mode non verbal mode

audio textchat

proxemic transmission

radio transmission

public private

not detailed here, see Wigham & Chanier, (2013)

ReCALL 25(1)

Multimodal data transcriptionStage 4: Data transcription & diffusion

Page 25: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

25

Elaboration of transcription methodology• Characterized by communication modes &

modalities▫Systematic approach to studying online

environments• New environments = new modalities

▫Added to transcription methodology Communication

modeCommunication

modalityAct

type and transcription codeExplanation

verbalaudio

audio act (tpa) verbal turn in the public audio channel

silence (sil)interval between two audio acts greater than three seconds

textchat textchat act (tpc) message entered in the textchat window

nonverbal

proxemicsmovement (mvt)

avatar movement in the environment, e.g. avatar sits down, flies, walks backwards

entrance into /exit from environment (es)

avatar enters or exits the synthetic world

kinesics kinesic (kin)avatar gestures and movements made by an avatar's body part e.g. nod, point, clap

production production (prod)production or display of an object in the SL environment

Stage 4: Data transcription & diffusion

Page 26: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

26

Multimodal transcription using ELAN

video screen capture

multimodal transcription aligned using timeline

participants &

modality

view of annotations for

one participant in one modality

Max Planck Institute for Psycholinguistics (2001). ELAN [software]. The Netherlands: Max Planck Institute for Psycholinguistics. [http://www.lat-mpi.eu/tools/elan/]

Stage 4: Data Analyses

Page 27: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

27

Production & deposit of LETEC distinguished corpus

•Particular analysis of a selected part of the global LETEC corpusChanier, T. Saddour, I. & Wigham, C.R. (2012). (dir.) Distinguished Corpus: Transcription of Verbal and Nonverbal Interactions of the Second Life Reflection archi21-slrefl-av-j2. Mulce.org : Clermont Université. [oai : mulce.org:mce-archi21-slrefl-av-j2 ; http://repository.mulce.org]

•Only contains transformed data (=the transcriptions)

•Refers to a selection of the original data in global corpus (=videos)

•Software used for transcription cited (=ELAN)

Stage 4: Data transcription & diffusion

Page 28: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

28

Why does structuring a corpus help analysis?•Common technical structures to hold

interaction data▫Data linked▫Analyses at different levels, in context

whilst maintaining a global view of the course

•XML structure allows standard forms of annotation / coding & different analysis software to be used▫Tatiana (2008)▫Calico (2009)

Stage 4: Research Analyses

Page 29: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

29

An analysis example

• Interplay between textchat & voicechat

• Textchat modality acts in adjunct to the audio modality

▫ e.g. technical problems exist, opening & closing sequences of sessions (Liddicoat, 2011; Palomeque, 2011)

• Monomodal textchat environments – auto-correction, negotiation of meaning and corrective feedback

• Learner overload (Deutschmann & Panichi, 2009)

Multimodal environments ? (Hampel & Stickler, 2012)

Can the textchat serve for L2 feedback provision?

Stage 4: Research Analyses

Wigham & Chanier (in print) CALL Journal

Page 30: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

An example of modality interplay

Page 31: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

31

Characterisation of textchat functions

Wigham & Chanier (in print) CALL Journal

Stage 4: Research Analyses

Page 32: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

32

Characterisation of textchat functions• Data coding facilitated by XML schemas

Stage 4: Research Analyses

Page 33: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

33

Feedback in textchat

• 17% of acts contain feedback (49 acts)• Primarily concerns lexical and grammatical non

target-like forms (cf. Tudini, 2003)

• Predominant use of recasts (32/49 instances)

EFL Session

Technical

Socialisation

Conversation management

Task Form

Es-j3 3 7 9 41 17

Sc-j2 26 5 7 76 16

Sc-j3 2 9 4 36 16

Stage 4: Research Analyses

Page 34: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

34

Results of textchat feedback study

• EFL tutor's strategic choice to use textchat - reduces cognitive load▫ Non expertise in content matter

• Language form Vs communicative meaning▫Recasts as remain in textchat window▫Recasts so as not to interrupt content

communication

• Students’ management of multiple modalities

Stage 4: Research Analyses

Page 35: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

35

Publication of analyses & deposit of associated distinguished corpus

• Production of distinguished corpus:▫ Wigham, C.R. (2013). (dir.) Distinguished Corpus: Interplay between

textchat and audio modalities during the Second Life Reflective Sessions. Mulce.org : Clermont Université. [oai : mulce.org:mce-archi21-modality-textchat ; http://repository.mulce.org]

• Analysed data presented in parallel with results▫ Wigham, C.R. & Chanier, T. (in print). Interactions between text chat

and audio modalities for L2 communication and feedback in the synthetic world Second Life. CALL Journal

• Distinguished corpora can be cited in articles• Explicit connections between data and publications

enhance the quality of CALL research

Stage 4: Publication

Page 36: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

36

Conclusion: Sustaining CALL research

• Reuse of data for cumulative or contrastive analyses▫ Rodrigues & Wigham (in print) – text chat &

problematic vocabulary points▫ Natural language processing techniques

• Facilitated by:▫ structured XML formalisms render online interaction

data autonomous from any platform, in tool agonistic form

▫ interactions described by modes & modalities -> not specific to an online environment

• Reuse of LETEC in corpus linguistics (TEI-CMC)

Conclusion

Page 37: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

37

Perspectives•Documented and selected materials in

their original context –basis for reflection in pedagogical corpora

•Integration of pedagogical corpora into teacher-training classrooms

Conclusion

Page 38: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

38

Contact: [email protected]

[email protected]

Website: http://lrl.univ-bpclermont.fr/

Mulce-documentation: http://mulce.org

Mulce-repository: http://repository.mulce.org

Thank you!

Page 39: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

39

Corpus metadata• Inform researchers about:

▫ conditions under which the corpus was built▫ how to use the corpus▫ the corpus' content▫ licences for re-using the corpus

• Used for web harvesting▫ corpus become visible to whole community (OLAC, Clarin)▫ corpus can be cited

Stage 3: Data organisation

Page 40: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

Characterisation of textchat functions

Analyses 40

• Data coding facilitated by XML schemas

Wigham & Chanier (in print) CALL Journal

Page 41: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

Data coverage•6 sessions (3 FFL, 3 EFL)•4h30m of screen recordings

Analyses 41

Groups analysed Audio acts Textchat acts

EFL 450 423

FLE 386 64

Total GS-j2 Total GS-j3 Total GE-j3 Total GL-j3 Total GA-j2 Total GA-j30

50

100

150

200

250

300

Number of tpa acts Number of tpc acts

Page 42: LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université

42

Perspectives•Documented and selected materials in their original context –basis for reflection•Inter-disciplinary project