glocall 2015 globalization and localization in computer-assisted language learning the future of...
TRANSCRIPT
GLOCALL 2015
Globalization and Localization in Computer-Assisted Language Learning
The future of Vocabprofiling
Tom Cobb
Université du Québec à Montréal
www.LEXTUTOR.ca
GLOCALL 2015
Globalization and Localization in Computer-Assisted Language Learning
The future & meaning of Vocabprofiling
Tom Cobb
Université du Québec à Montréal
www.LEXTUTOR.ca
3
Backgrounder
• Vocabprofiling is an activity performed on my website Lextutor
• www.lextutor.ca
• Which is probably the thing I am best known for + the reason you have asked me here
• So the framing question of my presentation is this:
4
Is Lextutor a bunch of software, or a coherent theory of language acquisition?
5
A clue.
6
Take 2 on a framing issue with a bit of “localization”
• Korean learners are interested in vocabulary lists?
• Don’t try to stop this interest
• Instead give them the right lists and a good way to use them
7
First some scene-setting stats
So A LOT of people use this
8
Korea always in daily Top 10 user countries
9
And elsewhere in GloCall land...
10
11
At a rate of…
12
What do these folks look at ?Basically every routine gets significant use
13
14
What do these folks look at ?But a few routines get the lion’s share
VOCABPROFILEgroup
VOCAB TESTS
CONCORDANCEgroup
15
What do these folks look at ?But a few routines get the lion’s share
VOCABPROFILEgroup
VOCAB TESTS
CONCORDANCEgroup
16
What do these folks look at ?But a few routines get the lion’s share
VOCABPROFILEgroup
VOCAB TESTS
CONCORDANCEgroup
17
So what is Vocabprofiling?
(A.k.a. VP-ing)
=> a procedure for matching learner to text through word frequency
18
What is word frequency?What is concordancing?
19
20
Word frequency in a general corpus
=Best predictor of word
knowledge
21
Which eventually gives us…
Etc.
22
…From which we can determine
The frequency of the words the learners know
vs.
the frequency of the words in a given text
23
1. The frequency of the words the learners know
24
2. The frequency of the words in a given textThere are strict time limits on the detention of persons without charge. An arrested person may not be detained without charge for more than 24 hours, unless a serious arrestable offence has been committed. If a serious arrestable offence has been committed a superintendent can extend the period to 36 hours to secure or preserve evidence by continued questioning. When a serious arrestable offence has been committed and the suspect needs to be held in custody beyond the 36 hour period, the police must bring the suspect before a magistrate to extend the time limit to a maximum of 60 hours.
25
So when we know the profile of the learner,
+ the profile of the text,
+ that 95% of words must be known for minimal comprehension
…we can make some fairly reliable and very useful predictions
26
Using this simple multiplication formula
Learner profile• K1 - 70% x• K2 - 50% x• K3 - 40% x• K4 - 20% x
Text profile• K1 - 80% =• K2 - 10% =• K3 - 5% =• K4 - 5% = --------------
100%
Predicted % of words comprehended
• K1 - 56%• K2 - 5%• K3 - 2%• K4 - 1% ------------
64%
With <95% of words known, even just to level of passive meaning recognition, comprehension will be minimal
27
Have you ever seen something like this in you learners reading
materials?
28
Case 1: Can this learner read this text ?
CHANGE THIS: START WITH IMPOSSIBLE CASE
There are strict time limits on the detention of persons without charge. An arrested person may not be detained without charge for more than 24 hours, unless a serious arrestable offence has been committed. If a serious arrestable offence has been committed a superintendent can extend the period to 36 hours to secure or preserve evidence by continued questioning. When a serious arrestable offence has been committed and the suspect needs to be held in custody beyond the 36 hour period, the police must bring the suspect before a magistrate to extend the time limit to a maximum of 60 hours.
29
Yes
The learner knows all the words sampled at 1k-2k-3k
The text is 95% 1k-3k words, so he probably knows all those
The learner knows > ½ the words 4k-5kThe text is 5% of 4k-5k words
The learner will know > half of these = 3%
The learner knows 95+3=98% of words in the text
30
Case 2: Can this learner read this text ?
There are strict time limits on the detention of persons without charge. An arrested person may not be detained without charge for more than 24 hours, unless a serious arrestable offence has been committed. If a serious arrestable offence has been committed a superintendent can extend the period to 36 hours to secure or preserve evidence by continued questioning. When a serious arrestable offence has been committed and the suspect needs to be held in custody beyond the 36 hour period, the police must bring the suspect before a magistrate to extend the time limit to a maximum of 60 hours.
31
This learner will know most of the k1-k2 words (90%), but half or less k3 and up
But 10% of words in this text are k3 and up!
This learner will know about 93% of the words
32
Here is this text with 93% of words known
lextutor.ca/cloze/vp/
33
So what can this learner do with this text?
First let us nuance the 95% empirical finding
34
From empirical research, we know that…
With <90% of words known, comprehension is difficult/impossible
With 90-95% known, comprehension is possible with resources Dictionary, discussion, content lecturer, seminar discussion…
With >95% known, (1) independent reading is feasible
(2) between 95-98%, fluency will improve
35
So, again, can this learner read this text ?There are strict time limits on the detention of persons without charge. An arrested person may not be detained without charge for more than 24 hours, unless a serious arrestable offence has been committed. If a serious arrestable offence has been committed a superintendent can extend the period to 36 hours to secure or preserve evidence by continued questioning. When a serious arrestable offence has been committed and the suspect needs to be held in custody beyond the 36 hour period, the police must bring the suspect before a magistrate to extend the time limit to a maximum of 60 hours.
YES for a with-resources intensive reading task
NO for a reading examination
NO if the goal is fluency development
36
Case 3: Can this learner read this text ?
There are strict time limits on the detention of persons without charge. An arrested person may not be detained without charge for more than 24 hours, unless a serious arrestable offence has been committed. If a serious arrestable offence has been committed a superintendent can extend the period to 36 hours to secure or preserve evidence by continued questioning. When a serious arrestable offence has been committed and the suspect needs to be held in custody beyond the 36 hour period, the police must bring the suspect before a magistrate to extend the time limit to a maximum of 60 hours.
37
The learner knows 70% of K1 wordsThe text is 74% k1 words
He will know about .7 x .7= 50% of k1 words
He knows a handful after k1The text is 27% after k1
At most he has 20% of theseSo he will know .27x.2= 5.4% of 2k-5k words
So he will know < 60% of words in the text
38
Here is this text with 68% of words known…
lextutor.ca/cloze/vp/
39
What happens if you give this text to this learner?
There are strict time limits on the detention of persons without charge. An arrested person may not be detained without charge for more than 24 hours, unless a serious arrestable offence has been committed. If a serious arrestable offence has been committed a superintendent can extend the period to 36 hours to secure or preserve evidence by continued questioning. When a serious arrestable offence has been committed and the suspect needs to be held in custody beyond the 36 hour period, the police must bring the suspect before a magistrate to extend the time limit to a maximum of 60 hours.
40
Have you ever seen something like this in you learners readers?
This is an, um… undesigned reading experience
With what objective?
41
Don’t believe this works?
Test this idea with your learners
Here are 7 handy pre-profiled texts, with 95% cut-offs at different k-
levels
(pic is link)
http://www.lextutor.ca/vp/comp/samples.html
42
So with Testing+VP we can find level-appropriate and task-appropriate texts
Only find?
Where will we find interesting texts for beginnners?
With VP we can also create such texts
43
44
45
Original Edited-to-a-profile
46
Plus this neat little feature
47
Ongoing VP work
• BNC lists supplemented by BNC-Coca– Brit and US English
• Frequency joined by Greco-Latin / Anglo-Saxon indicators– A.k.a. Multi- and single syllable words
• Incorporation of Multi-word units– “a lot” should be one k-1 unit, – not one k-1 + one k-3
• Differentiation of homoforms– River bank and money bank should be two words at two k-levels
• AND FINALLY Move to Mobile
48
49
The final vocab lists we need
1,000 List 3,000 List ?something course… …… something_of_a
bank_1 …… …
of_course bank_2
50
General move to phrases in Lextutor
51
So the Future of Vocabprofiling?
Ever more sophisticated analysis, without leaving teacher behind
Cut some words in two (bank)And some words into phrases (a lot)
Allow teachers to build collective library of texts on Lextutor, by topic x and 95% k-level y
Adapt VP as Web search engine to find texts of keywords x, y, z and VP k1=x, k2=y, k3=z
with Google API
52
Need more?<<new
yesterday>> – Horst &
Cobb chapter on Vping in…
53
So we have seen some useful VP tips and tricks from Lextutor
• But is it more?• Thesis:
Text computing + language acquisition are tightly connected in 2015
ALL LEXTUTOR ROUTINES ARE INSTANCES OF ‘DATA DRIVEN LANGUAGE LEARNING’ (DD-LL)
54
DD-LL in the broader scheme
55
DD-LL in the broader scheme
56
DD-LL in the broader scheme
57
Input as data
Large supplies of ‘Comprehensible input’ in real life are not simple to come by
While waiting, ‘Comprehensible computer input’ can do some of the job + create readiness
We saw VP predict text readability
Computing can also make language comprehensible in several other ways
58
Concordance1. For word meaning
If you can’t infer a meaning in one context, maybe in another
http://www.lextutor.ca/hyp/1/
59
Concordance2. To expose collocations
60
Concordance3. For what does not exist in a language
61
Concordance4. As data-linked writing tool
62
63
TTS To (1) slow speech down and (2) allow repetition
Generated straight from text without pre-recording
64
Code-link texts to external resource Many generated by algos straight from text without recording
http://www.lextutor.ca/hyp/2/
65
So if ‘comprehensible input’ is a theory of language acquisition
•Then Data-Driven Language-Learning is a theory about how to assure a reliable supply of this– Esp where NSs are in short supply
• And our best guess yet as to a principled use of the computer in language learning
66
But wait! Does DD-LL ‘work’?• Ongoing meta-analysis of a wide range of
DDLL application with the great Alex Boulton
67
205 recent DDL studies investigated56 studies selected for analysis
(selected for pre-post or exptl-control quantitative design)
68
149 studies thrown out for…
• Small n-size• Unreported or uncalculable Std Deviations• Plagiarism of other studies/ students’ PhDs• Math errors
Leaving these in the final line-up
69
Many of the 56 From GloCALL zone
Nam2010; Yoon & Jo, 2014
70
71
A wide range of RQ’sDoes DDL help with grammar in academic writing?Does DDL help with synonym use in academic writing?Does DDL help with preposition choice in academic writing?Does DDL help with writing in academic writing?Does DDL help with word collocations in academic writing?Does DDL help with word connotations in academic writing?Does DDL help with phraseology in academic writing?Does DDL help with spelling in academic writing?Does "corpus-based collocation instruction" help with collocations?Does DDL help with lexical knowledge (definitions and translations)?Does DDL help with interpreting?Does DDL help with lexical knowledge (definitions and translations)?Does DDL help with interpreting?Does DDL help with lexical knowledge (definitions and translations)?Does DDL help with interpreting?
etc…
72
Boiled down to a common measure: e.s.(= comparing means in light of overall
standard deviation)
73
So, over 56 good studies,pre vs. post means,
or control vs. experimental means are
an average 1.46 Std Devs apart
So e.s.=1.46 would be something like this(about a 15% difference with typical SDs)
Group A (Control or Pre)• 69• 69• 68• 64• 69• 87
M = 71SD = 8
Group B (Experimental or Post)• 83• 84• 85• 86• 87• 70
M = 82.5SD = 6
74
e.s. = 11.5 / 7 = 1.5
75
So DD-LL ‘works’ for many learning objectives
The only problem is, usually in some amazingly adapted version of concordancing
Which can be found where?Not exactly the delight of commercial app-developers
Obviously I see Lextutor as providing accessible DDL softwareThrough a sustained program of continuous development
A particular example
76
78
The EndThank You
Questions now, or via<[email protected]>
• Tom Cobb
• Université du Québec à Montréal
• www.LEXTUTOR.ca