tschirner 2011 reasonable expectations

19
Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119 101 TSCHIRNER, Erwin: Reasonable Expectations: Frameworks of Reference, Proficiency Levels, Educational Standards TSCHIRNER, Erwin (2008): Vernünftige Erwartungen: Referenzrahmen, Kompetenzniveaus, Bildungsstandards. Zeitschrift für Fremdsprachenforschung, 19.2, pp. 187–208. 1 In 2003, Germany instituted National Educational Standards (Bildungsstandards) for the lower secondary school system (grades 9 and 10) and in 2007 it decided to develop National Educational Standards for the upper secondary school system (Abitur). For foreign languages, both sets of standards are or will be based to a large extent on the Common European Framework of Reference for Languages (CEFR). This paper focuses on expected proficiency levels in speaking. The only large-scale study establishing proficiency levels to date at school is the DESI study which looked at scholastic achievement in German and English in grade 9 across all school types. The test used to evaluate speaking proficiency in English was the PhonePass Set-10 test. This paper discusses the validity of the results obtained and questions the rationality of some of the speaking proficiency expectations for foreign languages at German secondary schools. In most European countries, the CEFR has become the basis for curricular decisions, for the development of textbooks and other teaching materials, for test and evaluation purposes, and increasingly, for national educational standards. In Germany, the educational standards have been based on, and to a large extent consist of, the level descriptors of the CEFR. The levels to be reached at certain major points in a pupil’s career have been established a priori. This paper looks at the rationality of these levels. There are very few studies of how long it takes for certain groups of students to reach particular proficiency levels. The DESI study (Schröder – Harsch – Nold, 2006) claims to have gathered empirical evidence for one major exit point in secondary schools: the end of 9 th grade. It claims to have established levels for all four skills. This paper will look at the speaking skill and will argue that as far as speaking is concerned the study’s results are not plausible. It will further argue that this seems to be due to the test that was used. As far as setting educational standards is concerned, there seems to be a great deal of wishful thinking involved. In the state of Bavaria, e.g., the level to be reached by the end of Gymnasium, the most academically oriented school form in Germany ending with the Abitur, is B2 in all skills except reading, where it is C1. Students studying a foreign language as a core subject in grades 11 and 12 (Leistungskurs) are expected to reach C1 in all four skills. The levels posited are the same regardless of language and regardless of how long the language has been studied. 1 This paper is a slightly revised English version of TSCHIRNER, E. (2008): Vernünftige Erwartungen: Referenzrahmen, Kompetenzniveaus, Bildungsstandards. Zeitschrift für Fremdsprachenforschung, 19.2, pp. 187–208.

Upload: erwintschirner

Post on 07-Nov-2014

23 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

101

TSCHIRNER, Erwin: Reasonable Expectations: Frameworks of Reference, Proficiency Levels, Educational Standards

TSCHIRNER, Erwin (2008): Vernünftige Erwartungen: Referenzrahmen, Kompetenzniveaus, Bildungsstandards. Zeitschrift für Fremdsprachenforschung,

19.2, pp. 187–208.1

In 2003, Germany instituted National Educational Standards (Bildungsstandards) for the lower secondary school system (grades 9 and 10) and in 2007 it decided to develop National Educational Standards for the upper secondary school system (Abitur). For foreign languages, both sets of standards are or will be based to a large extent on the Common European Framework of Reference for Languages (CEFR). This paper focuses on expected proficiency levels in speaking. The only large-scale study establishing proficiency levels to date at school is the DESI study which looked at scholastic achievement in German and English in grade 9 across all school types. The test used to evaluate speaking proficiency in English was the PhonePass Set-10 test. This paper discusses the validity of the results obtained and questions the rationality of some of the speaking proficiency expectations for foreign languages at German secondary schools.

In most European countries, the CEFR has become the basis for curricular decisions, for the development of textbooks and other teaching materials, for test and evaluation purposes, and increasingly, for national educational standards. In Germany, the educational standards have been based on, and to a large extent consist of, the level descriptors of the CEFR. The levels to be reached at certain major points in a pupil’s career have been established a priori. This paper looks at the rationality of these levels. There are very few studies of how long it takes for certain groups of students to reach particular proficiency levels. The DESI study (Schröder – Harsch – Nold, 2006) claims to have gathered empirical evidence for one major exit point in secondary schools: the end

of 9th

grade. It claims to have established levels for all four skills. This paper will look at the speaking skill and will argue that as far as speaking is concerned the study’s results are not plausible. It will further argue that this seems to be due to the test that was used.

As far as setting educational standards is concerned, there seems to be a great deal of wishful thinking involved. In the state of Bavaria, e.g., the level to be reached by the end of Gymnasium, the most academically oriented school form in Germany ending with the Abitur, is B2 in all skills except reading, where it is C1. Students studying a foreign language as a core subject in grades 11 and 12 (Leistungskurs) are expected to reach C1 in all four skills. The levels posited are the same regardless of language and regardless of how long the language has been studied.

1This paper is a slightly revised English version of TSCHIRNER, E. (2008): Vernünftige Erwartungen: Referenzrahmen, Kompetenzniveaus, Bildungsstandards. Zeitschrift für Fremdsprachenforschung, 19.2, pp. 187–208.

Page 2: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

102

For English, for example, which starts in grade 3 in elementary school, a B2 level of speaking is expected. The same level is expected after three years of studying the third foreign language even if it is, e.g., Russian (Bayerisches Staatsministerium für Unterricht und Kultus, 2003). There is a richness of studies, however, that show that it takes a great deal of time to reach high levels of oral proficiency and that both type and intensity of exposure as well as language distance play as much of a role as simple length of time (Rifkin, 2003; Swender, 2003; Tschirner, 2007a).

There are hardly any empirical studies that look at the relationship between length of time and proficiency level achieved with respect to the CEFR. However, there is a large body of research that looks at length and type of exposure and proficiency level attained with respect to the Oral Proficiency Guidelines of the American Council on the Teaching of Foreign Languages (ACTFL) (see Tschirner and Heilenmann [1998] and Tschirner [2007a] for an overview). Because of the clear correspondences between the CEFR and ACTFL levels, these studies ought to be exploited thoroughly to arrive at preliminary proficiency standards for various languages at various grade levels.

In the first section of my paper I will focus on the correspondences between the CEFR and the ACTFL guidelines to establish the usefulness of using empirical results referring to ACTFL levels for European educational standards based on the CEFR. In the second section I will present the research results referring to ACTFL oral proficiency levels, especially with respect to the twin factors established above, i.e., time and quality of exposure as well as language distance. In the third section I will present the results of the DESI study and will compare them with the results of section two as well as with major international standardized tests. Because it will become clear that both research results and the expectations of major standardized tests do not agree with the results of the DESI study, the test used to established oral proficiency in that study, Versant English, will be critically examined. It will be argued that this test does not measure oral proficiency as it is conceptualized in the CEFR and the ACTFL guidelines. In the final part of my paper, I will offer an alternative approach of establishing educational standards based on empirical research and I will suggest reasonable proficiency levels for the grade levels in question.

1. Correspondences between ACTFL and CEFR The CEFR scales and descriptors are a product of the Swiss National Science Research Council project “Evaluation and self-evaluation of foreign-language competences at interfaces of the Swiss education system” directed by Günther Schneider and Brian North. North deconstructed existing proficiency scales (among them the ACTFL and the ILR scale, see below) to gain a total of 2,000 individual descriptors, to put them into categories, and to sort them according to level. He then held a number of workshops with a total of 290 English, French, and German teachers to review the appropriateness of his categories and of his sorting. Next, he asked teachers to assess a total of 2,700 students with the help of these descriptors. Using a Rasch model, he assigned difficulty values to the descriptors and arrived at nine fairly evenly spaced levels, i.e., A1, A2, A2+, B1, B1+, B2, B2+, C1 and C2 (Schneider, 2001). On the basis of his statistical analyses, North finally developed the scales found in the CEFR.

The ACTFL Oral Proficiency Guidelines derive from and are compatible with the U.S. Interagency Language Roundtable (ILR) Guidelines which are based on the guidelines of

Page 3: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

103

the U.S. Foreign Service Institute (FSI), developed in 1956. The FSI guidelines continue to be regarded as the ancestor of most existing speaking proficiency scales (Chalhoub-Deville – Fulcher, 2003; Fulcher, 1997; Herzog, 2003; Liskin-Gasparro, 2003). The ACTFL oral proficiency guidelines have been revised several times and now exist in their 1999 version. They distinguish between ten levels: Novice Low, Mid, and High, Intermediate Low, Mid, and High, Advanced Low, Mid, and High, and Superior (Breiner-Sanders et al., 2000). The descriptors describe oral proficiency from the very beginning to the most advanced levels of second language proficiency.

The CEFR scales are said to have been developed with the model of communicative competence by Bachman (1990) in mind (Schneider, 2001), with which the ACTFL scale is also compatible (Tschirner, 2001). Both scales rest on a great many broadly based expert judgments. The origin of both scales is the FSI scale of the USA. Correspondences between the ACTFL and FSI, now ILR, scales have been empirically validated and are constantly being revalidated. The correspondences between the CEFR and the ILR scale rest on the major role that the ILR and ACTFL scales have played in the development of the CEFR (both Brian North and John Trim were Mellon fellows at the National Language Resource Center (NLRC) in Washington, D. C., while developing the CEFR). Because both scales are fundamentally linked to the discussion of educational standards and form the basis of two major test systems that are very influential in the world, it is of great importance that correspondences are established between them.

Vandergrift (2006) investigated existing scales (including ACTFL and CEFR) for the Common Framework of Reference for Languages of Canada. He proposes the following correspondences between CEFR and ACTFL based on a careful interpretation of their descriptors: A1 and A2 with Novice High and Intermediate High, B1 and B2 with Advanced Low and Advanced Mid, and C1 and C2 with Advanced High and Superior, respectively. Very similar correspondences were proposed by Tschirner (2005) who suggests correlating A2 with Intermediate Mid and B1 with Intermediate High. North (Majima, 2005) also correlates B1 with Intermediate High and Advanced Mid with B2 and suggests that C1 is the equivalent of Superior. On the basis of the original nine point CEFR scale established by North, Tschirner (2007b) does a microanalysis of the ACTFL and CEFR speaking descriptors and suggests the following correspondence table for speaking. (See table 1 for a slightly modified version). (For a side by side listing of ACTFL and CEFR level descriptions see Appendix 1).

CEFR A1 A2 A2+ B1 B1+ B2 B2+ C1 C2

ACTFL NH IL IM IH AL AM AH S

Table 1: Correspondences between CEFR and ACTFL scales of oral proficiency.

Both sets of scales assume that the growth of oral proficiency may be described on the basis of four categories: (1) the length and complexity of the oral discourse; (2) the number and types of topics; (3) the variety and complexity of speech acts; and (4) the increasing mastery of the formal aspects of the spoken language, its grammar, vocabulary, pronunciation and socio-cultural appropriateness. The ACTFL Oral

Page 4: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

104

Proficiency Guidelines, e.g., assume a progression within these categories as follows. Discourse or organizational competence (Bachman, 1990) starts with the ability to

use words and phrases and strings of words to express intended meaning (Novice). At the Intermediate level, learners are able to use sentence-length utterances and strings of utterances; at the Advanced level, they are able to connect sentences using anaphora, subordination and other linguistic means to establish textual cohesion in predictable narrative and descriptive texts, and at the Superior level, in less predictable reasoned texts containing argumentation, supported opinion, and hypotheses.

Thematic competence develops from the most elementary personal and social needs and concepts (Novice) to basic personal topics (Intermediate), to the ability to talk about concrete and factual topics (Advanced), and to discussing abstract topics (Superior).

Functional competence starts with the ability to respond to very simple questions using words and strings of words and phrases (Novice). It continues to develop to the ability to ask and answer personal questions and fulfill basic everyday transactions (Intermediate). It develops further to the ability to narrate and describe in all time frames and to deal with everyday transactions that include a complication (Advanced), and finally to the ability to reason and argue, to support an opinion and to hypothesize (Superior).

Formal competence, finally, develops from the ability to be minimally comprehensible (Novice) to establish cohesion through the use of sentence-length utterances and strings of utterances (Intermediate), to the ability to distinguish between formal and informal varieties of speech and to explicitly mark temporal and aspectual relationships as well as the information structure of descriptive and narrative texts, e.g. through the use of word order, subordination, anaphora and case (Advanced), and finally to the ability to produce complex educated speech containing complex structures such as passive and subjunctive forms with very few noticeable difficulties and errors in pronunciation, lexis, grammar or sociolinguistic conventions (Superior).

The first major study looking at proficiency levels of U.S. college graduates with a B.A. in foreign languages and cultures was completed in 1967 (Carroll, 1967). In the 40 years since, a great many studies were undertaken to look at validity and reliability issues with respect to the OPI (e.g., Malone, 2003; Surface – Dierdorff, 2003; Watanabe, 1998), the relationship between level and length of study depending on native language(s) and target language as well as type of learning or instruction (e.g., Swender, 2003; Tschirner – Heilenman, 1998; Tschirner, 2007a), the relationship between level and the acquisition of particular linguistic structures (e.g., Magnan, 1988; Mikhailova, 2006; Norris, 1996; Rifkin, 2002; Rubio, 2003; Tschirner, 1996; Watanabe, 2003), as well as correspondences between the OPI and other standardized tests (e.g., Halleck – Moder, 1995; Kenyon – Tschirner, 2000). In the next section, I will focus on studies that have looked at questions relating to length of time, types of learning and language distance.

2. Length of time, types of learning, language distance A number of studies have tried to determine how much time and what types of learning are required to reach a particular level. Time requirements vary according to type of learning and the distance between native and target language, among other variables.

Page 5: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

105

The U.S. Foreign Service Institute (Foreign Service Institute, 2006), e.g., categorizes languages in three difficulty levels depending on their linguistic and cultural distance from American English. Category I, so-called “easy languages”, includes the Romance and Germanic languages (except German); Category II, “difficult languages”, includes the Slavic languages and languages such as Hindi, Modern Greek, Hungarian, and Vietnamese; while Category III, “super difficult languages“, includes Arabic, Chinese, Japanese, and Korean. German is in a category by itself, in between Categories I and II.

Class size at the FSI is very small, up to a maximum of six students. Instruction takes place within intensive teaching and learning environments (22 hours of instruction plus 18 hours of task-based homework assignments per week). Students are on average 40 years of age, they have at least an M.A. degree and they are highly motivated. With category I languages, students commonly reach Advanced High in 960 hours (including 600 classroom hours). For category II languages, it takes them 1,760 hours (including 1,100 classroom hours.) They require on average 3,520 hours (including 2,200 classroom hours) to reach the same level for “super difficult” languages. With these languages, students typically spend their complete second year in the target country. To reach Advanced High in German, students typically study German for 1,200 hours (including 750 classroom hours) (see table 2).

I German II III

AH 960 1200 1760 3520

Table 2: Suggested number of hours of study for students at the Foreign Service Institute (USA) to reach the level Advanced High (AH) with Category I, II and III languages and German.

These estimates are largely supported by research. The lower levels, from Novice Low (NL) to Intermediate Mid (IM), are reached relatively quickly for related languages. Tschirner and Heilenman (1998), e.g., found that it takes an average of 150 classroom hours to reach Intermediate Low (IL) and an average of 300 hours to reach IM for native speakers of English studying languages such as German, French or Spanish in college. To cross the threshold to advanced levels of oral proficiency, however, appears to be relatively difficult and frequently requires an immersion period of at least six months in a country where that particular language is spoken. For example, only students who studied in Russia for at least one semester reached a level of Advanced Low (AL) or higher by the time they finished their B.A. degree in Russian (Rifkin, 2003).

Swender (2003) summarizes the results of five Ivy League colleges who had their language majors (N = 501) tested over a period of five years (1998–2002) at the time of their college graduation. Many of these students had participated in a one-year study abroad program. 17% of Spanish, French and Italian majors (N = 442) were IM by the time they graduated, 57% were Intermediate High (IH) or AL, and 26% were Advanced Mid (AM) or higher. Similar results were achieved by German and Russian majors (N = 39): 26% were IM, 44% were IH or AL, and 30% were AM or higher. Slightly lower but still very acceptable results were achieved by Chinese and Japanese majors (N = 20): 35% were IM, 50% were IH or AL, and 15% were AM or higher.

Page 6: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

106

3. Deutsch Englisch Schülerleistungen International (DESI) Studies investigating oral proficiency levels of U.S. college students show that the average level reached at the end of a B.A. degree program in a foreign language after four years of intensive study of the foreign language, including its literature and culture, is IH (B1). Even this level is frequently only achieved if students spend at least two semesters abroad studying at a university in the target country. Approximately a third of language graduates reach the level AM (B2), and only if they spend at least a year studying in the target country. The DESI study in Germany claims to have yielded very similar results for ninth grade secondary school students (Schröder – Harsch – Nold, 2006). In the following, I will take a critical look at the study, specifically its results relating to the acquisition of oral proficiency and I will try to evaluate their credibility.

The DESI study was a German study investigating foreign language achievement of secondary school students in Germany completing ninth grade in English and German. The study was funded by the Kultusministerkonferenz (conference of state education ministers) (KMK) and was supposed to be a national study complementing the international PISA study. The study was organized by a consortium of researchers under the leadership of the Deutsches Institut für Internationale Pädagogische Forschung (DIPF). 11,000 ninth grade students of all school forms ranging from Hauptschule and Realschule to Gymnasium were tested at the beginning and at the end of the school year 2003/2004. The skills tested in English were oral proficiency, reading and listening proficiency, writing proficiency, language awareness, intercultural competence and the ability to reconstruct written texts (Kultusministerkonferenz, 2006).

According to the DESI consortium, the results of the test of oral proficiency may be interpreted most reliably according to the CEFR. Their test shows, they claim, that roughly two thirds of secondary school students reach the level A2, one third reaches B1, and nine per cent reach B2. The test that was used by the DESI consortium to establish oral proficiency levels was Versant English, a computer based test that is claimed to have been linked to the CEFR. The DESI consortium contends that their results support the CEFR levels established by the German Educational Standards for graduation from Hauptschule, the least academic of the three German school forms (DESI-Konsortium, 2006).

In the following, I will first compare the results of the DESI study with standardized international tests of English to establish their plausibility. In the section thereafter, I will discuss Versant English and I will take a critical look at the way it was linked to the CEFR to determine how reliable this linking process may have been.

Tannenbaum and Wylie (2005) linked some of the most important international Eng-lish tests with the levels B1 and C1 of the CEFR: the Test of English as a Foreign Language (TOEFL), the Test of Spoken English (TSE), the Test of Written English (TWE) and the Test of English for International Communication (TOEIC). Table 3 shows the results of this study. It also shows the Cambridge certificates that have been linked to the CEFR, including the Key English Test (KET) (A2), the Preliminary English Test (PET) (B1), the First Certificate in English (FCE) (B2), the Certificate in Advanced English (CAE) (C1) and the Certificate of Proficiency in English (CPE) (C2). In addition, table 3 presents the results of the DESI oral proficiency test for all ninth grade students and separately for ninth grade Gymnasium students (Schröder et al., 2006). Finally, the table presents

Page 7: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

107

the results of the US college students of German at graduation as presented in section 2. These students studied German at college level for a total of 700 classroom hours and many of them spent a year studying at university in a German-speaking country (Swender, 2003).

CEFR A1 A2 B1 B2 C1 C2

TSE 45 55

TWE 4,5 5,5

TOEFL (paper and pencil) 457 560

TOEIC 550 880

Cambridge Certificate KET PET FCE CAE CPE

DESI 9th grade (all) 28% 32% 25% 8% 1% 0,50%

DESI 9th grade (Gymn.) 5% 20% 50% 24% 1% 0,50%

B.A. (700 h) + 1 year abroad 26% 44% 30%

Table 3: Comparing international tests of English with the results of the DESI study and the OPI study by Swender (2003).

According to the DESI study, 72% of all ninth grade students would pass the Key English Test of the University of Cambridge ESOL Examinations (KET), 40% would pass the Preliminary English Test (PET) and almost 10% would pass the First Certificate in English (FCE). Moreover, every fourth ninth grade Gymnasium student would pass the FCE. To pass the FCE, it is recommended that students complete 600 classroom hours taught in small groups by native speakers of English who use English as the language of instruction. Commonly, these students need to be highly motivated to pass the FCE after this relatively short amount of time. Learners who pass the FCE

• understand texts from a wide variety of sources; • use English to make notes while someone is speaking in English; • talk to people about a wide variety of topics; • understand people talking in English on radio or television programs (University of Cambridge ESOL Examinations). The DESI consortium maintains that 25% of ninth grade Gymnasium students fulfill these requirements. In addition, their results amount to the claim that 40% of all ninth grade Hauptschule and 75% of Gymnasium students would pass the TOEFL test with at least 457 points or the TOEIC test with 550 points. Finally, if their claims regarding CEFR levels were true, this would amount to asserting that ninth grade German Gymnasium students reach the same levels as US B.A. foreign language students at graduation after four years of college including one complete year at a university in a German-speaking country. This does not seem plausible. In the next section therefore I will take a critical look at the test that yielded these results.

Page 8: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

108

4. Versant English Versant English is a commercial test of Ordinate Corporation in Menlo Park, CA, USA (now part of Pearson – prior to 2006, the test was called PhonePass SET-10.) The test uses voice recognition technology and is fully automated, including its evaluation. It is given over the telephone and it takes ca. 10 minutes. It is mostly a test of pronunciation even though the developers claim that it evaluates “facility in spoken English” because test takers allegedly need to have highly automated listening and speaking routines to do well on the test. Versant English consists of five parts, of which only the first four are evaluated (Ordinate, 2007).

Part 1 consists of 12 sentences. These sentences are grouped in sets of four that are loosely connected thematically to avoid ambiguities. In addition, they are meant to be simple sentences, both lexically and structurally. Here is an example of a set of four sentences.

1) Traffic is a huge problem in Southern California. 2) The endless city has no coherent mass transit system. 3) Sharing rides was going to be the solution to rush-hour traffic. 4) Most people still want to drive their own cars, though.

The test chooses eight of these 12 sentences at random which candidates have to read out loud. Part 2 consists of 16 spoken sentences. Candidates are asked to repeat these sentences out loud. The sentences were recorded by native speakers of American English speaking a variety of standard regional accents. They consist of three to fifteen words of length. Sentences are only heard. Here are three example sentences.

• War broke out. • It’s supposed to rain tomorrow, isn’t it? • There are three basic ways in which a story might be told to someone. Part 3 consists of simple spoken questions which are presumed to test both receptive and productive vocabulary. They are not supposed to presuppose any special kind of knowledge and were developed in such a way that a 12-year-old native speaker of English should be able to answer them. Each question contains 3–4 lexical units. Short responses are expected. Here are three typical questions.

• What season comes before spring? • What is frozen water called? • Does a tree usually have fewer trunks or branches? Part 3 of Versant English consists of 24 questions. For the DESI study, this part was reduced to 16 questions.

Page 9: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

109

Part 4 consists of 10 scrambled sentences which candidates have to say in the correct order. They consist of short simple sentences which are divided in three parts. They are heard only. Here are three typical examples.

• in / bed / stay • Ralph / this photograph / could convince • we wondered / would fit in here / whether the new piano Part 5 is not evaluated because it has to be rated by human raters. It consists of two questions that need to be answered orally. Candidates have 30 seconds per question. Questions asked are related to “family life” and “personal choices”. The members of the DESI consortium developed their own questions for this part. They also changed the number of questions to three and reduced the time to answer each question to 20 seconds. As with the original test, however, these answers were not rated and did not influence the final score.

The vocabulary and grammatical structures contained in Versant English were selected on the basis of speech samples from native speakers. 540 conversations between native speakers of American English were recorded and analyzed. These conversations were chosen to represent a broad selection of all important regional accents of American English and to represent men and women equally. The resulting test sentences were proofed by native speakers of British and Australian English so that lexical Americanisms could be avoided. Proofreaders, however, proofed these sentences only in writing.

The answers of test takers are automatically evaluated according to an internal algorithm the test developers keep secret. They reveal only that there are four categories: sentence structure, vocabulary, fluency, and pronunciation. Sentence structure is evaluated in parts 2 (sentence repetition) and 4 (scrambled sentences). Vocabulary is evaluated in part 3 (short answers). Both fluency and pronunciation are evaluated in parts 1 (reading out loud), 2 (sentence repetition), and 4 (scrambled sentences). Table 4 shows the distribution of rating criteria across tasks.

Sentence structure Vocabulary Fluency Pronunciation

Reading out loud

Sentence repetition

Short answers

Scrambled sentences

Table 4: Versant English tasks and rating criteria.

Fifty per cent of the final result is based on sentence structure and vocabulary. As with everything else, vocabulary is evaluated automatically. Points are given only if the expected words are contained in the response. The other fifty per cent of the final score

Page 10: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

110

is based on pronunciation and fluency. Points for pronunciation are given for correct pronunciation of consonants and vowels and for correctly marking word and sentence stress. Correctness is defined according to the pronunciation and stress marking of educated native speakers of American English. Fluency is judged by calculating how long it takes the candidate to start speaking, how rapidly he or she speaks, and the length and positioning of his or her pauses.

It is unclear what exactly the test measures but surely not oral proficiency or communicative competence. Pronunciation and fluency are rated in parts 1 (reading out loud) and 2 (sentence repetition). Pronunciation and fluency as well as basic listening comprehension and basic vocabulary are rated in parts 3 and 4 in addition to basic structural competence. The only part where functional competence may come into play at all and very little of it to boot is in the part that is not rated, i.e., part 5. Some studies have even been critical of Versant English as a test of pronunciation. Hincks (2001), e.g., found that candidates who were instructed to speak rapidly received better results simply on account of speaking rapidly.

There are very few studies of Versant English. There are a few articles in the voice recognition literature in which it is mentioned as a test of pronunciation (e.g. de Wet – van der Walt – Niesler, 2007), and three articles by Hincks who questions its validity as a test of pronunciation (Hincks, 2001; 2002; 2003). The goal of two studies was to link Versant English with the CEFR (de Jong – Bernstein, 2001; Suzuki – Harada, 2004). These two studies will be discussed in the following.

The first study was presented by John de Jong and Jared Bernstein in 2001 at the 7th European Conference on Speech Communication and Technology in Aalborg, Denmark, and published in the conference proceedings (de Jong – Bernstein, 2001). Jared Bernstein was the founder and president of the Ordinate Corporation. The goal of this study was to correlate part 5, open questions, with the other four parts. As discussed previously, part 5 consists of two simple personal questions, the responses to which may take up to 30 seconds each, i.e., the sample that may be rated is at most one minute long. These samples were rated by three raters. One was from the Netherlands, one from Switzerland, and one from the U.K. Only one of them was a professional rater of English. Raters were trained to assign CEFR ratings by watching videotapes of speakers at each of the six CEFR levels, one video per level. Next, they were asked to listen to six audio recordings of Versant English part 5 samples and to decide jointly on one rating. Before they did that, they simplified the CEFR descriptors to better match the Versant English samples (cf. appendix 2). Then, they rated part 5 samples of 121 candidates. The raters had a very high interrater reliability. This may be due to the quality of their training. It may also be due to the fact that there were only three raters and that they were able to use clear and simple descriptors.

Correlations between part 5 and parts 1–4 were high (.84). This means that most candidates who received higher ratings in parts 1–4 frequently also received higher ratings in part 5. It does not necessarily mean that parts 1–4 correlate with the levels of the CEFR, because the sample size was much too small even for CEFR standards and the descriptors used were not the actual CEFR descriptors but simplified versions thereof.

Moreover, there is a considerable spread and overlap of points of parts 1–4 associated with CEFR levels. As the scatter blot in de Jong and Bernstein, Figure 3 (2001, no page number) clearly shows, A1 is correlated with 38–59 points on parts 1–4, A2

Page 11: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

111

with 43–62, B1 with 49–80, B2 with 58–80, and C1 with 69–80. Thus, e.g., 58–59 points are correlated with all levels from A1 to B2 and 69–80 points with all levels from B1 to C1. In addition, the lack of sufficient numbers of candidates at the levels B2 to C2 puts these results into question as well. Only 11 candidates were rated B2, only 6 were rated C1 and only 2 were rated C2. Finally, the sample size of less than one minute that could be rated per candidate makes any results very questionable.

The second study (Suzuki – Harada, 2004) was also financed and initiated by Ordinate and followed a similar pattern. Parts 1–4 were correlated with Part 5. Samples from 268 non-native and 35 native speakers of English were rated by six raters. Interrater reliability again was high (89) as was the correlation between parts 1–4 and part 5. Again, the correlation coefficient does not show the complete picture. Complete agreement between raters was 63%, 30% of ratings were off by one level, and 7% by two levels or more. In addition, it is interesting to note that four native speakers were rated B1 and four more were rated B2. Finally, the scatter blot published in the study (Suzuki – Harada, 2004) again shows the great overlap between points assigned for parts 1–4 and the assumed CEFR level of part 5. A1 was correlated with 25–51 points; A2 with 25–56; B1 with 31–77; B2 with 41–76; C1 with 51–80; and C2 with 71–76. Again, someone who received 51 points could have been at any level from A1 to C1 and anyone with points from 71–76 could have been anything from B1 to C2. In addition, it is interesting to note that Ordinate (Ordinate, 2007) apparently uses very different cut points starting in 2007 than they did previously, presumably because of the study by Suzuki and Harada (see table 5).

A1 A2 B1 B2 C1 C2

2001 40–49 50–55 56–61 62–67 68–72 73–80

2006 26–35 36–46 47–57 58–68 69–78 79–80

Table 5: CEFR cut points of Versant English in 2001 and 2006.

Both studies are plagued by the same problems: very small sample length (less than a minute); unclear relationship between Versant English simplified CEFR descriptors and the original ones; lopsided concentration of most test results at the lower levels from A1 to B1; substantial overlap in points associated to individual CEFR levels. All these problems together cast more than a shadow of doubt on Ordinate’s claim that they linked their test reliably to CEFR levels. Equally unsubstantiated and unreliable therefore seems to be the claim of the DESI consortium that they established CEFR oral proficiency levels for ninth grade students in Germany.

5. Reasonable expectations In this paper I tried to show that the results of the DESI study with respect to attainable proficiency levels of English in ninth grade in German public schools are not reliable enough to function as a basis for educational standards. It is of considerable importance

Page 12: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

112

to establish empirically what proficiency levels may be attained at various exit points within the German school system. American studies seem to indicate that German educational standards are far too ambitious, and unreasonably so, with respect to oral proficiency levels even for the first foreign language studied in school. Equally unreasonable are the assumptions that it does not matter how long one studies a foreign language or that all foreign languages may be acquired equally fast (see below). Educational standards that assume the same proficiency levels at a particular exit point, e.g. the Abitur, for all languages regardless of length of study are not reasonable. In the following, I will try to establish more reasonable proficiency standards for German Gymnasien based on the substantial body of research that has been done with respect to the ACTFL oral proficiency standards. As point of departure, I will use the Bavarian proficiency standards for Gymnasien.

Row 1 in table 6 shows the grade levels of the Gymnasium in Bavaria. Row 2 shows the cumulative classroom hours for the first foreign language and row 3 the expected CEFR level for all students for all skills at the end of the year. Row 4 shows a more reasonable level based on the empirical studies discussed in section 1 of this paper. Rows 5–7 show the same three categories: classroom hours, expected level, more realistic level for the second foreign language starting in grade 6; rows 8–11 for the third foreign language starting in grade 8; and rows 12–14 for any foreign language added in grade 11. Row 11 shows slightly different realistic levels for Russian as a third language, a more distant language for native speakers of German than Italian or Spanish, the more common third languages.

1 Grade 5 6 7 8 9 10 12

2 FL1: Hours 200 360 520 640 760 880 1200

3 Expected Level A1 A1+ A2 A2+ B1 B1+ B2+

4 Realistic Level A1 A1+ A2 A2+ A2+ B1 B1

5 FL2: Hours 160 320 480 600 720 1040

6 Expected Level A1 A2 A2+ B1 B1+ B2

7 Realistic Level A1 A2 A2+ A2+ B1 B1

8 FL3: Hours 160 320 480 800

9 Expected Level A2 A2+ B1+ B2

10 Realistic Level A1+ A2 A2+ (B1)

11 Realistic Level: Russian A1 A1+ A2 A2+

12 Late Beginning FL: Hours 360

13 Expected Level B1

14 Realistic Level A2+

Table 6: Classroom hours, expected and realistic CEFR levels in oral proficiency for foreign languages at Gymnasien in Bavaria

Page 13: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

113

A2 in oral proficiency seems to be a reasonable goal, even or perhaps especially for children after 520 or even 360 hours of instruction, above all when they have been exposed to the first foreign language from grade 3 in elementary school. A2+ should be attainable after 640 hours and B1 after 880 hours. B1+ or B2 do not appear to be realistic goals even after 1200 classroom hours for students who have not spent at least a year abroad or have had a similar substantial immersion experience in bilingual school programs taught by native or near-native speakers of the target language in the target language. Even in such environments, only highly motivated students who are challenged to move beyond Basic Interpersonal Communication Skills (BICS) will move on to the higher levels of Cognitive Academic Language Proficiency (CALP) (Cummins, 1979). This requires substantial opportunities for academic presentations and discussions. Similar arguments apply to the second foreign language.

As far as the third foreign language is concerned, A2 appears to be a realistic goal after two years and A2+ after three years. Highly motivated students with an immersion experience may reach B1 at Abitur, most students probably will not. For more distant languages such as Russian, B1 does not seem to be a realistic goal, even for highly motivated students, unless they spend at least half a year in a Russian-speaking environment. For languages that are started in grade 11 and studied for a total of two years, the only reasonable goal appears to be A2+, unless a school year abroad is involved or there has been a substantial immersion experience at home or in a bilingual school program.

It is important to note that my estimates only apply to oral proficiency. The CEFR has to be commended for drawing attention to the fact that skills develop at different speeds (cf. also Tschirner, 1996). The educational standards of Bavaria, e.g., draw on this fact and posit a higher level of proficiency for reading than they do for the other skills. It appears reasonable to assume that higher proficiency levels are attainable for the receptive skills of reading and listening, perhaps even the levels that have been assumed for the various grade levels. Thus, although the expected levels for Gymnasium might not be achievable for speaking and writing, they may be for listening and reading. These levels, however, may not be reached even for the receptive skills until curricula and instructional approaches realize that more reading and listening experience is required to reach these levels than commonly assumed. Extended reading programs are required for reading proficiency, and extensive film, television and radio broadcast programs as well as frequent contact with speakers of the target language through school partnerships, tandem programs, and the use of modern Internet applications such as Facebook, Skype, and YouTube are required for listening proficiency.

In this paper I argued that it is of critical importance to have reasonable expectations with respect to proficiency levels attainable at major exit points within the German school system. The formulation of educational standards on the basis of the CEFR has refocused our attention on proficiency, the ability to do something with the foreign language skills that one has. It is important that there are no discrepancies between what is expected and what may be attained given the conditions under which foreign language learning is taking place. There are a great number of studies with respect to proficiency levels and hours of instruction that have not been taken into consideration by the people responsible for developing educational standards. I have argued that these studies may provide us with preliminary hypotheses that need to be investigated

Page 14: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

114

empirically. The first and to date only large scale empirical investigation into proficiency levels including oral proficiency attainable within the German school system, the DESI study, has major flaws in selecting appropriate measures of foreign language proficiency, especially with respect to oral proficiency. The measure chosen, Versant English, does not appear to be either valid nor a reliable measure of oral proficiency. The results of the DESI study, therefore, fail to provide us with a first approximation of what levels are possible at grade 9, the grade level under investigation.

Despite the fact that the DESI study failed to come up with reliable CEFR levels, many of their recommendations are valid. These include the notion that students and teachers alike have to speak much more English in class; that there should be a much greater focus on listening comprehension; that there is a lack of comprehensible input in many foreign language classrooms, especially through authentic listening materials such as film, television and radio broadcasts and, finally, that more students should be enrolled in bilingual programs. In addition, I would add, it is important that schools integrate their exchange programs and other immersion experiences much more clearly into their curriculum, that they make it a cornerstone of their foreign language programs and try to find ways to make such exchange experiences available to much greater number of student than has been the case. Only then may it be possible for more than just a few students to reach the very ambitious goals that the German educational standards have set for them.

References: BACHMAN, L. (2000): Fundamental considerations in language testing. Oxford: Oxford University Press.

BREINER-SANDERS, K. – LOWE, P. Jr. – MILES, J. - SWENDER, E. (2000): ACTFL proficiency guidelines –

speaking, revised. Foreign Language Annals, 33, pp. 13–18.

BAYERISCHES STAATSMINISTERIUM FÜR UNTERRICHT UND KULTUS (2003): Lehrplan für Gymnasien in

Bayern. Wolnzach: Kastner AG.

CARROLL, J. B. (1967): Foreign language proficiency levels attained by language majors near graduation

from college. Foreign Language Annals, 1/2, pp. 131–151.

CHALHOUB-DEVILLE, M. – FULCHER, G. (2003): The Oral Proficiency Interview: A research agenda. Foreign

Language Annals, 36, pp. 498–506.

CUMMINS, J. (1979): Cognitive/academic language proficiency, linguistic interdependence, the optimum

age question and some other matters. Working Papers on Bilingualism, 19, pp. 121–129.

DESI-KONSORTIUM (2006): Unterricht und Kompetenzerwerb in Deutsch und Englisch. Zentrale Befunde der

Studie Deutsch-Englisch-Schülerleistungen-International (DESI). Frankfurt am Main: Dipf.

FULCHER, G. (1997): The testing of L2 speaking. In: C. Clapham – D. Corson (eds.), Encyclopedia of language

and education, vol. 7: Language testing and assessment. Dordrecht: Kluwer, pp. 75–85.

HALLECK, G. – MODER, C. (1995): Testing language and teaching skills of international teaching assistants:

The limits of compensatory strategies. TESOL Quarterly, 29, pp. 733–758.

HERZOG, M. (2003): Impact of the proficiency scale and the Oral Proficiency Interview on the foreign lan-

guage program at the Defense Language Institute Foreign Language Center. Foreign Language Annals, 36,

pp. 566–571.

Page 15: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

115

HINCKS, R. (2001): Using speech recognition to evaluate skills in spoken English [on-line]. Working Papers

49. Papers from the Fonetik 2001. Lund: Lund University Department of Linguistics, pp. 58–61 [cit 02-10-

2008]. Accessible from WWW: <http://www.speech.kth.se/~hincks/papers/fon01.pdf>.

HINCKS, R. (2002): Speech recognition for language teaching and evaluating: A study of existing commercial

products. Proceedings from ICSLP 2002. Denver, Center for Spoken Language Research, pp. 773–776 [cit 02-

10-2008]. Accessible from WWW: <http://www.speech.kth.se/~hincks/papers/icslp02xxx.pdf>.

HINCKS, R. (2003): Speech technologies for pronunciation feedback and evaluation. ReCALL, 15, pp. 3–20.

DE JONG, J. H. A. L – BERNSTEIN, J. (2001): Relating PhonePass overall scores to the Council of Europe

Framework level descriptors. In: P. Dalsgard – B. Lindberg – H. Benner – Tan Zheng-hua (eds.), Proceedings

of Eurospeech 2001 Scandinavia, 7th

European Conference on Speech Communication and Technology, pp.

2803–2806.

KENYON, D. – TSCHIRNER, E. (2000): The rating of direct and semi-direct oral proficiency interviews:

Comparing performance at lower proficiency levels. Modern Language Journal, 84, pp. 85–101.

KULTUSMINISTERKONFERENZ (2006): Stellungnahme der Kultusministerkonferenz zu den Ergebnissen

der Studie „Deutsch Englisch Schülerleistungen International“ (DESI). Bonn, 03.03.2006 [on-line]. [cit 02-

10-2008]. Accessible from WWW: <http://www.kmk.org/aktuell/pm060303_desi.htm>.

Language Continuum (2006) [on-line]. Washington, DC: Foreign Service Institute. [cit 02-10-2008].

Accessible from WWW: <http://fsitraining.state.gov/training/Language%20Continuum.pdf>.

LISKIN-GASPARRO, J. (2003): The ACTFL Proficiency Guidelines and the Oral Proficiency Interview: A brief

history and analysis of their survival. Foreign Language Annals, 36, pp. 483–490.

MAGNAN, S. (1988): Grammar and the ACTFL Oral Proficiency Interview: Discussion and data. Modern

Language Journal, 72, pp. 266–276.

MAJIMA, J. (ed.) (2006): Panel discussion with Brian North, Johanna Panthier, Neil Jones, Lynne Parmenter

and Lukas Wertenschlag. Osaka University of Foreign Studies Japan-Europe International Symposium. A new

direction in foreign language education: The potential of the Common European Framework of Reference for

Languages. Osaka, 05.03.2006 [on-line]. [cit 02-10-2008]. Accessible from WWW: <http://homepage.mac.

com/jmajima1/bukosite/cef/Symposium.html>.

MALONE, M. (2003): Research on the Oral Proficiency Interview: Analysis, synthesis, and future directions.

Foreign Language Annals, 36, pp. 491–497.

MIKHAILOVA, J. (2006): Description in Russian: How the syntactical complexity of description in the OPI is

different from description in the SOPI. Working Papers in Slavic Studies, vol. 6. Columbus, OH: Ohio State

University.

NORRIS, J. (1996): A validation study of the ACTFL guidelines and the German Speaking Test. Manoa, HI:

University of Hawaii at Manoa. [Unpublished master’s thesis].

ORDINATE (2007): Versant for English Technical Manual [on-line]. Harcourt Assessment, Inc. [cit 0210-

2008]. Accessible from WWW: <http://harcourtassessment.com/hai/Images/dotCom/VersantTest/

Versant_English_TechManual.pdf>.

RIFKIN, B. (2002): A case study of the acquisition of narration in Russian: An investigation at the

intersection of three disciplines. Slavic and East European Journal, 46, pp. 465–482.

RIFKIN, B. (2003): Oral proficiency learning outcomes and curricular design. Foreign Language Annals, 36,

pp. 582–588.

Page 16: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

116

RUBIO, F. (2003): Structure and complexity of oral narratives in advanced-level Spanish: A comparison of

three learning backgrounds. Foreign Language Annals, 36, pp. 546–554.

SCHNEIDER, G. (2001): Kompetenzbeschreibungen für das Europäische Sprachenportfolio. Fremdsprachen

Lehren und Lernen, 30, pp. 193–214.

SCHRÖDER, K. – HARSCH, C. – NOLD, G. (2006): DESI. Die sprachpraktischen Kompetenzen unserer

Schülerinnen und Schüler im Bereich Englisch. Zentrale Befunde. Neusprachliche Mitteilungen, 03/2006, pp.

11–32.

SURFACE, E. – DIERDORFF, E. (2003): Reliability and the ACTFL Oral Proficiency Interview: Reporting

indices of interrater consistency and agreement for 19 languages. Foreign Language Annals, 36, pp. 507–

519.

SUZUKI, M. – HARADA, Y. (2004): A common testing framework for measuring spoken language skills of

non-native speakers [on-line]. IWLeL 2004: An interactive workshop on language e-learning, pp. 115–122

[cit 02-10-2008]. Accessible from WWW:

<http://dspace.wul.waseda.ac.jp/dspace/bitstream/2065/1402/1/13. pdf>.

SWENDER, E. (2003): Oral proficiency testing in the real world: Answers to frequently asked questions.

Foreign Language Annals, 36, pp. 520–535.

TANNENBAUM, R. – WYLIE, E. C. (2005): Mapping English language proficiency test scores onto the

Common European Framework. TOEFL Research Reports, 80. Princeton, NJ: ETS. [on-line]. [cit 02-10-2008].

Accessible from WWW: <http://www.ets.org/Media/Research/pdf/RR-05-18.pdf>.

TSCHIRNER, E. (1996): Scope and sequence: Rethinking beginning foreign language instruction. Modern

Language Journal, 80, pp. 1–14.

TSCHIRNER, E. (2001): Die Evaluation fremdsprachlicher mündlicher Handlungskompetenz: Ein Proble-

maufriss. Fremdsprachen Lehren und Lernen, 30, pp. 87–115.

TSCHIRNER, E. (2005): Das ACTFL OPI und der Europäische Referenzrahmen. Babylonia, 2/2005, pp. 50–

55.

TSCHIRNER, E. (2007a): The development of oral proficiency in a four-week intensive immersion program

in Germany. Die Unterrichtspraxis/Teaching German, 40, pp. 111–117.

TSCHIRNER, E. (2007b): The Common European Framework of Reference and the ACFTL Oral Proficiency

Guidelines: A comparison. Paper presented at the IES Faculty Seminar: Language Placement and Assessment.

Granada, 19.–22. 4. 2007.

TSCHIRNER, E. – HEILENMAN, L. K. (1998): Reasonable expectations: Oral proficiency goals for inter-

mediate students of German. Modern Language Journal, 82, pp. 147–158.

University of Cambridge ESOL Examinations. First Certificate of English [on-line]. [cit 20-10-2008]. Accessible

from WWW: <http://www.cambridgeesol.org/exams/general-english/fce.html>.

VANDERGRIFT, L. (2006): Proposal for a Common Framework of Reference for Languages for Canada.

Canadian Heritage [on-line]. [cit 02-10-2008]. Accessible from WWW: <http://www.pch.gc.ca/progs/lo-ol/

pubs/new-nouvelles_perspectives/tdm_e.cfm>.

WATANABE, S. (1998): Concurrent validity and application of the ACTFL Oral Proficiency Interview in a

Japanese language program. The Journal of the Association of Teachers of Japanese, 32/1, pp. 22–38.

WATANABE, S. (2003): Cohesion and coherence strategies in paragraph-length and extended discourse in

Japanese Oral Proficiency Interviews. Foreign Language Annals, 36, pp. 555–565.

Page 17: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

117

DE WET, F. – VAN DER WALT, C. – NIESLER, T. (2007): Automatic large-scale oral language proficiency

assessment. Proceedings of the eighth annual conference of the International Speech Communication Associ-

ation (Interspeech) [on-line]. Antwerp, Belgium, August 2007, pp. 218–221 [cit. 02-10-2008]. Accessible

from WWW: <http://www.dsp.sun.ac.za/~trn/reports/dewet+vanderwalt+niesler_interspeech07.pdf>.

Appendix 1: Side by side listing of select descriptors of the Common European Frame of Reference (CEFR) and the ACTFL Oral Proficiency Guidelines (ACTFL)

CEFR ACTFL

A1 Has a very basic repertoire of words and simple phrases related to personal details and particular concrete situations. Shows only limited control of a few simple grammatical structures and sentence patterns in a memorized repertoire. Can ask and answer questions about personal detail.

Conversation is restricted to topics such as basic personal information, basic objects and a limited number of activities, preferences and immediate needs. NH speakers respond to simple, direct questions or requests for information. They are able to express personal meaning by relying heavily on learned phrases or recombinations of these.

NH

A2 Uses basic sentence patterns with memorised phrases to communicate limited information in simple everyday situations. Can make self understood in very short utterances, even though pauses, false starts and reformulation are very evident. Can answer questions and respond to simple statements.

Able to handle successfully a limited number of uncomplicated communicative tasks by creating with the language in straightforward social situations. Able to express personal meaning by combining and recombining into short sentences what they know. Speech is characterized by frequent pauses, ineffective reformulations and self-corrections.

IL

A2+ Can initiate, maintain and close simple, restricted face-to-face conversation, asking and answering questions on topics of interest, pastimes and past activities. Can interact with reasonable ease in structured situations but participation in open discussion is fairly restricted.

Able to handle successfully a variety of uncomplicated communicative tasks expressing personal information covering self, family, home, daily activities, interests and personal preferences as well as physical and social needs. Able to ask a variety of questions to obtain simple information to satisfy basic needs.

IM

B1 Has enough language to get by, to express self with some hesitation and circumlocution on topics such as family, hobbies and interests, work, travel, and current events. Can link a

Able to handle successfully many uncomplicated tasks and social situations requiring an exchange of basic information related to work, school, recreation, particular

IH

Page 18: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

118

series of shorter discrete simple elements into a connected linear sequence of points.

interests and areas of competence. Able to narrate and describe using connected discourse of paragraph length with some consistency.

B1+ Can exploit a basic repertoire of strategies to keep a conversation going. Can give brief comments on others‘ views during discussion. Has a sufficient range of language to describe unpredictable situations. Generally good control of structures though with noticeable mother tongue influences.

Able to handle a variety of communicative tasks, although somewhat haltingly at times. Demonstrate the ability to narrate and describe in all major time frames but control of aspect may be lacking at times. Structure of the dominant language is still evident.

AL

B2 Able to give clear descriptions and express viewpoints on most general topics using some complex sentence forms to do so. Can use a limited number of cohesive devices to link utterances into clear, coherent discourse. Shows a relatively high degree of grammatical control. Does not make errors that cause misunderstanding.

Able to handle with ease and confidence a large number of communicative tasks. Able to narrate and describe in all major time frames by providing a full account, with good control of aspect. Speech is marked by substantial flow with much accuracy, clarity and precision. Intended message is conveyed without misrepresentation.

AM

C1 Has a good command of a broad range of language to express herself clearly in an appropriate style on a wide range of general, academic or professional topics. Can produce clear, well-structured speech, showing controlled use of organizational patterns, connectors and cohesive devices. Errors are rare.

Able to consistently explain in detail and narrate fully and accurately with linguistic ease, confidence and competence. Can provide a structured argument to support their opinions and they may construct hypotheses, but patterns of error appear. Can discuss some topics abstractly but are more comfortable discussing a variety of topics concretely.

AH

C2 Can create coherent and cohesive discourse making full and appropriate use of a variety of organizational patterns and a wide range of connectors and other cohesive devices. Maintains consistent grammatical control of complex language, even when attention is otherwise engaged.

Explain their opinions on a number of topics such as social and political issues, and provide structured argument to support their opinions. Able to construct and develop hypotheses to explore alternative possibilities. Demonstrate virtually no pattern of error in the use of basic structures.

S

Page 19: Tschirner 2011 Reasonable Expectations

Studie z aplikované lingvistiky/Studies in Applied Linguistics, 2011/1, pp. 101-119

119

Appendix 2: Simplified CEFR descriptors used to link Versant English to the CEFR

C2 Can express him/herself spontaneously at length with a natural colloquial flow. Consistent grammatical and phonological control of a wide range of complex language, including appropriate use of connectors and other cohesive devices.

C1 Can express him/herself fluently and spontaneously, almost effortlessly, with a smooth flow of language. Clear, natural pronunciation. Can vary intonation and stress for emphasis. High degree of accuracy; errors are rare. Controlled use of connectors and cohesive devices.

B2 Can produce stretches of language with a fairly even tempo; few noticeably long pauses. Clear pronunciation and intonation. Does not make errors which cause misunderstanding. Clear, coherent, linked discourse, though there may be some “jumpiness”.

B1 Can keep going comprehensibly, even though pausing for grammatical and lexical planning and repair may be very evident. Pronunciation is intelligible even if a foreign accent is sometimes evident and occasional mispronunciations occur. Reasonably accurate use of main repertoire associated with more predictable situations. Can link discrete, simple elements into a connected sequence.

A2 Can make him/herself understood in very short utterances, even though pauses, false starts and reformulation are very evident. Pronunciation is generally clear enough to be understood despite a noticeable foreign accent. Uses some simple structures correctly, but still systematically makes basic mistakes. Can link groups of words with simple connectors like “and”, “but” and “because”.

A1 Can manage very short, isolated, mainly pre-packaged utterances. Much pausing to search for expressions, to articulate less familiar words. Pronunciation is very foreign.