Download - Examining Dialogue

Transcript

http://ltj.sagepub.com/ 

Language Testing

http://ltj.sagepub.com/content/18/3/275The online version of this article can be found at:

 DOI: 10.1177/026553220101800302

2001 18: 275Language TestingMerrill Swain

validating inferences drawn from test scoresExamining dialogue: another approach to content specification and to

  

Published by:

http://www.sagepublications.com

can be found at:Language TestingAdditional services and information for     

http://ltj.sagepub.com/cgi/alertsEmail Alerts:  

http://ltj.sagepub.com/subscriptionsSubscriptions:  

http://www.sagepub.com/journalsReprints.navReprints:  

http://www.sagepub.com/journalsPermissions.navPermissions:  

http://ltj.sagepub.com/content/18/3/275.refs.htmlCitations:  

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Examining dialogue: another approachto content specification and tovalidating inferences drawn from testscoresMerrill Swain University of Toronto

In this article one aspect of the many interfaces between second language (L2)learning and L2 testing is examined. The aspect that is examined is the oral interac-tion – the dialogue – that occurs within small groups. Discussed from within asociocultural theory of mind, the point is made that, in a group, performance isjointly constructed and distributed across the participants. Dialogues construct cog-nitive and strategic processes which in turn construct student performance, infor-mation which may be invaluable in validating inferences drawn from test scores.Furthermore, student dialogues provide opportunities for language learning, i.e.,opportunities for the joint construction of knowledge. It is suggested that an exam-ination of the content of these dialogues can provide test developers with targetsfor measurement. Other implications for L2 testing are also discussed.

I Introduction

In this article I examine one aspect of the many interfaces betweenthe fields of second language (L2) assessment and L2 learning: smallgroups and the oral interaction that occurs within them.

Small groups consist of two or more individuals. Small groups aredifferent from interview contexts where asymmetry in the exchangesare a given, and where ‘one person is solely responsible for beginningand ending the interaction [and] for ending one topic and introducinga new topic . . . ’ (van Lier, 1989: 489). Both language educators andlanguage assessors are interested in what individuals say to each otherin small groups. I am interested because what individuals say to eachother can inform us about L2 learning processes and strategies. Lang-uage assessors are interested because there are tests which evaluatethe performance of individuals as they interact in pairs or smallgroups. Importantly, some of those tests are high-stakes tests.

Address for correspondence: Merrill Swain, The Ontario Institute for Studies in Education ofthe University of Toronto, 252 Bloor St. W., Toronto, Ontario, M5S 1V6, Canada; email:MSwain �oise.utoronto.ca

Language Testing 2001 18 (3) 275–302 0265-5322(01)LT208OA 2001 Arnold

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

276 Content specification & validating inferences drawn from test scores

Given that interaction in small groups is one point of overlapamong our interests, I discuss some of the research that I am currentlyengaged in and the theoretical orientation within which this researchis being conducted. Small groups are important to them both. Mygoal is to suggest some of what we, as researchers, might discoverabout L2 learners and about L2 test-takers by examining their dia-logues as they jointly construct their performance in group activities.Specifically I raise two issues for consideration:

1) Might these dialogues generate content which could serve as newtargets for measurement?

2) Might analyses of the dialogues provide additional insights forexamining the validity of the inferences we draw from a test andthe uses we make of them?

It is perhaps important to say at this point that the ways in whichwe have been examining dialogue (e.g., Swain and Lapkin, 1998) aresomewhat different from the text and discourse analyses that are nowquite commonly applied in our respective fields. Text and discourseanalyses focus on linguistic and interactional features of speaking,rather than on its content and on the underlying cognitive and strategicprocesses which both generate that talk, and that that talk generates.Our literatures are rich in the use of text and discourse analyses whichhave been used, for example, to examine the similarities and differ-ences of ‘test-speak’ with that used outside of the testing situation(e.g., Lazarton, 1992), or of the performances generated by differentoral tasks. A large number of features such as lexical density, fluency,structural complexity and turn-taking have been examined (e.g., Sho-hamy et al., 1993; Young and He, 1998). Also, the ways in whichwe have been examining dialogue are somewhat different from thatof other verbal protocol techniques (Green, 1998) in that the datawe have examined are the dialogues that occur between participatingindividuals, not those which occur in solo think-aloud reports andretrospective accounts. What I propose could complement theseanalyses as a source of validity evidence.

In the next section I provide two examples of high-stakes testswhich make use of small groups, along with a sampling of some ofthe issues that have been considered by researchers in the testing fieldrelated to small group testing. This is done as a reminder of the impor-tance of understanding the dynamics of small group interaction toassessment practices. In Section III I introduce some ideas emanatingfrom a theoretical perspective – a sociocultural theory of mind –which has recently been attracting interest amongst some L2 learningresearchers. In Section IV I then consider aspects of our research,which is being conducted from within this theoretical framework. The

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 277

purpose of this is to explore the relevance of this theoretical orien-tation and research to language testing.

II Small group tests and related issues

A number of high and lower stakes tests incorporate a speaking sec-tion in which two or more candidates are required to talk to eachother. For example, the main suite of UCLES’s five EFL Examin-ations each have a speaking component that involves interactionamong the candidates. The non-interview-like part of the speakingsection of the First Certificate in English, which lasts for approxi-mately seven minutes, is described in their handbook as follows: ‘Thecandidates are given visual prompts (photographs, line drawings, dia-grams, etc.) which generate discussion through engaging in tasks suchas planning, problem solving, decision making, prioritising, speculat-ing, etc.’ (UCLES, 1996: 84) (three minutes). The examiner thenencourages a discussion among the participants of matters related tothe theme of the visual prompt (four minutes).

A second example among high-stakes tests comes from HongKong. There, performance on the Hong Kong Use of English A/Slevel Examination determines whether a student can gain entry intoHong Kong’s tertiary institutions. Since 1994, this test includes a two-part oral component, the second part involving groups of four studentsinteracting in a university-like setting, replicating, for example, asmall academic seminar.

There are a number of reasons why some tests now include pairsor small groups of individuals interacting together to debate an issueor to solve a problem. Dissatisfaction with the oral interview as thesole means of assessing oral proficiency and a search for other tasksthat elicit different aspects of oral proficiency (Shohamy et al., 1986)are concomitant reasons. An attempt to influence teaching practices(Hilsdon, 1991) or, alternatively, to mirror teaching practices havealso played an important role. Economic reasons, too, have playedtheir part: where there are many students to be tested, it can be lessexpensive to test them in groups (Berry, 2000).

Given that small group testing occurs in even one high-stakes test,as well as its reasoned use, it is surprising that so little validationwork has been carried out. How are scores based on interactionamong participants to be interpreted as an indication of individualperformance ability? Can they be interpreted as individual perform-ance ability at all? McNamara, in his thought-provoking paper entitled‘“Interaction” in second language performance assessment: Whoseperformance?’ (1997) has raised precisely these questions. His ques-tions encompass a broad range of interactions including those between

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

278 Content specification & validating inferences drawn from test scores

interviewer and interviewee, and those between test-raters and test-takers. The research of Lumley and Brown (1996), for example,found that features of an interviewer’s language behaviour differen-tially support or handicap a test candidate’s performance. The pointhere is that performance is not solo performance, but rests on a jointconstruction by the participating individuals. I return to this pointbelow.

Other researchers have raised different questions that impact on aninterpretation of scores from group testing. Fulcher (1996), forexample, questioned students about their reactions to oral tasks theyhad recently completed. The tasks included one-on-one interviewsand a group discussion. Students indicated that they felt least anxiousprior to carrying out the group discussion task, and that they con-sidered the conversation that emerged during the group discussion tobe more natural than in the one-on-one interviews. These affectiveresponses translated into performance differences. On the basis of hisG-study, Fulcher (1996: 36) states, however, that ‘while task doeshave a significant effect upon scores, this effect is so small that itdoes not seriously reduce the ability to generalize from one task toanother’.

Berry (2000) has examined the relationship between extraversionand performance on a group oral test. What she has found is a com-plex relationship between the characteristics of the test-taker andthose of the rest of the group. The scores of introverts are suppressedwhen ‘they take part in a discussion . . . in a group with a low meanlevel of extraversion, and are elevated when in a group with a highmean level of extraversion’ (p. 163, section 5.6.6). The reverse holdsfor extroverts. Berry’s findings caution us against interpreting individ-ual scores as reflecting underlying linguistic abilities, and support aninterpretation of ‘situated performances’.

This potential for unfair biases if students of differing compati-bilities (or of differing abilities) are grouped together needs furtherinvestigation. Not only is there the likelihood of real performancedifferences of any one test-taker in different contexts, but there mayalso be induced biases, i.e., because of its embeddedness in differentcontexts, test-raters may unintentionally judge the same performancedifferently. Importantly, there is also the issue of ‘whose performanceit is anyway’.

III Theoretical perspective: sociocultural theory of mind

Since the 1980s the notions of ‘input’, and ‘interaction’ have playeda significant role in theorizing about the essential ingredients and con-ditions for L2 learning (e.g., Pica, 1994; Gass, 1997). This is familiar

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 279

territory for most researchers of L2 learning. What is less familiar isthe program of research that we (e.g. Swain, 1995, 2000; Kowal andSwain, 1997; Swain and Lapkin, 1998; 2000a; 2000b) have been pur-suing since the mid-1990s. The purpose of our research has been todiscover if, and how, what I have long called ‘output’ – that is, langu-age production – serves L2 learning.

The basic argument is that the learner’s drive to communicate suc-cessfully in the target language pushes him or her to go beyond thecognitive activity that occurs in comprehension and to engage in morecomplete grammatical processing. In attempting to communicate,learners will create linguistic form and meaning, and in so doing, willdiscover the limitations of their current system. Depending on theindividuals and the circumstances, noticing a gap in their linguisticknowledge may stimulate learning processes.

Certainly, for many of the learners we have recorded as they inter-acted while working together on tasks (e.g., Kowal and Swain, 1997),we have observed that those learners noticed gaps in their linguisticknowledge and they worked to fill them by turning to a dictionary orgrammar book, by asking their peers or teacher, by generating andtesting hypotheses, or by noting to themselves to pay attention tofuture relevant input. Our data show that these actions generatedlinguistic knowledge that was new for the learner or consolidated theirexisting knowledge (e.g., Swain and Lapkin, 1998). Importantly, itwas the attempt to communicate, as distinct from comprehending, thatfocused the learner’s attention on what he or she did not know, orknew only imperfectly.

This view of output is embedded in the concept of language as acommunicative activity. Since about 1995, individuals such as vanLier (2000) and Kramsch (1995) have argued against the continueduse of terms like those of ‘input’ and ‘output’, claiming that theylimit our understanding of L2 learning to an information-processing,machine-like perspective. These terms are based on a ‘conduit’ meta-phor that sees language use simply as items which are transmitted asoutput from one source to be received as input elsewhere. Using thismetaphor, they suggest, inhibits the development of a broader under-standing of language use and language learning.

This is a valid and productive critique. In accepting it and movingforward, I am engaged in reworking the notion of output to incorpor-ate it within a view that focuses on language learning and use asdialogue – dialogue with others, and dialogue with the self – servingboth communicative and cognitive functions. I have come to this pos-ition through research and reading the works of Vygotsky (e.g., 1978;1987) and those who have further developed his theoretical stance,

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

280 Content specification & validating inferences drawn from test scores

including psychologists (e.g., Wertsch, 1985; 1991; Cole, 1996) andapplied linguists (e.g. Lantolf and Appel, 1994; Lantolf, 2000a).

There are two important points I wish to make regarding this theor-etical perspective, both of which are based on one of its primarypremises: the origins of higher mental processes – that is, of cognitivefunctioning – are primarily social.

1 Higher cognitive processes as mediated activities

The first point is that higher cognitive processes – processes suchas attending, predicting, planning, monitoring and inferencing – aremediated activities whose source is the interaction that occursbetween individuals. That interaction initially takes the form of dia-logue between individuals. According to Wertsch (1980), strategiesthat an individual participates in through social dialogue develop intostrategic patterns of reasoning at the cognitive level, such that at alater stage ‘the individual has taken over the cognitive responsibilitiesof both agents who had formerly participated in the social dialogue’(p 159). These strategic processes become visible in the dialoguebetween individuals when they jointly engage in problem-solvingactivity. As Lantolf (2000b) states, ‘an extremely important impli-cation of research on mediated learning [is that] . . . attending to thetalk generated by learners during peer mediation allows us access tosome of the specific cognitive processes learners deploy to learn alanguage’ (p. 20 of manuscript).

2 Knowledge as constructed through dialogue

The second point is that knowledge is constructed through dialogue.That dialogue may be with the self; it may be with others. Theimportant point here is that dialogue mediates the construction ofknowledge; through dialogue participants co-construct knowledge. Inthe case of researchers of L2 learning, we are interested in the con-struction of linguistic knowledge. And, in fact, our own research isoriented towards demonstrating that dialogue mediates L2 learning.

Most of us have little trouble in understanding that dialoguemediates our learning of such substantive areas as mathematics,science and history. In principle the notion of dialogue mediating thelearning of language is no different. When learners engage in a jointactivity, particularly one in which successful communication isimportant, efforts to use language to solve language problems can beobserved in their dialogue. I have called this sort of knowledge-build-ing dialogue ‘collaborative dialogue’.

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 281

The implications of this perspective for language testing are two-fold. Let me mention them briefly here, and return to them after dis-cussing some relevant L2 learning research. The first implication isthat because cognitive and strategic processes are made visible indialogue, then studying that dialogue will provide us with evidenceof how participants in group interaction approach and process the taskdemands. If understanding these strategies and processes is importantto an understanding of the construct being measured, then the dia-logue amongst participants will be an important source of validationevidence. The second implication is that the process of interactionand the outcome of interaction is a joint achievement, not one ofindividual performance.

IV Current research

I now discuss our current research in which we have been examiningclosely the dialogues of pairs of students as they work collaborativelyon different tasks. I would like to suggest that those dialogues offeropportunities for L2 learning (that is, they are a possible source oflearning), that they offer evidence of the cognitive processes and stra-tegies learners are using, and that they provide content that can estab-lish targets for measurement.

In our current work in L2 learning, this neo-Vygotskian, socio-cognitive perspective has meant that we have designed our researchto provide opportunities for students to work together in pairs ondifferent tasks. We have been particularly interested in what studentssay to each other when they encounter a linguistic problem as theyare doing the task: how do they go about solving it? And if theysolve it, what evidence can we present that their solution providedan occasion for language learning? Learning is understood to be acontinuous process of constructing and extending meaning that occursduring learners’ involvement in situated joint activities (Halliday andMatthiessen, 1999; Wells, 1999).

In our most recent studies (Swain and Lapkin, 1998; 2000a;2000b), we had two main questions we wished to investigate. Ourfirst question was whether the dialogues of the students were, indeed,a source of L2 learning. Our second question related to the attentionalfocus and cognitive processes that different tasks engendered. Wewanted to know if one type of task would lead our students to focusmore on language form than would another type of task.

We worked with two Grade 8 (aged 13–14) mixed-ability French

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

282 Content specification & validating inferences drawn from test scores

Table 1 Stages of the research and time frame

Week 1 Week 2 Week 3 Week 4 Week 5

• pretest • informal • video-taped • tapes • posttest wasdeveloped instructor-led lesson; transcribed administeredfrom pilot training instructions; and class-study was session and modelling specificadministered of task posttest items

performance developed• task done in • task done in

pairs pairs and(focus on tape-recordedadjective (focus onagreement) reflexives)

immersion classes from the same school.1 These two classes of Grade8 Anglophone students had been in a French immersion programsince kindergarten. Until Grade 3, all instruction was in French; there-after, English language arts was introduced, and from about Grade 5on, approximately 50% of instruction was in French and 50% inEnglish.

Table 1 shows the stages of the research, and the time frame. Datacollection took place over a five-week period. Because one of ourgoals was to trace the learning that occurred in the student dialogues,we administered both a pretest and posttest. The posttest consisted ofthe pretest plus items we constructed based on the language-relatedepisodes (see definition below) occurring in the student dialogues. Inother words, we used the content of the language-related episodes asthe basis for the construction of new, additional posttest items. Thiswas necessary because of our research question as to whether thestudent dialogues were, in fact, a source of L2 learning. We had foundfrom previous research that in tasks like the jigsaw and dictogloss(see below for descriptions), we could not predict ahead of time pre-cisely what aspects of language form and meaning each pair of stu-dents would focus on, no matter how precisely we thought we hadstructured the task, and no matter to what extent we intervened withsuch manipulations as, for example, using videoed mini-lessons. Inother words, as each pair of students progressed through the task,they did so focusing on linguistic aspects that the story they created,created for them.

1The research was conducted during regular classroom time. In one class (30 students), pairsof students worked together on the dictogloss task; in the other class (35 students), pairs ofstudents worked together on the jigsaw task (see below for description).

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 283

This meant that any pre-designed posttest we might have admin-istered would misrepresent the knowledge we believed our learnerswould attain. This is the reason that in week 4 we transcribed thetapes of the students and constructed test items based on their dia-logues. It is in this sense that I raised the question at the beginningof this article of whether the dialogues of test-takers doing group oraltasks might serve as the basis for developing new targets formeasurement.

In order to be able to develop the new posttest items quickly, wehad decided ahead of time on the format of the test items we woulduse. Two examples are given in Figure 1. The first example shows amultiple-choice item, and the second shows two related grammati-cality judgement items. These items were, in fact, developed basedon two of the language-related episodes (Examples 1 and 2) dis-cussed below.

Figure 1 Two test item types: multiple-choice and grammaticality judgement

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

284 Content specification & validating inferences drawn from test scores

As can be seen in Table 1, during the second week, we conducteda session to familiarize the students with the particular type of taskthey would be doing again the following week when we tape-recordedthem. To focus students’ attention on at least one aspect of Frenchthat has proven problematic for French immersion students and thatwas highlighted in the task itself, a short mini-lesson on adjectiveagreement was given. Then the students did the task.

In the third week, the grammatical point focused on was the reflex-ive verb. A pre-recorded mini-lesson on French reflexive verbs wasshown on video. The video also showed two students workingtogether on a similar task, serving as a model for what the studentswere to do following the viewing of the video. The talk of each pairof students in the class was tape-recorded as they did their task. Thestories the students wrote were collected. Later they were rated on afive point scale for each of content, organization, vocabulary, mor-phology and syntax. The stories produced by each pair of studentswere scored by two experienced French immersion teachers. The twosets of ratings for each writing sample were averaged. For the descrip-tors of the five scales, see Swain and Lapkin, 2000a.

We used two contrasting tasks in the study: a jigsaw task and adictogloss task. The jigsaw task, a task in which each member of apair of students holds information the other does not, is shown inFigure 2. One student in each pair held pictures numbered 1, 3, 5 and7 and the other held those numbered 2, 4, 6 and 8. The students wererequired to construct the story told by the pictures by looking onlyat the cards each held. Typically the students worked through thecards sequentially, alternately telling each other what was in theirpictures. Then together they wrote out the story. Pica et al. (1994)suggest that this type of two-way information exchange task isthought maximally to foster negotiation of meaning, a conditionhypothesized to increase comprehensible input and therefore enhanceL2 learning.

The dictogloss task we used is shown in Figure 3. The text of thisdictogloss was read twice in French at normal speed to the students.Each student took notes on its content while they listened to the pass-age being read. Then pairs of students, using their notes, workedtogether to reconstruct the passage in writing. Teachers we hadworked with had tried out dictogloss tasks in action research in theirown classrooms. They had found that it provided a context in whichstudents not only negotiated meaning, but also focused on linguisticform in context (Kowal and Swain, 1994; 1997).

Our intent was that the two tasks be as similar and comparable aspossible in terms of content, so we constructed the dictogloss textfrom the pictures in the jigsaw task. We showed the series of eight

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 285

Figure 2 Jigsaw taskSource: ‘The tricky alarm-clock’ I. Baltova 1994

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

286 Content specification & validating inferences drawn from test scores

Le reveil-matin de Martine

Il est six heures du matin et le soleil se leve. Martine dort tranquillementdans son lit. Elle fait de beaux reves, la tete au pied du lit et les pieds surl’oreiller. Quand le reveil sonne, Martine ne veut pas se lever. Elle sort sonpied et avec le gros orteil, elle ferme le reveil. Elle se rendort tout de suite.Mais elle a le reveil qu’il faut pour ne pas etre en retard. A six heures etdeux minutes, une main mecanique tenant une petite plume sort du reveil etlui chatouille le pied. C’est efficace. Finalement Martine se leve. Elle sebrosse les dents, se peigne les cheveux et s’habille pour prendre le cheminde l’ecole. Encore une journee bien commencee.

Translation of dictogloss task

It’s 6a.m. and the sun is rising. Martine is sound asleep in her bed. She’shaving sweet dreams, her head at the foot of the bed and her feet on thepillow. When the alarm clock rings, Martine doesn’t want to get up. Shesticks her foot out and with her big toe she shuts off the alarm. She fallsasleep again immediately. But she has the kind of alarm clock you need toprevent being late. At 6:02, a mechanical hand holding a small feather comesout of the alarm clock. It tickles her foot. To good effect! Finally Martine getsup. She brushes her teeth, combs her hair and gets dressed to go to school.Another great start to the day!

Figure 3 Dictogloss text and translation

pictures to three adult native speakers of French and asked them tonarrate the story they saw unfolding. Combining their transcribed nar-ratives gave us the text we used for the dictogloss (Figure 3).

Our working hypothesis about cognitive processes the tasks wouldengage was that although both tasks provided opportunities for thenegotiation of meaning, the dictogloss task would lead students tofocus more on linguistic form than the jigsaw task. As both tasksinvolve using language communicatively, we predicted that theadditional focus on form which we thought the dictogloss task wouldpromote would enhance the learning opportunities for the studentswho did that task. Thus, when this prediction was not borne out asrevealed by the posttest results, an examination of the dialogue of thestudents became crucial in helping us understand why.

Because of our particular interest in collaborative dialogue – thatdialogue which ensued when students encountered a linguistic prob-lem and worked jointly to solve it – we identified in the transcriptsall instances of language-related episodes. A language-related epi-sode – an instance of collaborative dialogue – is a unit of analysisthat emerged from our data, and that we defined as any part of adialogue where students talk about the language they are producing,question their language use, or other- or self-correct their language

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 287

production. Language-related episodes (LREs) thus entail discussionof meaning and form, but may emphasize one of these more than theother. In our analysis we distinguished ‘lexis-based’ and ‘form-based’language-related episodes. Lexis-based LREs involve searching forFrench vocabulary and/or choosing among competing French vocabu-lary items. Form-based LREs involve focusing on spelling or anaspect of French morphology, syntax or discourse. (Conferencingamong the research team members achieved consensus in identifyingand classifying LREs.)

An example of a lexis-based LRE is shown in Example 1. Kimand Rick are two Grade 8 French immersion students. Here they areengaged in doing the jigsaw task; they are working on writing outthe part of the story illustrated in picture 2 of Figure 2.

Example 1

1 Kim: Quelque chose uh . . . est sur l’ . . . quelque chose est sur l’oreiller.(Something uh . . . is on the . . . something is on the pillow.)

2 Rick: Is that l’oreiller? [pointing to something in picture 2](Is that the pillow?)

3 Kim: No, this is l’oreiller. [pointing to it](No, this is the pillow)

4 Rick: Pillow?5 Kim: Yeah, pillow’s oreiller.

[Yeah, the French word for pillow is oreiller.](from Swain and Lapkin, 1998)

In turn 1 Kim is working out what they might write down about thispicture. However, Rick is not sure of the meaning of what Kim has said.Specifically, Rick is unsure about the meaning of l’oreiller and, in turn2, Rick seeks to clarify its meaning by pointing to something in thepicture and asking whether what he is pointing to is in fact an oreiller.His shift to English is significant; he could certainly ask his question inFrench. His shift allows him to focus on the French lexical item of impor-tance here: l’oreiller; and his use of English frames the French word andholds it up for attention and reflection. At this point, both the Frenchword and its referent in the picture are in focus, and remain so in turn3. In turn 3, Kim tells Rick that what he has pointed to is not a pillowwhile at the same time pointing to the correct referent. Here l’oreiller isagain highlighted through its embedding in an English sentence. In turn4, Rick appears to be making the essential connection: that l’oreillermeans pillow, which Kim confirms in turn 5.

In the L2 acquisition literature, this example would constitute aclassic example of negotiation of meaning, where a confirmationrequest by Rick leads to input that is made comprehensible for him.This is hypothesized as being a necessary condition for learning tooccur. However, what is going on here is more and different than that:

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

288 Content specification & validating inferences drawn from test scores

it is a collaborative venture. Two students are engaged in a dialogue inwhich the meaning of a lexical item is concretized, used, translatedinto the native language and back again into the target language, andlearned. This dialogue is not ‘enhancing’ learning, or leading to learn-ing, it is learning. In the posttest item (shown in Figure 1), Kim andRick both correctly checked l’oreiller. It is from data of this sort thatwe have inferred that learning occurred during the dialogues (LREs).Rick learned the meaning and perhaps the lexical item for ‘pillow’.For Kim, this LRE perhaps served to consolidate previous learning.In this brief LRE, we see evidence of the processes and strategies inwhich the students engage. Rick generates a hypothesis – ‘Is thatl’oreiller?’ – and has his hypothesis disconfirmed, being providedwith a correct solution – ‘No, this is l’oreiller’. What is going onhere is much more than just input and output.

In Figure 4, items from Purpura’s (1998) taxonomy of cognitivestrategies which are made visible in this short dialogue between Kimand Rick are shown. There is evidence of clarifying (line 1), verifying(line 3), translating (lines 4 and 5), inferencing (line 2), associating

Comprehending processes:

• Clarifying/verifying– I try to improve my English by asking other people to tell me if I

have understood or said something correctly.

• Translating– When I’m learning new material in my second language (L2) I

translate it into my native language.– I learn new words in my L2 by translating them into my own

language.

• Inferencing– I try to improve my listening in my L2 by listening for the important

words to help me understand better.– I try to improve my listening in my L2 by guessing the meaning of

new words from the situation.

Storing or memory processes:

• Associating– I learn new words in my L2 by connecting the sound of any new

word with an image or picture of it to help me remember it.

• Linking with prior knowledge– When I’m learning new material in my L2 I try to connect what I am

learning with what I already know.– I try to improve my listening in English by using what I know from

other situations to help me understand what is being said.

Figure 4 Items from Purpura’s (1998) cognitive strategies questionnaire

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 289

(lines 2 and 4), and linking with prior knowledge (entire LRE). Theseare strategies which occur dialogically here. We can see here aninstantiation of Wertsch’s (1980) point that a large share of strategicactivity in daily life has ‘distributed responsibility’. Our interpret-ations about the cognitive processes apparent in the dialogues wouldbe strengthened through the collection of additional data (e.g., askingthe participants what they thought about what they were doing). Inour current research (e.g., Swain and Lapkin, 2001), we have done so.

An example of a form-based LRE is shown as Example 2. Thisdialogue also occurred between two Grade 8 French immersion stu-dents doing the jigsaw task while they were writing out the story, andin reference to picture 8 (Figure 2):

Example 2

1 S1: Yvonne va a l’ecole.(Yvonne goes to school.)

2 S2: Se part a l’ecole.(Yvonne leaves [uses non-existent reflexive form] for school.)

3 S1: Oui. Elle . . . se marche.(Yes. She . . . walks [uses non-existent reflexive form])

4 S2: Se part, parce que . . .(Leaves [uses non-existent reflexive form], because . . . )

S2: Est-ce que c’est part ou se part?(Is it leaves or leaves [in the non-existent reflexive form]?)

5 S1: Part.(Leaves.)

6 S2: Part? Just part?(Leaves? Just leaves?)

7 S1: Yeah.8 S2: Ok. Yvonne part a l’ecole.

(Ok. Yvonne leaves for school.)(from Swain and Lapkin, 2000a).

What is going on in this dialogue? It is important to remember, ininterpreting this LRE, that these students watched on video a shortmini-lesson on reflexive verbs just prior to doing the task. The effectof this on many of the students was to lead them to overgeneralizeand use reflexive verbs where it is not possible to do so in French.This appears to be the case in this LRE. In turn 1, S1 proposes thatthey write Yvonne va a l’ecole. S1 chooses to use the all purposeverb aller (‘to go’), a verb whose meaning and form she knows verywell. However, perhaps because her partner, S2, remembers theinstructions to try to use reflexive verbs, she suggests using a morespecific verb, partir (‘to leave’), providing it, in turn 2, in the reflex-ive form. Se partir, however, is a non-existent form in French.Whether S1 realizes this or not, she appears not to like her partner’ssuggestion and so offers yet another alternative in turn 3: se marche.

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

290 Content specification & validating inferences drawn from test scores

The verb marcher also cannot be used in the reflexive form in French.S2, in turn 4, returns to her original suggestion of using se part, isabout to explain why she wants to use se part, and then demonstratesher uncertainty with the verb form by asking: Est-ce que c’est partou se part? (‘Is it leaves without the reflexive pronoun, or is it leaveswith the reflexive pronoun?’), that is, ‘Is it part or se part?’. S1, inturn 5, responds by saying that the correct form is part, that is, thecorrect form is without the reflexive pronoun. Still unsure, S2, in turn6, asks Part? Just part? (‘Leaves, just leaves without the reflexivepronoun?’). Note here, again, the switch to English – Just part? –again having the effect of highlighting and focusing attention on theverb form in question. In turn 7, S1 reassures her partner that part isthe correct form and so, in turn 8, S2 uses the form correctly andcompletes the sentence they had started to write back at turn 1. Inthe posttest item shown in Figure 1, S1 and S2 accurately stated thatthe first sentence (Les garcons partent pour l’ecole) was ‘certainlycorrect’ and that the second sentence (Les garcons se partent pourl’ecole) was ‘certainly incorrect’, evidence of the learning thatoccurred during the dialogue.

In this LRE, the issue is not one of comprehension as in the previousexample. Here retrieval (e.g., word repetition) and production processesplay more of a role. Additionally, we see ample evidence of hypothesisgeneration and testing. The students are trying to find the correct andbest way to express their intended meaning. What they wrote down –Yvonne part a l’ecole – is a more precise and sophisticated way toexpress what they wanted to say than what they started with. How theygot there is clearly a collaborative effort, and the question ‘whose per-formance is it anyway?’ is a good question to ask.

The purpose of having given these two examples of language-related episodes was four-fold. First, I wanted to show how rich eventhese very brief dialogues are in informing us of the mental strategiesand processes students use in performing these tasks. Secondly, Iwanted to provide them as examples of the two ways in which weclassified the language-related episodes that occurred in our data: aslexical-based LREs or as form-based LREs. We used LREs as a unitof analysis because we had set up the tasks with the expectation thatthey would lead students to focus their attention on form and meaningdifferentially across tasks. Thirdly, because we used LREs as our unitof analysis in attempting to understand the strategies the students usedin carrying out the tasks, it is important to see examples of LREs, soas to better understand the findings. And, fourthly, LREs were usedas the basis for developing posttest items (see above).

The results of our analysis are shown in Table 2. As shown inTable 2, there were no statistically significant differences between

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 291

Table 2 Language-related episodes (LREs)a

Class J Class D Sig.b

N M s.d. N M s.d.

Count of totalepisodes 12 8.8 8.0 14 9.2 4.2 ns

Count of lexis-basedLREs 12 4.0 3.7 14 3.7 2.3 ns

Count of form-basedLREs 12 4.8 4.5 14 5.5 2.9 ns

Percent lexis-basedLREs 12 41% 21% 14 40% 19% ns

Percent form-basedLREs 12 59% 21% 14 60% 19% ns

Notes: a A language related episode is any part of a dialogue where students talk aboutthe language they are producing, question their language use, or other- or self-correct.b Two-tail t-test, p � 0.05. Class J: pairs of students who did the jigsaw task. Class D:pairs of students who did the dictogloss task.

those doing the dictogloss task and those doing the jigsaw task in thenumber and percent of lexis-based LREs, nor in the number and per-cent of form-based LREs. In other words, quite different from ourexpectations, students responded similarly to the two tasks withrespect to the attention they paid to form and meaning. The veryreason for our having developed the tasks in the way that we did,was, in effect, not confirmed. We could not have known that but forour examination of the students’ dialogues. Examining their dialoguesin this way led us to rethink the nature of our tasks. In the end, webelieve that the fact that both tasks involved producing a ‘polished’written product in the target language led both sets of students tofocus equally as much on language form (Swain and Lapkin, 2000a).

An interesting feature of the data that appear in Table 2 is that thestandard deviations of Class D – consisting of the pairs of studentswho did the dictogloss task – are in general, considerably smallerthan those of the standard deviations of Class J – consisting of thepairs of students who did the jigsaw task. Levene’s test for equalityof variances showed that the range in the total number of LREs wassmaller for Class D than Class J (p � .05), suggesting that the dicto-gloss task constrains student responses to a greater degree than thejigsaw task (Swain and Lapkin, 2000a).2 I return to this point below.

2Of course, it is also possible that the composition of the students constrained the range ofresponses as opposed to the characteristics of the tasks. However, all students were given a Frenchcloze pretest to establish comparability of the two classes. There were no statistical differences (p� .05) between the average scores of the two classes. Also, the two classes were described by

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

292 Content specification & validating inferences drawn from test scores

To summarize up to this point, we have so far seen that the stu-dents’ dialogues:

1) provide opportunities for knowledge construction and thus canbe a source of learning;

2) can serve as the basis for developing test items; and3) make visible some of the cognitive and strategic activities of

learners as they jointly undertake a task.

We used language-related episodes as a unit of analysis to understandwhether the two tasks we used led to a differential focus on languageform and meaning.

We also conducted a separate analysis focusing on the students’use of English, their first language (L1), in performing the two tasks.Our goal in doing so was to uncover the strategic purposes for whichEnglish was used, and to discover whether the two tasks led to thedifferential use of English. Information about how and why the stu-dents used their L1 can provide us with insights about the tasks andthe students’ final product (i.e., their written stories).

Based on what the students said in English, we developed a set ofcategories (Swain and Lapkin, 2000b). These categories were influ-enced by our theoretical orientation: students’ use of L1 would servestrategic purposes as a cognitive tool, mediating task performance(see also Anton and DiCamilla, 1998). We discovered that studentsused their L1 for three principal purposes: (1) to move the task along;(2) to focus attention; and (3) for interpersonal interaction. Allinstances of English use were accounted for within this rubric (fordetails of how the coding scheme was developed, see Swain and Lap-kin, 2000b).

What we mean by ‘moving the task along’ is that students usedEnglish to, for example, figure out what they were supposed to bedoing in the task, figure out the order of events in the story(particularly in the dictogloss task), or develop an understanding ofthe story. Example 3 is illustrative.

Example 3

J1: Is that a foot? Yeah, ok, it’s a foot.(from Swain and Lapkin, 2000b.)

In this example, as in many others, it is clear that the student isexternalizing his own internalized dialogue: he asks a question andanswers it himself. The presence of his partner may be the reason hespoke out loud. However, what he says is also for himself as he works

their teachers and the researcher who collected the classroom data as ‘interchangeable’.

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 293

out an understanding of one of his pictures (picture 2) in the jigsawtask. Examples such as these make obvious the mediating role ofdialogue in problem solving, and reflect the very process of compre-hension the student is undergoing. As can be seen in Table 3, wefound that students who did the dictogloss task were twice as likelyto use their L1 to develop an understanding of the task than the stu-dents who did the jigsaw task: 22% of the turns in which Englishwas used vs. 10% respectively.

As they talked, students also used their L1 to focus attention. Bythis we mean that the students used their L1 to search for vocabulary,to focus on language form, to retrieve grammatical information, etc.An example of a search for vocabulary is shown in Example 4.

Example 4

1 J1: Et elle est tickelee. How do you say ‘tickled’?(And she is tickled. How do you say ‘tickled’?)

2 J2: Chatouillee.(Tickled.)

3 J1: Ok. Chatouillee, chatouillee. How do you say ‘foot’?(Ok. Tickled, tickled. How do you say ‘foot’?)

4 J2: Le pied.(Foot.)

5 J1: Ah, chatouillee les pieds.(Ah, tickled her feet.)

(from Swain and Lapkin, 2000b)

Here we see one student, J1, searching for lexical items. She usesEnglish to identify and search for the words she needed to constructa phrase in French. In Purpura’s terms, her repetition of chatouilleein turn 3 is an example of a ‘storing or memory process’, somethingthat she doesn’t need to do with the French word for ‘foot’, a wordthat is certainly known to her. (Not only do we see here the retrievaland storage of lexical items in this dialogue, but we can also see theway in which the phrase chatouillee les pieds (‘tickled her feet’) wasconstructed – incorrectly – on a word-by-word basis). As shown inTable 3, the jigsaw students were twice as likely to use English to

Table 3 Mean percentage of English turns used for understanding and vocabulary searchby task

Use of L1 Task

Jigsaw Dictogloss

Understanding 10 22Vocabulary search 27 14

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

294 Content specification & validating inferences drawn from test scores

search for vocabulary as the dictogloss students: 27% of the turns inwhich English was used vs. 14% respectively. (No differences werestatistically significant.)

These data suggest that the dictogloss and jigsaw tasks made quitedifferent processing demands on the students, even though bothrequired the students to produce a written story. The dictogloss taskmade more demands on the students’ abilities to comprehend andremember than did the jigsaw task; whereas the jigsaw task mademore demands on the students’ productive abilities than did the dicto-gloss task. As I pointed out earlier, we had developed these two tasksto be as similar as possible in content, with the expectation that therewould be a differential focus on form and meaning. Our analysis ofthe dialogues demonstrated that this was not the case. However, theanalysis of the use of English in the student dialogues suggests thatan important variable was the nature of the stimulus materials used:textual vs. visual. On the one hand, the dictogloss task provided thestudents with a French text, thus providing a set of French vocabularyand structures that were drawn on to do the task. However, the stu-dents could not proceed with the task until they had made sense ofthe French text they had heard. Thus student comprehension processeswere ‘taxed’. The jigsaw pictures, on the other hand, although easy tounderstand, provided no language model. So, for the jigsaw students,processes of lexical retrieval and the creation of linguistic structureswere of greater importance. Of course, to state these conclusionsunequivocally, an analysis of what functions French was used forwould also be necessary.

Perhaps particularly intriguing in examining the use of English inthe students’ dialogues is our finding that task performance interactedwith student proficiency. As shown in Table 4, for those studentswhose written stories were judged as above the median rating on thescale of language (an average of the scale ratings on vocabulary, mor-phology and syntax), the amount of English use was approximatelythe same across tasks: 15% and 18%. However, for those whose writ-ten stories were judged as below the median rating on the languagescale, considerably more English was used by the jigsaw studentsthan the dictogloss students: 41% vs. 25%. The same pattern is shownfor the content scale. The dictogloss task, in effect, ‘evened the play-ing field’ for students with respect to proficiency. We think that thedictogloss ‘evened the playing field’ because of the nature of thestimulus (input) material: a target language text.

Additionally, the performance of the jigsaw students relative to thedictogloss students was ‘all over the map’. Although no statisticallysignificant differences were found between the average ratings of the

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 295

Tab

le4

Per

cent

age

offir

stla

ngua

getu

rns

byst

uden

tdy

ads

who

are

abov

eor

belo

wm

edia

nla

ngua

gean

dco

nten

tra

tings

onth

eir

writ

ten

stor

ies

Sto

ryra

ting

Jigs

awD

icto

glos

s

Num

ber

ofM

ean

s.d.

Per

cent

Num

ber

ofM

ean

s.d.

Per

cent

pairs

ratin

gof

L1tu

rns

pairs

ratin

gof

L1tu

rns

Lang

uage

a

Abo

vem

edia

n5

4.0

.55

156

3.2

.42

18B

elow

med

ian

52.

6.7

941

61.

9.6

625

Con

tent

Abo

vem

edia

n5

4.0

.71

157

3.0

.57

20B

elow

med

ian

52.

0.7

141

51.

6.5

522

Not

e:a

Ave

rage

ofra

tings

for

voca

bula

ry,

mor

phol

ogy

and

synt

ax.

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

296 Content specification & validating inferences drawn from test scores

dictogloss and jigsaw students on the content, organization, vocabu-lary and syntax scales, the standard deviations for each of these meas-ures was much greater for the jigsaw students than for the dictoglossstudents, significantly so in the case of the vocabulary ratings. As Istated earlier, we also found that there was more variation in thenumber of language-related episodes produced by the jigsaw pairsthan the dictogloss pairs. It would appear, then, that the two taskswe used have the potential for creating more or less variation in thelearners’ performance.

V Summary

To sum up and pull together the various threads of this article, a fewpreliminary comments are first in order. In this article, I have dis-cussed the student dialogues in terms of the information we gleanedconcerning (1) the cognitive and strategic processes the tasks invokedand (2) that as the locus of language learning, they provided a sourceof information as to possible targets for the measurement of task out-comes. I have not discussed explicitly the issue of dialogue as thefocus of measurement itself. Many scales for rating oral proficiencyhave been developed; others, like Skehan (1998), have applied meas-ures of fluency, accuracy and complexity; still others, like Young(2000), have examined aspects of interactional competence. What Ihave discussed in this article could also be applied directly to thedialogues that occur in oral group testing as a measure of the strategicand cognitive uses of an L2, aspects of language performance thatare surely crucial in problem-solving tasks, and tasks that attempt tosimulate academic linguistic performance. That said, I now summar-ize and contextualize the points I have made.

First, I have suggested that one point of contact between the fieldsof L2 learning and testing is what happens in small groups. As langu-age testers, it is important to be able to measure accurately the per-formance of test-takers interacting in a small group setting. What Ihave suggested is that, in a group, the performance is jointly con-structed and distributed across the participants. We have seen thatdialogues construct cognitive and strategic processes which in turnconstruct student performance. One implication for testing is, mini-mally, that serious thought needs to be given to the most adequateand fair means of scoring the linguistic activity and its product thatderives from group interaction. It also means that in a testing situ-ation, who one is paired or grouped with, is not unimportant.

Secondly, our research has shown that the dialogues of studentdyads can be a source of learning, of the joint construction of knowl-edge. Language testers, too, are interested in language learning. One

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 297

aspect of measuring group performance might be to measure theknowledge that has been jointly constructed. What seems fairly cer-tain is that an examination of the content of the dialogues of test-takers could provide test-developers with targets for measurement,should thinking about the outcomes of group performance in this waybe productively pursued. I imagine that thinking further along theselines might most productively be pursued within group performanceswhere the task is tightly specified, and an attempt to replicateacademic and problem-solving contexts is at issue.

Thirdly, there are many situations in which both L2 learning andtesting researchers might find it useful to understand the cognitiveand strategic processes underlying performance. One examplefrom the L2 learning literature is Wesche and Paribakht’s (2000)study. Here, they report on their study of the acquisition of wordknowledge by students as the students carried out different typesof text-based vocabulary exercises, any of which could serve astest items. Each exercise was expected to promote learning of dif-ferent aspects of word knowledge. Wesche and Paribakht usedthink-aloud and retrospective reports to uncover the learners’ cog-nitive processes. Their conclusion is that ‘learners tended to workfrom the principle of minimal effort . . . they did not necessarilyfollow all the instructions provided or engage themselves in themental processes envisaged’ (2000: 207). This conclusion isimportant for understanding lexical acquisition and is alsoimportant for language testers to consider.

The use of qualitative methods for seeking validation evidencehas increased in recent years. As Banerjee and Luoma (1997: 276)point out, ‘Qualitative validation techniques can provide infor-mation on the content of the test, the properties of the test tasks,and the processes involved in test taking and assessing.’ Thesetechniques include expert judgement, introspective (includingthink-aloud protocols) and retrospective accounts of test-takersand test-raters, interviews, and text and discourse analysis of per-formance. Each of these has a rich theoretical and research litera-ture supporting its use.

The recording and examination of the dialogue of individualsjointly doing a task provides test-developers and test-researchers withadditional insights to aid in the interpretation of test scores and tomake recommendations about appropriate uses. In the examples Ihave given, I have shown that, by examining the students’ dialogues,assumptions we made about how task performance would be achievedwere shown to be wrong; that different tasks differentially engagedcomprehension and production processes; and that one task con-strained the use of certain cognitive and strategic processes in ways

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

298 Content specification & validating inferences drawn from test scores

in which the other task did not. These are important things to knowabout tasks, whether the tasks are to be used for research purposes,pedagogical purposes or testing purposes.

Green (1998: 49), writing about verbal protocol analysis, statesthat:

Under some circumstances, reports generated by two individuals working ona task can be useful and can serve to make explicit information that might notbe apparent within a protocol generated by an individual working alone on atask . . . The difficulty with paired reports is that the presence of another indi-vidual changes the way in which the task would be approached by an individualworking alone on that task. Two individuals working together on a task inter-act, and each modifies the behaviour of the other. The manner in which thetask is solved by a pair may differ enormously from the way in which eitherindividual might solve the task alone.

In response to this, I have three points to make. First, Green’sclaims are, in fact, empirical questions, and they should be investi-gated. Secondly, investigating these questions in the context of smallgroup vs. individual oral testing would be a useful validation endeav-our. Thirdly, I do not think that these investigations should stop withoral testing: they can be studied in the context of solo or joint writing,as we have done; or in the context of solo or joint understanding ofthe meaning of a written or spoken text. To consider these joint activi-ties as tasks for tests would fit into current ideas about integratinglanguage skills in test-tasks (e.g. Chapelle, 1998). They would alsomore faithfully mirror regular, daily classroom and non-classroomactivity. Furthermore, because students have reported less anxiety ingroup situations and because, in other disciplines at least (e.g., inmathematics) group performance has been superior to individual per-formance (Webb, 1993), group testing just might be, under the rightcircumstances, a means of ‘biasing for best’ (Swain, 1983).

What seems certain is that research in L2 learning and research inL2 testing have much in common (Bachman and Cohen, 1998). Bothsets of researchers need to understand and measure language perform-ance in groups and we both need to know how to interpret thatperformance, which includes understanding the processes and stra-tegies underlying that performance. In the final analysis, both sets ofresearchers need to know that whoever is doing the task is engagingin construct-relevant processes while doing so. This is why we mustpay serious attention to each others’ theories and research.

Editors’ comments

This article was the Samuel Messick Memorial Lecture and openingplenary address at the 22nd Annual Language Testing Research

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 299

Colloquium held in Vancouver, British Columbia in March, 2000.Although invited for publication in Language Testing, the paper wasalso peer reviewed in accordance with normal practice. The editorsbelieve that the SLA perspective it offers represents an important con-tribution to thinking in language testing. The theme of the conferencewas ‘Interdisciplinary interfaces with language testing’.

Acknowledgements

The author would like to thank the following people for their valuablefeedback on earlier drafts of this paper: Lindsay Brooks, AndrewCohen, Alister Cumming, Jean Handscombe, Jim Lantolf, SharonLapkin, Tim McNamara and Helen Moore.

VI References

Anton, M. and DiCamilla, F. 1998: Socio-cognitive functions of L1collaborative interaction in the L2 classroom. The Canadian ModernLanguage Review 54, 314–42.

Bachman, L.F. and Cohen, A.D., editors, 1998: Interfaces between secondlanguage acquisition and language testing research. Cambridge: Cam-bridge University Press.

Banerjee, J. and Luoma, S. 1997: Qualitative approaches to test validation.In Clapham, C. and Corson, D., editors, Encyclopedia of language andeducation, Volume 7: Language testing and assessment. Dordrecht:Kluwer Academic, 275–87.

Berry, V. 2000: An investigation into how individual differences in person-ality affect the complexity of language test tasks. Unpublished PhDthesis, King’s College, University of London.

Chapelle, C. 1998: Construct definition and validity inquiry in SLAresearch. In Bachman, L.F. and Cohen, A.D., editors, Interfacesbetween second language acquisition and language testing research.Cambridge, Cambridge University press, 32–70.

Cole, M. 1996: Cultural psychology. Cambridge, MA: Belknap Press ofHarvard University Press.

Fulcher, G. 1996: Testing tasks: issues in task design and the group oral.Language Testing 13, 23–51.

Gass, S. 1997: Input, interaction and the second language learner. Mahwah,NJ: Lawrence Erlbaum.

Green, A. 1998: Verbal protocol analysis in language testing research: ahandbook. Cambridge: Cambridge University Press.

Halliday, M.A.K. and Matthiessen, C.M.I.M. 1999: Construing experiencethrough meaning: a language-based approach to cognition. London:Cassell.

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

300 Content specification & validating inferences drawn from test scores

Hilsdon, J. 1991: The group oral exam: advantages and limitations. In Ald-erson, J.C. and North, B., editors, Language testing in the 1990s. Lon-don: Modern English Publications and the British Council, 189–97.

Kowal, M. and Swain, M. 1994: Using collaborative language productiontasks to promote students’ language awareness. Language Awareness3, 73–93.

—— 1997: From semantic to syntactic processing: how can we promotemetalinguistic awareness in the French immersion classroom? InJohnson, R.K. and Swain, M., editors, Immersion education: inter-national perspectives. Cambridge: Cambridge University Press,284–309.

Kramsch, C. 1995: The applied linguist and the foreign language teacher:can they talk to each other? Australian Review of Applied Linguistics18, 1–16.

Lantolf, J.P., editor, 2000a: Sociocultural theory and second languagelearning. Oxford: Oxford University Press.

—— 2000b: Second language learning as a mediated process. LanguageTeaching 33, 79–96.

Lantolf, J.P. and Appel, G., editors, 1994: Vygotskian approaches to secondlanguage research. Norwood, NJ: Ablex, 1–32.

Lazarton, A. 1992: The structural organization of a language interview: aconversation analytic perspective. System 20, 373–86.

Lumley, T. and Brown, A. 1996: Specific-purpose language performancetests: task and interaction. In Wigglesworth, G. and Elder, C., editors,The language testing cycle: from inception to washback. AustralianReview of Applied Linguistics, Series S, Number 13, 105–36.

McNamara, T. 1997: ‘Interaction’ in second language performance assess-ment: whose performance? Applied Linguistics 18, 446–66.

Pica, T. 1994: Research on negotiation: what does it reveal about second-language learning conditions, processes and outcomes? LanguageLearning 44, 493–527.

Pica, T., Kanagy, R. and Falodun, J. 1994: Choosing and using communi-cation tasks for second language instruction. In Crookes, G. and Gass,S., editors, Tasks and language learning: integrating theory and prac-tice. Clevedon, Avon: Multilingual Matters, 9–34.

Purpura, J.E. 1998: The development and construct validation of an instru-ment designed to investigate selected cognitive background character-istics of test-takers. In Kunnan, A.J., editor, Validation in languageassessment. Mahwah, NJ: Lawrence Erlbaum, 111–39.

Shohamy, E., Donitsa-Schmidt, S. and Waizer, R. 1993: The effect of theelicitation mode on the language samples obtained on oral tests. Paperpresented at the Language Testing Research Colloquium, Cambridge,UK. Available from the authors.

Shohamy, E., Reves, T. and Bejarano, Y. 1986: Introducing a new compre-hensive test of oral proficiency. English Language Teaching Journal40, 212–20.

Skehan, P. 1998: A cognitive approach to language learning. Oxford:Oxford University Press.

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

Merrill Swain 301

Swain, M. 1983: Large-scale communicative language testing: a case study.Language learning and communication 2, 133–47. Reprinted in Savig-non, S. and Berns, M., editors, Initiatives in communicative languageteaching. Reading, MA: Addison Wesley, 185–201.

—— 1995: Three functions of output in second language learning. In Cook,G. and Seidlhofer, B., editors, Principle and practice in appliedlinguistics: studies in honour of H.G. Widdowson. Oxford: OxfordUniversity Press, 125–44.

—— 2000: The output hypothesis and beyond: mediating acquisitionthrough collaborative dialogue. In Lantolf, J.P., editor, Socioculturaltheory and second language learning. Oxford: Oxford UniversityPress, 97–114.

Swain, M. and Lapkin, S. 1998: Interaction and second language learning:two adolescent French immersion students working together. ModernLanguage Journal 82, 320–37.

—— 2000a: Focus on form through collaborative dialogue: exploring taskeffects. In Bygate, M., Skehan, P. and Swain, M., editors, Researchingpedagogic tasks: second language learning, teaching, and testing.London: Longman, 99–118.

—— 2000b: Task-based second language learning: the uses of the firstlanguage. Language Teaching Research 4, 253–76.

—— 2001: What learners notice in their reformulated writing, what theylearn from it, and their insights into the process. Paper presented at theAAAL Annual Conference, St. Louis, MI. Available from the authors.

UCLES (University of Cambridge Local Examinations syndicate) 1996:First certificate in English handbook. Cambridge. UCLES.

van Lier, L. 1989: Reeling, writhing, drawling, stretching, and fainting incoils: oral proficiency interviews as conversation. TESOL Quarterly23, 489–508.

—— 2000: From input to affordance: social-interactive learning from anecological perspective. In Lantolf, J.P., editor, Sociocultural theory andsecond language learning. Oxford: Oxford University Press, 245–59.

Vygotsky, L.S. 1978: Mind in society: the development of higher psycho-logical processes. Cambridge, MA: Harvard University Press.

—— 1987: Thought and speech. In Rieber, R.W. and Carton, A.S., editors,The collected works of L.S. Vygotsky: Volume 1. New York: Plenum,243–85.

Webb, N.W. 1993: Collaborative group versus individual assessment inmathematics: processes and outcomes. Educational Assessment 1,131–52.

Wells, G. 1999: Dialogic inquiry: towards a sociocultural practice andtheory of education. Cambridge: Cambridge University Press.

Wertsch, J.V. 1980: The significance of dialogue in Vygotsky’s account ofsocial, egocentric, and inner speech. Contemporary Educational Psy-chology 5, 150–62.

—— 1985: Vygotsky and the social formation of mind. Cambridge, MA:Harvard University Press.

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from

302 Content specification & validating inferences drawn from test scores

—— 1991: Voices of the mind: a sociocultural approach to mediated action.Cambridge, MA: Harvard University Press.

Wesche, M. and Paribakht, S. 2000: Reading-based exercises in secondlanguage vocabulary learning: an introspective study. Modern Langu-age Journal 84, 196–213.

Young, R. 2000: Interactional competence: challenges for validity. Paperpresented at a joint symposium of the Language Research Colloquiumand the American Association for Applied Linguistics, Vancouver,British Columbia.

Young, R. and He, A. 1998: Talking and testing: discourse approaches tothe assessment of oral proficiency. Amsterdam, PA: John Benjamins.

by ROSÂNGELA RODRIGUES BORGES on September 4, 2010ltj.sagepub.comDownloaded from


Top Related