reading to learn and reading to integrate: new tasks for...

37
Language Testing 2005 22 (2) 174–210 10.1191/0265532205lt299oa © 2005 Edward Arnold (Publishers) Ltd Reading to learn and reading to integrate: new tasks for reading comprehension tests? Latricia Trites Murray State University and Mary McGroarty Northern Arizona University To address the concern that most traditional reading comprehension tests only measure basic comprehension, this study designed measures to assess more complex reading tasks: Reading to Learn and Reading to Integrate. The new measures were taken by 251 participants: 105 undergraduate native speakers of English, 106 undergraduate nonnative speakers, and 40 graduate nonnative speakers. The research subproblems included determina- tion of the influence of overall basic reading comprehension level, native language background, medium of presentation, level of education, and com- puter familiarity on Reading to Learn and Reading to Integrate measures; and the relationships among measures of Basic Comprehension, Reading to Learn, and Reading to Integrate. Results revealed that native language back- ground and level of education had a significant effect on performance on both experimental measures, while other independent variables did not. While all reading measures showed some correlation, Reading to Learn and Reading to Integrate had lower correlations with Basic Comprehension, suggesting a possible distinction between Basic Comprehension and the new measures. I Introduction Each year, thousands of international students apply to American universities in the hope of obtaining a degree from an English- speaking university, and one of the hurdles they face is attaining a ‘passing’ score on the Test of English as a Foreign Language (TOEFL). Although not designed as a gatekeeper by Educational Testing Service (ETS), the TOEFL is often used as such by many institutions of higher education across the USA (Educational Testing Service, 1997). Prior to 2000, the test assessed basic reading and listening comprehension, as well as grammatical ability. While these skills are essential, they represent the minimum needed to succeed in Address for correspondence: Latricia Trites, Assistant Professor, Murray State University, Department of English and Philosophy, 7C Faculty Hall, Murray, KY 42071, USA; email: [email protected]

Upload: truonganh

Post on 15-Mar-2018

229 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Language Testing 2005 22 (2) 174ndash210 1011910265532205lt299oa copy 2005 Edward Arnold (Publishers) Ltd

Reading to learn and reading tointegrate new tasks for readingcomprehension testsLatricia Trites Murray State University and Mary McGroarty Northern Arizona University

To address the concern that most traditional reading comprehension testsonly measure basic comprehension this study designed measures to assessmore complex reading tasks Reading to Learn and Reading to Integrate The new measures were taken by 251 participants 105 undergraduate native speakers of English 106 undergraduate nonnative speakers and 40graduate nonnative speakers The research subproblems included determina-tion of the influence of overall basic reading comprehension level nativelanguage background medium of presentation level of education and com-puter familiarity on Reading to Learn and Reading to Integrate measures andthe relationships among measures of Basic Comprehension Reading toLearn and Reading to Integrate Results revealed that native language back-ground and level of education had a significant effect on performance on bothexperimental measures while other independent variables did not While allreading measures showed some correlation Reading to Learn and Readingto Integrate had lower correlations with Basic Comprehension suggesting apossible distinction between Basic Comprehension and the new measures

I Introduction

Each year thousands of international students apply to Americanuniversities in the hope of obtaining a degree from an English-speaking university and one of the hurdles they face is attaining alsquopassingrsquo score on the Test of English as a Foreign Language(TOEFL) Although not designed as a gatekeeper by EducationalTesting Service (ETS) the TOEFL is often used as such by manyinstitutions of higher education across the USA (Educational TestingService 1997) Prior to 2000 the test assessed basic reading andlistening comprehension as well as grammatical ability While theseskills are essential they represent the minimum needed to succeed in

Address for correspondence Latricia Trites Assistant Professor Murray State UniversityDepartment of English and Philosophy 7C Faculty Hall Murray KY 42071 USA emaillatriciatritesmurraystateedu

higher education ETS aware of this minimum standard and asBachman (2000) mentions the need for task authenticity embarkedon a large-scale project to redesign TOEFL to better reflect the aca-demic language skills required in higher education Among othergoals the TOEFL 2000 project (Enright et al 1998) outlined plansto establish reading tasks for four distinct purposes

bull finding informationbull achieving basic comprehensionbull learning from texts andbull integrating information

The latter two purposes for reading represent a departure from tradi-tional reading tests and constitute more complex tasks that requiremore cognitive processing Tasks appropriate to measure these newpurposes needed to be developed and validated

The project reported here pursued the creation and evaluation ofthese new task types the development of scoring rubrics and the evaluation of native language effects on task and test perform-ance (Educational Testing Service 1998) In addition because thesewere new reading tasks some evidence for their validity was soughtby establishing a baseline for native speakers and then comparingthat baseline to performance of nonnative speakers The TOEFL2000 reading construct paper (Enright et al 1998) suggested that aReading to Learn task would require students to recognize the largerrhetorical frame organizing the information in a given text and carryout a task demonstrating awareness of this larger organizing frameEnright et al (1998) hold that in reading to learn readers must inte-grate and connect information presented by the author with whatthey already know Thus readers must rely on background knowl-edge of text structures to form a Situation Model a representation ofthe content and a Text Model a representation of the rhetoricalstructures of the text as postulated by van Dijk and Kintsch (1983)and discussed by Perfetti (1997) Goldman (1997 362) asserted thatto learn from texts readers must have an awareness of text structureand know how to use it to aid comprehension Reading to Learn canbe assessed in a variety of ways McNamara and Kintsch (1996) sug-gested that inferencing and sorting tasks requiring readers to processthe text based on domain-specific knowledge of the text structurescould yield a representation of the readersrsquo ability to learn from thetext Hence we postulated that one useful means of assessmentwould be to have participants recall information and reproduce infor-mation relationships reflecting their concept of text structure

Latricia Trites and Mary McGroarty 175

(Enright et al 1998 46ndash48) For the Reading to Learn task weassessed readersrsquo knowledge model through their ability to recall andcategorize information from a single text (Enright et al 1998 57)

Another goal of the project was to assess Reading to Integrateinformation which requires readers to integrate information frommultiple sources on the same topic Reading to Integrate goes a stepfurther than Reading to Learn because readers must integrate therhetorical and contextual information found across the texts andgenerate their own representation of this interrelationship (Perfetti1997) Therefore readers must assess the information presented inall sources read and accept or reject pieces of it as they create theirown understanding One means of assessing integration of informa-tion found in typical university assignments is the open-ended taskof generating a synthesis based on one or more texts (Enright et al1998 48ndash49) We used a writing task specifically a writing promptthat elicited the readerrsquos perception of the authorsrsquo communicativepurposes (Enright et al 1998 56) as well as amount of informationretained from two texts to test Reading to Integrate

II Related literature

Recent research has begun to explore the development of tasks thatdistinguish the constructs of Reading to Learn from basic compre-hension Researchers (van Dijk and Kintsch 1983 McNamara andKintsch 1996 Goldman 1997) have determined that reading tolearn requires an interaction between the Text Model of a text as wellas its Situation Model thus resulting in a more difficult measureThese researchers further suggest that Reading to Learn can beassessed through measures that go beyond recall summarizationand text-based multiple-choice questions

The construct of Reading to Integrate requires that readers notonly integrate the Text Model with the Situation Model but also that they create what Perfetti (1997 346) calls a Documents Modelconsisting of two critical elements lsquoAn Intertext Model that linkstexts in terms of their rhetorical relations to each other and aSituations Model that represents situations described in one or moretext with links to the textsrsquo He argues that the use of multiple textsas opposed to a single text brings into clearer focus the relationshipbetween the Text Model and the Situation Model This again sug-gests that Reading to Integrate should be more difficult than Readingto Learn

176 New tasks for reading comprehension tests

Because these constructs go beyond basic comprehensionReading to Learn and Reading to Integrate are hypothesized to bemore difficult reading tasks than Reading to Find Information andReading for Basic Comprehension Perfetti (1997) further suggeststhat Reading to Integrate is a more difficult task than Reading toLearn because it not only requires an integration of a Text Model anda Situation Model but requires an integration of multiple TextModels and multiple Situation Models Thus current reading theorysuggests a difficulty hierarchy of reading tasks based on the level ofintegration necessary to complete the tasks successfully Severalstudies (Perfetti et al 1995 1996 Britt et al 1996 Wiley and Voss1999) have attempted to move beyond basic comprehension andexamine readersrsquo ability to integrate the information from multipletexts into one cohesive knowledge base by having students makeconnections compare or contrast information across texts

Additionally recent research has addressed the effects of computerson reading and assessment such research is relevant to the currentproject because the new TOEFL is administered via computersReading-medium studies have shown that the only effect that com-puters have on reading is related to task (Reinking and Schreiner1985 Reinking 1988 van den Berg and Watt 1991 Lehto et al1995 Perfetti et al 1995 1996 Britt et al 1996 Foltz 1996Wiley and Voss 1999) Taylor et al (1998) found that after minimalcomputer training familiarity with technology did not have a signif-icant effect on examineesrsquo performance on TOEFL-like questionsBecause of the relevance of computer familiarity to TOEFL admin-istration a brief measure of computer familiarity was included in theresearch

For this project we asked three research questions

1) Is performance on a measure of Reading to Learn affected by medium of presentation (paper versus computer) technologyfamiliarity native language (native versus nonnative speakers of English) or level of education (graduate versus under-graduate)

2) Is performance on a measure of Reading to Integrate affected bymedium of presentation (paper versus computer) technologyfamiliarity native language (native versus nonnative) or level ofeducation (graduate versus undergraduate)

3) To what extent are measures of finding informationbasic read-ing comprehension Reading to Learn and Reading to Integraterelated

Latricia Trites and Mary McGroarty 177

III Methods

1 Participants

Two hundred and fifty-one participants the majority undergraduatesvolunteered to take part in this study The sample consisted of 105undergraduate native speakers of English (NSUs) 106 undergraduatenonnative speakers (NNSUs) and 40 graduate nonnative speakers(NNSGs) of English at a midsized southwestern university All data were collected between February and October 1999 All under-graduate participants were recruited through large undergraduateclasses in the areas enrolling most NNSs (business administrationhotel management engineering social sciences and humanities)We tested all NNSs accessible at the institution at the time of datacollection compared to a national sample of international studentsfrom the prior academic year we had a relatively larger proportionof undergraduate relative to graduate students Nearly all undergrad-uate participants were young adults with an average age of 21Nonnative speakers were also recruited from students enrolled in thesummer intensive English program which is made up of studentsneeding to increase TOEFL scores to at least 500 in order to enrollat a university We included 46 participants (32 of NNS sample)with TOEFL scores below 500 in the nonnative sample Graduatenonnative speakers (n 40) were recruited from the entire univer-sity population and had an average age of 3075 Nonnative speakersrepresented a range of language backgrounds One third wereJapanese with other Asian Germanic and Romance languages alsosubstantially represented Both the relatively modest sample size andthe all-volunteer nature of the participant sample preclude directgeneralization to the worldwide TOEFL population but participantswere representative of the levels of international students at the insti-tution where they were enrolled Participants who completed all fourdata collection sessions received a payment of US$10 per hour(US$40 for the entire project)

2 Instruments

This project used three existing instruments two to determine initialreading levels and one to assess levels of computer familiarity andtwo new instruments one for Reading to Learn and one for Readingto Integrate these were developed especially for the project Each ofthe new measures also served as the basis for an additional measure

178 New tasks for reading comprehension tests

of basic reading comprehension related directly to the text includedin the new task Thus each participant completed a total of sevendifferent instruments

a Existing instruments Initial levels of reading comprehensionwere determined based on the NelsonndashDenny Reading Test(NelsonndashDenny) Form G used to identify the reading levels of theNSs and three retired versions of the Institutional TOEFL ReadingComprehension Section (TOEFL Reading Comprehension) used toidentify the reading levels of the NNSs Although each of these testswas used to assess reading levels in the population for which it had been developed all 251 participants took both tests in order toprovide comparative data All 251 participants also completed a brief computer familiarity questionnaire

Participantsrsquo computer familiarity was determined through an 11-item questionnaire based on a longer 23-item questionnairepreviously developed by ETS (Eignor et al 1998) In the presentstudy we used only the 11 items that loaded the most heavily on themajor factors resulting from administration to a large sample ofTOEFL participants For these 11 items developers determined thereliability to be 93 using a split-half method (Eignor et al 199822) This brief questionnaire took approximately 5 minutes tocomplete reliability in our sample using coefficient alpha was 87

b Texts used for new measures In developing the new tasks weselected texts that would conform to the design specifications ofTOEFL 2000 They were problemsolution texts recommended asone of the potentially relevant text types for TOEFL 2000 (Enright et al 1998) Longer texts were used because these represented morechallenging and authentic academic tasks (Enright et al 1998) Weused one 1200-word and two 600-word texts The longer text(Tennesen 1997) was used to assess Reading to Learn and the two600-word texts (Monks 1997 Zimmerman 1997) were used toassess Reading to Integrate We chose these text lengths based onwork by Meyer (1985a) and further research by the first authorindicating that natural science texts between 1200 and 1500 wordsincluded representation of all necessary macro-rhetorical structuresof problemsolution texts with or without explicit signaling While1200ndash1500 word texts provide optimal representation of the macro-rhetorical structures texts of 600-words provide all the basic macro-rhetorical structures present in problemsolution texts Thus these

Latricia Trites and Mary McGroarty 179

180 New tasks for reading comprehension tests

lengths were long enough for adequate argumentation but not so long that they were excessively redundant (Enright et al 1998)Texts were also matched for readability according to standard read-ability scales such as the FleschndashKincaid ColemanndashLiau andBormuth scales and averaged a minimum of grade level 110 to 120on these scales Also all texts pertained to natural and social scienceseach text covered environmental issues such as air and water pollution(Enright et al 1998) Thus text topics were similar across tasks

c New instruments used in the study Three new reading measureswere used in this study to assess Reading to Learn Reading to Integrateand Basic Comprehension Trites (2000 Chapters 2 and 3) presents amore extensive review of literature and rationale for development of thenew measures

bull Reading to Learn The first new measure completion of a chart was used to determine participantsrsquo ability to read to learnSpivey (1997 69) suggests that readersrsquo categorization of infor-mation in text offers insight into their cognitive processes andtheir making of meaning We designed a measure to be used with a 1200-word text that students read on either paper or com-puter Students were asked to recall identify and categorizeinformation from the text on a chart reflecting macro-rhetoricalstructures called macrostructures in this study (problems andsolutions) and other types of information from problemsolutiontexts (causes effects and examples) categories based on thework of Meyer (1985a) The scoring rubric based on work byMeyer (1985b) and later modified by Jamieson et al (1993)awarded points only for the upper levels of textual structurerepresented on the chart (for task and scoring rubric seeAppendix 1) We weighted the information supplied on the chartas follows 10 points for correct information in the problem andsolution categories five points for correct information suppliedin the cause and effect categories and one point for accurateexamples This weighting reflects Meyerrsquos (1985b) hierarchicallevels which characterize problem and solution propositions ashigher order structures while the other categories represent lowerorder propositions1 The theoretical maximum score for this scale

1Students received no points for information improperly placed or for information not found in thetext

Latricia Trites and Mary McGroarty 181

was 241 which would result from maximum points given in allcategories The first author and two research assistants spent35ndash40 hours creating revising norming the scoring rubric anddeveloping the scoring guide (Trites 2000 Chapter 3) To deter-mine interrater reliability we used coefficient alpha rather thanpercentage of agreement because percentage of agreementinflates the likelihood of chance agreement (Hayes and Hatch1999) After norming overall interrater reliability was 99(coefficient alpha) with similarly high reliabilities assessed withsimilarly high alpha coefficients for all subcategories2

bull Reading to Integrate The second new measure assessed Readingto Integrate The task used to assess Reading to Integraterequired participants to read two 600-word texts and compose awritten synthesis The prompt asked students to make connec-tions across the range of ideas presented thus we asked readersto synthesize information rather than summarize or makecomparisons (Wiley and Voss 1999) This synthesis was scoredbased on an analytic scale ranging from 0 to 80 reflecting read-ersrsquo ability to recognize and manipulate the structure of the textsinclude specific information and express connections acrosstexts through the use of cohesive devices (for task and scoringrubric see Appendix 2) The test was designed to measure theintegration of content from both readings and did not assessother aspects of writing such as the creation of rhetorical stylegrammaticality or mechanics The rubric was composed of threesubcategories integration ability macrostructure recognitionand use of relevant details The integration subscore wasawarded the highest point values because this was the predomi-nant skill being tested It scored participants on their ability tomake connections across texts based on the manipulation of thetextual frames in both texts The second subcategory awardedpoints for the ability to recognize and articulate the macrostruc-tures (problem cause effect or solution) present in each textThis subcategory was similar to the categorizing task used in theReading to Learn measure with the additional constraint thatparticipants had to express the connections overtly The thirdsubcategory in the scoring rubric analysed the ability to use

2We recognize that tasks requiring high inference measures plus extensive norming and revision of the scoring rubric pose feasibility issues in large-scale testing Further research is needed todetermine whether and how such scoring procedures could be adapted in standardized testing fornumerous test-takers

relevant details as support in the written synthesis The firstauthor and two research assistants spent 30 hours revising norm-ing the scoring rubric and developing a decision guide resultingin an overall interrater reliability of 99 (coefficient alpha) withsimilarly high alphas for all subcategories

bull Basic Comprehension The third construct was measured bymultiple-choice tests related specifically to the texts used in the new tasks These tests were created by TOEFL TestDevelopment staff and followed current TOEFL reading sectionspecifications We used two multiple choice tests BasicComprehension Test 1 (BC1) and Basic Comprehension Test 2 (BC2) 20 items each one for the longer passage used to assessReading to Learn and one for the two passages used to assessReading to Integrate Both were scored based on number of items answered correctly Reliability on BC1 calculated basedon 251 participants was 84 (coefficient alpha) Inadvertently theorder of the texts used in BC2 was different for the two differentmedia however reliability on both versions of the test was highFor those who took BC2 based on paper texts (n 127) relia-bility was 84 (coefficient alpha) for those who took BC2 basedon computerized texts (n 124) reliability was 86 (coefficientalpha)

3 Design for data collection

This study used a 22 repeated measures design to examineperformance on the new reading tasks Native speaker undergraduatesand nonnative speaker undergraduates were divided into two groupseach of equal ability as determined by performance on the baselinestandardized measures of reading comprehension (NelsonndashDenny orTOEFL) Half of each group read texts on paper the other half readthe same texts on a computer screen A smaller group of nonnativespeaker graduates equally divided were also included for a compar-ison between performance by graduate and undergraduate nonnativespeakers Additionally the administration of the new measures wascounterbalanced to control for any practice effect

a Procedures All participants met with the researchers in foursessions each lasting about an hour The first two sessions were devotedto administering the existing instruments During Session 1 partici-pants received an introduction to the study and took one of the two

182 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 2: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

higher education ETS aware of this minimum standard and asBachman (2000) mentions the need for task authenticity embarkedon a large-scale project to redesign TOEFL to better reflect the aca-demic language skills required in higher education Among othergoals the TOEFL 2000 project (Enright et al 1998) outlined plansto establish reading tasks for four distinct purposes

bull finding informationbull achieving basic comprehensionbull learning from texts andbull integrating information

The latter two purposes for reading represent a departure from tradi-tional reading tests and constitute more complex tasks that requiremore cognitive processing Tasks appropriate to measure these newpurposes needed to be developed and validated

The project reported here pursued the creation and evaluation ofthese new task types the development of scoring rubrics and the evaluation of native language effects on task and test perform-ance (Educational Testing Service 1998) In addition because thesewere new reading tasks some evidence for their validity was soughtby establishing a baseline for native speakers and then comparingthat baseline to performance of nonnative speakers The TOEFL2000 reading construct paper (Enright et al 1998) suggested that aReading to Learn task would require students to recognize the largerrhetorical frame organizing the information in a given text and carryout a task demonstrating awareness of this larger organizing frameEnright et al (1998) hold that in reading to learn readers must inte-grate and connect information presented by the author with whatthey already know Thus readers must rely on background knowl-edge of text structures to form a Situation Model a representation ofthe content and a Text Model a representation of the rhetoricalstructures of the text as postulated by van Dijk and Kintsch (1983)and discussed by Perfetti (1997) Goldman (1997 362) asserted thatto learn from texts readers must have an awareness of text structureand know how to use it to aid comprehension Reading to Learn canbe assessed in a variety of ways McNamara and Kintsch (1996) sug-gested that inferencing and sorting tasks requiring readers to processthe text based on domain-specific knowledge of the text structurescould yield a representation of the readersrsquo ability to learn from thetext Hence we postulated that one useful means of assessmentwould be to have participants recall information and reproduce infor-mation relationships reflecting their concept of text structure

Latricia Trites and Mary McGroarty 175

(Enright et al 1998 46ndash48) For the Reading to Learn task weassessed readersrsquo knowledge model through their ability to recall andcategorize information from a single text (Enright et al 1998 57)

Another goal of the project was to assess Reading to Integrateinformation which requires readers to integrate information frommultiple sources on the same topic Reading to Integrate goes a stepfurther than Reading to Learn because readers must integrate therhetorical and contextual information found across the texts andgenerate their own representation of this interrelationship (Perfetti1997) Therefore readers must assess the information presented inall sources read and accept or reject pieces of it as they create theirown understanding One means of assessing integration of informa-tion found in typical university assignments is the open-ended taskof generating a synthesis based on one or more texts (Enright et al1998 48ndash49) We used a writing task specifically a writing promptthat elicited the readerrsquos perception of the authorsrsquo communicativepurposes (Enright et al 1998 56) as well as amount of informationretained from two texts to test Reading to Integrate

II Related literature

Recent research has begun to explore the development of tasks thatdistinguish the constructs of Reading to Learn from basic compre-hension Researchers (van Dijk and Kintsch 1983 McNamara andKintsch 1996 Goldman 1997) have determined that reading tolearn requires an interaction between the Text Model of a text as wellas its Situation Model thus resulting in a more difficult measureThese researchers further suggest that Reading to Learn can beassessed through measures that go beyond recall summarizationand text-based multiple-choice questions

The construct of Reading to Integrate requires that readers notonly integrate the Text Model with the Situation Model but also that they create what Perfetti (1997 346) calls a Documents Modelconsisting of two critical elements lsquoAn Intertext Model that linkstexts in terms of their rhetorical relations to each other and aSituations Model that represents situations described in one or moretext with links to the textsrsquo He argues that the use of multiple textsas opposed to a single text brings into clearer focus the relationshipbetween the Text Model and the Situation Model This again sug-gests that Reading to Integrate should be more difficult than Readingto Learn

176 New tasks for reading comprehension tests

Because these constructs go beyond basic comprehensionReading to Learn and Reading to Integrate are hypothesized to bemore difficult reading tasks than Reading to Find Information andReading for Basic Comprehension Perfetti (1997) further suggeststhat Reading to Integrate is a more difficult task than Reading toLearn because it not only requires an integration of a Text Model anda Situation Model but requires an integration of multiple TextModels and multiple Situation Models Thus current reading theorysuggests a difficulty hierarchy of reading tasks based on the level ofintegration necessary to complete the tasks successfully Severalstudies (Perfetti et al 1995 1996 Britt et al 1996 Wiley and Voss1999) have attempted to move beyond basic comprehension andexamine readersrsquo ability to integrate the information from multipletexts into one cohesive knowledge base by having students makeconnections compare or contrast information across texts

Additionally recent research has addressed the effects of computerson reading and assessment such research is relevant to the currentproject because the new TOEFL is administered via computersReading-medium studies have shown that the only effect that com-puters have on reading is related to task (Reinking and Schreiner1985 Reinking 1988 van den Berg and Watt 1991 Lehto et al1995 Perfetti et al 1995 1996 Britt et al 1996 Foltz 1996Wiley and Voss 1999) Taylor et al (1998) found that after minimalcomputer training familiarity with technology did not have a signif-icant effect on examineesrsquo performance on TOEFL-like questionsBecause of the relevance of computer familiarity to TOEFL admin-istration a brief measure of computer familiarity was included in theresearch

For this project we asked three research questions

1) Is performance on a measure of Reading to Learn affected by medium of presentation (paper versus computer) technologyfamiliarity native language (native versus nonnative speakers of English) or level of education (graduate versus under-graduate)

2) Is performance on a measure of Reading to Integrate affected bymedium of presentation (paper versus computer) technologyfamiliarity native language (native versus nonnative) or level ofeducation (graduate versus undergraduate)

3) To what extent are measures of finding informationbasic read-ing comprehension Reading to Learn and Reading to Integraterelated

Latricia Trites and Mary McGroarty 177

III Methods

1 Participants

Two hundred and fifty-one participants the majority undergraduatesvolunteered to take part in this study The sample consisted of 105undergraduate native speakers of English (NSUs) 106 undergraduatenonnative speakers (NNSUs) and 40 graduate nonnative speakers(NNSGs) of English at a midsized southwestern university All data were collected between February and October 1999 All under-graduate participants were recruited through large undergraduateclasses in the areas enrolling most NNSs (business administrationhotel management engineering social sciences and humanities)We tested all NNSs accessible at the institution at the time of datacollection compared to a national sample of international studentsfrom the prior academic year we had a relatively larger proportionof undergraduate relative to graduate students Nearly all undergrad-uate participants were young adults with an average age of 21Nonnative speakers were also recruited from students enrolled in thesummer intensive English program which is made up of studentsneeding to increase TOEFL scores to at least 500 in order to enrollat a university We included 46 participants (32 of NNS sample)with TOEFL scores below 500 in the nonnative sample Graduatenonnative speakers (n 40) were recruited from the entire univer-sity population and had an average age of 3075 Nonnative speakersrepresented a range of language backgrounds One third wereJapanese with other Asian Germanic and Romance languages alsosubstantially represented Both the relatively modest sample size andthe all-volunteer nature of the participant sample preclude directgeneralization to the worldwide TOEFL population but participantswere representative of the levels of international students at the insti-tution where they were enrolled Participants who completed all fourdata collection sessions received a payment of US$10 per hour(US$40 for the entire project)

2 Instruments

This project used three existing instruments two to determine initialreading levels and one to assess levels of computer familiarity andtwo new instruments one for Reading to Learn and one for Readingto Integrate these were developed especially for the project Each ofthe new measures also served as the basis for an additional measure

178 New tasks for reading comprehension tests

of basic reading comprehension related directly to the text includedin the new task Thus each participant completed a total of sevendifferent instruments

a Existing instruments Initial levels of reading comprehensionwere determined based on the NelsonndashDenny Reading Test(NelsonndashDenny) Form G used to identify the reading levels of theNSs and three retired versions of the Institutional TOEFL ReadingComprehension Section (TOEFL Reading Comprehension) used toidentify the reading levels of the NNSs Although each of these testswas used to assess reading levels in the population for which it had been developed all 251 participants took both tests in order toprovide comparative data All 251 participants also completed a brief computer familiarity questionnaire

Participantsrsquo computer familiarity was determined through an 11-item questionnaire based on a longer 23-item questionnairepreviously developed by ETS (Eignor et al 1998) In the presentstudy we used only the 11 items that loaded the most heavily on themajor factors resulting from administration to a large sample ofTOEFL participants For these 11 items developers determined thereliability to be 93 using a split-half method (Eignor et al 199822) This brief questionnaire took approximately 5 minutes tocomplete reliability in our sample using coefficient alpha was 87

b Texts used for new measures In developing the new tasks weselected texts that would conform to the design specifications ofTOEFL 2000 They were problemsolution texts recommended asone of the potentially relevant text types for TOEFL 2000 (Enright et al 1998) Longer texts were used because these represented morechallenging and authentic academic tasks (Enright et al 1998) Weused one 1200-word and two 600-word texts The longer text(Tennesen 1997) was used to assess Reading to Learn and the two600-word texts (Monks 1997 Zimmerman 1997) were used toassess Reading to Integrate We chose these text lengths based onwork by Meyer (1985a) and further research by the first authorindicating that natural science texts between 1200 and 1500 wordsincluded representation of all necessary macro-rhetorical structuresof problemsolution texts with or without explicit signaling While1200ndash1500 word texts provide optimal representation of the macro-rhetorical structures texts of 600-words provide all the basic macro-rhetorical structures present in problemsolution texts Thus these

Latricia Trites and Mary McGroarty 179

180 New tasks for reading comprehension tests

lengths were long enough for adequate argumentation but not so long that they were excessively redundant (Enright et al 1998)Texts were also matched for readability according to standard read-ability scales such as the FleschndashKincaid ColemanndashLiau andBormuth scales and averaged a minimum of grade level 110 to 120on these scales Also all texts pertained to natural and social scienceseach text covered environmental issues such as air and water pollution(Enright et al 1998) Thus text topics were similar across tasks

c New instruments used in the study Three new reading measureswere used in this study to assess Reading to Learn Reading to Integrateand Basic Comprehension Trites (2000 Chapters 2 and 3) presents amore extensive review of literature and rationale for development of thenew measures

bull Reading to Learn The first new measure completion of a chart was used to determine participantsrsquo ability to read to learnSpivey (1997 69) suggests that readersrsquo categorization of infor-mation in text offers insight into their cognitive processes andtheir making of meaning We designed a measure to be used with a 1200-word text that students read on either paper or com-puter Students were asked to recall identify and categorizeinformation from the text on a chart reflecting macro-rhetoricalstructures called macrostructures in this study (problems andsolutions) and other types of information from problemsolutiontexts (causes effects and examples) categories based on thework of Meyer (1985a) The scoring rubric based on work byMeyer (1985b) and later modified by Jamieson et al (1993)awarded points only for the upper levels of textual structurerepresented on the chart (for task and scoring rubric seeAppendix 1) We weighted the information supplied on the chartas follows 10 points for correct information in the problem andsolution categories five points for correct information suppliedin the cause and effect categories and one point for accurateexamples This weighting reflects Meyerrsquos (1985b) hierarchicallevels which characterize problem and solution propositions ashigher order structures while the other categories represent lowerorder propositions1 The theoretical maximum score for this scale

1Students received no points for information improperly placed or for information not found in thetext

Latricia Trites and Mary McGroarty 181

was 241 which would result from maximum points given in allcategories The first author and two research assistants spent35ndash40 hours creating revising norming the scoring rubric anddeveloping the scoring guide (Trites 2000 Chapter 3) To deter-mine interrater reliability we used coefficient alpha rather thanpercentage of agreement because percentage of agreementinflates the likelihood of chance agreement (Hayes and Hatch1999) After norming overall interrater reliability was 99(coefficient alpha) with similarly high reliabilities assessed withsimilarly high alpha coefficients for all subcategories2

bull Reading to Integrate The second new measure assessed Readingto Integrate The task used to assess Reading to Integraterequired participants to read two 600-word texts and compose awritten synthesis The prompt asked students to make connec-tions across the range of ideas presented thus we asked readersto synthesize information rather than summarize or makecomparisons (Wiley and Voss 1999) This synthesis was scoredbased on an analytic scale ranging from 0 to 80 reflecting read-ersrsquo ability to recognize and manipulate the structure of the textsinclude specific information and express connections acrosstexts through the use of cohesive devices (for task and scoringrubric see Appendix 2) The test was designed to measure theintegration of content from both readings and did not assessother aspects of writing such as the creation of rhetorical stylegrammaticality or mechanics The rubric was composed of threesubcategories integration ability macrostructure recognitionand use of relevant details The integration subscore wasawarded the highest point values because this was the predomi-nant skill being tested It scored participants on their ability tomake connections across texts based on the manipulation of thetextual frames in both texts The second subcategory awardedpoints for the ability to recognize and articulate the macrostruc-tures (problem cause effect or solution) present in each textThis subcategory was similar to the categorizing task used in theReading to Learn measure with the additional constraint thatparticipants had to express the connections overtly The thirdsubcategory in the scoring rubric analysed the ability to use

2We recognize that tasks requiring high inference measures plus extensive norming and revision of the scoring rubric pose feasibility issues in large-scale testing Further research is needed todetermine whether and how such scoring procedures could be adapted in standardized testing fornumerous test-takers

relevant details as support in the written synthesis The firstauthor and two research assistants spent 30 hours revising norm-ing the scoring rubric and developing a decision guide resultingin an overall interrater reliability of 99 (coefficient alpha) withsimilarly high alphas for all subcategories

bull Basic Comprehension The third construct was measured bymultiple-choice tests related specifically to the texts used in the new tasks These tests were created by TOEFL TestDevelopment staff and followed current TOEFL reading sectionspecifications We used two multiple choice tests BasicComprehension Test 1 (BC1) and Basic Comprehension Test 2 (BC2) 20 items each one for the longer passage used to assessReading to Learn and one for the two passages used to assessReading to Integrate Both were scored based on number of items answered correctly Reliability on BC1 calculated basedon 251 participants was 84 (coefficient alpha) Inadvertently theorder of the texts used in BC2 was different for the two differentmedia however reliability on both versions of the test was highFor those who took BC2 based on paper texts (n 127) relia-bility was 84 (coefficient alpha) for those who took BC2 basedon computerized texts (n 124) reliability was 86 (coefficientalpha)

3 Design for data collection

This study used a 22 repeated measures design to examineperformance on the new reading tasks Native speaker undergraduatesand nonnative speaker undergraduates were divided into two groupseach of equal ability as determined by performance on the baselinestandardized measures of reading comprehension (NelsonndashDenny orTOEFL) Half of each group read texts on paper the other half readthe same texts on a computer screen A smaller group of nonnativespeaker graduates equally divided were also included for a compar-ison between performance by graduate and undergraduate nonnativespeakers Additionally the administration of the new measures wascounterbalanced to control for any practice effect

a Procedures All participants met with the researchers in foursessions each lasting about an hour The first two sessions were devotedto administering the existing instruments During Session 1 partici-pants received an introduction to the study and took one of the two

182 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 3: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

(Enright et al 1998 46ndash48) For the Reading to Learn task weassessed readersrsquo knowledge model through their ability to recall andcategorize information from a single text (Enright et al 1998 57)

Another goal of the project was to assess Reading to Integrateinformation which requires readers to integrate information frommultiple sources on the same topic Reading to Integrate goes a stepfurther than Reading to Learn because readers must integrate therhetorical and contextual information found across the texts andgenerate their own representation of this interrelationship (Perfetti1997) Therefore readers must assess the information presented inall sources read and accept or reject pieces of it as they create theirown understanding One means of assessing integration of informa-tion found in typical university assignments is the open-ended taskof generating a synthesis based on one or more texts (Enright et al1998 48ndash49) We used a writing task specifically a writing promptthat elicited the readerrsquos perception of the authorsrsquo communicativepurposes (Enright et al 1998 56) as well as amount of informationretained from two texts to test Reading to Integrate

II Related literature

Recent research has begun to explore the development of tasks thatdistinguish the constructs of Reading to Learn from basic compre-hension Researchers (van Dijk and Kintsch 1983 McNamara andKintsch 1996 Goldman 1997) have determined that reading tolearn requires an interaction between the Text Model of a text as wellas its Situation Model thus resulting in a more difficult measureThese researchers further suggest that Reading to Learn can beassessed through measures that go beyond recall summarizationand text-based multiple-choice questions

The construct of Reading to Integrate requires that readers notonly integrate the Text Model with the Situation Model but also that they create what Perfetti (1997 346) calls a Documents Modelconsisting of two critical elements lsquoAn Intertext Model that linkstexts in terms of their rhetorical relations to each other and aSituations Model that represents situations described in one or moretext with links to the textsrsquo He argues that the use of multiple textsas opposed to a single text brings into clearer focus the relationshipbetween the Text Model and the Situation Model This again sug-gests that Reading to Integrate should be more difficult than Readingto Learn

176 New tasks for reading comprehension tests

Because these constructs go beyond basic comprehensionReading to Learn and Reading to Integrate are hypothesized to bemore difficult reading tasks than Reading to Find Information andReading for Basic Comprehension Perfetti (1997) further suggeststhat Reading to Integrate is a more difficult task than Reading toLearn because it not only requires an integration of a Text Model anda Situation Model but requires an integration of multiple TextModels and multiple Situation Models Thus current reading theorysuggests a difficulty hierarchy of reading tasks based on the level ofintegration necessary to complete the tasks successfully Severalstudies (Perfetti et al 1995 1996 Britt et al 1996 Wiley and Voss1999) have attempted to move beyond basic comprehension andexamine readersrsquo ability to integrate the information from multipletexts into one cohesive knowledge base by having students makeconnections compare or contrast information across texts

Additionally recent research has addressed the effects of computerson reading and assessment such research is relevant to the currentproject because the new TOEFL is administered via computersReading-medium studies have shown that the only effect that com-puters have on reading is related to task (Reinking and Schreiner1985 Reinking 1988 van den Berg and Watt 1991 Lehto et al1995 Perfetti et al 1995 1996 Britt et al 1996 Foltz 1996Wiley and Voss 1999) Taylor et al (1998) found that after minimalcomputer training familiarity with technology did not have a signif-icant effect on examineesrsquo performance on TOEFL-like questionsBecause of the relevance of computer familiarity to TOEFL admin-istration a brief measure of computer familiarity was included in theresearch

For this project we asked three research questions

1) Is performance on a measure of Reading to Learn affected by medium of presentation (paper versus computer) technologyfamiliarity native language (native versus nonnative speakers of English) or level of education (graduate versus under-graduate)

2) Is performance on a measure of Reading to Integrate affected bymedium of presentation (paper versus computer) technologyfamiliarity native language (native versus nonnative) or level ofeducation (graduate versus undergraduate)

3) To what extent are measures of finding informationbasic read-ing comprehension Reading to Learn and Reading to Integraterelated

Latricia Trites and Mary McGroarty 177

III Methods

1 Participants

Two hundred and fifty-one participants the majority undergraduatesvolunteered to take part in this study The sample consisted of 105undergraduate native speakers of English (NSUs) 106 undergraduatenonnative speakers (NNSUs) and 40 graduate nonnative speakers(NNSGs) of English at a midsized southwestern university All data were collected between February and October 1999 All under-graduate participants were recruited through large undergraduateclasses in the areas enrolling most NNSs (business administrationhotel management engineering social sciences and humanities)We tested all NNSs accessible at the institution at the time of datacollection compared to a national sample of international studentsfrom the prior academic year we had a relatively larger proportionof undergraduate relative to graduate students Nearly all undergrad-uate participants were young adults with an average age of 21Nonnative speakers were also recruited from students enrolled in thesummer intensive English program which is made up of studentsneeding to increase TOEFL scores to at least 500 in order to enrollat a university We included 46 participants (32 of NNS sample)with TOEFL scores below 500 in the nonnative sample Graduatenonnative speakers (n 40) were recruited from the entire univer-sity population and had an average age of 3075 Nonnative speakersrepresented a range of language backgrounds One third wereJapanese with other Asian Germanic and Romance languages alsosubstantially represented Both the relatively modest sample size andthe all-volunteer nature of the participant sample preclude directgeneralization to the worldwide TOEFL population but participantswere representative of the levels of international students at the insti-tution where they were enrolled Participants who completed all fourdata collection sessions received a payment of US$10 per hour(US$40 for the entire project)

2 Instruments

This project used three existing instruments two to determine initialreading levels and one to assess levels of computer familiarity andtwo new instruments one for Reading to Learn and one for Readingto Integrate these were developed especially for the project Each ofthe new measures also served as the basis for an additional measure

178 New tasks for reading comprehension tests

of basic reading comprehension related directly to the text includedin the new task Thus each participant completed a total of sevendifferent instruments

a Existing instruments Initial levels of reading comprehensionwere determined based on the NelsonndashDenny Reading Test(NelsonndashDenny) Form G used to identify the reading levels of theNSs and three retired versions of the Institutional TOEFL ReadingComprehension Section (TOEFL Reading Comprehension) used toidentify the reading levels of the NNSs Although each of these testswas used to assess reading levels in the population for which it had been developed all 251 participants took both tests in order toprovide comparative data All 251 participants also completed a brief computer familiarity questionnaire

Participantsrsquo computer familiarity was determined through an 11-item questionnaire based on a longer 23-item questionnairepreviously developed by ETS (Eignor et al 1998) In the presentstudy we used only the 11 items that loaded the most heavily on themajor factors resulting from administration to a large sample ofTOEFL participants For these 11 items developers determined thereliability to be 93 using a split-half method (Eignor et al 199822) This brief questionnaire took approximately 5 minutes tocomplete reliability in our sample using coefficient alpha was 87

b Texts used for new measures In developing the new tasks weselected texts that would conform to the design specifications ofTOEFL 2000 They were problemsolution texts recommended asone of the potentially relevant text types for TOEFL 2000 (Enright et al 1998) Longer texts were used because these represented morechallenging and authentic academic tasks (Enright et al 1998) Weused one 1200-word and two 600-word texts The longer text(Tennesen 1997) was used to assess Reading to Learn and the two600-word texts (Monks 1997 Zimmerman 1997) were used toassess Reading to Integrate We chose these text lengths based onwork by Meyer (1985a) and further research by the first authorindicating that natural science texts between 1200 and 1500 wordsincluded representation of all necessary macro-rhetorical structuresof problemsolution texts with or without explicit signaling While1200ndash1500 word texts provide optimal representation of the macro-rhetorical structures texts of 600-words provide all the basic macro-rhetorical structures present in problemsolution texts Thus these

Latricia Trites and Mary McGroarty 179

180 New tasks for reading comprehension tests

lengths were long enough for adequate argumentation but not so long that they were excessively redundant (Enright et al 1998)Texts were also matched for readability according to standard read-ability scales such as the FleschndashKincaid ColemanndashLiau andBormuth scales and averaged a minimum of grade level 110 to 120on these scales Also all texts pertained to natural and social scienceseach text covered environmental issues such as air and water pollution(Enright et al 1998) Thus text topics were similar across tasks

c New instruments used in the study Three new reading measureswere used in this study to assess Reading to Learn Reading to Integrateand Basic Comprehension Trites (2000 Chapters 2 and 3) presents amore extensive review of literature and rationale for development of thenew measures

bull Reading to Learn The first new measure completion of a chart was used to determine participantsrsquo ability to read to learnSpivey (1997 69) suggests that readersrsquo categorization of infor-mation in text offers insight into their cognitive processes andtheir making of meaning We designed a measure to be used with a 1200-word text that students read on either paper or com-puter Students were asked to recall identify and categorizeinformation from the text on a chart reflecting macro-rhetoricalstructures called macrostructures in this study (problems andsolutions) and other types of information from problemsolutiontexts (causes effects and examples) categories based on thework of Meyer (1985a) The scoring rubric based on work byMeyer (1985b) and later modified by Jamieson et al (1993)awarded points only for the upper levels of textual structurerepresented on the chart (for task and scoring rubric seeAppendix 1) We weighted the information supplied on the chartas follows 10 points for correct information in the problem andsolution categories five points for correct information suppliedin the cause and effect categories and one point for accurateexamples This weighting reflects Meyerrsquos (1985b) hierarchicallevels which characterize problem and solution propositions ashigher order structures while the other categories represent lowerorder propositions1 The theoretical maximum score for this scale

1Students received no points for information improperly placed or for information not found in thetext

Latricia Trites and Mary McGroarty 181

was 241 which would result from maximum points given in allcategories The first author and two research assistants spent35ndash40 hours creating revising norming the scoring rubric anddeveloping the scoring guide (Trites 2000 Chapter 3) To deter-mine interrater reliability we used coefficient alpha rather thanpercentage of agreement because percentage of agreementinflates the likelihood of chance agreement (Hayes and Hatch1999) After norming overall interrater reliability was 99(coefficient alpha) with similarly high reliabilities assessed withsimilarly high alpha coefficients for all subcategories2

bull Reading to Integrate The second new measure assessed Readingto Integrate The task used to assess Reading to Integraterequired participants to read two 600-word texts and compose awritten synthesis The prompt asked students to make connec-tions across the range of ideas presented thus we asked readersto synthesize information rather than summarize or makecomparisons (Wiley and Voss 1999) This synthesis was scoredbased on an analytic scale ranging from 0 to 80 reflecting read-ersrsquo ability to recognize and manipulate the structure of the textsinclude specific information and express connections acrosstexts through the use of cohesive devices (for task and scoringrubric see Appendix 2) The test was designed to measure theintegration of content from both readings and did not assessother aspects of writing such as the creation of rhetorical stylegrammaticality or mechanics The rubric was composed of threesubcategories integration ability macrostructure recognitionand use of relevant details The integration subscore wasawarded the highest point values because this was the predomi-nant skill being tested It scored participants on their ability tomake connections across texts based on the manipulation of thetextual frames in both texts The second subcategory awardedpoints for the ability to recognize and articulate the macrostruc-tures (problem cause effect or solution) present in each textThis subcategory was similar to the categorizing task used in theReading to Learn measure with the additional constraint thatparticipants had to express the connections overtly The thirdsubcategory in the scoring rubric analysed the ability to use

2We recognize that tasks requiring high inference measures plus extensive norming and revision of the scoring rubric pose feasibility issues in large-scale testing Further research is needed todetermine whether and how such scoring procedures could be adapted in standardized testing fornumerous test-takers

relevant details as support in the written synthesis The firstauthor and two research assistants spent 30 hours revising norm-ing the scoring rubric and developing a decision guide resultingin an overall interrater reliability of 99 (coefficient alpha) withsimilarly high alphas for all subcategories

bull Basic Comprehension The third construct was measured bymultiple-choice tests related specifically to the texts used in the new tasks These tests were created by TOEFL TestDevelopment staff and followed current TOEFL reading sectionspecifications We used two multiple choice tests BasicComprehension Test 1 (BC1) and Basic Comprehension Test 2 (BC2) 20 items each one for the longer passage used to assessReading to Learn and one for the two passages used to assessReading to Integrate Both were scored based on number of items answered correctly Reliability on BC1 calculated basedon 251 participants was 84 (coefficient alpha) Inadvertently theorder of the texts used in BC2 was different for the two differentmedia however reliability on both versions of the test was highFor those who took BC2 based on paper texts (n 127) relia-bility was 84 (coefficient alpha) for those who took BC2 basedon computerized texts (n 124) reliability was 86 (coefficientalpha)

3 Design for data collection

This study used a 22 repeated measures design to examineperformance on the new reading tasks Native speaker undergraduatesand nonnative speaker undergraduates were divided into two groupseach of equal ability as determined by performance on the baselinestandardized measures of reading comprehension (NelsonndashDenny orTOEFL) Half of each group read texts on paper the other half readthe same texts on a computer screen A smaller group of nonnativespeaker graduates equally divided were also included for a compar-ison between performance by graduate and undergraduate nonnativespeakers Additionally the administration of the new measures wascounterbalanced to control for any practice effect

a Procedures All participants met with the researchers in foursessions each lasting about an hour The first two sessions were devotedto administering the existing instruments During Session 1 partici-pants received an introduction to the study and took one of the two

182 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 4: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Because these constructs go beyond basic comprehensionReading to Learn and Reading to Integrate are hypothesized to bemore difficult reading tasks than Reading to Find Information andReading for Basic Comprehension Perfetti (1997) further suggeststhat Reading to Integrate is a more difficult task than Reading toLearn because it not only requires an integration of a Text Model anda Situation Model but requires an integration of multiple TextModels and multiple Situation Models Thus current reading theorysuggests a difficulty hierarchy of reading tasks based on the level ofintegration necessary to complete the tasks successfully Severalstudies (Perfetti et al 1995 1996 Britt et al 1996 Wiley and Voss1999) have attempted to move beyond basic comprehension andexamine readersrsquo ability to integrate the information from multipletexts into one cohesive knowledge base by having students makeconnections compare or contrast information across texts

Additionally recent research has addressed the effects of computerson reading and assessment such research is relevant to the currentproject because the new TOEFL is administered via computersReading-medium studies have shown that the only effect that com-puters have on reading is related to task (Reinking and Schreiner1985 Reinking 1988 van den Berg and Watt 1991 Lehto et al1995 Perfetti et al 1995 1996 Britt et al 1996 Foltz 1996Wiley and Voss 1999) Taylor et al (1998) found that after minimalcomputer training familiarity with technology did not have a signif-icant effect on examineesrsquo performance on TOEFL-like questionsBecause of the relevance of computer familiarity to TOEFL admin-istration a brief measure of computer familiarity was included in theresearch

For this project we asked three research questions

1) Is performance on a measure of Reading to Learn affected by medium of presentation (paper versus computer) technologyfamiliarity native language (native versus nonnative speakers of English) or level of education (graduate versus under-graduate)

2) Is performance on a measure of Reading to Integrate affected bymedium of presentation (paper versus computer) technologyfamiliarity native language (native versus nonnative) or level ofeducation (graduate versus undergraduate)

3) To what extent are measures of finding informationbasic read-ing comprehension Reading to Learn and Reading to Integraterelated

Latricia Trites and Mary McGroarty 177

III Methods

1 Participants

Two hundred and fifty-one participants the majority undergraduatesvolunteered to take part in this study The sample consisted of 105undergraduate native speakers of English (NSUs) 106 undergraduatenonnative speakers (NNSUs) and 40 graduate nonnative speakers(NNSGs) of English at a midsized southwestern university All data were collected between February and October 1999 All under-graduate participants were recruited through large undergraduateclasses in the areas enrolling most NNSs (business administrationhotel management engineering social sciences and humanities)We tested all NNSs accessible at the institution at the time of datacollection compared to a national sample of international studentsfrom the prior academic year we had a relatively larger proportionof undergraduate relative to graduate students Nearly all undergrad-uate participants were young adults with an average age of 21Nonnative speakers were also recruited from students enrolled in thesummer intensive English program which is made up of studentsneeding to increase TOEFL scores to at least 500 in order to enrollat a university We included 46 participants (32 of NNS sample)with TOEFL scores below 500 in the nonnative sample Graduatenonnative speakers (n 40) were recruited from the entire univer-sity population and had an average age of 3075 Nonnative speakersrepresented a range of language backgrounds One third wereJapanese with other Asian Germanic and Romance languages alsosubstantially represented Both the relatively modest sample size andthe all-volunteer nature of the participant sample preclude directgeneralization to the worldwide TOEFL population but participantswere representative of the levels of international students at the insti-tution where they were enrolled Participants who completed all fourdata collection sessions received a payment of US$10 per hour(US$40 for the entire project)

2 Instruments

This project used three existing instruments two to determine initialreading levels and one to assess levels of computer familiarity andtwo new instruments one for Reading to Learn and one for Readingto Integrate these were developed especially for the project Each ofthe new measures also served as the basis for an additional measure

178 New tasks for reading comprehension tests

of basic reading comprehension related directly to the text includedin the new task Thus each participant completed a total of sevendifferent instruments

a Existing instruments Initial levels of reading comprehensionwere determined based on the NelsonndashDenny Reading Test(NelsonndashDenny) Form G used to identify the reading levels of theNSs and three retired versions of the Institutional TOEFL ReadingComprehension Section (TOEFL Reading Comprehension) used toidentify the reading levels of the NNSs Although each of these testswas used to assess reading levels in the population for which it had been developed all 251 participants took both tests in order toprovide comparative data All 251 participants also completed a brief computer familiarity questionnaire

Participantsrsquo computer familiarity was determined through an 11-item questionnaire based on a longer 23-item questionnairepreviously developed by ETS (Eignor et al 1998) In the presentstudy we used only the 11 items that loaded the most heavily on themajor factors resulting from administration to a large sample ofTOEFL participants For these 11 items developers determined thereliability to be 93 using a split-half method (Eignor et al 199822) This brief questionnaire took approximately 5 minutes tocomplete reliability in our sample using coefficient alpha was 87

b Texts used for new measures In developing the new tasks weselected texts that would conform to the design specifications ofTOEFL 2000 They were problemsolution texts recommended asone of the potentially relevant text types for TOEFL 2000 (Enright et al 1998) Longer texts were used because these represented morechallenging and authentic academic tasks (Enright et al 1998) Weused one 1200-word and two 600-word texts The longer text(Tennesen 1997) was used to assess Reading to Learn and the two600-word texts (Monks 1997 Zimmerman 1997) were used toassess Reading to Integrate We chose these text lengths based onwork by Meyer (1985a) and further research by the first authorindicating that natural science texts between 1200 and 1500 wordsincluded representation of all necessary macro-rhetorical structuresof problemsolution texts with or without explicit signaling While1200ndash1500 word texts provide optimal representation of the macro-rhetorical structures texts of 600-words provide all the basic macro-rhetorical structures present in problemsolution texts Thus these

Latricia Trites and Mary McGroarty 179

180 New tasks for reading comprehension tests

lengths were long enough for adequate argumentation but not so long that they were excessively redundant (Enright et al 1998)Texts were also matched for readability according to standard read-ability scales such as the FleschndashKincaid ColemanndashLiau andBormuth scales and averaged a minimum of grade level 110 to 120on these scales Also all texts pertained to natural and social scienceseach text covered environmental issues such as air and water pollution(Enright et al 1998) Thus text topics were similar across tasks

c New instruments used in the study Three new reading measureswere used in this study to assess Reading to Learn Reading to Integrateand Basic Comprehension Trites (2000 Chapters 2 and 3) presents amore extensive review of literature and rationale for development of thenew measures

bull Reading to Learn The first new measure completion of a chart was used to determine participantsrsquo ability to read to learnSpivey (1997 69) suggests that readersrsquo categorization of infor-mation in text offers insight into their cognitive processes andtheir making of meaning We designed a measure to be used with a 1200-word text that students read on either paper or com-puter Students were asked to recall identify and categorizeinformation from the text on a chart reflecting macro-rhetoricalstructures called macrostructures in this study (problems andsolutions) and other types of information from problemsolutiontexts (causes effects and examples) categories based on thework of Meyer (1985a) The scoring rubric based on work byMeyer (1985b) and later modified by Jamieson et al (1993)awarded points only for the upper levels of textual structurerepresented on the chart (for task and scoring rubric seeAppendix 1) We weighted the information supplied on the chartas follows 10 points for correct information in the problem andsolution categories five points for correct information suppliedin the cause and effect categories and one point for accurateexamples This weighting reflects Meyerrsquos (1985b) hierarchicallevels which characterize problem and solution propositions ashigher order structures while the other categories represent lowerorder propositions1 The theoretical maximum score for this scale

1Students received no points for information improperly placed or for information not found in thetext

Latricia Trites and Mary McGroarty 181

was 241 which would result from maximum points given in allcategories The first author and two research assistants spent35ndash40 hours creating revising norming the scoring rubric anddeveloping the scoring guide (Trites 2000 Chapter 3) To deter-mine interrater reliability we used coefficient alpha rather thanpercentage of agreement because percentage of agreementinflates the likelihood of chance agreement (Hayes and Hatch1999) After norming overall interrater reliability was 99(coefficient alpha) with similarly high reliabilities assessed withsimilarly high alpha coefficients for all subcategories2

bull Reading to Integrate The second new measure assessed Readingto Integrate The task used to assess Reading to Integraterequired participants to read two 600-word texts and compose awritten synthesis The prompt asked students to make connec-tions across the range of ideas presented thus we asked readersto synthesize information rather than summarize or makecomparisons (Wiley and Voss 1999) This synthesis was scoredbased on an analytic scale ranging from 0 to 80 reflecting read-ersrsquo ability to recognize and manipulate the structure of the textsinclude specific information and express connections acrosstexts through the use of cohesive devices (for task and scoringrubric see Appendix 2) The test was designed to measure theintegration of content from both readings and did not assessother aspects of writing such as the creation of rhetorical stylegrammaticality or mechanics The rubric was composed of threesubcategories integration ability macrostructure recognitionand use of relevant details The integration subscore wasawarded the highest point values because this was the predomi-nant skill being tested It scored participants on their ability tomake connections across texts based on the manipulation of thetextual frames in both texts The second subcategory awardedpoints for the ability to recognize and articulate the macrostruc-tures (problem cause effect or solution) present in each textThis subcategory was similar to the categorizing task used in theReading to Learn measure with the additional constraint thatparticipants had to express the connections overtly The thirdsubcategory in the scoring rubric analysed the ability to use

2We recognize that tasks requiring high inference measures plus extensive norming and revision of the scoring rubric pose feasibility issues in large-scale testing Further research is needed todetermine whether and how such scoring procedures could be adapted in standardized testing fornumerous test-takers

relevant details as support in the written synthesis The firstauthor and two research assistants spent 30 hours revising norm-ing the scoring rubric and developing a decision guide resultingin an overall interrater reliability of 99 (coefficient alpha) withsimilarly high alphas for all subcategories

bull Basic Comprehension The third construct was measured bymultiple-choice tests related specifically to the texts used in the new tasks These tests were created by TOEFL TestDevelopment staff and followed current TOEFL reading sectionspecifications We used two multiple choice tests BasicComprehension Test 1 (BC1) and Basic Comprehension Test 2 (BC2) 20 items each one for the longer passage used to assessReading to Learn and one for the two passages used to assessReading to Integrate Both were scored based on number of items answered correctly Reliability on BC1 calculated basedon 251 participants was 84 (coefficient alpha) Inadvertently theorder of the texts used in BC2 was different for the two differentmedia however reliability on both versions of the test was highFor those who took BC2 based on paper texts (n 127) relia-bility was 84 (coefficient alpha) for those who took BC2 basedon computerized texts (n 124) reliability was 86 (coefficientalpha)

3 Design for data collection

This study used a 22 repeated measures design to examineperformance on the new reading tasks Native speaker undergraduatesand nonnative speaker undergraduates were divided into two groupseach of equal ability as determined by performance on the baselinestandardized measures of reading comprehension (NelsonndashDenny orTOEFL) Half of each group read texts on paper the other half readthe same texts on a computer screen A smaller group of nonnativespeaker graduates equally divided were also included for a compar-ison between performance by graduate and undergraduate nonnativespeakers Additionally the administration of the new measures wascounterbalanced to control for any practice effect

a Procedures All participants met with the researchers in foursessions each lasting about an hour The first two sessions were devotedto administering the existing instruments During Session 1 partici-pants received an introduction to the study and took one of the two

182 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 5: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

III Methods

1 Participants

Two hundred and fifty-one participants the majority undergraduatesvolunteered to take part in this study The sample consisted of 105undergraduate native speakers of English (NSUs) 106 undergraduatenonnative speakers (NNSUs) and 40 graduate nonnative speakers(NNSGs) of English at a midsized southwestern university All data were collected between February and October 1999 All under-graduate participants were recruited through large undergraduateclasses in the areas enrolling most NNSs (business administrationhotel management engineering social sciences and humanities)We tested all NNSs accessible at the institution at the time of datacollection compared to a national sample of international studentsfrom the prior academic year we had a relatively larger proportionof undergraduate relative to graduate students Nearly all undergrad-uate participants were young adults with an average age of 21Nonnative speakers were also recruited from students enrolled in thesummer intensive English program which is made up of studentsneeding to increase TOEFL scores to at least 500 in order to enrollat a university We included 46 participants (32 of NNS sample)with TOEFL scores below 500 in the nonnative sample Graduatenonnative speakers (n 40) were recruited from the entire univer-sity population and had an average age of 3075 Nonnative speakersrepresented a range of language backgrounds One third wereJapanese with other Asian Germanic and Romance languages alsosubstantially represented Both the relatively modest sample size andthe all-volunteer nature of the participant sample preclude directgeneralization to the worldwide TOEFL population but participantswere representative of the levels of international students at the insti-tution where they were enrolled Participants who completed all fourdata collection sessions received a payment of US$10 per hour(US$40 for the entire project)

2 Instruments

This project used three existing instruments two to determine initialreading levels and one to assess levels of computer familiarity andtwo new instruments one for Reading to Learn and one for Readingto Integrate these were developed especially for the project Each ofthe new measures also served as the basis for an additional measure

178 New tasks for reading comprehension tests

of basic reading comprehension related directly to the text includedin the new task Thus each participant completed a total of sevendifferent instruments

a Existing instruments Initial levels of reading comprehensionwere determined based on the NelsonndashDenny Reading Test(NelsonndashDenny) Form G used to identify the reading levels of theNSs and three retired versions of the Institutional TOEFL ReadingComprehension Section (TOEFL Reading Comprehension) used toidentify the reading levels of the NNSs Although each of these testswas used to assess reading levels in the population for which it had been developed all 251 participants took both tests in order toprovide comparative data All 251 participants also completed a brief computer familiarity questionnaire

Participantsrsquo computer familiarity was determined through an 11-item questionnaire based on a longer 23-item questionnairepreviously developed by ETS (Eignor et al 1998) In the presentstudy we used only the 11 items that loaded the most heavily on themajor factors resulting from administration to a large sample ofTOEFL participants For these 11 items developers determined thereliability to be 93 using a split-half method (Eignor et al 199822) This brief questionnaire took approximately 5 minutes tocomplete reliability in our sample using coefficient alpha was 87

b Texts used for new measures In developing the new tasks weselected texts that would conform to the design specifications ofTOEFL 2000 They were problemsolution texts recommended asone of the potentially relevant text types for TOEFL 2000 (Enright et al 1998) Longer texts were used because these represented morechallenging and authentic academic tasks (Enright et al 1998) Weused one 1200-word and two 600-word texts The longer text(Tennesen 1997) was used to assess Reading to Learn and the two600-word texts (Monks 1997 Zimmerman 1997) were used toassess Reading to Integrate We chose these text lengths based onwork by Meyer (1985a) and further research by the first authorindicating that natural science texts between 1200 and 1500 wordsincluded representation of all necessary macro-rhetorical structuresof problemsolution texts with or without explicit signaling While1200ndash1500 word texts provide optimal representation of the macro-rhetorical structures texts of 600-words provide all the basic macro-rhetorical structures present in problemsolution texts Thus these

Latricia Trites and Mary McGroarty 179

180 New tasks for reading comprehension tests

lengths were long enough for adequate argumentation but not so long that they were excessively redundant (Enright et al 1998)Texts were also matched for readability according to standard read-ability scales such as the FleschndashKincaid ColemanndashLiau andBormuth scales and averaged a minimum of grade level 110 to 120on these scales Also all texts pertained to natural and social scienceseach text covered environmental issues such as air and water pollution(Enright et al 1998) Thus text topics were similar across tasks

c New instruments used in the study Three new reading measureswere used in this study to assess Reading to Learn Reading to Integrateand Basic Comprehension Trites (2000 Chapters 2 and 3) presents amore extensive review of literature and rationale for development of thenew measures

bull Reading to Learn The first new measure completion of a chart was used to determine participantsrsquo ability to read to learnSpivey (1997 69) suggests that readersrsquo categorization of infor-mation in text offers insight into their cognitive processes andtheir making of meaning We designed a measure to be used with a 1200-word text that students read on either paper or com-puter Students were asked to recall identify and categorizeinformation from the text on a chart reflecting macro-rhetoricalstructures called macrostructures in this study (problems andsolutions) and other types of information from problemsolutiontexts (causes effects and examples) categories based on thework of Meyer (1985a) The scoring rubric based on work byMeyer (1985b) and later modified by Jamieson et al (1993)awarded points only for the upper levels of textual structurerepresented on the chart (for task and scoring rubric seeAppendix 1) We weighted the information supplied on the chartas follows 10 points for correct information in the problem andsolution categories five points for correct information suppliedin the cause and effect categories and one point for accurateexamples This weighting reflects Meyerrsquos (1985b) hierarchicallevels which characterize problem and solution propositions ashigher order structures while the other categories represent lowerorder propositions1 The theoretical maximum score for this scale

1Students received no points for information improperly placed or for information not found in thetext

Latricia Trites and Mary McGroarty 181

was 241 which would result from maximum points given in allcategories The first author and two research assistants spent35ndash40 hours creating revising norming the scoring rubric anddeveloping the scoring guide (Trites 2000 Chapter 3) To deter-mine interrater reliability we used coefficient alpha rather thanpercentage of agreement because percentage of agreementinflates the likelihood of chance agreement (Hayes and Hatch1999) After norming overall interrater reliability was 99(coefficient alpha) with similarly high reliabilities assessed withsimilarly high alpha coefficients for all subcategories2

bull Reading to Integrate The second new measure assessed Readingto Integrate The task used to assess Reading to Integraterequired participants to read two 600-word texts and compose awritten synthesis The prompt asked students to make connec-tions across the range of ideas presented thus we asked readersto synthesize information rather than summarize or makecomparisons (Wiley and Voss 1999) This synthesis was scoredbased on an analytic scale ranging from 0 to 80 reflecting read-ersrsquo ability to recognize and manipulate the structure of the textsinclude specific information and express connections acrosstexts through the use of cohesive devices (for task and scoringrubric see Appendix 2) The test was designed to measure theintegration of content from both readings and did not assessother aspects of writing such as the creation of rhetorical stylegrammaticality or mechanics The rubric was composed of threesubcategories integration ability macrostructure recognitionand use of relevant details The integration subscore wasawarded the highest point values because this was the predomi-nant skill being tested It scored participants on their ability tomake connections across texts based on the manipulation of thetextual frames in both texts The second subcategory awardedpoints for the ability to recognize and articulate the macrostruc-tures (problem cause effect or solution) present in each textThis subcategory was similar to the categorizing task used in theReading to Learn measure with the additional constraint thatparticipants had to express the connections overtly The thirdsubcategory in the scoring rubric analysed the ability to use

2We recognize that tasks requiring high inference measures plus extensive norming and revision of the scoring rubric pose feasibility issues in large-scale testing Further research is needed todetermine whether and how such scoring procedures could be adapted in standardized testing fornumerous test-takers

relevant details as support in the written synthesis The firstauthor and two research assistants spent 30 hours revising norm-ing the scoring rubric and developing a decision guide resultingin an overall interrater reliability of 99 (coefficient alpha) withsimilarly high alphas for all subcategories

bull Basic Comprehension The third construct was measured bymultiple-choice tests related specifically to the texts used in the new tasks These tests were created by TOEFL TestDevelopment staff and followed current TOEFL reading sectionspecifications We used two multiple choice tests BasicComprehension Test 1 (BC1) and Basic Comprehension Test 2 (BC2) 20 items each one for the longer passage used to assessReading to Learn and one for the two passages used to assessReading to Integrate Both were scored based on number of items answered correctly Reliability on BC1 calculated basedon 251 participants was 84 (coefficient alpha) Inadvertently theorder of the texts used in BC2 was different for the two differentmedia however reliability on both versions of the test was highFor those who took BC2 based on paper texts (n 127) relia-bility was 84 (coefficient alpha) for those who took BC2 basedon computerized texts (n 124) reliability was 86 (coefficientalpha)

3 Design for data collection

This study used a 22 repeated measures design to examineperformance on the new reading tasks Native speaker undergraduatesand nonnative speaker undergraduates were divided into two groupseach of equal ability as determined by performance on the baselinestandardized measures of reading comprehension (NelsonndashDenny orTOEFL) Half of each group read texts on paper the other half readthe same texts on a computer screen A smaller group of nonnativespeaker graduates equally divided were also included for a compar-ison between performance by graduate and undergraduate nonnativespeakers Additionally the administration of the new measures wascounterbalanced to control for any practice effect

a Procedures All participants met with the researchers in foursessions each lasting about an hour The first two sessions were devotedto administering the existing instruments During Session 1 partici-pants received an introduction to the study and took one of the two

182 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 6: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

of basic reading comprehension related directly to the text includedin the new task Thus each participant completed a total of sevendifferent instruments

a Existing instruments Initial levels of reading comprehensionwere determined based on the NelsonndashDenny Reading Test(NelsonndashDenny) Form G used to identify the reading levels of theNSs and three retired versions of the Institutional TOEFL ReadingComprehension Section (TOEFL Reading Comprehension) used toidentify the reading levels of the NNSs Although each of these testswas used to assess reading levels in the population for which it had been developed all 251 participants took both tests in order toprovide comparative data All 251 participants also completed a brief computer familiarity questionnaire

Participantsrsquo computer familiarity was determined through an 11-item questionnaire based on a longer 23-item questionnairepreviously developed by ETS (Eignor et al 1998) In the presentstudy we used only the 11 items that loaded the most heavily on themajor factors resulting from administration to a large sample ofTOEFL participants For these 11 items developers determined thereliability to be 93 using a split-half method (Eignor et al 199822) This brief questionnaire took approximately 5 minutes tocomplete reliability in our sample using coefficient alpha was 87

b Texts used for new measures In developing the new tasks weselected texts that would conform to the design specifications ofTOEFL 2000 They were problemsolution texts recommended asone of the potentially relevant text types for TOEFL 2000 (Enright et al 1998) Longer texts were used because these represented morechallenging and authentic academic tasks (Enright et al 1998) Weused one 1200-word and two 600-word texts The longer text(Tennesen 1997) was used to assess Reading to Learn and the two600-word texts (Monks 1997 Zimmerman 1997) were used toassess Reading to Integrate We chose these text lengths based onwork by Meyer (1985a) and further research by the first authorindicating that natural science texts between 1200 and 1500 wordsincluded representation of all necessary macro-rhetorical structuresof problemsolution texts with or without explicit signaling While1200ndash1500 word texts provide optimal representation of the macro-rhetorical structures texts of 600-words provide all the basic macro-rhetorical structures present in problemsolution texts Thus these

Latricia Trites and Mary McGroarty 179

180 New tasks for reading comprehension tests

lengths were long enough for adequate argumentation but not so long that they were excessively redundant (Enright et al 1998)Texts were also matched for readability according to standard read-ability scales such as the FleschndashKincaid ColemanndashLiau andBormuth scales and averaged a minimum of grade level 110 to 120on these scales Also all texts pertained to natural and social scienceseach text covered environmental issues such as air and water pollution(Enright et al 1998) Thus text topics were similar across tasks

c New instruments used in the study Three new reading measureswere used in this study to assess Reading to Learn Reading to Integrateand Basic Comprehension Trites (2000 Chapters 2 and 3) presents amore extensive review of literature and rationale for development of thenew measures

bull Reading to Learn The first new measure completion of a chart was used to determine participantsrsquo ability to read to learnSpivey (1997 69) suggests that readersrsquo categorization of infor-mation in text offers insight into their cognitive processes andtheir making of meaning We designed a measure to be used with a 1200-word text that students read on either paper or com-puter Students were asked to recall identify and categorizeinformation from the text on a chart reflecting macro-rhetoricalstructures called macrostructures in this study (problems andsolutions) and other types of information from problemsolutiontexts (causes effects and examples) categories based on thework of Meyer (1985a) The scoring rubric based on work byMeyer (1985b) and later modified by Jamieson et al (1993)awarded points only for the upper levels of textual structurerepresented on the chart (for task and scoring rubric seeAppendix 1) We weighted the information supplied on the chartas follows 10 points for correct information in the problem andsolution categories five points for correct information suppliedin the cause and effect categories and one point for accurateexamples This weighting reflects Meyerrsquos (1985b) hierarchicallevels which characterize problem and solution propositions ashigher order structures while the other categories represent lowerorder propositions1 The theoretical maximum score for this scale

1Students received no points for information improperly placed or for information not found in thetext

Latricia Trites and Mary McGroarty 181

was 241 which would result from maximum points given in allcategories The first author and two research assistants spent35ndash40 hours creating revising norming the scoring rubric anddeveloping the scoring guide (Trites 2000 Chapter 3) To deter-mine interrater reliability we used coefficient alpha rather thanpercentage of agreement because percentage of agreementinflates the likelihood of chance agreement (Hayes and Hatch1999) After norming overall interrater reliability was 99(coefficient alpha) with similarly high reliabilities assessed withsimilarly high alpha coefficients for all subcategories2

bull Reading to Integrate The second new measure assessed Readingto Integrate The task used to assess Reading to Integraterequired participants to read two 600-word texts and compose awritten synthesis The prompt asked students to make connec-tions across the range of ideas presented thus we asked readersto synthesize information rather than summarize or makecomparisons (Wiley and Voss 1999) This synthesis was scoredbased on an analytic scale ranging from 0 to 80 reflecting read-ersrsquo ability to recognize and manipulate the structure of the textsinclude specific information and express connections acrosstexts through the use of cohesive devices (for task and scoringrubric see Appendix 2) The test was designed to measure theintegration of content from both readings and did not assessother aspects of writing such as the creation of rhetorical stylegrammaticality or mechanics The rubric was composed of threesubcategories integration ability macrostructure recognitionand use of relevant details The integration subscore wasawarded the highest point values because this was the predomi-nant skill being tested It scored participants on their ability tomake connections across texts based on the manipulation of thetextual frames in both texts The second subcategory awardedpoints for the ability to recognize and articulate the macrostruc-tures (problem cause effect or solution) present in each textThis subcategory was similar to the categorizing task used in theReading to Learn measure with the additional constraint thatparticipants had to express the connections overtly The thirdsubcategory in the scoring rubric analysed the ability to use

2We recognize that tasks requiring high inference measures plus extensive norming and revision of the scoring rubric pose feasibility issues in large-scale testing Further research is needed todetermine whether and how such scoring procedures could be adapted in standardized testing fornumerous test-takers

relevant details as support in the written synthesis The firstauthor and two research assistants spent 30 hours revising norm-ing the scoring rubric and developing a decision guide resultingin an overall interrater reliability of 99 (coefficient alpha) withsimilarly high alphas for all subcategories

bull Basic Comprehension The third construct was measured bymultiple-choice tests related specifically to the texts used in the new tasks These tests were created by TOEFL TestDevelopment staff and followed current TOEFL reading sectionspecifications We used two multiple choice tests BasicComprehension Test 1 (BC1) and Basic Comprehension Test 2 (BC2) 20 items each one for the longer passage used to assessReading to Learn and one for the two passages used to assessReading to Integrate Both were scored based on number of items answered correctly Reliability on BC1 calculated basedon 251 participants was 84 (coefficient alpha) Inadvertently theorder of the texts used in BC2 was different for the two differentmedia however reliability on both versions of the test was highFor those who took BC2 based on paper texts (n 127) relia-bility was 84 (coefficient alpha) for those who took BC2 basedon computerized texts (n 124) reliability was 86 (coefficientalpha)

3 Design for data collection

This study used a 22 repeated measures design to examineperformance on the new reading tasks Native speaker undergraduatesand nonnative speaker undergraduates were divided into two groupseach of equal ability as determined by performance on the baselinestandardized measures of reading comprehension (NelsonndashDenny orTOEFL) Half of each group read texts on paper the other half readthe same texts on a computer screen A smaller group of nonnativespeaker graduates equally divided were also included for a compar-ison between performance by graduate and undergraduate nonnativespeakers Additionally the administration of the new measures wascounterbalanced to control for any practice effect

a Procedures All participants met with the researchers in foursessions each lasting about an hour The first two sessions were devotedto administering the existing instruments During Session 1 partici-pants received an introduction to the study and took one of the two

182 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 7: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

180 New tasks for reading comprehension tests

lengths were long enough for adequate argumentation but not so long that they were excessively redundant (Enright et al 1998)Texts were also matched for readability according to standard read-ability scales such as the FleschndashKincaid ColemanndashLiau andBormuth scales and averaged a minimum of grade level 110 to 120on these scales Also all texts pertained to natural and social scienceseach text covered environmental issues such as air and water pollution(Enright et al 1998) Thus text topics were similar across tasks

c New instruments used in the study Three new reading measureswere used in this study to assess Reading to Learn Reading to Integrateand Basic Comprehension Trites (2000 Chapters 2 and 3) presents amore extensive review of literature and rationale for development of thenew measures

bull Reading to Learn The first new measure completion of a chart was used to determine participantsrsquo ability to read to learnSpivey (1997 69) suggests that readersrsquo categorization of infor-mation in text offers insight into their cognitive processes andtheir making of meaning We designed a measure to be used with a 1200-word text that students read on either paper or com-puter Students were asked to recall identify and categorizeinformation from the text on a chart reflecting macro-rhetoricalstructures called macrostructures in this study (problems andsolutions) and other types of information from problemsolutiontexts (causes effects and examples) categories based on thework of Meyer (1985a) The scoring rubric based on work byMeyer (1985b) and later modified by Jamieson et al (1993)awarded points only for the upper levels of textual structurerepresented on the chart (for task and scoring rubric seeAppendix 1) We weighted the information supplied on the chartas follows 10 points for correct information in the problem andsolution categories five points for correct information suppliedin the cause and effect categories and one point for accurateexamples This weighting reflects Meyerrsquos (1985b) hierarchicallevels which characterize problem and solution propositions ashigher order structures while the other categories represent lowerorder propositions1 The theoretical maximum score for this scale

1Students received no points for information improperly placed or for information not found in thetext

Latricia Trites and Mary McGroarty 181

was 241 which would result from maximum points given in allcategories The first author and two research assistants spent35ndash40 hours creating revising norming the scoring rubric anddeveloping the scoring guide (Trites 2000 Chapter 3) To deter-mine interrater reliability we used coefficient alpha rather thanpercentage of agreement because percentage of agreementinflates the likelihood of chance agreement (Hayes and Hatch1999) After norming overall interrater reliability was 99(coefficient alpha) with similarly high reliabilities assessed withsimilarly high alpha coefficients for all subcategories2

bull Reading to Integrate The second new measure assessed Readingto Integrate The task used to assess Reading to Integraterequired participants to read two 600-word texts and compose awritten synthesis The prompt asked students to make connec-tions across the range of ideas presented thus we asked readersto synthesize information rather than summarize or makecomparisons (Wiley and Voss 1999) This synthesis was scoredbased on an analytic scale ranging from 0 to 80 reflecting read-ersrsquo ability to recognize and manipulate the structure of the textsinclude specific information and express connections acrosstexts through the use of cohesive devices (for task and scoringrubric see Appendix 2) The test was designed to measure theintegration of content from both readings and did not assessother aspects of writing such as the creation of rhetorical stylegrammaticality or mechanics The rubric was composed of threesubcategories integration ability macrostructure recognitionand use of relevant details The integration subscore wasawarded the highest point values because this was the predomi-nant skill being tested It scored participants on their ability tomake connections across texts based on the manipulation of thetextual frames in both texts The second subcategory awardedpoints for the ability to recognize and articulate the macrostruc-tures (problem cause effect or solution) present in each textThis subcategory was similar to the categorizing task used in theReading to Learn measure with the additional constraint thatparticipants had to express the connections overtly The thirdsubcategory in the scoring rubric analysed the ability to use

2We recognize that tasks requiring high inference measures plus extensive norming and revision of the scoring rubric pose feasibility issues in large-scale testing Further research is needed todetermine whether and how such scoring procedures could be adapted in standardized testing fornumerous test-takers

relevant details as support in the written synthesis The firstauthor and two research assistants spent 30 hours revising norm-ing the scoring rubric and developing a decision guide resultingin an overall interrater reliability of 99 (coefficient alpha) withsimilarly high alphas for all subcategories

bull Basic Comprehension The third construct was measured bymultiple-choice tests related specifically to the texts used in the new tasks These tests were created by TOEFL TestDevelopment staff and followed current TOEFL reading sectionspecifications We used two multiple choice tests BasicComprehension Test 1 (BC1) and Basic Comprehension Test 2 (BC2) 20 items each one for the longer passage used to assessReading to Learn and one for the two passages used to assessReading to Integrate Both were scored based on number of items answered correctly Reliability on BC1 calculated basedon 251 participants was 84 (coefficient alpha) Inadvertently theorder of the texts used in BC2 was different for the two differentmedia however reliability on both versions of the test was highFor those who took BC2 based on paper texts (n 127) relia-bility was 84 (coefficient alpha) for those who took BC2 basedon computerized texts (n 124) reliability was 86 (coefficientalpha)

3 Design for data collection

This study used a 22 repeated measures design to examineperformance on the new reading tasks Native speaker undergraduatesand nonnative speaker undergraduates were divided into two groupseach of equal ability as determined by performance on the baselinestandardized measures of reading comprehension (NelsonndashDenny orTOEFL) Half of each group read texts on paper the other half readthe same texts on a computer screen A smaller group of nonnativespeaker graduates equally divided were also included for a compar-ison between performance by graduate and undergraduate nonnativespeakers Additionally the administration of the new measures wascounterbalanced to control for any practice effect

a Procedures All participants met with the researchers in foursessions each lasting about an hour The first two sessions were devotedto administering the existing instruments During Session 1 partici-pants received an introduction to the study and took one of the two

182 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 8: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 181

was 241 which would result from maximum points given in allcategories The first author and two research assistants spent35ndash40 hours creating revising norming the scoring rubric anddeveloping the scoring guide (Trites 2000 Chapter 3) To deter-mine interrater reliability we used coefficient alpha rather thanpercentage of agreement because percentage of agreementinflates the likelihood of chance agreement (Hayes and Hatch1999) After norming overall interrater reliability was 99(coefficient alpha) with similarly high reliabilities assessed withsimilarly high alpha coefficients for all subcategories2

bull Reading to Integrate The second new measure assessed Readingto Integrate The task used to assess Reading to Integraterequired participants to read two 600-word texts and compose awritten synthesis The prompt asked students to make connec-tions across the range of ideas presented thus we asked readersto synthesize information rather than summarize or makecomparisons (Wiley and Voss 1999) This synthesis was scoredbased on an analytic scale ranging from 0 to 80 reflecting read-ersrsquo ability to recognize and manipulate the structure of the textsinclude specific information and express connections acrosstexts through the use of cohesive devices (for task and scoringrubric see Appendix 2) The test was designed to measure theintegration of content from both readings and did not assessother aspects of writing such as the creation of rhetorical stylegrammaticality or mechanics The rubric was composed of threesubcategories integration ability macrostructure recognitionand use of relevant details The integration subscore wasawarded the highest point values because this was the predomi-nant skill being tested It scored participants on their ability tomake connections across texts based on the manipulation of thetextual frames in both texts The second subcategory awardedpoints for the ability to recognize and articulate the macrostruc-tures (problem cause effect or solution) present in each textThis subcategory was similar to the categorizing task used in theReading to Learn measure with the additional constraint thatparticipants had to express the connections overtly The thirdsubcategory in the scoring rubric analysed the ability to use

2We recognize that tasks requiring high inference measures plus extensive norming and revision of the scoring rubric pose feasibility issues in large-scale testing Further research is needed todetermine whether and how such scoring procedures could be adapted in standardized testing fornumerous test-takers

relevant details as support in the written synthesis The firstauthor and two research assistants spent 30 hours revising norm-ing the scoring rubric and developing a decision guide resultingin an overall interrater reliability of 99 (coefficient alpha) withsimilarly high alphas for all subcategories

bull Basic Comprehension The third construct was measured bymultiple-choice tests related specifically to the texts used in the new tasks These tests were created by TOEFL TestDevelopment staff and followed current TOEFL reading sectionspecifications We used two multiple choice tests BasicComprehension Test 1 (BC1) and Basic Comprehension Test 2 (BC2) 20 items each one for the longer passage used to assessReading to Learn and one for the two passages used to assessReading to Integrate Both were scored based on number of items answered correctly Reliability on BC1 calculated basedon 251 participants was 84 (coefficient alpha) Inadvertently theorder of the texts used in BC2 was different for the two differentmedia however reliability on both versions of the test was highFor those who took BC2 based on paper texts (n 127) relia-bility was 84 (coefficient alpha) for those who took BC2 basedon computerized texts (n 124) reliability was 86 (coefficientalpha)

3 Design for data collection

This study used a 22 repeated measures design to examineperformance on the new reading tasks Native speaker undergraduatesand nonnative speaker undergraduates were divided into two groupseach of equal ability as determined by performance on the baselinestandardized measures of reading comprehension (NelsonndashDenny orTOEFL) Half of each group read texts on paper the other half readthe same texts on a computer screen A smaller group of nonnativespeaker graduates equally divided were also included for a compar-ison between performance by graduate and undergraduate nonnativespeakers Additionally the administration of the new measures wascounterbalanced to control for any practice effect

a Procedures All participants met with the researchers in foursessions each lasting about an hour The first two sessions were devotedto administering the existing instruments During Session 1 partici-pants received an introduction to the study and took one of the two

182 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 9: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

relevant details as support in the written synthesis The firstauthor and two research assistants spent 30 hours revising norm-ing the scoring rubric and developing a decision guide resultingin an overall interrater reliability of 99 (coefficient alpha) withsimilarly high alphas for all subcategories

bull Basic Comprehension The third construct was measured bymultiple-choice tests related specifically to the texts used in the new tasks These tests were created by TOEFL TestDevelopment staff and followed current TOEFL reading sectionspecifications We used two multiple choice tests BasicComprehension Test 1 (BC1) and Basic Comprehension Test 2 (BC2) 20 items each one for the longer passage used to assessReading to Learn and one for the two passages used to assessReading to Integrate Both were scored based on number of items answered correctly Reliability on BC1 calculated basedon 251 participants was 84 (coefficient alpha) Inadvertently theorder of the texts used in BC2 was different for the two differentmedia however reliability on both versions of the test was highFor those who took BC2 based on paper texts (n 127) relia-bility was 84 (coefficient alpha) for those who took BC2 basedon computerized texts (n 124) reliability was 86 (coefficientalpha)

3 Design for data collection

This study used a 22 repeated measures design to examineperformance on the new reading tasks Native speaker undergraduatesand nonnative speaker undergraduates were divided into two groupseach of equal ability as determined by performance on the baselinestandardized measures of reading comprehension (NelsonndashDenny orTOEFL) Half of each group read texts on paper the other half readthe same texts on a computer screen A smaller group of nonnativespeaker graduates equally divided were also included for a compar-ison between performance by graduate and undergraduate nonnativespeakers Additionally the administration of the new measures wascounterbalanced to control for any practice effect

a Procedures All participants met with the researchers in foursessions each lasting about an hour The first two sessions were devotedto administering the existing instruments During Session 1 partici-pants received an introduction to the study and took one of the two

182 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 10: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 183

standardized basic reading comprehension measures (NelsonndashDennyor TOEFL Reading Comprehension) Students completed thecomputer familiarity questionnaire and the NelsonndashDenny Test at the same testing session because the NelsonndashDenny was shorter than the TOEFL Reading Comprehension During Session 2 partic-ipants took the other standardized basic reading comprehensionmeasure

Next each participant group was subdivided into two subgroupsfor computer-based or paper reading of the texts for the new tasksThe subgroups were matched on their performance on initial readingmeasures the NelsonndashDenny was used for native speakers and theTOEFL Reading Comprehension was used for nonnative speakersIndependent t-tests run on these reading measures showed no signif-icant difference in basic comprehension for the newly created sub-groups assigned to each medium ensuring that they were balancedfor initial reading levels Participants stayed in the same subgroupsfor the duration of the study To ensure uniformity of response modeall participants whether they read the source texts on the computeror on paper responded to the reading tasks using paper and pencilformat3

The last two sessions each lasting approximately one hour werededicated to administration of the new measures The Reading toLearn session took slightly longer to administer because administra-tive procedures were longer for this novel task The new tasks werecounterbalanced to control for practice effect thus half of the partic-ipants took the Reading to Learn measure first and half took theReading to Integrate measure first During Session 3 we administeredthe first new measure (for ease of discussion Reading to Learn is dis-cussed first) and BC1 At this session students were given 12 minutesto read a 1200-word passage either on computer or on paper We lim-ited the time allowed for reading based on 100 words per minutethought to be ample (Grabe personal communication 1998) Afterexaminees read the text they were given 4 minutes to take notes ona half sheet of paper Participants were instructed to take minimalnotes due to the time constraints Next the text was removed andexaminees were allowed 15 minutes to complete a chart based on the reading with the aid of their notes After completing this Readingto Learn activity participants were allowed to use the text and

3Although responses could have been entered and perhaps scored by computer this would haveintroduced factors not directly related to our research questions and remains an area for furtherstudy

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 11: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

were given 15 minutes to answer BC1 Following these new testingsessions 49 participants were selected for a related interview con-cerning the cognitive processes used in task completion (for furtherdetails see Trites 2000 Chapter 6)

During Session 4 students were given 12 minutes to read twoshort texts (600 words each) either on computer or paper Afterparticipants read the assigned texts they were given 4 minutes totake one-half page of notes (Enright et al 1998) Next the textswere removed and participants were asked to demonstrate Readingto Integrate by writing a synthesis of the texts with the aid of theirnotes (15 minutes allowed for this task) After completing theReading to Integrate task participants were allowed to see the textsagain and answered BC2 (15 minutes allowed for this task) In oneReading to Integrate session for unknown reasons six of the sevenparticipants read only one text Because we cannot explain the causeof this anomalous session we have eliminated scores from thesessionrsquos seven participants from subsequent analyses thus slightlyreducing the N size for the Reading to Integrate measure

b Variables used in study The six independent variables includedthree nominal (Native Language Background Medium of TextPresentation and Level of Education) and three interval variables(NelsonndashDenny TOEFL Reading Comprehension and ComputerFamiliarity) The four dependent variables were Reading to LearnReading to Integrate BC1 and BC2

IV Results

First we present the descriptive statistics for all reading measuresfollowed by a systematic analysis of independent variables that mightaffect participant performance on the new measures Scatterplotswere checked for all reading measures to ensure normality of dataKurtosis and skewness levels for all reading measures were found tobe within normal limits indicating a relatively normal distributionDescriptive statistics for all existing measures are shown in Table 1Means for these measures show a consistent pattern the nativespeaker undergraduates had the highest mean followed by the non-native speaker graduates followed by the nonnative speaker under-graduates On the reading measures NelsonndashDenny and TOEFLReading Comprehension the nonnative speaker undergraduates

184 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 12: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 185

showed the largest variance in performance while on the computerfamiliarity measure the variance of both nonnative speaker groupswas substantially larger than that of the native speakers

The same pattern emerged for the means on the new measures (seeTable 2) as for the existing measures The native speaker undergradu-ate group performed better on all new measures than both of thenonnative speaker groups The nonnative speaker graduate groupperformed better than the nonnative speaker undergraduate group onall measures as well This robust pattern of performance was alsofound in the variance of three of the four new measures On BC1 andBC2 the performance of the native speaker undergraduates showed the least amount of variance followed by the nonnative speaker grad-uates followed by the nonnative speaker undergraduates On Readingto Integrate the native speaker undergraduate group showed substan-tially less variance than the nonnative speaker groups however thevariance of the two nonnative speaker groups was almost identical OnReading to Learn all three groups showed considerable variance

Table 3 reveals the range of awarded points achieved by all partici-pant groups The nature of the Reading to Learn point system created amaximum possible point value (241) that no participant achieved Wespeculate that there are at least three possible causes of the discrepancybetween the theoretical maximum and the range of observed scores

Table 1 Descriptive statistics for existing measures for three participant groups

Group n Mean sd kMax

NelsonndashDennyNSU 105 12648 1646 156NNSU 106 6724 3191 156NNSG 40 8888 2188 156Total participants 251 9547 3693 156

TOEFL Reading comprehensionNSU 105 6130 424 67NNSU 106 5030 853 67NNSG 40 5715 455 67Total participants 251 5599 819 67

Computer familiarityNSU 104 3808 360 44NNSU 104 3482 599 44NNSG 40 3563 602 44Total participants 248 3631 533 44

Note kMax number of items or maximum possible score

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 13: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

186 New tasks for reading comprehension tests

bull task novelty no participant reported ever doing such a taskpreviously

bull time allowed for task completion andbull space on the response sheet space constraints may have limited

the amount of information that participants could include

Future research would need to address these issues However for theReading to Integrate measure the full range of possible point totalswas achieved by at least one participant in each group

1 Computer familiarity

The overall plan for the analyses was to check the influence of theindependent variables on the dependent measures with computerfamiliarity being addressed first Initially we had proposed that if computer familiarity was significantly different across groups itwould be entered into all calculations as a covariate To determinethis it was necessary to conduct an Analysis of Variance (ANOVA) forcomputer familiarity across the six participantmedium subgroups

Table 2 Descriptive statistics for new measures for three participant groups

Group n Mean sd kMax

Reading to Learn (chart)NSU 105 5185 1986 241NNSU 106 3173 1950 241NNSG 40 4468 1927 241Total participants 251 4221 2164 241

Basic Comprehension Test 1NSU 105 1698 247 20NNSU 106 1173 425 20NNSG 40 1498 350 20Total participants 251 1444 423 20

Reading to Integrate (synthesis)NSU 101 6365 1105 80NNSU 103 3724 2176 80NNSG 40 5360 2103 80Total participants 244 5086 2163 80

Basic Comprehension Test 2NSU 105 1591 278 20NNSU 106 975 454 20NNSG 40 1285 361 20Total participants 251 1282 472 20

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 14: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 187

The resulting ANOVA (F 470 p 05) showed a significantdifference between subgroups on the computer familiarity question-naire therefore a post hoc Scheffeacute test was done to locate significantcontrasts After analysis of all possible subgroup contrasts the post hoc Scheffeacute revealed that the only significant difference in sub-groups appeared between the native speaker undergraduates andnonnative speaker undergraduates who read texts on paper Hencealthough there was one significant contrast it occurred in two sub-groups reading on paper not in any of the subgroups who read oncomputer All groups generally scored high on computer familiarityalthough as noted variance of the nonnative groups was greater Itwas thus established that computer familiarity had no significanteffect on participants who read texts on computer so we did not usecomputer familiarity as a covariate in further analyses and proceededto the three research questions of central interest to this study

Because both Research Questions 1 and 2 are similar ndash except thatthey address the two different new reading measures Reading toLearn and Reading to Integrate ndash we approached them in the samemanner through ANOVA to identify the independent variables thatcould have significantly affected the results on the new measures

2 Research Question 1

The first research question asked if performance on a measure ofReading to Learn was affected by medium of presentation computerfamiliarity native language or level of education We calculated a uni-variate ANOVA with Type III sums of squares on Reading to Learn with

Table 3 Range of scores for new measures for three participant groups

Group n Minimum Maximum kMax

Reading to Learn (chart)NSU 105 14 120 241NNSU 106 0 86 241NNSG 40 3 94 241Total participants 251 0 120 241

Reading to Integrate (synthesis)NSU 101 38 80 80NNSU 103 0 80 80NNSG 40 5 80 80Total participants 244 0 80 80

Notes kMax number of items or maximum possible score n size reduced forreading to integrate because of anomalous testing session

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 15: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

188 New tasks for reading comprehension tests

group status medium of text presentation and test order as possiblecontributing factors4 Table 4 shows that there were no significantinteractions for any of the group medium or test order combinationsThe only significant main effect was group membership

Because group membership was a combined measure that includedboth native language background as well as level of education posthoc analysis was needed to identify the significant contrasts Table 5shows that there was a significant difference in performance on theReading to Learn measure between the native speaker undergraduateand the nonnative speaker undergraduate groups as well as a sig-nificant difference between the nonnative speaker undergraduate and nonnative speaker graduate groups There was no significantdifference in performance between the native speaker undergrad-uate and the nonnative speaker graduate groups Therefore theanswer to Research Question 1 is that native language backgroundand level of education did have a significant effect on performance onthe Reading to Learn measure but that medium of text presentationdid not Further order of testing whether participants took Readingto Learn or Reading to Integrate first had no significant effect

3 Research Question 2

The second research question related to the first asked if perform-ance on Reading to Integrate was affected by medium of presentation

Table 4 Performance on Reading to Learn measure by groups medium and testorder (n 251) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 2186929 2 1093464 2810Medium 121586 1 121586 313Test order 29498 1 29498 076Group medium 33481 2 16740 043Group test order 437 2 219 001Medium test order 39173 1 39173 101Group medium test order 57529 2 28765 074Error 9299745 239 38911

Note p 05

4Test order was added as an additional variable to double check that our counterbalancing had beeneffective in controlling for any practice effect

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 16: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 189

computer familiarity native language or level of education Again toensure that counterbalancing of tests controlled for any practiceeffect test order was added as an additional variable

To answer this question we proceeded to calculate a univariateANOVA on the Reading to Integrate measure with group statusmedium of text presentation and test order entered as possible contri-buting factors The results (Table 6) show as for Research Question 1that there were no significant interactions for any of the groupmedium or test order combinations the only significant main effectwas group membership The answer for Research Question 2 is thatnative language background and educational level had a significanteffect on Reading to Integrate but medium of text presentation did notPost hoc analysis of group contrasts showed that all three groups weredistinct in their performance on Reading to Integrate (see Table 7)

4 Research Question 3

The third research question asked to what extent measures of basiccomprehension Reading to Learn and Reading to Integrate were

Table 5 Post hoc Scheffeacute for Reading to Learn measure (n 251)

Group n Group n Mean difference Standard error

NSU 105 NNSU 106 2012 272NNSG 40 717 367

NNSU 106 NNSG 40 1295 366

Note p 05

Table 6 Performance on Reading to Integrate measure by groups medium andtest order (n 244a) (univariate analysis of variance)

Source Type III sum df Mean square Fof squares

Group 3629433 2 1814717 5582b

Medium 19295 1 19295 059Test order 9783 1 9783 030Group medium 1182 2 591 002Group test order 3014 2 1507 005Medium test order 109872 1 109872 338Group medium test order 148858 2 74429 229Error 7543037 232 32513

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 17: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

190 New tasks for reading comprehension tests

related We used correlational analysis as the first step in answeringthis question Results for the total participant population (see Table 8)showed moderate to high correlations across all reading measuresHowever the analyses done for Research Questions 1 and 2 revealedthat group status had a significant effect on performance on Readingto Learn and Reading to Integrate Further we realize that corre-lations are sensitive to variance so the high correlations seen in thetotal population could have been an artifact of combining the threegroups Therefore we examined the correlations among all readingmeasures for each group (available in Trites 2000 Appendix 1 pp230ndash33) While the reading measures were still correlated oftenmoderately sometimes highly magnitudes differed and sometimesdropped substantially The text-specific multiple-choice measuresBC1 and BC2 consistently correlated more highly with theNelsonndashDenny and TOEFL Reading Comprehension tests than withReading to Learn and Reading to Integrate based on the same textssuggesting a test method or construct effect Because comparisonsbetween different measures of basic comprehension were not a goal of the project BC1 and BC2 were not used in further analysesWe conclude that as expected all reading measures were relatedbut the lower correlations between Reading to Learn and Reading to Integrate and the traditional basic comprehension measures led us to consider further types of analysis to identify the possibledistinctiveness of the new measures

5 Discriminant analysis

Because we were interested in determining how constructs differedwe sought additional analyses to help us better characterize the new constructs Of the several possible statistical methods that could have been employed two are most plausible multivariateanalysis of variance usually associated with experimental research

Table 7 Post hoc Scheffeacute for Reading to Integrate measure (n 244a)

Group n Group n Mean difference Standard error

NS 101 NNSU 103 2641b 253NNSG 40 1005b 337

NNSU 103 NNSG 40 1636b 336

Notes an size reduced for Reading to Integrate because of anomalous testingsession bp 05

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 18: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 191

Tab

le 8

Co

rrel

atio

ns

for

all r

ead

ing

mea

sure

s fo

r al

l par

tici

pan

ts (

n

251)

TO

EFL

Rea

din

gB

asic

B

asic

Rea

din

g t

oR

ead

ing

to

C

om

pre

hen

sio

nC

om

pre

hen

sio

n T

est

1C

om

pre

hen

sio

n T

est

2Le

arn

Inte

gra

tea

Nel

son

ndashDen

ny

90b

85b

84b

66b

69b

TO

EFL

Rea

din

g

100

85b

84b

64b

69b

com

pre

hen

sio

nB

asic

1

008

4b6

8b6

8b

com

pre

hen

sio

n 1

Bas

ic

100

68b

70b

com

pre

hen

sio

n 2

Rea

din

g t

o L

earn

100

59b

Not

es a

nsi

ze r

edu

ced

fo

r R

ead

ing

to

Inte

gra

te b

ecau

se o

f an

om

alo

us

test

ing

ses

sio

n b

p

05

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 19: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

and discriminant analysis usually associated with descriptiveresearch (Tabachnick and Fidell 1996) The present research wasconducted with samples of naturally occurring student groups andwas not experimental Moreover we were interested in finding waysto compare participant performance on the measures of the newconstructs Reading to Learn and Reading to Integrate with perform-ance on more traditional measures of basic comprehension Thus we opted to use discriminant analysis because of its parsimony of description and clarity of interpretation (Stevens 1996)Discriminant analysis a technique recommended to describe groupdifferences or predict group membership based on a comparison ofmultiple predictors (Huberty 1994) has been used in other areas ofapplied linguistic research to investigate creation of a student profileof success or failure on Computer Assisted Language Learning(CALL) lessons (Jamieson et al 1993) and accurate classification oftext types into registers (Biber 1993) among other purposes

To further distinguish basic comprehension from the new con-structs we conducted discriminant analysis on each of the twolanguage groups (native and nonnative) to determine whetherReading to Learn and Reading to Integrate would classify partici-pants in the same way that Basic Comprehension would We dividedthe native speaker and nonnative speaker groups into three levelshigh middle (mid) and low reading ability scorers based on thebasic comprehension measure chosen for that group (NelsonndashDennyfor native speakers TOEFL Reading Comprehension for nonnativespeakers) Research methodologists (Tabachnick and Fidell 1996513) note that robustness is expected when the smallest group has isleast 20 our smallest group was 25 To check the assumption ofhomogeneity of the variancecovariance matrices we examined theoutcomes of Boxrsquos M Test and found them all nonsignificant(Klecka 1980) Each group was checked for outliers usingMahalanobis distance treated as Chi-Square and no outliers were found (Tabachnick and Fidell 1996) Thus the data met allassumptions required for use of discriminant analysis

a Discriminant analysis for nonnative speakers To organize thediscriminant analysis in order to see if the Reading to LearnReadingto Integrate Composite classified participants similarly to themeasure of Basic Comprehension (for nonnative speakers TOEFLReading Comprehension) we divided the entire nonnative speakergroup (n 146) into three levels of basic comprehension high mid

192 New tasks for reading comprehension tests

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 20: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 193

and low These reading ability groups of high mid and low werebased on typical TOEFL Reading Comprehension score levelsrequired for program entry Participants were classified as high iftheir scores were greater than or equal to 550 or 56 and above on thescaled score on the TOEFL Reading Comprehension (550 is the cutscore often used for graduate entry) Participants were classified asmid if scores ranged between 500 and 549 or 50 to 55 on the TOEFLReading Comprehension Participants were classified as low if theirscores fell below 500 or 49 and below on TOEFL ReadingComprehension (500 is a minimum TOEFL score sometimes usedfor undergraduate admission often with the proviso that studentsenroll in ESL classes either prior to official enrollment or concur-rently) For our entire nonnative speaker group descriptive statisticson basic comprehension reading ability group membership levelsappear in Table 9

We ran SPSS Discriminant Analysis with initial grouping variablesof high mid and low reading ability We compared initial readingability levels with high mid and low categories on the Reading toLearnReading to Integrate Composite a new variable reflectinglevel of performance on the Reading to Learn and Reading toIntegrate measures combined5 The discriminant analysis yieldedone discriminant function with an eigenvalue of 92 responsible for999 of the variance in outcomes Wilkrsquos Lambda for this functionwas 52 significant at 001 the associated Chi-Square value wasextremely large (9093) and highly significant ( p 001) indicatingthe group centroids on the composite Reading to LearnReading toIntegrate function for the three nonnative speaker reading abilitygroups were significantly different Both Reading to Learn and

Table 9 Descriptive statistics for nonnative speakers by TOEFL reading compre-hension reading ability groups (n 143)

Reading ability group n Mean on TOEFL reading comprehension sd

High (56) 57 5968 303Mid (50ndash55) 42 5281 167Low (49) 44 4211 547

Total 143 5226 822

Note Total n 143 due to loss of three cases in anomalous Reading to Integratesession

5We first calculated two separate discriminant analyses one for Reading to Learn and one for Reading to Integrate but we found that both loaded on a single function so we used thecomposite in subsequent analyses

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 21: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

194 New tasks for reading comprehension tests

Reading to Integrate loaded significantly on the discriminant func-tion at 86 for Reading to Learn and 78 for Reading to Integrate ( p 05)

Over half (649) of the high reading ability group remained highon the new measure less than half (429) of the mid reading ability group remained classified as mid Hence the Reading toLearnReading to Integrate Composite was particularly influential inreclassifying the mid reading ability group and to a lesser extent thehigh group However most (818) of the low reading ability groupremained low on the Reading to LearnReading to IntegrateComposite (see Table 10) Of the 143 nonnative speaker participants92 (64) remained in the initial basic comprehension category onthe composite the rest moved but in different directions Twenty-one participants (147) were classified into a higher category on theReading to LearnReading to Integrate Composite than their initialbasic comprehension level would have suggested while 31 (217)were reclassified into a lower category Thus 51 participants(364) just over one third of the sample were classified differentlybased on their Reading to LearnReading to Integrate Compositeperformance

b Discriminant analysis for native speakers Because one of ourgoals in this project was to probe the possible validity of these newmeasures by assessing performance of two groups native as well as

Table 10 Discriminant analysis comparison of nonnative speaker reading abilitygroups with reading to learnreading to integrate composite (n 143)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto integrate composite

High Mid Low

CountHigh (56) 37 17 3 57Mid (50ndash55) 13 18 11 42Low (49) 2 6 36 44Reclassification total 52 41 50 143

PercentageHigh (56) 649 298 53 1000Mid (50ndash55) 310 429 262 1000Low (49) 45 136 818 1000

NoteTotal n 143 due to loss of three cases in anomalous Reading to Integratesession

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 22: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 195

nonnative speakers we conducted a parallel discriminant analysisfor native speakers Thus for the native speakers we followed thesame procedure dividing the entire native speaker group (n 105)into three levels of basic comprehension high mid and low basedon score distances of 5 standard deviations from the sample meanof the NelsonndashDenny test for these participants Native speakersneed not take reading comprehension tests when entering the univer-sity so the three-way split was based entirely on our sample dataParticipants were classified as high if their scores on the Nelson-Denny were greater than or equal to 135 Participants were classifiedas mid if their scores ranged between 119ndash134 on the Nelson-DennyParticipants were classified as low if their scores fell at or below 118on the Nelson-Denny Descriptive statistics for the entire nativespeaker sample on basic comprehension group membership appearin Table 11

The discriminant analysis yielded one discriminant function with an eigenvalue of 31 responsible for 987 of the variance inoutcomes Wilkrsquos Lambda for this function was 76 significant at 001 the associated Chi-Square value was large (2639) and highlysignificant (p 001) indicating the group centroids on the discrim-inant function for the three reading ability groups on the Reading toLearnReading to Integrate Composite were significantly differentPooled within groups correlations between discriminating variablesshowed that Reading to Learn correlated with the first discriminantfunction at a level of 81 Reading to Integrate correlated with thesecond discriminant function at 71 This contrasts with findings forthe nonnative speakers where scores on the combined new measuresloaded significantly on only one discriminant function For nativespeakers then there is evidence for two significant discriminantfunctions although the first accounts for almost all of the varianceAlthough these two measures (Reading to Learn and Reading toIntegrate) loaded on two separate discriminant functions they still

Table 11 Descriptive statistics for native speakers by NelsonndashDenny reading abilitygroups (n 101)

Reading ability group n Mean on NelsonndashDenny sd

High (135) 39 14126 519Mid (119ndash134) 37 12565 509Low (118) 25 10288 1098Total 101 12604 1652

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 23: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

196 New tasks for reading comprehension tests

showed moderate correlations with the alternate function6 justifyingthe composite calculations

Results of discriminant analysis for native speakers seen in Table 12 show a different pattern than that observed for nonnativespeakers Nearly three-fourths (718) of the native speakersclassified as high in the basic comprehension reading ability groupremained high on the Reading to LearnReading to IntegrateComposite For the mid group however only 189 remainedclassified as mid Just over half (52) of the low reading abilitygroup members remained low on the Reading to LearnReading to Integrate Composite As with the nonnative speakers participantsin the mid category on basic comprehension showed the mostfrequent reclassification Forty-eight of the 101 (475) nativespeaker participants remained in the initial classification categoriesTwenty-two (218) were reclassified into a higher category and 31(307) were reclassified into a lower category Thus 53 participants(525) over half of the sample were classified differently based ontheir performance on the Reading to LearnReading to IntegrateComposite

6Reading to Learn correlated with function 2 at -58 Reading to Integrate with function 1 at 71

Table 12 Discriminant analysis comparison of native speaker reading abilitygroups with Reading to LearnReading to Integrate Composite (n 101)

Reading ability group Predicted group membership Initial for Reading to LearnReading classification totalto Integrate Composite

High Mid Low

CountHigh (56) 37 17 3CountHigh (135) 28 3 8 39Mid (119ndash134) 10 7 20 37Low (118) 5 7 13 25Reclassification total 43 17 41 101

PercentageHigh (135) 718 77 205 1000Mid (119ndash134) 270 189 541 1000Low (118) 200 280 520 1000

Note Total n 101 for discriminant analysis due to loss of four cases in anomalousreading to integrate session

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 24: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

V Interpretation

Analyses done to answer Research Questions 1 and 2 showed that performance on Reading to Learn and Reading to Integratemeasures was significantly influenced by language background and level of education (graduate vs undergraduate for nonnativespeakers only) Moreover level of computer familiarity had nosignificant effect on Reading to Learn and Reading to Integrateperformance

Correlations showed that as expected all reading measures corre-lated to some degree generally answering Research Question 3 Themost interesting results came from the discriminant analyses becausethey showed a pattern of differential classification on the new taskssensitive to initial reading ability This pattern differed for nonnativespeakers and native speakers Examination of the reclassificationsbased on the new tasks revealed some of the problems of using abasic comprehension-only test to predict performance on more chal-lenging literacy tasks These results also imply the need for furtherwork on new measures of advanced literacy skills such as Reading toLearn and Reading to Integrate to reflect trends in construct-drivenassessment (Pellegrino et al 1999)

For the nonnative speakers most (818) of the participants withTOEFL Reading Comprehension below 50 remained classified as lowon the Reading to LearnReading to Integrate Composite suggestingthe existence of a lower threshold of academic English proficiencyParticipants below this threshold were unlikely to perform well ontasks assessing Reading to Learn and Reading to Integrate Even tasksinvolving only selection (such as BC1 and BC2) rather than produc-tion of responses were difficult for this group However for thenonnative speaker readers in the mid and high reading ability groupsbasic comprehension level was not nearly as consistent a predictor of performance on the Reading to LearnReading to IntegrateComposite This was especially striking for those in the mid readingability group Approximately one fourth of the mid group droppedinto the low category on the Reading to LearnReading to IntegrateComposite indicating that they experienced more difficulty in com-pleting the new measures but approximately one fourth could indeedcategorize and synthesize information relatively well despite mid-dling performance on basic comprehension This finding suggeststhat for nonnative speaker readers in the mid reading ability categorybasic comprehension measures were insufficient to predict theirperformance on the new measures

Latricia Trites and Mary McGroarty 197

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 25: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Almost two-thirds of the high basic comprehension nonnativespeakers remained high on the Reading to LearnReading toIntegrate Composite but one third dropped suggesting that theycould not categorize and synthesize information from texts as well asthey could recognize information For this third the recognition-onlytask multiple-choice basic comprehension overestimated ability tomanipulate and synthesize information For the high and mid read-ing ability nonnative speaker groups basic comprehension-only testsmay then overestimate academic English proficiency

Results for the native speakers showed a more definitive pattern ofreclassification For the low reading ability group approximatelyhalf were misclassified suggesting that the basic comprehensionmeasure underrepresented their ability to categorize and synthesizeinformation in academic texts Most of the mid reading ability groupwere reclassified as lower on the Reading to LearnReading toIntegrate Composite showing that basic comprehension resultsoverestimated ability to succeed on more challenging tasks On theother hand almost one third of the mid reading ability group didbetter on the Reading to LearnReading to Integrate Composite forthem basic comprehension underrepresented their ability to com-plete more challenging reading tasks For the high ability readersnearly one third dropped on the Reading to LearnReading toIntegrate Composite For them the basic comprehension measureoverestimated their level of academic English proficiency The other two thirds of the high reading ability group remained highsuggesting a higher threshold of academic English proficiency

Considering results from both the nonnative speakers and nativespeakers we conclude that the new tasks did assess something dif-ferent from basic comprehension once a lower level threshold ofbasic academic English proficiency had been achieved Examinationof scatterplots based on discriminant functions indicated an obviousseparation between participants who could and could not perform onthe Reading to LearnReading to Integrate Composite thus provid-ing some tentative evidence for concurrent validity (Messick 1989Chapelle 1999) We had hoped to find clear evidence of a hierarchysuggesting that Reading to Learn was demonstrably more difficultthan basic comprehension and Reading to Integrate demonstrablymore difficult than Reading to Learn but results did not yield anobvious hierarchy Results did however suggest an even simplerpattern a dichotomy For those nonnative speakers above the lowerthreshold of English language proficiency which in our data wouldbe approximately 500 or 49 on the scaled TOEFL Reading

198 New tasks for reading comprehension tests

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 26: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Comprehension scores some could perform well on the newmeasures some could not revealing two rather than three groups onthe Reading to LearnReading to Integrate Composite These resultslead us to speculate that the new measures tap additional skills suchas lsquosophisticated discourse processes and critical thinking skillsrsquo(Enright and Schedl 1999 24) in addition to language proficiencyFor native speakers the discriminant loading suggested a possiblehierarchy of difficulty but the scatterplots and classification tablesrevealed a dichotomy Reclassification based on the Reading toLearnReading to Integrate Composite nearly eliminated the mid basic comprehension reading ability category with only 168of the native speakers qualifying as mid on the Reading toLearnReading to Integrate Composite

VI Conclusions

While we acknowledge that there are some limitations to this projectsuch as the time required to administer and score new measuresthe results are illuminating useful and suggest some considerationsfor future research The first consideration relates to appropriateclassification of students based on academic reading abilities Thiscorresponds to the role of TOEFL as a gatekeeper by many institu-tions For TOEFL test-takers whose current TOEFL scores would bein an intermediate to high intermediate range the new tasks couldassess additional abilities relevant to academic performance Forsuch participants additional test development efforts are warrantedThe majority of high reading ability nonnative speakers (649)could perform the new measures a third could not Our resultssuggest that some unknown number of mid reading ability nonnativespeakers are more capable of succeeding at more challenging aca-demic reading tasks than their current level of basic comprehensionassessment would indicate At the same time there are some highand mid reading ability participants who could not perform the moredemanding tasks admitting such students directly into universityprograms could result in failure These results suggest that for moststudents at lower levels of basic comprehension (in our study thosewith TOEFL scores below 490 to 500) development of new tasks isunnecessary Nevertheless our results indicate that a certain smallpercentage (in our sample 8 students 182) of nonnative speakersclassified as low reading ability on basic comprehension did in fact

Latricia Trites and Mary McGroarty 199

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 27: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

perform adequately on more challenging tasks like Reading to Learnand Reading to Integrate Given the very large number of TOEFLtest-takers around the world this finding merits further research Itwould be potentially unfair to exclude such students from universitystudy based on basic comprehension-only measures such as the cur-rent TOEFL We would urge institutions that use TOEFL to considerscores on these new more demanding tasks if doing so wouldaddress institutional needs According to interview data collected inconjunction with this project (Trites 2000 Chapter 6) participantsperceived that successful completion of the new tasks required athorough understanding of the texts whereas the multiple-choicetests were less demanding requiring only superficial grasp of con-tent This finding corroborates the observation of Freedle and Kostin(1999 3) who note that often examinees do not need to comprehendthe accompanying text to answer the test item Therefore for allTOEFL test-takers incorporating more challenging tasks such as theReading to Learn and Reading to Integrate measures into typicalEnglish language instruction could have positive washback effects

The second area of consideration is the focus on additional rele-vant research For predictive validity it is important to determine thecorrelation between the Reading to LearnReading to IntegrateComposite measures and actual academic performance in universityclasses Correlations might differ depending on the degree of catego-rization and synthesis required by different major fields of studyAnother area for future research is related to the novelty of theReading to Learn task as an assessment technique Current pedagog-ical trends in literacy instruction emphasize the use of graphic organ-izers as a means to understand and manipulate information in textsTo our knowledge graphic organizers are typically used as classroomactivities rather than assessment tools They have good potential foruse in assessment if students are familiar with them and if appropri-ate scoring systems can be developed This project demonstrates that it is possible though labor intensive to develop reliable scoringsystems Further research is needed to explore refinement of this tasktype and related scoring systems for use in testing programs withlarge numbers of test-takers and scorers Although the Reading toIntegrate task (generating a synthesis) was more familiar the scoringsystem was innovative because it reflected a readerrsquos ability torecognize textual frames as well as integrate information If testdevelopers are interested in the abilities of nonnative speaker readersto perform such tasks further development of similar tasks andscoring systems is warranted While a substantial investment of time

200 New tasks for reading comprehension tests

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 28: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

would be required to refine the administration and scoring systemsneeded for more complex tasks such as these it should be weighedagainst the possible danger of under-representing the likelihood ofacademic success based on results of the basic comprehension-onlymeasures still most often used to assess academic proficiency

Acknowledgements

This project was funded by a grant from Educational Testing Serviceas part of the TOEFL 2000 effort We appreciate the cooperation ofthe ETS staff members who assisted in the development of passagespecific tasks and other aspects of the research however no officialendorsement of Educational Testing Service should be inferred

VII References

Bachman L 2000 Modern language testing at the turn of the century assur-ing that what we count counts Language Testing 17 1ndash42

Biber D 1993 Using register-diversified corpora for general language studiesComputational Linguistics 19 219ndash41

Britt M Rouet J and Perfetti C 1996 Using hypertext to study and reasonabout historical evidence In Rouet J Levonen J Dillon A and Spiro Reditors Hypertext and cognition Mahwah NJ Lawrence Erlbaum 43ndash72

Chapelle C 1999 Validity in language assessment Annual Review of AppliedLinguistics 19 1ndash19

Educational Testing Service 1997 TOEFL test and score manual PrincetonNJ Educational Testing Service

mdashmdash 1998 Draft TOEFL 2000 research agenda framework areas of researchResearch agenda three TOEFL 2000 internal document Princeton NJEducational Testing Service

Eignor D Taylor C Kirsch I and Jamieson J 1998 Development of ascale for assessing the level of computer familiarity of TOEFL exami-nees TOEFL Research Report No 60 Princeton NJ EducationalTesting Service

Enright M and Schedl M 1999 Reading for a reason using reader purposeto guide test design TOEFL 2000 Internal Report Princeton NJEducational Testing Service

Enright M Grabe W Mosenthal P Mulcahy-Ernt P and Schedl M1998 A TOEFL 2000 framework for testing reading comprehension aworking paper Princeton NJ Educational Testing Service

Foltz P 1996 Comprehension coherence and strategies in hypertext InRouet J Levonen J Dillon A and Spiro R editors Hypertext and cog-nition Mahwah NJ Lawrence Erlbaum 109ndash36

Latricia Trites and Mary McGroarty 201

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 29: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Freedle R and Kostin I 1999 Does the text matter in a multiple choice testof comprehension The case for the construct validity of TOEFLrsquosminitalks Language Testing 16 2ndash32

Goldman S 1997 Learning from text reflections on the past and suggestionsfor the future Discourse Processes 23 357ndash98

Hayes J and Hatch J 1999 Issues in measuring reliability correlationversus percentage of agreement Written Communication 16 354ndash67

Huberty C 1994 Applied discriminant analysis New York John WileyJamieson J Campbell J Norfleet L and Berbisada N 1993 Reliability

of a computerized scoring routine for an open-ended task System 21305ndash22

Jamieson J Norfleet L and Berbisada N 1993 Successes failures anddropouts in computer-assisted language lessons Computer AssistedEnglish Language Learning Journal 4 12ndash20

Klecka W 1980 Discriminant analysis In Lewis-Beck M editorQuantitative applications in the social sciences Volume 19 NewburyPark CA Sage

Lehto M Zhu W and Carpenter B 1995 The relative effectiveness ofhypertext and text International Journal of Human-ComputerInteraction 7 293ndash313

McNamara D and Kintsch W 1996 Learning from texts effects of prior knowl-edge and text coherence Discourse Processes 22 247ndash88

Messick S 1989 Validity In Linn RL editor Educational measurement 3rdedition New York American Council on Education Macmillan 13ndash103

Meyer B 1985a Prose analyses purposes procedures and problems InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 11ndash64

mdashmdash 1985b Prose analysis purposes procedures and problems Part 2 InBritton B and Black J editors Understanding expository text HillsideNJ Lawrence Erlbaum 269ndash97

Monks V 1997 AugustSeptember Two views same waterway NationalWildlife 35 36ndash37

Pellegrino J Baxter G and Glaser R 1999 Addressing the lsquotwo disci-plinesrsquo problem linking theories of cognition and learning with assess-ment and instructional practice Review of Research in Education 24307ndash53

Perfetti C 1997 Sentences individual differences and multiple texts threeissues in text comprehension Discourse Processes 23 337ndash55

Perfetti C Britt MA and Georgi M 1995 Text-based learning andreasoning studies in history Hillside NJ Lawrence Erlbaum

Perfetti C Marron M and Foltz P 1996 Sources of comprehension failuretheoretical perspectives and case studies In Cornoldi C and Oakhill Jeditors Reading comprehension difficulties processes and interventionMahwah NJ Lawrence Erlbaum 137ndash65

202 New tasks for reading comprehension tests

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 30: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Reinking D 1988 Computer-mediated text and comprehension differencesthe role of reading time reader preference and estimation of learningReading Research Quarterly 23 484ndash500

Reinking D and Schreiner R 1985 The effects of computer-mediated texton measures of reading comprehension and reading behavior ReadingResearch Quarterly 20 536ndash53

Spivey N 1997 The constructivist metaphor reading writing and themaking of meaning San Diego CA Academic Press

Stevens J 1996 Applied multivariate statistics for the social sciences 3rdedition Mahwah NJ Lawrence Erlbaum

Tabachnick B and Fidell L 1996 Using multivariate statistics 3rd editionNew York Harper Collins

Taylor C Jamieson J Eignor D and Kirsch I 1998 The relationshipbetween computer familiarity and performance on computer-basedTOEFL test tasks TOEFL Research Report No 61 Princeton NJEducational Testing Service

Tennesen M 1997 NovemberDecember On a clear day National Parks 7126ndash9

Trites L 2000 Beyond basic comprehension reading to learn and reading tointegrate for native and non-native speakers Unpublished doctoraldissertation Northern Arizona University Flagstaff AZ

Van den Berg S and Watt J 1991 Effects of educational setting on studentresponses to structured hypertext Journal of Computer-BasedInstruction 18 118ndash24

Van Dijk TA and Kintsch W 1983 Strategies of discourse comprehensionNew York Academic Press

Wiley J and Voss J 1999 Constructing arguments from multiple sourcestasks that promote understanding and not just memory for text Journalof Educational Psychology 91 310ndash11

Zimmerman T 1997 December 29 Filter it with billions and billions of oys-ters how to revive the Chesapeake Bay US News and World Report 12363

Appendix 1a Chart completion task

Directions Complete the following chart Fill in as much detail aspossible from the text read by categorizing the information into thedifferent areas on the chart Do not use single words for yourresponses form your responses in phrases or complete sentencesInclude examples from the text

Make no judgments about the accuracy of causes or effects or theeffectiveness of the solutions mentioned in the text Solutions are seen

Latricia Trites and Mary McGroarty 203

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 31: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

204 New tasks for reading comprehension tests

as any action taken in response to the problem(s) Solutions can takemany forms such as proposed solutions attempted solutions or failedsolutions Also space is provided under each category for examplesExamples are specific examples found in the text that are used by theauthor(s) to exemplify the problems causes effects or solutions inthe text However there may not be examples for every category

Points will be awarded for correct responses only There is no penaltyfor incorrect responses Points will be awarded in the following manner

Problems and Solutions 10 points eachCauses and Effects 5 points eachExamples 1 point each

Problems Causes Effects Solutions

Examples Examples Examples Examples

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 32: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 205

Ap

pen

dix

1b

Rea

din

g t

o L

earn

sco

rin

g r

ub

ric

0ndash2

41 t

ota

l po

ssib

le

Pro

ble

ms

(10

po

ints

) 1

wo

rd

Cau

ses

(5p

oin

ts)

1 w

ord

E

ffec

ts (

5 p

oin

ts)

1 w

ord

S

olu

tio

ns

(10

po

ints

en

d)

1 w

ord

(6 p

oin

ts)

40

max

imu

m(3

po

ints

) 5

5 m

axim

um

(3 p

oin

ts)

30

max

imu

m(6

po

ints

) 9

0 m

axim

um

bullA

ir p

ollu

tio

nS

mo

g i

n t

he

bullS

ulf

ur

nit

rog

en e

mis

sio

ns

bullTre

es a

nd

pla

nts

aff

ecte

dS

tric

ter

En

vir

on

men

tal

Law

s

Nati

on

al

Park

sbull

Aci

d r

ain

(in

jure

d)

gro

wth

hin

der

ed(r

eso

luti

on

s

acts

)

bullO

pp

osit

ion

or

Ign

ori

ng

bull

Gro

un

d-l

evel

ozo

ne

bullV

isib

ilit

y d

ecre

ased

bull19

77 C

lean

Air

Act

(or

lack

of

coo

per

atio

n)

to

Urb

an

In

du

str

ial

em

issio

ns

bullM

eta

ls l

oo

sen

ed

into

wat

ers

Am

en

dm

en

ts l

ab

elin

gN

Ps

envi

ron

men

tal s

tan

dar

ds

bullA

uto

mo

bile

emis

sio

ns

(su

rfac

eg

rou

nd

wat

er

as C

lass I

are

as

(reg

ula

tio

ns)

bullP

ow

er

pla

nts

fac

tori

esp

lan

ts

wat

er p

ollu

ted

)bull

Reg

ion

al h

aze

reg

ula

tio

ns

bullN

o t

rue

lsquopo

int

sou

rces

rsquoem

issi

on

sbull

Nu

trie

nts

rem

oved

pro

po

sed

by

the

EPA

(myri

ad

of

sm

aller

po

llu

tio

n

bullE

mis

sio

ns

fro

m S

mo

kesta

cks

(lea

ched

) fr

om

so

il bull

Red

ucti

on

of

allo

wab

leso

urc

es

rath

er t

han

on

e (c

him

neys)

and

or

pla

nts

po

llu

tio

n s

tan

dard

s f

rom

la

rge

sou

rce)

bullE

mis

sio

ns

fro

m K

iln

sbull

Pu

blic o

utc

ryo

ver

stan

ce

ind

ust

rybull

Nati

on

al

Park

s l

imit

ed

bull

Un

reg

ula

ted

po

lluti

on

sm

og

o

f b

ig b

usi

nes

sg

ove

rnm

ent

bullC

lean

Air

Act

set

ju

risd

icti

on

of

po

lluti

on

(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)bull

Aq

uati

c l

ife in

jure

d (

dam

aged

)vis

ibilit

y g

oals

sou

rces

ou

tsid

e o

f th

e p

arks

bullS

mo

ke f

rom

Co

ntr

olled

bu

rns

bullO

bje

cti

ng

to c

on

str

ucti

on

bullE

mis

sio

ns

fro

m la

rge

citi

esp

erm

its

bullId

en

tifi

cati

on

of

po

lluti

on

so

urc

es

bullIn

du

str

y m

od

ificati

on

s

Insta

llati

on

o

f d

evic

es

(scr

ub

ber

s)

to r

edu

ce

po

lluti

on

bullP

ub

lic p

ressu

re t

op

rote

ct p

arks

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 33: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

206 New tasks for reading comprehension testsA

pp

en

dix

1b

(co

nti

nu

ed)

Exam

ple

s (

1 p

oin

t) 5

maxim

um

Exam

ple

s (

1 p

oin

t o

r 5 p

oin

ts

Exam

ple

s (

1 p

oin

t) 6

maxim

um

Exam

ple

s (

1 p

oin

t o

r 10 p

oin

ts

wit

h o

vera

rch

ing

cau

se

wit

h o

vera

rch

ing

so

luti

on

liste

d a

bo

ve)

liste

d a

bo

ve)

bullG

reat

Sm

oky M

ou

nta

inbull

Au

tom

ob

ile

emis

sio

ns

bullLe

aves

of

pla

nts

tu

rnin

g

bull19

77 C

lean

Air

Act

N

atio

nal

Par

kbull

Po

wer

pla

nts

fac

tori

es

pu

rple

an

d b

row

n (

stip

plin

g)

Am

en

dm

en

ts

lab

elin

gN

Ps

bullG

ran

d C

an

yo

np

lan

ts e

mis

sio

ns

bullH

ind

erin

g p

ho

tosyn

thesis

of

as C

lass I

are

as

Nat

ion

al P

ark

bullE

mis

sio

ns

fro

m

pla

nts

an

d t

rees

bullS

mo

ky M

ou

nta

in S

ou

rces

S

mo

kesta

cks

(ch

imn

eys)

bullR

ed

ucti

on

of

vis

ibilit

yat

bullR

egio

nal

haz

ere

gu

lati

on

s

fro

m O

hio

New

Yo

rk

bullE

mis

sio

ns

fro

m K

iln

sG

ran

d C

an

yo

np

rop

ose

d b

y th

e E

PA

Atl

anta

etc

bull

Un

reg

ula

ted

po

lluti

on

sm

og

bull

Vis

ibili

ty r

educ

ed 9

0 o

f bull

Red

ucti

on

of

allo

wab

lebull

Gra

nd

Can

yon

So

urc

es(f

rom

oth

er c

ou

ntr

ies

Mex

ico

)th

e da

ysp

ollu

tio

n s

tan

dard

s f

rom

fro

m C

A N

V U

T A

Z N

M

bullS

mok

e fr

om C

on

tro

lled

bu

rns

bullS

ee v

ague

blu

e m

asse

sin

du

stry

and

Mex

ico

bullE

mis

sio

ns

fro

m la

rge

citi

esbull

See

hal

f as

far

as

in 1

919

bullT

N L

utt

rell

corp

bu

ildin

g

per

mit

sit

uat

ion

Exam

ple

s (

1 p

oin

t)

10 m

axim

um

Exam

ple

s (

1 p

oin

t)

5 m

axim

um

bullN

avaj

o G

ener

atin

g S

tati

on

bull

Ob

ject

ing

to

kiln

s in

TN

Pag

e A

Z (

Po

wer

Pla

nt

AZ

)bull

Res

earc

her

s u

sin

g s

cien

tifi

cbull

Po

wer

pla

nts

in T

N a

nd

OH

tech

no

log

y to

ID s

ou

rces

rive

r va

lley

(rad

ioac

tive

iso

top

es)

bullS

ou

ther

n C

alif

orn

ia E

dis

on

bull

So

uth

ern

Cal

ifo

rnia

Ed

iso

nP

lan

t L

aug

hlin

NV

Pla

nt

in L

aug

hlin

NV

bull

Ten

nes

see

Lutt

rell

Kiln

s T

Nid

enti

fied

as

po

int

sou

rce

bullIn

du

stry

in A

Zbull

Scru

bber

sin

stal

led

at

bullC

ars

in C

AN

avaj

o G

ener

atin

g P

lan

tbull

Sm

oke

stac

ks in

NM

NV

UT

bull10

r

edu

ctio

n o

f p

ollu

tio

nbull

Los

An

gel

esin

10ndash

15 y

ears

(g

oal

of

no

bullA

tlan

tam

an-m

ade

po

lluti

on

)bull

New

Yo

rk

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 34: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 207

Appendix 2a Reading to Integrate task integration activity

Directions For the next 15 minutes reflect on what you read and compose a short essay that combines the information in thematerial read and makes connections across the range of ideas andarguments presented

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

mdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdashmdash

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 35: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

208 New tasks for reading comprehension tests

Appendix 2b Reading to Integrate scoring rubric

Integration50 Excellent Integrates texts accurately and successfully on

multiple levels and creates a true DocumentsModel Generalizes at least two macrostructureconcepts common across texts (this may besimply identifying the existence of amacrostructure followed by the supportingmacrostructures from each text) Effectivelyintegrates relevant support (ie details ormacrostructures) to support generalizations andmay still discuss each article separately to somedegree Integrates macrostructures present inboth texts

40 Good Integrates through the creation of a welldeveloped and accurate introduction andor aconclusion yet summarizes articles separatelyANDOR generalizes one major macrostructurethat is fully developed with support Uses somerelevant support

30 Fair Attempts to integrate through the creation of apartially developed possibly inaccurateintroductory ANDOR concluding statementusually related to the main topic of the articlesbut has no substantive development Creates aText and Situation Model of each text separatelyDoes not make generalizations for integrationbeyond the one statement May make evaluativeor editorial statements however these may beinaccurate

20 Poor Creates a well developed and accurate TextModel and Situation Model for each of thetexts yet attempts no integrative connectionacross the texts

10 Very Poor Ineffectively or inaccurately attempts tosummarize each article in a very reporting stylerevealing little or no use of backgroundknowledge contextualization or evaluationResponse is simply a recall of information fromthe texts No introduction or conclusion ORparticipant confuses texts and sees separatetexts as one issue (problem)

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 36: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

Latricia Trites and Mary McGroarty 209

0 No Response Participant does not attempt response oronly addresses one article

MacrostructuresParticipants will receive one point for each macrostructureaccurately identified in each text with a minimum score of 4 awardedfor the inability to accurately identify any macrostructures

25 Excellent Accurately identifies all 4 macrostructures(Total 8) present in both texts

20 Good Accurately identifies all 4 macrostructures in(Total 67) one text and 3 in the other OR accurately

identifies 3 of the four macrostructures inboth texts (Or 4 in one and 2 in the other)

15 Fair Accurately identifies 3 of the four(Total 45) macrostructures in one text and 2 of the four

in the other (Or 4 in one and 1 in the other)OR accurately identifies 2 of the fourmacrostructures in both texts (Or 4 in oneand 0 in the other or 3 in one and 1 in theother)

10 Poor Accurately identifies 2 of the macrostructures

(Total 23) in one text and 1 in the other (Or 3 in oneand 0 in the other) Accurately identifies 1 ofthe four macrostructures in both texts (Or 2in one and 0 in the other)

5 Very Poor Accurately identifies 1 of the four (Total 01) macrostructures in one text and 0 in the

other OR unable to accurately identify anymacrostructures in the texts

0 No Response Participant does not attempt response

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response

Page 37: Reading to learn and reading to integrate: new tasks for ...englishvls.hunnu.edu.cn/Downloads/LangTst/tst_014.pdfintegrate: new tasks for reading comprehension tests? Latricia Trites

210 New tasks for reading comprehension tests

Use of relevant details

5 Excellent Effectively uses multiple relevant details assupport no irrelevant or erroneous details

4 Good Effectively uses relevant details as supportmay include some inaccurate or irrelevantdetails (More than 50 of details arerelevant)

3 Fair Possibly uses one or two relevant details yetseveral irrelevant details or inaccurate detailserroneous or fabricated) appear in thesynthesis (50 or less of details arerelevant)

2 Poor Erroneous or fabricated details used inattempt to support arguments presented doesnot include relevant details

1 Very Poor No details used or listed0 No Response Participant does not attempt response