accelerating the development of formal thinking in middle and high school students ii: postproject...

12
JOURNAL OF RESEARCH IN SCIENCE TEACHING VOL. 29, NO. 1, PP. 81-92 (1992) Accelerating the Development of Formal Thinking in Middle and High School Students 11: Postproject Effects on Science Achievement Michael Shayer and Philip S. Adey King's College, Universiry of London Abstract A one-year lag was found between the effect of an intervention intended to promote formal operational thinking in students initially 11 or 12 years of age and the appearance of substantial science achievement in the experimental groups. A one-year lag was also reported on cognitive development: Whereas at the end of the two-year intervention the experimental groups were up to 0.90 ahead of the control groups, one year later the differential on Piagetian measures had disappeared, but the experimentals now showed better science achievement of even greater magnitude. Although the control groups showed normal distribution both on science achievement and cognitive development, the experimental groups showed bi- or trimodal distribution. Between one-half and one-quarter of the students involved in the experiment in different groups showed effects of the order of 20 both on cognitive development and science achievement; some students appeared unaffected (compared with the controls), and others demonstrated modest effects on science achievement. An agelgender interaction is reported: the most substantial effects were found in boys initially aged 12+ and girls initially 11 + . The only group to show no effects was boys initially aged 11 +. It is suggested that the intervention methods may have favored the abstract analytical learning style as described by Cohen (1986). Background Recently Adey and Shayer (1 990) conducted a study on the effects of an intervention program in the context of science instruction aimed at accelerating the rate of cognitive development of early adolescents (Cognitive Acceleration through Science Edu- carian'-CASE). Data on 11 experimental classes and the same number of comparable control classes were collected. Effect sizes equivalent to average class gains, over a two-year period, of 23 percentile points in relation to population norms (Shayer, Kuchemann, & Wylam, 1976; Shayer & Wyiam, 1978) were found. Yet no gains in science achievement were found in experimental classes during the period of intervention. This article reports on science achievement by the same students in the year following ' The CASE I1 Project (Cognitive Acceleration through Science Education) was funded by the British Economic and Social Research Council, contract C00232189, from 1984 to 1987. 8 1992 by the National Association for Research in Science Teaching Published by John Wiley & Sons, Inc. CCC 0022-4308/92/001081- 12$04.00

Upload: michael-shayer

Post on 06-Jul-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

JOURNAL OF RESEARCH IN SCIENCE TEACHING VOL. 29, NO. 1, PP. 81-92 (1992)

Accelerating the Development of Formal Thinking in Middle and High School Students 11: Postproject Effects on Science Achievement

Michael Shayer and Philip S. Adey

King's College, Universiry of London

Abstract

A one-year lag was found between the effect of an intervention intended to promote formal operational thinking in students initially 11 or 12 years of age and the appearance of substantial science achievement in the experimental groups. A one-year lag was also reported on cognitive development: Whereas at the end of the two-year intervention the experimental groups were up to 0.90 ahead of the control groups, one year later the differential on Piagetian measures had disappeared, but the experimentals now showed better science achievement of even greater magnitude. Although the control groups showed normal distribution both on science achievement and cognitive development, the experimental groups showed bi- or trimodal distribution. Between one-half and one-quarter of the students involved in the experiment in different groups showed effects of the order of 20 both on cognitive development and science achievement; some students appeared unaffected (compared with the controls), and others demonstrated modest effects on science achievement. An agelgender interaction is reported: the most substantial effects were found in boys initially aged 12+ and girls initially 11 + . The only group to show no effects was boys initially aged 1 1 +. It is suggested that the intervention methods may have favored the abstract analytical learning style as described by Cohen (1986).

Background

Recently Adey and Shayer (1 990) conducted a study on the effects of an intervention program in the context of science instruction aimed at accelerating the rate of cognitive development of early adolescents (Cognitive Acceleration through Science Edu- carian'-CASE). Data on 11 experimental classes and the same number of comparable control classes were collected. Effect sizes equivalent to average class gains, over a two-year period, of 23 percentile points in relation to population norms (Shayer, Kuchemann, & Wylam, 1976; Shayer & Wyiam, 1978) were found. Yet no gains in science achievement were found in experimental classes during the period of intervention. This article reports on science achievement by the same students in the year following

' The CASE I1 Project (Cognitive Acceleration through Science Education) was funded by the British Economic and Social Research Council, contract C00232189, from 1984 to 1987.

8 1992 by the National Association for Research in Science Teaching Published by John Wiley & Sons, Inc. CCC 0022-4308/92/001081- 12$04.00

Page 2: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

82 SHAYER AND ADEY

the intervention and relates the findings to some recent issues in the intervention literature.

Shayer & Beasley (1987), in comments on the effects of the Instrumental Enrichment course of Feuerstein, Rand, Hoffman, and Miller (1980), argued that it had been a mistake, in the intervention literature, to look for and to assess school achievement during the course of, or retrospectively at the end of, an intervention. They also argued that it was a mistake in principle to assess achievement using rather general standardized tests of achievement. Rather one should wait until the end of the program when the intervention had produced its maximum effect, and then investigate the difference in learning achievement between experimental and control classes over the next three months to a year in the same science courses. This implies that the content of the achievement tests used at the end of this period should relate only to what has been taught subsequent to the intervention. Thus the test items would not relate back to earlier learning experiences preceding or contemporaneous with the intervention, and the effect of the intervention would not be diluted by artifacts of testing. Embretson (1987b) makes a similar argument in relation to the use of dynamic testing procedures for spatial skills. She also warned of various methodological pitfalls which in the past may have led to the effects of intervention being underestimated (Embretson, 1987a). It is necessary to make sure that ceiling or floor effects do not obscure the treatment effect, which means that a scoring procedure such as latent trait analysis should be used to make sure that a given score difference has the same meaning for students who scored initially low on a test as after they have improved. If the intention is to compare groups, then Embretson (1987b, p. 344) recommends the method of midualized gain scores, as discussed in Cronbach and Furby (1970), as “. . . the best measure of gain because it is uncorrelated with initial status.” Although Cronbach and Furby do question the wisdom of estimating gains, on grounds of nnreliability, their Eq. 21 for residualized gain scores does appear to remove the danger of systematic error being imported with the analysis.

An equation of posttest performance regressed on pretest results for control students is used (a) to predict and to describe the variation of gain scores for controls, and, likewise (b) to predict, for experimental students, their expected performance given their pretest score. The distribution of gain scores for students in experimental classes can then be inspected on the basis of results predicted from controls with comparable starting levels. Close inspection of the distribution of the residualized gain scores for the control students allows for the method itself to be checked for systematic error at all input values.

Method

In the previous study (Adey & Shayer, 1990) students had been given a Piagetian pretest (National Foundation for Educational Research [NFER] , 1979), a Piagetian posttest two academic years later at the close of the intervention (June, 1987), and at the same time a test of science achievement produced by the research team to cover common aspects of science learning. There were two cohorts in the study: students initially 1 1 + years of age and students initially 12+. All of the 12+ class groups and one 11 + group were already in high schools. Two 11 + class groups were in middle schools during the two-year intervention period. By the beginning of the present study all students were in high school, and those who had previously been in high school

Page 3: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

EFFECTS ON SCIENCE ACHIEVEMENT 83

classes in general science were now in separate classes for biology, physics, or chemistry. These schools have a “mixed-ability” class policy. Either the children are separated according to their elementary school records into three different ability groups, and then allotted to classes at random within each group, or all children entering the school are placed in classes randomly. In most cases the classes were taught by a different teacher in 1987/88, and a combination of tracking and pupil choice had changed and mixed all the teaching groups. There was some loss of data compared with the previous study because some students changed schools, and in two cases schools did not provide all the data requested. But the data reported here are still representative of all the cohorts of students originally studied.

In English schools achievement is usually tested summatively at the end of each academic year in June by examinations relating to the whole year’s learning. In assessing achievement it was thought important to use the same criteria as the schools themselves recognize. Thus each school’s June, 1988 examination results in science were used. Although this might be thought to lower the validity of what was assessed, it should be borne in mind that the work in all schools in England is aimed towards the same external examination (GSCE) at 16+, which means in practice that schools are using very similar examination criteria by the time it comes to the year preceding GCSE. Often they use questions from past external examination papers. Following the discussion of gain scores in Cronbach and Furby (1970), differences in question difficulty and examining criteria were dealt with by calculating a regression equation of science examination percentage ( 1988) on preintervention Piagetian reasoning task (PRT) level (1985) for each control class, as a step to obtaining individual residualized gain scores (r.g.s). Then for each control class its regression equation was used to predict the mean science percentage from the mean Piagetian pretest result for all control students, relying on the fact that for all correlations the regression line runs through the mean of both variables. The class mean science percentages were then adjusted by a constant to a mean common to all schools, intersecting with the mean of the Piagetian pretest. This enabled a single regression equation to be formed which included all control students. For each experimental class the same school constant was used to adjust the science examination percentages. For each individual the 1985 PRT value was entered into the total regression equation predicting the expected adjusted science examination percentage. The difference between the predicted and the obtained science score is the residualized gain score.

In addition, the classes were given a final delayed Piagetian posttest in June or July, 1988. To avoid floor or ceiling effects all PRTs were scored using Rasch scaling, which was then linearly transformed to an equal-interval Piagetian scale. As an additional check that the method of residualized gain scores was not distorting the data, the comparison between experimental and control classes on PRTs was made using raw gains only.

Results

In Adey and Shayer (1990) it was found that the mean effect size on boys initially 12+ years old on Piagetian Reasoning Tasks (PRTs) was 0.910 in relation to control classes. Since we now report aspects of the data which were not identified in Adey and Shayer, analysis of this group will be reported first.

Page 4: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

84 SHAYER AND ADEY

Boys 12+ in 1985. In Table 1 are listed residualized gain scores for each class both for the immediate posttest on PRTs conducted in June 1987 and for science achievement in the year following the intervention, tested in school science examinations in June 1988.

The mean effect size on science achievement (1988) for the boys aged 12+ in 1985-1.14~-and the previous effect size of 0 . 9 1 ~ on PRTs in June 1987 are respectable by any criteria. But when the distribution of residualized gain scores are plotted for all control and all experimental students, it is immediately apparent that on both measures the pattern in the data is misrepresented by reporting mean gains for each experimental class. The bulk of the experimental students appear to be distributed in the same way as the control students, but there is a smaller group of experimentals (occumng in all the classes) in a separate distribution with much larger effects, standing clear of both controls and the other experimentals, as can be seen in Tables 2 and 3.

The statistical significance of the hypothesized three distributions of cognitive development (controls, high experimentals, and low experimentals) was tested by using the conservative chi-square test for normal distribution on the raw-score PRT gains. A convenient feature of this test is that the chi-square values are summed, interval by interval, so that the probability can be looked up as each interval is added. Accordingly, a normal distribution was calculated having the same mean and standard deviation as the controls. In Tables 2 to 6, row values of expected n, x2, etc., are collapsed where appropriate for performing the chi-square test. The total chi-square sum for the control group, as tested against normal distribution, was 3.74 (df = 3, p > 0.3). Thus the control gains show normal distribution (M = 0.39, SD = 1.01, n = 43). As can be seen in Table 2, the experimental group gave a cumulative chi-square sum of 12.38 ( p < 0.01), up to and including a gain score of 1.49 levels, followed by a massive jump to x2 = 33.28 to take into consideration the 15 students with gain scores greater than 1.5 levels. It is consistent with this evidence to view the experimental gains as a bimodal distribution, with 22 students with gain scores 1.59 and below having a mean gain of 0.72 levels, and 14 students having a mean gain of 2.28 levels. The effect size of the high experimentals is 1 . 8 8 ~ in comparison with the controls, and that of the low experimentals is 0 . 3 3 ~ .

In Table 3 a similar pattern is shown for the residualized gain scores calculated from the regression equation for controls of science achievement (1988) on the pretest PRT level (1985). The total chi-square sum for the control group, as tested against normal distribution, was 4.38 (df = 3, p > 0.2), whereas it can be seen from Table 3 that a similar test on the experimental-group distribution appears to identify three groups. The lowest seven expermentals are distributed as though they were control students (x2 = 0.20). There is then an intermediate group significantly higher than the controls (at p < 0.01), and then a separate group of high experimentals shown by the great jump to x2 = 79.8. For the experimental p u p as a whole the mean residualized gain is 10.63%, giving a t value of 3.74 (p < O.OOO5, 1-tail) for the difference from the controls. But if the experimental group is divided into two at a gain of 16%, the lower group (n = 24) has a mean r.g.s. of -0.3%, virtually the same as that of the controls (M = -0.01, SD = 9.37, n = 46). The mean r.g.s. of the upper group is 24.4% (n = 19), which is an effect size of 2 . 6 1 ~ compared with the controls.

Alternatively, one could take the view from Table 3 that the distribution is trimodal, the seven lowest scores forming the first p u p , with a further p u p of low experimentals

Page 5: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

Tabl

e 1

Mea

n R

esid

ualiz

ed G

ain

Scor

es on

Piag

etia

n R

easo

ning

Tas

ks (1

987)

and

Sci

ence

Ach

ieve

men

t (19

88)

PRT

1987

Sc

ienc

e ac

hiev

emen

t 198

8

Age

in

Expe

rimen

tal

Con

trol

(E-C

) Ex

perim

enta

l C

ontro

l (E

-C)

1985

Se

x Sc

hool

N

Scor

e' N

Sc

ore'

w N

Sc

ore

%

N Sc

ore

%

w ~~

11-t

Boy

s 5/

V

5/c 64

6M

9

11+

Girl

s 5/

v 5/

c 6/

QM

9

12+

Boys

3 8 9 11

12+

Girl

s 3 8 9 11

9 -0

.59

2 0.

24

-0.7

7 9

7.6

2 11

.9

-0.3

3 8

-0.2

1 10

-0

.26

0.05

8

3.4

10

-0.2

0.

27

4 0.

40

14

0.26

0.

13

4 0.

6 14

-0

.7

0.01

3

0.68

6

0.14

0.

51

3 -7

.9

6 -0

.8

-0.5

4 11

-0

.11

15

-0.1

3 0.

02

13

2.1

15

-0.4

0.

19

4 0.

47

3 -0

.14

0.79

5

15.1

3

-4.5

1.

66**

8

-0.7

6 3

-0.2

0 -0

.73

8 4.

6 3

1.5

0.26

9

0.65

13

0.

02

0.82

* 9

11.8

13

1.

0 0.

91*

9 0.

34

13

-0.0

3 0.

48

9 -0

.1

13

-0.6

0.

04

15

0.67

7

0.24

0.

44

15

0.6

6 -9

.0

1.03

* 8

0.97

9

-0.4

5 1.

44**

8

17.3

9

0.6

1.78

**

6 1.

57

19

0.30

1.

29*

11

10.7

23

0.

8 1.

05*

7 0.

75

8 -0

.48

1.25

* 9

21.3

8

-0.7

2.

35**

15

-0.0

2 5

0.55

-0

.53

15

3.9

5 -2

.1

0.53

10

0.

24

9 -0

.87

1.03

* 10

4.

5 9

-0.4

0.

43

5 1.

00

9 0.66

0.32

8

2.1

12

0.4

0.15

3

-0.4

1 3

0.28

-0

.64

3 10

.2

3 12

.0

-0.1

6

' Units are le

vels

on an e

qual

inte

rval

sca

le fr

om 4

(=

mid

dle

conc

rete

) to

8 (=

mat

ure

form

al).

* p <

0.05

. **

p <

0.0

1.

Page 6: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

86 SHAYER AND ADEY

Table 2 Stem-and-Leaf Diagram of Experimental Raw Gain Scores, 1985 -87, on PRTs, Tested for Normal Distribution Having the Same Mean and Standard Deviation as Controls: Boys I 2 +

Gain scores Cumulative Range Scores Expected n x 2 X 2 P

2.5 2.0 1.5 1 .o 0.5 0.0

-0.5 -1.0

2.52 2.61 2.67 2.68 2.85 2.02 2.07 2.17 2.36 2.49 1.59 1.85 1.85 1.86 1.91 4.89 20.90 33.28 <lo-' 1.06 1.18 1.28 1.28 1.31 1.39 1.39 1.49 4.87 2.01 12.38 <.01 0.63 0.70 0.72 0.81 0.86 0.94 0.96 6.52 0.04 10.37 <0.02 0.00 0.05 0.05 6.92 2.22 10.33 <0.02

5.89 5.89 8.11 <0.05 -0.52 -0.65 -0.76 6.92 2.22 2.22 >0.50

significantly different from the controls (range -2.0 to 15.3, n = 17) with a mean r.g.s and effect size of about 0 . 5 ~ .

Although the mean gains both on PRT 1987, and Science achievement 1988 are positive for this group in relation to the controls, for PRT 1987 they fall far short of statistical significance ( 2 = 0.84). The chi-square test gives a maximum cumulative sum of 5.62 (5.99 required for significance at 0.05 level) in comparison with the controls on PRT, whereas for the controls themselves x2 = 2.34 (df = 2, p > 0.3). It is safest to regard this as a noneffect.

Girls 12+ in 1985.

Table 3 Stem-and-Leaf Diagram of Experimental Residualized Gain Scores on Science Achievement, Tested for Normal Distribution Having the Same Mean and Standard Deviation as Controls: Boys 12+

Residualized gain scores Cumulative

Range Scores Expected n x2 x 2 P

40 44.9 35 30 30.2 32.3 34.5 25 25.0 25.2 25.3 26.7 20 20.6 22.3 22.6 23.6 15 10 5 0

-5 - 10 - 15 - 20 - 25 - 30

15.0 15.3 16.7 16.7 18.5 19.3 19.5 19.7 19.8 10.0 11.3 11.6 13.0 13.7 6.12 64.6 79.8 <lo-'' 5.8 7.0 9.4 6.70 2.04 15.18 <0.01 0.3 1.2 2.2 3.6 8.68 2.52 13.14 CO.01 -0.3 -0.8 -2.0 8.68 3.72 10.62 C0.02

6.70 6.70 6.90 <0.10 -10.3 -10.4 -13.0 -14.8 6.12 0.20 0.20

-20.6 -22.7 -30.7

Page 7: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

EFFECTS ON SCIENCE ACHIEVEMENT 87

Table 4 Stem-and-Leaf Diagram of Experimental Residualized Gain Scores on Science Achievement, Tested for Normal Distribution Having the Same Mean and Standard Deviation as Controls: Girls 1 I +

Gain scores Cumulative Range Scores Expectedn x2 X 2 P

35 36.1 38.5 30 25 25.4 20 24.8 15 15.1 17.3 19.9 3.11 - 10 10.3 10.5 3.22 - 5 5.2 5.5 6.9 7.5 7.7 8.0 8.6 4.88 2.04 5.82 <0.02 0 0.8 0.9 1.1 1.9 2.0 3.5 3.5 4.8 5.78 0.85 3.78 ==0.05

-5 -0.1 -2.7 -4.5 5.78 1.34 2.93 -10 -6.4 -6.6 -7.3 -7.7 4.88 1.59 1.59 - 15 6.34 - -20 -17.4 - 25 -30 -27.2 -35 -31.5

For science achievement, though, there is an effect. The distribution for the experimental girls can be read as bimodal like the 12+ boys diagram, or unimodal, with a modal value in the range +5 to + 10% residualized gain score. There are seven with r.g. scores in excess of + 15%, compared with an expected number of 3.1 given normal distribution with the same mean and standard deviation as the control group. From the experimental group 24 out of 34 had positive residual gain scores, compared with 12 out of 29 for the controls. For the difference between these proportions, z = 2.33 (p = 0.01, 1-tail). For the control group (M = -0.01, SD = 11.27, II = 29) the chi-square sum was 0.29 ( p > O S ) , showing normal distribution as with all the control groups. For the experimentals the cumulative chi-square sum is 3.78 (df = 1, p = 0.05) including the intervals below a r.g.s. of 5%, and rises to 5.82 (p < 0.02) because of the excess of values over 15%. The overall effect size is 0.400 in comparison with the controls if averaged over all experimentals. Alternatively, one could view the effect as due to the top seven students, for which the mean effect would be 2.240, and no effect for the rest.

For this group the same pattern as for 12+ boys appears both for PRT scores and for science achievement. Table 5 gives a stem-and-leaf diagram, and details of the chi-square test. As with the 12+ boys, this analysis is done on raw gain scores. For the control group x2 = 0.92 (df = 1, p > 0.3), showing normal distribution. For the experimentals the Merence from the controls only becomes significant when the scores exceeding 1.5 levels are entered (x2 = 5.72, df = 1, p < 0.02). The view that the effect is distributed among all the experimentals must be rejected, since the t value for the mean difference between all experimentals and the control group (M = 0.80, SD = 0.77, n = 32) is only 0.30. The effect is localized to the girls with gain scores in excess of I .5 levels, as can be seen from the chi-square

Girls 12+ in 1985.

Page 8: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

88 SHAYER AND ADEY

Table 5 Stem-and-Leaf Diagram of Experimental Raw Gain Scores, 1985 -87, on PRTs, Tested for Normal Distribution Having the Same Mean and Standard Deviation as Controls: Girls 12 +

Gain scores Cumulative

Range Scores Expected n x2 X2 P

3.0 3.15 2.5 2.94

1.5 1.53 1.66 1.69 3.69 2.35 5.72 <0.02 1.0 1.07 1.19 6.48 3.10 3.37 <0.10 0.5 0.50 0.54 0.64 0.75 0.78 0.93 0.93 0.97 0.97 7.63 0.25 0.27 0.0 0.00 0.17 0.24 0.35 0.41 0.46 5.97 0.02 0.02

2.0 2.04 2.11 2.22 2.33 1.74 -

-0.5 4.48 - - 1 .O -0.56 -0.85 -1.5 -1.39 -2.0 -1.56

test for normal distribution shown in Table 5 . The mean effect size for the top eight girls is 1 . 9 1 ~ .

For science achievement the bimodality of the residualized gain scores, as shown in Table 6, is even more obvious. From the experimental group 22 out of 31 had residualized gain scores which were positive, compared with 16 out of 32 for the controls. For the difference between these proportions z = 1.70 (p < 0.05, 1-tail). For the control group (M = -0.112, SD = 11.81) the chi-square total is 1.89 (df = 1, p > 0.1) showing normal distribution, whereas the experimental group shows significant difference from n o d distribution with the same mean and standard deviation

Table 6 Stem-and-Leaf Diagram of Experimental Residualized Gain Scores on Science Achievement, Tested for Normal Distribution Having the Same Mean and Standard Deviation as Controls: Girls 11 +

Residualized gain scores Cumulative

Range scores Expectedn x2 X2 P

30 25 20 15 10 5 0

-5 - 10 - 15 - 20

30.2 27.4 20.3 21.3 21.5 23.9 24.7 17.4 10.0 11.0 11.2 12.3 12.4 7.1 8.1 9.0 9.7 2.2 2.8 4.3 4.4 4.6 -2.0 -3.5 -3.5 -5.1 -9.0 -10.1 -11.7 -15.9 -17.3

3.11 2.94 4.30 5.04 5.06 4.35 2.99 3.22

_- 7.98 10.85 <.001

0.01 2.87 <. lo

2.07 2.86 <.lo 0.79 0.79

-

__

I

Page 9: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

EFFECTS ON SCIENCE ACHIEVEMENT 89

as the control group only above the value of +lo%. This is clearly due to the seven girls with residualized gains of 20 and above. If the effect size is spread over all the experimentals, the value is a substantial 0.60~ in relation to the controls, but it seems more satisfactory to regard the effect as located in the seven high-scoring experimentals, in which case the effect size is 2 . 0 5 ~ .

With this group there was no sign of bimodality in any of the data plotted. There was no difference in the distributions: For science achievement the chi-square total was 2.32 (df = 3, p > 0.3 in the test for the difference from the control distribution. The only difference which appeared to show in the data was for science achievement, with an effect size of 0 . 2 0 ~ . The proportion of the experimental group with residualized gain scores above zero was 22 out of 37, compared with 23 out of 47 for the controls. For the difference between these proportions, z = 0.97 (p = 0.17, 1-tail). Thus no effect can be claimed on behalf of these experimentals.

Result of delayed posttest PRT 1988. The mean residualized gain scores for experimental and control groups for the posttest PRT administered in 1987 and the delayed posttest given in summer 1988 are shown in Table 7. It can be seen that the substantial effects reported for cognitive development at the end of the intervention (Adey & Shayer, in press) have disappeared.

Boys 11 + in 1985.

Discussion

For Tables 2-6, whether the analysis was performed on raw gain scores or residualized gain scores there was no difference found from normal distribution for the control classes. Thus no systematic errors are added by the method of analysis. For three of the four groups analyzed there were positive effects on science achievement in the year following the CASE intervention. In relation to the standard deviation of the control group science scores, the mean effect sizes were 12+ boys: 1.14u, 12+

Table 7 Comparison of Mean Residualized Gain Scores (1.g.s.) and Effect Sizes for Experimental and Control Groups on Piagetian Reasoning Tasks: Posttest (1987) and Delayed Posttest (1988)

Posttest Delayed posttest Group r . g x a Effect size r.g.s Effect size ~~

12+ Girls Controls Experimentals

12+ Boys Controls Experimentals

1 1 + Girls Controls Experimentals

11+ Boys Controls Experimentals

0.00 0.12

-0.01 0.90

0.00 0.19

0.00 -0.14

0.00 O.llu -0.08 -0.070

0.00 0 . 9 1 ~ -0.01 -0.010

0.00 0.26~ -0.35 -0.43~

0.00 -0.13~ -0.04 -0.04u

~ ~~~ ~ ~

'Units are levels on an equal interval scale from 4 (= middle concrete) to 8 (= mature formal).

Page 10: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

90 SHAYER AND ADEY

girls: 0.40u, and 11+ girls: 0.60~. But closer inspection of the data showed that it was misleading to average these effects over the whole experimental groups. In all experimental groups except 11+ boys there was evidence of bimodal or trimodd distributions in contrast to normal distribution in all control classes. It appears that the intervention affected a smaller proportion (between one-quarter and one-half) of the experimental students with a much larger effect-of the order of two standard deviations. We do not know a previous report in the intervention literature of data showing this pattern, and suggest it will be important in future to check for its presence.

Given the discovery of this pattern, the data on cognitive devtlopment reported in Adey and Shayer (1990) by the end of the two-year CASE intervention were reexamined. For the group with the largest effect size on cognitive development by summer 1987-12+ boys-both of the groups in the bimodal distribution showed mean gains greater than the control group. The low experimental group of 22 students showed an effect of 0.33a, and the upper group of 14 students had an effect of 2.24a, in relation to the controls. Whereas earlier we were able only to report slight evidence of an effect on girls initially 11+, it is now seen that 7 of the 30 girls showed an effect 2a greater than the controls. There was some evidence of a bimodal effect on girls initially 12+, but it just failed to show statistical significance.

There are two puzzling features in the data, which may be related. There is an age/gender interaction, and there is clear evidence that the effects of the intervention were very large, but confined to a minority of the students. The boys and girls were in the same classes during the two-year intervention, and so must have been exposed to the same teaching styles. It may be that implicit style of the intervention techniques (Adey, Shayer, & Yates, 1989) interacted with the learning styles of the students.

It does appear that for the intervention methods of the CASE project there are ages where the effects are more likely to be substantial. There is little or no effect on cognitive development for the younger boys but the largest effect reported was for boys whose age ranged from an average of twelve-and-one-fourth at outset to 14 at the end of the intervention. In Adey and Shayer (1990) it was found that most of this increase occurred in the second year. But the largest effect for girls, both on cognitive development and subsequent science achievement, was on the group initially twelve and a quarter years old. This may be correlated with differential ages for brain-growth spurts for boys and girls, as suggested in Adey and Shayer (1990). But this does not explain the bimodal distributions of effects on the variables both of cognitive development and science achievement. Our suggestion, argued in Adey and Shayer (in press), is that the CASE lesson materials and teaching methodology may have concentrated on one learning style-abstract analytical. The large effect sizes were of the same order for both the boys and the girls in the groups affected, but there was a smaller proportion of girls showing the effect. This agrees with the differential proportion of adolescents showing the four different learning styles described by Cohen (1986).

The disappearance of the difference between experimentals and controls on cognitive development in the year following the intervention recalls the earlier Headstart research. There was a lag of a year between cognitive gains and science achievement for the experimental groups, and the control groups appeared to have lagged one year behind the experimentals on cognitive development. Has the intervention merely hastened the cognitive development of the experimentals without affecting its ultimate level? From the point of view of the student, higher school achievement in a crucial year should

Page 11: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

EFFECTS ON SCIENCE ACHIEVEMENT 91

give a permanent advantage-so it can be claimed that the intervention methodology has delivered the major effect that was intended. The psychological meaning cannot be elucidated without the gathering of further evidence but there is at least one study in the literature (Feuerstein et al., 1980) where an intervention on adolescents of the same age range as the present study, giving quite modest effects after two years, resulted in further differential cognitive gains in the two years following. There may be a further lag effect to come.

With the publication of the CASE inteavention materials entitled Thinking Science (Adey, Shayer & Yaks, 1989) there is the opportunity for further research and development with the factor of learning style given explicit attention. But the present implications of the work reported for classroom practice are unfamiliar and will be highlighted briefly. Everyone concerned with science curriculum development in schools-whether concerned with the disabling effect some students’ alternative conceptions can have for their learning if not addressed explicitly, or wishing directly to improve the quality of students’ learning-has the aim of increasing student achievement. The question is: What kind of professional skills are required from science teachers if students’ science achievement is to improve? The implicit assumption of many science educators is that the quality of instruction needs to be improved. In Britain the government has legislated a new National Curriculum, and detailed lists of levels of 17 aspects of science attainment desired for all have been published (Department of Education and Science, 1989), which can readily be shown to be characteristic of what can be expected only from the top 30% of the ability range (Shayer & Adey, 1981).

We believe that the major limiting factor to the achievement of some 70% of students in science in secondary schools is the cognitive level at which they can process fresh learning information. We have found that the professional skills which need developing to address this factor are radically different from those hitherto developed in good science teaching practice. Specifically, they do involve starting where each individual learner is-many of the CASE lessons ask for much more class discussion than is usual, so that each student contributes to collaborative reflective learning- but they also involve structuring the subsequent intervention quite directively according to a strong Piagetian model. One principle is that new terminology, for example, the notions of a variable, values of variables, and the relationship between variables, are best introduced in contexts which require concrete modeling only. Thus the students can gain competence and confidence in its use well in advance of being presented with a situation requiring formal models for which the vocabulary will then be an aid. Another strong element in teachers’ skill is that of using and recognizing the Piagetian reasoning patterns (formal schemata) which underlie science lessons. Thus, bearing in mind the pattern of control of variables as an element in the design of experiments, when students are introduced to the vocabulary of variables they are also given several instances where the right conclusion is that there is no relationship. Thus the question of how to approach the design of simple experiments has been prefigured. Further detail is given in Shayer (1986). The essential difference from existing science teaching expertise is that the underlying agenda is shifted back one level of abstraction from the specifics of science concepts and algorithms to a raising of consciousness about the reasoning patterns the different contexts have in common. Teachers responsible for average and below-average students will need to add this skill to their existing skills if their students’ science achievement is to improve substantially. The priority

Page 12: Accelerating the development of formal thinking in middle and high school students II: Postproject effects on science achievement

92 SHAYER AND ADEY

appears to be to develop and describe the growth of this skill, and our present work in progress (CASE 111) involves close collaboration with science teachers working with 12- 14-year-old students to that end.

References

Adey, P.S., & Shayer, M. (1990). Accelerating the development of formal thinking in middle and high school students. Journal of Research in Science Teaching, 27(3),

Adey, P.S., Shayer, M., & Yates, C. (1989). Thinking science: Student and

Cohen, R.A. (1986). Conceptual styles and social change. Acton, MA: Copely. Cronbach, L., & Furby, L. (1970). How should we measure change-or should

we? Psychological Bulletin, 74, 68-80. Department of Education and Science (1989). Science in the National Curriculum.

London: Her Majesty’s Stationery Office. Embretson, S. E. (1987a). Diagnostic testing by measuring learning processes:

Psychometric considerations for dynamic testing. In N. Frederiksen, A. Lesgold, R. Glaser, & M. Shafto (Eds.), Diagnostic monitoring of skill and knowledge acquisition. Hillsdale, NJ: Erlbaum.

Embretson, S.E. (1987b). Improving the measurement of spatial aptitude by dynamic testing. Intelligence, 11, 333-358.

Feuerstein, R., Rand, Y., Hoffman, M., & Miller, R. (1980). Instrumental enrich- ment: An intervention programme for cognitive modifiability. Baltimore: University Park Press.

National Foundation for Educational Research (1979). Science reasoning tasks. Windsor: Author. (Available as Piagetian reasoning tasks from science reasoning, Room G. 1 1 , King’s College, University of London, Comwall House, Waterloo Road, London SE1 8TX).

Shayer, M. (1986). Data processing and science investigation in schools. Research Papers in Education, 1(3), 237-253.

Shayer, M., & Adey, P.S. (1981). Towards a science of science teaching. London: Heinemann Educational Books.

Shayer, M., & Beasley, F. (1987). Does instrumental enrichment work? British Educational Research Journal, 13(2), 101- 119.

Shayer, M., Kiichemann, D.E., & Wylam, H. (1976). The distribution of Piagetian stages of thinking in British middle and secondary school children. British Journal of Educational Psychology, 46, 164- 173.

Shayer, M., & Wylam. H. (1978). The distribution of Piagetian stages of thinking in British middle and secondary school children. II: 14-16 year-olds and sex differentials. British Journal of Educational Psychology, 48, 62-70.

267-285.

teachers’ materials for the CASE intervention. London: Macmillan.

Manuscript accepted kbruary 1, 1990.