comparison of the development of formal thought in … · 2014. 7. 16. ·...

11
Developmental Psychology I999 vol. 35, No. 4, 1048-1058 Copyright 1999 by the American Psychological Association, Inc. 0012-1649/99/S3.00 Comparison of the Development of Formal Thought in Adolescent Cohorts Aged 10 to 15 Years (1967-1996 and 1972-1993) Andre Flieller University Nancy 2 Five standard Piagetian tests were administered to 180 adolescents between the ages of 10 and 15 years. The results were compared with those obtained in 1967 and in 1972 for similar participant samples. At equal ages, today's adolescents exhibited a higher level of cognitive development than the adolescents of 20 or 30 years ago. The amount of gain observed varied across tasks, being very large for combinatory thought but mixed for conservation. This acceleration of cognitive development can partially explain the continuous rise in intelligence test performance (Flynn effect). Many cohort comparisons have shown that the mean score on a variety of intelligence tests (e.g., Raven Progressive Matrices, Stanford-Binet Intelligence Scale, Wechsler Intelligence scales) has been rising by approximately 3 IQ points per decade for at least 60 years (see reviews by Flynn, 1984, 1987; Storfer, 1990). This rise has been noted in over 20 countries, including France, where the studies reported in the present article took place (e.g., Flieller, Manciaux, & Kop, 1995; Pry & Manderscheid, 1993; Wechsler, 1996). Whereas the phenomenon of rising IQ scores, known as the Flynn effect, is well documented, it remains puzzling to many scholars, who are divided as to its interpretation (see Neisser, 1998). Many environmental variables have been proposed to ex- plain the rise, including better nutrition (e.g., Lynn, 1990), the development of education (e.g., Teasdale & Owen, 1989), and the ever-increasing complexity of modern environments (e.g., Schooler, 1998). But none of these variables alone seems to be able to account for all of the observations, nor can any one variable explain performance changes of the magnitude observed. Accord- ing to Flynn (1987, 1996, 1998), the observed gains are too great to correspond to a genuine increase in cognitive abilities. One of Flynn's main arguments is that these massive IQ gains are not accompanied by similar progress in real life. A case in point is the fact that no improvement in scholastic achievement has been observed in the United States. Some approaches aimed at better explaining the Flynn effect have recently been proposed. For Neisser (1997), the one- dimensional conception of intelligence that underlies the debate about the Flynn effect should be discarded: "Different forms of intelligence are developed by different kinds of experience ... We are indeed very much smarter than our grandparents where visual I would like to thank Francois Longeot and Jacqueline Hornemann for the information they so willingly provided; Delphine Lambert and Youssef Tazouti for their valuable collaboration; Charles Brainerd and Douglas Detterman for their comments on an earlier version of this article; and Vivian Waltz for the English translation. Correspondence concerning this article should be addressed to Andre Flieller, Laboratory of Psychology, University Nancy 2, B.P. 33-97, F-54015 Nancy Cedex, France. Electronic mail may be sent to [email protected]. analysis is concerned, but not with respect to other aspects of intelligence" (p. 447). On her side, Greenfield (1998) proposed attempting to determine the historical and cultural changes that might account for higher test scores and testing their explanatory power by means of field and lab experiments. To illustrate, she reported studies demonstrating the impact of video games and computers on the development of visual-spatial skills, measured on tests where a strong Flynn effect was found (e.g., the Object Assembly subtest of the Wechsler Intelligence Scale for Children [WISC]). According to Williams (1998), the Flynn effect is the result of multiple variables, whose effects should be studied sep- arately. She advocated making finer-grained comparisons aimed at pinpointing specific changes that could be rooted in numerous variables. These proposals all point to the utility of conducting analytic studies of performance improvements on cognitive tests. The above authors suggest that test-dependent or even item-dependent variations in the extent of the rise can provide researchers with cues about what variables are at play in the Flynn effect. The present article takes such an analytic approach. It reports two studies comparing the performance on Piagetian tasks of same-age adolescents from different cohorts. As shown below, Piagetian tasks have a number of original characteristics that are not found in conventional tests. As such, any specific performance changes observed on this type of task may help explain the Flynn effect. Note that cohorts have never been compared using Piagetian tasks, although certain observations suggest that the transition from the preoperational stage to the concrete operational stage is now occurring at a faster pace (Bronkhart as cited in Reese & McClus- key, 1984; Ruffy, 1981). This article focuses on the transition to formal operational intelligence. The choice of this developmental period was motivated by the importance of environmental vari- ables in the acquisition of formal thought, a fact that Piaget (1972) himself acknowledged. What are the characteristics of Piagetian tasks? It is now well established that the measures supplied by these tasks are correlated (sometimes strongly) with those obtained on conventional intelli- gence tests like the WISC or Raven Progressive Matrices. The correlation has been observed on formal-level tasks (Dupont, Gendre, & Pauli, 1975; Keating, 1975; Lim, 1988, 1996; Longeot, 1969; Neimark, 1975) as well as on concrete-level ones. Cloutier 1048

Upload: others

Post on 07-Sep-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

Developmental PsychologyI999 vol. 35, No. 4, 1048-1058

Copyright 1999 by the American Psychological Association, Inc.0012-1649/99/S3.00

Comparison of the Development of Formal Thought in Adolescent CohortsAged 10 to 15 Years (1967-1996 and 1972-1993)

Andre FliellerUniversity Nancy 2

Five standard Piagetian tests were administered to 180 adolescents between the ages of 10 and 15 years.The results were compared with those obtained in 1967 and in 1972 for similar participant samples. Atequal ages, today's adolescents exhibited a higher level of cognitive development than the adolescentsof 20 or 30 years ago. The amount of gain observed varied across tasks, being very large for combinatorythought but mixed for conservation. This acceleration of cognitive development can partially explain thecontinuous rise in intelligence test performance (Flynn effect).

Many cohort comparisons have shown that the mean score on avariety of intelligence tests (e.g., Raven Progressive Matrices,Stanford-Binet Intelligence Scale, Wechsler Intelligence scales)has been rising by approximately 3 IQ points per decade for atleast 60 years (see reviews by Flynn, 1984, 1987; Storfer, 1990).This rise has been noted in over 20 countries, including France,where the studies reported in the present article took place (e.g.,Flieller, Manciaux, & Kop, 1995; Pry & Manderscheid, 1993;Wechsler, 1996).

Whereas the phenomenon of rising IQ scores, known as theFlynn effect, is well documented, it remains puzzling to manyscholars, who are divided as to its interpretation (see Neisser,1998). Many environmental variables have been proposed to ex-plain the rise, including better nutrition (e.g., Lynn, 1990), thedevelopment of education (e.g., Teasdale & Owen, 1989), and theever-increasing complexity of modern environments (e.g.,Schooler, 1998). But none of these variables alone seems to beable to account for all of the observations, nor can any one variableexplain performance changes of the magnitude observed. Accord-ing to Flynn (1987, 1996, 1998), the observed gains are too greatto correspond to a genuine increase in cognitive abilities. One ofFlynn's main arguments is that these massive IQ gains are notaccompanied by similar progress in real life. A case in point is thefact that no improvement in scholastic achievement has beenobserved in the United States.

Some approaches aimed at better explaining the Flynn effecthave recently been proposed. For Neisser (1997), the one-dimensional conception of intelligence that underlies the debateabout the Flynn effect should be discarded: "Different forms ofintelligence are developed by different kinds of experience . . . Weare indeed very much smarter than our grandparents where visual

I would like to thank Francois Longeot and Jacqueline Hornemann forthe information they so willingly provided; Delphine Lambert and YoussefTazouti for their valuable collaboration; Charles Brainerd and DouglasDetterman for their comments on an earlier version of this article; andVivian Waltz for the English translation.

Correspondence concerning this article should be addressed to AndreFlieller, Laboratory of Psychology, University Nancy 2, B.P. 33-97,F-54015 Nancy Cedex, France. Electronic mail may be sent [email protected].

analysis is concerned, but not with respect to other aspects ofintelligence" (p. 447). On her side, Greenfield (1998) proposedattempting to determine the historical and cultural changes thatmight account for higher test scores and testing their explanatorypower by means of field and lab experiments. To illustrate, shereported studies demonstrating the impact of video games andcomputers on the development of visual-spatial skills, measuredon tests where a strong Flynn effect was found (e.g., the ObjectAssembly subtest of the Wechsler Intelligence Scale for Children[WISC]). According to Williams (1998), the Flynn effect is theresult of multiple variables, whose effects should be studied sep-arately. She advocated making finer-grained comparisons aimed atpinpointing specific changes that could be rooted in numerousvariables.

These proposals all point to the utility of conducting analyticstudies of performance improvements on cognitive tests. Theabove authors suggest that test-dependent or even item-dependentvariations in the extent of the rise can provide researchers withcues about what variables are at play in the Flynn effect. Thepresent article takes such an analytic approach. It reports twostudies comparing the performance on Piagetian tasks of same-ageadolescents from different cohorts. As shown below, Piagetiantasks have a number of original characteristics that are not foundin conventional tests. As such, any specific performance changesobserved on this type of task may help explain the Flynn effect.Note that cohorts have never been compared using Piagetian tasks,although certain observations suggest that the transition from thepreoperational stage to the concrete operational stage is nowoccurring at a faster pace (Bronkhart as cited in Reese & McClus-key, 1984; Ruffy, 1981). This article focuses on the transition toformal operational intelligence. The choice of this developmentalperiod was motivated by the importance of environmental vari-ables in the acquisition of formal thought, a fact that Piaget (1972)himself acknowledged.

What are the characteristics of Piagetian tasks? It is now wellestablished that the measures supplied by these tasks are correlated(sometimes strongly) with those obtained on conventional intelli-gence tests like the WISC or Raven Progressive Matrices. Thecorrelation has been observed on formal-level tasks (Dupont,Gendre, & Pauli, 1975; Keating, 1975; Lim, 1988, 1996; Longeot,1969; Neimark, 1975) as well as on concrete-level ones. Cloutier

1048

Page 2: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

TRENDS IN ADOLESCENT COGNITIVE DEVELOPMENT 1049

and Goldschmid (1976), for example, found a correlation of .46between a formal operations test (proportion test) and the RavenProgressive Matrices. The exact factorial composition of Piaget'stasks is not known because of a number of methodological prob-lems, particularly concerning task sampling (see reviews by Car-roll, 1993; Gregoire, 1991; Lautrey, Ribaupierre, & Rieben, 1990).Some conclusions may nevertheless be drawn. A general and oftenstrong factor has been found in every factor analysis of Piagetiantasks (e.g., Jensen, 1980). However, group factors are also noted.For example, several studies (Inman & Secrest, 1981; Lautrey etal., 1990; Londeix, 1985) found evidence of an opposition betweenspatial reasoning tasks and logical reasoning tasks (classification,seriation, quantification of probabilities, prepositional logic, etc.).Likewise, a factor specific to conservation tasks has been found insome studies (Carroll, Kohlberg, & De Vries, 1984; Lautrey et al.,1990). The relations between these factors and those of conven-tional tests need to be clarified (Carroll, 1993). One can say,however, that spatial tasks, Piagetian or otherwise, have their ownspecificities. Moreover, a few studies (e.g., Humphreys, Rich, &Davey, 1985) have shown that Piagetian tasks have a higherloading on the nonverbal factor than on the verbal factor, andcertain authors (e.g., Rubin, Brown, & Priddle, 1978) take Piage-tian measures to be akin to measures of fluid intelligence.

Despite these similarities, Piagetian tasks have certain interest-ing properties. First, they do not have time limits; performancethus depends on reasoning ability, not execution speed. This pointis important, because a number of authors (e.g., Brand 1987a,1987b) have explained the Flynn effect in terms of an increase inresponding speed, and Flynn himself (1996), although he rejectedthis hypothesis, felt that it was "theoretically ideal" (p. 24). Sec-ond, a correct response is only accepted in Piagetian tasks if theparticipant is capable of justifying it. Although the validity ofusing children's explanations as a criterion for assessing theirresponses has been debated (see Brainerd, 1973a, 1977b), chil-dren's justifications can provide useful information in diachroniccomparisons. In particular, they may reveal changes in problem-solving strategies. Third, there is a hierarchical order among theitems in a given task that has been validated using various tech-niques (hierarchical analysis, Rasch analysis; see Bond, 1995;Green, Ford, & Flamer, 1971). The existence of this hierarchybrings out potential changes in the order of the items, and anychange of this type can be taken as an indication of a potentialvariation over time in the meaning of the items concerned. Last,Piagetian tasks measure the acquisition of specific concepts andways of reasoning, such as the concept of proportion at the formaloperational level. As a consequence, accurate detection of cogni-tive progress contributing to the Flynn effect is likely to be easierthan with traditional tests.

It should be noted that the above description does not implyadherence to Piaget's theory. Moreover, except perhaps for thesecond point, Piagetian tasks are not the only ones to have thefeatures mentioned above (e.g., like Piagetian items, many items inthe Stanford-Binet test are not timed). However, only a few testspossess all four properties, and they have rarely been used tocompare cohorts. Piagetian tasks thus have something original tooffer, and this puts them in a good position for supplying new datato account for the Flynn effect.

What can one expect to find if one compares the responses ofadolescents on the formal operations tasks to those of same-age

adolescents 20 years earlier? What might be the significance andutility of such observations?

If the rise in IQ scores corresponds in part to a faster pace ofcognitive development, this should come through on the Piagetiantasks. In other words, access to the formal operations level shouldoccur earlier today than 20 years ago. Improvements in perfor-mance on Piagetian tasks cannot be explained in terms of fasterprocessing speed or more intelligent guessing, two variables sug-gested by Brand (1987a, 1987b). Nor can an overall increase informal operations task scores be the result of the development ofspecific visual analysis skills (Neisser, 1997), because some ofthese tasks do not tap this type of ability (e.g., quantificationof probabilities).

However, the Piagetian tasks are not homogeneous, as somefactor analytic studies have shown, and they are affected to dif-ferent extents by past experience (for the formal operational level,see Larivee, Longeot, & Normandeau, 1989). An example isperformance on conservation tasks, which seems to be less af-fected by the level of schooling than other tasks, such as seriationor quantification of probabilities (Laurendeau-Bendavid, 1977).By contrast, combinational thinking problems, such as the permu-tation of tokens, seem to be highly dependent on the mathematicscurricula offered in the schools (Longeot, 1968). If performancehas truly improved more on tests with a visuospatial componentthan it has on other tests (Flieller, Jautz, & Kop, 1989; Lynn &Hampson, 1986), then this effect should show up on some Piage-tian tasks. For instance, the so-called mechanical curves problem(Nassefat, 1963), which relies on spatial representations, shouldgive rise to greater cross-cohort differences than the other formaloperations tasks.

These trends in access to formal thinking are not relevant to theFlynn effect alone but also contribute to the debate on the learningof Piagetian concepts, a debate that reached its peak in the 1970s(reviews reflecting different points of view have been published byseveral authors, e.g., Brainerd, 1973b, 1977a; Kuhn, 1974;Lefebvre-Pinard, 1976; Strauss, 1972, 1974). Some Piagetian au-thors contend that improvements in performance obtained in alaboratory setting cannot be equated with the spontaneous progressobserved during development. One of the merits of cohort com-parisons is that they bring out differences obtained under naturalconditions. Of course it remains to be shown that these differencesare not completely artifactual, which brings one back to the con-troversy about the Flynn effect.

If rising test scores reflect faster cognitive development, theresponses should be qualitatively the same for a given performancelevel: Correct answers should be justified by the same argumentsand even result from the same strategies, and incorrect answersshould be caused by the same reasoning errors. Qualitative differ-ences in the responses may be indicative of changes in the proce-dures used, making cohort comparisons problematic. Suppose, forexample, that only the adolescents in the most recent cohort use amathematical formula to solve a combination problem. Such adifference could hardly be regarded as a developmental difference.

It also follows from the hypothesized speed-up in the pace ofcognitive development that the item hierarchy for a given task willnot vary. The fact that, when ranked by difficulty, the order of theitems has not changed over time obviously does not suffice toprove the hypothesis. On the other hand, a significant modification

Page 3: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

1050 FLIELLER

in this order could lead one to suspect changes in the very meaningof the measures.

Surveys on the level of operational reasoning in adolescents thatcould potentially serve as a basis for cohort comparisons arescarce. Such studies would have to meet a number of conditionsregarding the samples and tasks, and these conditions are rarelysatisfied. In France, the only adequate data can be found in twonorming studies on the Logical Thought Development Scale. Thisscale was first standardized in 1967 by its author, who publishednorms for each task (Longeot, 1967). Because the norming groupsused by Longeot were small (30 participants per age group),Hornemann standardized Longeot's scale in 1972 on a largersample; however, this sample was limited to 10- to 12-year-olds,and norms were constructed only for the total score.

Therefore, I conducted two comparative studies. The first studywas based on the 1972 sample. Its specific aims were to test thehypothesis of an improvement in overall performance in formaloperational tasks and to compare this improvement with thatobserved on conventional tests. The second study was based on the1967 sample. It was aimed at duplicating the previous cohortcomparison on 13- to 15-year-olds by studying performancechanges task by task and making precise, qualitative comparisonsof the responses on certain items.

Given that both studies were based on Longeot's scale and thatthis test is not well-known outside of France, I describe the scalein the first part of this article, which serves as a general introduc-tion to the present investigations.

Longeot's Logical Thought Development Scale

Longeot's Echelle de Developpement de la Pensee Logique(Logical Thought Development Scale [LTDS]) was designed tomeasure the cognitive development of adolescents (see Longeot,1967, 1969). It comprises five tests, which are administered indi-vidually. Each test corresponds to the standardization of a problemused previously by Piaget and Inhelder in their experiments (e.g.,Inhelder & Piaget, 1958; Piaget & Inhelder, 1951; Piaget, Inhelder,& Szeminska, 1956).

In compliance with Piagetian tradition, the operational level(concrete, or C; intermediate, or /; formal A, or Fa; formal B, orFb) is assessed not only according to the children's ability to solvethe test problems but also on the basis of the explanations they giveto support their answers and their reactions to the experimenter'sobjections and countersuggestions. However, unlike Piaget's clin-ical method, the experimenter's interventions are strictly defined inthe test manual.

Conservation

The materials consist of three same-size balls—two of clay andone of metal—and two containers of water. There are three itemsin the test. The first assesses the dissociation of weight andvolume. The experimenter asks the participant to explain why thewater rises when the balls of clay are dropped in the water(explanation by weight or by volume) and to predict the height ofthe water in the containers (equal or unequal levels). Next, theexperimenter shows the participant one of the clay balls and themetal ball and asks him or her to predict the water levels inthe containers and justify the answer given. The second item tests

conservation of volume. The experimenter rolls a ball of clay intoa long sausage shape and asks the participant to predict the waterlevels if the ball of clay is dropped in one container and thesausage-shaped clay is dropped in the other. A justification isrequested. The same questions are asked after a second transfor-mation in which the sausage is broken up into 10 or so pieces.Objections are made if the answer is incorrect. The third item testsconservation of weight. The participant must predict the relativeweights of the untransformed ball of clay and the other ball of clayafter it is flattened and then broken into small pieces. Justificationsare requested, and objections are made to incorrect answers. Thethird item is given only following failure on both of the precedingitems; otherwise, the participant is credited with a success.

Level C is defined as success on conservation of weight only,and Level / is defined as success on conservation of volume or onvolume-weight dissociation. Participants who succeed on bothconservation of volume and volume-weight dissociation are de-fined as intermediate or better ( a / ) .

Permutation

The materials are tokens of different colors. The test has fouritems. In the first, the participant has to predict how many differentways a yellow token, a red token, and a blue token can be lined up(prediction). The participant is then asked to line up the tokens onthe table in all possible ways (execution). The experimenter makessuggestions in cases of failure. Only the execution phase is scored.Then a green token is added and the same procedure is followed(prediction followed by execution). The prediction of the numberof permutations is the second item, and execution is the third. Forthe fourth item, still another token is added (purple) and theparticipant is simply asked to predict the number of possiblepermutations of the five colors. The participant must explain themethod used, and the experimenter determines whether the partic-ipant is capable of generalizing the method to six and seven colors.

The scoring is based on response accuracy, the number ofsuggestions made by the experimenter, and the method used by theparticipant (e.g., the answer 12 on Item 2 is not accepted if it isobtained by doubling the permutations of three colors: 6 permu-tations for 3 colors, therefore 12 permutations for 4 colors).

On this test, Levels C and / are not distinguished: Participantswho succeed on the first item are classified at the concrete orintermediate level (C-/), and those who fail on this item areclassified as below C ( < Q . Level Fa is defined as success onItem 2 or 3, and Level Fb is defined by success on Item 4.

Quantification of Probabilities

In this test, two kinds of tokens are used: plain yellow tokensand yellow tokens with a large black X on one side. Each item usestwo collections of tokens; for example, a collection of four tokens,one of which has an X, and a collection of four tokens, two ofwhich have an X. The participant has to decide in which of the twocollections he or she will have a greater chance of picking a tokenwith an X on the first try, without looking.

The test includes eight items. In the four Level-C items, theparticipant has to compare the following proportions (where thenumerator is the number of tokens with an X and the denominatoris the total number of tokens): V* and %; Vs and 3/T, lA and Vi; %

Page 4: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

TRENDS IN ADOLESCENT COGNITIVE DEVELOPMENT 1051

and 3/7. The level is reached if two or more of these problems arecorrectly solved. In the single Level-/ item, the proportions Vz and% have to be compared. In the two Level-Fa items, % has to becompared with Vi, and 3/9 with %. The Fa level is reached if at leastone problem is correctly solved. Finally, the one Level-Fb iteminvolves the comparison of the proportions % and Vs.

Pendulum Oscillations

The materials consist of a stand, a hanging string, and fiveweights (50 g, 100 g, 150 g, 200 g, 250 g) that can be hung on thestring. The experimenter hangs a weight, sets it in motion, and thenexplains to the participant that different variables can affect theperiod of the pendulum: the length of the string, the weight of thependulum, the force of the initial push, and the height from whichthe pendulum is released. The participant is then asked to doexperiments to determine which variables account for the pendu-lum's motion and is allowed to perform whatever experiments heor she wishes. If the participant changes only one variable at atime, the experimenter proposes changing two—for, example, theweight and the length of the string (countersuggestion). When theparticipant has finished, the experimenter asks him or her to drawconclusions.

This test, which has only one item, is a Level-F& test. This levelis reached if and only if the participant proceeds in a systematicway by changing only one variable at a time. In cases of failure,the participant is rated as below Fb (<Fb).

Mechanical Curves

This test requires a small apparatus composed of a cylinderfastened to a stand that can be rotated around its horizontal axisusing a crank. The cylinder is covered with a standard sheet ofpaper (8 X 11 Vz in.). A pencil can be moved horizontally from oneend of the cylinder to the other along a bar attached on top of thestand. The task consists of imagining and drawing the line drawnby the pencil as it moves along the rotating cylinder. Thus, twosimultaneous movements must be coordinated: the horizontal mo-tion of the pencil and the rotation of the cylinder.

The test has six items: the pencil moves without cylinder rota-tion (Level O ; the cylinder rotates without pencil movement(Level C); the cylinder rotates and the pencil moves from one endto the other (Level /); the cylinder rotates and the pencil goes andcomes back (Level Fa); the cylinder does two full rotations andthe pencil goes and comes back (Level Fa); and the cylinder doestwo rotations and the pencil goes to one end but does not comeback (Level Fb).

Level C is reached when Item 1 or 2 is solved, Level / whenItem 3 is solved, Level Fa when Item 4 or 5 is solved, and LevelFb when Item 6 is solved.

Total Score and Overall Operational Level

To determine the total score, one must weight the items in sucha way that the different tests have the same weight at eachoperational level. For example, the conservation test and the per-mutation test have a single Level-C item, whereas the probabilitytest has four and the mechanical curves test has two. Thus, success

on these items is scored 2 on conservation and permutation, V2 onprobability, and 1 on mechanical curves.

In the Piagetian perspective, the total score is of little directinterest. As a consequence, it is converted into an overall opera-tional level by means of a table. The resulting operational levelsare C, /, Fa, and Fb. In a psychometric perspective, the total scoreobviously has a value of its own, and it is particularly interestingin cohort comparisons. For additional information on the LTDS,readers may wish to refer directly to Longeot's publications (1966,1967, 1969).

Study 1

The first study was designed to compare the Longeot test scoresof adolescents aged 10-13 years examined in 1972 versus 1993.

Method

Participants

1972 participants. Following the publication of the LTDS (Longeot,1967), Hornemann established the test norms using a larger sample thanLongeot's. Data collection took several years and was completed in 1972(J. Hornemann, personal communication, September 18, 1995). Her resultswere included in the 1974 LTDS manual. The sample was selected usingthe quota method and was representative of the French school populationat the time. It consisted of three age groups. The 10-year-old groupincluded 150 participants ranging in age from 9 years 6 months to 10years 5 months; the 11-year-old group included 180 participants whoranged in age from 10 years 6 months to 11 years 5 months; and the12-year-old group consisted of 180 participants ranging in age from 11years 6 months to 12 years 5 months.

1993 participants. The 1993 sample was drawn from nine elementaryschools and three junior high schools in the metropolitan area of Nancy,France. It included three age groups with 30 participants each: a 10-year-old group (ranging in age from 9 years 6 months to 10 years 5 months), an11-year-old group (ranging in age from 10 years 6 months to 11 years 5months), and a 12-year-old group (ranging from 11 years 6 months to 12years 5 months). The age groups were defined as in Hornemann, but themedians were slightly below the theoretical ones (9 years 9 months, 10years 11 months, and 11 years 10 months instead of 10 years, 11 years,and 12 years).

A weighting procedure using the weight command of SPSS (StatisticalPackage for the Social Sciences) was applied to each subsample (agegroup) in order to ensure representativeness with respect to national quotas.Four variables were controlled: sex, nationality, school grade, and father'soccupation. In other words, for each age group, the distribution of thestudents along these four variables was the same in the sample and inFrance as a whole.

Scale and Scoring

The study used the LTDS scale described above. In 1993, as well as in1972, the tests were scored by an assistant and checked by me, and Iadhered strictly to the scoring instructions in the manual. These instruc-tions, which have not been changed since the publication of the LTDS, arevery clear, and no interpretation problems were encountered.

Procedure

In 1972, the LTDS was administered in accordance with the standardprocedure (full administration of the scale). In 1993, an abbreviated versionhad to be used in order to comply with the 1-hr-per-student time limit set

Page 5: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

1052 FLIELLER

by the schools, where the adolescents were tested individually. To savetime, I did not give the pendulum test because 10- to 12-year-old childrenrarely succeed on this problem. For the same reason, the permutation testwas stopped when the participant failed the Fa-level item, and the proba-bility and mechanical curves tests were stopped when the participant failedon the /-level item. The procedure followed in 1993 on these three tests isauthorized in the LTDS manual. In other words, the dropped items areconsidered optional in the manual, although they were in fact administeredin 1972.

Results

Statistical Analyses

Because of the small sample size in 1993 (30 per age group), allstatistical processing was done with and without weighting. Theresults with weighted and unweighted procedures yielded no sub-stantive differences given that the quotas were controlled asstrictly as possible during data collection. The results presentedbelow were obtained with weighted statistical procedures. (Theresults of unweighted statistical procedures are available from theauthor.)

Psychometric Properties of the LTDS in 1993

Principal-components analysis of the LTDS. When standard-izing the LTDS, Hornemann did not study the psychometric prop-erties of the scale. To fill this gap, I analyzed the structure of theLTDS using a principal-components analysis on the 1993 sample.On the basis of Kaiser's criterion, only one factor could be re-tained, which accounted for 56% of the total variance. The factorloadings were .59 for conservation, .80 for permutation, .82 forprobability quantification, and .77 for mechanical curves.

Reliability of the LTDS tests. The reliability of the LTDS testswas measured by Cronbach's coefficient alpha. The followingvalues were obtained: .62 for conservation (three items), .63 forpermutation (four items), .81 for probability quantification (eightitems), .66 for mechanical curves (six items), and .83 for the wholescale.

Total Score and Operational Levels in Each Cohort

The LTDS manual supplies two types of data taken from Home-mann's standardization of the test: the distribution of the totalscore in the 1972 sample and the frequency table of the overalloperational level in that same sample.

Comparison of the total scores in 1972 and 1993 did not yield asignificant difference in the variance: In 1972, SD = 4.6; in 1993,SD = 4.8; Levene's test for equality of variance, F(l, 598) = 0.07,p = .80. In contrast, the mean of the total score increased signif-icantly, rising from 11.3 in 1972 to 12.4 in 1993, r(598) = 2.04,p < .03 (one-tailed). The differences in the means were equivalentat 0.23 SD. For the normalized scores (M = 100, SD = 15), theprogression was thus 3.5 points over 21 years, or 5 points in a30-year generation. No age-linked variation in this progressionwas found: Cohort x Age, tested by an analysis of variance, F{2,594) = 28.01, p = .26.

Table 1 gives the distribution of participants across the fouroperational levels by testing date. The percentage of 10- to 12-year-olds at Level C was lower in 1993 than in 1972: after

Table 1Overall Level of Development on the LTDS by Ageand Date of Administration

% of participants at each level

Age 10-12 years(Study 1)

Age 13-15 years(Study 2)

Level1972

(n = 510)1993

(n = 90)1967

(n = 90)1996

(« = 90)

ConcreteIntermediateFormal AFormal B

4644

91

3747133

2243269

12344015

Note. Percentages have been rounded off. LTDS = Echelle de Devel-oppement de la Pensee Logique (Logical Thought Development Scale).

combining Groups Fa and Fb, ^ ( 2 , N = 600) = 3.44, .05 < p <.10 (one-tailed).

Discussion

The use of an abbreviated version of the LTDS in 1993 slightlyreduced cross-cohort differences, which would have been a littlelarger if the pendulum test had been given and if the other tests hadbeen administered in their entirety. Therefore, the above results areconservative.

The improvement in performance found in the present study (5points in a generation) is not as great as that observed in nonverbaltests from three other French studies on children (to facilitatecomparisons, I standardized the scores and calculated the progres-sion over a 30-year period). Pry and Manderscheid (1993) noteda 15.2-point progression in one generation for 9- to 10-year-oldchildren tested on the Raven Coloured Progressive Matrices in1962 and 1993. For 7-year-olds tested in 1973 and 1992, Flieller etal. (1995) observed a gain of 11.4 points on the nonverbal part ofa time-limited group test, the Echelle Collective de Niveau Intel-lectuel (Collective Scale of Intellectual Level [ECNI]), and 3.1points on the verbal part. A comparison of the scores obtained bythe same sample of 11-year-old children on the revised version ofthe WISC (1981 norms) and on the 3rd edition of the WISC(WISC-III; 1996 norms) revealed a gain in one generation of 9.2IQ points on the Performance scale and 5 IQ points on the Verbalscale (Wechsler, 1996). The difference between the LTDS resultsand the nonverbal test results in these three studies is all the morenotable because, as I indicated earlier, Piagetian tasks are morestrongly correlated with nonverbal tests than with verbal ones. Thedifference may be due to the fact that visuospatial skills play alesser role in the LTDS than in the nonverbal tests in question. Thisinterpretation is consistent with Neisser's (1997) explanation ofthe Flynn effect. Regarding the execution-speed account proposedby Brand (1987a, 1987b), speed seems to play a secondary role inthe performance gains because they were as great on the LTDS andon the WISC-III Verbal scale (both untimed) as on the ECNIverbal tests (timed).

Study 2

The second study was primarily aimed at extending the com-parison to older adolescents (aged 13 to 15 years) and examining

Page 6: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

TRENDS IN ADOLESCENT COGNITIVE DEVELOPMENT 1053

the performance trends on each LTDS test, because this could notbe done with the results published by Hornemann. The data usedwere collected by Longeot himself in his investigation of theLTDS.

Method

Participants

1967 participants. In Longeot's (1967) statistical analysis of theLTDS, 210 participants between the ages of 9 and 15 years were examined(30 per age group). In the present study, only participants aged 13, 14,and 15 years were considered. The 10- to 12-year-old participants fromStudy 1 could not be included in Study 2, because the full LTDS was usedin the second study whereas an abbreviated version of it was used in thefirst. The age ranges in the groups were defined as in Hornemann (specifiedby F. Longeot in a personal communication, December 18, 1993): The13-year-old age group included participants aged 12 years 6 months to 13years 5 months; the 14-year-old age group included those aged 13 years 6months to 14 years 5 months; and the 15-year-old age group included thoseaged 14 years 6 months to 15 years 5 months (with a few participantsaged 15 years 6 months to 16 years 5 months).

The tests were administered in schools located in the Paris area. Longeothad matched each age group to its national distribution by school grade,thus controlling an important variable. However, because of the specificgeographical area in which the students lived, the representativeness of thesample cannot be fully guaranteed. Nonetheless, no significant differencewas found between Longeot's Parisian sample and Hornemann's nationalsample in the operational level distributions of the 10- to 12-year-olds:After combining groups Fa and Fb, x*(2, N = 600) = 0.43, p > .80.Moreover, if a bias did exist in the sample, it would certainly skew the dataupward (making the mean score higher in the sample than in the popula-tion), because the Paris area is socially more privileged than the rest ofFrance. Such a bias runs counter to the hypothesis tested here.

1996 participants. The participants in 1996 came from four junior highschools in the metropolitan area of Nancy, France. They were 13 (n =30), 14 (n = 30), and 15 (n = 30) years old. The age ranges were the sameas in Longeot (1967), but the medians were slightly below the theoreticalones (12 years 11 months, 13 years 8 months, and 14 years 11 monthsinstead of 13 years, 14 years, and 15 years).

As with the 1993 sample, I applied a weighting procedure (using theSPSS weight command) to the three subsamples (age groups) in order torepresent the national quotas for sex, nationality, school grade, and father'soccupation.

Scale and Scoring

In Study 2, the tests were scored by an assistant and checked by me, asin Study 1. This method is similar to the one used in 1967, where the testswere either scored by Longeot alone or by an assistant first and then againby Longeot. The instructions in the LTDS manual, written by Longeot,were strictly followed. It is worth noting that in a secondary analysis ofLongeot's data, Lautrey (1980b, p. 686) scored the tests of the 1967 samplehimself. He found a high degree of reliability (although he did not mentionthe reliability coefficient).

Procedure

The entire LTDS was administered in 1996. Thus, as in 1967, theparticipants took all five LTDS tests and all items in each test. Theadolescents were tested in their schools both in the 1967 study and in the1996 study.

Results

Statistical Analyses

As in Study 1, all statistical processing was done with andwithout weighting (except the hierarchical analysis of the LTDS,where weighting was not applicable). The results with weightedand unweighted procedures did not yield any substantial differ-ences, given that the quotas were controlled as strictly as possibleduring data collection. The results presented below are the onesobtained with weighted statistical procedures. (Unweighted resultsare available from the author upon request.)

Psychometric Properties of the LTDS

Principal-components analysis of the LTDS in 1996. UsingKaiser's criterion, only one component could be retained, whichaccounted for 50% of the total variance. The factor loadings on thefirst principal component were .61 for conservation, .75 for per-mutation, .73 for probability, .63 for the pendulum, and .78 formechanical curves. These values are close to those found for the10- to 12-year-olds in the 1993 sample.

Reliability of the LTDS tests in 1996. The reliabilities of thetests, as measured by Cronbach's alpha, were .42 for conservation,.49 for permutation, .76 for probability, .62 for mechanical curves,and .83 for the whole scale. Again, these values are close to thosefound in the 1993 sample, except for conservation and permuta-tion, the reliabilities of which were not as high for the 13- to15-year-olds as for the 10- to 12-year-olds.

Hierarchical analysis of the LTDS in 1967 and in 1996. Lon-geot (1967) studied the validity of the LTDS using a hierarchicalanalysis technique especially adapted to Piagetian tests. Accordingto this technique, responses to items of different levels must bestrictly ranked, whereas responses to items of the same level, beingof equal theoretical difficulty, need not be ranked. Hence, Longeotcalculated two indexes, called the interstage index and the intra-stage index. The two indexes have the same formula, i = 1 —(Eo -T- Ee), where Eo stands for the number of observed errors andEe is the number of expected errors in cases of response patternequiprobability.

However, the errors are not counted in the same way for the twoindexes. For the interstage index, the items are ordered by theo-retical difficulty. Success in an order that differs from the pre-defined ranking (e.g., a participant succeeds on a Level-/ item butfails on all Level-C items) is scored as an error. The higher thevalue of this index, the greater the validity of the scale. For theintrastage index, the items are ranked within each stage by theirrelative difficulty, observed empirically for the sample beingtested. Any successes that occur in a different order are scored aserrors. For example, if the scale has two concrete items, i and/ andin the sample one observes a success rate of 80% on ;' and 60% onj , then success on j accompanied by failure on i constitutes anerror. The higher the value of this index, the lower the validity ofthe scale. Note that this interpretation of the intrastage index isbased on the Piagetian framework used by Longeot. In the presentstudy, temporal variations in the intrastage index were not taken tobe indicative of a change in validity but, rather, to reflect a changein the order of item difficulty.

For the 1967 sample, Longeot published the results of thesecalculations on the 210 participants aged 9 to 15 years who were

Page 7: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

1054 FLIELLER

tested in that year. For 1993, the use of a stopping criterionprevented calculation of the interstage indexes, where there mayhave been cases of participants who, say, failed on all Level-/items of a given test but would have succeeded on a Level-F itemhad they been tested for that level. The hierarchical analysis of theLTDS was thus applied to the 1996 sample alone.

Looking at Table 2, one can see that the interstage index valuesin 1967 and 1996 are very close except for conservation, where theindex is at its maximum in 1996 (at 13-15 years, the probability ofmaking an interstage error on this test was virtually zero). Thesevery high values for the interstage index indicate that the LTDStests conform to the stage model, both in 1967 and in 1996.

The value of the intrastage index was higher in 1996 than in1967 on conservation, quantification of probabilities, and mechan-ical curves. This means that for these tests, items at the sameoperational level tended to develop in a fixed order in 1996 but notin 1967. In contrast, the intrastage index was lower in 1996 on thepermutation test. In 1967, the correct prediction of the number ofpermutations of four tokens tended to be subordinate to the abilityto produce the permutations; this was no longer true in 1996.

Operational Levels in Each Cohort

Table 1 gives the distribution of participants across the fouroverall operational levels. The percentage of 13- to 15-year-olds atLevel Fa or Fb was significantly higher in 1996 (55%) than in1967 (35%), x*(3, N = 180) = 8.02, p < .03 (one-tailed).

Table 3 gives the participants' operational levels on each test.On the conservation test, the operational levels of the participantswere significantly different in 1967 and 1996, ;^(3, N =180) = 7.56, p < .03 (two-tailed). There was both a progression(decrease in the percentage of participants at Level C and increasein the percentage of participants at Level 7) and a regression (dropin the percentage of participants at Level s / ) . In one of hispublications, Longeot (1966) mentioned that the weight-volumedissociation item was correctly solved by 50% of the 13-year-oldparticipants and by more than 70% of the 14- and 15-year-oldparticipants, which makes a success rate of at least 63% at age13-15 years. This percentage was only 47 in 1996, ^ ( l , N =180) = 5.05, p < .05 (two-tailed). Because the volume conserva-tion item seems to have been answered as well in 1996 as 30 yearsago (Longeot found 70% success at age 12 years, and I found 84%success at age 13-15 years in 1996), it is the drop in performance

Table 2Hierarchical Analysis of the LTDS Tests in 1967 and 1996

Table 313- to 15-Year-Olds' Level of Development on theLTDS Tests in 1967 and 1996

Test

ConservationPermutationProbabilityMechanical curves

Interstage and

1967

Interstage

0.850.940.940.98

Intrastage

0.560.490.300.39

intrastage indexes

1996

Interstage Intrastage

1.00 0,760.98 0.150.97 0.640.99 0.66

Test

ConservationConcreteIntermediate^Intermediate

Permutation<ConcreteConcrete or intermediateFormal AFormal B

ProbabilityConcreteIntermediateFormal AFormal B

Pendulum<Formal BFormal B

Mechanical curvesConcreteIntermediateFormal AFormal B

% of participants

1967(n = 90)

212851

12512611

31203118

7921

264132

1

at each level

1996(n = 90)

124742

1186714

19162640

6535

16452811

Note. The pendulum test was not analyzed because it included only oneitem. LTDS = Echelle de Developpement de la Pensee Logique (LogicalThought Development Scale).

Note. Percentages have been rounded off. LTDS = Echelle de Devel-oppement de la Pensee Logique (Logical Thought Development Scale).

on the weight-volume dissociation item that is responsible for thelower percentage of 2:/ participants on the conservation test. Notethat 38% of those who failed on this item had attained the formallevel (Fa or Fb) on the LTDS as a whole.

Improvement was observed on the other four tests. The percent-age of participants at the formal level (Fa or Fb) was higher in1996 than in 1967 on permutation, ^ ( 3 , N = 180) = 39.97, p <.00001 (two-tailed); quantification of probabilities, )?(3, N =180) = 11.17, p < .02 (two-tailed); the pendulum, / ( 3 , N =180) = 4.12, p < .05 (two-tailed); and mechanical curves, ^2(3>N = 180) = 9.74, p < .03 (two-tailed).

Cramer's V coefficient can be used to compare intercohortdifferences across tests. V was .20 for conservation, .48 for per-mutation, .25 for probabilities, .15 for the pendulum, and .23 formechanical curves.

Qualitative Study of Responses

Because the 1967 protocols were destroyed, a complete quali-tative study of the responses was not possible. Some data forcomparisons are nevertheless available.

The LTDS manual and Longeot's publications contain a greatdeal of information about the responses obtained in 1967 on thevarious LTDS items, both for correct answers (justifications) andfor incorrect answers (error types). This information was used byboth Longeot and myself to categorize and score the responses. Allresponse types observed in 1996 were also found by Longeotbecause no categorization problems arose in this study (this remarkalso applies to Study 1). On the weight-volume dissociation item,

Page 8: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

TRENDS IN ADOLESCENT COGNITIVE DEVELOPMENT 1055

for example, the adolescents who failed, whether in 1996 or in1967, thought that the rise in the water level was a function of theobject's weight, not of its volume. The only exception occurred forthe fourth item of the permutation test (prediction of the permu-tations of five tokens of different colors), where 1 adolescentapplied the mathematical formula Pn = nl. However, this was aspecial case (both of this girl's parents were mathematics teach-ers). The other students who succeeded on this item discovered thesolution using the types of reasoning observed by Longeot (e.g.,taking the result previously found for four colors and multiplyingit by five).

However, were the frequencies of the various types of responsesthe same as 30 years ago? A partial answer to this question can befound in the detailed information about two items supplied byLongeot (1967). For the first item (the prediction of the permuta-tions of four tokens), Longeot found six types of answers. Two(the answer 8 and the answer 12) were generalizations of thesolution to the preceding three-token permutation problem (6 = 3colors X 2): 8 was obtained by multiplying 4 colors by 2, and 12was obtained by multiplying the result for three colors also by 2(12 = 6 X 2). These are Level-C responses, as are the otherincorrect answers (e.g., "as many ways as colors," in this case, 4).The remaining three response types were Level Fa, which added 1point to the score. The correct answer (24) was relatively infre-quent at the ages considered here. The answers 12 and 16, althoughincorrect, reflected multiplicative reasoning at the formal level(12 = 4 X 3 and 16 = 4 X 4). Table 4 can be used to compare theresponse distributions on this item in 1967 and 1996. One can seethat the distributions are roughly the same for the two cohorts: Theanswer 8 was by far the most frequent among the Level-C re-sponses, and the answer 12 was clearly the predominant Faresponse.

The other item studied by Longeot was the third mechanicalcurve problem, in which the participant was asked to draw the linemade by the pencil as it moved to the end of the cylinder while thecylinder made one complete rotation. This is a Level-/ item. Thecorrect answer is obviously a diagonal line across the paper (77%answered correctly in 1967 versus 84% in 1996). Four incorrectresponse categories were observed. The most primitive responses(a horizontal line or a vertical line) resulted from failure to con-sider one of the two movements. Those participants who incor-rectly drew two perpendicular lines simply juxtaposed the twomovements. Participants who incorrectly drew a sinusoidal curve

Table 4Responses Given by 13- to 15-Year-Olds on the 2nd Item of thePermutation Test (Prediction of the No. of PermutationsWith 4 Tokens)

Response 1967 1996

Level-C answers812 (6 X 2)Other Level-C answers

Level-Fa answers12 (4 X 3)1624

4028

21118

223

367

14

Table 5Responses Given by 13- to 15-Year-Olds on the 3rd Item of theMechanical Curves Test (Pencil Moves to One Endand Cylinder Rotates Once)

Response 1967 1996

One horizontal or vertical lineTwo perpendicular linesSinusoidal curveNonrectilinear diagonal lineOther wrong answersRectilinear diagonal

96330

69

33233

76

attempted to compound the two movements. The fourth categoryof incorrect responses included various lines that moved in theright direction (from the upper left-hand corner to the lowerright-hand corner of the paper) but were not straight (wavy line,series of horizontal lines, etc.). The distributions of the responsetypes in 1967 and 1996 are given in Table 5. One can see again thatthe two cohorts are close together on the incorrect answer catego-ries. The only notable difference is the presence of a response notmentioned by Longeot in 1967—namely, a curved line connectingthe upper-left and lower-right corners of the paper. But this re-sponse is merely a variant of the fourth category described above.

Discussion

This study showed that the mean operational level of the 13- to15-year-old adolescents was higher in 1996 than in 1967. Thisfinding confirms the main result of the first study (note that the 10-to 12-year-old participants tested in 1993 were born in 1983, 1982,and 1981 as were the 13- to 15-year-old participants tested in1996). However, compared with Study 1, this study has twolimitations. First, it was not possible to compare the LTDS scoresbecause Longeot only published the overall operational level of hisparticipants. Second, the 1967 sample was not as representative asthe national sample of 1972. Nevertheless, given that the distribu-tions of the 10- to 12-year-olds' operational levels were the samein the two samples (as indicated above in the Method section), itseems legitimate to use the 1967 sample as a base for comparison.

The operational level of the 13- to 15-year-olds rose to variableextents on the different LTDS tests. The greatest increase occurredon the permutation test. Because of the small number of items(four), this test was not very reliable. Even so, the low reliabilitylevel cannot account for this particular trend because the changesobserved between 1967 and 1996 are one-directional. Anotherpossible reason for the rise could be related to school learning,although the mathematics curriculum in France does not appear toinclude any changes (such as the introduction of the concepts ofcombinatory analysis) likely to account for an improvement incombinatory thought processes. Besides, with one exception, theparticipants did not use a mathematical formula to solve thepermutation problems. Perhaps the schools promote this type ofthinking in an indirect manner. Longeot (1968) provided experi-mental evidence of such an indirect effect by showing that learningthe Cartesian product improves performance on permutation prob-lems. Note also that the presentation of data in table format ismuch more widespread in the teaching practices of the past few

Page 9: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

1056 FLIELLER

decades, so much so that table reading and building has become aneveryday activity in all disciplines, even as early as elementaryschool. More generally, one can assume that the development ofcombinatory thought is promoted by the use of systematic proce-dures (e.g., in grammar, using a tree to diagram the cases of a rule)and that today's children are encouraged more to use suchprocedures.

The scores on the probability, pendulum, and mechanical curvestests did not rise as much. The mechanical curves results are a littlesurprising. The literature on the Flynn effect would lead one tobelieve that progress on a spatial test would be greater than onother tests. Indeed, particularly large performance gains have beenfound on this type of test. However, the mechanical curves test isdifferent from tests like the WISC puzzles, in which substantialimprovements have been observed: Although the puzzles involvevisual perception skills, mechanical curves require building com-plex mental representations (note that mechanical curves had thehighest loading level on the overall factor in the 13- to 15-year-oldgroup).

Another unexpected result was the one found for the conserva-tion test, in which both an improvement (more participants atLevel / in 1996 than in 1967) and a drop (more failures on theweight-volume dissociation) were observed. This test, which hasonly three items, was not very reliable, and this fact may partiallyaccount for the mixed results obtained. Other variables may alsohave had an impact. Laurendeau-Bendavid (1977) noted thatschooling had less effect on conservation (number and area) thanon probability quantification. As a consequence, the unexpectedconservation test results could also be explained in terms of therespective roles of personal experience and school in the acquisi-tion of the operations. Such an account is in line with the hypoth-esis that school plays an important role in the Flynn effect. What-ever the case may be, the data collected cannot account for theparticular result obtained for the dissociation item: 79% of the 13-to 15-year-olds who failed on this item in 1996 succeeded onvolume conservation, and 38% reached the formal level on theLTDS as a whole. Further research is needed to confirm andexplain these observations.

If one considers each test separately, the striking finding is thestability of the responses. First, as far as one can tell, the 1996 and1967 adolescents reasoned in the same way, gave the same argu-ments, and made the same mistakes. And the item hierarchy for thedifferent operational levels (C, I, Fa, Fb) did not change over theperiod considered: The interstage indexes were high in both 1996and 1967. The only changes were on the intrastage indexes, whichleads one to suspect certain variations in the item-difficulty orderwithin the operational levels. This information is not sufficient tocontend that the formal thought scales measure the same dimen-sions today as they did 30 years ago, but this hypothesis is notfalsified by the data either. The rising performance seems tocorrespond to genuine progress in the reasoning abilities of ado-lescents. This point is important for the interpretation of the Flynneffect.

Conclusions

Both studies conducted here showed that Piagetian tests, liketraditional intelligence tests, are subject to the Flynn effect. Peri-

odical updating of the norms of these developmental scales istherefore required, just as it is for other tests.

The characteristics of the tasks used in the present study allowone to rule out certain explanations of the Flynn effect—namely,improved guessing and faster test execution speed, as hypothesizedby Brand (1987a, 1987b). Furthermore, Study 1 showed that themagnitude of the Flynn effect was the same on the LTDS as onverbal tests (e.g., Verbal scale of the WISC) and smaller on theLTDS than on nonverbal tests. This finding suggests that part ofthe Flynn effect can be ascribed to gains in visual analysis skills(Flieller et al., 1989; Neisser, 1997), which play a key role in manynonverbal tests but play a minimal role in the LTDS (e.g., thedifficulty of the pendulum task clearly lies in the type of reasoningthe adolescents have to do and not in their ability to visuallyperceive the problem data, which are obvious).

Both studies revealed progress on problems involving variouskinds of reasoning (e.g., combinatorial problems, proportion prob-lems, control-of-variables problems). Given that the cognitive abil-ities involved in such reasoning are constituents of intelligence, theFlynn effect can be partially interpreted in terms of a gain inintelligence and a speedup in cognitive development. This inter-pretation is supported by the fact that the adolescents observed inthe 1990s probably solved these problems using the same proce-dures and the same reasoning processes as the adolescents of thesame age observed in the late 1960s. Yet intelligence and cognitivedevelopment have many dimensions. The improvement observedhere of course only concerns some of these dimensions.

The two studies reported in this article do not allow me to makeany statements about the causes of this faster developmental pace.Given the role played by schooling in access to formal thought, thedevelopment of academic education is, besides other variables, aserious candidate for the explanation (in his 1996 publication,Flynn himself contended that education may have a major impactin certain countries and during certain periods). In France, wherethese two studies were conducted, secondary school enrollmentgrew considerably during the period in question (according tostatistics from the French Ministery of Education, the prevalenceof middle school attendance at age 15 years rose from 60% in 1967to 90% in 1993). Note that at the same age, a greater proportion ofthe adolescents tested in 1967 were in the primary school system(Grades 1 to 8), whereas more of the adolescents in 1993 had goneon to the secondary school system (Grades 6 to 12). The fasterpace of cognitive development may therefore be linked in part tothe increase in the type of schooling of adolescents. In addition, theadolescents' parents had also attended school for more years: Theaverage of 9 years in the Longeot and Hornemann samples went upto 11 years in the 1993 and 1996 samples (estimates based onnational statistics). Given that a study by Lautrey (1980a) showedthat children's operational level depends on the child-rearing prac-tices in the home, which themselves depend on the parents' edu-cation, it is possible that the children's faster cognitive develop-ment is due in part to their more educated parents.

If one assumes that the pace of cognitive development has infact increased, another conclusion can be drawn from this study.As I mentioned earlier, experiments with operational training ofparticipants instigated many debates in the 1970s. The GenevaSchool has repeatedly contested the ecological validity of experi-ments that demonstrate progress after short-term training. Cohortcomparisons offer an interesting alternative to the experimental

Page 10: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

TRENDS IN ADOLESCENT COGNITIVE DEVELOPMENT 1057

method for the learning versus development debate. In response tothe question Piaget called "the American question," they suggestthat the pace of cognitive development may indeed be rising.

References

Bond, T. G. (1995). Piaget and measurement I: The twain really do meet.Archives de Psychologie, 63, 71-87.

Brainerd, C. J. (1973a). Judgments and explanations as criteria for thepresence of cognitive structures. Psychological Bulletin, 79, 172-179.

Brainerd, C. J. (1973b). Neo-Piagetian training experiments revisited: Isthere any support for the cognitive-developmental stage hypothesis?Cognition, 2, 349-370.

Brainerd, C. J. (1977a). Cognitive development and concept learning: Aninterpretative review. Psychological Bulletin, 84, 917-939.

Brainerd, C. J. (1977b). Response criteria in concept development research.Child Development, 48, 360-366.

Brand, C. (1987a). British IQ: Keeping up with the times. Nature, 328, 761.Brand, C. (1987b). Bryter still and bryter? Nature, 328, 110.Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic

studies. New York: Cambridge University Press.Carroll, J. B., Kohlberg, L., & DeVries, R. (1984). Psychometric and

Piagetian intelligences: Toward resolution of controversy. Intelli-gence, 8, 67-91.

Cloutier, R., & Goldschmid, M. L. (1976). Individual differences in thedevelopment of formal reasoning. Child Development, 47, 1097-1102.

Dupont, J.-B., Gendre, F., & Pauli, L. (1975). Epreuves operatoires et testsfactoriels classiques: Contribution a I'etude de la structure des aptitudesmentales durant l'adolescence [Tests of operations and classic factorialtests: Contribution to the study of the structure of mental aptitudesduring adolescence]. Revue Europeenne des Sciences Sociales et Cah-iers Vilfredo Pareto, 13, 137-198.

Flieiler, A., Jautz, M., & Kop, J.-L. (1989). Les reponses au test Mosai'quea quarante ans d'intervalle [Answers on the Mosaique test after a periodof forty years], Enfance, 42, 7-22.

Flieller, A., Manciaux, M, & Kop, J.-L. (1995). Enquete "20 ans apres":Comparaison des competences cognitives de deux cohortes d'ecoliersde 7 ans, observees a vingt ans d'intervalle [The "20 years later" survey:Comparison of cognitive skills of two cohorts of 7-year-old students,observed 20 years apart]. Unpublished technical report. Nancy, France:University Nancy 2.

Flynn, J. R. (1984). The mean IQ of Americans: Massive gains 1932 to1978. Psychological Bulletin, 95, 29-51.

Flynn, J. R. (1987). Massive IQ gains in 14 nations: What IQ tests reallymeasure. Psychological Bulletin, 101, 171-191.

Flynn, J. R. (1996). What environmental factors affect intelligence: Therelevance of IQ gains over time. In D. K. Detterman (Ed.), The envi-ronment: Current topics in human intelligence (Vol. 5, pp. 17-29).Norwood. NJ: Ablex.

Flynn, J. R. (1998). IQ gains over time: Toward finding the causes. In U.Neisser (Ed.), The rising curve: Long-term gains in IQ and relatedmeasures (pp. 25-66). Washington, DC: American Psychological As-sociation.

Green, D. R., Ford, M. P., & Flamer, G. B. (Eds.). (1971). Measurementand Piaget. New York: McGraw Hill.

Greenfield, P. M. (1998). The cultural evolution of IQ. In U. Neisser (Ed.),The rising curve: Long-term gains in IQ and related measures (pp.81-123). Washington, DC: American Psychological Association.

Gregoire, J. (1991). Les epreuves Piagetiennes et les tests d'intelligencetraditionnels evaluent-ils une meme realite? Revue de litterature ettentative d'articulation [Do Piagetian tests and traditional intelligencetests measure the same reality? A review and an attempt to integrate].Psychologie et Psychometrie, 12, 30-50.

Humphreys, L. G., Rich, S. A., & Davey, T. C. (1985). A Piagetian test ofgeneral intelligence. Developmental Psychology, 21, 872-879.

Inhelder, B., & Piaget, J. (1958). The growth of logical thinking fromchildhood to adolescence. New York: Basic Books.

Inman, W. C , & Secrest, T. (1981). Piaget's data and Spearman's theory:An empirical reconciliation and its implication for academic achieve-ment. Intelligence, 5, 329-344.

Jensen, A. R. (1980). Bias in mental testing. London: Methuen.Keating, D. P. (1975). Precocious cognitive development at the level of

formal operations. Child Development, 46, 276-280.Kuhn, D. (1974). Inducing development experimentally: Comments on a

research paradigm. Developmental Psychology, 10, 590-600.Larivee, S., Longeot, F., & Normandeau, S. (1989). Apprentissage des

operations formelles: Une recension des recherches [The training offormal operations: A critical review]. L'Annee Psvchologique, 89, 553-584.

Laurendeau-Bendavid, M. (1977). Culture, schooling, and cognitive devel-opment: A comparative study of children in French Canada and Rwanda.In P. R. Dasen (Ed.), Piagetian psychology: Cross-cultural contributions(pp. 123-168). New York: Gardner Press.

Lautrey, J. (1980a). Classe sociale, milieu familial, intelligence [Socialclass, familial environment, intelligence]. Paris: Presses Universitairesde France.

Lautrey, J. (1980b). La variability intra-individuelle du niveau operatoire etses implications theoriques [The intraindividual variability of the oper-ational level and its theorical implications]. Bulletin de Psychologie, 33,685-697.

Lautrey, J., Ribaupierre, A. de, & Rieben, L. (1990). L'integration desaspects genetiques et differentiels du developpement cognitif [Integra-tion of genetic and differential aspects of the cognitive development]. InM. Reuchlin, F. Longeot, C. Marendaz, & T. Ohlmann (Eds.), Connattredifferemment (pp. 181-208). Nancy, France: Presses Universitaires deNancy.

Lefebvre-Pinard, M. (1976). Les experiences de Geneve sur l'apprentissage:Un dossier peu convaincant (meme pour un Piagetien) [The Genevanlearning experiments: An unconvincing file (even for a Piagetian psy-chologist)]. Canadian Psychological Review, 17, 103-109.

Lim, T. K. (1988). Relationships between standardized psychometric andPiagetian measures of intelligence at the formal operational level. Intel-ligence, 12, 167-182.

Lim, T. K. (1996). A cross-validation study of the factorial structureunderlying intelligence tests and Piagetian formal operational tests.Educational Psychology, 16, 453-461.

Londeix, H. (1985). Les relations entre psychologie genetique et psycholo-gie differentielle. II—L'analyse d'un test operatoire, l'ECDL [The re-lations between developmental psychology and differential psychology.II—Analysis of a test of Piagetian operations, the ECDL]. L 'OrientationScolaire et Professionnelle, 14, 105-144.

Longeot, F. (1966). Experimentation d'une echelle individuelle du devel-oppement de la pensee logique [Experimentation of an individual scaleof logical thought development]. B1NOP, 22, 306-319.

Longeot, F. (Ed.). (1967). Aspects differentiels de la psychologie genetique[Differential approach to developmental psychology] [Special issue].BINOP, 23.

Longeot, F. (1968). La pedagogie des mathematiques et le developpementdes operations formelles dans le second cycle de l'enseignement secon-daire [Teaching of mathematics and the development of formal opera-tions in high school], Enfance, 21, 379-381.

Longeot, F. (1969). Psychologie differentielle et theorie operatoire deVintelligence [Differential psychology and Piagetian theory of intelli-gence]. Paris: Dunod.

Lynn, R. (1990). The role of nutrition in secular increases in intelligence.Personality and Individual Differences, 11, 273-285.

Lynn, R., & Hampson, S. (1986). The rise of national intelligence: Evi-dence from Britain, Japan and the U.S.A. Personality and IndividualDifferences, 7, 23-32.

Page 11: Comparison of the Development of Formal Thought in … · 2014. 7. 16. · flieller@clsh.univ-nancy2.fr. analysis is concerned, but not with respect to other aspects of intelligence"

1058 FLIELLER

Manuel de I'Echelle de Developpement de la Pensee Logique [Manual ofLongeot's test]. (1974). Paris: Editions et Applications Psychologiques.

Nassefat, M. (1963). Etude quantitative sur revolution des operationsintellectuelles [A quantitative study on the development of cognitiveoperations]. Neuchatel, Switzerland: Delachaux & Niestle.

Neimark, E. D. (1975). Longitudinal development of formal operationsthought. Genetic Psychology Monographs, 91, 171-225.

Neisser, U. (1997). Rising scores on intelligence tests. American Scien-tist, 85, 440-447.

Neisser, U. (Ed.). (1998). The rising curve: Long-term gains in IQ and relatedmeasures. Washington, DC: American Psychological Association.

Piaget, J. (1972). Intellectual evolution from adolescence to adulthood.Human Development, 15, 1-12.

Piaget, J., & Inhelder, B. (1951). La genese de Videe de hasard chezI'enfant [The origin of the idea of chance in children]. Paris: PressesUniversitaires de France.

Piaget, J., Inhelder, B., & Szeminska, A. (1956). The child's conception ofspace. London: Routledge and Kegan Paul.

Pry, R., & Manderscheid, J. C. (1993). Efficience scolaire, raisonnementinductif et caracteristiques du milieu chez I'enfant de 9-10 ans [Aca-demic achievement, inductive reasoning, and environmental variables in9- to 10-year-old children]. L'Orientation Scolaire et Profession-nelle, 22, 235-244.

Reese, H. W., & McCluskey, K. A. (1984). Dimensions of historicalconstancy and change. In K. A. McCluskey & H. W. Reese (Eds.),Life-span developmental psychology: Historical and generational effects(pp. 17-45). Orlando, FL: Academic Press.

Rubin, K. H., Brown, I. D., & Priddle, R. L. (1978). The relationships betweenmeasures of fluid, crystallized, and Piagetian intelligence in elementary-school-aged children. Journal of Genetic Psychology, 132, 29-36.

Ruffy, M. (1981). Influence of social factors in the development of theyoung child's moral judgment. European Journal of Social Psychol-ogy, 11, 61-75.

Schooler, C. (1998). Environmental complexity and the Flynn effect. In U.Neisser (Ed.), The rising curve: Long-term gains in IQ and relatedmeasures (pp. 67-79). Washington, DC: American Psychological Asso-ciation.

Storfer, M. D. (1990). Intelligence and giftedness. San Francisco: Jossey-Bass.

Strauss, S. (1972). Inducing cognitive development and learning: A reviewof short-term traning experiments. I—The organismic developmentalapproach. Cognition, 1, 329-357.

Strauss, S. (1974). A reply to Brainerd. Cognition, 3, 155-185.Teasdale, T. W., & Owen, D. R. (1989). Continuing secular increases in

intelligence and stable prevalence of high intelligence levels. Intelli-gence, 13, 255-262.

Wechsler, D. (1996). W1SC-III: Echelle d'Intelligence de Wechsler pourEnfants, troisieme edition [WISC-III: Wechsler Intelligence Scale forChildren, 3rd ed., French adaptation]. Paris: Editions du Centre dePsychologie Appliquee.

Williams, W. M. (1998). Are we raising smarter children today? School-and home-related influences on IQ. In U. Neisser (Ed.), The risingcurve: Long-term gains in IQ and related measures (pp. 125-154).Washington, DC: American Psychological Association.

Received November 19, 1996Revision received December 8, 1998

Accepted December 8, 1998