[lecture notes in computer science] 21st century learning for 21st century skills volume 7563 ||...

14
A. Ravenscroft et al. (Eds.): EC-TEL 2012, LNCS 7563, pp. 292–305, 2012. © Springer-Verlag Berlin Heidelberg 2012 eAssessment for 21 st Century Learning and Skills Christine Redecker * , Yves Punie, and Anusca Ferrari Institute for Prospective Technological Studies (IPTS), European Commission, Joint Research Centre, Edificio Expo, C/Inca Garcilaso 3, 41092 Seville, Spain {Christine.Redecker,Yves.Punie,Anusca.Ferrari}@ec.europa.eu Abstract. In the past, eAssessment focused on increasing the efficiency and effectiveness of test administration; improving the validity and reliability of test scores; and making a greater range of test formats susceptible to automatic scoring. Despite the variety of computer-enhanced test formats, eAssessment strategies have been firmly grounded in a traditional paradigm, based on the explicit testing of knowledge. There is a growing awareness that this approach is less suited to capturing “Key Competences” and “21 st century skills”. Based on a review of the literature, this paper argues that, though there are still technological challenges, the more pressing task is to transcend the testing paradigm and conceptually develop (e)Assessment strategies that foster the development of 21 st century skills. Keywords: eAssessment, Computer-Based Assessment (CBA), competence- based assessment, key competences, 21 st century skills. 1 Rethinking 21 st Century Assessment Assessment is an essential component of learning and teaching, as it allows the quality of both teaching and learning to be judged and improved [1]. It often determines the priorities of education [2], it always influences practices and has backwash effects on learning [3]. Changes in curricula and learning objectives are ineffective if assessment practices remain the same [4], as learning and teaching tend to be modelled against the test [2]. Assessment is usually understood to have two purposes: formative and summative. Formative assessment aims to gather evidence about pupils' proficiency to influence teaching methods and priorities, whereas summative assessment is used to judge pupils' achievements at the end of a programme of work [2]. Assessment procedures in formal education and training have traditionally focused on examining knowledge and facts through formal testing [4], and do not easily lend themselves to grasping 'soft skills'. Lately, however, there has been a growing awareness that curricula – and with them assessment strategies – need to be revised to * The views expressed in this article are purely those of the authors and may not in any circumstances be regarded as stating an official position of the European Commission.

Upload: davinia

Post on 04-Dec-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

A. Ravenscroft et al. (Eds.): EC-TEL 2012, LNCS 7563, pp. 292–305, 2012. © Springer-Verlag Berlin Heidelberg 2012

eAssessment for 21st Century Learning and Skills

Christine Redecker*, Yves Punie, and Anusca Ferrari

Institute for Prospective Technological Studies (IPTS), European Commission, Joint Research Centre,

Edificio Expo, C/Inca Garcilaso 3, 41092 Seville, Spain

{Christine.Redecker,Yves.Punie,Anusca.Ferrari}@ec.europa.eu

Abstract. In the past, eAssessment focused on increasing the efficiency and effectiveness of test administration; improving the validity and reliability of test scores; and making a greater range of test formats susceptible to automatic scoring. Despite the variety of computer-enhanced test formats, eAssessment strategies have been firmly grounded in a traditional paradigm, based on the explicit testing of knowledge. There is a growing awareness that this approach is less suited to capturing “Key Competences” and “21st century skills”. Based on a review of the literature, this paper argues that, though there are still technological challenges, the more pressing task is to transcend the testing paradigm and conceptually develop (e)Assessment strategies that foster the development of 21st century skills.

Keywords: eAssessment, Computer-Based Assessment (CBA), competence-based assessment, key competences, 21st century skills.

1 Rethinking 21st Century Assessment

Assessment is an essential component of learning and teaching, as it allows the quality of both teaching and learning to be judged and improved [1]. It often determines the priorities of education [2], it always influences practices and has backwash effects on learning [3]. Changes in curricula and learning objectives are ineffective if assessment practices remain the same [4], as learning and teaching tend to be modelled against the test [2]. Assessment is usually understood to have two purposes: formative and summative. Formative assessment aims to gather evidence about pupils' proficiency to influence teaching methods and priorities, whereas summative assessment is used to judge pupils' achievements at the end of a programme of work [2].

Assessment procedures in formal education and training have traditionally focused on examining knowledge and facts through formal testing [4], and do not easily lend themselves to grasping 'soft skills'. Lately, however, there has been a growing awareness that curricula – and with them assessment strategies – need to be revised to

* The views expressed in this article are purely those of the authors and may not in any

circumstances be regarded as stating an official position of the European Commission.

Page 2: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

eAssessment for 21st Century Learning and Skills 293

more adequately reflect the skills needed for life in the 21st century. The evolution of information and communication technologies (ICT) has contributed to a sudden change in terms of skills that students need to acquire. Skills such as problem-solving, reflection, creativity, critical thinking, learning to learn, risk-taking, collaboration, and entrepreneurship are becoming increasingly important [5]. The relevance of these "21st Century skills" [6] is moreover recognised in the European Key Competence Recommendation [7] by emphasizing their transversal and over-arching role. To foster and develop these skills, assessment strategies should go beyond testing factual knowledge and aim to capture the less tangible themes underlying all Key Competences.

2 Looking beyond the e-Assessment Era

At the end of the eighties, Bunderson, Inouye and Olsen [8] forecasted four generations of computerized educational measurement, namely: Generation 1: Computerized testing (administering conventional tests by computer); Generation 2: Computerized adaptive testing (tailoring the difficulty or contents or an aspect of the timing on the basis of examinees’ responses); Generation 3: Continuous measurement (using calibrated measures to continuously and unobtrusively estimate dynamic changes in the student’s achievement trajectory); Generation 4: Intelligent measurement (producing intelligent scoring, interpretation of individual profiles, and advice to learners and teachers by means of knowledge bases and inferencing procedures).

Multiple Choice

Short Answer

Generation 1Automated

administration and/or scoring of

conventional tests

Adaptive Testing

Generation 2

Generation 4

Intelligent Tutors

Automated Feedback Personalised feedback

and tutoring

1990 1995 2000 2005 2010 2015 2020 2025

Efficient Testing Personalised Learning

Learning 2.0

Generation 3Continuous integrated

assessment

Data Mining & Analysis

Learning Analytics

Generation Re-inventionTransformative testing: use

technology to change test formats

Computer-Based Assessment (CBA) Embedded Assessment

Peer assessment

Collaborative multimedia learning environments

Online collaborationVirtual laboratories

Games

Simulations

Virtual WorldsReplication of complex real life

situations within a test

Behavioural tracking

Fig. 1. Current and future e-Assessment strategies. Source: IPTS on the basis of [8-10]

These predictions from 1989 are not far off the mark. The first two generations of eAssessment or Computer-Based Assessment (CBA), which should more precisely be referred to as Computer-Based Testing, have now become mainstream. The main challenge currently lies in making the transition to the latter two, the era of Embedded Assessment, which is based on the notion of “Learning Analytics”, i.e. the

Page 3: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

294 C. Redecker, Y. Punie, and A. Ferrari

interpretation of a wide range of data about students' proficiency in order to assess academic progress, predict future performance, and tailor education to individual students [11]. Although Learning Analytics are currently still in an experimental and development phase, embedded assessment could become a technological reality within the next five years [11].

However, the transition from computer-based testing to embedded assessment requires technological advances to be complemented with a conceptual shift in assessment paradigms. While the first two generations of CBA centre on the notion of testing and on the use of computers to improve the efficiency of testing procedures, generation 3 and 4 seamlessly integrate holistic and personalised assessment into learning. Embedded assessment allows learners to be continuously monitored and guided by the electronic environment which they use for their learning activities, thus merging formative and summative assessment within the learning process. Ultimately, with generation 4, learning systems will be able to provide instant and valid feedback and advice to learners and teachers concerning future learning strategies, based on the learners' individual learning needs and preferences. Explicit testing could thus become obsolete.

This conceptual shift in the area of eAssessment is paralleled by the overall pedagogical shift from knowledge-based to competence-based assessment and the recent focus on transversal and generic skills, which are less susceptible to generation 1 and 2 e-Assessment strategies. Generation 3 and 4 assessment formats may offer a viable avenue for capturing the more complex and transversal skills and competences crucial for work and life in the 21st century. However, to seize these opportunities, assessment paradigms need to shift to being enablers of more personalised and targeted learning processes. Hence, the open question is how we can make this shift happen. This paper, based on an extensive review of the literature, provides an overview of current ICT-enabled assessment practices, with a particular focus on the more recent developments of ICT-enhanced assessment tools that allow the recognition of 21st century skills.

3 The Testing Paradigm

3.1 Mainstream eAssessment Strategies

First and second generation tests have led to a more effective and efficient delivery of traditional assessments [9]. More recently, assessment tools have been enriched to include more authentic tasks and to allow for the assessment of constructs that have either been difficult to assess or which have emerged as part of the information age [12]. As the measurement accuracy of all of these test approaches depends on the quality of the items it includes, item selection procedures – such as Item Response Theory or mathematical programming – play a central role in the assessment process [13]. First generation computer-based tests are already being administered widely for a variety of educational purposes, especially in the US [14], but increasingly also in Europe [15]. Second generation adaptive tests, which select test items based on the candidates' previous response, allow for a more efficient administration mode (less items and less testing time), while at the same time keeping measurement precision [9]. Different algorithms have been developed for the selection of test items. The most

Page 4: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

eAssessment for 21st Century Learning and Skills 295

well-known type of algorithmic testing is CAT, which is a test where the algorithm is designed to provide an accurate point estimation of individual achievement [16]. CAT tests are very widespread, in particular in the US, where they are used for assessment at primary and secondary school level [10, 14, 17]; and for admission to higher education [17]. Adaptive tests are also used in European countries, for instance the Netherlands [18] and Denmark [19].

3.2 Reliability of eAssessment

Advantages of computer-based testing over traditional assessment formats include, among other: paperless test distribution and data collection; efficiency gains; rapid feedback; machine-scorable responses; and standardized tools for examinees as calculators and dictionaries [17]. Furthermore computer-based tests tend to have a positive effect on students' motivation, concentration and performance [20, 21]. More recently, they also provide learners and teachers with detailed reports that describe strengths and weaknesses, thus supporting formative assessment [20]. However, the reliability and validity of scores has been a major concern, particularly in the early phases of eAssessment, given the prevalence of multiple-choice formats in computer-based tests. Recent research indicates that scores are indeed generally higher in multiple choice tests than they are in short answer formats [22]. Some studies found no significant differences between student performance on paper and on screen [20, 23], whereas others indicate that paper-based and computer-based tests do not necessarily measure the same skills [10, 24].

Current research efforts concentrate on improving the reliability and validity of test scores by improving selection procedures for large item banks [13], by increasing measurement efficiency [25], by taking into account multiple basic abilities simultaneously [26], and by developing algorithms for automated language analysis, that allow the electronic scoring of long free text answers.

3.3 Automated Scoring

One of the main drivers of progress in eAssessment has been the improvement of automatic scoring techniques for free text answers [27] and dedicated written text assignments [28]. Automated scoring could dramatically reduce the time and costs associated with the assessment of complex skills such as writing, but its use must be validated against a variety of criteria for it to be accepted by test users and stakeholders [29].

Already, assignments in programming languages or other formal notations can be automatically assessed [30]. For short-answer free-text responses of around a sentence in length, automatic scoring has also been shown to be at least as good as that of human markers [31]. Similarly, automated scoring for highly predictable speech, such as a one sentence answer to a simple question, correlates very highly with human ratings of speech quality, although this is not the case with longer and more open-ended responses [17]. Automated scoring1 is also used for scoring essay-length

1 Examples include: Intelligent Essay Assessor (Pearson Knowledge Technologies),

Intellimetric (Vantage), Project Essay Grade (PEG) (Measurement, Inc.), and e-rater (ETS).

Page 5: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

296 C. Redecker, Y. Punie, and A. Ferrari

responses [10] where it is found to closely mimic the results of human scoring: the agreement of an electronic score with a human score is typically as high as the agreement between two humans, and sometimes even higher [10, 17, 29]. However, these programmes tend to omit features that cannot be easily computed, such as content, organization and development [32]. Thus, while there is in general a high correlation between human and machine marking, discrepancies are higher for essays which exhibit more abstract qualities [33]. A further research line aims to develop programmes which mark short-answer free-text and give tailored feedback on incorrect and incomplete responses, inviting examinees to repeat the task immediately [34].

4 The Tutoring Paradigm

Integrated assessment is already implemented in some technology-enhanced learning (TEL) environments, where data-mining techniques can be used for formative assessment and individual tutoring. Similarly, virtual worlds, games, simulations and virtual laboratories allow the tracking of individual learners’ activity and can make learning behaviour assessable. Many TEL environments, tools and systems recreate learning situations which require complex thinking, problem-solving and collaboration strategies and thus allow for the development of 21st century skills. Some of these environments include the assessment of performance. Assessment packages for Learning Management Systems are currently under development to integrate self-assessment, peer-assessment and summative assessment, based on the automatic analysis of learner data [35]. Furthermore, data on student engagement in these environments can be used for embedded assessment, which refers to students engaging in learning activities while an assessment system draws conclusions based on their tasks [36]. Data-mining techniques are already used to evaluate university students' activity patterns in Virtual Learning Environments for diagnostic purposes. Analytical data mining can, for example, identify students who are at risk of dropping out or underperforming,2 generate diagnostic and performance reports,3 assess interaction patterns between students on collaborative tasks,4 and visualise collaborative knowledge work.5 It is expected that, in five years’ time, advances in data mining will enable the interpretation of data concerning students’ engagement, performance, and progress, in order to assess academic progress, predict future performance, and revise curricula and teaching strategies [11].

Although many of these programmes and environments are still experimental in scope and implementation, there are a number of promising technologies and related

2 For example: the Signals system at Purdue University, http://www.itap.purdue.edu/tlt/signals/ the Academic Early Alert and Retention System at Northerm Arizona University, http://www4.nau.edu/ua/GPS/student/

3 http://www.socrato.com/ 4 http://research.uow.edu.au/learningnetworks/seeing/snapp/ index.html

5 http://emergingmediainitiative.com/project/ learning-analytics/

Page 6: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

eAssessment for 21st Century Learning and Skills 297

assessment strategies that might soon give rise to integrated assessment formats that comprehensively capture 21st century skills.

4.1 Intelligent Tutoring Systems and Automated Feedback

Research indicates that the closer the feedback is to the actual performance, the more powerful its impact is on subsequent performance and learner motivation [37]. Timing is the obvious advantage of Intelligent Tutoring Systems (ITSs) [38], that adapt the level of difficulty of the tasks administered to the individual learners' progress and needs. Most programmes provide qualitative information on why particular responses are incorrect [37]. Although in some cases fairly generic, some programmes search for patterns in student work to adjust the level of difficulty in subsequent exercises according to needs [39]. Huang et al. [40], for example, developed an intelligent argumentation assessment system for elementary school pupils. The system analyses the structure of students' scientific arguments posted on a moodle discussion board and issues feedback in case of bias. In a first trial, the system was shown to be effective in classifying and improving students’ argumentation levels and assisting them in learning the core concepts. Currently, ITSs such as AutoTutor6 and GnuTutor7 [41] teach students by holding a conversation in natural language. There are versions of AutoTutor that guide interactive simulation in 3D micro-worlds, that detect and produce emotions, and that are embedded in games [42]. The latest version of AutoTutor has been enabled to detect learners' boredom, confusion, and frustration by monitoring conversational cues, gross body language, and facial features [43]. GnuTutor, a simplified open source variety of AutoTutor, intends to create a freely available, open source ITSs platform that can be used by schools and researchers [41].

ITSs are being widely used in the US, where the most popular system "Cognitive Tutor" provides differentiated instruction in mathematics which encourages problem-solving behaviour to half a million students in US middle and high schools. The programme selects mathematical problems for each student at an adapted level of difficulty. Correct solution strategies are annotated with hints. Students can access instruction that is directly relevant to the problem they are working on and the strategy they are following within that problem.8 Research indicates that students who used Cognitive Tutor significantly outscored their peers on national exams, an effect that was especially noticeable for students with limited English proficiency or special learning needs [44]. Research on the implementation of a web-based intelligent tutoring system "eFit" for mathematics at lower secondary schools in Germany confirms this finding: children using the tutoring system significantly improved their arithmetic performance over a period of 9 months [45].

ITSs are also used widely to support reading. SuccessMaker’s Reader’s Workshop9 and Accelerated Reader10 are two very popular commercial reading software products 6 http://www.autotutor.org/ 7 http://gnututor.com/ 8 http://carnegielearning.com/static/web_docs/ 2010_Cognitive_Tutor_Effectiveness.pdf

9 http://www.successmaker.com/Courses/c_awc_rw.html 10 http://www.renlearn.com/ar/

Page 7: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

298 C. Redecker, Y. Punie, and A. Ferrari

for primary education in the US. They provide ICT-based instruction with animations and game-like scenarios. Assessment is embedded and feedback is automatic and instant. Learning can be customized for three different profiles and each lesson can be adapted to student's strengths and weaknesses. These programmes have been evaluated as having positive impacts on learning [39]. Anotherexample is iSTART,11 a Web-based tutoring programme to teach reading strategies to young, adolescent and college-aged students. Animated agents are used to develop comprehension strategies such as paraphrasing and predicting. As the learner progresses through the modules, s/he creates self-explanations that are evaluated by the agent [46]. iSTART has been shown to improve the quality of students' self-explanation and this, in turn, was reflected in improved comprehension scores [47]. Similarly, an automated reading tutor for children starting to read in Dutch automatically assesses their reading levels, provides them with oral feedback at the phoneme, syllable or word level, and tracks where they are reading, for automated screen advancement or for direct feedback to them [48].

4.2 Immersive Environments, Virtual Worlds, Games and Simulations

Immersive environments and games are specifically suitable for acquiring 21st century skills such as problem-solving, collaboration and inquiry, because they are based on the fact that what needs to be acquired is not explicit but must be inferred from the situation [49]. In these environments, the learning context is similar to the contexts within which students will apply their learning, thus promoting inquiry skills; making learning activities more motivating; and increasing the likelihood that acquired skills will transfer to real-world situations [46]. It has been recognised that immersive game-based learning environments lead to significantly better learning results than traditional learning approaches [50].

Assessment can be integrated in the learning process within virtual environments and games. A virtual world like Second Life, for example, uses a mixture of multiple choice and environment interaction questions for both delivery and assessment of content, while encouraging an exploratory attitude to learning [51]. However, Virtual worlds, games and simulations are primarily conceived of as learning rather than testing environments. In science education, for instance, computer simulations, scientific games and virtual laboratories provide opportunities for students to develop and apply skills and knowledge in more realistic contexts and provide feedback in real time. Simulations may involve mini-laboratory investigations, or “predict-observe-explain” demonstrations. Dynamic websites, such as Web of Inquiry,12 allow students to carry out scientific inquiry projects to develop and test their theories; learn scientific language, tools, and practices of investigation; engage in self-assessment; and provide feedback to peers [52]. Simulations provided by Molecular Workbench,13 for example, emulate phenomena that are too small or too fast to observe, such as

11 iSTART stands for interactive strategy training for active reading and thinking, see:

http://129.219.222.66/Publish/projectsiteistart.html 12 http://www.webofinquiry.org 13 http://mw.concord.org/modeler/index.html

Page 8: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

eAssessment for 21st Century Learning and Skills 299

chemical reactions or gas at the molecular level. These visual, interactive computational experiments for teaching and learning science can be customized and adapted by the teacher. Embedded assessments allow teachers to generate real-time reports and track students' learning progress. Programmes like these usually provide opportunities for students to reflect on their own actions and response patterns [39]. Some science-learning environments have embedded formative assessments that teachers can access immediately in order to gauge the effectiveness of their instruction and modify their plans accordingly [53].

Furthermore, a variety of recent educational games for science education could, in principle, integrate assessment and tutoring functionalities similar to ITSs. ARIES (Acquiring Research Investigative and Evaluative Skills) is a computerized educational tool which incorporates multiple learning principles, such as testing effects, generation effects, and formative feedback [54]. Another example is Quest Atlantis,14 which promotes causal reasoning skills, subject knowledge in physics and chemistry, and an understanding of how systems work at both macro and micro level.

Some game environments include feedback, tutoring and monitoring of progress. In River City, for example, students use their knowledge of biology and the results of tests conducted online with equipment such as virtual microscopes to investigate the mechanisms through which a disease is spreading in a simulated 18th century city. Prompts gradually fade as students acquire inquiry skills. Data-mining allows teachers to document gains in students' engagement, learning and self-efficacy [46, 55]. Similarly, the Virtual Performance Assessment project15 relies on simulated, game-like environments to assess students’ ability to perform scientific inquiry. The assessment is performed in authentic settings, which allow better observation and measurement of complex cognition and inquiry processes [6].

4.3 Authentic Tasks Employing Digital Tools

Practical tasks using mobile devices or online resources are another promising avenue for developing ICT-enabled assessment formats. A number of national pilots assess tasks that replicate real life contexts and are solved by using common technologies, such as the internet, office and multimedia tools. In Denmark, for example, students at the commercial and technical upper secondary schools have, since 2001, been sitting Danish language, Maths and Business Economics exams based on CD-ROMs with access to multimedia resources. The aim is to evaluate their understanding of the subjects and their ability to search, combine, analyse and synthesise information and work in an inter-disciplinary way.16 A new pilot was launched in 2009 which aims to develop use of full Internet access during high stakes formal assessments. Similarly, in the eSCAPE project, a 6-hour collaborative design workshop replaced school examinations for 16 year-old students in Design and Technology in eleven schools

14 http://atlantis.crlt.indiana.edu/ 15 http://vpa.gse.harvard.edu/a-case-study-of-the-virtual-

performance-assessment-project/ 16 http://www.cisco.com/web/strategy/docs/

education/DanishNationalAssessmentSystem.pdf

Page 9: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

300 C. Redecker, Y. Punie, and A. Ferrari

across England. Students work individually, but within a group context, and record assessment evidence via a handheld device in a short multimedia portfolio. The reliability of the assessment method was reported as very high [6, 56]. In the US, College Work and Readiness Assessment (CWRA) was introduced in St. Andrew’s School in Delaware to test students' readiness for college and work, and it quickly spread to other schools across the US. It consists of a single 90-minute task that students must accomplish by using a library of online documents. Students must address real-world dilemmas (like helping a town reduce pollution), making judgments that have economic, social, and environmental implications, and articulate a solution in writing.

This approach has proved particularly fruitful for the assessment of digital competence. The Key Stage 3 ICT tests (UK), for example, require 14 year-old students to use multiple ICT tools in much the same way they are used in real work and academic environments [10]. The project led to the development of authentic tasks to assign to students, who completed tests of ICT skills in a virtual desktop environment [56]. Similarly, the iSkills17 assessment aims to measure students' critical thinking and problem-solving skills in a digital environment. In a one-hour exam real-time, scenario-based tasks are presented that measure an individual's ability to navigate, critically evaluate and understand the wealth of information available through digital technology. The national ICT skills assessment programme in Australia [57] is designed to be an authentic performance assessment, mirroring students’ typical ‘real world’ use of ICT. In 2005 and 2008, students completed tasks on computers using software that included a seamless combination of simulated and live applications.

5 The Conceptual Leap

As the examples outlined above illustrate, we are currently witnessing a great number of innovative developments, both in the area of Computer-Based Testing and in the area of embedded assessment and intelligent tutoring, which indicate promising avenues for capturing complex key competences and 21st century skills. In the past, research focused on the technological side, with more and more tools, functionalities and algorithms to increase measurement accuracy and to create more complex and engaging learning environments with targeted feedback loops [10, 17, 36]. Given that learning analytics could, in the future, replace explicit testing, (e)Assessment will become far closer interwoven with learning and teaching and will have to respond to and respect the pedagogical concepts on which the learning process is based. To further exploit the potential of technologies, research efforts should therefore not only focus on increasing efficiency, validity and reliability of ICT enhanced assessment formats, but must also consider how the pedagogical and conceptual foundations of different pedagogical approaches translate into different eAssessment strategies. The possibility to use data from the learning process itself for assessment purposes in an objective, valid, reliable and comparable way renders explicit testing obsolete. Thus,

17 http://www.ets.org/iskills/

Page 10: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

eAssessment for 21st Century Learning and Skills 301

we need to re-consider the value of tests, examinations, and in particular of high stakes summative assessments, that are based on the evaluation of students’ performance displayed in just one instance, on dedicated tasks that are limited in scope [39]. If traditional computer-based formats continue to play an important role in high-stakes tests, there is a risk that learning processes and outcomes will fail to address the key competences and complex transversal skills needed in the 21st century, despite the technological availability of embedded assessment formats and the implementation of competence-based curricula. We have seen in the past that with computer-based testing, formative and summative assessment formats have diverged. As a result, schools are now starting to use computer-based multiple choice tests as a means of training for high stakes examinations [58]. Since assessment practice tends to affect teaching and learning practices, the danger is to focus on “scoring well” rather than on developing complex key competences which will be increasingly important in the future [5]. If new assessment strategies are not deployed, computer-based assessment will further increase the gap between learning and testing and the necessary competences to be acquired will not be given the relevance they deserve. If, on the other hand, formative and summative assessment become an integral part of the learning process, and digital learning environments become the main source for grading and certification, there is a need to better understand how information collected digitally should be used, evaluated and weighted to adequately reflect the performance of each individual learner. If genuinely pedagogical tasks, such as assessing and tutoring, are increasingly delegated to digital environments, these have to be designed in such a way that they become a tool for teachers and learners to communicate effectively with one another.

Embedded assessment should be designed to respect and foster the primacy of pedagogy and the role of the teacher. Judgements on the achievement and performance of students that are based on data collected in digital environments must be based on a transparent and fair process of interpreting and evaluating these data, which is mediated by digital applications and tools, but ultimately lies in the hands of teachers and learners. Given that teachers will be able to base their pedagogical decisions and judgements on a wider range of data than in the past, pedagogical principles for interpreting, evaluating, weighing and reflecting on these different kinds of data are needed. Therefore, to facilitate the development and implementation of environments that will effectively serve learners and teachers in developing 21st century skills, technology developers and educators need to enter into a dialogue for the creation of effective learning pathways. Hence, it is important to ensure that further progress in the technological development of environments, applications and tools for learning and assessment is guided by pedagogical principles that reflect the competence requirements of the 21st century.

6 Conclusions

This paper has argued for a paradigm shift in the use and deployment of Information and Communication Technologies (ICT) in assessment. In the past, eAssessment has mainly focused on increasing efficiency and effectiveness of test administration;

Page 11: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

302 C. Redecker, Y. Punie, and A. Ferrari

improving the validity and reliability of test scores; and making a greater range of test formats susceptible to automatic scoring, with a view to simultaneously improving efficiency and validity. However, despite the variety of computer-enhanced test formats, eAssessment strategies have been firmly grounded in the traditional assessment paradigm, which has for centuries dominated formal education and training and is based on the explicit testing of knowledge. However, against the background of rapidly changing skill requirements in a knowledge-based society, education and training systems in Europe are becoming increasingly aware that curricula and with them assessment strategies need to refocus on fostering more holistic “Key Competences” and transversal or general skills, such as so-called “21st century skills”. ICT offer many opportunities for supporting assessment formats that can capture complex skills and competences that are otherwise difficult to assess. To seize these opportunities, research and development in eAssessment and in assessment in general have to transcend the testing paradigm and develop new concepts of embedded, authentic and holistic assessment. Thus, while there is still a need to technologically advance in the development and deployment of integrated assessment formats, the more pressing task at the moment is to make the conceptual shift between traditional and 21st century testing and develop (e-)Assessment pedagogies, frameworks, formats and approaches that reflect the core competences needed for life in the 21st century.

References

1. Ferrari, A., Cachia, R., Punie, Y.: Innovation and Creativity in Education and Training in the EU Member States: Fostering Creative Learning and Supporting Innovative Teaching. In: JRC-IPTS (2009)

2. NACCCE: All Our Futures: Creativity, Culture and Education (1999) 3. Ellis, S., Barrs, M.: The Assessment of Creative Learning. In: Sefton-Green, J. (ed.)

Creative Learning, pp. 73–89. Creative Partnerships, London (2008) 4. Cachia, R., Ferrari, A., Ala-Mutka, K., Punie, Y.: Creative Learning and Innovative

Teaching: Final Report on the Study on Creativity and Innovation in Education in EU Member States. In: JRC-IPTS (2010)

5. Redecker, C., Leis, M., Leendertse, M., Punie, Y., Gijsbers, G., Kirschner, P., Stoyanov, S., Hoogveld, B.: The Future of Learning: Preparing for Change. In: JRC-IPTS (2010)

6. Binkley, M., Erstad, O., Herman, J., Raizen, S., Ripley, M., Rumble, M.: Defining 21st Century Skills. In: Griffin, P., McGaw, B., Care, E. (eds.) Assessment and Teaching of 21st Century Skills, pp. 17–66. Springer, Heidelberg (2012)

7. Council of the European Union: Recommendation of the European Parliament and the Council of 18 December 2006 on key competences for lifelong learning. (2006/962/EC). Official Journal of the European Union, L394/10 (2006)

8. Bunderson, V.C., Inouye, D.K., Olsen, J.B.: The four generations of computerized educational measurement. In: Linn, R.L. (ed.) Educational Measurement, pp. 367–407. Macmillan, New York (1989)

9. Martin, R.: New possibilities and challenges for assessment through the use of technology. In: Scheuermann, F., Pereira, A.G. (eds.) Towards a Research Agenda on Computer-Based Assessment. Office for Official Publications of the European Communities, Luxembourg (2008)

Page 12: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

eAssessment for 21st Century Learning and Skills 303

10. Bennett, R.E.: Technology for Large-Scale Assessment. In: Peterson, P., Baker, E., McGaw, B. (eds.) International Encyclopedia of Education, vol. 8, pp. 48–55. Elsevier, Oxford (2010)

11. Johnson, L., Smith, R., Willis, H., Levine, A., Haywood, K.: The 2011 Horizon Report. The New Media Consortium (2011)

12. Pellegrino, J.W.: Technology and Learning - Assessment. In: Peterson, P., Baker, E., McGaw, B. (eds.) International Encyclopedia of Education, vol. 8, pp. 42–47. Elsevier, Oxford (2010)

13. El-Alfy, E.S.M., Abdel-Aal, R.E.: Construction and analysis of educational tests using abductive machine learning. Computers and Education 51, 1–16 (2008)

14. Csapó, B., Ainley, J., Bennett, R., Latour, T., Law, N.: Technological Issues for Computer-Based Assessment. The University of Melbourne, Australia (2010)

15. Moe, E.: Introducing Large-scale Computerised Assessment Lessons Learned and Future Challenges. In: Scheuermann, F., Björnsson, J. (eds.) The Transition to Computer-Based Assessment. Office for Official Publications of the European Communities, Luxembourg (2009)

16. Thompson, N.A., Weiss, D.J.: Computerized and Adaptive Testing in Educational Assessment. In: Scheuermann, F., Björnsson, J. (eds.) The Transition to Computer-Based Assessment. Office for Official Publications of the European Communities, Luxembourg (2009)

17. Bridgeman, B.: Experiences from Large-Scale Computer-Based Testing in the USA. In: Scheuermann, F., Björnsson, J. (eds.) The Transition to Computer-Based Assessment. Office for Official Publications of the European Communities, Luxembourg (2009)

18. Eggen, T.J.H.M., Straetmans, G.J.J.M.: Computerized Adaptive Testing of Arithmetic at the Entrance of Primary School Teacher Training College. In: Scheuermann, F., Björnsson, J. (eds.) The Transition to Computer-Based Assessment. Office for Official Publications of the European Communities, Luxembourg (2009)

19. Wandall, J.: National Tests in Denmark – CAT as a Pedagogic Tool. In: Scheuermann, F., Björnsson, J. (eds.) The Transition to Computer-Based Assessment. Office for Official Publications of the European Communities, Luxembourg (2009)

20. Ripley, M.: Transformational Computer-based Testing. In: Scheuermann, F., Björnsson, J. (eds.) The Transition to Computer-Based Assessment. Office for Official Publications of the European Communities, Luxembourg (2009)

21. Garrett, N., Thoms, B., Alrushiedat, N., Ryan, T.: Social ePortfolios as the new course management system. On the Horizon 17, 197–207 (2009)

22. Park, J.: Constructive multiple-choice testing system. British Journal of Educational Technology 41, 1054–1064 (2010)

23. Hardré, P.L., Crowson, H.M., Xie, K., Ly, C.: Testing differential effects of computer-based, web-based and paper-based administration of questionnaire research instruments. British Journal of Educational Technology 38, 5–22 (2007)

24. Horkay, N., Bennett, R.E., Allen, N., Kaplan, B., Yan, F.: Does it Matter if I Take My Writing Test on Computer? An Empirical Study of Mode Effects in NAEP. Journal of Technology, Learning, and Assessment 5 (2006)

25. Frey, A., Seitz, N.N.: Multidimensional adaptive testing in educational and psychological measurement: Current state and future challenges. Studies in Educational Evaluation 35, 89–94 (2009)

26. Hartig, J., Höhler, J.: Multidimensional IRT models for the assessment of competencies. Studies in Educational Evaluation 35, 57–63 (2009)

Page 13: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

304 C. Redecker, Y. Punie, and A. Ferrari

27. Noorbehbahani, F., Kardan, A.A.: The automatic assessment of free text answers using a modified BLEU algorithm. Computers and Education 56, 337–345 (2011)

28. He, Y., Hui, S.C., Quan, T.T.: Automatic summary assessment for intelligent tutoring systems. Computers and Education 53, 890–899 (2009)

29. Weigle, S.C.: Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability. Language Testing 27, 335–353 (2010)

30. Amelung, M., Krieger, K., Rösner, D.: E-assessment as a service. IEEE Transactions on Learning Technologies 4, 162–174 (2011)

31. Butcher, P.G., Jordan, S.E.: A comparison of human and computer marking of short free-text student responses. Computers and Education 55, 489–499 (2010)

32. Ben-Simon, A., Bennett, R.E.: Toward a more substantively meaningful automated essay scoring. Journal of of Technology, Learning and Assessment 6 (2007)

33. Hutchison, D.: An evaluation of computerised essay marking for national curriculum assessment in the UK for 11-year-olds. British Journal of Educational Technology 38, 977–989 (2007)

34. Jordan, S., Mitchell, T.: e-Assessment for learning? The potential of short-answer free-text questions with tailored feedback. British Journal of Educational Technology 40, 371–385 (2009)

35. Florián, B.E., Baldiris, S.M., Fabregat, R., De La Hoz Manotas, A.: A set of software tools to build an author assessment package on Moodle: Implementing the AEEA proposal. In: 10th IEEE International Conference on Advanced Learning Technologies, ICALT, pp. 67–69 (2010)

36. Ridgway, J., McCusker, S.: Challenges for Research in e-Assessment. In: Scheuermann, F., Pereira, A.G. (eds.) Towards a Research Agenda on Computer-Based Assessment. Office for Official Publications of the European Communities, Luxembourg (2008)

37. Nunan, D.: Technology Supports for Second Language Learning. In: International Encyclopedia of Education, vol. 8, pp. 204–209. Elsevier, Oxford (2010)

38. Ljungdahl, L., Prescott, A.: Teachers’ use of diagnostic testing to enhance students’ literacy and numeracy learning. International Journal of Learning 16, 461–476 (2009)

39. Looney, J.: Making it Happen: Formative Assessment and Educational Technologies. Promethean Thinking Deeper Research Papers 1 (2010)

40. Huang, C.J., Wang, Y.W., Huang, T.H., Chen, Y.C., Chen, H.M., Chang, S.C.: Performance evaluation of an online argumentation learning assistance agent. Computers and Education 57, 1270–1280 (2011)

41. Olney, A.M.: GnuTutor: An open source intelligent tutoring system based on AutoTutor. Cognitive and Metacognitive Educational Systems: Papers from the AAAI Fall Symposium FS-09-02, 70–75 (2009)

42. Graesser, A.: Autotutor and the world of pedagogical agents: Intelligent tutoring systems with natural language dialogue. In: 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22, vol. 3 (2009)

43. D’Mello, S., Craig, S., Fike, K., Graesser, A.: Responding to Learners’ Cognitive-Affective States with Supportive and Shakeup Dialogues. In: Jacko, J.A. (ed.) HCI International 2009, Part III. LNCS, vol. 5612, pp. 595–604. Springer, Heidelberg (2009)

44. Ritter, S., Anderson, J.R., Koedinger, K.R., Corbett, A.: Cognitive tutor: Applied research in mathematics education. Psychonomic Bulletin and Review 14, 249–255 (2007)

45. Graff, M., Mayer, P., Lebens, M.: Evaluating a web based intelligent tutoring system for mathematics at German lower secondary schools. Education and Information Technologies 13, 221–230 (2008)

Page 14: [Lecture Notes in Computer Science] 21st Century Learning for 21st Century Skills Volume 7563 || eAssessment for 21st Century Learning and Skills

eAssessment for 21st Century Learning and Skills 305

46. Means, B., Rochelle, J.: An Overview of Technology and Learning. In: Peterson, P., Baker, E., McGaw, B. (eds.) International Encyclopedia of Education, vol. 8, pp. 1–10. Elsevier, Oxford (2010)

47. DeFrance, N., Khasnabis, D., Palincsar, A.S.: Reading and Technology. In: Peterson, P., Baker, E., McGaw, B. (eds.) International Encyclopedia of Education, vol. 8, pp. 150–157. Elsevier, Oxford (2010)

48. Duchateau, J., Kong, Y.O., Cleuren, L., Latacz, L., Roelens, J., Samir, A., Demuynck, K., Ghesquière, P., Verhelst, W., Hamme, H.V.: Developing a reading tutor: Design and evaluation of dedicated speech recognition and synthesis modules. Speech Communication 51, 985–994 (2009)

49. de Jong, T.: Technology Supports for Acquiring Inquiry Skills. In: Peterson, P., Baker, E., McGaw, B. (eds.) International Encyclopedia of Education, vol. 8, pp. 167–171. Elsevier, Oxford (2010)

50. Barab, S.A., Scott, B., Siyahhan, S., Goldstone, R., Ingram-Goble, A., Zuiker, S.J., Warren, S.: Transformational play as a curricular scaffold: Using videogames to support science education. Journal of Science Education and Technology 18, 305–320 (2009)

51. Bloomfield, P.R., Livingstone, D.: Multi-modal learning and assessment in Second Life with quizHUD. In: Conference in Games and Virtual Worlds for Serious Applications, pp. 217–218 (2009)

52. Herrenkohl, L.R., Tasker, T., White, B.: Pedagogical practices to support classroom cultures of scientific inquiry. Cognition and Instruction 29, 1–44 (2011)

53. Delgado, C., Krajcik, J.: Technology and Learning - Supports for Subject Matter Learning. In: Peterson, P., Baker, E., McGaw, B. (eds.) International Encyclopedia of Education, vol. 8, pp. 197–203. Elsevier, Oxford (2010)

54. Wallace, P., Graesser, A., Millis, K., Halpern, D., Cai, Z., Britt, M.A., Magliano, J., Wiemer, K.: Operation ARIES!: A computerized game for teaching scientific inquiry. Frontiers in Artificial Intelligence and Applications 200, 602–604 (2009)

55. Dede, C.: Technological Support for Acquiring Twenty-First -Century Skills. In: Peterson, P., Baker, E., McGaw, B. (eds.) International Encyclopedia of Education, vol. 8, pp. 158–166. Elsevier, Oxford (2010)

56. Ripley, M.: Transformational Computer-based Testing. In: Scheuermann, F., Björnsson, J. (eds.) The Transition to Computer-Based Assessment. New Approaches to Skills Assessment and Implications for Large-scale Testing. Office for Official Publications of the European Communities, Luxembourg (2009)

57. MCEECDYA: National Assessment Program. ICT Literacy Years 6 and 10 Report 2008. Ministerial Council for Education, Early Childhood Development and Youth Affairs (MCEECDYA) (2008)

58. Glas, C.A.W., Geerlings, H.: Psychometric aspects of pupil monitoring systems. Studies in Educational Evaluation 35, 83–88 (2009)