boris milašinović faculty of electrical engineering and computing university of zagreb, croatia
DESCRIPTION
Continuous assessment Mid term exams and final exam Periodical small tests Homeworks In class assessment Problems: Manual reviewing process is usually more precise but time consuming Lack of teaching staff Solution: Using software for automatic evaluation of assignments Complete automatic reviewing would be to strict Mid term and final exams reviewed manually Homeworks and quizzes (multiple choice tests) with automatic evaluation 3TRANSCRIPT
Some experiences using various assessments methods with the
emphasis on the automatic evaluation of (programming)
assignments Boris Milašinović
Faculty of Electrical Engineering and Computing
University of Zagreb, Croatia
2
MotivationCourses
Programming (and software engineering)Data Structures and Algorithms>= 700 students on each course
Old examination system2 mid term examsClassic written exam as an option=> no continuous assessment!
How to introduce continuous assessments?
3
Continuous assessmentMid term exams and final examPeriodical small testsHomeworksIn class assessmentProblems:
Manual reviewing process is usually more precise but time consuming
Lack of teaching staffSolution:
Using software for automatic evaluation of assignments Complete automatic reviewing would be to strict
Þ Mid term and final exams reviewed manuallyÞ Homeworks and quizzes (multiple choice tests) with automatic evaluation
4
Grading scheme (1 point = 1% of final points)Programming and
software engineering1st mid-term exam: 15
(points)2nd mid-term exam: 25 Final exam : 30Homeworks: 3 x 2 = 6Quizzes: 6 x 3 = 18Class activity: 6
Algorithms and Data Structures1st mid-term exam: 152nd mid-term exam: 20Final exam : 30Homeworks: 3 x 3 = 9Quizzes: 6 x 3 = 18Class activity: 8
To pass a student must collect at least 50 points and at least 8 (of 30) points on the final exam
• Grades are awarded by Gaussian distribution
5
Multiple-choice tests (Quizzes)30 minutes, 12 questions, 5 possible answers (only 1 is
correct) Programming and software engineering
6 quizzes (3 points per quiz) Correct answer: 0.25 points Incorrect: -0.05
Algorithms and Data Structures: 3 quizzes (6 points per quiz) Correct answer: 0.5 points Incorrect: -0.1
1 point is equal to 1% of final score
In order to retain the initial quality new questions had to be added every year to enlarge the questions database.
Time span between the assessments should be reduced (all students should do the test in the same day)
6
Multiple-choice tests (Quizzes) – Histograms of results in 2007/08
Histogram (P iP I2007Z 16v*861c)
Quiz 1 = 859*0,5*norm al(x; 2,1742; 0,6827)
-1,0 -0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Q uiz 1
0
50
100
150
200
250
300
350
No
of o
bs
His togram (PiPI2007Z 16v*861c)
Quiz 2 = 861*0,5*norm al(x; 1,858; 0,7399)
-1,0 -0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Quiz 2
0
20
40
60
80
100
120
140
160
180
200
220
240
260
No
of o
bs
Histogram (P iPI2007Z 16v*861c )
Quiz 3 = 861*0,5*norm al(x; 1,7513; 0,7998)
-1,0 -0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Q uiz 3
0
20
40
60
80
100
120
140
160
180
200
220
240
No
of o
bs
His togram (P iP I2007Z 16v*861c)
Quiz 4 = 861*0,5*norm al(x; 2,1584; 0,8334)
-1,0 -0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Q uiz 4
0
50
100
150
200
250
300
350
400
No
of o
bs
His togram (PiP I2007Z 16v*861c )
Q uiz 5 = 861*0,5*norm al(x; 1,7771; 0,8729)
-1,0 -0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Q uiz 5
0
20
40
60
80
100
120
140
160
180
200
220
240
260
280
No
of o
bs
His togram (PiPI2007Z 16v*861c )
Q uiz 6 = 861*0,5*norm al(x; 1,6478; 0,8756)
-1,0 -0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Quiz 6
0
20
40
60
80
100
120
140
160
180
200
220
240
260
No
of o
bs
His togram (ASP_2007 13v*729c )
Q uiz 1 = 729*0,5*norm al(x; 4,3102; 1,3663)
-1 0 1 2 3 4 5 6 7
Q uiz 1
0
20
40
60
80
100
120
140
160
No
of o
bs
His tog ram (ASP_2007 13v*729c )
Q uiz 2 = 729*1*norm al(x; 4,1034; 1,5903)
-2 -1 0 1 2 3 4 5 6 7
Quiz 2
0
20
40
60
80
100
120
140
160
180
200
220
240
260
No
of o
bs
His tog ram (ASP_2007 13v*729c)
Q uiz 3 = 729*0,5*norm al(x; 3,5207; 1,7958)
-1 0 1 2 3 4 5 6 7
Q uiz 3
0
20
40
60
80
100
No
of o
bs
Programming and software engineering
Algorithms and Data Structures
7
Suitability of Multiple-choice tests (Quizzes)Advantages:
Simple to create and easy to runQuick (quiz lasts 30 minutes)Has good results distributionHelps achieving continuous assessment
Disadvantages: Backwash effect:
Students tend to study the matter needed to pass the test, not the matter representing the core knowledge being thought
Does not represent real knowledge of programming! Students learn how to recognize answers.
8
HomeworksStudents are given either whole programs or
individual functions as programming assignmentsOn upload, students' code is joined with the code
previously defined by the teachers and compiled.Upon the successful compilation, program is run
against the predefined tests and its output is compared with the expected results.Tests with fixed set dataRandomly generated inputs
Time to collect the assignment: 7-10 daysTime to solve: 2 days from the time of assignment
collection
9
Avoiding ambiguities in homeworks test definitionsSwitching to automatic evaluation can bring
problems. The lack of experience (on behalf of students and
teachers)Avoiding ambiguities in test definitionsExams for automatic evaluation must be precisely
definedFirst test with automatic evaluation:
300 received e-mails with complaints on automatic evaluation process (more than 40%)
Now: less than 40 complaints per test (5%) => mostly with request for explanation of student’s solution errors
10
Homeworks - Histograms
His togram (ASP_20 07 13v*729 c)
Hom ework 1 = 729*0,5*norm al(x; 2,3047; 0,9892)
-0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Hom ework 1
0
50
100
150
200
250
300
350
400
450
500
No
of o
bs
His tog ram (ASP_20 07 13v*729 c)
Hom ework 2 = 729*0,5*norm al(x; 2,4412; 0,9108)
-0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Hom ework 2
0
100
200
300
400
500
600
No
of o
bs
Histog ram (ASP_20 07.sta 13v*729c)
Hom ework 3 = 729*0,5*norm al(x; 2,4295; 1,0057)
-0,5 0,0 0,5 1,0 1,5 2,0 2,5 3,0 3,5
Hom ew ork 3
0
100
200
300
400
500
600
No
of o
bs
Histogram (P iPI2007Z.s ta 16v*861c)
Hom ework 1 = 861*0,2*norm al(x; 1,7321; 0,6197)
-0,4 -0,2 0,0 0,2 0,4 0,6 0,8 1,0 1,2 1,4 1,6 1,8 2,0 2,2
Hom ework 1
0
100
200
300
400
500
600
700
800
No
of o
bs
Histogram (PiPI2007Z.s ta 16v*861c)
Hom ework 2 = 861*0,2*norm al(x; 1,6099; 0,7232)
-0, 4 -0,2 0,0 0,2 0,4 0,6 0,8 1,0 1,2 1,4 1,6 1,8 2,0 2,2
Hom ew ork 2
0
100
200
300
400
500
600
700
No
of o
bs
Histogram (PiPI2007Z.s ta 16v*861c )
Hom ew ork 3 = 861*0,2*norm al(x; 1,4473; 0,8636)
-0,4 -0,2 0,0 0,2 0,4 0,6 0,8 1,0 1,2 1,4 1,6 1,8 2,0 2,2
Hom ework 3
0
100
200
300
400
500
600
700
No
of o
bs
11
Suitability of automatically evaluated homeworksAdvantages
Improve students programming skills More time to create assignments, but no manual review
evaluation laterDisadvantages
Minor error can lead to zero points for the assignment More tests refines the grading scale Running pre-tests examples before the final submission
Poor distribution (although not unusual for homeworks) Easy to cheat
Suspicious situations: similar solutions too short time between assignment collection and submission
12
Correlation between results on each assignment and student’s final grade – Algorithms and Data Structures (2007/08)
1st mid-term exam
2nd mid-term exam
Final exam
Quiz 1 Quiz 2 Quiz 3 Homework 1
Homework 2
Homework 3
Classroom activity
Total Grade
1st mid-term exam 0,67 0,64 0,53 0,51 0,37 0,14 0,24 0,31 0,45 0,80 0,75
2nd mid-term exam 0,67 0,68 0,47 0,55 0,37 0,12 0,22 0,31 0,48 0,84 0,80
Final exam 0,64 0,68 0,52 0,58 0,52 0,18 0,25 0,34 0,53 0,90 0,86
Quiz 1 0,53 0,47 0,52 0,53 0,41 0,20 0,27 0,39 0,42 0,66 0,56
Quiz 2 0,51 0,55 0,58 0,53 0,49 0,19 0,28 0,40 0,47 0,72 0,62
Quiz 3 0,37 0,37 0,52 0,41 0,49 0,21 0,29 0,44 0,41 0,61 0,51
Homework 1 0,14 0,12 0,18 0,20 0,19 0,21 0,23 0,26 0,11 0,26 0,21
Homework 2 0,24 0,22 0,25 0,27 0,28 0,29 0,23 0,40 0,26 0,37 0,27
Homework 3 0,31 0,31 0,34 0,39 0,40 0,44 0,26 0,40 0,28 0,48 0,31
Classroom activity 0,45 0,48 0,53 0,42 0,47 0,41 0,11 0,26 0,28 0,66 0,62
Total 0,80 0,84 0,90 0,66 0,72 0,61 0,26 0,37 0,48 0,66 0,92
Grade 0,75 0,80 0,86 0,56 0,62 0,51 0,21 0,27 0,31 0,62 0,92
13
E1
E2
FE
Q1
Q2
Q3
Q4
Q5
Q6
H1
H2
H3
CA
T GR
1st mid-term exam 1,00 0,65 0,59 0,43 0,40 0,55 0,44 0,46 0,40 0,17 0,12 0,20 0,28 0,76 0,69
2nd mid-term exam 0,65 1,00 0,69 0,48 0,48 0,61 0,49 0,56 0,50 0,16 0,12 0,20 0,40 0,87 0,85
Final exam 0,59 0,69 1,00 0,52 0,55 0,64 0,59 0,68 0,64 0,18 0,17 0,33 0,48 0,91 0,87
Quiz 1 0,43 0,48 0,52 1,00 0,55 0,56 0,50 0,48 0,51 0,24 0,20 0,29 0,41 0,62 0,55
Quiz 2 0,40 0,48 0,55 0,55 1,00 0,59 0,57 0,55 0,55 0,21 0,22 0,34 0,50 0,64 0,57
Quiz 3 0,55 0,61 0,64 0,56 0,59 1,00 0,57 0,62 0,59 0,22 0,21 0,32 0,42 0,74 0,68
Quiz 4 0,44 0,49 0,59 0,50 0,57 0,57 1,00 0,70 0,65 0,29 0,29 0,43 0,51 0,68 0,56
Quiz 5 0,46 0,56 0,68 0,48 0,55 0,62 0,70 1,00 0,70 0,24 0,24 0,41 0,53 0,74 0,66
Quiz 6 0,40 0,50 0,64 0,51 0,55 0,59 0,65 0,70 1,00 0,23 0,25 0,42 0,51 0,70 0,59
Homework 1 0,17 0,16 0,18 0,24 0,21 0,22 0,29 0,24 0,23 1,00 0,24 0,25 0,25 0,27 0,20
Homework 2 0,12 0,12 0,17 0,20 0,22 0,21 0,29 0,24 0,25 0,24 1,00 0,22 0,24 0,24 0,16
Homework 3 0,20 0,20 0,33 0,29 0,34 0,32 0,43 0,41 0,42 0,25 0,22 1,00 0,32 0,38 0,26
Class activity 0,28 0,40 0,48 0,41 0,50 0,42 0,51 0,53 0,51 0,25 0,24 0,32 1,00 0,57 0,50
Total number of points 0,76 0,87 0,91 0,62 0,64 0,74 0,68 0,74 0,70 0,27 0,24 0,38 0,57 1,00 0,93
Grade 0,69 0,85 0,87 0,55 0,57 0,68 0,56 0,66 0,59 0,20 0,16 0,26 0,50 0,93 1,00
Programming and software engineering (2007/08)
14
Prediction of grades after 1/3 of the semester
Classification Matrix (Algorithms and Data structures-2007/08)Rows: Grades Columns: Predicted grades
Percent 1 2 3 4 51 72,73% 112 0 38 4 02 0% 25 0 52 15 03 52,55% 23 2 103 67 14 68,78% 1 0 51 130 75 39,80% 0 0 0 59 39
Total 52,67% 161 2 244 275 47
1 homework
1 quiz1 mid-term
exam
1 homework
2 quizzes1 mid-term
exam
Classification Matrix (Programming and software enginereeing-2007/08) Rows: Grades Columns: Predicted grades
Percent 1 2 3 4 51 76,76% 218 0 60 6 02 0% 33 0 45 9 03 51% 35 0 102 63 04 73,13% 3 0 48 147 35 11,49% 0 0 2 75 10
Total 55,53% 289 0 257 300 13
15
Prediction of grades after 2/3 of the semester
Classification Matrix (Algorithms and Data structures-2007/08)Rows: Grades Columns: Predicted grades
Percent 1 2 3 4 51 75,32% 116 19 18 1 02 17,39% 17 16 56 3 03 76,53% 4 7 150 35 04 69,31% 0 1 41 131 165 62,24% 0 0 0 37 61
Total 65,02% 137 43 265 207 77
2 homeworks
2 quizzes2 mid-term
exam
2 homeworks
4 quizzes2 mid-term
exam
Classification Matrix (Programming and software enginereeing-2007/08) Rows: Grades Columns: Predicted grades
Percent 1 2 3 4 51 86,27% 245 7 28 4 02 14,95% 29 13 44 1 03 72,5% 15 3 145 37 04 75,62% 0 0 36 152 135 68,97% 0 0 0 27 60
Total 71,59% 289 23 253 221 73
16
ConclusionWhy is prediction not more accurate?
Extremely poor prediction for grade 2! Reason: Significant number of students learn just to pass Gaussian distribution of grades
15%-35%-35%-15%Nevertheless: Very good prediction about the number of
students that will pass the exam
What about various assessments methodsRequires more work but covers more aspects of
assessmentEliminates backwash effectContinuous assessment