Download - Survival of the fittest in the jungle of OER
Open Survival of the fittest Open in the jungle of Open Educational Resources
15th of August 2014
a Master Thesis by
Sander Latour
These slides were used in the defense of my thesis, in closure of my master in Artificial Intelligence. If you are interested in my thesis, contact me.
Open Educational Resources [1]
Learning objects that can freely be
reused, revised, remixed and redistributed.
[1] Daniel E Atkins, John S Brown and Allen L Hammond. Creative Common, 2007. A review of the open educational resources (OER) movement: Achievements, challenges, and new opportunities.
Let’s start by defining what Open Educational Resources are. There are several definitions available and a lot can be said about the term. For a starting point to that body of work I refer the reader to the article cited at the bottom of the page. For our purposes we can suffice by saying that Open Educational Resources are learning objects that can freely be reused, revised (as in altered), remixed (as in used in different combinations) and redistributed. The result of this construct is that now millions of learning objects are available to teachers around the world, with the expectation that this number will at some point start to grow exponentially.
Open Educational ResourcesLearning objects that can freely be
reused, revised, remixed and redistributed.
Textual objects Video objects Interactive objects
You should not focus on every detail. Stick to the bigger picture
Example:You are reading this.
Stick to the bigger picture
Depending on the definition, learning objects can mean anything from a single paragraph to entire courses. In the context of this thesis, learning objects can be textual explanations, video clips and interactive content.
OER Sequence
Typically, a teacher or teacher figure creates a sequence of these educational resources in order to present to a student.
However, there are many different sequences possible, of varying lengths. The teacher aims to find exactly that sequence that maximizes the learning done by the student as a result.
T1
T1
T2
T1
T2
T2
Or put differently. If we would measure the competence of a student on a particular topic, before and after the presentation of the sequence of OER.Then we would be interested in that sequence, where we expect the difference between the two tests, or the impact of the sequence, will be the biggest.
( )T
2T
1
=
Tmax
NLG T
1“ impact ”
To be precise, we use the Normalized Learning Gain metric for that, which is the percentage of the gain that theoretically could be achieved.
( )NLG E
Obviously when we are collecting the sequence for a student, we do not know what the result would be. Instead we would have to make an estimate about what we expect the impact to be. How can we make that estimate, when we know very little about the content of the OER?
NLG1(
(
( )NLG E
One logical choice would be to present the sequence of material to some students and see how that goes.
NLG2(
(
NLG3(
(
NLG4(
(
NLG5(
(
NLG6(
(
NLG7(
(
NLG8(
(
NLG1(
(
( )NLG E
But of course testing it on one student would not suffice. So perhaps we would test it on ten students, or ten thousand.
NLG2(
(
NLG3(
(
NLG4(
(
NLG5(
(
NLG6(
(
NLG7(
(
NLG8(
(
NLG1(
(
( )NLG E
And then it might turn-out that the sequence was not so good after all, which makes the students very sad. They could’ve learned more if you had presented them with a different sequence of material.
NLG2(
(
NLG3(
(
NLG4(
(
NLG5(
(
NLG6(
(
NLG7(
(
NLG8(
(
NLG1(
(
( )NLG E I regret trying this
And that situation would make the teacher regret his decision of presenting that sequence.
Exploration Exploitation
regretminimize online“ while learning ”
( )NLG E
This brings us to the dilemma that is the starting point of my thesis. On the one hand it is important to not give up on apparently bad sequences after only a few students. On the other hand we also don’t want to experiment too much since that would damage too much learning, so it is important to also stick to the sequences you know are good. We need to balance those necessities for exploration vs. exploitation in order to minimize the regret we will have while trying to find out which sequences are effective.
Survival ofthe fittest
UCB + a Genetic
Algorithm
[2] A.E. Eiben and J.E. Smith. Natural Computing, 2007. Introduction to Evolutionary Computing.
[2]
That concludes what I am trying to do. Now let’s look at how I am trying to do that. And we’ll pick up the pace a bit as well, as I explain what Upper Confidence Bound selection and Genetic Algorithms are.
( ) 2 ln(n) n
NLG
averagetotal nr. ofevaluations
nr. of timestried
UCB-1[3]
[3] P. Auer, N. Cesa-Bianchi and P. Fischer. Machine learning, 2002. Finite-time analysis of the multiarmed bandit problem.
UCB tries to balance the exploration and exploitation when selecting sequences to present. It does so by calculating the value returned by the equation on this slide, and then picks the sequence with the highest value. Now the equation consists of two terms. The first term, before the ‘+’, represents how effective the sequence was in our past experiences. That term ensures that sequences with a high impact on learning will be picked more often. The second term represents how often we gave the sequence a chance, if we only gave the sequence to a few students then this term will become really big after a while. Thus, the second term ensures that we continue to explore sequences to see how good or bad they actually are. So, this is already a big improvement over just testing each sequence on ten thousand students.
The impacts of these sequences are not independent
If these two are effective
then it makessense to try this
However, UCB treats all sequences as completely independent options. The impact of one sequence provides no information about the impact of others according to this mechanism. However, looking at the four sequences on the slide, this is probably not true. The sequences have a lot in common and if the two middle ones turn out to be effective, then it makes sense to try the fourth one as well. This is the type of generalization that you might recognize from playing the game mastermind.
Genetic Algorithms
Population containing a subset of candidates
Candidates have a “fitness” value, i.e. how good is it?
Higher fitness means higher chance of reproduction
Produced offspring is a combination of both parents
Inspired by Darwinian evolution
Genetic algorithms are much better at making these generalizations, albeit implicitly. Let’s start by giving you a general idea about what a genetic algorithm is. First off, it works with the notion of a population, that contains a subset of the possible candidates. Much like a population of animals contains a subset of all possible chromosomes. Each of these candidates has a fitness value (remember: survival of the fittest) that expresses how good the candidate is. In our case, how big the impact of a sequence is on learning. A higher fitness means the candidate has a higher chance of reproduction. Any produced offspring as a result of that reproduction is a combination of both parents. In other words, a child sequence looks a lot like the two parent sequences.
( ) 2 ln(n) n
NLG
Current population
T1
T2
Selecting most promising sequence
Evaluation of impact
Now let’s put the pieces together. Instead of all possible sequences, we now have a population of sequences to choose from. UCB picks the most promising sequence to try. That sequence is then presented to a student, which provides us with a new measurement of the NLG (impact) of the sequence.
NLG1
NLG3
NLG2
NLG4
NLG5
Roulette selection of parents
1
2
3
4
Crossover& Mutation
Crossover& Mutation
Offspring
Offspring
Generational replacementwith elite preservation
elite
offspring
Current generation
Next generation
As you now know, this impact determines the chance of that sequence reproducing. And this is done by creating a metaphorical wheel of fortune, which is divided up in regions corresponding to sequences. The size of the region is proportionate to the fitness of the sequence. This causes very effective sequences to be far more likely to become a parent than ineffective sequences. The selected parent pairs then recombine together and produce two new sequences as their offspring. This offspring then replaces their parents in the generation, thereby becoming selectable by UCB. This is called generational replacement. In my thesis I also implemented elite preservation, which ensures that the very best of sequences of the current generation will get a free pass to survive exactly the same in the new generation. They will not be replaced by children or change in any other way. This allows us to continue to select the best possible sequence found so far, while the new offspring can potentially contain an even better one.
Genes
chromosomePermutation encoding
… with varying length… with partial permutations
One-point crossover Append crossover
Swap mutation Addition mutation Deletion mutation
Now you might remember that when reproduction occurs in nature, the chromosome of the offspring is in part determined by the chromosome of the first parent and in part by the chromosome of the second parent. I’ve encoded our domain with the permutation encoding, which means the chromosome is the sequence consisting of genes that represent the learning objects. The parent sequences can then be split up in the one-point crossover at a random point, resulting in two sequence parts of the first parent and two parts of the second parent. These are then shuffled and recombined in the new sequence. If the parent sequence contains only one learning object, you cannot split it up. In that case the recombination takes place by placing the chromosomes of the parents before and after each other, creating two new sequences. After crossover, a small mutation may occur on the offspring’s chromosome. Either two learning objects are swapped in position, a new learning object is added to the sequence, or a random learning object is removed from the sequence.
Evaluation& Results
Experiment with online “course”
That concludes the mechanism of the system I’ve created to minimize our regret while selecting sequence of learning material for students. I’ve also tested this in an experiment which took the form of a small online course.
Nim game
Curriculum
Low High
student groups
4 lessons
4 OER
T1
T2
7sequences in 1 generation
10evaluations in 1 generation
2elite members
5%mutation
237total usable participants
Algorithm Participants
voluntaryparticipation
could stop atany moment
diverse crowdnot just students
3MC 3MC
The course contained a curriculum about the mathematical game Nim. The course consisted of four lessons, in each of which a sequence of learning material was presented to a student that was composed out of four possible OER. The pre-test and post-test that were used consisted of three multiple-choice questions, intended to keep the duration of the experiment doable. And the students were split up in two groups, low amount of prior knowledge and high amount of prior knowledge. The optimal sequence was searched for both groups, more or less independently. The genetic algorithm was setup to have 7 sequences in each generation, with 10 evaluations before the next generation, two elite members in each generation and a 0.05% chance of mutation in offsprings’ chromosomes. The participants came from many different online sources, but participated voluntarily and could stop at any moment. In total 237 participants contributed to the data gathered.
Does the system learn to pick sequences with more learning impact over those with less impact?
Figure: Regret in Rules - Low Figure: Regret in Intuition - Low
Built-upregret
students
The system worked for the “Low” student groups
In “High” groups there was either too little data or a technical issueIt’s unknown how good the apparently best sequences really are
In each the lesson, the system was successful in finding and staying with the sequences that afterwards appeared to be the best, for at least the Low student group. In the high student groups there was either too little data or a technical issue. It is unknown whether it would have worked otherwise. Furthermore, even though it found sequences that turned out to be the best so far, we do not know how good it actually is. Since we have no benchmark or true value available.
Possible explanation
limited pre- and post-testcoarse division of studentsindependence assumption
Variance in the observed learning impact
learning impact
students
Figure: Best sequence in “Rules” lesson for student group “Low”
It is not strange to suspect that our estimated impact is not perfect given the large amount of variation in observed learning impact. This graph shows the impact values for all students that went through the apparently best sequence of the first lesson. There are three possible explanations for that. 1) the pre- and post-test had too few questions to give any trustworthy measurement of competence. 2) the students were divided in only two groups, leaving a very heterogenic group. 3) The lessons are assumed to be independent, therefore the system doesn’t take into account what material was presented to a student in a previous lesson. Of course that doesn’t explain this variation in the first lesson.
In conclusion,
A possible approach for using learning impact in the assessment of OERhas been presented and tested.
Many lessons can be drawn from the results, but the principle works.
I recommend others to continueon the path of using learning impactin the assessment of OER.