Download - Survival of the fittest in the jungle of OER

Open Survival of the fittest Open in the jungle of Open Educational Resources

15th of August 2014

a Master Thesis by

Sander Latour

These slides were used in the defense of my thesis, in closure of my master in Artificial Intelligence. If you are interested in my thesis, contact me.

Open Educational Resources [1]

Learning objects that can freely be

reused, revised, remixed and redistributed.

[1] Daniel E Atkins, John S Brown and Allen L Hammond. Creative Common, 2007. A review of the open educational resources (OER) movement: Achievements, challenges, and new opportunities.

Let’s start by defining what Open Educational Resources are. There are several definitions available and a lot can be said about the term. For a starting point to that body of work I refer the reader to the article cited at the bottom of the page. For our purposes we can suffice by saying that Open Educational Resources are learning objects that can freely be reused, revised (as in altered), remixed (as in used in different combinations) and redistributed. The result of this construct is that now millions of learning objects are available to teachers around the world, with the expectation that this number will at some point start to grow exponentially.

Open Educational ResourcesLearning objects that can freely be

reused, revised, remixed and redistributed.

Textual objects Video objects Interactive objects

You should not focus on every detail. Stick to the bigger picture

Example:You are reading this.

Stick to the bigger picture

Depending on the definition, learning objects can mean anything from a single paragraph to entire courses. In the context of this thesis, learning objects can be textual explanations, video clips and interactive content.

OER Sequence

Typically, a teacher or teacher figure creates a sequence of these educational resources in order to present to a student.

However, there are many different sequences possible, of varying lengths. The teacher aims to find exactly that sequence that maximizes the learning done by the student as a result.

T1

T1

T2

T1

T2

T2

Or put differently. If we would measure the competence of a student on a particular topic, before and after the presentation of the sequence of OER.Then we would be interested in that sequence, where we expect the difference between the two tests, or the impact of the sequence, will be the biggest.

( )T

2T

1

=

Tmax

NLG T

1“ impact ”

To be precise, we use the Normalized Learning Gain metric for that, which is the percentage of the gain that theoretically could be achieved.

( )NLG E

Obviously when we are collecting the sequence for a student, we do not know what the result would be. Instead we would have to make an estimate about what we expect the impact to be. How can we make that estimate, when we know very little about the content of the OER?

NLG1(

(

( )NLG E

One logical choice would be to present the sequence of material to some students and see how that goes.

NLG2(

(

NLG3(

(

NLG4(

(

NLG5(

(

NLG6(

(

NLG7(

(

NLG8(

(

NLG1(

(

( )NLG E

But of course testing it on one student would not suffice. So perhaps we would test it on ten students, or ten thousand.

NLG2(

(

NLG3(

(

NLG4(

(

NLG5(

(

NLG6(

(

NLG7(

(

NLG8(

(

NLG1(

(

( )NLG E

And then it might turn-out that the sequence was not so good after all, which makes the students very sad. They could’ve learned more if you had presented them with a different sequence of material.

NLG2(

(

NLG3(

(

NLG4(

(

NLG5(

(

NLG6(

(

NLG7(

(

NLG8(

(

NLG1(

(

( )NLG E I regret trying this

And that situation would make the teacher regret his decision of presenting that sequence.

Exploration Exploitation

regretminimize online“ while learning ”

( )NLG E

This brings us to the dilemma that is the starting point of my thesis. On the one hand it is important to not give up on apparently bad sequences after only a few students. On the other hand we also don’t want to experiment too much since that would damage too much learning, so it is important to also stick to the sequences you know are good. We need to balance those necessities for exploration vs. exploitation in order to minimize the regret we will have while trying to find out which sequences are effective.

Survival ofthe fittest

UCB + a Genetic

Algorithm

[2] A.E. Eiben and J.E. Smith. Natural Computing, 2007. Introduction to Evolutionary Computing.

[2]

That concludes what I am trying to do. Now let’s look at how I am trying to do that. And we’ll pick up the pace a bit as well, as I explain what Upper Confidence Bound selection and Genetic Algorithms are.

( ) 2 ln(n) n

NLG

averagetotal nr. ofevaluations

nr. of timestried

UCB-1[3]

[3] P. Auer, N. Cesa-Bianchi and P. Fischer. Machine learning, 2002. Finite-time analysis of the multiarmed bandit problem.

UCB tries to balance the exploration and exploitation when selecting sequences to present. It does so by calculating the value returned by the equation on this slide, and then picks the sequence with the highest value. Now the equation consists of two terms. The first term, before the ‘+’, represents how effective the sequence was in our past experiences. That term ensures that sequences with a high impact on learning will be picked more often. The second term represents how often we gave the sequence a chance, if we only gave the sequence to a few students then this term will become really big after a while. Thus, the second term ensures that we continue to explore sequences to see how good or bad they actually are. So, this is already a big improvement over just testing each sequence on ten thousand students.

The impacts of these sequences are not independent

If these two are effective

then it makessense to try this

However, UCB treats all sequences as completely independent options. The impact of one sequence provides no information about the impact of others according to this mechanism. However, looking at the four sequences on the slide, this is probably not true. The sequences have a lot in common and if the two middle ones turn out to be effective, then it makes sense to try the fourth one as well. This is the type of generalization that you might recognize from playing the game mastermind.

Genetic Algorithms

Population containing a subset of candidates

Candidates have a “fitness” value, i.e. how good is it?

Higher fitness means higher chance of reproduction

Produced offspring is a combination of both parents

Inspired by Darwinian evolution

Genetic algorithms are much better at making these generalizations, albeit implicitly. Let’s start by giving you a general idea about what a genetic algorithm is. First off, it works with the notion of a population, that contains a subset of the possible candidates. Much like a population of animals contains a subset of all possible chromosomes. Each of these candidates has a fitness value (remember: survival of the fittest) that expresses how good the candidate is. In our case, how big the impact of a sequence is on learning. A higher fitness means the candidate has a higher chance of reproduction. Any produced offspring as a result of that reproduction is a combination of both parents. In other words, a child sequence looks a lot like the two parent sequences.

( ) 2 ln(n) n

NLG

Current population

T1

T2

Selecting most promising sequence

Evaluation of impact

Now let’s put the pieces together. Instead of all possible sequences, we now have a population of sequences to choose from. UCB picks the most promising sequence to try. That sequence is then presented to a student, which provides us with a new measurement of the NLG (impact) of the sequence.

NLG1

NLG3

NLG2

NLG4

NLG5

Roulette selection of parents

1

2

3

4

Crossover& Mutation

Crossover& Mutation

Offspring

Offspring

Generational replacementwith elite preservation

elite

offspring

Current generation

Next generation

As you now know, this impact determines the chance of that sequence reproducing. And this is done by creating a metaphorical wheel of fortune, which is divided up in regions corresponding to sequences. The size of the region is proportionate to the fitness of the sequence. This causes very effective sequences to be far more likely to become a parent than ineffective sequences. The selected parent pairs then recombine together and produce two new sequences as their offspring. This offspring then replaces their parents in the generation, thereby becoming selectable by UCB. This is called generational replacement. In my thesis I also implemented elite preservation, which ensures that the very best of sequences of the current generation will get a free pass to survive exactly the same in the new generation. They will not be replaced by children or change in any other way. This allows us to continue to select the best possible sequence found so far, while the new offspring can potentially contain an even better one.

Genes

chromosomePermutation encoding

… with varying length… with partial permutations

One-point crossover Append crossover

Swap mutation Addition mutation Deletion mutation

Now you might remember that when reproduction occurs in nature, the chromosome of the offspring is in part determined by the chromosome of the first parent and in part by the chromosome of the second parent. I’ve encoded our domain with the permutation encoding, which means the chromosome is the sequence consisting of genes that represent the learning objects. The parent sequences can then be split up in the one-point crossover at a random point, resulting in two sequence parts of the first parent and two parts of the second parent. These are then shuffled and recombined in the new sequence. If the parent sequence contains only one learning object, you cannot split it up. In that case the recombination takes place by placing the chromosomes of the parents before and after each other, creating two new sequences. After crossover, a small mutation may occur on the offspring’s chromosome. Either two learning objects are swapped in position, a new learning object is added to the sequence, or a random learning object is removed from the sequence.

Evaluation& Results

Experiment with online “course”

That concludes the mechanism of the system I’ve created to minimize our regret while selecting sequence of learning material for students. I’ve also tested this in an experiment which took the form of a small online course.

Nim game

Curriculum

Low High

student groups

4 lessons

4 OER

T1

T2

7sequences in 1 generation

10evaluations in 1 generation

2elite members

5%mutation

237total usable participants

Algorithm Participants

voluntaryparticipation

could stop atany moment

diverse crowdnot just students

3MC 3MC

The course contained a curriculum about the mathematical game Nim. The course consisted of four lessons, in each of which a sequence of learning material was presented to a student that was composed out of four possible OER. The pre-test and post-test that were used consisted of three multiple-choice questions, intended to keep the duration of the experiment doable. And the students were split up in two groups, low amount of prior knowledge and high amount of prior knowledge. The optimal sequence was searched for both groups, more or less independently. The genetic algorithm was setup to have 7 sequences in each generation, with 10 evaluations before the next generation, two elite members in each generation and a 0.05% chance of mutation in offsprings’ chromosomes. The participants came from many different online sources, but participated voluntarily and could stop at any moment. In total 237 participants contributed to the data gathered.

Does the system learn to pick sequences with more learning impact over those with less impact?

Figure: Regret in Rules - Low Figure: Regret in Intuition - Low

Built-upregret

students

The system worked for the “Low” student groups

In “High” groups there was either too little data or a technical issueIt’s unknown how good the apparently best sequences really are

In each the lesson, the system was successful in finding and staying with the sequences that afterwards appeared to be the best, for at least the Low student group. In the high student groups there was either too little data or a technical issue. It is unknown whether it would have worked otherwise. Furthermore, even though it found sequences that turned out to be the best so far, we do not know how good it actually is. Since we have no benchmark or true value available.

Possible explanation

limited pre- and post-testcoarse division of studentsindependence assumption

Variance in the observed learning impact

learning impact

students

Figure: Best sequence in “Rules” lesson for student group “Low”

It is not strange to suspect that our estimated impact is not perfect given the large amount of variation in observed learning impact. This graph shows the impact values for all students that went through the apparently best sequence of the first lesson. There are three possible explanations for that. 1) the pre- and post-test had too few questions to give any trustworthy measurement of competence. 2) the students were divided in only two groups, leaving a very heterogenic group. 3) The lessons are assumed to be independent, therefore the system doesn’t take into account what material was presented to a student in a previous lesson. Of course that doesn’t explain this variation in the first lesson.

In conclusion,

A possible approach for using learning impact in the assessment of OERhas been presented and tested.

Many lessons can be drawn from the results, but the principle works.

I recommend others to continueon the path of using learning impactin the assessment of OER.

Download - Survival of the fittest in the jungle of OER

Top Related