adaptive multilevel clustering model for the prediction of academic risk

Adaptive Multilevel Clustering Model for the Prediction of

Academic Risk

Xavier Ochoa

Escuela Superior Politécnica del Litoral

http://www.slideshare.net/xaoch

http://www.slideshare.net/xaoch

Academic RiskFailing Homework

Failing Lessons

Failing Courses

Failing Semesters

Drop Out / Kick out

Can Academic Boards mitigate academic risk and

assure quality standards?

• Are these risks to the academics or risks to quality

educational outcomes for the students?

• Are Academic Boards serving the institution or the

students?

http://environmentalrisk.org/dont-touch-this-button/

How do we predict Academic Risk?

How academic risk is predicted?

1. Historical data is collected about previous students and their outcomes

2. That data is used train predictive models using machine learning techniques


3. The performance of each model is measured against student data not seen by the model before

4. The best model is selected and an academic risk prediction system is built around it


5. The information of a new student is provided to the system (and the model) and the academic risk is estimated

6. That information is provided to the student or counselor for their consideration

Arnold, K. E. & Pistilli, M. D.

Course Signals at Purdue: Using learning

analytics to increase student success

Proceedings of the 2nd International

Conference on Learning Analytics and

Knowledge, 2012, 267-270

Now we get the best model!But not the best risk assessment for each

individual student

80% accuracy is GREAT if you are among the 80% predicted

accuratelly

80% accuracy means NOTHING if you are among 20% predicted

wrongly

We propose the use of Adaptive Models for Academic Risk Prediction

Use weaknesses and strengths of different models to switch models (or parameters)

to give the best possible prediction for each student

An example with a Clustering Model

Based on the data of Computer Science in ESPOL during the last 13 years

Simple Clustering Model

1. Define the variables that will be used to group students together

2. Define the variables that will be used to group semesters together

3. Create clusters of students and semesters

4. Calculate the failure rate of each cluster

5. Compare current students (and the proposed courses) and find which cluster is closer

6. Assign the failure rate of the closest cluster as the academic risk of the student

Variables and model are defined at design time

There is nothing to do a runtime

Adaptive Multilevel Clustering Model

1. Create a hierarchy of factors that could be used to cluster students and semesters from less specific to more specific


2. Use the fact that the accuracy of prediction based on clustering models is usually related to the size of the cluster (more data produce better prediction)


3. Also, use the fact that the prediction accuracy is also related to the specificity of the cluster


4. For each student find the optimal balance by selecting the more specific clustering variables that create a similar cluster with a minimum of previous examples in runtime

Cluster Size

Specifity

Student A: Very mainstream student taking

a common semester (lots of similar cases)Adaptive Multilevel Clustering:

Select the more specific clustering variables (Level of Mastery and Course code)

Student will be compared with very specific clusters with good size

Student B: Very odd student taking odd

semester (very few similar cases)

Adaptive Multilevel Clustering:

Select the more geneeric clustering variables (GPA and Number of Courses)

Student will be compared with generic clusters but with good size

EvaluationThe idea seems ok… but does it work?

Comparison with Simple Clustering

TABLE IBRIER SCORE FOR THE TRADITIONAL CLUSTERING MODELS IN THE

VALIDATION SET

StudentPerformance

Course LoadLevel 1 Level 2 Level 3 Level 4

Level 1 0.2051 0.2321 0.3060 0.3682

Level 2 0.2166 0.2473 0.3201 0.3755

TABLE IIBRIER SCORE FOR THE TRADITIONAL CLUSTERING MODELS IN THE TEST

SET

StudentPerformance


Level 1 0.2706 0.2940 0.3208 0.3603

Level 2 0.2939 0.3257 0.3489 0.3710

diagonal means that X% of the students that have received a

estimation risk of X± 5% have failed at least one course in the

semester. The closer the points are to the diagonal, the better

the prediction and the lower the Brier score.

Fig. 4. Reliability plot for the prediction of the test set by the best performingclustering model

3) Threshold Estimation: To estimate the value of the

threshold that maximize the performance of adaptive multi-

level clustering model, this model is run against the validation

dataset with different values for the threshold (from 1 to 400).

Figure 5 plots the final Brier score obtained versus the value

of the threshold.

The size of the cluster that minimize the Brier score

(0.2021) of the adaptive multilevel clustering model in the

validation set is128. It is interesting to note that theBrier score

rapidly decrease for small values of threshold until values near

100. After 128, the Brier value slowly increase. This shape of

graph suggests that for values around 128, the sensitivity of

the Brier score is low to changes in the value of the threshold.

Moreover, it is better to overestimate the threshold that to use

values too small.

Fig. 5. Empirical estimation of the threshold parameter

It is also interesting that the lowest Brier score obtained for

the adaptive multilevel clustering model (0.2021) is lower than

the lowest score obtained by the traditional clustering models

(0.2050) over the validation set.

4) Adaptive Multilevel Clustering Model: Once the thresh-

old has been established (128), the adaptive multilevel cluster-

ing model is run over the test set. The Brier score obtained is

0.2646, lower that the 0.2706 obtained by the best traditional

clustering model. Figure 6 present the reliability plot of the

adaptive model.

Fig. 6. Reliability plot for the prediction of the test set for the adaptivemultilevel clustering model

To confirm that the selected threshold value is a good

estimation of the real threshold for the test set, the adaptive

multilevel clustering model is run over the test set with

different values for the threshold. The results can be seen in

Brier Score=0.27High error for certain students

Comparison with Simple Clustering

TABLE IBRIER SCORE FOR THE TRADITIONAL CLUSTERING MODELS IN THE

VALIDATION SET

StudentPerformance


Level 1 0.2051 0.2321 0.3060 0.3682

Level 2 0.2166 0.2473 0.3201 0.3755

TABLE IIBRIER SCORE FOR THE TRADITIONAL CLUSTERING MODELS IN THE TEST

SET

StudentPerformance


Level 1 0.2706 0.2940 0.3208 0.3603

Level 2 0.2939 0.3257 0.3489 0.3710

diagonal means that X% of the students that have received a

estimation risk of X± 5% have failed at least one course in the

semester. The closer the points are to the diagonal, the better

the prediction and the lower the Brier score.

Fig. 4. Reliability plot for the prediction of the test set by the best performingclustering model

3) Threshold Estimation: To estimate the value of the

threshold that maximize the performance of adaptive multi-

level clustering model, this model is run against the validation

dataset with different values for the threshold (from 1 to 400).

Figure 5 plots the final Brier score obtained versus the value

of the threshold.

The size of the cluster that minimize the Brier score

(0.2021) of the adaptive multilevel clustering model in the

validation set is128. It is interesting to note that the Brier score

rapidly decrease for small values of threshold until values near

100. After 128, the Brier value slowly increase. This shape of

graph suggests that for values around 128, the sensitivity of

the Brier score is low to changes in the value of the threshold.

Moreover, it is better to overestimate the threshold that to use

values too small.

Fig. 5. Empirical estimation of the threshold parameter

It is also interesting that the lowest Brier score obtained for

the adaptive multilevel clustering model (0.2021) is lower than

the lowest score obtained by the traditional clustering models

(0.2050) over the validation set.

4) Adaptive Multilevel Clustering Model: Once the thresh-

old has been established (128), the adaptive multilevel cluster-

ing model is run over the test set. The Brier score obtained is

0.2646, lower that the 0.2706 obtained by the best traditional

clustering model. Figure 6 present the reliability plot of the

adaptive model.

Fig. 6. Reliability plot for the prediction of the test set for the adaptivemultilevel clustering model

To confirm that the selected threshold value is a good

estimation of the real threshold for the test set, the adaptive

multilevel clustering model is run over the test set with

different values for the threshold. The results can be seen in

Brier Score=0.26More distributed errorsBest possible forecast for each student

ConclusionsAnd what to do with all of these

Academic prediction systems are about the students, not the model

Vs.?

Adaptive Systems and User-Controlled Analysis

Collabortive process of LA design

Gracias / Thank youQuestions?

Xavier [email protected]://ariadne.cti.espol.edu.ec/xavierTwitter: @xaoch

adaptive multilevel clustering model for the prediction of academic risk

Technology